Monthly Digest — 2026-03
631 unique stories across 31 days and 8 sources.
Hacker News(124)
- How to talk to anyone, and why you should (www.theguardian.com)
- Microgpt explained interactively (growingswe.com)
- Operational issue – Multiple services (UAE) (health.aws.amazon.com)
- When does MCP make sense vs CLI? (ejholmes.github.io)
- The workers behind Meta's smart glasses can see everything (www.svd.se)
- Welcome (back) to Macintosh (take.surf)
- British Columbia to end time changes, adopt year-round daylight time (www.cbc.ca)
- First in-utero stem cell therapy for fetal spina bifida repair is safe: study (health.ucdavis.edu)
- Iran War Cost Tracker (iran-cost-ticker.com)
- GitHub Is Having Issues (www.githubstatus.com)
- Intel's make-or-break 18A process node debuts for data center with 288-core Xeon (www.tomshardware.com)
- GPT‑5.3 Instant (openai.com)
- Building a new Flash (bill.newgrounds.com)
- An interactive map of Flock Cams (deflock.org)
- Making Firefox's right-click not suck with about:config (joshua.hu)
- Something is afoot in the land of Qwen (simonwillison.net)
- Pentagon formally labels Anthropic supply-chain risk (www.wsj.com)
- GPT-5.4 (openai.com)
- The government uses targeted advertising to track your location (www.eff.org)
- A GitHub Issue Title Compromised 4k Developer Machines (grith.ai)
GitHub Trending(62)
- moeru-ai / airi
- ruvnet / wifi-densepose
- ruvnet / ruflo
- microsoft / markitdown
- anthropics / prompt-eng-interactive-tutorial
- ruvnet / RuView
- K-Dense-AI / claude-scientific-skills
- CodebuffAI / codebuff
- KeygraphHQ / shannon
- msitarzewski / agency-agents
- aquasecurity / trivy
- TheCraigHewitt / seomachine
- QwenLM / Qwen-Agent
- microsoft / hve-core
- Ed1s0nZ / CyberStrikeAI
- 666ghj / MiroFish
- openai / skills
- GoogleCloudPlatform / generative-ai
- shadcn-ui / ui
- openclaw / openclaw
Product Hunt(123)
- Simplora 2.0
The agentic meeting stack with free prep, notes, and chat
- Voicr
Your voice in, polished text out — in seconds
- Epismo Skills
Everything your agent needs to run reliably
- Octrafic
Test your APIs in plain English, straight from the terminal
- Rankfender
AI visibility and automated SEO optimization platform
- ChatWithAds
From Data to AI-Assisted Decision, In One Conversation.
- Mosaic
Zapier for Video Editing
- JDoodleClaw
The most user-friendly OpenClaw. Securely hosted.
- getviktor.com
Your AI Coworker that proactively executes tasks
- Krisp Accent Conversion
Understand accented speech in real time
- The Bias
The synthesis engine for multi-perspective news
- Deep Personality
Science-backed personality insights for you and your partner
- Gemini 3.1 Flash-Lite
Best-in-class intelligence for your high-volume workloads
- Fix in Cursor
GitHub PR comment to Cursor prompt in one click
- Picsart Persona & Storyline
Design your AI influencer and create any story with it.
- Projekt
The BYOK Design & Dev Tool for Building with Agents
- Step 3.5 Flash
Frontier open-source MoE model built for OpenClaw agents
- Itchy
Free macOS notch app with 12+ modules & custom SDK
- Codex app for Windows
Codex now runs natively on Windows with secure sandbox
- Parsewise
Cursor for document work
Hugging Face(85)
- The Trinity of Consistency as a Defining Principle for General World Models
The construction of World Models capable of learning, simulating, and reasoning about objective physical laws constitutes a foundational challenge in the pursuit of Artificial General Intelligence. Recent advancements represented by video generation models like Sora have demonstrated the potential of data-driven scaling laws to approximate physical dynamics, while the emerging Unified Multimodal Model (UMM) offers a promising architectural paradigm for integrating perception, language, and reasoning. Despite these advances, the field still lacks a principled theoretical framework that defines the essential properties requisite for a General World Model. In this paper, we propose that a World Model must be grounded in the Trinity of Consistency: Modal Consistency as the semantic interface, Spatial Consistency as the geometric basis, and Temporal Consistency as the causal engine. Through this tripartite lens, we systematically review the evolution of multimodal learning, revealing a trajectory from loosely coupled specialized modules toward unified architectures that enable the synergistic emergence of internal world simulators. To complement this conceptual framework, we introduce CoW-Bench, a benchmark centered on multi-frame reasoning and generation scenarios. CoW-Bench evaluates both video generation models and UMMs under a unified evaluation protocol. Our work establishes a principled pathway toward general world models, clarifying both the limitations of current systems and the architectural requirements for future progress.
- From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models
As Large Multimodal Models (LMMs) scale up and reinforcement learning (RL) methods mature, LMMs have made notable progress in complex reasoning and decision making. Yet training still relies on static data and fixed recipes, making it difficult to diagnose capability blind spots or provide dynamic, targeted reinforcement. Motivated by findings that test driven error exposure and feedback based correction outperform repetitive practice, we propose Diagnostic-driven Progressive Evolution (DPE), a spiral loop where diagnosis steers data generation and reinforcement, and each iteration re-diagnoses the updated model to drive the next round of targeted improvement. DPE has two key components. First, multiple agents annotate and quality control massive unlabeled multimodal data, using tools such as web search and image editing to produce diverse, realistic samples. Second, DPE attributes failures to specific weaknesses, dynamically adjusts the data mixture, and guides agents to generate weakness focused data for targeted reinforcement. Experiments on Qwen3-VL-8B-Instruct and Qwen2.5-VL-7B-Instruct show stable, continual gains across eleven benchmarks, indicating DPE as a scalable paradigm for continual LMM training under open task distributions. Our code, models, and data are publicly available at https://github.com/hongruijia/DPE.
- MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility through natural language interaction and tool-mediated decision making. However, systematic evaluation in real-world mobility settings is hindered by diverse routing demands, non-deterministic mapping services, and limited reproducibility. In this study, we introduce MobilityBench, a scalable benchmark for evaluating LLM-based route-planning agents in real-world mobility scenarios. MobilityBench is constructed from large-scale, anonymized real user queries collected from Amap and covers a broad spectrum of route-planning intents across multiple cities worldwide. To enable reproducible, end-to-end evaluation, we design a deterministic API-replay sandbox that eliminates environmental variance from live services. We further propose a multi-dimensional evaluation protocol centered on outcome validity, complemented by assessments of instruction understanding, planning, tool use, and efficiency. Using MobilityBench, we evaluate multiple LLM-based route-planning agents across diverse real-world mobility scenarios and provide an in-depth analysis of their behaviors and performance. Our findings reveal that current models perform competently on Basic information retrieval and Route Planning tasks, yet struggle considerably with Preference-Constrained Route Planning, underscoring significant room for improvement in personalized mobility applications. We publicly release the benchmark data, evaluation toolkit, and documentation at https://github.com/AMAP-ML/MobilityBench .
- OmniGAIA: Towards Native Omni-Modal AI Agents
Human intelligence naturally intertwines omni-modal perception -- spanning vision, audio, and language -- with complex reasoning and tool usage to interact with the world. However, current multi-modal LLMs are primarily confined to bi-modal interactions (e.g., vision-language), lacking the unified cognitive capabilities required for general AI assistants. To bridge this gap, we introduce OmniGAIA, a comprehensive benchmark designed to evaluate omni-modal agents on tasks necessitating deep reasoning and multi-turn tool execution across video, audio, and image modalities. Constructed via a novel omni-modal event graph approach, OmniGAIA synthesizes complex, multi-hop queries derived from real-world data that require cross-modal reasoning and external tool integration. Furthermore, we propose OmniAtlas, a native omni-modal foundation agent under tool-integrated reasoning paradigm with active omni-modal perception. Trained on trajectories synthesized via a hindsight-guided tree exploration strategy and OmniDPO for fine-grained error correction, OmniAtlas effectively enhances the tool-use capabilities of existing open-source models. This work marks a step towards next-generation native omni-modal AI assistants for real-world scenarios.
- dLLM: Simple Diffusion Language Modeling
Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures. To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling -- training, inference, and evaluation -- and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline. The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute, including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.
- Enhancing Spatial Understanding in Image Generation via Reward Modeling
Recent progress in text-to-image generation has greatly advanced visual fidelity and creativity, but it has also imposed higher demands on prompt complexity-particularly in encoding intricate spatial relationships. In such cases, achieving satisfactory results often requires multiple sampling attempts. To address this challenge, we introduce a novel method that strengthens the spatial understanding of current image generation models. We first construct the SpatialReward-Dataset with over 80k preference pairs. Building on this dataset, we build SpatialScore, a reward model designed to evaluate the accuracy of spatial relationships in text-to-image generation, achieving performance that even surpasses leading proprietary models on spatial evaluation. We further demonstrate that this reward model effectively enables online reinforcement learning for the complex spatial generation. Extensive experiments across multiple benchmarks show that our specialized reward model yields significant and consistent gains in spatial understanding for image generation.
- Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets
The reliability of multilingual Large Language Model (LLM) evaluation is currently compromised by the inconsistent quality of translated benchmarks. Existing resources often suffer from semantic drift and context loss, which can lead to misleading performance metrics. In this work, we present a fully automated framework designed to address these challenges by enabling scalable, high-quality translation of datasets and benchmarks. We demonstrate that adapting test-time compute scaling strategies, specifically Universal Self-Improvement (USI) and our proposed multi-round ranking method, T-RANK, allows for significantly higher quality outputs compared to traditional pipelines. Our framework ensures that benchmarks preserve their original task structure and linguistic nuances during localization. We apply this approach to translate popular benchmarks and datasets into eight Eastern and Southern European languages (Ukrainian, Bulgarian, Slovak, Romanian, Lithuanian, Estonian, Turkish, Greek). Evaluations using both reference-based metrics and LLM-as-a-judge show that our translations surpass existing resources, resulting in more accurate downstream model assessment. We release both the framework and the improved benchmarks to facilitate robust and reproducible multilingual AI development.
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
GPU kernel optimization is fundamental to modern deep learning but remains a highly specialized task requiring deep hardware expertise. Despite strong performance in general programming, large language models (LLMs) remain uncompetitive with compiler-based systems such as torch.compile for CUDA kernel generation. Existing CUDA code generation approaches either rely on training-free refinement or fine-tune models within fixed multi-turn execution-feedback loops, but both paradigms fail to fundamentally improve the model's intrinsic CUDA optimization ability, resulting in limited performance gains. We present CUDA Agent, a large-scale agentic reinforcement learning system that develops CUDA kernel expertise through three components: a scalable data synthesis pipeline, a skill-augmented CUDA development environment with automated verification and profiling to provide reliable reward signals, and reinforcement learning algorithmic techniques enabling stable training. CUDA Agent achieves state-of-the-art results on KernelBench, delivering 100\%, 100\%, and 92\% faster rate over torch.compile on KernelBench Level-1, Level-2, and Level-3 splits, outperforming the strongest proprietary models such as Claude Opus 4.5 and Gemini 3 Pro by about 40\% on the hardest Level-3 setting.
- UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?
Unified multimodal models have recently demonstrated strong generative capabilities, yet whether and when generation improves understanding remains unclear. Existing benchmarks lack a systematic exploration of the specific tasks where generation facilitates understanding. To this end, we introduce UniG2U-Bench, a comprehensive benchmark categorizing generation-to-understanding (G2U) evaluation into 7 regimes and 30 subtasks, requiring varying degrees of implicit or explicit visual transformations. Extensive evaluation of over 30 models reveals three core findings: 1) Unified models generally underperform their base Vision-Language Models (VLMs), and Generate-then-Answer (GtA) inference typically degrades performance relative to direct inference. 2) Consistent enhancements emerge in spatial intelligence, visual illusions, or multi-round reasoning subtasks, where enhanced spatial and shape perception, as well as multi-step intermediate image states, prove beneficial. 3) Tasks with similar reasoning structures and models sharing architectures exhibit correlated behaviors, suggesting that generation-understanding coupling induces class-consistent inductive biases over tasks, pretraining data, and model architectures. These findings highlight the necessity for more diverse training data and novel paradigms to fully unlock the potential of unified multimodal modeling.
- Beyond Language Modeling: An Exploration of Multimodal Pretraining
The visual world offers a critical axis for advancing foundation models beyond language. Despite growing interest in this direction, the design space for native multimodal models remains opaque. We provide empirical clarity through controlled, from-scratch pretraining experiments, isolating the factors that govern multimodal pretraining without interference from language pretraining. We adopt the Transfusion framework, using next-token prediction for language and diffusion for vision, to train on diverse data including text, video, image-text pairs, and even action-conditioned video. Our experiments yield four key insights: (i) Representation Autoencoder (RAE) provides an optimal unified visual representation by excelling at both visual understanding and generation; (ii) visual and language data are complementary and yield synergy for downstream capabilities; (iii) unified multimodal pretraining leads naturally to world modeling, with capabilities emerging from general training; and (iv) Mixture-of-Experts (MoE) enables efficient and effective multimodal scaling while naturally inducing modality specialization. Through IsoFLOP analysis, we compute scaling laws for both modalities and uncover a scaling asymmetry: vision is significantly more data-hungry than language. We demonstrate that the MoE architecture harmonizes this scaling asymmetry by providing the high model capacity required by language while accommodating the data-intensive nature of vision, paving the way for truly unified multimodal models.
- Utonia: Toward One Encoder for All Point Clouds
We dream of a future where point clouds from all domains can come together to shape a single model that benefits them all. Toward this goal, we present Utonia, a first step toward training a single self-supervised point transformer encoder across diverse domains, spanning remote sensing, outdoor LiDAR, indoor RGB-D sequences, object-centric CAD models, and point clouds lifted from RGB-only videos. Despite their distinct sensing geometries, densities, and priors, Utonia learns a consistent representation space that transfers across domains. This unification improves perception capability while revealing intriguing emergent behaviors that arise only when domains are trained jointly. Beyond perception, we observe that Utonia representations can also benefit embodied and multimodal reasoning: conditioning vision-language-action policies on Utonia features improves robotic manipulation, and integrating them into vision-language models yields gains on spatial reasoning. We hope Utonia can serve as a step toward foundation models for sparse 3D data, and support downstream applications in AR/VR, robotics, and autonomous driving.
- BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?
Current benchmarks for code agents primarily assess narrow, repository-specific fixes, overlooking critical real-world challenges such as cross-repository reasoning, domain-specialized problem solving, dependency-driven migration, and full-repository generation. To address this gap, we introduce BeyondSWE, a comprehensive benchmark that broadens existing evaluations along two axes - resolution scope and knowledge scope - using 500 real-world instances across four distinct settings. Experimental results reveal a significant capability gap: even frontier models plateau below 45% success, and no single model performs consistently across task types. To systematically investigate the role of external knowledge, we develop SearchSWE, a framework that integrates deep search with coding abilities. Our experiments show that search augmentation yields inconsistent gains and can in some cases degrade performance, highlighting the difficulty of emulating developer-like workflows that interleave search and reasoning during coding tasks. This work offers both a realistic, challenging evaluation benchmark and a flexible framework to advance research toward more capable code agents.
- Helios: Real Real-Time Long Video Generation Model
We introduce Helios, the first 14B video generation model that runs at 19.5 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching the quality of a strong baseline. We make breakthroughs along three key dimensions: (1) robustness to long-video drifting without commonly used anti-drifting heuristics such as self-forcing, error-banks, or keyframe sampling; (2) real-time generation without standard acceleration techniques such as KV-cache, sparse/linear attention, or quantization; and (3) training without parallelism or sharding frameworks, enabling image-diffusion-scale batch sizes while fitting up to four 14B models within 80 GB of GPU memory. Specifically, Helios is a 14B autoregressive diffusion model with a unified input representation that natively supports T2V, I2V, and V2V tasks. To mitigate drifting in long-video generation, we characterize typical failure modes and propose simple yet effective training strategies that explicitly simulate drifting during training, while eliminating repetitive motion at its source. For efficiency, we heavily compress the historical and noisy context and reduce the number of sampling steps, yielding computational costs comparable to -- or lower than -- those of 1.3B video generative models. Moreover, we introduce infrastructure-level optimizations that accelerate both inference and training while reducing memory consumption. Extensive experiments demonstrate that Helios consistently outperforms prior methods on both short- and long-video generation. We plan to release the code, base model, and distilled model to support further development by the community.
- T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Think about how human handles complex reading tasks: marking key points, inferring their relationships, and structuring information to guide understanding and responses. Likewise, can a large language model benefit from text structure to enhance text-processing performance? To explore it, in this work, we first introduce Structure of Thought (SoT), a prompting technique that explicitly guides models to construct intermediate text structures, consistently boosting performance across eight tasks and three model families. Building upon this insight, we present T2S-Bench, the first benchmark designed to evaluate and improve text-to-structure capabilities of models. T2S-Bench includes 1.8K samples across 6 scientific domains and 32 structural types, rigorously constructed to ensure accuracy, fairness, and quality. Evaluation on 45 mainstream models reveals substantial improvement potential: the average accuracy on the multi-hop reasoning task is only 52.1%, and even the most advanced model achieves 58.1% node accuracy in end-to-end extraction. Furthermore, on Qwen2.5-7B-Instruct, SoT alone yields an average +5.7% improvement across eight diverse text-processing tasks, and fine-tuning on T2S-Bench further increases this gain to +8.6%. These results highlight the value of explicit text structuring and the complementary contributions of SoT and T2S-Bench. Dataset and eval code have been released at https://t2s-bench.github.io/T2S-Bench-Page/.
- Heterogeneous Agent Collaborative Reinforcement Learning
We introduce Heterogeneous Agent Collaborative Reinforcement Learning (HACRL), a new learning paradigm that addresses the inefficiencies of isolated on-policy optimization. HACRL enables collaborative optimization with independent execution: heterogeneous agents share verified rollouts during training to mutually improve, while operating independently at inference time. Unlike LLM-based multi-agent reinforcement learning (MARL), HACRL does not require coordinated deployment, and unlike on-/off-policy distillation, it enables bidirectional mutual learning among heterogeneous agents rather than one-directional teacher-to-student transfer. Building on this paradigm, we propose HACPO, a collaborative RL algorithm that enables principled rollout sharing to maximize sample utilization and cross-agent knowledge transfer. To mitigate capability discrepancies and policy distribution shifts, HACPO introduces four tailored mechanisms with theoretical guarantees on unbiased advantage estimation and optimization correctness. Extensive experiments across diverse heterogeneous model combinations and reasoning benchmarks show that HACPO consistently improves all participating agents, outperforming GSPO by an average of 3.3\% while using only half the rollout cost.
- Proact-VL: A Proactive VideoLLM for Real-Time AI Companions
Proactive and real-time interactive experiences are essential for human-like AI companions, yet face three key challenges: (1) achieving low-latency inference under continuous streaming inputs, (2) autonomously deciding when to respond, and (3) controlling both quality and quantity of generated content to meet real-time constraints. In this work, we instantiate AI companions through two gaming scenarios, commentator and guide, selected for their suitability for automatic evaluation. We introduce the Live Gaming Benchmark, a large-scale dataset with three representative scenarios: solo commentary, co-commentary, and user guidance, and present Proact-VL, a general framework that shapes multimodal language models into proactive, real-time interactive agents capable of human-like environment perception and interaction. Extensive experiments show Proact-VL achieves superior response latency and quality while maintaining strong video understanding capabilities, demonstrating its practicality for real-time interactive applications.
- MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier
While large language models (LLMs) show promise in scientific discovery, existing research focuses on inference or feedback-driven training, leaving the direct modeling of the generative reasoning process, P(hypothesis|background) (P(h|b)), unexplored. We demonstrate that directly training P(h|b) is mathematically intractable due to the combinatorial complexity (O(N^k)) inherent in retrieving and composing inspirations from a vast knowledge base. To break this barrier, we introduce MOOSE-Star, a unified framework enabling tractable training and scalable inference. In the best case, MOOSE-Star reduces complexity from exponential to logarithmic (O(log N)) by (1) training on decomposed subtasks derived from the probabilistic equation of discovery, (2) employing motivation-guided hierarchical search to enable logarithmic retrieval and prune irrelevant subspaces, and (3) utilizing bounded composition for robustness against retrieval noise. To facilitate this, we release TOMATO-Star, a dataset of 108,717 decomposed papers (38,400 GPU hours) for training. Furthermore, we show that while brute-force sampling hits a ''complexity wall,'' MOOSE-Star exhibits continuous test-time scaling.
- SkillNet: Create, Evaluate, and Connect AI Skills
Current AI agents can flexibly invoke tools and execute complex tasks, yet their long-term advancement is hindered by the lack of systematic accumulation and transfer of skills. Without a unified mechanism for skill consolidation, agents frequently ``reinvent the wheel'', rediscovering solutions in isolated contexts without leveraging prior strategies. To overcome this limitation, we introduce SkillNet, an open infrastructure designed to create, evaluate, and organize AI skills at scale. SkillNet structures skills within a unified ontology that supports creating skills from heterogeneous sources, establishing rich relational connections, and performing multi-dimensional evaluation across Safety, Completeness, Executability, Maintainability, and Cost-awareness. Our infrastructure integrates a repository of over 200,000 skills, an interactive platform, and a versatile Python toolkit. Experimental evaluations on ALFWorld, WebShop, and ScienceWorld demonstrate that SkillNet significantly enhances agent performance, improving average rewards by 40% and reducing execution steps by 30% across multiple backbone models. By formalizing skills as evolving, composable assets, SkillNet provides a robust foundation for agents to move from transient experience to durable mastery.
- DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval
Large Language Model (LLM) agents can automate data-science workflows, but many rigorous statistical methods implemented in R remain underused because LLMs struggle with statistical knowledge and tool retrieval. Existing retrieval-augmented approaches focus on function-level semantics and ignore data distribution, producing suboptimal matches. We propose DARE (Distribution-Aware Retrieval Embedding), a lightweight, plug-and-play retrieval model that incorporates data distribution information into function representations for R package retrieval. Our main contributions are: (i) RPKB, a curated R Package Knowledge Base derived from 8,191 high-quality CRAN packages; (ii) DARE, an embedding model that fuses distributional features with function metadata to improve retrieval relevance; and (iii) RCodingAgent, an R-oriented LLM agent for reliable R code generation and a suite of statistical analysis tasks for systematically evaluating LLM agents in realistic analytical scenarios. Empirically, DARE achieves an NDCG at 10 of 93.47%, outperforming state-of-the-art open-source embedding models by up to 17% on package retrieval while using substantially fewer parameters. Integrating DARE into RCodingAgent yields significant gains on downstream analysis tasks. This work helps narrow the gap between LLM automation and the mature R statistical ecosystem.
- RoboPocket: Improve Robot Policies Instantly with Your Phone
Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2times in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.
Techmeme(124)
- Australia's eSafety Commissioner threatens action against app stores and search engines if AI services operating in Australia don't verify user ages by March 9 (Byron Kaye/Reuters)
Byron Kaye / Reuters : Australia's eSafety Commissioner threatens action against app stores and search engines if AI services operating in Australia don't verify user ages by March 9 — Australia's internet regulator said it may push search engines and app stores to block artificial intelligence services that fail …
- Israel-based Guidde, which is developing a platform to accelerate the adoption of AI in organizations, raised a $50M Series B round led by PSG Equity (Meir Orbach/CTech)
Meir Orbach / CTech : Israel-based Guidde, which is developing a platform to accelerate the adoption of AI in organizations, raised a $50M Series B round led by PSG Equity — The Israeli startup's platform turns employee workflows into structured knowledge for automation. — Guidde, a startup developing …
- Sources describe in detail the failed talks between Anthropic and DOD, and how officials at agencies, including the CIA, still hope for a peace agreement (New York Times)
New York Times : Sources describe in detail the failed talks between Anthropic and DOD, and how officials at agencies, including the CIA, still hope for a peace agreement — The Pentagon and Anthropic were close to agreeing on the use of artificial intelligence. But strong personalities, mutual dislike and a rival company unraveled a deal.
- Chinese matchmaking apps like Wanmei Qinjia, which has 50M users and lets parents look for spouses for their children, surge as marriage rates continue to fall (Kohei Fujimura/Nikkei Asia)
Kohei Fujimura / Nikkei Asia : Chinese matchmaking apps like Wanmei Qinjia, which has 50M users and lets parents look for spouses for their children, surge as marriage rates continue to fall — DALIAN, China — Apps that enable parents to search for spouses for their unmarried children have become increasingly popular in China …
- Filing: PayPay is seeking to raise up to $1.1B at a valuation of up to $13.4B in its US IPO, selling nearly 55M shares priced between $17 and $20 apiece (Arasu Kannagi Basil/Reuters)
Arasu Kannagi Basil / Reuters : Filing: PayPay is seeking to raise up to $1.1B at a valuation of up to $13.4B in its US IPO, selling nearly 55M shares priced between $17 and $20 apiece — PayPay and a selling shareholder are aiming to raise as much as $1.1 billion in an initial public offering in the United States …
- Source: Cursor's annualized revenue topped $2B in February, doubling from three months earlier, and about 60% of the revenue is coming from corporate customers (Rachel Metz/Bloomberg)
Rachel Metz / Bloomberg : Source: Cursor's annualized revenue topped $2B in February, doubling from three months earlier, and about 60% of the revenue is coming from corporate customers — Cursor's annualized revenue topped $2 billion in February, according to a person familiar with the matter …
- The US Treasury Department, State Department, and federal housing agency are ending use of Anthropic products; State Department says it will switch to OpenAI (Reuters)
Reuters : The US Treasury Department, State Department, and federal housing agency are ending use of Anthropic products; State Department says it will switch to OpenAI — The U.S. Treasury Department, State Department and the federal housing agency are terminating all use of Anthropic products …
- Sources: US considers limiting Chinese companies to 75K Nvidia H200 chips each, less than half of what some want to buy; AMD MI325 chips also count toward a cap (Bloomberg)
Bloomberg : Sources: US considers limiting Chinese companies to 75K Nvidia H200 chips each, less than half of what some want to buy; AMD MI325 chips also count toward a cap — US officials are considering caps on the number of AI accelerators Nvidia Corp. can export to any one Chinese company …
- Sources: President Trump met with Coinbase CEO Brian Armstrong on March 3 before publicly admonishing banks over the GENIUS Act, echoing Coinbase's position (Jasper Goodman/Politico)
Jasper Goodman / Politico : Sources: President Trump met with Coinbase CEO Brian Armstrong on March 3 before publicly admonishing banks over the GENIUS Act, echoing Coinbase's position — President Donald Trump met privately on Tuesday with Coinbase CEO Brian Armstrong before publicly backing the company's position …
- An OpenAI spokesperson says Sam Altman misspoke in saying OpenAI was looking to deploy on NATO classified networks, and that it was for "unclassified networks" (Hyunsu Yim/Reuters)
Hyunsu Yim / Reuters : An OpenAI spokesperson says Sam Altman misspoke in saying OpenAI was looking to deploy on NATO classified networks, and that it was for “unclassified networks” — OpenAI is considering a contract to deploy its AI technology on North Atlantic Treaty Organization's (NATO) …
- Asia's smaller chip companies are joining their bigger peers in hiking prices as robust AI demand fuels capex, projected to rise 25% YoY to over $136B in 2026 (Nikkei Asia)
Nikkei Asia : Asia's smaller chip companies are joining their bigger peers in hiking prices as robust AI demand fuels capex, projected to rise 25% YoY to over $136B in 2026 — TAIPEI — Asia's smaller chip companies are joining their bigger peers in hiking prices as robust AI demand fuels record levels …
- The UK government commits an initial £40M to an AI research lab, modeled on its DARPA-inspired ARIA, seeking breakthroughs in science, healthcare, and transport (Madhumita Murgia/Financial Times)
Madhumita Murgia / Financial Times : The UK government commits an initial £40M to an AI research lab, modeled on its DARPA-inspired ARIA, seeking breakthroughs in science, healthcare, and transport — New state-backed body seeks AI breakthroughs in science, healthcare and transport — The UK is launching …
- Pasqal, a French startup that builds quantum processors using neutral atom technology, plans to go public via a SPAC merger at a $2B pre-money valuation (Bailey Lipschultz/Bloomberg)
Bailey Lipschultz / Bloomberg : Pasqal, a French startup that builds quantum processors using neutral atom technology, plans to go public via a SPAC merger at a $2B pre-money valuation — Pasqal Holding SAS agreed to merge with a blank-check firm in a deal that values the combined company at $2 billion pre-money …
- Google, Microsoft, Meta, Amazon, OpenAI, and others sign a pledge at the White House to bear the cost of new electricity generation to power their data centers (Reuters)
Reuters : Google, Microsoft, Meta, Amazon, OpenAI, and others sign a pledge at the White House to bear the cost of new electricity generation to power their data centers — Google (GOOGL.O), Microsoft (MSFT.O), Meta (META.O), Amazon (AMZN.O) and several artificial intelligence companies signed a pledge …
- Leaked Friday memo: Dario Amodei called OpenAI's DOD deal "safety theater", said DOD dislikes Anthropic in part for not giving "dictator-style praise to Trump" (The Information)
The Information : Leaked Friday memo: Dario Amodei called OpenAI's DOD deal “safety theater”, said DOD dislikes Anthropic in part for not giving “dictator-style praise to Trump” — Anthropic CEO Dario Amodei on Friday told employees that a deal OpenAI and its CEO Sam Altman struck …
- Source: a16z Crypto is targeting around $2B for its fifth fund and plans to close the raise by the end of the first half of 2026 (Fortune)
Fortune : Source: a16z Crypto is targeting around $2B for its fifth fund and plans to close the raise by the end of the first half of 2026 — The largest player in the crypto venture world is back on the fundraising circuit. The blockchain arm of Andreessen Horowitz, also known as a16z crypto …
- Sources: Together AI is in talks to raise ~$1B at a $7.5B pre-money valuation, up from $3.3B in 2025; its annualized revenue has hit ~$1B, up 3x+ from mid-2025 (The Information)
The Information : Sources: Together AI is in talks to raise ~$1B at a $7.5B pre-money valuation, up from $3.3B in 2025; its annualized revenue has hit ~$1B, up 3x+ from mid-2025 — Together AI, one of several up-and-coming cloud providers renting out Nvidia chip servers to AI developers …
- Anthropic says Claude's free active users grew 60%+ and daily signups grew 4x since the start of the year, with Monday being its strongest day ever (Shirin Ghaffary/Bloomberg)
Shirin Ghaffary / Bloomberg : Anthropic says Claude's free active users grew 60%+ and daily signups grew 4x since the start of the year, with Monday being its strongest day ever — The Claude maker gains new traction with everyday users while its enterprise business is under pressure — Anthropic is gaining ground …
- Anthropic launches an early-warning system for potential AI-driven destruction of white-collar jobs, says it shows "limited evidence" of AI-led job loss so far (Courtenay Brown/Axios)
Courtenay Brown / Axios : Anthropic launches an early-warning system for potential AI-driven destruction of white-collar jobs, says it shows “limited evidence” of AI-led job loss so far — - An occupation's specific tasks; — An estimate of which of those tasks can be performed by large language models.
- Microsoft's new gaming CEO, Asha Sharma, teases the next-gen Xbox, codenamed Project Helix, saying it "will lead in performance and play your Xbox and PC games" (Jay Peters/The Verge)
Jay Peters / The Verge : Microsoft's new gaming CEO, Asha Sharma, teases the next-gen Xbox, codenamed Project Helix, saying it “will lead in performance and play your Xbox and PC games” — The new Xbox boss Asha Sharma revealed the codename as one of her first big announcements.
Solidot(113)
- 美国加州和科罗拉多州计划要求在操作系统层级验证用户年龄
美国多个州都要求成人网站验证访客年龄,但验证年龄的常用方法如扫描脸部或提供身份证件都存在泄露隐私的问题。加州以及科罗拉多州计划要求在操作系统层级验证年龄,然后通过 API 与应用共享。加州去年通过了 AB 1043 法案,要求操作系统开发商创建一种让设备所有者注册其年龄段的方法,该法律将于 2027 年 1 月 1 日生效。科罗拉多州的议员提出了类似的法案 SB26-051,该法案的共同提出者参议员 Matt Ball 表示,他们的目的通过一个以注重隐私的年龄验证框架为儿童的网络安全提供周全的保障。
- Anthropic 的 Claude 在苹果美国区免费应用榜跃居第一
本周五,美国总统特朗普下令联邦机构立刻停用 Anthropic 的 Claude 助手,原因是 Anthropic 在安全原则上坚守其立场。相比之下,其竞争对手 OpenAI 看起来完全没有任何立场,此举在美国用户中间引发了卸载 OpenAI 的 ChatGPT 安装 Claude 的热潮,这一趋势推动 Claude 周六跃居苹果 App Store 美国区免费应用榜榜首,超过了 ChatGPT,ChatGPT 屈居第二,Google 的 Gemini 排名第四。根据分析公司 Sensor Tower 的数据,一个月前的 1 月 30 日 Claude 还排在排行榜的第 131 名,2 月的大部分时间徘徊在前 20 名左右,而 ChatGPT 通常是第一名。Anthropic 还发布了记忆导入功能,方便 ChatGPT 用户改用 Claude。
- 当你需要帮助时狗的反应类似 2 岁小孩但猫只会旁观
根据发表在《Animal Behaviour》期刊上的一项研究,匈牙利研究人员对比了人在需要帮助时 18-24 个月的幼儿、以及宠物狗和猫的反应。结果显示,狗的自发性亲社会行为与幼儿类似,而猫则是冷眼旁观。在实验中,熟人如父母或主人假装在寻找一个藏起来的东西,四分之三的情况下狗和幼儿会提供帮助。猫只有在符合自身利益时才会参与进来提供帮助。
- 克罗地亚宣布在战争结束 31 年后完成地雷清除工作
发生在 1991 年—1995 年之间的克罗地亚战争广泛使用地雷,战后留下了超过 1000 平方公里的布雷区。在战争结束 31 年之后,克罗地亚内政部长 Davor Božinović 宣布所有已知的雷区均已清除完毕。清除地雷期间有 208 人死亡,包括 41 名排雷人员,总耗资约 12 亿欧元。他表示共清除了近 10.7 万枚地雷和 40.7 万枚未爆弹药。
- NIST 限制外国科学家进入其实验室
过去几周在美国国家标准与技术研究院(NIST)工作的数百名外国科学家被限制进入实验室,除非有联邦雇员陪同,否则不得在晚上和周末进入实验室。部分国家的科学家最早将在本月底失去访问权限。拟议中的规则尚无书面版本,仅通过会议传达。最新变化是基于 NIST 在 2025 年更新的研究安全规则,它将中国、俄罗斯、伊朗、朝鲜、古巴、委内瑞拉和叙利亚的科学家视为“高风险”人群,中国等国的研究人员已被告知,他们的实验室访问权限将于 3 月 31 日前接受审查,如果在 NIST 工作逾 3 年或从事量子技术或 AI 等敏感项目而构成“高风险”,其访问权限将被终止。低风险国家的研究人员也面临从 9 月或 12 月起失去访问权限。NIST 研究人员不从事机密研究,NIST 前主任 Patrick Gallagher 表示看不出这么做会带来什么安全上的好处。
- 亚马逊 AWS 中东数据中心遭遇火灾和断电
亚马逊 AWS 披露其位于中东数据中心的一处遭遇断电一处遭遇“物体”撞击后起火。它没有披露是什么东西撞击了数据中心设施。AWS 表示它位于阿联酋的一个数据中心于 7:30 a.m. ET 遭遇撞击,撞击产生了火花和火灾,消防部门在灭火过程中切断了数据中心和发电机的电源。
- 为何女性的疼痛持续时间更长
医生通常认为免疫系统会通过引起炎症加剧疼痛,而炎症通常表现为红肿。最新研究显示,免疫细胞在帮助缓解疼痛方面也可能至关重要,男女之间的免疫细胞功能差异可能会影响疼痛消退的速度。研究人员调查了名为 IL-10(interleukin-10)的分子,测量了小鼠皮肤损伤后和交通事故急诊伤者体内的 IL-10 水平,发现 IL-10 的作用不仅是缓解炎症,还能直接与疼痛感受神经细胞通信将其关闭。也就是 IL-10 有助于消除疼痛。IL-10 由免疫系统的一种白细胞单核细胞(Monocyte)产生,这种细胞会在血液中循环转移到受伤组织。在男性体内,单核细胞更容易产生 IL-10,而在女性体内不太明显。原因是睾酮会影响单核细胞产生的 IL-10 的数量,而男性体内有更高的睾酮水平。
- 小鼠研究发现器官同步衰老但存在性别差异
研究人员构建了迄今最详尽的图谱,展示了衰老如何影响21种哺乳动物组织中的数千种细胞亚型。他们通过分析不同年龄段小鼠的近 700 万个单细胞,确定了随时间推移最易受损的细胞,以及促使其衰老的因素。研究人员对 32 只小鼠21个器官中提取的数百万个单细胞进行分析。这些小鼠处于 3 个年龄段:1 个月(年轻成年小鼠)、5 个月(中年小鼠)、21 个月(老年小鼠)。研究人员识别出了超过 1800 种不同的细胞亚型,其中包括许多此前未被完整描述过的罕见类型。随后研究团队追踪了各年龄阶段小鼠不同类型细胞的数量变化情况。数十年来,科学家们一直认为衰老主要改变的是细胞功能,而非细胞数量。而研究团队的分析结果对这一观点提出了挑战。他们发现,约 1/4 的细胞类型在数量上随时间推移发生了显著变化,比如某些肌肉细胞和肾细胞的数量大幅减少,而免疫细胞数量则大幅增加。这些变化在不同器官的细胞中具有同步性,相似的细胞状态在不同器官中几乎同时出现和消失。这种模式表明,血液中循环中的共同信号可能有助协调全身的衰老过程。大约 40% 与衰老相关的改变因性别而异。例如女性衰老过程中,表现出更广泛的免疫激活情况。
- ChatGPT 卸载率在五角大楼交易之后飙升 295%
根据 Sensor Tower 的数据,在 OpenAI 与五角大楼达成交易之后,用户对此做出了反应,2 月 28 日当天 OpenAI AI 聊天机器人 ChatGPT 应用的卸载量在美国比前一天飙升 295%,而过去 30 天它的平均日卸载率是 9%。与此同时,拒绝五角大楼要求的 OpenAI 竞争对手 Anthropic 的 Claude 应用的下载量在 2 月 27 日和 2 月 28 日分别增长了 37% 和 51%。ChatGPT 的下载量也受到了交易的影响,在宣布与五角大楼合作前的 2 月 27 日 ChatGPT 下载量环比增长 14%,但宣布交易后的 28 日其下载量环比下降 13%,3 月 1 日下载量继续环比下降 5%。Claude 在美国免费应用排行榜也已经连续三天登上榜首,这一波热潮也导致 Claude 多次发生短暂的宕机。
- ARM Cortex X925 桌面性能赶上了 AMD 和英特尔
英国公司 Arm 设计的芯片长期以来是为低功耗和小面积优化的,但它也一直推出针对高性能应用场景的核心。2012 年 Arm 发布 64 位核心 Cortex A57 时,能媲美 AMD 和英特尔最新处理器还是遥不可及的梦想。它在 2024 年推出的高性能核心 Cortex X925 已将梦想变成了现实。英伟达超级芯片 GB10 Superchip 使用的 Arm 核心就是基于 Cortex X925。它在桌面性能上赶上了 AMD Zen 5 和英特尔的 Lion Cove。GB10 使用了 10 个 X925 核心,分成两个集群,其中之一的 X925 核心最高频率 4 GHz,另一个是 3.9 GHz。测试显示它的重排序性能优于 AMD Zen 5,L2 缓存容量赶上了英特尔处理器的 P-Cores(即性能核心)。
- 南极过去三十年损失了 1.2 万平方公里的底部冰
加州尔湾的冰川学家绘制了过去 30 年的南极洲环极底部冰线迁移图,显示它损失了逾 1.2 万平方公里的底部冰(grounded ice)。底部冰是直接与海床或基岩接触的冰,区别于漂浮在水面上的冰架或冰山,它们通常更稳定。研究人员综合分析了多颗卫星的数据,发现 77% 的海岸线未发生冰川接地线迁移,但西南极洲、南极半岛和东南极洲部分地区损失了 12,820 平方公里的底部冰。冰盖正以平均每年 442 平方公里的速度从接地线后退。变化最显著的是西南极洲的 Amundsen Sea 和 Getz 海域,冰川后退了 10-40 公里左右。Pine Island 冰川后退了 33 公里,Thwaites 冰川后退了 26 公里,Smith 冰川的后退距离达到了 42 公里。
- 小米莱卡手机起售价 1.6 万元
小米和莱卡宣布了面向高端智能手机市场的莱卡智能手机 Leitzphone,以莱卡创始人 Ernst Leitz 的名字命名。Leitzphone 配备了两个 5000 万像素镜头和一个 2 亿像素镜头,提供了类似相机的调整焦距、快门速度和曝光的旋钮,硬件配置与 Xiaomi 17 Ultra 基本相同,起售价 1999 欧元——约 1.6 万人民币。
- 《Highguard》将于 3 月 12 日永久关闭
《Highguard》开发商 Wildlight Entertainment 宣布游戏将于 3 月 12 日永久关闭。《Highguard》是一款以突袭为主题的英雄射击游戏,于 1 月 26 日上线,一度吸引了 9.7 万玩家同时在线,但这一热度并没有持续太长时间,根据 Steamdb 的统计,过去二十四小时游戏同时在线人数最高仅为 460 人,对一款需要长期运营的免费 PvP 游戏而言,结局已经注定了。在关闭前《Highguard》共运营 45 天,是索尼《Concord》的 3.75 倍长,《Concord》运营 12 天就永久关闭了。Wildlight 的主要投资者是腾讯,它已经在两周前撤回了投资。
- 利用大模型进行大规模去匿名化
根据海量数据训练并能快速检索相关信息的大模型大幅降低了网络开盒(或叫去匿名化)的成本。一个人可仅仅通过少数特征被个别界定,比如仅通过邮政编码、出生日期和性别,87% 的美国人口即可被个别界定。根据发表在预印本平台 arXiv 的一篇论文,大模型能用于大规模的去匿名化,能高精度的识别网络上的匿名用户。研究人员设计了一个攻击流程:提取身份特征,搜索候选匹配,通过推理验证匹配结果减少误判。传统的去匿名工作需要专业调查人员花费数小时或更长时间,大模型不仅花费时间更少,而且可以大幅扩大规模。利用大模型,以关联 Hacker News 匿名账号和 LinkedIn 实名账号为例,系统能在维持 99% 精度的情况下,将回索率从 0.1% 大幅提升至 45.1%。回索率(Recall)被用于衡量模型找回所有相关信息的能力。研究人员指出,保护网民匿名性的旧方法不再有效。
- Google Chrome 将每两周发布一个新版本
从 9 月 8 日发布的 v153 起,Google Chrome 发布周期将从四周缩短到两周。Google 表示此举旨在确保开发者和用户能立即获取到最新的性能改进、修复和新功能。两周发布周期的版本更新规模更小,最大限度减少中断,简化发布后的调试(debugging)。Google 表示其它方面基本没有变化,如每周的安全更新,Dev 和 Canary 渠道都没有变化。面向企业和 Chromium 嵌入者客户的 Extended Stable 八周发布周期也没有变。
- OpenAI 开发 GitHub 的替代
OpenAI 正在开发一个代码托管平台,与微软的 GitHub 展开竞争。原因据称 GitHub 服务过去几个月频繁中断,导致 OpenAI 工程师们无法工作,因此他们决定开发自己能控制的新代码托管平台。该项目仍然处于早期阶段,可能还需要几个月时间才能完成。如果 OpenAI 真的推出这款产品,那么这将意味着 OpenAI 将与其大股东微软展开直接竞争。根据最新一轮融资,OpenAI 的估值达到了 8400 亿美元。
- 超级木星挑战其形成理论
在太阳系中,木星是无可争议的行星之王,但在银河系的其它角落,存在着体型比木星还更大的超级木星。发表在《自然天文学》期刊的一项研究利用韦伯太空望远镜观测了距离地球约 130 光年外的 HR 8799 星系。该星系有四颗质量高达木星 5-10 倍的巨型气态行星,它们与母恒星的距离远达 15-70 个天文单位,这在传统行星形成理论中几乎是难以解释的地带。天文学界对于巨大天体的诞生通常有两套剧本:一种是如同木星般由岩石核心缓慢吸积尘埃与气体的「由下而上」模式;另一种则是像恒星一样,由气体云直接因引力坍缩而成的「由上而下」模式。由于 HR 8799 的行星位于物质稀薄的星盘边缘,过去许多专家认为,这些远在天边的巨兽应该是透过引力塌缩直接形成的,因为在那个距离下,传统的核心吸积速度太慢,根本来不及在气体盘消散前拼凑出如此庞大的行星。研究团队利用韦伯望远镜的近红外线光谱仪寻找大气中的「硫」。在行星形成的初期,硫通常被锁在固体的岩石或冰粒中,因此如果在行星大气中发现大量的硫,就代表这颗行星在成长过程中曾经吞噬过大量的固体物质,这强烈暗示它走的是核心吸积路线。研究结果令人惊讶,团队在内侧三颗行星中都发现了硫化氢的踪迹,证实这些质量高达木星 10 倍的巨型行星,其形成方式与木星非常相似,也就是由下而上的核心吸积法。这项发挑战了现有的行星演化模型。
- 第三颗星际访客与太阳系内的天体碰撞的可能性
天文学家去年报告发现了已知第三颗星际天体,前两颗分别是 'Oumuamua、彗星 2I/Borisov,第三颗 3I/ATLAS 也属于星际彗星。3I/ATLAS 目前正在太阳系内飞行,根据发表在《The Astronomical Journal》期刊上的一项研究,中科院上海天文台等研究团队模拟分析了彗星 3I/ATLAS 与太阳系内天体碰撞的概率。3I/ATLAS 的轨道倾角约为175°,这意味着其运行方向与太阳系内大部分天体近乎相反。它的近日点距离仅约 1.36 天文单位,相当于其将“逆流而行”穿越内太阳系天体密集区域。这一独特的轨道特征引发了它与数以万计的小行星相遇时发生碰撞的可能性有多大的疑问。研究团队的结论是:在它“逆行穿越”内太阳系期间,共有 31 颗近地小行星和 736 颗主带小行星,会与其物理距离缩小至 0.03 个天文单位(约 450 万公里)以内。其中彗星 3I/ATLAS 核心与小行星 2020 BG107 的发生撞击的概率约为 0.025%,小行星进入彗发范围内的概率则高达 2.7%。
- 思科警告两个 Catalyst SD-WAN Manager 漏洞正被活跃利用
思科警告两个 Catalyst SD-WAN Manager 漏洞正被活跃利用,敦促管理员尽快打上补丁堵上漏洞。Catalyst SD-WAN Manager 前称 vManage,允许系统管理员集中监控和管理最多 6,000 台 Catalyst SD-WAN 设备。思科称,它的安全响应团队发现 CVE-2026-20128 和 CVE-2026-20122 漏洞正被活跃利用。CVE-2026-20122 是一个任意文件覆盖漏洞,能被拥有有效只读凭据和 API 访问权限的远程攻击者利用,属于高危漏洞;CVE-2026-20128 只能被本地攻击者利用,威胁等级中等。
- 美国近十年来首次批准建造商业核反应堆
美国核管理委员会一致投票批准了 TerraPower 的商业核反应堆建造许可。这是美国近十年来首次批准建造商业核反应堆。TerraPower 获得了比尔盖茨的投资,它的核反应堆使用液态钠冷却而不是水冷却,产生的核废料更少。TerraPower 计划建造的是凯默勒一号机组(Kemmerer Unit 1),其非核设施已从 2024 年 6 月开始建造。反应堆计划于 2031 年投入运营,但在投入运营前它还需要获得运营许可证。