DIGEST · 2026-02-09

OrangeBot.AI Digest — 2026-02-09

56 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Another GitHub outage in the same day (www.githubstatus.com)
  2. Testing Ads in ChatGPT (openai.com)
  3. Irish man with valid US work permit held in ICE detention for five months (www.theguardian.com)
  4. Converting a $3.88 analog clock from Walmart into a ESP8266-based Wi-Fi clock (github.com)
  5. Why is the sky blue? (explainers.blog)
  6. GitHub Is Down (github.com)
  7. GitHub is down again (www.githubstatus.com)
  8. Hong Kong pro-democracy tycoon Jimmy Lai gets 20 years' jail (www.bbc.com)
  9. AI Doesn't Reduce Work–It Intensifies It (hbr.org)
  10. Discord will require a face scan or ID for full access next month (www.theverge.com)
  11. AT&T, Verizon blocking release of Salt Typhoon security assessment reports (www.reuters.com)
  12. UEFI Bindings for JavaScript (codeberg.org)
  13. Thoughts on Generating C (wingolog.org)
  14. Matrix messaging gaining ground in government IT (www.theregister.com)
  15. Show HN: Algorithmically finding the longest line of sight on Earth (alltheviews.world)

GitHub Trending(13)

  1. KeygraphHQ / shannon

    Fully autonomous AI hacker to find actual exploits in your web apps. Shannon has achieved a 96.15% success rate on the hint-free, source-aware XBOW Benchmark.

  2. virattt / dexter

    An autonomous agent for deep financial research

  3. pydantic / monty

    A minimal, secure Python interpreter written in Rust for use by AI

  4. hsliuping / TradingAgents-CN

    基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版

  5. iOfficeAI / AionUi

    Free, local, open-source 24/7 Cowork and OpenClaw for Gemini CLI, Claude Code, Codex, OpenCode, Qwen Code, Goose CLI, Auggie, and more | 🌟 Star if you like it!

  6. public-apis / public-apis

    A collective list of free APIs

  7. github / gh-aw

    GitHub Agentic Workflows

  8. Shubhamsaboo / awesome-llm-apps

    Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

  9. gitbutlerapp / gitbutler

    The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte

  10. microsoft / litebox

    A security-focused library OS supporting kernel- and user-mode execution

  11. openai / skills

    Skills Catalog for Codex

  12. EveryInc / compound-engineering-plugin

    Official Claude Code compound engineering plugin

  13. DrewThomasson / ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1158+ languages!

Hugging Face(15)

  1. F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

    Reinforcement Learning with Verifiable Rewards (RLVR) is commonly based on group sampling to estimate advantages and stabilize policy updates. In practice, large group sizes are not feasible due to computational limits, which biases learning toward trajectories that are already likely. Smaller groups often miss rare-correct trajectories while still containing mixed rewards, concentrating probability on common solutions. We derive the probability that updates miss rare-correct modes as a function of group size, showing non-monotonic behavior, and characterize how updates redistribute mass within the correct set, revealing that unsampled-correct mass can shrink even as total correct mass grows. Motivated by this analysis, we propose a difficulty-aware advantage scaling coefficient, inspired by Focal loss, that down-weights updates on high-success prompts. The lightweight modification can be directly integrated into any group-relative RLVR algorithm such as GRPO, DAPO, and CISPO. On Qwen2.5-7B across in-domain and out-of-domain benchmarks, our method improves pass@256 from 64.1 rightarrow 70.3 (GRPO), 69.3 rightarrow 72.5 (DAPO), and 73.2 rightarrow 76.8 (CISPO), while preserving or improving pass@1, without increasing group size or computational cost.

  2. Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making

    We introduce Baichuan-M3, a medical-enhanced large language model engineered to shift the paradigm from passive question-answering to active, clinical-grade decision support. Addressing the limitations of existing systems in open-ended consultations, Baichuan-M3 utilizes a specialized training pipeline to model the systematic workflow of a physician. Key capabilities include: (i) proactive information acquisition to resolve ambiguity; (ii) long-horizon reasoning that unifies scattered evidence into coherent diagnoses; and (iii) adaptive hallucination suppression to ensure factual reliability. Empirical evaluations demonstrate that Baichuan-M3 achieves state-of-the-art results on HealthBench, the newly introduced HealthBench-Hallu and ScanBench, significantly outperforming GPT-5.2 in clinical inquiry, advisory and safety. The models are publicly available at https://huggingface.co/collections/baichuan-inc/baichuan-m3.

  3. OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

    The rapid advancement of Large Language Models (LLMs) has catalyzed the development of autonomous agents capable of navigating complex environments. However, existing evaluations primarily adopt a deductive paradigm, where agents execute tasks based on explicitly provided rules and static goals, often within limited planning horizons. Crucially, this neglects the inductive necessity for agents to discover latent transition laws from experience autonomously, which is the cornerstone for enabling agentic foresight and sustaining strategic coherence. To bridge this gap, we introduce OdysseyArena, which re-centers agent evaluation on long-horizon, active, and inductive interactions. We formalize and instantiate four primitives, translating abstract transition dynamics into concrete interactive environments. Building upon this, we establish OdysseyArena-Lite for standardized benchmarking, providing a set of 120 tasks to measure an agent's inductive efficiency and long-horizon discovery. Pushing further, we introduce OdysseyArena-Challenge to stress-test agent stability across extreme interaction horizons (e.g., > 200 steps). Extensive experiments on 15+ leading LLMs reveal that even frontier models exhibit a deficiency in inductive scenarios, identifying a critical bottleneck in the pursuit of autonomous discovery in complex environments. Our code and data are available at https://github.com/xufangzhi/Odyssey-Arena

  4. AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders

    Sparse Autoencoders (SAEs) are powerful tools for interpreting neural representations, yet their use in audio remains underexplored. We train SAEs across all encoder layers of Whisper and HuBERT, provide an extensive evaluation of their stability, interpretability, and show their practical utility. Over 50% of the features remain consistent across random seeds, and reconstruction quality is preserved. SAE features capture general acoustic and semantic information as well as specific events, including environmental noises and paralinguistic sounds (e.g. laughter, whispering) and disentangle them effectively, requiring removal of only 19-27% of features to erase a concept. Feature steering reduces Whisper's false speech detections by 70% with negligible WER increase, demonstrating real-world applicability. Finally, we find SAE features correlated with human EEG activity during speech perception, indicating alignment with human neural processing. The code and checkpoints are available at https://github.com/audiosae/audiosae_demo.

  5. On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models

    Entropy serves as a critical metric for measuring the diversity of outputs generated by large language models (LLMs), providing valuable insights into their exploration capabilities. While recent studies increasingly focus on monitoring and adjusting entropy to better balance exploration and exploitation in reinforcement fine-tuning (RFT), a principled understanding of entropy dynamics during this process is yet to be thoroughly investigated. In this paper, we establish a theoretical framework for analyzing the entropy dynamics during the RFT process, which begins with a discriminant expression that quantifies entropy change under a single logit update. This foundation enables the derivation of a first-order expression for entropy change, which can be further extended to the update formula of Group Relative Policy Optimization (GRPO). The corollaries and insights drawn from the theoretical analysis inspire the design of entropy control methods, and also offer a unified lens for interpreting various entropy-based methods in existing studies. We provide empirical evidence to support the main conclusions of our analysis and demonstrate the effectiveness of the derived entropy-discriminator clipping methods. This study yields novel insights into RFT training dynamics, providing theoretical support and practical strategies for optimizing the exploration-exploitation balance during LLM fine-tuning.

  6. Pisets: A Robust Speech Recognition System for Lectures and Interviews

    This work presents a speech-to-text system "Pisets" for scientists and journalists which is based on a three-component architecture aimed at improving speech recognition accuracy while minimizing errors and hallucinations associated with the Whisper model. The architecture comprises primary recognition using Wav2Vec2, false positive filtering via the Audio Spectrogram Transformer (AST), and final speech recognition through Whisper. The implementation of curriculum learning methods and the utilization of diverse Russian-language speech corpora significantly enhanced the system's effectiveness. Additionally, advanced uncertainty modeling techniques were introduced, contributing to further improvements in transcription quality. The proposed approaches ensure robust transcribing of long audio data across various acoustic conditions compared to WhisperX and the usual Whisper model. The source code of "Pisets" system is publicly available at GitHub: https://github.com/bond005/pisets.

  7. MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

    Training instability remains a critical challenge in large language model (LLM) pretraining, often manifesting as sudden gradient explosions that waste significant computational resources. We study training failures in a 5M-parameter NanoGPT model scaled via μP, identifying two key phenomena preceding collapse: (1) rapid decline in weight matrix stable rank (ratio of squared Frobenius norm to squared spectral norm), and (2) increasing alignment between adjacent layer Jacobians. We prove theoretically that these two conditions jointly cause exponential gradient norm growth with network depth. To break this instability mechanism, we propose MSign, a new optimizer that periodically applies matrix sign operations to restore stable rank. Experiments on models from 5M to 3B parameters demonstrate that MSign effectively prevents training failures with a computational overhead of less than 7.0%.

  8. DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

    Being able to simulate the outcomes of actions in varied environments will revolutionize the development of generalist agents at scale. However, modeling these world dynamics, especially for dexterous robotics tasks, poses significant challenges due to limited data coverage and scarce action labels. As an endeavor towards this end, we introduce DreamDojo, a foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos. Our data mixture represents the largest video dataset to date for world model pretraining, spanning a wide range of daily scenarios with diverse objects and skills. To address the scarcity of action labels, we introduce continuous latent actions as unified proxy actions, enhancing interaction knowledge transfer from unlabeled videos. After post-training on small-scale target robot data, DreamDojo demonstrates a strong understanding of physics and precise action controllability. We also devise a distillation pipeline that accelerates DreamDojo to a real-time speed of 10.81 FPS and further improves context consistency. Our work enables several important applications based on generative world models, including live teleoperation, policy evaluation, and model-based planning. Systematic evaluation on multiple challenging out-of-distribution (OOD) benchmarks verifies the significance of our method for simulating open-world, contact-rich tasks, paving the way for general-purpose robot world models.

  9. Self-Improving World Modelling with Latent Actions

    Internal modelling of the world -- predicting transitions between previous states X and next states Y under actions Z -- is essential to reasoning and planning for LLMs and VLMs. Learning such models typically requires costly action-labelled trajectories. We propose SWIRL, a self-improvement framework that learns from state-only sequences by treating actions as a latent variable and alternating between Forward World Modelling (FWM) P_θ(Y|X,Z) and an Inverse Dynamics Modelling (IDM) Q_φ(Z|X,Y). SWIRL iterates two phases: (1) Variational Information Maximisation, which updates the FWM to generate next states that maximise conditional mutual information with latent actions given prior states, encouraging identifiable consistency; and (2) ELBO Maximisation, which updates the IDM to explain observed transitions, effectively performing coordinate ascent. Both models are trained with reinforcement learning (specifically, GRPO) with the opposite frozen model's log-probability as a reward signal. We provide theoretical learnability guarantees for both updates, and evaluate SWIRL on LLMs and VLMs across multiple environments: single-turn and multi-turn open-world visual dynamics and synthetic textual environments for physics, web, and tool calling. SWIRL achieves gains of 16% on AURORABench, 28% on ByteMorph, 16% on WorldPredictionBench, and 14% on StableToolBench.

  10. Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

    Recent progress in reasoning models suggests that generating plausible attempts for research-level mathematics may be within reach, but verification remains a bottleneck, consuming scarce expert time. We hypothesize that a meaningful solution should contain enough method-level information that, when applied to a neighborhood of related questions, it should yield better downstream performance than incorrect solutions. Building on this idea, we propose Consequence-Based Utility, an oracle-free evaluator that scores each candidate by testing its value as an in-context exemplar in solving related yet verifiable questions. Our approach is evaluated on an original set of research-level math problems, each paired with one expert-written solution and nine LLM-generated solutions. Notably, Consequence-Based Utility consistently outperforms reward models, generative reward models, and LLM judges on ranking quality. Specifically, for GPT-OSS-120B, it improves Acc@1 from 67.2 to 76.3 and AUC from 71.4 to 79.6, with similarly large AUC gains on GPT-OSS-20B (69.0 to 79.2). Furthermore, compared to LLM-Judges, it also exhibits a larger solver-evaluator gap, maintaining a stronger correct-wrong separation even on instances where the underlying solver often fails to solve.

  11. Self-Improving Multilingual Long Reasoning via Translation-Reasoning Integrated Training

    Long reasoning models often struggle in multilingual settings: they tend to reason in English for non-English questions; when constrained to reasoning in the question language, accuracies drop substantially. The struggle is caused by the limited abilities for both multilingual question understanding and multilingual reasoning. To address both problems, we propose TRIT (Translation-Reasoning Integrated Training), a self-improving framework that integrates the training of translation into multilingual reasoning. Without external feedback or additional multilingual data, our method jointly enhances multilingual question understanding and response generation. On MMATH, our method outperforms multiple baselines by an average of 7 percentage points, improving both answer correctness and language consistency. Further analysis reveals that integrating translation training improves cross-lingual question alignment by over 10 percentage points and enhances translation quality for both mathematical questions and general-domain text, with gains up to 8.4 COMET points on FLORES-200.

  12. POINTS-GUI-G: GUI-Grounding Journey

    The rapid advancement of vision-language models has catalyzed the emergence of GUI agents, which hold immense potential for automating complex tasks, from online shopping to flight booking, thereby alleviating the burden of repetitive digital workflows. As a foundational capability, GUI grounding is typically established as a prerequisite for end-to-end task execution. It enables models to precisely locate interface elements, such as text and icons, to perform accurate operations like clicking and typing. Unlike prior works that fine-tune models already possessing strong spatial awareness (e.g., Qwen3-VL), we aim to master the full technical pipeline by starting from a base model with minimal grounding ability, such as POINTS-1.5. We introduce POINTS-GUI-G-8B, which achieves state-of-the-art performance with scores of 59.9 on ScreenSpot-Pro, 66.0 on OSWorld-G, 95.7 on ScreenSpot-v2, and 49.9 on UI-Vision. Our model's success is driven by three key factors: (1) Refined Data Engineering, involving the unification of diverse open-source datasets format alongside sophisticated strategies for augmentation, filtering, and difficulty grading; (2) Improved Training Strategies, including continuous fine-tuning of the vision encoder to enhance perceptual accuracy and maintaining resolution consistency between training and inference; and (3) Reinforcement Learning (RL) with Verifiable Rewards. While RL is traditionally used to bolster reasoning, we demonstrate that it significantly improves precision in the perception-intensive GUI grounding task. Furthermore, GUI grounding provides a natural advantage for RL, as rewards are easily verifiable and highly accurate.

  13. Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities

    Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as an indispensable paradigm for enhancing reasoning in Large Language Models (LLMs). However, standard policy optimization methods, such as Group Relative Policy Optimization (GRPO), often converge to low-entropy policies, leading to severe mode collapse and limited output diversity. We analyze this issue from the perspective of sampling probability dynamics, identifying that the standard objective disproportionately reinforces the highest-likelihood paths, thereby suppressing valid alternative reasoning chains. To address this, we propose a novel Advantage Re-weighting Mechanism (ARM) designed to equilibrate the confidence levels across all correct responses. By incorporating Prompt Perplexity and Answer Confidence into the advantage estimation, our method dynamically reshapes the reward signal to attenuate the gradient updates of over-confident reasoning paths, while redistributing probability mass toward under-explored correct solutions. Empirical results demonstrate that our approach significantly enhances generative diversity and response entropy while maintaining competitive accuracy, effectively achieving a superior trade-off between exploration and exploitation in reasoning tasks. Empirical results on Qwen2.5 and DeepSeek models across mathematical and coding benchmarks show that ProGRPO significantly mitigates entropy collapse. Specifically, on Qwen2.5-7B, our method outperforms GRPO by 5.7% in Pass@1 and, notably, by 13.9% in Pass@32, highlighting its superior capability in generating diverse correct reasoning paths.

  14. MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments

    Current mobile GUI agent benchmarks systematically fail to assess memory capabilities, with only 5.2-11.8% memory-related tasks and no cross-session learning evaluation. We introduce MemGUI-Bench, a comprehensive memory-centric benchmark with pass@k and staged LLM-as-judge evaluation. Our contributions include: (1) a systematic memory taxonomy analyzing 11 agents across 5 architectures; (2) 128 tasks across 26 applications where 89.8% challenge memory through cross-temporal and cross-spatial retention; (3) MemGUI-Eval, an automated pipeline with Progressive Scrutiny and 7 hierarchical metrics; and (4) RQ-driven assessment of 11 state-of-the-art agents. Our experiments reveal significant memory deficits across all evaluated systems, identify 5 distinct failure modes, and synthesize 5 actionable design implications. All resources including code, benchmark, and evaluation results will be \textit{fully open-sourced and continuously maintained} at https://lgy0404.github.io/MemGUI-Bench/.

  15. Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

    The scaling of Large Language Models (LLMs) drives interest in matrix-based optimizers (e.g., Shampoo, Muon, SOAP) for their convergence efficiency; yet their requirement for holistic updates conflicts with the tensor fragmentation in distributed frameworks like Megatron. Existing solutions are suboptimal: synchronous approaches suffer from computational redundancy, while layer-wise partitioning fails to reconcile this conflict without violating the geometric constraints of efficient communication primitives. To bridge this gap, we propose Canzona, a Unified, Asynchronous, and Load-Balanced framework that decouples logical optimizer assignment from physical parameter distribution. For Data Parallelism, we introduce an alpha-Balanced Static Partitioning strategy that respects atomicity while neutralizing the load imbalance. For Tensor Parallelism, we design an Asynchronous Compute pipeline utilizing Micro-Group Scheduling to batch fragmented updates and hide reconstruction overhead. Extensive evaluations on the Qwen3 model family (up to 32B parameters) on 256 GPUs demonstrate that our approach preserves the efficiency of established parallel architectures, achieving a 1.57x speedup in end-to-end iteration time and reducing optimizer step latency by 5.8x compared to the baseline.

Solidot(13)

  1. 为遵守美国新规汽车厂商加紧移除中国软件代码

    美国最新规定要求汽车厂商需要向美国政府证明,自 3 月 17 日起,其产品的核心部件不包含在中国、或由中国公司编写的代码。该规定还涵盖高级自动驾驶软件,并将从 2029 年扩大到硬件。此举旨在防止车载摄像头、麦克风和 GPS 追踪系统被外国对手利用,是美国试图与中国供应链脱钩的试金石。汽车创新联盟(Alliance for Automotive Innovation)政策主管 Hilary Cain 表示,这是数十年来最具影响力和复杂性的汽车监管政策之一,需要对供应链进行深入审查,严格遵守合规时间表。

  2. COVID-19 疫情期间天空更干净但甲烷排放飙升

    2020 年春天,COVID-19 疫情导致全球工业和旅游业停滞,卫星记录到二氧化氮——内燃机和重工业的副产品——浓度急剧下降,全球空气质量为数十年以来最佳。然而与此同时,温室气体甲烷的浓度出现飙升,当年的甲烷增长率达到了 16.2ppb,创 1980 年代有记录以来最高。根据发表在《科学》期刊上的一项研究,北京大学研究人员认为部分原因就是大气中的氮氧化物减少所致。大气中的甲烷会被羟基自由基分解,转变为水蒸气和二氧化碳。羟基自由基作为大气甲烷清除剂必须通过一系列由太阳光引发的化学反应不断补充,而反应的关键成分是氮氧化物,疫情期间采取的封锁政策导致全球氮氧化物浓度下降约 15%-20%,进而导致羟基自由基生成速度急降,甲烷因此在大气中停留的时间延长,加剧全球暖化。增加的甲烷排放主要来自微生物。新冠疫情期间恰逢拉尼娜现象,它通常会导致热带地区降雨量增加,在水涝缺氧的环境中,微生物产甲烷菌大量繁殖,加速产生甲烷。研究人员利用卫星数据跟踪到新增甲烷主要来自热带非洲和东南亚的大片湿地,这些地区的湿地导致 2020-2022 年间全球甲烷排放量增加约 30%。

  3. Linux From Scratch 放弃 System V 版本

    Linux From Scratch(LFS)项目提供了从源代码构建定制 Linux 系统的逐步指南。项目提供了 System V 和 systemd 两个版本,允许用户选择不同的初始化系统。现在 LFS 项目宣布将不再提供 System V 版本,第一个理由是工作量太大,项目志愿者们不堪重负——LFS 包含 88 个软件包,Beyond Linux From Scratch(BLFS) 包含逾 1000 个软件包,更新软件包需要同时检查与 System V 和 systemd 的兼容性;第二个原因是桌面环境 GNOME 和 KDE Plasma 未来都只支持 systemd 了。预计 3 月释出的 LFS 13.0 将只有 systemd 版本。

  4. LineageOS 23.2 释出

    LineageOS 23.2 释出。主要变化包括:支持 Material 3 Expressive 视觉风格,支持完全自定义快速设置面板,扩展深色主题,更强大的私人空间文件管理工具等等。开发者称,Android 开源项目 AOSP 过去几年的发布节奏转向了季度发布,而 Google 最近宣布从季度发布转向半年发布一次,LineageOS 项目也将采用六个月的发布节奏。

  5. Ardour 9.0 释出

    开源音频工作站 Ardour 释出了 9.0 版本。这是一次重大更新,带来了用户长期期待的新功能。主要变化包括: Region FX、剪辑录制、触控式 GUI、钢琴卷帘窗口、剪辑编辑等等,此外还修复了数十个 bug,新的 MIDI 绑定映射,改进了大多数 macOS 用户的 GUI 性能。开发者表示期待用户的反馈。

  6. Linux 6.19 释出

    Linus Torvalds 在内核邮件列表上宣布释出 Linux 6.19,他同时确认下一个版本将是 Linux 7.0。Linux 6.19 的主要特性包括:初步支持英特尔的线性地址空间分离(address-space separation)功能,支持 Arm 内存系统资源分区和监控、listns()系统调用、新实现的可重启序列、ext4 文件系统支持大块、改进内存安全、实时更新协调器,等等。更多可浏览 KernelNewbies 6.19。

  7. 绝命毒师效应:癌症诊断与犯罪行为上升相关

    美剧《绝命毒师》讲述了高中化学教师 Walter White 在诊断癌症以及家庭经济状况岌岌可危的情况下变成毒枭的故事。但绝命毒师效应在现实中真的存在吗?根据发表在《American Economic Journal: Applied Economics》期刊上的一项研究,经济学家分析了来自丹麦的数据集,发现癌症诊断会使犯罪概率增加约 14%。研究针对了 1980 -2018 年间被诊断癌症的 368,317 名患者,组合健康记录与犯罪登记记录,研究人员跟踪了这些癌症患者的犯罪行为,并与正常人群进行对照。分析显示,在诊断癌症第一年,犯罪概率实际上下降了,研究人员指出这符合直觉,因为化疗和放疗都是非常痛苦的生理过程。但在确诊两年后,犯罪概率显著上升,这一效应能持续逾十年。研究发现,癌症会使无违法记录的人首次触犯法律。丹麦有覆盖全国的免费全民医保,在没有全民医保的国家癌症带来的冲击将会更痛苦。研究还发现,犯罪活动的增加主要由男性驱动。

  8. AI 热导致短缺无处不在

    美国五大科技公司亚马逊、Google、微软、Meta 和甲骨文今年计划在 AI 上投资大约 7000 亿美元,但在可预计的未来 AI 投资获得的回报远低于支出。而在 AI 上的巨额投资已经让整个世界体验到了无处不在的短缺。熟练电工越来越难以找到,非数据中心建筑项目被迫暂停,智能手机价格未来几年会继续上涨,有前景的创新面临资金不足的困境。知名投资人 Roger McNamee 称,自 2022 年中期以来,美国在 AI 领域的投资额可能超过了此前整个科技行业的所有投资总额。苹果上周通知投资者,该公司在采购 iPhone 和 Mac 电脑所需的两种关键芯片上遇到了困难。CEO Tim Cook 不愿意讨论是否会涨价。非 AI 创业公司的融资额降至十年来的最低点。

  9. Waymo 部分远程操作人员位于菲律宾

    在美国国会自动驾驶汽车安全和监管听证会上,Waymo 首席安全官 Mauricio Peña 披露该公司的部分远程操作人员位于菲律宾。当自动驾驶汽车遭遇它无法自己解决的驾驶情况时怎么办?汽车会联络公司,与远程操作人员进行通信,远程操作人员并不会远程驾驶汽车,而是提供指导,动态的驾驶任务仍然由汽车自身负责。Mauricio Peña 博士表示部分远程操作人员位于美国,还有部分位于菲律宾。Waymo 解释说,远程操作员工都是持有驾照的司机,会接受与驾驶相关的犯罪记录和其它交通违规行为的审查,还会“接受随机的毒品检测”。

  10. Steam Machine 定价过高可能导致其缺乏竞争力

    Valve 原本针对入门级 PC 游戏市场的 Steam Machine 因为内存和固态硬盘价格飙升而可能因定价过高缺乏竞争力,乃至无人问津。目前 16GB DDR5 内存条高达 200 美元。在 Valve 去年宣布 Steam Machine 时分析师估计其 512 GB 固态硬盘版本售价在 599 至 629 美元之间,2TB 固态硬盘版本 849 至 899 美元之间。但如今因价格飙升,分析师估计 512GB 版本可能超过 1000 美元,2TB 版本价格可能在 1300-1500美元,这根本不太可能卖出去。Valve 面临的一大问题是,相比现有 PC 厂商它的产量低议价能力更弱,更容易受价格波动的影响。

  11. 丰田开发适合汽车的开源游戏引擎 Fluorite

    Toyota Connected North America 的开发者在 FOSDEM 2026 上宣布了他们正在开发的开源游戏引擎 Fluorite。Toyota Connected North America 是丰田和微软合作成立的公司,致力于开发车载软件、AI 等相关技术。Fluorite 使用了 Flutter、Dart 语言以及 Google 的 Filament 3D 渲染引擎。丰田在寻找适合车载系统的游戏引擎,Unity 和 Unreal Engine 因代码私有、资源占用高以及授权费用昂贵被否决,开源引擎 Godot 存在启动时间过长且资源占用过高等问题,其他引擎被认为不稳定或缺乏稳定的 API。丰田最终选择了 Fluorite,目前公开的信息还不多。

  12. AI.com 域名以 7 千万美元出售

    加密货币交易所 Crypto.com 联合创始人兼 CEO Kris Marszalek 以 7000 万美元的价格收购了域名 AI.com,创下域名最高成交价纪录。款项全部以加密货币形式支付给一位未透露姓名的卖家。Marszalek 计划在本周末的超级碗广告中推出该网站,宣传个人“AI 智能体”,该智能体能帮助用户发送消息、使用应用和交易股票。根据 GoDaddy 的数据,此前的域名成交纪录是 Carinsurance.com,成交价接近 5000 万美元。

  13. 一季度内存价格比去年四季度翻番

    内存价格自去年 10 月之后开始飙升,在京东上 32 GB DDR4 内存条从去年初的 400-500 元涨到了如今的 2500 元左右。根据 Counterpoint Research 的跟踪报告,2026 年第一季度 DRAM、NAND 和 HBM 芯片价格比 2025 年四季度上涨了 80%-90%。64GB RDIMM 内存价格从 2025 年四季度的合同价 450 美元飙升至 900 美元以上,Counterpoint 预计其价格将在第二季度突破 1000 美元。DRAM 运营利润率在 2025 年四季度达到了 60% 左右——这是传统 DRAM 利润率首次超过 HBM——而 2026 年第一季度有望创历史新高。