DIGEST · 2026-01-25

OrangeBot.AI Digest — 2026-01-25

54 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Oneplus phone update introduces hardware anti-rollback (consumerrights.wiki)
  2. Yes, It's Fascism (www.theatlantic.com)
  3. First, make me care (gwern.net)
  4. ICE using Palantir tool that feeds on Medicaid data (www.eff.org)
  5. FAA institutes nationwide drone no-fly zones around ICE operations (www.aerotime.aero)
  6. White House alters arrest photo of ICE protester, says "the memes will continue" (arstechnica.com)
  7. A macOS app that blurs your screen when you slouch (github.com)
  8. Iran Protest Death Toll Could Top 30k, According to Local Health Officials (time.com)
  9. Doom has been ported to an earbud (doombuds.com)
  10. Show HN: Bonsplit – Tabs and splits for native macOS apps (bonsplit.alasdairmonk.com)
  11. Introduction to PostgreSQL Indexes (dlt.github.io)
  12. A flawed paper in management science has been cited more than 6k times (statmodeling.stat.columbia.edu)
  13. Deutsche Telekom is throttling the internet (netzbremse.de)
  14. Google confirms 'high-friction' sideloading flow is coming to Android (www.androidauthority.com)
  15. A Lament for Aperture (ikennd.ac)

GitHub Trending(9)

  1. Blaizzy / mlx-audio

    A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

  2. VectifyAI / PageIndex

    📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

  3. remotion-dev / remotion

    🎥 Make videos programmatically with React

  4. qarmin / czkawka

    Multi functional app to find duplicates, empty folders, similar images etc.

  5. OpenBMB / UltraRAG

    UltraRAG v3: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

  6. microsoft / VibeVoice

    Open-Source Frontier Voice AI

  7. openai / codex

    Lightweight coding agent that runs in your terminal

  8. supermemoryai / supermemory

    Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.

  9. Psiphon-Inc / conduit

    Conduit React Native app

Hugging Face(15)

  1. EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

    The development of native computer-use agents (CUA) represents a significant leap in multimodal AI. However, their potential is currently bottlenecked by the constraints of static data scaling. Existing paradigms relying primarily on passive imitation of static datasets struggle to capture the intricate causal dynamics inherent in long-horizon computer tasks. In this work, we introduce EvoCUA, a native computer use agentic model. Unlike static imitation, EvoCUA integrates data generation and policy optimization into a self-sustaining evolutionary cycle. To mitigate data scarcity, we develop a verifiable synthesis engine that autonomously generates diverse tasks coupled with executable validators. To enable large-scale experience acquisition, we design a scalable infrastructure orchestrating tens of thousands of asynchronous sandbox rollouts. Building on these massive trajectories, we propose an iterative evolving learning strategy to efficiently internalize this experience. This mechanism dynamically regulates policy updates by identifying capability boundaries -- reinforcing successful routines while transforming failure trajectories into rich supervision through error analysis and self-correction. Empirical evaluations on the OSWorld benchmark demonstrate that EvoCUA achieves a success rate of 56.7%, establishing a new open-source state-of-the-art. Notably, EvoCUA significantly outperforms the previous best open-source model, OpenCUA-72B (45.0%), and surpasses leading closed-weights models such as UI-TARS-2 (53.1%). Crucially, our results underscore the generalizability of this approach: the evolving paradigm driven by learning from experience yields consistent performance gains across foundation models of varying scales, establishing a robust and scalable path for advancing native agent capabilities.

  2. HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

    Recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated significant improvement in offline video understanding. However, extending these capabilities to streaming video inputs, remains challenging, as existing models struggle to simultaneously maintain stable understanding performance, real-time responses, and low GPU memory overhead. To address this challenge, we propose HERMES, a novel training-free architecture for real-time and accurate understanding of video streams. Based on a mechanistic attention investigation, we conceptualize KV cache as a hierarchical memory framework that encapsulates video information across multiple granularities. During inference, HERMES reuses a compact KV cache, enabling efficient streaming understanding under resource constraints. Notably, HERMES requires no auxiliary computations upon the arrival of user queries, thereby guaranteeing real-time responses for continuous video stream interactions, which achieves 10times faster TTFT compared to prior SOTA. Even when reducing video tokens by up to 68% compared with uniform sampling, HERMES achieves superior or comparable accuracy across all benchmarks, with up to 11.4% gains on streaming datasets.

  3. LLM-in-Sandbox Elicits General Agentic Intelligence

    We introduce LLM-in-Sandbox, enabling LLMs to explore within a code sandbox (i.e., a virtual computer), to elicit general intelligence in non-code domains. We first demonstrate that strong LLMs, without additional training, exhibit generalization capabilities to leverage the code sandbox for non-code tasks. For example, LLMs spontaneously access external resources to acquire new knowledge, leverage the file system to handle long contexts, and execute scripts to satisfy formatting requirements. We further show that these agentic capabilities can be enhanced through LLM-in-Sandbox Reinforcement Learning (LLM-in-Sandbox-RL), which uses only non-agentic data to train models for sandbox exploration. Experiments demonstrate that LLM-in-Sandbox, in both training-free and post-trained settings, achieves robust generalization spanning mathematics, physics, chemistry, biomedicine, long-context understanding, and instruction following. Finally, we analyze LLM-in-Sandbox's efficiency from computational and system perspectives, and open-source it as a Python package to facilitate real-world deployment.

  4. The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

    Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrary orders. Intuitively, this flexibility implies a solution space that strictly supersets the fixed autoregressive trajectory, theoretically unlocking superior reasoning potential for general tasks like mathematics and coding. Consequently, numerous works have leveraged reinforcement learning (RL) to elicit the reasoning capability of dLLMs. In this paper, we reveal a counter-intuitive reality: arbitrary order generation, in its current form, narrows rather than expands the reasoning boundary of dLLMs. We find that dLLMs tend to exploit this order flexibility to bypass high-uncertainty tokens that are crucial for exploration, leading to a premature collapse of the solution space. This observation challenges the premise of existing RL approaches for dLLMs, where considerable complexities, such as handling combinatorial trajectories and intractable likelihoods, are often devoted to preserving this flexibility. We demonstrate that effective reasoning is better elicited by intentionally forgoing arbitrary order and applying standard Group Relative Policy Optimization (GRPO) instead. Our approach, JustGRPO, is minimalist yet surprisingly effective (e.g., 89.1% accuracy on GSM8K) while fully retaining the parallel decoding ability of dLLMs. Project page: https://nzl-thu.github.io/the-flexibility-trap

  5. BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

    Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in current training paradigms where goal-driven data collection creates a dataset bias. In such datasets, language instructions are highly predictable from visual observations alone, causing the conditional mutual information between instructions and actions to vanish, a phenomenon we term Information Collapse. Consequently, models degenerate into vision-only policies that ignore language constraints and fail in out-of-distribution (OOD) settings. To address this, we propose BayesianVLA, a novel framework that enforces instruction following via Bayesian decomposition. By introducing learnable Latent Action Queries, we construct a dual-branch architecture to estimate both a vision-only prior p(a mid v) and a language-conditioned posterior π(a mid v, ell). We then optimize the policy to maximize the conditional Pointwise Mutual Information (PMI) between actions and instructions. This objective effectively penalizes the vision shortcut and rewards actions that explicitly explain the language command. Without requiring new data, BayesianVLA significantly improves generalization. Extensive experiments across on SimplerEnv and RoboCasa demonstrate substantial gains, including an 11.3% improvement on the challenging OOD SimplerEnv benchmark, validating the ability of our approach to robustly ground language in action.

  6. Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

    Representation Autoencoders (RAEs) have shown distinct advantages in diffusion modeling on ImageNet by training in high-dimensional semantic latent spaces. In this work, we investigate whether this framework can scale to large-scale, freeform text-to-image (T2I) generation. We first scale RAE decoders on the frozen representation encoder (SigLIP-2) beyond ImageNet by training on web, synthetic, and text-rendering data, finding that while scale improves general fidelity, targeted data composition is essential for specific domains like text. We then rigorously stress-test the RAE design choices originally proposed for ImageNet. Our analysis reveals that scaling simplifies the framework: while dimension-dependent noise scheduling remains critical, architectural complexities such as wide diffusion heads and noise-augmented decoding offer negligible benefits at scale Building on this simplified framework, we conduct a controlled comparison of RAE against the state-of-the-art FLUX VAE across diffusion transformer scales from 0.5B to 9.8B parameters. RAEs consistently outperform VAEs during pretraining across all model scales. Further, during finetuning on high-quality datasets, VAE-based models catastrophically overfit after 64 epochs, while RAE models remain stable through 256 epochs and achieve consistently better performance. Across all experiments, RAE-based diffusion models demonstrate faster convergence and better generation quality, establishing RAEs as a simpler and stronger foundation than VAEs for large-scale T2I generation. Additionally, because both visual understanding and generation can operate in a shared representation space, the multimodal model can directly reason over generated latents, opening new possibilities for unified models.

  7. Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

    Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We revisit this setting in a controlled study and introduce Stable-DiffCoder, a block diffusion code model that reuses the Seed-Coder architecture, data, and training pipeline. To enable efficient knowledge learning and stable training, we incorporate a block diffusion continual pretraining (CPT) stage enhanced by a tailored warmup and block-wise clipped noise schedule. Under the same data and architecture, Stable-DiffCoder overall outperforms its AR counterpart on a broad suite of code benchmarks. Moreover, relying only on the CPT and supervised fine-tuning stages, Stable-DiffCoder achieves stronger performance than a wide range of \~8B ARs and DLLMs, demonstrating that diffusion-based training can improve code modeling quality beyond AR training alone. Moreover, diffusion-based any-order modeling improves structured code modeling for editing and reasoning, and through data augmentation, benefits low-resource coding languages.

  8. SAMTok: Representing Any Mask with Two Words

    Pixel-wise capabilities are essential for building interactive intelligent systems. However, pixel-wise multi-modal LLMs (MLLMs) remain difficult to scale due to complex region-level encoders, specialized segmentation decoders, and incompatible training objectives. To address these challenges, we present SAMTok, a discrete mask tokenizer that converts any region mask into two special tokens and reconstructs the mask using these tokens with high fidelity. By treating masks as new language tokens, SAMTok enables base MLLMs (such as the QwenVL series) to learn pixel-wise capabilities through standard next-token prediction and simple reinforcement learning, without architectural modifications and specialized loss design. SAMTok builds on SAM2 and is trained on 209M diverse masks using a mask encoder and residual vector quantizer to produce discrete, compact, and information-rich tokens. With 5M SAMTok-formatted mask understanding and generation data samples, QwenVL-SAMTok attains state-of-the-art or comparable results on region captioning, region VQA, grounded conversation, referring segmentation, scene graph parsing, and multi-round interactive segmentation. We further introduce a textual answer-matching reward that enables efficient reinforcement learning for mask generation, delivering substantial improvements on GRES and GCG benchmarks. Our results demonstrate a scalable and straightforward paradigm for equipping MLLMs with strong pixel-wise capabilities. Our code and models are available.

  9. Learning to Discover at Test Time

    How can we use AI to discover a new state of the art for a scientific problem? Prior work in test-time scaling, such as AlphaEvolve, performs search by prompting a frozen LLM. We perform reinforcement learning at test time, so the LLM can continue to train, but now with experience specific to the test problem. This form of continual learning is quite special, because its goal is to produce one great solution rather than many good ones on average, and to solve this very problem rather than generalize to other problems. Therefore, our learning objective and search subroutine are designed to prioritize the most promising solutions. We call this method Test-Time Training to Discover (TTT-Discover). Following prior work, we focus on problems with continuous rewards. We report results for every problem we attempted, across mathematics, GPU kernel engineering, algorithm design, and biology. TTT-Discover sets the new state of the art in almost all of them: (i) Erdős' minimum overlap problem and an autocorrelation inequality; (ii) a GPUMode kernel competition (up to 2times faster than prior art); (iii) past AtCoder algorithm competitions; and (iv) denoising problem in single-cell analysis. Our solutions are reviewed by experts or the organizers. All our results are achieved with an open model, OpenAI gpt-oss-120b, and can be reproduced with our publicly available code, in contrast to previous best results that required closed frontier models. Our test-time training runs are performed using Tinker, an API by Thinking Machines, with a cost of only a few hundred dollars per problem.

  10. Qwen3-TTS Technical Report

    In this report, we present the Qwen3-TTS series, a family of advanced multilingual, controllable, robust, and streaming text-to-speech models. Qwen3-TTS supports state-of-the-art 3-second voice cloning and description-based control, allowing both the creation of entirely novel voices and fine-grained manipulation over the output speech. Trained on over 5 million hours of speech data spanning 10 languages, Qwen3-TTS adopts a dual-track LM architecture for real-time synthesis, coupled with two speech tokenizers: 1) Qwen-TTS-Tokenizer-25Hz is a single-codebook codec emphasizing semantic content, which offers seamlessly integration with Qwen-Audio and enables streaming waveform reconstruction via a block-wise DiT. 2) Qwen-TTS-Tokenizer-12Hz achieves extreme bitrate reduction and ultra-low-latency streaming, enabling immediate first-packet emission (97,ms) through its 12.5 Hz, 16-layer multi-codebook design and a lightweight causal ConvNet. Extensive experiments indicate state-of-the-art performance across diverse objective and subjective benchmark (e.g., TTS multilingual test set, InstructTTSEval, and our long speech test set). To facilitate community research and development, we release both tokenizers and models under the Apache 2.0 license.

  11. Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

    AI agents may soon become capable of autonomously completing valuable, long-horizon tasks in diverse domains. Current benchmarks either do not measure real-world tasks, or are not sufficiently difficult to meaningfully measure frontier models. To this end, we present Terminal-Bench 2.0: a carefully curated hard benchmark composed of 89 tasks in computer terminal environments inspired by problems from real workflows. Each task features a unique environment, human-written solution, and comprehensive tests for verification. We show that frontier models and agents score less than 65\% on the benchmark and conduct an error analysis to identify areas for model and agent improvement. We publish the dataset and evaluation harness to assist developers and researchers in future work at https://www.tbench.ai/ .

  12. OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

    This paper presents a family of advanced vision encoder, named OpenVision 3, that learns a single, unified visual representation that can serve both image understanding and image generation. Our core architecture is simple: we feed VAE-compressed image latents to a ViT encoder and train its output to support two complementary roles. First, the encoder output is passed to the ViT-VAE decoder to reconstruct the original image, encouraging the representation to capture generative structure. Second, the same representation is optimized with contrastive learning and image-captioning objectives, strengthening semantic features. By jointly optimizing reconstruction- and semantics-driven signals in a shared latent space, the encoder learns representations that synergize and generalize well across both regimes. We validate this unified design through extensive downstream evaluations with the encoder frozen. For multimodal understanding, we plug the encoder into the LLaVA-1.5 framework: it performs comparably with a standard CLIP vision encoder (e.g., 62.4 vs 62.2 on SeedBench, and 83.7 vs 82.9 on POPE). For generation, we test it under the RAE framework: ours substantially surpasses the standard CLIP-based encoder (e.g., gFID: 1.89 vs 2.54 on ImageNet). We hope this work can spur future research on unified modeling.

  13. Towards Automated Kernel Generation in the Era of LLMs

    The performance of modern AI systems is fundamentally constrained by the quality of their underlying kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programming models, making kernel engineering a critical but notoriously time-consuming and non-scalable process. Recent advances in large language models (LLMs) and LLM-based agents have opened new possibilities for automating kernel generation and optimization. LLMs are well-suited to compress expert-level kernel knowledge that is difficult to formalize, while agentic systems further enable scalable optimization by casting kernel development as an iterative, feedback-driven loop. Rapid progress has been made in this area. However, the field remains fragmented, lacking a systematic perspective for LLM-driven kernel generation. This survey addresses this gap by providing a structured overview of existing approaches, spanning LLM-based approaches and agentic optimization workflows, and systematically compiling the datasets and benchmarks that underpin learning and evaluation in this domain. Moreover, key open challenges and future research directions are further outlined, aiming to establish a comprehensive reference for the next generation of automated kernel optimization. To keep track of this field, we maintain an open-source GitHub repository at https://github.com/flagos-ai/awesome-LLM-driven-kernel-generation.

  14. Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

    Composed Image Retrieval (CIR) is a pivotal and complex task in multimodal understanding. Current CIR benchmarks typically feature limited query categories and fail to capture the diverse requirements of real-world scenarios. To bridge this evaluation gap, we leverage image editing to achieve precise control over modification types and content, enabling a pipeline for synthesizing queries across a broad spectrum of categories. Using this pipeline, we construct EDIR, a novel fine-grained CIR benchmark. EDIR encompasses 5,000 high-quality queries structured across five main categories and fifteen subcategories. Our comprehensive evaluation of 13 multimodal embedding models reveals a significant capability gap; even state-of-the-art models (e.g., RzenEmbed and GME) struggle to perform consistently across all subcategories, highlighting the rigorous nature of our benchmark. Through comparative analysis, we further uncover inherent limitations in existing benchmarks, such as modality biases and insufficient categorical coverage. Furthermore, an in-domain training experiment demonstrates the feasibility of our benchmark. This experiment clarifies the task challenges by distinguishing between categories that are solvable with targeted data and those that expose intrinsic limitations of current model architectures.

  15. ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion

    Generating animated 3D objects is at the heart of many applications, yet most advanced works are typically difficult to apply in practice because of their limited setup, their long runtime, or their limited quality. We introduce ActionMesh, a generative model that predicts production-ready 3D meshes "in action" in a feed-forward manner. Drawing inspiration from early video models, our key insight is to modify existing 3D diffusion models to include a temporal axis, resulting in a framework we dubbed "temporal 3D diffusion". Specifically, we first adapt the 3D diffusion stage to generate a sequence of synchronized latents representing time-varying and independent 3D shapes. Second, we design a temporal 3D autoencoder that translates a sequence of independent shapes into the corresponding deformations of a pre-defined reference shape, allowing us to build an animation. Combining these two components, ActionMesh generates animated 3D meshes from different inputs like a monocular video, a text description, or even a 3D mesh with a text prompt describing its animation. Besides, compared to previous approaches, our method is fast and produces results that are rig-free and topology consistent, hence enabling rapid iteration and seamless applications like texturing and retargeting. We evaluate our model on standard video-to-4D benchmarks (Consistent4D, Objaverse) and report state-of-the-art performances on both geometric accuracy and temporal consistency, demonstrating that our model can deliver animated 3D meshes with unprecedented speed and quality.

Solidot(15)

  1. 欧盟去年风能和太阳能首次超过化石燃料

    根据 Ember 的报告,2025 年欧盟风能和太阳能发电量首次超过了化石燃料。这一成就主要受益于太阳能发电量的快速增长,欧盟太阳能发电量占比创下 13% 的纪录,风能和太阳能合计占欧盟发电量的 30%,超过了化石燃料的 29%。这一转变尤为重要,因为欧盟的俄罗斯液化天然气替代方——美国——正变得越来越不可靠,并且越来越愿意将经济工具武器化。逾半数欧盟成员国的风能和太阳能发电量超过了化石燃料。

  2. 利用音爆追踪重返大气层的神舟十五轨道舱

    过去几年,重返地球大气层的空间碎片数量呈指数级增长,不受控的重返大气层事件对人类生命、基础设施和环境的威胁日益加剧。研究人员在《科学》期刊上发表论文,报告了用地面的地震传感器所提供的公开数据探测再入大气层碎片所产生的冲击波(即音爆)的方法。通过对 2024 年 4 月重返地球的神舟十五号轨道舱的再入过程进行监测,研究人员验证了他们的方法;该轨道舱此前处于轨道衰减状态,并会定期飞越六大洲的主要人口密集区的上空。通过利用来自南加州和内华达州传感器的地震数据,研究人员对神舟十五号重返大气层时产生的音爆进行了分析。神舟十五号最终被观察到的再入点与追踪及撞击预测的估算位置相差约 8600 公里。研究人员成功推算出了该航天器的地面轨迹、速度和高度。此外音爆模式显示,神舟十五号并非在单次爆炸事件中坠落,而是可能逐渐碎裂成较小的碎片。这与目击者的报告和视频片段相符。

  3. 美国退出后加州加入 WHO 疾病预警网络

    美国正式退出 WHO 的第二天,加州加入了 WHO 的全球疾病暴发预警和应对网络(Global Outbreak Alert and Response Network,GOARN),成为第一个重新加入该组织的美国州。加州州长 Gavin Newsom 在一份声明中表示,“特朗普政府退出 WHO 是一个鲁莽的决定,将损害所有加州人民和美国人民的利益。加州不会目睹这一决定带来的混乱。我们将继续在全球范围内加强合作,保持在公共卫生准备工作的最前沿...“

  4. 微软向 FBI 提供 BitLocker 密钥解锁硬盘加密数据

    微软最近向 FBI 提供了 BitLocker 密钥去解锁三台笔记本电脑硬盘上的加密数据。Windows 11 默认启用 BitLocker 全盘加密,而密钥会上传到用户的 Microsoft Account,也就是会上传到微软云端。而微软以及执法机构可以访问密钥解密 BitLocker 加密的硬盘。此案与关岛发生的疫情失业援助欺诈相关。FBI 在查获三台使用 BitLocker 加密的笔记本电脑六个月后申请了搜查令。微软未予以置评,它此前曾表示平均每年会收到 20 份提供 BitLocker 密钥的请求。

  5. Bending Spoons 解雇了几乎所有 Vimeo 员工

    意大利公司 Bending Spoons 在 2023 年收购知名笔记应用印象笔记(Evernote)后解雇了所有员工,它在 2025 年 9 月以 13.8 亿美元收购视频网站 Vimeo 后做了同样的事情。Vimeo 前品牌副总裁 Dave Brown 在 LinkedIn 上表示,裁员影响了“公司大部分员工”。一位视频工程师称“几乎所有人”都被解雇了,“包括整个视频团队”,另一位软件工程师表示他和“公司绝大多数人”一起失去了工作。

  6. Mistral CEO 认为中国的 AI 模型并不落后

    Mistral CEO Arthur Mensch 认为中国的 AI 模型并不落后于美国,他指出中国开源模型技术的能力“可能让美国 CEO 们感到压力”。他的观点与其他科技领袖截然不同,后者认为中国在 AI 尖端技术方面落后西方数月甚至数年。Google DeepMind CEO Demis Hassabis 表示中国在尖端模型开发上落后西方约六个月,尚未展现出突破性进展的能力。Anthropic CEO Dario Amodei 表示,美国限制销售最先进 AI 芯片政策正阻碍中国发展,向中国出售高端 AI 芯片就好比“向朝鲜出售核武器”。

  7. 通过数字方式破坏电网

    在美国特种部队逮捕马杜罗(Nicolás Maduro)的同时,委内瑞拉首都加拉加斯同时陷入一片黑暗。这标志着现代冲突性质的深刻转变:物理战争与网络战的融合。断电并非是通过炸毁输电塔或切断电线造成的,而是通过对管理电力的工控系统进行精准而隐蔽的操控。恶意软件能入侵工业控制器,能拦截电网运营商发送的合法指令,替换为破坏电网稳定的恶意指令。恶意软件可以发送指令快速打开和关闭断路器,该操作会导致大型变压器或发电机过热或与电网失去同步,从而造成物理损坏,可能引发火灾或爆炸,而修复工作可能需要数月时间。攻击工控系统的历史案例包括 2009 年对伊朗铀浓缩离心机的 Stuxne 蠕虫攻击,2016 年俄罗斯对乌克兰能源部门的 Industroyer 攻击等。

  8. eBay 禁止 AI 智能体自动购物

    eBay 更新了它的用户协议,明确禁止第三方生成式 AI 未经许可与该平台互动代替用户自动购物。新条款将于 2026 年 2 月 20 日生效。过去一年多家 AI 公司推出了自动购物的 AI 功能:OpenAI 推出 Instant Checkout,允许用户直接在聊天界面从 Etsy 和 Shopify 商家购物;Perplexity 为其付费客户提供了 Buy with Pro 功能,亚马逊提供了 Buy For Me 功能。eBay 新条款禁止的是第三方自动购物,没有排除它可能会自己提供 AI 购物功能。

  9. 美国正式退出 WHO,尚有 2.78 亿美元账单未支付

    美国正式退出 WHO,留下了 2.78 亿美元的未付账单。根据 1948 年的联合决议,美国将其退出意图提前一年于去年 1 月 22 日通知 WHO。但在实际中,特朗普政府立即切断了与 WHO 的联系。根据联合决议,美国在退出前必须全额支付其财政义务。但特朗普政府也没有兑现这一承诺,拖欠了 WHO 的 2.78 亿美元会费。失去美国的财政支持对 WHO 是一大打击。在去年初收到通知后,WHO 立即开始削减成本,包括冻结招聘、限制差旅支出、将所有会议改为虚拟会议、限制 IT 设备更新以及暂停办公室翻新。WHO 还开始裁员。到今年年中,WHO 员工总数预计将减少 22%。

  10. TikTok 成立美国公司

    TikTok 发布公告,已成立 TikTok 美国数据安全合资有限责任公司(TikTok USDS Joint Venture LLC)。该合资公司将负责 TikTok 美国的数据保护、算法安全、内容审核及软件保障。在 TikTok 美国数据安全合资公司中,甲⻣⽂、银湖资本、MGX 各持股 15%。其他投资⽅包括海纳国际集团关联企业 Vastmere 战略投资有限责任公司、Alpha Wave Partners 等多家企业,字节跳动保留 19.9% 的股份。

  11. 艺术家警告 AI 模型“偷窃不是创新”

    大约 800 名艺术家、作家、演员和音乐家署名发起了名为“Stealing Isn't Innovation”的运动,对 AI 公司的大规模盗窃行为发起反击。署名者包括了 George Saunders 和 Jodi Picoult 等作家,凯特·布兰切特和斯嘉丽·约翰逊等演艺明星,R.E.M. 乐队、Billy Corgan 和 根枝乐队(The Roots)等音乐人。AI 公司训练大模型的数据集包含了大量未经授权的版权内容。“Stealing Isn't Innovation”认为,“在 GenAI 领导权争夺战的驱动下,唯利是图的科技公司——既包括全球顶级富豪企业,也包括私募股权支持的创投公司——在未经授权且未支付报酬的情况下,从网上抓取了海量的创作内容。这种非法的知识产权掠夺行为催生了一个充斥着虚假信息、深度伪造以及平庸低质内容(AI slop)的信息生态系统。这不仅可能导致 AI 模型由于数据污染而崩溃,还直接威胁到美国在 AI 领域的领先地位及其国际竞争力。”

  12. 科学家将梅毒的起源上溯到 5500 年前

    根据发表在《科学》期刊上的一项研究,科学家将梅毒的起源上溯到 5500 年前。研究人员在哥伦比亚发现的从中全新世时期的人类狩猎采集者遗骸中提取出一个 5500 年之久的梅毒螺旋体基因组。这一新证据将该病原体的已知基因记录向前推进了约 3000 年。该基因组(TE1-3)代表了梅毒螺旋体的一个先前未知的分支,该分支在所有其他已知亚种出现之前就已经分化出来。尽管 TE1-3 明确归属于梅毒螺旋体,但其基因构成具多样性且与现代菌株截然不同。TE1-3 也携带了与现代梅毒螺旋体毒力相关的一整套基因特征。这些发现显示,梅毒螺旋体的出现早于美洲农业的兴起,表明该病原体的出现并不依赖于通常与传染病传播相关的农业集约化和人口密集化。相反,TE1-3 谱系与狩猎采集社会的社会生态条件相关,其中包括高流动性、小型社群互动以及与野生动物的密切接触。这项研究的发现拓展了人们对全球梅毒螺旋体疾病的出现时间、生态和社会架构的理解。

  13. 世界前一百大城市半数高度缺水

    根据伦敦大学学院科学家对 NASA 卫星照片的最新分析,全世界最大的一百座城市半数高度缺水,其中 38 座位于极度缺水地区。北京、纽约、洛杉矶、里约热内卢和德里等城市极度缺水,而伦敦、曼谷和雅加达则是高度缺水。分析发现,印度钦奈、伊朗德黑兰和中国郑州等城市呈现明显干旱化趋势,而东京、拉各斯和坎帕拉等城市呈现明显湿润化趋势。约有 11 亿人生活在干旱化严重的都市区,有 9600 万生活在呈湿润化趋势的地区。德黑兰已连续六年遭受干旱,距离无水可用的“归零日(day zero)”仅一步之遥。开普敦和钦奈也已接近“归零日”。

  14. 欧洲议会呼吁减少对美国科技巨头的依赖

    欧洲议会呼吁欧盟委员会减少对美国科技巨头的依赖,优先发展欧盟本土的云计算、AI 和开源基础设施。为加强数字主权,应对美国对关键数字基础设施日益加强控制,欧洲议会通过的决议突出了欧洲科技优先、公共采购改革以及“公共资金,公共代码”原则。议员们希望为基于开放标准和互操作性的欧洲数字公共基础设施奠定基础。

  15. ReactOS 庆祝诞生 30 周年

    ReactOS 是一个致力于开发与 Windows NT 和 Windows 2000 应用程序和硬件驱动程序兼容的开源操作系统的项目。ReactOS 诞生于 FreeWin95 项目的废墟之上,FreeWin95 旨在提供 Windows 95 的开源克隆版本,但该项目陷入了困境,项目协调员 Jason Filby 之后领导了一个针对 Windows NT 的新克隆项目,该项目被命名为 ReactOS,意在打破微软对 PC 操作系统的垄断地位。ReactOS 的开发始于 1996 年,1998 年发布了首个版本,过去 30 年有 301 名贡献者递交了逾 88,000 次 commit,总共 14,929,578 行代码。开发者在纪念 30 周年的博客里表示会继续推进 ReactOS 项目,正在开发的新项目包括面向开发者的全新构建环境 RosBE、新的 NTFS 驱动、新的 ATA 驱动、多处理器 (SMP) 支持、对 Class 3 UEFI 系统的支持、内核和用户模式地址空间布局随机化 (ASLR),对基于 WDDM 构建的现代 GPU 驱动支持等等。