DIGEST · 2026-01-23

OrangeBot.AI Digest — 2026-01-23

58 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Tesla kills Autopilot, locks lane-keeping behind $99/month fee (arstechnica.com)
  2. Auto-compact not triggering on Claude.ai despite being marked as fixed (github.com)
  3. Proof of Corn (proofofcorn.com)
  4. Microsoft gave FBI set of BitLocker encryption keys to unlock suspects' laptops (techcrunch.com)
  5. Gas Town's agent patterns, design bottlenecks, and vibecoding at scale (maggieappleton.com)
  6. KORG phase8 – Acoustic Synthesizer (www.korg.com)
  7. Radicle: The Sovereign Forge (radicle.xyz)
  8. European Alternatives (european-alternatives.eu)
  9. Microsoft mishandling example.com (tinyapps.org)
  10. Show HN: Whosthere: A LAN discovery tool with a modern TUI, written in Go (github.com)
  11. What has Docker become? (tuananh.net)
  12. Booting from a vinyl record (2020) (boginjr.com)
  13. AI Usage Policy (github.com)
  14. AI is a horse (2024) (kconner.com)
  15. Updates to our web search products and Programmable Search Engine capabilities (programmablesearchengine.googleblog.com)

GitHub Trending(13)

  1. remotion-dev / remotion

    🎥 Make videos programmatically with React

  2. microsoft / VibeVoice

    Open-Source Frontier Voice AI

  3. block / goose

    an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

  4. ai-dynamo / dynamo

    A Datacenter Scale Distributed Inference Serving Framework

  5. browser-use / browser-use

    🌐 Make websites accessible for AI agents. Automate tasks online with ease.

  6. github / copilot-cli

    GitHub Copilot CLI brings the power of Copilot coding agent directly to your terminal.

  7. Asabeneh / 30-Days-Of-Python

    The 30 Days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than 100 days. Follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

  8. anthropics / claude-code

    Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

  9. deepseek-ai / FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

  10. microsoft / Data-Science-For-Beginners

    10 Weeks, 20 Lessons, Data Science for All!

  11. OpenBMB / UltraRAG

    UltraRAG v3: A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

  12. lyogavin / airllm

    AirLLM 70B inference with single 4GB GPU

  13. KellerJordan / modded-nanogpt

    NanoGPT (124M) in 2 minutes

Hugging Face(15)

  1. EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

    The development of native computer-use agents (CUA) represents a significant leap in multimodal AI. However, their potential is currently bottlenecked by the constraints of static data scaling. Existing paradigms relying primarily on passive imitation of static datasets struggle to capture the intricate causal dynamics inherent in long-horizon computer tasks. In this work, we introduce EvoCUA, a native computer use agentic model. Unlike static imitation, EvoCUA integrates data generation and policy optimization into a self-sustaining evolutionary cycle. To mitigate data scarcity, we develop a verifiable synthesis engine that autonomously generates diverse tasks coupled with executable validators. To enable large-scale experience acquisition, we design a scalable infrastructure orchestrating tens of thousands of asynchronous sandbox rollouts. Building on these massive trajectories, we propose an iterative evolving learning strategy to efficiently internalize this experience. This mechanism dynamically regulates policy updates by identifying capability boundaries -- reinforcing successful routines while transforming failure trajectories into rich supervision through error analysis and self-correction. Empirical evaluations on the OSWorld benchmark demonstrate that EvoCUA achieves a success rate of 56.7%, establishing a new open-source state-of-the-art. Notably, EvoCUA significantly outperforms the previous best open-source model, OpenCUA-72B (45.0%), and surpasses leading closed-weights models such as UI-TARS-2 (53.1%). Crucially, our results underscore the generalizability of this approach: the evolving paradigm driven by learning from experience yields consistent performance gains across foundation models of varying scales, establishing a robust and scalable path for advancing native agent capabilities.

  2. The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

    Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrary orders. Intuitively, this flexibility implies a solution space that strictly supersets the fixed autoregressive trajectory, theoretically unlocking superior reasoning potential for general tasks like mathematics and coding. Consequently, numerous works have leveraged reinforcement learning (RL) to elicit the reasoning capability of dLLMs. In this paper, we reveal a counter-intuitive reality: arbitrary order generation, in its current form, narrows rather than expands the reasoning boundary of dLLMs. We find that dLLMs tend to exploit this order flexibility to bypass high-uncertainty tokens that are crucial for exploration, leading to a premature collapse of the solution space. This observation challenges the premise of existing RL approaches for dLLMs, where considerable complexities, such as handling combinatorial trajectories and intractable likelihoods, are often devoted to preserving this flexibility. We demonstrate that effective reasoning is better elicited by intentionally forgoing arbitrary order and applying standard Group Relative Policy Optimization (GRPO) instead. Our approach, JustGRPO, is minimalist yet surprisingly effective (e.g., 89.1% accuracy on GSM8K) while fully retaining the parallel decoding ability of dLLMs. Project page: https://nzl-thu.github.io/the-flexibility-trap

  3. HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

    Recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated significant improvement in offline video understanding. However, extending these capabilities to streaming video inputs, remains challenging, as existing models struggle to simultaneously maintain stable understanding performance, real-time responses, and low GPU memory overhead. To address this challenge, we propose HERMES, a novel training-free architecture for real-time and accurate understanding of video streams. Based on a mechanistic attention investigation, we conceptualize KV cache as a hierarchical memory framework that encapsulates video information across multiple granularities. During inference, HERMES reuses a compact KV cache, enabling efficient streaming understanding under resource constraints. Notably, HERMES requires no auxiliary computations upon the arrival of user queries, thereby guaranteeing real-time responses for continuous video stream interactions, which achieves 10times faster TTFT compared to prior SOTA. Even when reducing video tokens by up to 68% compared with uniform sampling, HERMES achieves superior or comparable accuracy across all benchmarks, with up to 11.4% gains on streaming datasets.

  4. BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

    Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in current training paradigms where goal-driven data collection creates a dataset bias. In such datasets, language instructions are highly predictable from visual observations alone, causing the conditional mutual information between instructions and actions to vanish, a phenomenon we term Information Collapse. Consequently, models degenerate into vision-only policies that ignore language constraints and fail in out-of-distribution (OOD) settings. To address this, we propose BayesianVLA, a novel framework that enforces instruction following via Bayesian decomposition. By introducing learnable Latent Action Queries, we construct a dual-branch architecture to estimate both a vision-only prior p(a mid v) and a language-conditioned posterior π(a mid v, ell). We then optimize the policy to maximize the conditional Pointwise Mutual Information (PMI) between actions and instructions. This objective effectively penalizes the vision shortcut and rewards actions that explicitly explain the language command. Without requiring new data, BayesianVLA significantly improves generalization. Extensive experiments across on SimplerEnv and RoboCasa demonstrate substantial gains, including an 11.3% improvement on the challenging OOD SimplerEnv benchmark, validating the ability of our approach to robustly ground language in action.

  5. LLM-in-Sandbox Elicits General Agentic Intelligence

    We introduce LLM-in-Sandbox, enabling LLMs to explore within a code sandbox (i.e., a virtual computer), to elicit general intelligence in non-code domains. We first demonstrate that strong LLMs, without additional training, exhibit generalization capabilities to leverage the code sandbox for non-code tasks. For example, LLMs spontaneously access external resources to acquire new knowledge, leverage the file system to handle long contexts, and execute scripts to satisfy formatting requirements. We further show that these agentic capabilities can be enhanced through LLM-in-Sandbox Reinforcement Learning (LLM-in-Sandbox-RL), which uses only non-agentic data to train models for sandbox exploration. Experiments demonstrate that LLM-in-Sandbox, in both training-free and post-trained settings, achieves robust generalization spanning mathematics, physics, chemistry, biomedicine, long-context understanding, and instruction following. Finally, we analyze LLM-in-Sandbox's efficiency from computational and system perspectives, and open-source it as a Python package to facilitate real-world deployment.

  6. Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

    Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We revisit this setting in a controlled study and introduce Stable-DiffCoder, a block diffusion code model that reuses the Seed-Coder architecture, data, and training pipeline. To enable efficient knowledge learning and stable training, we incorporate a block diffusion continual pretraining (CPT) stage enhanced by a tailored warmup and block-wise clipped noise schedule. Under the same data and architecture, Stable-DiffCoder overall outperforms its AR counterpart on a broad suite of code benchmarks. Moreover, relying only on the CPT and supervised fine-tuning stages, Stable-DiffCoder achieves stronger performance than a wide range of \~8B ARs and DLLMs, demonstrating that diffusion-based training can improve code modeling quality beyond AR training alone. Moreover, diffusion-based any-order modeling improves structured code modeling for editing and reasoning, and through data augmentation, benefits low-resource coding languages.

  7. Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

    Representation Autoencoders (RAEs) have shown distinct advantages in diffusion modeling on ImageNet by training in high-dimensional semantic latent spaces. In this work, we investigate whether this framework can scale to large-scale, freeform text-to-image (T2I) generation. We first scale RAE decoders on the frozen representation encoder (SigLIP-2) beyond ImageNet by training on web, synthetic, and text-rendering data, finding that while scale improves general fidelity, targeted data composition is essential for specific domains like text. We then rigorously stress-test the RAE design choices originally proposed for ImageNet. Our analysis reveals that scaling simplifies the framework: while dimension-dependent noise scheduling remains critical, architectural complexities such as wide diffusion heads and noise-augmented decoding offer negligible benefits at scale Building on this simplified framework, we conduct a controlled comparison of RAE against the state-of-the-art FLUX VAE across diffusion transformer scales from 0.5B to 9.8B parameters. RAEs consistently outperform VAEs during pretraining across all model scales. Further, during finetuning on high-quality datasets, VAE-based models catastrophically overfit after 64 epochs, while RAE models remain stable through 256 epochs and achieve consistently better performance. Across all experiments, RAE-based diffusion models demonstrate faster convergence and better generation quality, establishing RAEs as a simpler and stronger foundation than VAEs for large-scale T2I generation. Additionally, because both visual understanding and generation can operate in a shared representation space, the multimodal model can directly reason over generated latents, opening new possibilities for unified models.

  8. SAMTok: Representing Any Mask with Two Words

    Pixel-wise capabilities are essential for building interactive intelligent systems. However, pixel-wise multi-modal LLMs (MLLMs) remain difficult to scale due to complex region-level encoders, specialized segmentation decoders, and incompatible training objectives. To address these challenges, we present SAMTok, a discrete mask tokenizer that converts any region mask into two special tokens and reconstructs the mask using these tokens with high fidelity. By treating masks as new language tokens, SAMTok enables base MLLMs (such as the QwenVL series) to learn pixel-wise capabilities through standard next-token prediction and simple reinforcement learning, without architectural modifications and specialized loss design. SAMTok builds on SAM2 and is trained on 209M diverse masks using a mask encoder and residual vector quantizer to produce discrete, compact, and information-rich tokens. With 5M SAMTok-formatted mask understanding and generation data samples, QwenVL-SAMTok attains state-of-the-art or comparable results on region captioning, region VQA, grounded conversation, referring segmentation, scene graph parsing, and multi-round interactive segmentation. We further introduce a textual answer-matching reward that enables efficient reinforcement learning for mask generation, delivering substantial improvements on GRES and GCG benchmarks. Our results demonstrate a scalable and straightforward paradigm for equipping MLLMs with strong pixel-wise capabilities. Our code and models are available.

  9. Learning to Discover at Test Time

    How can we use AI to discover a new state of the art for a scientific problem? Prior work in test-time scaling, such as AlphaEvolve, performs search by prompting a frozen LLM. We perform reinforcement learning at test time, so the LLM can continue to train, but now with experience specific to the test problem. This form of continual learning is quite special, because its goal is to produce one great solution rather than many good ones on average, and to solve this very problem rather than generalize to other problems. Therefore, our learning objective and search subroutine are designed to prioritize the most promising solutions. We call this method Test-Time Training to Discover (TTT-Discover). Following prior work, we focus on problems with continuous rewards. We report results for every problem we attempted, across mathematics, GPU kernel engineering, algorithm design, and biology. TTT-Discover sets the new state of the art in almost all of them: (i) Erdős' minimum overlap problem and an autocorrelation inequality; (ii) a GPUMode kernel competition (up to 2times faster than prior art); (iii) past AtCoder algorithm competitions; and (iv) denoising problem in single-cell analysis. Our solutions are reviewed by experts or the organizers. All our results are achieved with an open model, OpenAI gpt-oss-120b, and can be reproduced with our publicly available code, in contrast to previous best results that required closed frontier models. Our test-time training runs are performed using Tinker, an API by Thinking Machines, with a cost of only a few hundred dollars per problem.

  10. Qwen3-TTS Technical Report

    In this report, we present the Qwen3-TTS series, a family of advanced multilingual, controllable, robust, and streaming text-to-speech models. Qwen3-TTS supports state-of-the-art 3-second voice cloning and description-based control, allowing both the creation of entirely novel voices and fine-grained manipulation over the output speech. Trained on over 5 million hours of speech data spanning 10 languages, Qwen3-TTS adopts a dual-track LM architecture for real-time synthesis, coupled with two speech tokenizers: 1) Qwen-TTS-Tokenizer-25Hz is a single-codebook codec emphasizing semantic content, which offers seamlessly integration with Qwen-Audio and enables streaming waveform reconstruction via a block-wise DiT. 2) Qwen-TTS-Tokenizer-12Hz achieves extreme bitrate reduction and ultra-low-latency streaming, enabling immediate first-packet emission (97,ms) through its 12.5 Hz, 16-layer multi-codebook design and a lightweight causal ConvNet. Extensive experiments indicate state-of-the-art performance across diverse objective and subjective benchmark (e.g., TTS multilingual test set, InstructTTSEval, and our long speech test set). To facilitate community research and development, we release both tokenizers and models under the Apache 2.0 license.

  11. Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

    AI agents may soon become capable of autonomously completing valuable, long-horizon tasks in diverse domains. Current benchmarks either do not measure real-world tasks, or are not sufficiently difficult to meaningfully measure frontier models. To this end, we present Terminal-Bench 2.0: a carefully curated hard benchmark composed of 89 tasks in computer terminal environments inspired by problems from real workflows. Each task features a unique environment, human-written solution, and comprehensive tests for verification. We show that frontier models and agents score less than 65\% on the benchmark and conduct an error analysis to identify areas for model and agent improvement. We publish the dataset and evaluation harness to assist developers and researchers in future work at https://www.tbench.ai/ .

  12. Rethinking Composed Image Retrieval Evaluation: A Fine-Grained Benchmark from Image Editing

    Composed Image Retrieval (CIR) is a pivotal and complex task in multimodal understanding. Current CIR benchmarks typically feature limited query categories and fail to capture the diverse requirements of real-world scenarios. To bridge this evaluation gap, we leverage image editing to achieve precise control over modification types and content, enabling a pipeline for synthesizing queries across a broad spectrum of categories. Using this pipeline, we construct EDIR, a novel fine-grained CIR benchmark. EDIR encompasses 5,000 high-quality queries structured across five main categories and fifteen subcategories. Our comprehensive evaluation of 13 multimodal embedding models reveals a significant capability gap; even state-of-the-art models (e.g., RzenEmbed and GME) struggle to perform consistently across all subcategories, highlighting the rigorous nature of our benchmark. Through comparative analysis, we further uncover inherent limitations in existing benchmarks, such as modality biases and insufficient categorical coverage. Furthermore, an in-domain training experiment demonstrates the feasibility of our benchmark. This experiment clarifies the task challenges by distinguishing between categories that are solvable with targeted data and those that expose intrinsic limitations of current model architectures.

  13. Towards Automated Kernel Generation in the Era of LLMs

    The performance of modern AI systems is fundamentally constrained by the quality of their underlying kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programming models, making kernel engineering a critical but notoriously time-consuming and non-scalable process. Recent advances in large language models (LLMs) and LLM-based agents have opened new possibilities for automating kernel generation and optimization. LLMs are well-suited to compress expert-level kernel knowledge that is difficult to formalize, while agentic systems further enable scalable optimization by casting kernel development as an iterative, feedback-driven loop. Rapid progress has been made in this area. However, the field remains fragmented, lacking a systematic perspective for LLM-driven kernel generation. This survey addresses this gap by providing a structured overview of existing approaches, spanning LLM-based approaches and agentic optimization workflows, and systematically compiling the datasets and benchmarks that underpin learning and evaluation in this domain. Moreover, key open challenges and future research directions are further outlined, aiming to establish a comprehensive reference for the next generation of automated kernel optimization. To keep track of this field, we maintain an open-source GitHub repository at https://github.com/flagos-ai/awesome-LLM-driven-kernel-generation.

  14. OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

    This paper presents a family of advanced vision encoder, named OpenVision 3, that learns a single, unified visual representation that can serve both image understanding and image generation. Our core architecture is simple: we feed VAE-compressed image latents to a ViT encoder and train its output to support two complementary roles. First, the encoder output is passed to the ViT-VAE decoder to reconstruct the original image, encouraging the representation to capture generative structure. Second, the same representation is optimized with contrastive learning and image-captioning objectives, strengthening semantic features. By jointly optimizing reconstruction- and semantics-driven signals in a shared latent space, the encoder learns representations that synergize and generalize well across both regimes. We validate this unified design through extensive downstream evaluations with the encoder frozen. For multimodal understanding, we plug the encoder into the LLaVA-1.5 framework: it performs comparably with a standard CLIP vision encoder (e.g., 62.4 vs 62.2 on SeedBench, and 83.7 vs 82.9 on POPE). For generation, we test it under the RAE framework: ours substantially surpasses the standard CLIP-based encoder (e.g., gFID: 1.89 vs 2.54 on ImageNet). We hope this work can spur future research on unified modeling.

  15. PROGRESSLM: Towards Progress Reasoning in Vision-Language Models

    Estimating task progress requires reasoning over long-horizon dynamics rather than recognizing static visual content. While modern Vision-Language Models (VLMs) excel at describing what is visible, it remains unclear whether they can infer how far a task has progressed from partial observations. To this end, we introduce Progress-Bench, a benchmark for systematically evaluating progress reasoning in VLMs. Beyond benchmarking, we further explore a human-inspired two-stage progress reasoning paradigm through both training-free prompting and training-based approach based on curated dataset ProgressLM-45K. Experiments on 14 VLMs show that most models are not yet ready for task progress estimation, exhibiting sensitivity to demonstration modality and viewpoint changes, as well as poor handling of unanswerable cases. While training-free prompting that enforces structured progress reasoning yields limited and model-dependent gains, the training-based ProgressLM-3B achieves consistent improvements even at a small model scale, despite being trained on a task set fully disjoint from the evaluation tasks. Further analyses reveal characteristic error patterns and clarify when and why progress reasoning succeeds or fails.

Solidot(15)

  1. eBay 禁止 AI 智能体自动购物

    eBay 更新了它的用户协议,明确禁止第三方生成式 AI 未经许可与该平台互动代替用户自动购物。新条款将于 2026 年 2 月 20 日生效。过去一年多家 AI 公司推出了自动购物的 AI 功能:OpenAI 推出 Instant Checkout,允许用户直接在聊天界面从 Etsy 和 Shopify 商家购物;Perplexity 为其付费客户提供了 Buy with Pro 功能,亚马逊提供了 Buy For Me 功能。eBay 新条款禁止的是第三方自动购物,没有排除它可能会自己提供 AI 购物功能。

  2. 美国正式退出 WHO,尚有 2.78 亿美元账单未支付

    美国正式退出 WHO,留下了 2.78 亿美元的未付账单。根据 1948 年的联合决议,美国将其退出意图提前一年于去年 1 月 22 日通知 WHO。但在实际中,特朗普政府立即切断了与 WHO 的联系。根据联合决议,美国在退出前必须全额支付其财政义务。但特朗普政府也没有兑现这一承诺,拖欠了 WHO 的 2.78 亿美元会费。失去美国的财政支持对 WHO 是一大打击。在去年初收到通知后,WHO 立即开始削减成本,包括冻结招聘、限制差旅支出、将所有会议改为虚拟会议、限制 IT 设备更新以及暂停办公室翻新。WHO 还开始裁员。到今年年中,WHO 员工总数预计将减少 22%。

  3. TikTok 成立美国公司

    TikTok 发布公告,已成立 TikTok 美国数据安全合资有限责任公司(TikTok USDS Joint Venture LLC)。该合资公司将负责 TikTok 美国的数据保护、算法安全、内容审核及软件保障。在 TikTok 美国数据安全合资公司中,甲⻣⽂、银湖资本、MGX 各持股 15%。其他投资⽅包括海纳国际集团关联企业 Vastmere 战略投资有限责任公司、Alpha Wave Partners 等多家企业,字节跳动保留 19.9% 的股份。

  4. 艺术家警告 AI 模型“偷窃不是创新”

    大约 800 名艺术家、作家、演员和音乐家署名发起了名为“Stealing Isn't Innovation”的运动,对 AI 公司的大规模盗窃行为发起反击。署名者包括了 George Saunders 和 Jodi Picoult 等作家,凯特·布兰切特和斯嘉丽·约翰逊等演艺明星,R.E.M. 乐队、Billy Corgan 和 根枝乐队(The Roots)等音乐人。AI 公司训练大模型的数据集包含了大量未经授权的版权内容。“Stealing Isn't Innovation”认为,“在 GenAI 领导权争夺战的驱动下,唯利是图的科技公司——既包括全球顶级富豪企业,也包括私募股权支持的创投公司——在未经授权且未支付报酬的情况下,从网上抓取了海量的创作内容。这种非法的知识产权掠夺行为催生了一个充斥着虚假信息、深度伪造以及平庸低质内容(AI slop)的信息生态系统。这不仅可能导致 AI 模型由于数据污染而崩溃,还直接威胁到美国在 AI 领域的领先地位及其国际竞争力。”

  5. 科学家将梅毒的起源上溯到 5500 年前

    根据发表在《科学》期刊上的一项研究,科学家将梅毒的起源上溯到 5500 年前。研究人员在哥伦比亚发现的从中全新世时期的人类狩猎采集者遗骸中提取出一个 5500 年之久的梅毒螺旋体基因组。这一新证据将该病原体的已知基因记录向前推进了约 3000 年。该基因组(TE1-3)代表了梅毒螺旋体的一个先前未知的分支,该分支在所有其他已知亚种出现之前就已经分化出来。尽管 TE1-3 明确归属于梅毒螺旋体,但其基因构成具多样性且与现代菌株截然不同。TE1-3 也携带了与现代梅毒螺旋体毒力相关的一整套基因特征。这些发现显示,梅毒螺旋体的出现早于美洲农业的兴起,表明该病原体的出现并不依赖于通常与传染病传播相关的农业集约化和人口密集化。相反,TE1-3 谱系与狩猎采集社会的社会生态条件相关,其中包括高流动性、小型社群互动以及与野生动物的密切接触。这项研究的发现拓展了人们对全球梅毒螺旋体疾病的出现时间、生态和社会架构的理解。

  6. 世界前一百大城市半数高度缺水

    根据伦敦大学学院科学家对 NASA 卫星照片的最新分析,全世界最大的一百座城市半数高度缺水,其中 38 座位于极度缺水地区。北京、纽约、洛杉矶、里约热内卢和德里等城市极度缺水,而伦敦、曼谷和雅加达则是高度缺水。分析发现,印度钦奈、伊朗德黑兰和中国郑州等城市呈现明显干旱化趋势,而东京、拉各斯和坎帕拉等城市呈现明显湿润化趋势。约有 11 亿人生活在干旱化严重的都市区,有 9600 万生活在呈湿润化趋势的地区。德黑兰已连续六年遭受干旱,距离无水可用的“归零日(day zero)”仅一步之遥。开普敦和钦奈也已接近“归零日”。

  7. 欧洲议会呼吁减少对美国科技巨头的依赖

    欧洲议会呼吁欧盟委员会减少对美国科技巨头的依赖,优先发展欧盟本土的云计算、AI 和开源基础设施。为加强数字主权,应对美国对关键数字基础设施日益加强控制,欧洲议会通过的决议突出了欧洲科技优先、公共采购改革以及“公共资金,公共代码”原则。议员们希望为基于开放标准和互操作性的欧洲数字公共基础设施奠定基础。

  8. ReactOS 庆祝诞生 30 周年

    ReactOS 是一个致力于开发与 Windows NT 和 Windows 2000 应用程序和硬件驱动程序兼容的开源操作系统的项目。ReactOS 诞生于 FreeWin95 项目的废墟之上,FreeWin95 旨在提供 Windows 95 的开源克隆版本,但该项目陷入了困境,项目协调员 Jason Filby 之后领导了一个针对 Windows NT 的新克隆项目,该项目被命名为 ReactOS,意在打破微软对 PC 操作系统的垄断地位。ReactOS 的开发始于 1996 年,1998 年发布了首个版本,过去 30 年有 301 名贡献者递交了逾 88,000 次 commit,总共 14,929,578 行代码。开发者在纪念 30 周年的博客里表示会继续推进 ReactOS 项目,正在开发的新项目包括面向开发者的全新构建环境 RosBE、新的 NTFS 驱动、新的 ATA 驱动、多处理器 (SMP) 支持、对 Class 3 UEFI 系统的支持、内核和用户模式地址空间布局随机化 (ASLR),对基于 WDDM 构建的现代 GPU 驱动支持等等。

  9. mRNA 癌症疫苗展现了抗癌潜力

    Moderna 和默克公布了针对黑色素瘤的 mRNA 癌症疫苗二期临床试验初步结果:相比标准疗法,五年内癌症复发和死亡风险降低近 50%。二期临床试验包含了 157 名被诊断为三期或四期黑色素瘤且手术切除后有高复发风险的患者。此类手术后预防复发的标准治疗方法是采用免疫疗法如使用默克的 Keytruda,所有参与者都接受了 Keytruda 治疗,他们以 2:1 的比例随机分组,部分患者接受了定制 mRNA 疫苗。疫苗专门针对每位患者的黑色素瘤进行定制,它携带了基因指令去构建最多 34 种独一无二的突变癌细胞标志物。一旦进入患者体内,健康细胞会产生这些标志物,训练 T 细胞去识别和攻击癌细胞。此前公布的结果显示,同时接受 mRNA 疫苗和 Keytruda 的 107 名患者中有 24 人在两年随访期间复发或死亡,只使用 Keytruda 的 50 名患者有 20 人复发或死亡,显示疫苗将复发或死亡风险降低了 44%。两家公司未公布五年随访的具体结果,只是表示风险降低了 49%。

  10. Eric Schmidt 认为欧洲必须投资自己的开源 AI 模型

    Google 前 CEO Eric Schmidt 在瑞士达沃斯世界经济论坛上表示,欧洲必须投资建设自己的开源 AI 实验室和应对能源价格飙升的问题,否则会很快发现自己将依赖于中国的开源模型。美国公司多数转向闭源模型,这意味着它们的 AI 技术需要购买和获得授权。而中国的 AI 模型则主要采用开放权重的方式。除非欧洲愿意投入大量资金开发欧洲本土的模型,否则最终欧洲将不得不使用中国的模型。这对欧洲来说可能不是好事。

  11. 座头鲸通过社交学会气泡网捕食

    气泡网捕食是一种群体捕猎技术,鲸鱼通过喷出气泡来围捕鱼群,然后一起向上猛冲,将鱼吞食。几十年来这种行为一直在美国阿拉斯加水域的座头鲸身上被观察到,研究人员最近开始在加拿大西部峡湾的座头鲸种群中发现了这种行为。研究人员利用 2004 年至 2023 年的野外观测数据,重点研究了生活在加拿大西部基蒂马特峡湾的526 头鲸鱼。研究团队利用每头鲸鱼独有的尾鳍图像来识别它们。数据显示,有 254 头鲸鱼至少进行过一次气泡网捕食,其中约 90% 的捕食行为是在合作环境下进行的。这种行为似乎在 2014 年之后开始增多,这与东北太平洋地区发生的一次重大海洋热浪事件相吻合,该热浪导致猎物数量减少。如果鲸鱼经常与已经使用气泡网捕食技术的同类群体互动,它们就更有可能学会这种捕食方式。气泡网捕食技术可能最初是由从东北太平洋其他地区迁徙而来的鲸鱼引入该地区的,但研究结果主要表明,这种行为是通过当地的社会网络传播开来的,由稳定的群体和具有影响力的个体传播。

  12. 铠侠称其 2026 年产能已经售罄

    内存价格飙升的今天你也别指望 SSD 会便宜。第二大 NAND 制造商铠侠存储器业务总经理中户俊介表示该公司的 2026 年产能已经售罄,企业级和消费级 SSD 市场将进入“高端且昂贵的阶段”。中户称,企业普遍感到一旦停止投资 AI 会被淘汰,因此他们别无选择只能继续投资。”如果对生成式 AI 数据中心的需求没有发生重大变化,那么在可预见的未来,这轮投资将使 SSD 价格居高不下。中户表示铠侠正试图提高产能以满足不断增长的需求。

  13. 白俄业余无线电操作人员面临死刑惩罚

    白俄罗斯业余无线电(HAM Radio)社区紧急呼吁外界关注,白俄罗斯政府对该社区实施了突袭行动,至少逮捕了七人,威胁对其中三人判处死刑,理由是他们窃取国家机密。面临死刑惩罚的三人属于一个有 50 多名成员的业余无线电爱好者网络,他们被控“间谍罪”和“叛国罪”。政府查获了 500 多件无线电设备,国家电视台称他们利用无线电监视政府飞机的行踪,但未提供证据。

  14. 伊朗如何封锁互联网

    伊朗的互联网封锁已长达 14 天。美国网络监测公司 Kentik 根据其收集的数据分析了伊朗的互联网封锁。除了互联网封锁,国际语音通信也被封锁,国内通信服务也长时间中断,对伊朗 9000 万国民而言这是历史上最严重的通信封锁事件。此次大规模封锁的第一个征兆是该国国有电信公司 TIC 的自治系统 AS49666 于 1 月 8 日 11:42 UTC 撤回其 IPv6 BGP 路由,伊朗的 IPv6 流量不到其总流量的不到 1%,对普通用户基本没影响,但它预示之后几小时发生的事。当地时间晚上 7 点(UTC 16:30)伊朗的流量开始暴降,UTC 18:45 伊朗进出流量降至几乎为零。伊朗的 IPv4 路由仍然在线,政府启用了白名单制度,只允许经过批准的用户或服务访问互联网,还有非常少的流量进出伊朗,1 月 9 日 AS6736 还为伊朗大学短暂恢复了数小时联网服务。

  15. 32 家化石燃料公司占全球二氧化碳排放的一半

    根据《Carbon Majors》报告,2024 年 32 家化石燃料公司的排放量占到了全球二氧化碳排放量的一半。沙特阿美是最大的国家所有的污染企业,而埃克森美孚是最大的投资者所有的污染企业。排名前 20 的化石燃料公司中有 17 家是国有企业。沙特阿美排放了 17 亿吨二氧化碳,大部分来自出口的石油。如果沙特阿美是一个国家,它将是全球第五大碳排放国,位于俄罗斯之后。埃克森美孚的化石燃料生产排放了 6.1 亿吨二氧化碳,是第九大污染排放国,排名位于韩国之前。