DIGEST · 2026-01-30

OrangeBot.AI Digest — 2026-01-30

59 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Antirender: remove the glossy shine on architectural renderings (antirender.com)
  2. Kimi K2.5 Technical Report [pdf] (github.com)
  3. Amazon's Spending on 'Melania' Is a Barely Concealed Bribe (daringfireball.net)
  4. Microsoft 365 now tracks you in real time? (ztechtalk.com)
  5. Buttered Crumpet, a custom typeface for Wallace and Gromit (jamieclarketype.com)
  6. Wisconsin communities signed secrecy deals for billion-dollar data centers (www.wpr.org)
  7. Godot 4.6 Release: It's all about your flow (godotengine.org)
  8. The engineer who invented the Mars rover suspension in his garage [video] (www.youtube.com)
  9. Tesla’s autonomous vehicles are crashing at a rate much higher tha human drivers (electrek.co)
  10. Netflix Animation Studios Joins the Blender Development Fund as Corporate Patron (www.blender.org)
  11. GOG: Linux "the next major frontier" for gaming as it works on a native client (www.xda-developers.com)
  12. Software Pump and Dump (tautvilas.lt)
  13. How AI impacts skill formation (arxiv.org)
  14. How AI assistance impacts the formation of coding skills (www.anthropic.com)
  15. OpenClaw – Moltbot Renamed Again (openclaw.ai)

GitHub Trending(14)

  1. openclaw / openclaw

    Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

  2. asgeirtj / system_prompts_leaks

    Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini

  3. MoonshotAI / kimi-cli

    Kimi Code CLI is your next CLI agent.

  4. modelcontextprotocol / ext-apps

    Official repo for spec & SDK of MCP Apps protocol - standard for UIs embedded AI chatbots, served by MCP servers

  5. NevaMind-AI / memU

    Memory for 24/7 proactive agents like openclaw (moltbot, clawdbot).

  6. hashicorp / vault

    A tool for secrets management, encryption as a service, and privileged access management

  7. badlogic / pi-mono

    AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods

  8. anomalyco / opencode-anthropic-auth
  9. protocolbuffers / protobuf

    Protocol Buffers - Google's data interchange format

  10. pedroslopez / whatsapp-web.js

    A WhatsApp client library for NodeJS that connects through the WhatsApp Web browser app

  11. TeamNewPipe / NewPipe

    A libre lightweight streaming front-end for Android.

  12. Shubhamsaboo / awesome-llm-apps

    Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

  13. microsoft / playwright-cli

    CLI for common Playwright actions. Record and generate Playwright code, inspect selectors and take screenshots.

  14. lobehub / lobehub

    The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

Hugging Face(15)

  1. Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

    Autonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literature online. This on-the-spot computation strategy incurs high computational cost, suffers from context window limitations, and often leads to brittle reasoning and hallucination. We propose Idea2Story, a pre-computation-driven framework for autonomous scientific discovery that shifts literature understanding from online reasoning to offline knowledge construction. Idea2Story continuously collects peer-reviewed papers together with their review feedback, extracts core methodological units, composes reusable research patterns, and organizes them into a structured methodological knowledge graph. At runtime, underspecified user research intents are aligned to established research paradigms, enabling efficient retrieval and reuse of high-quality research patterns instead of open-ended generation and trial-and-error. By grounding research planning and execution in a pre-built knowledge graph, Idea2Story alleviates the context window bottleneck of LLMs and substantially reduces repeated runtime reasoning over literature. We conduct qualitative analyses and preliminary empirical studies demonstrating that Idea2Story can generate coherent, methodologically grounded, and novel research patterns, and can produce several high-quality research demonstrations in an end-to-end setting. These results suggest that offline knowledge construction provides a practical and scalable foundation for reliable autonomous scientific discovery.

  2. Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models

    Text-to-image (T2I) models have achieved remarkable success in generating high-fidelity images, but they often fail in handling complex spatial relationships, e.g., spatial perception, reasoning, or interaction. These critical aspects are largely overlooked by current benchmarks due to their short or information-sparse prompt design. In this paper, we introduce SpatialGenEval, a new benchmark designed to systematically evaluate the spatial intelligence of T2I models, covering two key aspects: (1) SpatialGenEval involves 1,230 long, information-dense prompts across 25 real-world scenes. Each prompt integrates 10 spatial sub-domains and corresponding 10 multi-choice question-answer pairs, ranging from object position and layout to occlusion and causality. Our extensive evaluation of 21 state-of-the-art models reveals that higher-order spatial reasoning remains a primary bottleneck. (2) To demonstrate that the utility of our information-dense design goes beyond simple evaluation, we also construct the SpatialT2I dataset. It contains 15,400 text-image pairs with rewritten prompts to ensure image consistency while preserving information density. Fine-tuned results on current foundation models (i.e., Stable Diffusion-XL, Uniworld-V1, OmniGen2) yield consistent performance gains (+4.2%, +5.7%, +4.4%) and more realistic effects in spatial relations, highlighting a data-centric paradigm to achieve spatial intelligence in T2I models.

  3. Scaling Embeddings Outperforms Scaling Experts in Language Models

    While Mixture-of-Experts (MoE) architectures have become the standard for sparsity scaling in large language models, they increasingly face diminishing returns and system-level bottlenecks. In this work, we explore embedding scaling as a potent, orthogonal dimension for scaling sparsity. Through a comprehensive analysis and experiments, we identify specific regimes where embedding scaling achieves a superior Pareto frontier compared to expert scaling. We systematically characterize the critical architectural factors governing this efficacy -- ranging from parameter budgeting to the interplay with model width and depth. Moreover, by integrating tailored system optimizations and speculative decoding, we effectively convert this sparsity into tangible inference speedups. Guided by these insights, we introduce LongCat-Flash-Lite, a 68.5B parameter model with ~3B activated trained from scratch. Despite allocating over 30B parameters to embeddings, LongCat-Flash-Lite not only surpasses parameter-equivalent MoE baselines but also exhibits exceptional competitiveness against existing models of comparable scale, particularly in agentic and coding domains.

  4. DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

    Manipulating dynamic objects remains an open challenge for Vision-Language-Action (VLA) models, which, despite strong generalization in static manipulation, struggle in dynamic scenarios requiring rapid perception, temporal anticipation, and continuous control. We present DynamicVLA, a framework for dynamic object manipulation that integrates temporal reasoning and closed-loop adaptation through three key designs: 1) a compact 0.4B VLA using a convolutional vision encoder for spatially efficient, structurally faithful encoding, enabling fast multimodal inference; 2) Continuous Inference, enabling overlapping reasoning and execution for lower latency and timely adaptation to object motion; and 3) Latent-aware Action Streaming, which bridges the perception-execution gap by enforcing temporally aligned action execution. To fill the missing foundation of dynamic manipulation data, we introduce the Dynamic Object Manipulation (DOM) benchmark, built from scratch with an auto data collection pipeline that efficiently gathers 200K synthetic episodes across 2.8K scenes and 206 objects, and enables fast collection of 2K real-world episodes without teleoperation. Extensive evaluations demonstrate remarkable improvements in response speed, perception, and generalization, positioning DynamicVLA as a unified framework for general dynamic object manipulation across embodiments.

  5. OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

    The development of large vision language models drives the demand for managing, and applying massive amounts of multimodal data, making OCR technology, which extracts information from visual images, increasingly popular. However, existing OCR methods primarily focus on recognizing text elements from images or scanned documents (Text-centric OCR), neglecting the identification of visual elements from visually information-dense image sources (Vision-centric OCR), such as charts, web pages and science plots. In reality, these visually information-dense images are widespread on the internet and have significant real-world application value, such as data visualization and web page analysis. In this technical report, we propose OCRVerse, the first holistic OCR method in end-to-end manner that enables unified text-centric OCR and vision-centric OCR. To this end, we constructe comprehensive data engineering to cover a wide range of text-centric documents, such as newspapers, magazines and books, as well as vision-centric rendered composites, including charts, web pages and scientific plots. Moreover, we propose a two-stage SFT-RL multi-domain training method for OCRVerse. SFT directly mixes cross-domain data to train and establish initial domain knowledge, while RL focuses on designing personalized reward strategies for the characteristics of each domain. Specifically, since different domains require various output formats and expected outputs, we provide sufficient flexibility in the RL stage to customize flexible reward signals for each domain, thereby improving cross-domain fusion and avoiding data conflicts. Experimental results demonstrate the effectiveness of OCRVerse, achieving competitive results across text-centric and vision-centric data types, even comparable to large-scale open-source and closed-source models.

  6. MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

    Recent advances in Vision Language Models (VLMs) have driven significant progress in visual reasoning. However, open-source VLMs still lag behind proprietary systems, largely due to the lack of high-quality reasoning data. Existing datasets offer limited coverage of challenging domains such as STEM diagrams and visual puzzles, and lack consistent, long-form Chain-of-Thought (CoT) annotations essential for eliciting strong reasoning capabilities. To bridge this gap, we introduce MMFineReason, a large-scale multimodal reasoning dataset comprising 1.8M samples and 5.1B solution tokens, featuring high-quality reasoning annotations distilled from Qwen3-VL-235B-A22B-Thinking. The dataset is established via a systematic three-stage pipeline: (1) large-scale data collection and standardization, (2) CoT rationale generation, and (3) comprehensive selection based on reasoning quality and difficulty awareness. The resulting dataset spans STEM problems, visual puzzles, games, and complex diagrams, with each sample annotated with visually grounded reasoning traces. We fine-tune Qwen3-VL-Instruct on MMFineReason to develop MMFineReason-2B/4B/8B versions. Our models establish new state-of-the-art results for their size class. Notably, MMFineReason-4B succesfully surpasses Qwen3-VL-8B-Thinking, and MMFineReason-8B even outperforms Qwen3-VL-30B-A3B-Thinking while approaching Qwen3-VL-32B-Thinking, demonstrating remarkable parameter efficiency. Crucially, we uncover a "less is more" phenomenon via our difficulty-aware filtering strategy: a subset of just 7\% (123K samples) achieves performance comparable to the full dataset. Notably, we reveal a synergistic effect where reasoning-oriented data composition simultaneously boosts general capabilities.

  7. ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

    Large language models allocate uniform computation across all tokens, ignoring that some sequences are trivially predictable while others require deep reasoning. We introduce ConceptMoE, which dynamically merges semantically similar tokens into concept representations, performing implicit token-level compute allocation. A learnable chunk module identifies optimal boundaries by measuring inter-token similarity, compressing sequences by a target ratio R before they enter the compute-intensive concept model. Crucially, the MoE architecture enables controlled evaluation: we reallocate saved computation to match baseline activated FLOPs (excluding attention map computation) and total parameters, isolating genuine architectural benefits. Under these conditions, ConceptMoE consistently outperforms standard MoE across language and vision-language tasks, achieving +0.9 points on language pretraining, +2.3 points on long context understanding, and +0.6 points on multimodal benchmarks. When converting pretrained MoE during continual training with layer looping, gains reach +5.5 points, demonstrating practical applicability. Beyond performance, ConceptMoE reduces attention computation by up to R^2times and KV cache by Rtimes. At R=2, empirical measurements show prefill speedups reaching 175\% and decoding speedups up to 117\% on long sequences. The minimal architectural modifications enable straightforward integration into existing MoE, demonstrating that adaptive concept-level processing fundamentally improves both effectiveness and efficiency of large language models.

  8. PLANING: A Loosely Coupled Triangle-Gaussian Framework for Streaming 3D Reconstruction

    Streaming reconstruction from monocular image sequences remains challenging, as existing methods typically favor either high-quality rendering or accurate geometry, but rarely both. We present PLANING, an efficient on-the-fly reconstruction framework built on a hybrid representation that loosely couples explicit geometric primitives with neural Gaussians, enabling geometry and appearance to be modeled in a decoupled manner. This decoupling supports an online initialization and optimization strategy that separates geometry and appearance updates, yielding stable streaming reconstruction with substantially reduced structural redundancy. PLANING improves dense mesh Chamfer-L2 by 18.52% over PGSR, surpasses ARTDECO by 1.31 dB PSNR, and reconstructs ScanNetV2 scenes in under 100 seconds, over 5x faster than 2D Gaussian Splatting, while matching the quality of offline per-scene optimization. Beyond reconstruction quality, the structural clarity and computational efficiency of \modelname~make it well suited for a broad range of downstream applications, such as enabling large-scale scene modeling and simulation-ready environments for embodied AI. Project page: https://city-super.github.io/PLANING/ .

  9. AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts

    The evolution of Large Language Models (LLMs) into autonomous agents necessitates the management of extensive, dynamic contexts. Current benchmarks, however, remain largely static, relying on passive retrieval tasks that fail to simulate the complexities of agent-environment interaction, such as non-linear reasoning and iterative feedback. To address this, we introduce AgentLongBench, which evaluates agents through simulated environment rollouts based on Lateral Thinking Puzzles. This framework generates rigorous interaction trajectories across knowledge-intensive and knowledge-free scenarios. Experiments with state-of-the-art models and memory systems (32K to 4M tokens) expose a critical weakness: while adept at static retrieval, agents struggle with the dynamic information synthesis essential for workflows. Our analysis indicates that this degradation is driven by the minimum number of tokens required to resolve a query. This factor explains why the high information density inherent in massive tool responses poses a significantly greater challenge than the memory fragmentation typical of long-turn dialogues.

  10. Qwen3-ASR Technical Report

    In this report, we introduce Qwen3-ASR family, which includes two powerful all-in-one speech recognition models and a novel non-autoregressive speech forced alignment model. Qwen3-ASR-1.7B and Qwen3-ASR-0.6B are ASR models that support language identification and ASR for 52 languages and dialects. Both of them leverage large-scale speech training data and the strong audio understanding ability of their foundation model Qwen3-Omni. We conduct comprehensive internal evaluation besides the open-sourced benchmarks as ASR models might differ little on open-sourced benchmark scores but exhibit significant quality differences in real-world scenarios. The experiments reveal that the 1.7B version achieves SOTA performance among open-sourced ASR models and is competitive with the strongest proprietary APIs while the 0.6B version offers the best accuracy-efficiency trade-off. Qwen3-ASR-0.6B can achieve an average TTFT as low as 92ms and transcribe 2000 seconds speech in 1 second at a concurrency of 128. Qwen3-ForcedAligner-0.6B is an LLM based NAR timestamp predictor that is able to align text-speech pairs in 11 languages. Timestamp accuracy experiments show that the proposed model outperforms the three strongest force alignment models and takes more advantages in efficiency and versatility. To further accelerate the community research of ASR and audio understanding, we release these models under the Apache 2.0 license.

  11. Exploring Reasoning Reward Model for Agents

    Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform complex reasoning and tool use. However, most methods still relies on sparse outcome-based reward for training. Such feedback fails to differentiate intermediate reasoning quality, leading to suboptimal training results. In this paper, we introduce Agent Reasoning Reward Model (Agent-RRM), a multi-faceted reward model that produces structured feedback for agentic trajectories, including (1) an explicit reasoning trace , (2) a focused critique that provides refinement guidance by highlighting reasoning flaws, and (3) an overall score that evaluates process performance. Leveraging these signals, we systematically investigate three integration strategies: Reagent-C (text-augmented refinement), Reagent-R (reward-augmented guidance), and Reagent-U (unified feedback integration). Extensive evaluations across 12 diverse benchmarks demonstrate that Reagent-U yields substantial performance leaps, achieving 43.7% on GAIA and 46.2% on WebWalkerQA, validating the effectiveness of our reasoning reward model and training schemes. Code, models, and datasets are all released to facilitate future research.

  12. LoL: Longer than Longer, Scaling Video Generation to Hour

    Recent research in long-form video generation has shifted from bidirectional to autoregressive models, yet these methods commonly suffer from error accumulation and a loss of long-term coherence. While attention sink frames have been introduced to mitigate this performance decay, they often induce a critical failure mode we term sink-collapse: the generated content repeatedly reverts to the sink frame, resulting in abrupt scene resets and cyclic motion patterns. Our analysis reveals that sink-collapse originates from an inherent conflict between the periodic structure of Rotary Position Embedding (RoPE) and the multi-head attention mechanisms prevalent in current generative models. To address it, we propose a lightweight, training-free approach that effectively suppresses this behavior by introducing multi-head RoPE jitter that breaks inter-head attention homogenization and mitigates long-horizon collapse. Extensive experiments show that our method successfully alleviates sink-collapse while preserving generation quality. To the best of our knowledge, this work achieves the first demonstration of real-time, streaming, and infinite-length video generation with little quality decay. As an illustration of this robustness, we generate continuous videos up to 12 hours in length, which, to our knowledge, is among the longest publicly demonstrated results in streaming video generation.

  13. Language-based Trial and Error Falls Behind in the Era of Experience

    While Large Language Models (LLMs) excel in language-based agentic tasks, their applicability to unseen, nonlinguistic environments (e.g., symbolic or spatial tasks) remains limited. Previous work attributes this performance gap to the mismatch between the pretraining distribution and the testing distribution. In this work, we demonstrate the primary bottleneck is the prohibitive cost of exploration: mastering these tasks requires extensive trial-and-error, which is computationally unsustainable for parameter-heavy LLMs operating in a high dimensional semantic space. To address this, we propose SCOUT (Sub-Scale Collaboration On Unseen Tasks), a novel framework that decouples exploration from exploitation. We employ lightweight "scouts" (e.g., small MLPs) to probe environmental dynamics at a speed and scale far exceeding LLMs. The collected trajectories are utilized to bootstrap the LLM via Supervised Fine-Tuning (SFT), followed by multi-turn Reinforcement Learning (RL) to activate its latent world knowledge. Empirically, SCOUT enables a Qwen2.5-3B-Instruct model to achieve an average score of 0.86, significantly outperforming proprietary models, including Gemini-2.5-Pro (0.60), while saving about 60% GPU hours consumption.

  14. Discovering Hidden Gems in Model Repositories

    Public repositories host millions of fine-tuned models, yet community usage remains disproportionately concentrated on a small number of foundation checkpoints. We investigate whether this concentration reflects efficient market selection or if superior models are systematically overlooked. Through an extensive evaluation of over 2,000 models, we show the prevalence of "hidden gems", unpopular fine-tunes that significantly outperform their popular counterparts. Notably, within the Llama-3.1-8B family, we find rarely downloaded checkpoints that improve math performance from 83.2% to 96.0% without increasing inference costs. However, discovering these models through exhaustive evaluation of every uploaded model is computationally infeasible. We therefore formulate model discovery as a Multi-Armed Bandit problem and accelerate the Sequential Halving search algorithm by using shared query sets and aggressive elimination schedules. Our method retrieves top models with as few as 50 queries per candidate, accelerating discovery by over 50x.

  15. Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

    Reinforcement learning (RL) post-training is a dominant approach for improving the reasoning performance of large language models (LLMs), yet growing evidence suggests that its gains arise primarily from distribution sharpening rather than the acquisition of new capabilities. Recent work has shown that sampling from the power distribution of LLMs using Markov chain Monte Carlo (MCMC) can recover performance comparable to RL post-training without relying on external rewards; however, the high computational cost of MCMC makes such approaches impractical for widespread adoption. In this work, we propose a theoretically grounded alternative that eliminates the need for iterative MCMC. We derive a novel formulation showing that the global power distribution can be approximated by a token-level scaled low-temperature one, where the scaling factor captures future trajectory quality. Leveraging this insight, we introduce a training-free and verifier-free algorithm that sharpens the base model's generative distribution autoregressively. Empirically, we evaluate our method on math, QA, and code tasks across four LLMs, and show that our method matches or surpasses one-shot GRPO without relying on any external rewards, while reducing inference latency by over 10x compared to MCMC-based sampling.

Solidot(15)

  1. 遗传对人类寿命的影响占到五成

    根据发表在《科学》期刊上的一项研究,人类寿命的遗传性远高于以往的认知。分析显示,在排除了诸如意外事故或传染病等外部因素所导致的死亡之后,遗传因素或许可解释影响人类寿命长短中的约 50%的。理解人类寿命的遗传性是衰老问题研究的核心,然而要衡量基因对长寿的影响仍颇为棘手。尽管已发现一些与寿命相关的基因,但外部环境因素——如疾病或生活条件——对个体寿命的长短影响重大,因而常会掩盖或模糊潜在的遗传影响。为厘清死亡的内、外成因,研究人员运用了数学模型、人类死亡率模拟以及多个大规模双胞胎队列数据集。研究结果显示,外源性死亡会系统性地压低对寿命遗传影响的估测。一旦将外因导致的死亡数据进行妥善剥离,作者发现,遗传对人类寿命的影响率会大幅跃升至 55% 左右——这一数值是先前估计的两倍以上,表明遗传是决定人类衰老进程的核心驱动因素。

  2. 特斯拉停产 Models S 和 X 转向造人形机器人

    特斯拉 CEO 马斯克(Elon Musk)宣布将停产 Models S 和 X,位于加州 Fremont 的工厂将转向建造人形机器人 Optimus。Models S 和 X 是目前特斯拉最老的车型,前者起售价 95,000 美元,后者 100,000 美元。它最受欢迎的车型是廉价的 Model 3 和 Model Y,前者起售价 37,000 美元,后者为 40,000 美元,两大车型占到了去年 159 万辆交付量的 97%。加州 Fremont 工厂的汽车生产线将转为 Optimus 生产线,马斯克声称该生产线年产量能达到 100 万台。

  3. 中国批准进口英伟达 H200 芯片

    中国允许三大科技公司进口英伟达 H200 芯片。字节跳动、阿里巴巴和腾讯获准购买逾 40 万颗 H200 芯片。美国是在 1 月 13 日批准出口 H200 芯片,但中国海关随后发出了不允许进口的通知。H200 是英伟达在 B200 之后性能第二强的 AI 芯片,其性能是以前允许出口到中国的 H20 芯片的六倍。华为等公司已推出性能接近 H20 的产品,但它们的性能距离 H200 还很遥远。

  4. 苹果将在不久后对 Patreon 创作者收取佣金

    苹果已将 Patreon 平台支付系统从旧支付系统切换到 App Store 应用内购买系统的截止日期设为 2026 年 11 月 1 日。苹果最初要求 Patreon 必须在 2025 年 11 月前迁移到 App Store 应用内购买系统,否则 Patreon 将面临从 App Store 下架的风险。苹果将 Patreon 上支持者支付给创作者的款项视为数字商品,有权从中收取佣金。苹果对应用内购买和订阅收取 30% 的佣金,对持续一年以上的订阅,佣金比例降至 15%。Patreon 是一个供内容创作者进行众筹的平台,它收取的佣金比例为 10%。苹果此举意味着它从 Patreon 创作者收取的佣金三倍于平台本身。

  5. Google 查封 IPIDEA 域名

    IPIDEA 是一家鲜为人知的中国公司,它运营着世界最大的住宅代理网络,任何时候都有上千万代理可供出租。住宅代理经常是在用户不知情下安装到设备上的,很多 Android 电视盒就被发现预装了住宅代理服务。KrebsOnSecurity 本月早些时候报道,物联网僵尸网络 Kimwolf 利用 IPIDEA 的漏洞快速扩散到了逾两百万设备上。本周 Google 宣布与合作伙伴一起采取行动,查封了 IPIDEA 的数十个域名,下架了数百个相关的 Android 应用。Google 称它观察到大量黑客和间谍利用 IPIDEA 的住宅代理混淆流量隐藏其踪迹。Google 发现 IPIDEA 控制着十多个 VPN 品牌,包括 360 Proxy、922 Proxy、ABC Proxy、Cherry Proxy、Door VPN、Galleon VPN、IP 2 World、Ipidea、Luna Proxy、PIA S5 Proxy、PY Proxy、Radish VPN 和 Tab Proxy 等。

  6. 日本首例防陆地暴雨的人工降雨实验

    日本千叶大学、富山大学等组成的研究团队 1 月在富山县近海启动实验,旨在通过在海上人工引发降雨或降雪以减少陆地上的暴雨灾害。人工降雨主要是缓解旱情,以防止暴雨为目的的实验尚属首例。近年来因大雨导致的水灾呈增加趋势,千叶大学教授小槻峻司称:“将来希望建立能控制雨云生成时间和地点的方法。”实验于 7 日至 13 日共实施 4 次,以性质上类似于容易引发暴雨灾害的夏季积雨云,但生成高度更低且容易预测的日本海冬季雪云为实验对象。实验使用小型螺旋桨飞机在富山湾近海上空约 3000 米处分多次投放总计约 30 千克干冰,最长持续 2 小时,确认了天空和云层情况。今后计划对收集到的数据进行分析并找出合适的投放方式。

  7. 韦伯绘制出至今最精细暗物质地图

    天文学家藉由韦伯太空望远镜的超高解析影像,首度建立一幅广阔、解析度极高的宇宙质量分布图,显示暗物质与普通物质如何从星系四周的丝状结构开始,交织延伸到致密的星系团,其影像解析度较以往提升逾一倍,拍摄出更黯淡的天体得以让天文学家回溯宇宙演化的极早期阶段。 暗物质约占宇宙总物质的85%,不发光也不会吸收光,直接观测它是非常困难的。然而暗物质产生的引力会扭曲背景遥远星系的影像。研究团队透过测量约 25 万个星系影像形状的微小剪切效应,重建出连续区域内最精细的质量分布图,进而推知暗物质的空间位置。相较于先前以哈伯望远镜为主的研究,韦伯望远镜拍摄的影像兼具高解析度、高灵敏度、涵盖的视野广阔,能测量并绘制出宇宙网中除了大质量星系团外的暗淡丝状结构与低质量星系群。天文学家发现测量结果与标准宇宙学模型一致。

  8. Apple TV 将改编布兰登·桑德森的寰宇系列

    Apple TV 获得了布兰登·桑德森寰宇系列的影视剧改编权。桑德森是当今最多产、最受欢迎的奇幻作家之一,与苹果达成的交易赋予他对改编作品有极高掌控权,他拥有最终的审批权。寰宇系列(Cosmere)发生在一个架空宇宙寰宇中,造物主雅多纳西(Adonalsium)被谋反者杀害。他的力量被分裂成 16 个碎片,散布到各个世界,将各种魔法传播到宇宙的各个角落。苹果计划改编的首批作品包括《迷雾之子》系列和《飓光志》系列。

  9. GNU C Library 将从 Sourceware 迁移到 Linux Foundation 托管的 CTI

    GNU C Library“glibc”维护者 Carlos O'Donell 宣布项目核心服务将从 Sourceware 迁移到 Linux Foundation 托管的 Core Toolchain Infrastructure“CTI”。此举旨在满足 glibc 和 GNU Toolchain 当前及未来的需求,拥有安全、稳固(robust)且可持续的基础设施,同时兼顾开发者和社区协作创新的需求,确保基础设施在长期内有可靠资金支持。

  10. Linux kernel 社区制定 Linus Torvalds 卸任的计划

    Linux kernel 社区正式制定了一旦 Linus Torvalds 最终卸任如何寻找他的接替者的计划。该计划由资深内核贡献者 Dan Williams 起草,在最近举行的东京 Linux 内核维护者峰会进行了讨论。计划并没有指定具体的接替者,而是制定了一套流程,在最坏或有序过渡的情况下选择一位或多位维护者接管 Linux,包括召开一次会议去权衡各种方案,以最大限度保障 Linux 项目的长期健康发展。一位东京维护者开玩笑的建议,像选举新教皇的秘密会议一样,把遴选小组锁在房间里,在做出决定时释放出一团白烟。此举旨在防止“巴士系数”问题。巴士系数是指一个项目或项目至少失去若干关键成员的参与(“被巴士撞了”,指代职业和生活方式变动、婚育、意外伤亡等任意导致缺席的缘由)即导致项目陷入混乱、瘫痪而无法存续时,这些成员的数量即为巴士系数。Linus Torvalds 目前在 Linux 项目的核心地位意味着项目的巴士系数为 1。目前内核社区排在 Torvalds 之后的是稳定版内核维护者 Greg Kroah-Hartman。对于有人建议指定 Greg KH 为继任者,Torvalds 回答:“问题是 Greg 并非一直是 Greg。在他之前是 Andrew Morton 和 Alan Cox。Greg 之后会是 Shannon 和 Steve。真正的问题在于你必须找到一个或一组能赢得社区信任的人,而信任在于有足够长的时间让人们了解你的工作方式,但足够长的时间并不意味着要有 30 年。”

  11. 欧洲纯电汽车销量首次超过汽油车

    欧洲汽车协会公布的 2025 年 12 月的销售数据显示,欧洲纯电汽车销量首次超过汽油车。纯电汽车销量占比和汽油车都是 22.5%,柴油车占 7%,混动 33%,插电混动 10.7%。在纯电品牌中,特斯拉销量持续下滑,市场份额被比亚迪和大众吞食。12 月欧盟共注册了 320,812 辆新电动汽车,相比 2024 年同期增长了 46.1%,电动汽车市场份额达到 33.3%,较上年增长了 9.2 个百分点。

  12. Vibe Coding 杀死开源

    生成式 AI 正在重塑软件开发。Claude Code、Cursor 和 Lovable 等 AI 辅助编程助手让用户几乎无需手动编码就能将其意图转化为可工作的应用。这种软件构建方式被称为 Vibe Coding。Vibe Coding 降低了软件开发成本,但也改变了用户与软件生态系统的互动方式。传统的软件开发模式中,开发者选择开源软件包、阅读文档,与维护者及其他用户互动。在 Vibe Coding 下,AI 智能体可以直接选择、组合和修改软件包,人类开发者可能不知道使用了哪些上游组件。这就引发了一个开源软件的可持续性问题。开源软件项目依赖于用户的参与和互动——文档访问、bug 报告、公开问答和声誉——维持维护和获取报酬。如果 AI 取代了人类用户之间的互动,那么旧的开源软件开发模式将会彻底改变,开源软件的可用性和质量将会下降。中欧大学和德国经济研究所  Kiel Institute for the World Economy 的研究人员在 arxiv 上发表研究报告认为,Vibe Coding 将会杀死开源。

  13. Anthropic 如何构建 Claude

    根据上周公开的图书作者诉 Anthropic 侵权案的法庭文件,该公司实施了名为“巴拿马计划”(Project Panama)的行动:大量购买实体图书,拆开书脊、扫描书页去训练其 Claude 聊天机器人,之后将图书残骸送去回收公司。Anthropic 为此投入了数千万美元,聘请了二十年前参与 Google Books 项目的 Google 高管 Tom Turvey。Anthropic 从包括 Better World Books 和 World of Books 在内的图书零售商批量购买图书,每批数以万计;供应商文件显示,Anthropic 计划扫描 50-200 万册图书。在巴拿马计划前,Anthropic 联合创始人 Ben Mann 曾在 2021 年 6 月 的 11 天内,从影子图书馆 LibGen 下载书籍,向同事分享了盗版图书馆镜像站的链接,称“这太棒了!!!”法庭文件还披露,Meta 员工在获得扎克伯格(Mark Zuckerberg)批准后,也从盗版图书种子平台下载图书,有工程师表示“用公司笔记本电脑下载(盗版书)种子文件不太好”。Anthropic 去年 8 月以 15 亿美元和解侵权案,但未承认有不当行为。

  14. OpenAI 科学部门负责人称大模型尚未准备好产生新发现

    OpenAI 副总裁、OpenAI for Science 部门负责人 Kevin Weil 接受 MIT Technology Review 采访时承认,大模型还无法产生全新发现,表示这不是目前的任务。大模型的输出是组合现有成果,时常会出错,它并不是提出全新的方法。Weil 承认现有的大模型还没有达到理想状态,可能最终会达到,他对此感到乐观。大模型擅长挖掘被遗忘的解决方案,发现跨领域的联系,Weil 表示加速科学发展的标准不需要“像爱因斯坦那样彻底重塑整个领域”。他说,GPT-5 阅读了过去 30 年发表的几乎所有论文,聚合来自不相关学科的类推。现有知识的积累——帮助科学家避免在已解决的问题上浪费精力——本身就是一种加速。

  15. 法国政府将用本国平台取代美国的 Teams 和 Zoom

    法国政府宣布将用本国开发的视频会议平台 Visio 取代美国微软的 Teams 和 Zoom 平台,计划到 2027 年所有政府部门都使用该平台。此举是法国加强数字主权战略的一部分,减少对外国软件供应商——尤其是美国供应商——的依赖,重新控制关键数字基础设施。Visio 已测试一年,有约 4 万用户。