DIGEST · 2026-01-13

OrangeBot.AI Digest — 2026-01-13

54 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Games Workshop bans staff from using AI (www.ign.com)
  2. Show HN: Self-host Reddit – 2.38B posts, works offline, yours forever (github.com)
  3. Signal leaders warn agentic AI is an insecure, unreliable surveillance risk (coywolf.com)
  4. AI Generated Music Barred from Bandcamp (old.reddit.com)
  5. 90M people. 118 hours of silence. One nation erased from the internet (state-of-iranblackout.whisper.security)
  6. Influencers and OnlyFans models are dominating U.S. O-1 visa requests (www.theguardian.com)
  7. Scott Adams has died (www.usatoday.com)
  8. The U.S. Government Just Followed Through on Its Ban of DJI Drones (www.popularmechanics.com)
  9. What a year of solar and batteries saved us in 2025 (scotthelme.co.uk)
  10. Anthropic invests $1.5M in the Python Software Foundation (discuss.python.org)
  11. Scott Adams has died (www.youtube.com)
  12. Indifference is a power (aeon.co)
  13. Apple Creator Studio (www.apple.com)
  14. Local Journalism Is How Democracy Shows Up Close to Home (buckscountybeacon.com)
  15. The UK is shaping a future of precrime and dissent management (2025) (freedomnews.org.uk)

GitHub Trending(9)

  1. obra / superpowers

    Claude Code superpowers: core skills library

  2. icloud-photos-downloader / icloud_photos_downloader

    A command-line tool to download photos from iCloud

  3. blakeblackshear / frigate

    NVR with realtime local object detection for IP cameras

  4. twitter / the-algorithm

    Source code for the X Recommendation Algorithm

  5. home-assistant / home-assistant.io

    📘 Home Assistant User documentation

  6. chidiwilliams / buzz

    Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

  7. adam-maj / tiny-gpu

    A minimal GPU design in Verilog to learn how GPUs work from the ground up

  8. Free-TV / IPTV

    M3U Playlist for free TV channels

  9. onlook-dev / onlook

    The Cursor for Designers • An Open-Source AI-First Design tool • Visually build, style, and edit your React App with AI

Hugging Face(15)

  1. Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

    In real-world video question answering scenarios, videos often provide only localized visual cues, while verifiable answers are distributed across the open web; models therefore need to jointly perform cross-frame clue extraction, iterative retrieval, and multi-hop reasoning-based verification. To bridge this gap, we construct the first video deep research benchmark, VideoDR. VideoDR centers on video-conditioned open-domain video question answering, requiring cross-frame visual anchor extraction, interactive web retrieval, and multi-hop reasoning over joint video-web evidence; through rigorous human annotation and quality control, we obtain high-quality video deep research samples spanning six semantic domains. We evaluate multiple closed-source and open-source multimodal large language models under both the Workflow and Agentic paradigms, and the results show that Agentic is not consistently superior to Workflow: its gains depend on a model's ability to maintain the initial video anchors over long retrieval chains. Further analysis indicates that goal drift and long-horizon consistency are the core bottlenecks. In sum, VideoDR provides a systematic benchmark for studying video agents in open-web settings and reveals the key challenges for next-generation video deep research agents.

  2. BabyVision: Visual Reasoning Beyond Language

    While humans develop core visual skills long before acquiring language, contemporary Multimodal LLMs (MLLMs) still rely heavily on linguistic priors to compensate for their fragile visual understanding. We uncovered a crucial fact: state-of-the-art MLLMs consistently fail on basic visual tasks that humans, even 3-year-olds, can solve effortlessly. To systematically investigate this gap, we introduce BabyVision, a benchmark designed to assess core visual abilities independent of linguistic knowledge for MLLMs. BabyVision spans a wide range of tasks, with 388 items divided into 22 subclasses across four key categories. Empirical results and human evaluation reveal that leading MLLMs perform significantly below human baselines. Gemini3-Pro-Preview scores 49.7, lagging behind 6-year-old humans and falling well behind the average adult score of 94.1. These results show despite excelling in knowledge-heavy evaluations, current MLLMs still lack fundamental visual primitives. Progress in BabyVision represents a step toward human-level visual perception and reasoning capabilities. We also explore solving visual reasoning with generation models by proposing BabyVision-Gen and automatic evaluation toolkit. Our code and benchmark data are released at https://github.com/UniPat-AI/BabyVision for reproduction.

  3. PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

    We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields strong improvements across diverse domains, and notably pushes reasoning beyond frontier systems in mathematics: an 8B model reaches 94.5% on HMMT 2025, surpassing GPT-5's 93.2% by scaling effective TTC to roughly two million tokens. We open-source model checkpoints, training data, and the full inference pipeline to accelerate follow-up work.

  4. X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

    Competitive programming presents great challenges for Code LLMs due to its intensive reasoning demands and high logical complexity. However, current Code LLMs still rely heavily on real-world data, which limits their scalability. In this paper, we explore a fully synthetic approach: training Code LLMs with entirely generated tasks, solutions, and test cases, to empower code reasoning models without relying on real-world data. To support this, we leverage feature-based synthesis to propose a novel data synthesis pipeline called SynthSmith. SynthSmith shows strong potential in producing diverse and challenging tasks, along with verified solutions and tests, supporting both supervised fine-tuning and reinforcement learning. Based on the proposed synthetic SFT and RL datasets, we introduce the X-Coder model series, which achieves a notable pass rate of 62.9 avg@8 on LiveCodeBench v5 and 55.8 on v6, outperforming DeepCoder-14B-Preview and AReal-boba2-14B despite having only 7B parameters. In-depth analysis reveals that scaling laws hold on our synthetic dataset, and we explore which dimensions are more effective to scale. We further provide insights into code-centric reinforcement learning and highlight the key factors that shape performance through detailed ablations and analysis. Our findings demonstrate that scaling high-quality synthetic data and adopting staged training can greatly advance code reasoning, while mitigating reliance on real-world coding data.

  5. MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

    While the Transformer architecture dominates many fields, its quadratic self-attention complexity hinders its use in large-scale applications. Linear attention offers an efficient alternative, but its direct application often degrades performance, with existing fixes typically re-introducing computational overhead through extra modules (e.g., depthwise separable convolution) that defeat the original purpose. In this work, we identify a key failure mode in these methods: global context collapse, where the model loses representational diversity. To address this, we propose Multi-Head Linear Attention (MHLA), which preserves this diversity by computing attention within divided heads along the token dimension. We prove that MHLA maintains linear complexity while recovering much of the expressive power of softmax attention, and verify its effectiveness across multiple domains, achieving a 3.6\% improvement on ImageNet classification, a 6.3\% gain on NLP, a 12.6\% improvement on image generation, and a 41\% enhancement on video generation under the same time complexity.

  6. GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

    Large Reasoning Models (LRMs) achieve remarkable performance by explicitly generating multi-step chains of thought, but this capability incurs substantial inference latency and computational cost. Collaborative inference offers a promising solution by selectively allocating work between lightweight and large models, yet a fundamental challenge remains: determining when a reasoning step requires the capacity of a large model or the efficiency of a small model. Existing routing strategies either rely on local token probabilities or post-hoc verification, introducing significant inference overhead. In this work, we propose a novel perspective on step-wise collaboration: the difficulty of a reasoning step can be inferred from its very first token. Inspired by the "Aha Moment" phenomenon in LRMs, we show that the entropy of the initial token serves as a strong predictor of step difficulty. Building on this insight, we introduce GlimpRouter, a training-free step-wise collaboration framework. GlimpRouter employs a lightweight model to generate only the first token of each reasoning step and routes the step to a larger model only when the initial token entropy exceeds a threshold. Experiments on multiple benchmarks demonstrate that our approach significantly reduces inference latency while preserving accuracy. For instance, GlimpRouter attains a substantial 10.7% improvement in accuracy while reducing inference latency by 25.9% compared to a standalone large model on AIME25. These results suggest a simple yet effective mechanism for reasoning: allocating computation based on a glimpse of thought rather than full-step evaluation.

  7. OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent

    While Vision-Language Models (VLMs) have significantly advanced Computer-Using Agents (CUAs), current frameworks struggle with robustness in long-horizon workflows and generalization in novel domains. These limitations stem from a lack of granular control over historical visual context curation and the absence of visual-aware tutorial retrieval. To bridge these gaps, we introduce OS-Symphony, a holistic framework that comprises an Orchestrator coordinating two key innovations for robust automation: (1) a Reflection-Memory Agent that utilizes milestone-driven long-term memory to enable trajectory-level self-correction, effectively mitigating visual context loss in long-horizon tasks; (2) Versatile Tool Agents featuring a Multimodal Searcher that adopts a SeeAct paradigm to navigate a browser-based sandbox to synthesize live, visually aligned tutorials, thereby resolving fidelity issues in unseen scenarios. Experimental results demonstrate that OS-Symphony delivers substantial performance gains across varying model scales, establishing new state-of-the-art results on three online benchmarks, notably achieving 65.84% on OSWorld.

  8. Lost in the Noise: How Reasoning Models Fail with Contextual Distractors

    Recent advances in reasoning models and agentic AI systems have led to an increased reliance on diverse external information. However, this shift introduces input contexts that are inherently noisy, a reality that current sanitized benchmarks fail to capture. We introduce NoisyBench, a comprehensive benchmark that systematically evaluates model robustness across 11 datasets in RAG, reasoning, alignment, and tool-use tasks against diverse noise types, including random documents, irrelevant chat histories, and hard negative distractors. Our evaluation reveals a catastrophic performance drop of up to 80% in state-of-the-art models when faced with contextual distractors. Crucially, we find that agentic workflows often amplify these errors by over-trusting noisy tool outputs, and distractors can trigger emergent misalignment even without adversarial intent. We find that prompting, context engineering, SFT, and outcome-reward only RL fail to ensure robustness; in contrast, our proposed Rationale-Aware Reward (RARE) significantly strengthens resilience by incentivizing the identification of helpful information within noise. Finally, we uncover an inverse scaling trend where increased test-time computation leads to worse performance in noisy settings and demonstrate via attention visualization that models disproportionately focus on distractor tokens, providing vital insights for building the next generation of robust, reasoning-capable agents.

  9. Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models

    Diffusion Language Models (DLMs) offer a promising alternative for language modeling by enabling parallel decoding through iterative refinement. However, most DLMs rely on hard binary masking and discrete token assignments, which hinder the revision of early decisions and underutilize intermediate probabilistic representations. In this paper, we propose EvoToken-DLM, a novel diffusion-based language modeling approach that replaces hard binary masks with evolving soft token distributions. EvoToken-DLM enables a progressive transition from masked states to discrete outputs, supporting revisable decoding. To effectively support this evolution, we introduce continuous trajectory supervision, which aligns training objectives with iterative probabilistic updates. Extensive experiments across multiple benchmarks show that EvoToken-DLM consistently achieves superior performance, outperforming strong diffusion-based and masked DLM baselines. Project webpage: https://aim-uofa.github.io/EvoTokenDLM.

  10. Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction

    As LLM-based agents are increasingly used in long-term interactions, cumulative memory is critical for enabling personalization and maintaining stylistic consistency. However, most existing systems adopt an ``all-or-nothing'' approach to memory usage: incorporating all relevant past information can lead to Memory Anchoring, where the agent is trapped by past interactions, while excluding memory entirely results in under-utilization and the loss of important interaction history. We show that an agent's reliance on memory can be modeled as an explicit and user-controllable dimension. We first introduce a behavioral metric of memory dependence to quantify the influence of past interactions on current outputs. We then propose Steerable Memory Agent, SteeM, a framework that allows users to dynamically regulate memory reliance, ranging from a fresh-start mode that promotes innovation to a high-fidelity mode that closely follows interaction history. Experiments across different scenarios demonstrate that our approach consistently outperforms conventional prompting and rigid memory masking strategies, yielding a more nuanced and effective control for personalized human-agent collaboration.

  11. DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving

    Video generation models, as one form of world models, have emerged as one of the most exciting frontiers in AI, promising agents the ability to imagine the future by modeling the temporal evolution of complex scenes. In autonomous driving, this vision gives rise to driving world models: generative simulators that imagine ego and agent futures, enabling scalable simulation, safe testing of corner cases, and rich synthetic data generation. Yet, despite fast-growing research activity, the field lacks a rigorous benchmark to measure progress and guide priorities. Existing evaluations remain limited: generic video metrics overlook safety-critical imaging factors; trajectory plausibility is rarely quantified; temporal and agent-level consistency is neglected; and controllability with respect to ego conditioning is ignored. Moreover, current datasets fail to cover the diversity of conditions required for real-world deployment. To address these gaps, we present DrivingGen, the first comprehensive benchmark for generative driving world models. DrivingGen combines a diverse evaluation dataset curated from both driving datasets and internet-scale video sources, spanning varied weather, time of day, geographic regions, and complex maneuvers, with a suite of new metrics that jointly assess visual realism, trajectory plausibility, temporal coherence, and controllability. Benchmarking 14 state-of-the-art models reveals clear trade-offs: general models look better but break physics, while driving-specific ones capture motion realistically but lag in visual quality. DrivingGen offers a unified evaluation framework to foster reliable, controllable, and deployable driving world models, enabling scalable simulation, planning, and data-driven decision-making.

  12. MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

    The rapid development of interactive and autonomous AI systems signals our entry into the agentic era. Training and evaluating agents on complex agentic tasks such as software engineering and computer use requires not only efficient model computation but also sophisticated infrastructure capable of coordinating vast agent-environment interactions. However, no open-source infrastructure can effectively support large-scale training and evaluation on such complex agentic tasks. To address this challenge, we present MegaFlow, a large-scale distributed orchestration system that enables efficient scheduling, resource allocation, and fine-grained task management for agent-environment workloads. MegaFlow abstracts agent training infrastructure into three independent services (Model Service, Agent Service, and Environment Service) that interact through unified interfaces, enabling independent scaling and flexible resource allocation across diverse agent-environment configurations. In our agent training deployments, MegaFlow successfully orchestrates tens of thousands of concurrent agent tasks while maintaining high system stability and achieving efficient resource utilization. By enabling such large-scale agent training, MegaFlow addresses a critical infrastructure gap in the emerging agentic AI landscape.

  13. Boosting Latent Diffusion Models via Disentangled Representation Alignment

    Latent Diffusion Models (LDMs) generate high-quality images by operating in a compressed latent space, typically obtained through image tokenizers such as Variational Autoencoders (VAEs). In pursuit of a generation-friendly VAE, recent studies have explored leveraging Vision Foundation Models (VFMs) as representation alignment targets for VAEs, mirroring the approach commonly adopted for LDMs. Although this yields certain performance gains, using the same alignment target for both VAEs and LDMs overlooks their fundamentally different representational requirements. We advocate that while LDMs benefit from latents retaining high-level semantic concepts, VAEs should excel in semantic disentanglement, enabling encoding of attribute-level information in a structured way. To address this, we propose the Semantic disentangled VAE (Send-VAE), explicitly optimized for disentangled representation learning through aligning its latent space with the semantic hierarchy of pre-trained VFMs. Our approach employs a non-linear mapper network to transform VAE latents, aligning them with VFMs to bridge the gap between attribute-level disentanglement and high-level semantics, facilitating effective guidance for VAE learning. We evaluate semantic disentanglement via linear probing on attribute prediction tasks, showing strong correlation with improved generation performance. Finally, using Send-VAE, we train flow-based transformers SiTs; experiments show Send-VAE significantly speeds up training and achieves a state-of-the-art FID of 1.21 and 1.75 with and without classifier-free guidance on ImageNet 256x256.

  14. What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

    Current vision-language benchmarks predominantly feature well-structured questions with clear, explicit prompts. However, real user queries are often informal and underspecified. Users naturally leave much unsaid, relying on images to convey context. We introduce HAERAE-Vision, a benchmark of 653 real-world visual questions from Korean online communities (0.76% survival from 86K candidates), each paired with an explicit rewrite, yielding 1,306 query variants in total. Evaluating 39 VLMs, we find that even state-of-the-art models (GPT-5, Gemini 2.5 Pro) achieve under 50% on the original queries. Crucially, query explicitation alone yields 8 to 22 point improvements, with smaller models benefiting most. We further show that even with web search, under-specified queries underperform explicit queries without search, revealing that current retrieval cannot compensate for what users leave unsaid. Our findings demonstrate that a substantial portion of VLM difficulty stem from natural query under-specification instead of model capability, highlighting a critical gap between benchmark evaluation and real-world deployment.

  15. ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration

    Large Language Models (LLMs) can extend their parameter knowledge limits by adopting the Tool-Integrated Reasoning (TIR) paradigm. However, existing LLM-based agent training framework often focuses on answers' accuracy, overlooking specific alignment for behavior patterns. Consequently, agent often exhibits ineffective actions during TIR tasks, such as redundant and insufficient tool calls. How to calibrate erroneous behavioral patterns when executing TIR tasks, thereby exploring effective trajectories, remains an open-ended problem. In this paper, we propose ET-Agent, a training framework for calibrating agent's tool-use behavior through two synergistic perspectives: Self-evolving Data Flywheel and Behavior Calibration Training. Specifically, we introduce a self-evolutionary data flywheel to generate enhanced data, used to fine-tune LLM to improve its exploration ability. Based on this, we implement an two-phases behavior-calibration training framework. It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors. Further in-depth experiments confirm the superiority of across multiple dimensions, including correctness, efficiency, reasoning conciseness, and tool execution accuracy. Our ET-Agent framework provides practical insights for research in the TIR field. Codes can be found in https://github.com/asilverlight/ET-Agent

Solidot(15)

  1. 伊朗搜查和收缴 Starlink 设备

    全国断网之后,伊朗居民主要通过 Starlink 设备与外界通信,将抗议视频传递出去。而 Starlink 也收到了严重干扰,时断时续。非营利组织 Miaan Group 的 Amir Rashidi 称伊朗政府开始搜查和没收 Starlink 设备。一位德黑兰用户通过 Starlink 接受了 WSJ 的采访,表示自己上传了亲戚拍摄的抗议视频,发送给了国外第三方,由他们发布到社媒上。Starlink 连接状况通常早上或中午时好点。Starlink 终端在伊朗属于非法设备,是通过走私进来的。在 2022 年上一次大规模抗议之后,Starlink 终端大量涌入伊朗。NetFreedom Pioneers 等组织向伊朗运送了数千套 Starlink 设备。

  2. 中文用户的脑机接口

    应用脑机接口技术已突破英语语音和文字合成,但针对汉语解码的脑机接口技术研究相对较少。中国科学院研究人员针对汉语解码,开发出植入式高通量柔性脑机接口系统和汉语言实时神经网络解码算法,首次实现脑机接口实时汉语解码和语句合成。相比于英语,汉语具有其独特性。具体而言,英语是以多音节为主的非声调语言,汉语则是以单音节为主的声调语言。同时,英语词汇量较大,常用英语单词约为 20000 个,而汉语通过约 400 个汉语音节加 4 个声调,可构建出覆盖日常需求的 3500 多个常用汉字。研究团队利用汉语本身优势,从约 400 个汉语音节和 4 个声调入手,将其作为稳定的中间解码单元,实现从脑电到文字的“翻译”,且通过解码这些汉语音节和声调,可外推至全部汉字。研究显示,受试者经过9天的语言解码任务后,394个汉语音节(解码未覆盖音节为生僻音节且受试者不认识)纯神经解码平均准确率达到71.2%,单音节解码延时65ms,实时汉语语句解码速率达到 49.6 字/分钟。

  3. Google 对为何没有下架 Grok 应用拒绝置评

    苹果对下架应用的措施很模糊,留下了可伸缩的空间。相比下 Google 则制定了明确的规定,根据其支持文档:“我们禁止任何应用包含或宣传色情内容或粗言秽语,包括淫秽内容或以性取悦为目的的任何内容或服务。我们禁止任何疑似宣传或招揽有偿性行为的应用或应用内容。我们禁止任何应用包含或宣传与性掠夺行为相关的内容,或在未经当事人同意的情况下散布色情内容。”xAI 的 Grok AI 最近的行为完全符合该规定,Web 版本的 Grok 已经将 deepfake 功能限制为付费用户使用,但 Grok 应用仍然没有任何限制。对于为何没有下架 Grok 应用 Google 至今拒绝置评。

  4. 挪威电动汽车销量占到了总销量的 97%

    挪威公布了 2025 年 12 月和全年汽车销量数据,纯电和插电混动汽车销量占到了总销量的 97.5%,基本实现了它在 2017 年设定的到 2025 年停售燃油汽车的目标。2025 年挪威共注册新乘用车 179,549 辆。其中纯电 172,232 辆,插电混动 2,751 辆,传统燃油汽车 2,306 辆,柴油车 1,773 辆,汽油车 487 辆。燃油车主要包括特殊用途车辆如无障碍汽车或警用及其它应急车辆。中国电动汽车在挪威的销量也实现了增长,中国品牌的份额从去年的 10.4% 上升至 13.7%。

  5. 苹果选择 Gemini 驱动 Siri

    苹果与 Google 达成多年协议,将使用 Gemini 驱动其智能助手 Siri。苹果早在 2024 年就承诺推出 AI 驱动的 Siri 助手,但因为内部大模型不可靠而一直延期。苹果之后决定选择外部 AI 模型,它测试了 OpenAI 的 ChatGPT 和 Anthropic 的 Claude,最终决定选择了 Google 的 Gemini。报道称,苹果将为此每年向 Google 支付 10 亿美元。考虑到 Google 为了获得苹果设备的默认搜索引擎位置而每年支付超过 200 亿美元(2022 年的数字,如今可能会更高),10 亿美元对苹果而言只是少收点钱而已。新版本的 Siri 将在今年晚些时候在 iOS 26、iPadOS 26 和 macOS 26 Tahoe 上推出。

  6. 中国公司申请发射逾 20 万颗互联网卫星

    中国公司向国际电信联盟(ITU)申请发射逾 20 万颗互联网卫星。去年 12 月 30 日成立的无线电频谱开发利用和技术创新研究院申请了规模最大的两个项目——CTC-1 和 CTC-2——分别申请发射 96,714 颗卫星,它在成立的第二天递交了申请。除此之外,中国移动的 L1 项目包含了 2520 颗卫星,千帆星座的开发商上海垣信卫星科技有限公司递交了 1296 颗卫星计划。千帆星座计划到 2030 年部署 1.5 万颗卫星,国网计划发射 1.3 万颗卫星。向国际电信联盟递交申请的还有:国电高科天启 3G(1,132 颗卫星)、中国移动 M1(144 颗卫星)、航天驭星 YX-5(106 颗卫星)、GalaxySpace 的 Galaxy-SAR-2 网络(96 颗卫星)和 BlackSpider-3 星座(81 颗卫星)。

  7. 被罚款后 Cloudflare 威胁退出意大利

    由于 Cloudflare 的公共 DNS 服务 1.1.1.1 拒绝过滤盗版网站,意大利通信监管机构 AGCOM 对 Cloudflare 处以创纪录的 1420 万欧元罚款。Cloudflare CEO Matthew Prince 威胁退出意大利,停止为 Milano-Cortina 冬奥会提供免费网络安全服务。AGCOM 开出的罚金金额相当于 Cloudflare 全球年收入的 1%,但超过了意大利业务收入的两倍。Prince 表示 Cloudflare 可能取消在意大利设立办事处的计划。他指责 AGCOM 代表“欧洲媒体精英的秘密集团”执行“互联网审查计划”。

  8. Windows 资源管理器可能集成 Copilot 侧边栏

    微软想要让 Windows 的 AI 功能更醒目。根据 Windows 11 的预览版信息,它正在测试在资源管理器中集成 Copilot 的新功能,不是在右键菜单中添加一个“Ask Copilot”按钮,而是集成在资源管理器中,位于侧边栏或类似详细信息/预览窗格界面中。

  9. 伊朗断网四天

    根据 Netblocks 以及 Cloudflare Rader 的监测,伊朗全国断网四天。自 2025 年 12 月起,伊朗发生了一系列抗议活动,起因是民众对通货膨胀飙升、食品价格上涨以及伊朗里亚尔大幅贬值感到不满。示威活动最初由店主和市场商贩发起,进入新年后,抗议规模日益扩大。维基百科显示,目前有至少 2000 人死亡,逾万人被捕。

  10. 中国测试商用超临界二氧化碳发电机

    全球首台超临界二氧化碳发电机组“超碳一号”在贵州六盘水首钢水钢集团投入商业运行,这是全球范围内首次将超临界二氧化碳发电技术从实验室推向商业落地。不论是火电还是核电,原理都类似于“烧开水”,就是用热量将水变为水蒸气,推动汽轮机转动来发电。但超临界二氧化碳发电技术是一种革新型热电转换技术。这一技术是把温度超过 31 摄氏度、压力升高至 73 个大气压以上环境中的超临界二氧化碳作为循环工质,将其送进发电系统里,再通过压缩机和换热器提高超临界二氧化碳的压力和温度,让高温高压的二氧化碳推动透平旋转,进而产生电能。相比现役烧结余热蒸汽发电技术,“超碳一号”发电效率提升 85% 以上,净发电量提升 50% 以上。

  11. Linus Torvalds 的个人项目使用 AI 辅助编程完成

    Linus Torvalds 在 GitHub 上公开了名为 AudioNoise 的随机数字音效个人项目,他在 README 文件中透露使用了 Google 的 AI 驱动 IDE 工具 Antigravity 进行了辅助编程——或者叫 Vibe Coding。Antigravity 是 Windsurf 的分支,而 Windsurf 则是微软 VS Code 的分支。

  12. AI 伴侣应用聊黄案本周二审

    因为大量用户在 APP 上与 AI 智能体“聊黄”,APP 的主要开发和运营者被追究了刑责。2025 年 9 月,上海市徐汇区人民法院一审判决,两名被告人犯制作淫秽物品牟利罪,分别获刑四年、一年半。此案成为国内首起 AI 服务提供者涉黄获刑的案件。案涉 APP Alien Chat(AC)是一款 AI 伴侣聊天应用,定位是为年轻群体提供亲密陪伴和情感支持。用户在 AC 注册会员后,可与 AI 聊天。判决书披露,AC App 手机注册用户 11.6 万人,其中付费用户 2.4 万人。截至案发,共收取会员充值费 363 万余元。两名被告人不服判决提出上诉,案件二审将于 1 月 14 日在上海市第一中级人民法院开庭。

  13. 喜马拉雅山冬季降雪量大幅减少

    在本该白雪皑皑的季节,喜马拉雅山却是光秃秃的,因为今年冬季的降雪量大幅减少。相比 1980-2020 年之间的平均降雪量,过去五年喜马拉雅山多数年份的降雪量都出现了下降。全球的气温上升也意味着少量积雪会很快融化。气象学家表示积雪减少不仅改变喜马拉雅山面貌,还会影响当地数亿人的生活和生态系统。融雪是该地区主要的淡水来源。印度气象部门记录显示 12 月印度北部几乎所有地区都没有降雨和降雪。根据数据集 ERA-5(European Centre for Medium-Range Weather Forecasts Reanalysis),相比 40 年长期平均水平(1980-2020 年),喜马拉雅山西北部的降雪量过去五年减少了 25%。

  14. 因 GitHub 强行推销使用 Copilot Gentoo 考虑迁移出去

    Gentoo Linux 项目回顾了 2025 年展望了 2026 年。开发者表示,目前微软旗下的 GitHub 不断强行在代码库中推销使用其 AI 功能 Copilot,Gentoo 考虑计划将库镜像和 pull request 迁移到 Codeberg。Codeberg 是基于 Forgejo 的服务,由德国柏林的一家非营利组织维护。Gentoo 将继续托管自己的主 git 库、bug 等基础设施,这些都没有计划改变。

  15. 合法化逆向工程有助于终结平台垃圾化

    加拿大科幻小说作家、平台垃圾化(enshittification)一词的发明者 Cory Doctorow 在《卫报》上发表文章,重复了他在 39C3 会议上主张的一个观点:合法化逆向工程有助于终结平台垃圾化。他说,美国公司的有缺陷产品之所以没有失去吸引力的一大原因是法律禁止逆向工程——或者叫反规避。而各国法律之所以包含反规避条款是美国对其贸易伙伴的强制性要求。而逆向工程是修改现有产品使其更好为用户服务的必要前提。在废除法律中的反规避条款前,我们无法对美国的云软件进行逆向工程,不管是数据库或字处理器等软件,还是拖拉机之类的硬件,在废除后我们才能将更自由开放可审计的代码去取代美国的私有代码,更好的维护数字主权。