DIGEST · 2026-01-17

OrangeBot.AI Digest — 2026-01-17

50 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Canada's deal with China signals it is serious about shift from US (www.bbc.com)
  2. 2025 was the third hottest year on record (www.economist.com)
  3. Eight European countries face 10% tariff for opposing US control of Greenland (apnews.com)
  4. We put Claude Code in Rollercoaster Tycoon (labs.ramp.com)
  5. The recurring dream of replacing developers (www.caimito.net)
  6. What life is like in Minneapolis now (donmoynihan.substack.com)
  7. The Dilbert Afterlife (www.astralcodexten.com)
  8. PCs refuse to shut down after Microsoft patch (www.theregister.com)
  9. ASCII characters are not pixels: a deep dive into ASCII rendering (alexharri.com)
  10. After 25 years, Wikipedia has proved that news doesn't need to look like news (www.niemanlab.org)
  11. US electricity demand surged in 2025 – solar handled 61% of it (electrek.co)
  12. Map To Poster – Create Art of your favourite city (github.com)
  13. Show HN: Streaming gigabyte medical images from S3 without downloading them (github.com)
  14. ClickHouse acquires Langfuse (langfuse.com)
  15. You have three minutes to escape the perpetual underclass (geohot.github.io)

GitHub Trending(5)

  1. eigent-ai / eigent

    Eigent: The Open Source Cowork Desktop to Unlock Your Exceptional Productivity.

  2. obra / superpowers

    An agentic skills framework & software development methodology that works.

  3. puckeditor / puck

    The visual editor for React with AI superpowers

  4. google / langextract

    A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

  5. iOfficeAI / AionUi

    Free, local, open-source Cowork for Gemini CLI, Claude Code, Codex, Opencode, Qwen Code, Goose Cli, Auggie, and more | 🌟 Star if you like it!

Hugging Face(15)

  1. STEP3-VL-10B Technical Report

    We present STEP3-VL-10B, a lightweight open-source foundation model designed to redefine the trade-off between compact efficiency and frontier-level multimodal intelligence. STEP3-VL-10B is realized through two strategic shifts: first, a unified, fully unfrozen pre-training strategy on 1.2T multimodal tokens that integrates a language-aligned Perception Encoder with a Qwen3-8B decoder to establish intrinsic vision-language synergy; and second, a scaled post-training pipeline featuring over 1k iterations of reinforcement learning. Crucially, we implement Parallel Coordinated Reasoning (PaCoRe) to scale test-time compute, allocating resources to scalable perceptual reasoning that explores and synthesizes diverse visual hypotheses. Consequently, despite its compact 10B footprint, STEP3-VL-10B rivals or surpasses models 10times-20times larger (e.g., GLM-4.6V-106B, Qwen3-VL-235B) and top-tier proprietary flagships like Gemini 2.5 Pro and Seed-1.5-VL. Delivering best-in-class performance, it records 92.2% on MMBench and 80.11% on MMMU, while excelling in complex reasoning with 94.43% on AIME2025 and 75.95% on MathVision. We release the full model suite to provide the community with a powerful, efficient, and reproducible baseline.

  2. Urban Socio-Semantic Segmentation with Vision-Language Reasoning

    As hubs of human activity, urban surfaces consist of a wealth of semantic entities. Segmenting these various entities from satellite imagery is crucial for a range of downstream applications. Current advanced segmentation models can reliably segment entities defined by physical attributes (e.g., buildings, water bodies) but still struggle with socially defined categories (e.g., schools, parks). In this work, we achieve socio-semantic segmentation by vision-language model reasoning. To facilitate this, we introduce the Urban Socio-Semantic Segmentation dataset named SocioSeg, a new resource comprising satellite imagery, digital maps, and pixel-level labels of social semantic entities organized in a hierarchical structure. Additionally, we propose a novel vision-language reasoning framework called SocioReasoner that simulates the human process of identifying and annotating social semantic entities via cross-modal recognition and multi-stage reasoning. We employ reinforcement learning to optimize this non-differentiable process and elicit the reasoning capabilities of the vision-language model. Experiments demonstrate our approach's gains over state-of-the-art models and strong zero-shot generalization. Our dataset and code are available in https://github.com/AMAP-ML/SocioReasoner.

  3. Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

    Reinforcement learning (RL) has become a central paradigm for post-training large language models (LLMs), particularly for complex reasoning tasks, yet it often suffers from exploration collapse: policies prematurely concentrate on a small set of dominant reasoning patterns, improving pass@1 while limiting rollout-level diversity and gains in pass@k. We argue that this failure stems from regularizing local token behavior rather than diversity over sets of solutions. To address this, we propose Uniqueness-Aware Reinforcement Learning, a rollout-level objective that explicitly rewards correct solutions that exhibit rare high-level strategies. Our method uses an LLM-based judge to cluster rollouts for the same problem according to their high-level solution strategies, ignoring superficial variations, and reweights policy advantages inversely with cluster size. As a result, correct but novel strategies receive higher rewards than redundant ones. Across mathematics, physics, and medical reasoning benchmarks, our approach consistently improves pass@k across large sampling budgets and increases the area under the pass@k curve (AUC@K) without sacrificing pass@1, while sustaining exploration and uncovering more diverse solution strategies at scale.

  4. Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

    Multi-agent systems have evolved into practical LLM-driven collaborators for many applications, gaining robustness from diversity and cross-checking. However, multi-agent RL (MARL) training is resource-intensive and unstable: co-adapting teammates induce non-stationarity, and rewards are often sparse and high-variance. Therefore, we introduce Multi-Agent Test-Time Reinforcement Learning (MATTRL), a framework that injects structured textual experience into multi-agent deliberation at inference time. MATTRL forms a multi-expert team of specialists for multi-turn discussions, retrieves and integrates test-time experiences, and reaches consensus for final decision-making. We also study credit assignment for constructing a turn-level experience pool, then reinjecting it into the dialogue. Across challenging benchmarks in medicine, math, and education, MATTRL improves accuracy by an average of 3.67\% over a multi-agent baseline, and by 8.67\% over comparable single-agent baselines. Ablation studies examine different credit-assignment schemes and provide a detailed comparison of how they affect training outcomes. MATTRL offers a stable, effective and efficient path to distribution-shift-robust multi-agent reasoning without tuning.

  5. VIBE: Visual Instruction Based Editor

    Instruction-based image editing is among the fastest developing areas in generative AI. Over the past year, the field has reached a new level, with dozens of open-source models released alongside highly capable commercial systems. However, only a limited number of open-source approaches currently achieve real-world quality. In addition, diffusion backbones, the dominant choice for these pipelines, are often large and computationally expensive for many deployments and research settings, with widely used variants typically containing 6B to 20B parameters. This paper presents a compact, high-throughput instruction-based image editing pipeline that uses a modern 2B-parameter Qwen3-VL model to guide the editing process and the 1.6B-parameter diffusion model Sana1.5 for image generation. Our design decisions across architecture, data processing, training configuration, and evaluation target low-cost inference and strict source consistency while maintaining high quality across the major edit categories feasible at this scale. Evaluated on the ImgEdit and GEdit benchmarks, the proposed method matches or exceeds the performance of substantially heavier baselines, including models with several times as many parameters and higher inference cost, and is particularly strong on edits that require preserving the input image, such as an attribute adjustment, object removal, background edits, and targeted replacement. The model fits within 24 GB of GPU memory and generates edited images at up to 2K resolution in approximately 4 seconds on an NVIDIA H100 in BF16, without additional inference optimizations or distillation.

  6. Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

    The central challenge of AI for Science is not reasoning alone, but the ability to create computational methods in an open-ended scientific world. Existing LLM-based agents rely on static, pre-defined tool libraries, a paradigm that fundamentally fails in scientific domains where tools are sparse, heterogeneous, and intrinsically incomplete. In this paper, we propose Test-Time Tool Evolution (TTE), a new paradigm that enables agents to synthesize, verify, and evolve executable tools during inference. By transforming tools from fixed resources into problem-driven artifacts, TTE overcomes the rigidity and long-tail limitations of static tool libraries. To facilitate rigorous evaluation, we introduce SciEvo, a benchmark comprising 1,590 scientific reasoning tasks supported by 925 automatically evolved tools. Extensive experiments show that TTE achieves state-of-the-art performance in both accuracy and tool efficiency, while enabling effective cross-domain adaptation of computational tools. The code and benchmark have been released at https://github.com/lujiaxuan0520/Test-Time-Tool-Evol.

  7. DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

    Vision-Language Pre-training (VLP) models demonstrate strong performance across various downstream tasks by learning from large-scale image-text pairs through contrastive pretraining. The release of extensive English image-text datasets (e.g., COYO-700M and LAION-400M) has enabled widespread adoption of models such as CLIP and SigLIP in tasks including cross-modal retrieval and image captioning. However, the advancement of Chinese vision-language pretraining has substantially lagged behind, due to the scarcity of high-quality Chinese image-text data. To address this gap, we develop a comprehensive pipeline for constructing a high-quality Chinese cross-modal dataset. As a result, we propose DanQing, which contains 100 million image-text pairs collected from Common Crawl. Different from existing datasets, DanQing is curated through a more rigorous selection process, yielding superior data quality. Moreover, DanQing is primarily built from 2024-2025 web data, enabling models to better capture evolving semantic trends and thus offering greater practical utility. We compare DanQing with existing datasets by continual pre-training of the SigLIP2 model. Experimental results show that DanQing consistently achieves superior performance across a range of Chinese downstream tasks, including zero-shot classification, cross-modal retrieval, and LMM-based evaluations. To facilitate further research in Chinese vision-language pre-training, we will open-source the DanQing dataset under the Creative Common CC-BY 4.0 license.

  8. Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

    The advancement of artificial intelligence toward agentic science is currently bottlenecked by the challenge of ultra-long-horizon autonomy, the ability to sustain strategic coherence and iterative correction over experimental cycles spanning days or weeks. While Large Language Models (LLMs) have demonstrated prowess in short-horizon reasoning, they are easily overwhelmed by execution details in the high-dimensional, delayed-feedback environments of real-world research, failing to consolidate sparse feedback into coherent long-term guidance. Here, we present ML-Master 2.0, an autonomous agent that masters ultra-long-horizon machine learning engineering (MLE) which is a representative microcosm of scientific discovery. By reframing context management as a process of cognitive accumulation, our approach introduces Hierarchical Cognitive Caching (HCC), a multi-tiered architecture inspired by computer systems that enables the structural differentiation of experience over time. By dynamically distilling transient execution traces into stable knowledge and cross-task wisdom, HCC allows agents to decouple immediate execution from long-term experimental strategy, effectively overcoming the scaling limits of static context windows. In evaluations on OpenAI's MLE-Bench under 24-hour budgets, ML-Master 2.0 achieves a state-of-the-art medal rate of 56.44%. Our findings demonstrate that ultra-long-horizon autonomy provides a scalable blueprint for AI capable of autonomous exploration beyond human-precedent complexities.

  9. CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

    Recent video generation models have revealed the emergence of Chain-of-Frame (CoF) reasoning, enabling frame-by-frame visual inference. With this capability, video models have been successfully applied to various visual tasks (e.g., maze solving, visual puzzles). However, their potential to enhance text-to-image (T2I) generation remains largely unexplored due to the absence of a clearly defined visual reasoning starting point and interpretable intermediate states in the T2I generation process. To bridge this gap, we propose CoF-T2I, a model that integrates CoF reasoning into T2I generation via progressive visual refinement, where intermediate frames act as explicit reasoning steps and the final frame is taken as output. To establish such an explicit generation process, we curate CoF-Evol-Instruct, a dataset of CoF trajectories that model the generation process from semantics to aesthetics. To further improve quality and avoid motion artifacts, we enable independent encoding operation for each frame. Experiments show that CoF-T2I significantly outperforms the base video model and achieves competitive performance on challenging benchmarks, reaching 0.86 on GenEval and 7.468 on Imagine-Bench. These results indicate the substantial promise of video models for advancing high-quality text-to-image generation.

  10. Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

    Recent progress in text-to-image (T2I) diffusion models (DMs) has enabled high-quality visual synthesis from diverse textual prompts. Yet, most existing T2I DMs, even those equipped with large language model (LLM)-based text encoders, remain text-pixel mappers -- they employ LLMs merely as text encoders, without leveraging their inherent reasoning capabilities to infer what should be visually depicted given the textual prompt. To move beyond such literal generation, we propose the think-then-generate (T2G) paradigm, where the LLM-based text encoder is encouraged to reason about and rewrite raw user prompts; the states of the rewritten prompts then serve as diffusion conditioning. To achieve this, we first activate the think-then-rewrite pattern of the LLM encoder with a lightweight supervised fine-tuning process. Subsequently, the LLM encoder and diffusion backbone are co-optimized to ensure faithful reasoning about the context and accurate rendering of the semantics via Dual-GRPO. In particular, the text encoder is reinforced using image-grounded rewards to infer and recall world knowledge, while the diffusion backbone is pushed to produce semantically consistent and visually coherent images. Experiments show substantial improvements in factual consistency, semantic alignment, and visual realism across reasoning-based image generation and editing benchmarks, achieving 0.79 on WISE score, nearly on par with GPT-4. Our results constitute a promising step toward next-generation unified models with reasoning, expression, and demonstration capacities.

  11. Alterbute: Editing Intrinsic Attributes of Objects in Images

    We introduce Alterbute, a diffusion-based method for editing an object's intrinsic attributes in an image. We allow changing color, texture, material, and even the shape of an object, while preserving its perceived identity and scene context. Existing approaches either rely on unsupervised priors that often fail to preserve identity or use overly restrictive supervision that prevents meaningful intrinsic variations. Our method relies on: (i) a relaxed training objective that allows the model to change both intrinsic and extrinsic attributes conditioned on an identity reference image, a textual prompt describing the target intrinsic attributes, and a background image and object mask defining the extrinsic context. At inference, we restrict extrinsic changes by reusing the original background and object mask, thereby ensuring that only the desired intrinsic attributes are altered; (ii) Visual Named Entities (VNEs) - fine-grained visual identity categories (e.g., ''Porsche 911 Carrera'') that group objects sharing identity-defining features while allowing variation in intrinsic attributes. We use a vision-language model to automatically extract VNE labels and intrinsic attribute descriptions from a large public image dataset, enabling scalable, identity-preserving supervision. Alterbute outperforms existing methods on identity-preserving object intrinsic attribute editing.

  12. ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

    While LLM-based agents can interact with environments via invoking external tools, their expanded capabilities also amplify security risks. Monitoring step-level tool invocation behaviors in real time and proactively intervening before unsafe execution is critical for agent deployment, yet remains under-explored. In this work, we first construct TS-Bench, a novel benchmark for step-level tool invocation safety detection in LLM agents. We then develop a guardrail model, TS-Guard, using multi-task reinforcement learning. The model proactively detects unsafe tool invocation actions before execution by reasoning over the interaction history. It assesses request harmfulness and action-attack correlations, producing interpretable and generalizable safety judgments and feedback. Furthermore, we introduce TS-Flow, a guardrail-feedback-driven reasoning framework for LLM agents, which reduces harmful tool invocations of ReAct-style agents by 65 percent on average and improves benign task completion by approximately 10 percent under prompt injection attacks.

  13. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching

    Tool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to distinguish effective tool calls from redundant or erroneous ones, particularly in long-horizon multi-turn scenarios. To address this, we propose MatchTIR, a framework that introduces fine-grained supervision via bipartite matching-based turn-level reward assignment and dual-level advantage estimation. Specifically, we formulate credit assignment as a bipartite matching problem between predicted and ground-truth traces, utilizing two assignment strategies to derive dense turn-level rewards. Furthermore, to balance local step precision with global task success, we introduce a dual-level advantage estimation scheme that integrates turn-level and trajectory-level signals, assigning distinct advantage values to individual interaction turns. Extensive experiments on three benchmarks demonstrate the superiority of MatchTIR. Notably, our 4B model surpasses the majority of 8B competitors, particularly in long-horizon and multi-turn tasks. Our codes are available at https://github.com/quchangle1/MatchTIR.

  14. Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

    Today's strongest video-language models (VLMs) remain proprietary. The strongest open-weight models either rely on synthetic data from proprietary VLMs, effectively distilling from them, or do not disclose their training data or recipe. As a result, the open-source community lacks the foundations needed to improve on the state-of-the-art video (and image) language models. Crucially, many downstream applications require more than just high-level video understanding; they require grounding -- either by pointing or by tracking in pixels. Even proprietary models lack this capability. We present Molmo2, a new family of VLMs that are state-of-the-art among open-source models and demonstrate exceptional new capabilities in point-driven grounding in single image, multi-image, and video tasks. Our key contribution is a collection of 7 new video datasets and 2 multi-image datasets, including a dataset of highly detailed video captions for pre-training, a free-form video Q&A dataset for fine-tuning, a new object tracking dataset with complex queries, and an innovative new video pointing dataset, all collected without the use of closed VLMs. We also present a training recipe for this data utilizing an efficient packing and message-tree encoding scheme, and show bi-directional attention on vision tokens and a novel token-weight strategy improves performance. Our best-in-class 8B model outperforms others in the class of open weight and data models on short videos, counting, and captioning, and is competitive on long-videos. On video-grounding Molmo2 significantly outperforms existing open-weight models like Qwen3-VL (35.5 vs 29.6 accuracy on video counting) and surpasses proprietary models like Gemini 3 Pro on some tasks (38.4 vs 20.0 F1 on video pointing and 56.2 vs 41.1 J&F on video tracking).

  15. Transition Matching Distillation for Fast Video Generation

    Large video diffusion and flow models have achieved remarkable success in high-quality video generation, but their use in real-time interactive applications remains limited due to their inefficient multi-step sampling process. In this work, we present Transition Matching Distillation (TMD), a novel framework for distilling video diffusion models into efficient few-step generators. The central idea of TMD is to match the multi-step denoising trajectory of a diffusion model with a few-step probability transition process, where each transition is modeled as a lightweight conditional flow. To enable efficient distillation, we decompose the original diffusion backbone into two components: (1) a main backbone, comprising the majority of early layers, that extracts semantic representations at each outer transition step; and (2) a flow head, consisting of the last few layers, that leverages these representations to perform multiple inner flow updates. Given a pretrained video diffusion model, we first introduce a flow head to the model, and adapt it into a conditional flow map. We then apply distribution matching distillation to the student model with flow head rollout in each transition step. Extensive experiments on distilling Wan2.1 1.3B and 14B text-to-video models demonstrate that TMD provides a flexible and strong trade-off between generation speed and visual quality. In particular, TMD outperforms existing distilled models under comparable inference costs in terms of visual fidelity and prompt adherence. Project page: https://research.nvidia.com/labs/genair/tmd

Solidot(15)

  1. 全世界三角洲都有地面在下沉

    弗吉尼亚理工大学的研究人员分析了欧洲空间局哨兵 1 号卫星雷达 2014-2023 年的数据,确定全球 40 个河流三角洲的下沉速度,其中包括湄公河、密西西比河、亚马孙河、赞比西河、长江和尼罗河等流域的三角洲。 全球有 5 亿人生活在三角洲地区,有10 座人口逾千万的特大城市坐落在这些地区。分析显示 40 个三角洲中,每个三角洲都有逾 1/3 的面积在下沉;而其中 38 个三角洲下沉的面积超过了一半。泰国曼谷所在的湄南河三角洲最严重,该地区沉降速度为每年 8 毫米,是当前全球平均海平面上升速度的两倍,并且该三角洲 94% 的区域正以每年超过5毫米的速度下沉。地面下沉和海平面上升的综合影响,意味着曼谷及湄南河三角洲的海平面正以每年 12.3 毫米的速度上升。埃及的亚历山大港、印度尼西亚的雅加达和泗水等城市也面临快速下沉的威胁。研究人员发现,地下水抽取对地面下沉的总体影响最大。

  2. 华为手机出货量 2025 年重回第一

    根据 IDC 的数据,华为手机出货量 2025 年重回第一。华为 2025 年的手机出货量比 2024 年减少1.9%,为 4670 万部。由于 2024 年位居榜首的 vivo 出货量大幅下降 6.6%,华为逆转升至首位。出货量排在第 2 位的是美国苹果,增长 4% 达到 4620 万部。2025 年 9 月上市的 iPhone17 系列销售强劲。中国 2025 的整体手机出货量减少 0.6%,为 2 亿 8460 万部,2 年来首次低于上年。IDC 预测 2026 年的出货量为 2 亿 7800 万部,持续低于上年。

  3. 美国公交车站太密集而影响行驶速度

    纽约、旧金山等美国城市的公交车行驶速度非常缓慢,只比快走快一点。一项分析发现,原因是车站之间的距离太近,车站密度过大,只需要减少些车站点就能提高公交车速。美国城市车站之间的平均距离为 313 米,每英里有 5 个车站,费城、芝加哥和旧金山等老城市车站距离更短,分别为 214 米、223 米和 248 米。欧洲城市的车站间距通常在 300-450 米之间。公交车在每个车站都会耗费时间:乘客上下车、加速和减速、为轮椅使用者降低高度、错过红绿灯等。公交车大约五分之一的运营时间耗费在停车和起步上,由于人工成本占公交运营成本的大部分,速度慢就意味着运营成本更高。美洲部分城市开始测试扩大公交车站间距,旧金山通过将每英里停靠车站数从六个减少到两个半,车速提高 4.4%-14%。温哥华试点减少了四分之一公交停靠站,平均行程时间缩短了五分钟,单条路线每年节省约 50 万美元。

  4. 微软关闭员工图书馆削减数字订阅

    微软内部曾有传说,员工图书馆内的藏书是如此之重,以至于建筑物都下沉了。然而如今实体图书馆行将消失,而数字订阅也将削减。微软正在转向“AI 驱动的学习体验”。微软是从去年 11 月起开始通知出版商取消订阅,已订阅了 22 年的 Strategic News Service(SNS)不再是微软的订阅服务之一。微软员工称,他们无法再访问 The Information 等数字出版物,无法再从图书馆借阅商业书籍。

  5. 长期接触低浓度农药缩短野生鱼类寿命

    中国研究人员在《科学》期刊上发表一项研究,发现即使在监管框架认为安全的剂量下,长期接触农药毒死蜱(chlorpyrifos)也会加速野生鱼类的生理衰老过程并缩短其寿命。这些发现引发了人们对农药长期低度污染环境影响的担忧。为评估接触低浓度农药会对野生鱼类产生何种影响,武汉研究人员将对中国湖泊中 2 万 4388 条达氏鲌(Culter Dabryi)的野外观测(这些湖泊持续存在低浓度的常见农药毒死蜱)与实验室中的实验相结合,发现来自受农药污染湖泊的鱼类会出现端粒缩短;它们还会呈现以年轻个体为主的被截断的种群结构,表明长期接触低剂量的毒死蜱与生理衰老过程加速及寿命缩短有关。这些发现也在实验室中得到证实。

  6. 针对 Pixel 9 手机的零点击利用链

    Google 安全团队 Project Zero 发表了三篇博客,分析了针对 Pixel 9 手机的零点击漏洞利用链。相关漏洞早在 2025 年 9 月 19 日就已经公开,但 Google 直到 2026 年 1 月 6 日才给手机打上补丁。安全研究员称,智能手机过去几年为用户提供了多项 AI 驱动的功能,但这些 AI 功能带来的一个后果是零点击攻击面的扩大。其中一项功能是音频转录。Google Messages 接收到的短信和 RCS 音频附件无需用户操作就会自动解码。音频解码器现在是零点击攻击面的一部分,针对 Pixel 9 的零点击漏洞利用链就始于音频解码器 Dolby Unified Decoder——它提供了对 AC-3(Dolby Digital)和 EAC-3(Dolby Digital Plus)音频格式的支持。

  7. 中国应用商店下架“死了么”App

    最近一段时间病毒式传播的“死了么”App 在中国区应用商店下架,原因可能与名字有关,开发商最近几天正在征集新中文名。以苹果应用商店为例,中国区已经搜索不到“死了么”,但其它地区仍然可以搜索到,能正常下载,而且在位居付费排行榜前列。

  8. 伊朗断网八天,为至今全球断网第三长

    根据 NetBlocks 的监测,伊朗全国断网已进入第八天,持续时长超过 180 小时,为该国持续时间最长的断网事件,在全世界可以排到第三名。伊朗此前持续时间最长的两次断网事件发生在 2019 年和 2025 年,分别持续 163 小时和 160 小时。全世界最长的两次全国断网事件是:2021 年苏丹持续 35 天,2024 年 7 月毛里塔尼亚持续 22 天。NetBlocks 研究主管 Isik Mater 表示伊朗断网是该公司至今观察到的最全面、执行最严格的全国性断网事件之一,尤其是在受影响人口方面。

  9. 维基百科与 AI 公司签署数据授权协议

    维基百科迎来了 25 周年纪念,这家最大的在线百科全书网站宣布与亚马逊、微软、Meta、Perplexity 和 Mistral AI 等 AI 公司签署了使用其数据训练 AI 的授权协议。维基媒体基金会未透露相关财务细节。它早在 2022 年就与 Google 签署了授权协议。虽然维基百科也受到了生成式 AI 的冲击,它仍然是访问量第九大的网站,拥有逾 6500 万篇文章,涵盖 300 种语言,由 25 万名志愿者编辑维护。

  10. 携程涉嫌垄断被调查

    市场监管总局根据前期核查,依据《中华人民共和国反垄断法》,对携程涉嫌滥用市场支配地位实施垄断行为立案调查。携程旗下拥有携程网、去哪儿网、Skyscanner、Trip.com四个主要品牌,以及驴评网、鸿鹄逸游、永安、易游等多个支线品牌。它占了中国在线旅游市场份额一半以上,是中国最大的在线旅行社,也是全球最大的在线旅行社之一。过去几年它受到了很多争议,其中最具有争议的行为是大数据杀熟,它的资深用户发现携程向其展示的价格比向其他用户展示的价格更高,携程将它的资深用户归类为“对价格不敏感”。

  11. 每天额外多运动五分钟少坐 30 分钟有助于延寿

    根据发表在《柳叶刀》期刊上的一项研究,每天多运动五分钟以及减少半小时久坐有助于延长寿命。对英国、美国、挪威和瑞典的 13.5 万人的研究发现,每天增加五分钟中等强度的体育锻炼如快走,与死亡率降低约 10% 相关。每天减少 30 分钟的久坐与死亡率降低约 7% 相关。研究人员强调,研究结果不应被用作个人健康建议,而是展现了对整体人群的巨大益处。

  12. 英国警方怪罪 Microsoft Copilot 生成不存在的足球比赛情报

    英国西米德兰兹警局(West Midlands Police)局长 Craig Guildford 称微软的 Copilot AI 助手生成了不存在的西汉姆联队对阵特拉维夫马卡比队的足球比赛情报。局长此前曾经否认警方使用生成式 AI 生成了报告,将错误归咎于“社交媒体抓取”。Guildford 表示他在周五下午获悉错误是使用微软 Copilot AI 导致的。

  13. 网红与 OnlyFans 模特主导美国特殊人才 O-1 签证

    移民事务律师 Michael Wildes 的办公室墙上挂着一张巨幅照片,画中是小野洋子和她已故的丈夫、披头士主唱约翰列侬——他们曾是 Wildes 父亲的委托人,当年正是 Wildes 父亲为这对夫妇辩护,使他们免遭驱逐出境。几十年后,Wildes 开始为他这一代人中一些最知名的演员和歌手提供法律服务。不过如今联系他办理签证的人当中,越来越多的是社交媒体网红和 OnlyFans 平台模特。美国特殊人才 O-1 签证并没有对艺术家进行严格定义,它采用了基于影响力的评价机制,导致 Youtube、Tiktok、Instagram、OnlyFans 上的网红占据了该签证获取量的半数以上。

  14. 2025 年为有记录以来最热的年份之一

    根据世界气象组织(WMO)对八个数据集的综合分析,全球平均表面温度比 1850-1900 年的平均值高出了 1.44°C(不确定性边际为±0.13°C)。其中两个数据集将 2025 年列为 176 年记录中第二热年份,另外六个将其列为第三热年份。在所有八个数据集中,过去三年(2023-2025年)均是最暖的三年。2023-2025 年的三年综合平均气温比工业化前时代高出了 1.48°C(不确定性边际为 ±0.13°C)。在所有八个数据集中,过去 11 年(2015-2025年)均是最暖的 11 年。

  15. 中印煤电发电量 1970 年代以来首次下降

    Centre for Research on Energy and Clean Air 的分析发现,中国和印度两个最大的煤炭消费国的煤电发电量 自 1973 年以来首次同时下降。中国和印度的煤电发电量分别下降了 1.6% 和 3%,两国大力发展的清洁能源项目足以满足不断增长的能源需求。中国去年新增太阳能发电装机容量逾 300GW,风能发电装机容量逾 100GW,而印度去年新增太阳能 35GW、风能 6GW 和水电 3.5GW。