WEEK · 2025-W39

Weekly Digest — 2025-W39

141 unique stories (2025-09-222025-09-28), aggregated across 8 sources.

Hacker News(42)

  1. Qwen3-Omni: Native Omni AI model for text, image and video (github.com)
  2. AI-generated “workslop” is destroying productivity? (hbr.org)
  3. OpenAI and Nvidia announce partnership to deploy 10GW of Nvidia systems (openai.com)
  4. UK Millionaire exodus did not occur, study reveals (taxjustice.net)
  5. A New Internet Business Model? (blog.cloudflare.com)
  6. PlanetScale for Postgres is now GA (planetscale.com)
  7. I'm leaving Ruby Central (gist.github.com)
  8. Find SF parking cops (walzr.com)
  9. Markov chains are the original language models (elijahpotter.dev)
  10. Always Invite Anna (sharif.io)
  11. Libghostty is coming (mitchellh.com)
  12. Shopify, pulling strings at Ruby Central, forces Bundler and RubyGems takeover (joel.drapper.me)

GitHub Trending(30)

  1. Gar-b-age / CookLikeHOC

    🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.

  2. bevyengine / bevy

    A refreshingly simple data-driven game engine built in Rust

  3. Alibaba-NLP / DeepResearch

    Tongyi Deep Research, the Leading Open-source Deep Research Agent

  4. tldraw / tldraw

    very good whiteboard SDK / infinite canvas SDK

  5. elastic / elasticsearch

    Free and Open Source, Distributed, RESTful Search Engine

  6. LizardByte / Sunshine

    Self-hosted game stream host for Moonlight.

  7. gin-gonic / gin

    Gin is a high-performance HTTP web framework written in Go. It provides a Martini-like API but with significantly better performance—up to 40 times faster—thanks to httprouter. Gin is designed for building REST APIs, web applications, and microservices.

  8. LadybirdBrowser / ladybird

    Truly independent web browser

  9. gofiber / fiber

    ⚡️ Express inspired web framework written in Go

  10. eslint / eslint

    Find and fix problems in your JavaScript code.

  11. fmtlib / fmt

    A modern formatting library

  12. mtdvio / every-programmer-should-know

    A collection of (mostly) technical things every software developer should know about

Hugging Face(30)

  1. RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

    Large language models excel at function- and file-level code generation, yet generating complete repositories from scratch remains a fundamental challenge. This process demands coherent and reliable planning across proposal- and implementation-level stages, while natural language, due to its ambiguity and verbosity, is ill-suited for faithfully representing complex software structures. To address this, we introduce the Repository Planning Graph (RPG), a persistent representation that unifies proposal- and implementation-level planning by encoding capabilities, file structures, data flows, and functions in one graph. RPG replaces ambiguous natural language with an explicit blueprint, enabling long-horizon planning and scalable repository generation. Building on RPG, we develop ZeroRepo, a graph-driven framework for repository generation from scratch. It operates in three stages: proposal-level planning and implementation-level refinement to construct the graph, followed by graph-guided code generation with test validation. To evaluate this setting, we construct RepoCraft, a benchmark of six real-world projects with 1,052 tasks. On RepoCraft, ZeroRepo produces repositories averaging nearly 36K LOC, roughly 3.9times the strongest baseline (Claude Code) and about 64times other baselines. It attains 81.5% functional coverage and a 69.7% pass rate, exceeding Claude Code by 27.3 and 35.8 percentage points, respectively. Further analysis shows that RPG models complex dependencies, enables progressively more sophisticated planning through near-linear scaling, and enhances LLM understanding of repositories, thereby accelerating agent localization.

  2. MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

    Unified multimodal Large Language Models (LLMs) that can both understand and generate visual content hold immense potential. However, existing open-source models often suffer from a performance trade-off between these capabilities. We present Manzano, a simple and scalable unified framework that substantially reduces this tension by coupling a hybrid image tokenizer with a well-curated training recipe. A single shared vision encoder feeds two lightweight adapters that produce continuous embeddings for image-to-text understanding and discrete tokens for text-to-image generation within a common semantic space. A unified autoregressive LLM predicts high-level semantics in the form of text and image tokens, with an auxiliary diffusion decoder subsequently translating the image tokens into pixels. The architecture, together with a unified training recipe over understanding and generation data, enables scalable joint learning of both capabilities. Manzano achieves state-of-the-art results among unified models, and is competitive with specialist models, particularly on text-rich evaluation. Our studies show minimal task conflicts and consistent gains from scaling model size, validating our design choice of a hybrid tokenizer.

  3. Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

    Generative modeling, representation learning, and classification are three core problems in machine learning (ML), yet their state-of-the-art (SoTA) solutions remain largely disjoint. In this paper, we ask: Can a unified principle address all three? Such unification could simplify ML pipelines and foster greater synergy across tasks. We introduce Latent Zoning Network (LZN) as a step toward this goal. At its core, LZN creates a shared Gaussian latent space that encodes information across all tasks. Each data type (e.g., images, text, labels) is equipped with an encoder that maps samples to disjoint latent zones, and a decoder that maps latents back to data. ML tasks are expressed as compositions of these encoders and decoders: for example, label-conditional image generation uses a label encoder and image decoder; image embedding uses an image encoder; classification uses an image encoder and label decoder. We demonstrate the promise of LZN in three increasingly complex scenarios: (1) LZN can enhance existing models (image generation): When combined with the SoTA Rectified Flow model, LZN improves FID on CIFAR10 from 2.76 to 2.59-without modifying the training objective. (2) LZN can solve tasks independently (representation learning): LZN can implement unsupervised representation learning without auxiliary loss functions, outperforming the seminal MoCo and SimCLR methods by 9.3% and 0.2%, respectively, on downstream linear classification on ImageNet. (3) LZN can solve multiple tasks simultaneously (joint generation and classification): With image and label encoders/decoders, LZN performs both tasks jointly by design, improving FID and achieving SoTA classification accuracy on CIFAR10. The code and trained models are available at https://github.com/microsoft/latent-zoning-networks. The project website is at https://zinanlin.me/blogs/latent_zoning_networks.html.

  4. BaseReward: A Strong Baseline for Multimodal Reward Model

    The rapid advancement of Multimodal Large Language Models (MLLMs) has made aligning them with human preferences a critical challenge. Reward Models (RMs) are a core technology for achieving this goal, but a systematic guide for building state-of-the-art Multimodal Reward Models (MRMs) is currently lacking in both academia and industry. Through exhaustive experimental analysis, this paper aims to provide a clear ``recipe'' for constructing high-performance MRMs. We systematically investigate every crucial component in the MRM development pipeline, including reward modeling paradigms (e.g., Naive-RM, Critic-based RM, and Generative RM), reward head architecture, training strategies, data curation (covering over ten multimodal and text-only preference datasets), backbone model and model scale, and ensemble methods. Based on these experimental insights, we introduce BaseReward, a powerful and efficient baseline for multimodal reward modeling. BaseReward adopts a simple yet effective architecture, built upon a {Qwen2.5-VL} backbone, featuring an optimized two-layer reward head, and is trained on a carefully curated mixture of high-quality multimodal and text-only preference data. Our results show that BaseReward establishes a new SOTA on major benchmarks such as MM-RLHF-Reward Bench, VL-Reward Bench, and Multimodal Reward Bench, outperforming previous models. Furthermore, to validate its practical utility beyond static benchmarks, we integrate BaseReward into a real-world reinforcement learning pipeline, successfully enhancing an MLLM's performance across various perception, reasoning, and conversational tasks. This work not only delivers a top-tier MRM but, more importantly, provides the community with a clear, empirically-backed guide for developing robust reward models for the next generation of MLLMs.

  5. SPATIALGEN: Layout-guided 3D Indoor Scene Generation

    Creating high-fidelity 3D models of indoor environments is essential for applications in design, virtual reality, and robotics. However, manual 3D modeling remains time-consuming and labor-intensive. While recent advances in generative AI have enabled automated scene synthesis, existing methods often face challenges in balancing visual quality, diversity, semantic consistency, and user control. A major bottleneck is the lack of a large-scale, high-quality dataset tailored to this task. To address this gap, we introduce a comprehensive synthetic dataset, featuring 12,328 structured annotated scenes with 57,440 rooms, and 4.7M photorealistic 2D renderings. Leveraging this dataset, we present SpatialGen, a novel multi-view multi-modal diffusion model that generates realistic and semantically consistent 3D indoor scenes. Given a 3D layout and a reference image (derived from a text prompt), our model synthesizes appearance (color image), geometry (scene coordinate map), and semantic (semantic segmentation map) from arbitrary viewpoints, while preserving spatial consistency across modalities. SpatialGen consistently generates superior results to previous methods in our experiments. We are open-sourcing our data and models to empower the community and advance the field of indoor scene understanding and generation.

  6. Lynx: Towards High-Fidelity Personalized Video Generation

    We present Lynx, a high-fidelity model for personalized video synthesis from a single input image. Built on an open-source Diffusion Transformer (DiT) foundation model, Lynx introduces two lightweight adapters to ensure identity fidelity. The ID-adapter employs a Perceiver Resampler to convert ArcFace-derived facial embeddings into compact identity tokens for conditioning, while the Ref-adapter integrates dense VAE features from a frozen reference pathway, injecting fine-grained details across all transformer layers through cross-attention. These modules collectively enable robust identity preservation while maintaining temporal coherence and visual realism. Through evaluation on a curated benchmark of 40 subjects and 20 unbiased prompts, which yielded 800 test cases, Lynx has demonstrated superior face resemblance, competitive prompt following, and strong video quality, thereby advancing the state of personalized video generation.

  7. LIMI: Less is More for Agency

    We define Agency as the emergent capacity of AI systems to function as autonomous agents actively discovering problems, formulating hypotheses, and executing solutions through self-directed engagement with environments and tools. This fundamental capability marks the dawn of the Age of AI Agency, driven by a critical industry shift: the urgent need for AI systems that don't just think, but work. While current AI excels at reasoning and generating responses, industries demand autonomous agents that can execute tasks, operate tools, and drive real-world outcomes. As agentic intelligence becomes the defining characteristic separating cognitive systems from productive workers, efficiently cultivating machine autonomy becomes paramount. Current approaches assume that more data yields better agency, following traditional scaling laws from language modeling. We fundamentally challenge this paradigm. LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles. Through strategic focus on collaborative software development and scientific research workflows, we show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior. Using only 78 carefully designed training samples, LIMI achieves 73.5% on comprehensive agency benchmarks, dramatically outperforming state-of-the-art models: Kimi-K2-Instruct (24.1%), DeepSeek-V3.1 (11.9%), Qwen3-235B-A22B-Instruct (27.5%), and GLM-4.5 (45.1%). Most strikingly, LIMI demonstrates 53.7% improvement over models trained on 10,000 samples-achieving superior agentic intelligence with 128 times fewer samples. Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.

  8. OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models

    Recent advances in video insertion based on diffusion models are impressive. However, existing methods rely on complex control signals but struggle with subject consistency, limiting their practical applicability. In this paper, we focus on the task of Mask-free Video Insertion and aim to resolve three key challenges: data scarcity, subject-scene equilibrium, and insertion harmonization. To address the data scarcity, we propose a new data pipeline InsertPipe, constructing diverse cross-pair data automatically. Building upon our data pipeline, we develop OmniInsert, a novel unified framework for mask-free video insertion from both single and multiple subject references. Specifically, to maintain subject-scene equilibrium, we introduce a simple yet effective Condition-Specific Feature Injection mechanism to distinctly inject multi-source conditions and propose a novel Progressive Training strategy that enables the model to balance feature injection from subjects and source video. Meanwhile, we design the Subject-Focused Loss to improve the detailed appearance of the subjects. To further enhance insertion harmonization, we propose an Insertive Preference Optimization methodology to optimize the model by simulating human preferences, and incorporate a Context-Aware Rephraser module during reference to seamlessly integrate the subject into the original scenes. To address the lack of a benchmark for the field, we introduce InsertBench, a comprehensive benchmark comprising diverse scenes with meticulously selected subjects. Evaluation on InsertBench indicates OmniInsert outperforms state-of-the-art closed-source commercial solutions. The code will be released.

  9. Qwen3-Omni Technical Report

    We present Qwen3-Omni, a single multimodal model that, for the first time, maintains state-of-the-art performance across text, image, audio, and video without any degradation relative to single-modal counterparts. Qwen3-Omni matches the performance of same-sized single-modal models within the Qwen series and excels particularly on audio tasks. Across 36 audio and audio-visual benchmarks, Qwen3-Omni achieves open-source SOTA on 32 benchmarks and overall SOTA on 22, outperforming strong closed-source models such as Gemini-2.5-Pro, Seed-ASR, and GPT-4o-Transcribe. Qwen3-Omni adopts a Thinker-Talker MoE architecture that unifies perception and generation across text, images, audio, and video, yielding fluent text and natural real-time speech. It supports text interaction in 119 languages, speech understanding in 19 languages, and speech generation in 10 languages. To reduce first-packet latency in streaming synthesis, Talker autoregressively predicts discrete speech codecs using a multi-codebook scheme. Leveraging the representational capacity of these codebooks, we replace computationally intensive block-wise diffusion with a lightweight causal ConvNet, enabling streaming from the first codec frame. In cold-start settings, Qwen3-Omni achieves a theoretical end-to-end first-packet latency of 234 ms. To further strengthen multimodal reasoning, we introduce a Thinking model that explicitly reasons over inputs from any modality. Since the research community currently lacks a general-purpose audio captioning model, we fine-tuned Qwen3-Omni-30B-A3B to obtain Qwen3-Omni-30B-A3B-Captioner, which produces detailed, low-hallucination captions for arbitrary audio inputs. Qwen3-Omni-30B-A3B, Qwen3-Omni-30B-A3B-Thinking, and Qwen3-Omni-30B-A3B-Captioner are publicly released under the Apache 2.0 license.

  10. OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

    Despite the growing interest in replicating the scaled success of large language models (LLMs) in industrial search and recommender systems, most existing industrial efforts remain limited to transplanting Transformer architectures, which bring only incremental improvements over strong Deep Learning Recommendation Models (DLRMs). From a first principle perspective, the breakthroughs of LLMs stem not only from their architectures but also from two complementary mechanisms: context engineering, which enriches raw input queries with contextual cues to better elicit model capabilities, and multi-step reasoning, which iteratively refines model outputs through intermediate reasoning paths. However, these two mechanisms and their potential to unlock substantial improvements remain largely underexplored in industrial ranking systems. In this paper, we propose OnePiece, a unified framework that seamlessly integrates LLM-style context engineering and reasoning into both retrieval and ranking models of industrial cascaded pipelines. OnePiece is built on a pure Transformer backbone and further introduces three key innovations: (1) structured context engineering, which augments interaction history with preference and scenario signals and unifies them into a structured tokenized input sequence for both retrieval and ranking; (2) block-wise latent reasoning, which equips the model with multi-step refinement of representations and scales reasoning bandwidth via block size; (3) progressive multi-task training, which leverages user feedback chains to effectively supervise reasoning steps during training. OnePiece has been deployed in the main personalized search scenario of Shopee and achieves consistent online gains across different key business metrics, including over +2% GMV/UU and a +2.90% increase in advertising revenue.

  11. TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

    This paper introduces TempSamp-R1, a new reinforcement fine-tuning framework designed to improve the effectiveness of adapting multimodal large language models (MLLMs) to video temporal grounding tasks. We reveal that existing reinforcement learning methods, such as Group Relative Policy Optimization (GRPO), rely on on-policy sampling for policy updates. However, in tasks with large temporal search spaces, this strategy becomes both inefficient and limited in performance, as it often fails to identify temporally accurate solutions. To address this limitation, TempSamp-R1 leverages ground-truth annotations as off-policy supervision to provide temporally precise guidance, effectively compensating for the sparsity and misalignment in on-policy solutions. To further stabilize training and reduce variance in reward-based updates, TempSamp-R1 provides a non-linear soft advantage computation method that dynamically reshapes the reward feedback via an asymmetric transformation. By employing a hybrid Chain-of-Thought (CoT) training paradigm, TempSamp-R1 optimizes a single unified model to support both CoT and non-CoT inference modes, enabling efficient handling of queries with varying reasoning complexity. Experimental results demonstrate that TempSamp-R1 outperforms GRPO-based baselines, establishing new state-of-the-art performance on benchmark datasets: Charades-STA (R1@0.7: 52.9%, +2.7%), ActivityNet Captions (R1@0.5: 56.0%, +5.3%), and QVHighlights (mAP: 30.0%, +3.0%). Moreover, TempSamp-R1 shows robust few-shot generalization capabilities under limited data. Code: https://github.com/HVision-NKU/TempSamp-R1

  12. GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

    Recent advancements in reinforcement learning (RL) have enhanced the reasoning abilities of large language models (LLMs), yet the impact on multimodal LLMs (MLLMs) is limited. Particularly in vision-intensive tasks like geometric reasoning, MLLMs hallucinate frequently, leading to inaccurate reasoning. We attribute this to the perceptual bottleneck in MLLMs, which caps the benefits of reasoning training. To quantify this, we design a Geo-Perception Question-Answering (GeoPQA) benchmark, targeting basic geometric concepts and spatial relationships. Experiments on GeoPQA reveal significant shortcomings of MLLMs in visual perception, which constrain RL reward signals for effective training. To address this bottleneck, we propose a two-stage RL training framework by first enhancing the visual perception of geometric structures, then fostering reasoning capabilities. Applied to Qwen2.5-VL-3B-Instruct, our two-stage training improves geometric reasoning by 9.7% and geometric problem solving by 9.1%, compared to the direct reasoning training approach. Our method also generalizes to other vision-intensive domains like figure understanding, highlighting the importance of perceptual grounding in effective MLLM reasoning.

Solidot(39)

  1. BlockBlasters 游戏补丁被发现含有恶意程序

    Valve 从 Steam 商店下架了 2D 平台游戏《BlockBlasters》,原因是该游戏最近释出的一个补丁被发现含有恶意程序。《BlockBlasters》于 7 月 31 日发布,8 月 30 日释出了补丁 Build 19799326,其中的文件 game2.bat 表现出了恶意行为,它会收集用户的 IP 和位置信息,检测安装的杀毒软件;收集用户的登录信息,上传收集的信息,执行 VBS 启动器脚本。它最终会安装一个后门和一个窃取程序,从 Google Chrome、Brave Browser 和 Microsoft Edge 窃取信息,它主要窃取加密货币。可能有数百名玩家受到这次攻击的影响。

  2. 中国海军成功测试舰载机电磁弹射

    新华社报道,中国海军宣布,歼-15T、歼-35和空警-600三型舰载机,已于此前成功完成在福建舰上的首次弹射起飞和着舰训练。这是我国首次在弹射型航母上,实现多型号先进舰载机的电磁弹射和阻拦着舰。美国在 2010 年完成了最早的陆基电磁弹射,福特号航母在 2013 年安装了第一套电磁弹射器,但因为种种问题至今没有进行舰载机电磁弹射测试。

  3. 英国银行仍然运行 1960 年代写的代码

    英国银行仍然运行 1960 年代写的代码,而了解这些代码的人寥寥无几。根据一项对 200 家英国银行的调查,16% 的银行依赖 1960 年代的软件,近 40% 的银行仍在维护 1970 年代的代码。50% 的银行承认,他们依赖的软件只有一两位已到或接近退休年龄的员工了解。31.5% 的银行表示,他们依赖一两位未到退休年龄的员工掌握旧系统。38 家银行称,他们仍在使用设计用于在穿孔卡等物理系统上运行的代码,15% 的银行运行的是为房间大小的大型机编写的代码。银行机构庞大,不太可能在每一次科技创新时都重构基础设施。一位受访者表示,其银行核心系统建于 1970 年代,至今仍在使用 Cobol 语言。

  4. 西雅图艰难应对科技工作减少

    微软雷德蒙德总部附近的 Five Stones 咖啡店几个月前招聘咖啡师,收到的应聘者简历列出了在微软等科技公司任职的经历,应聘者通常有硕士学位,有平面设计或市场营销经验,甚至拥有高级职位,而他们应聘的职位薪水是当地的最低薪水:时薪 16.66 美元。Five Stones 咖啡店没有录取这些高学历应聘者,而是优先考虑传统的入门级咖啡师,如学历为高中的人。根据跟踪裁员的 Layoffs.fyi 网站的数据,西雅图最大的两家科技公司微软和亚马逊自 2023 年以来裁员逾 4.6 万人,占到了当地科技公司裁员总数的 85%。科技行业大规模裁员冲击了西雅图的其它领域。亚马逊和微软园区周边商业和购物区的餐饮和零售支出减少,热门地区交易额下降 7%。西雅图在 2025 年上半年有 450 家餐厅关门,相当于当地餐厅总数的 16%。Uber 司机 Juan Prado 在 2021 年的收入达到六位数,经常接送乘客去面试,但今年此类的需求要少得多。当地的商业地产空置率也创历史新高。

  5. 天文学家在地球附近发现一颗准卫星

    天文学家在地球附近发现一颗准卫星(quasi-moon)。被称为 2025 PN7 的天体是一颗近地小行星,围绕太阳飞行一周大约一年时间,可能在地球附近徘徊了约 60 年,直到今年夏天近距离掠过地球时才被望远镜观测到。此类准卫星因体积小且暗淡无光而难以被发现,夏威夷的 Pan-STARRS 天文台是在 8 月 29 日观测到 2025 PN7,历史档案数据显示它在类地球轨道上运行了数十年。天文学家仍致力于确定 2025 PN7 的大小,有估计认为其直径为 19 米或 30 米,它可能是已知绕地球运行的最小准卫星。

  6. 微软的 Entra ID 漏洞可能造成灾难性的后果

    世界各地的企业过去十年将其数字基础设施从自托管服务器迁移至云端,它们受益于云服务提供商如微软提供的安全功能。但如果云服务商本身出现问题,后果可能将是灾难性的。安全研究员 Dirk-jan Mollema 在微软云服务 Azure 的身份和访问管理平台 Entra ID 发现了两个漏洞,可用于获得管理员权限,允许他访问 Entra ID 中储存的所有用户账号,从而造成灾难性影响。Mollema 于 7 月 14 日向微软披露了漏洞,微软于 7 月 17 日发布了补丁。微软之后向 Mollema 确认,问题已于 7 月 23 日修复,8 月实施了额外补救措施。微软于 9 月 4 日公开了漏洞的 CVE。

  7. 多地推进采集男性居民血样

    内蒙古自治区锡林浩特市公安局发布关于锡林浩特市集中采集男性居民血样并录入本地 DN A数据库的通告,引发网友关注。在锡林浩特市之前,多地都曾开展集中采集男性居民血样的工作,此举是为了推进“Y库”建设。“Y库”的全称为“Y库家系工匠系统”,此前白银连环杀人案、南医大女生被害案等案件告破,“Y库”都立了功。据锡林浩特市公安局通告,为进一步夯实公安基础工作,完善该市居民基础信息库数据,健全居民个人信息管理,提升重大风险防范与应对能力,精准落实相关工作举措,根据上级部门统一部署,锡林浩特市公安局各派出所将开展男性居民血样集中采集工作。采集时间为2025年9月5日起,采集对象为锡林浩特市辖区内男性居民,采集地点为居民户籍所在地派出所。通告还称,本次血样采集的作用是完善公民身份信息,直接关联到个人身份证、护照等证件的办理。并且对于防范老人儿童走失、人员身份信息确认等方面,具有重大作用。请广大男性居民积极支持配合此项工作,携带本人有效身份证件(身份证、户口本等)前往指定采集点完成信息登记与血样采集。采集过程严格遵循相关规范,居民个人信息及生物样本将依法严格保密,确保信息安全。此项工作功在当代、利在千秋,望全体市民理解支持,共同推动工作顺利开展。

  8. BMI 指数过低死亡风险可能更高

    一项研究表明,“肥胖但健康”是有科学依据的。基于数万丹麦人数据的研究发现,在 5 年随访期间,体重指数(BMI)为超重,甚至部分肥胖的人群,其死亡风险并不比 BMI 处于正常范围上限(22.5~25)的人群高。研究人员对 85761 人(女性占 81.4%,基线中位年龄 66.4 岁)的 BMI 和死亡率之间的关系进行了研究。随访期间,共有 7555 人(占比 8%)死亡。分析发现,低体重人群的死亡风险几乎是接近健康上限的参照组(22.5~25)的 3 倍。BMI ≥40 的严重肥胖人群的死亡风险则是参照组的 2.1 倍。BMI 在 35 以下并未显示出与较高的死亡风险相关,即使在 35~40 区间也仅与轻微增加的风险有关。

  9. TikTok 算法将在美国重新训练

    美国白宫公布了字节跳动剥离 TikTok 美国业务的细节,美国总统特朗普预计将在本周晚些时候批准这笔交易。TikTok 美国业务将移交给以甲骨文和银湖资本(Silver Lake)为中心的企业联合体,并由该联合体负责运营。但作为核心技术的算法仍由中国以授权方式提供,并未完全切割。合资公司中美国董事会成员占半数以上,美国政府没有计划派人员进入董事会,也不会获取在重大事项上行使否决权的“黄金股”或实施资本注入。字节跳动对合资公司的持股将控制在 20% 以下。因为相关法规定义,中国资本若在美企持股比例超过 20%,该企业将被视为“在中国的管理之下”。合资公司预计还有多家企业和投资者参与,但“资本构成尚未最终确定”。甲骨文将在美国境内利用用户数据来运行算法的复制版本,并负责安全措施。甲骨文接收的算法副本包括“源代码”,将被纳入甲骨文管理的系统中,以便该公司实施验证。

  10. Google TV 加入 Gemini AI 助手

    Google 开始将其 Gemini AI 助手推送给 Google TV 设备。用户将能向 Gemini 寻求电视推荐、节目回顾、评论,甚至执行家庭作业辅导、假期计划或学习新技能等任务的帮助。Gemini AI 将首先推送给 TCL 的 QM9K 系列智能电视,晚些时候推送给 Google TV Streamer、Walmart onn 4K Pro、2025 Hisense U7、U8 和 UX 型号,以及 2025 款 TCL QM7K、QM8K 和 X11K 系列型号。

  11. Windows 11 支持将视频设为墙纸

    微软正在为 Windows 11 加入将视频设为桌面墙纸的功能。最新的 Windows 11 预览版包含了该功能,允许用户将 MP4、MOV、AVI、WMV、M4V 或 MKV 文件设置为墙纸,用户查看桌面时视频就会播放。视频墙纸并非是新特性,Windows 操作系统早就支持该功能。Windows Vista 的终极版通过 DreamScene 功能支持视频墙纸,很多 Linux 发行版都支持,macOS 也支持将移动背景设为锁屏墙纸。

  12. 英伟达向 OpenAI 投资千亿美元

    AI 芯片最大的供应商宣布与 AI 行业估值最高的公司展开合作,投资建造用于训练 AI 的数据中心。英伟达宣布将向 OpenAI 投资千亿美元,OpenAI 的估值达到了 5000 亿美元,但英伟达手中并没有这么多现金,它的投资承诺更像是助长 AI 泡沫的意向书。