DIGEST · 2025-11-20

OrangeBot.AI Digest — 2025-11-20

56 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. CBP is monitoring US drivers and detaining those with suspicious travel patterns (apnews.com)
  2. Microsoft makes Zork open-source (opensource.microsoft.com)
  3. Free interactive tool that shows you how PCIe lanes work on motherboards (mobomaps.com)
  4. Okta's NextJS-0auth troubles (joshua.hu)
  5. Show HN: F32 – An Extremely Small ESP32 Board (github.com)
  6. Android and iPhone users can now share files, starting with the Pixel 10 (blog.google)
  7. The Banished Bottom of the Housing Market (www.ryanpuzycki.com)
  8. Nano Banana Pro (blog.google)
  9. 210 IQ Is Not Enough (taylor.town)
  10. Firefox 147 Will Support the XDG Base Directory Specification (www.phoronix.com)
  11. Red Alert 2 in web browser (chronodivide.com)
  12. Adversarial poetry as a universal single-turn jailbreak mechanism in LLMs (arxiv.org)
  13. 'Calvin and Hobbes' at 40 (www.npr.org)
  14. Interactive World History Atlas Since 3000 BC (geacron.com)
  15. CUDA Ontology (jamesakl.com)

GitHub Trending(15)

  1. sansan0 / TrendRadar

    🎯 告别信息过载,AI 助你看懂新闻资讯热点,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等13种工具)。支持企业微信/个人微信/飞书/钉钉/Telegram/邮件/ntfy推送,30秒网页部署,1分钟手机通知,无需编程。支持Docker部署⭐ 让算法为你服务,用AI理解热点

  2. google / adk-go

    An open-source, code-first Go toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

  3. TapXWorld / ChinaTextbook

    所有小初高、大学PDF教材。

  4. yeongpin / cursor-free-vip

    [Support 0.49.x](Reset Cursor AI MachineID & Bypass Higher Token Limit) Cursor Ai ,自动重置机器ID , 免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machine. Please upgrade to pro. We have this limit in place to prevent abuse. Please let us know if you believe this is a mistake.

  5. nvm-sh / nvm

    Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

  6. traefik / traefik

    The Cloud Native Application Proxy

  7. HKUDS / LightRAG

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

  8. bobeff / open-source-games

    A list of open source games.

  9. volcengine / verl

    verl: Volcano Engine Reinforcement Learning for LLMs

  10. GibsonAI / Memori

    Open-Source Memory Engine for LLMs, AI Agents & Multi-Agent Systems

  11. yangshun / tech-interview-handbook

    Curated coding interview preparation materials for busy software engineers

  12. microsoft / call-center-ai

    Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!

  13. MustardChef / WSABuilds

    Run Windows Subsystem For Android on your Windows 10 and Windows 11 PC using prebuilt binaries with Google Play Store (MindTheGapps) and/or Magisk or KernelSU (root solutions) built in.

  14. playcanvas / engine

    Powerful web graphics runtime built on WebGL, WebGPU, WebXR and glTF

  15. iptv-org / iptv

    Collection of publicly available IPTV channels from all over the world

Hugging Face(11)

  1. Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

    This report introduces Kandinsky 5.0, a family of state-of-the-art foundation models for high-resolution image and 10-second video synthesis. The framework comprises three core line-up of models: Kandinsky 5.0 Image Lite - a line-up of 6B parameter image generation models, Kandinsky 5.0 Video Lite - a fast and lightweight 2B parameter text-to-video and image-to-video models, and Kandinsky 5.0 Video Pro - 19B parameter models that achieves superior video generation quality. We provide a comprehensive review of the data curation lifecycle - including collection, processing, filtering and clustering - for the multi-stage training pipeline that involves extensive pre-training and incorporates quality-enhancement techniques such as self-supervised fine-tuning (SFT) and reinforcement learning (RL)-based post-training. We also present novel architectural, training, and inference optimizations that enable Kandinsky 5.0 to achieve high generation speeds and state-of-the-art performance across various tasks, as demonstrated by human evaluation. As a large-scale, publicly available generative framework, Kandinsky 5.0 leverages the full potential of its pre-training and subsequent stages to be adapted for a wide range of generative applications. We hope that this report, together with the release of our open-source code and training checkpoints, will substantially advance the development and accessibility of high-quality generative models for the research community.

  2. Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks

    Video Models have achieved remarkable success in high-fidelity video generation with coherent motion dynamics. Analogous to the development from text generation to text-based reasoning in language modeling, the development of video models motivates us to ask: Can video models reason via video generation? Compared with the discrete text corpus, video grounds reasoning in explicit spatial layouts and temporal continuity, which serves as an ideal substrate for spatial reasoning. In this work, we explore the reasoning via video paradigm and introduce VR-Bench -- a comprehensive benchmark designed to systematically evaluate video models' reasoning capabilities. Grounded in maze-solving tasks that inherently require spatial planning and multi-step reasoning, VR-Bench contains 7,920 procedurally generated videos across five maze types and diverse visual styles. Our empirical analysis demonstrates that SFT can efficiently elicit the reasoning ability of video model. Video models exhibit stronger spatial perception during reasoning, outperforming leading VLMs and generalizing well across diverse scenarios, tasks, and levels of complexity. We further discover a test-time scaling effect, where diverse sampling during inference improves reasoning reliability by 10--20%. These findings highlight the unique potential and scalability of reasoning via video for spatial reasoning tasks.

  3. What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

    AI research agents offer the promise to accelerate scientific progress by automating the design, implementation, and training of machine learning models. However, the field is still in its infancy, and the key factors driving the success or failure of agent trajectories are not fully understood. We examine the role that ideation diversity plays in agent performance. First, we analyse agent trajectories on MLE-bench, a well-known benchmark to evaluate AI research agents, across different models and agent scaffolds. Our analysis reveals that different models and agent scaffolds yield varying degrees of ideation diversity, and that higher-performing agents tend to have increased ideation diversity. Further, we run a controlled experiment where we modify the degree of ideation diversity, demonstrating that higher ideation diversity results in stronger performance. Finally, we strengthen our results by examining additional evaluation metrics beyond the standard medal-based scoring of MLE-bench, showing that our findings still hold across other agent performance metrics.

  4. VisPlay: Self-Evolving Vision-Language Models from Images

    Reinforcement learning (RL) provides a principled framework for improving Vision-Language Models (VLMs) on complex reasoning tasks. However, existing RL approaches often rely on human-annotated labels or task-specific heuristics to define verifiable rewards, both of which are costly and difficult to scale. We introduce VisPlay, a self-evolving RL framework that enables VLMs to autonomously improve their reasoning abilities using large amounts of unlabeled image data. Starting from a single base VLM, VisPlay assigns the model into two interacting roles: an Image-Conditioned Questioner that formulates challenging yet answerable visual questions, and a Multimodal Reasoner that generates silver responses. These roles are jointly trained with Group Relative Policy Optimization (GRPO), which incorporates diversity and difficulty rewards to balance the complexity of generated questions with the quality of the silver answers. VisPlay scales efficiently across two model families. When trained on Qwen2.5-VL and MiMo-VL, VisPlay achieves consistent improvements in visual reasoning, compositional generalization, and hallucination reduction across eight benchmarks, including MM-Vet and MMMU, demonstrating a scalable path toward self-evolving multimodal intelligence. The project page is available at https://bruno686.github.io/VisPlay/

  5. Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset

    The applicability of current lesion segmentation models for chest X-rays (CXRs) has been limited both by a small number of target labels and the reliance on long, detailed expert-level text inputs, creating a barrier to practical use. To address these limitations, we introduce a new paradigm: instruction-guided lesion segmentation (ILS), which is designed to segment diverse lesion types based on simple, user-friendly instructions. Under this paradigm, we construct MIMIC-ILS, the first large-scale instruction-answer dataset for CXR lesion segmentation, using our fully automated multimodal pipeline that generates annotations from chest X-ray images and their corresponding reports. MIMIC-ILS contains 1.1M instruction-answer pairs derived from 192K images and 91K unique segmentation masks, covering seven major lesion types. To empirically demonstrate its utility, we introduce ROSALIA, a vision-language model fine-tuned on MIMIC-ILS. ROSALIA can segment diverse lesions and provide textual explanations in response to user instructions. The model achieves high segmentation and textual accuracy in our newly proposed task, highlighting the effectiveness of our pipeline and the value of MIMIC-ILS as a foundational resource for pixel-level CXR lesion grounding.

  6. ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

    The proliferation of hour-long videos (e.g., lectures, podcasts, documentaries) has intensified demand for efficient content structuring. However, existing approaches are constrained by small-scale training with annotations that are typical short and coarse, restricting generalization to nuanced transitions in long videos. We introduce ARC-Chapter, the first large-scale video chaptering model trained on over million-level long video chapters, featuring bilingual, temporally grounded, and hierarchical chapter annotations. To achieve this goal, we curated a bilingual English-Chinese chapter dataset via a structured pipeline that unifies ASR transcripts, scene texts, visual captions into multi-level annotations, from short title to long summaries. We demonstrate clear performance improvements with data scaling, both in data volume and label intensity. Moreover, we design a new evaluation metric termed GRACE, which incorporates many-to-one segment overlaps and semantic similarity, better reflecting real-world chaptering flexibility. Extensive experiments demonstrate that ARC-Chapter establishes a new state-of-the-art by a significant margin, outperforming the previous best by 14.0% in F1 score and 11.3% in SODA score. Moreover, ARC-Chapter shows excellent transferability, improving the state-of-the-art on downstream tasks like dense video captioning on YouCook2.

  7. MHR: Momentum Human Rig

    We present MHR, a parametric human body model that combines the decoupled skeleton/shape paradigm of ATLAS with a flexible, modern rig and pose corrective system inspired by the Momentum library. Our model enables expressive, anatomically plausible human animation, supporting non-linear pose correctives, and is designed for robust integration in AR/VR and graphics pipelines.

  8. FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI

    As embodied intelligence emerges as a core frontier in artificial intelligence research, simulation platforms must evolve beyond low-level physical interactions to capture complex, human-centered social behaviors. We introduce FreeAskWorld, an interactive simulation framework that integrates large language models (LLMs) for high-level behavior planning and semantically grounded interaction, informed by theories of intention and social cognition. Our framework supports scalable, realistic human-agent simulations and includes a modular data generation pipeline tailored for diverse embodied tasks.To validate the framework, we extend the classic Vision-and-Language Navigation (VLN) task into a interaction enriched Direction Inquiry setting, wherein agents can actively seek and interpret navigational guidance. We present and publicly release FreeAskWorld, a large-scale benchmark dataset comprising reconstructed environments, six diverse task types, 16 core object categories, 63,429 annotated sample frames, and more than 17 hours of interaction data to support training and evaluation of embodied AI systems. We benchmark VLN models, and human participants under both open-loop and closed-loop settings. Experimental results demonstrate that models fine-tuned on FreeAskWorld outperform their original counterparts, achieving enhanced semantic understanding and interaction competency. These findings underscore the efficacy of socially grounded simulation frameworks in advancing embodied AI systems toward sophisticated high-level planning and more naturalistic human-agent interaction. Importantly, our work underscores that interaction itself serves as an additional information modality.

  9. Mixture of States: Routing Token-Level Dynamics for Multimodal Generation

    We introduce MoS (Mixture of States), a novel fusion paradigm for multimodal diffusion models that merges modalities using flexible, state-based interactions. The core of MoS is a learnable, token-wise router that creates denoising timestep- and input-dependent interactions between modalities' hidden states, precisely aligning token-level features with the diffusion trajectory. This router sparsely selects the top-k hidden states and is trained with an ε-greedy strategy, efficiently selecting contextual features with minimal learnable parameters and negligible computational overhead. We validate our design with text-to-image generation (MoS-Image) and editing (MoS-Editing), which achieve state-of-the-art results. With only 3B to 5B parameters, our models match or surpass counterparts up to 4times larger. These findings establish MoS as a flexible and compute-efficient paradigm for scaling multimodal diffusion models.

  10. Aligning Generative Music AI with Human Preferences: Methods and Challenges

    Recent advances in generative AI for music have achieved remarkable fidelity and stylistic diversity, yet these systems often fail to align with nuanced human preferences due to the specific loss functions they use. This paper advocates for the systematic application of preference alignment techniques to music generation, addressing the fundamental gap between computational optimization and human musical appreciation. Drawing on recent breakthroughs including MusicRL's large-scale preference learning, multi-preference alignment frameworks like diffusion-based preference optimization in DiffRhythm+, and inference-time optimization techniques like Text2midi-InferAlign, we discuss how these techniques can address music's unique challenges: temporal coherence, harmonic consistency, and subjective quality assessment. We identify key research challenges including scalability to long-form compositions, reliability amongst others in preference modelling. Looking forward, we envision preference-aligned music generation enabling transformative applications in interactive composition tools and personalized music services. This work calls for sustained interdisciplinary research combining advances in machine learning, music-theory to create music AI systems that truly serve human creative and experiential needs.

  11. Medal S: Spatio-Textual Prompt Model for Medical Segmentation

    We introduce Medal S, a medical segmentation foundation model that supports native-resolution spatial and textual prompts within an end-to-end trainable framework. Unlike text-only methods lacking spatial awareness, Medal S achieves channel-wise alignment between volumetric prompts and text embeddings, mitigating inaccuracies from resolution mismatches. By preserving full 3D context, it efficiently processes multiple native-resolution masks in parallel, enhancing multi-class segmentation performance. A lightweight 3D convolutional module enables precise voxel-space refinement guided by both prompt types, supporting up to 243 classes across CT, MRI, PET, ultrasound, and microscopy modalities in the BiomedSegFM dataset. Medal S offers two prompting modes: a text-only mode, where model predictions serve as spatial prompts for self-refinement without human input, and a hybrid mode, incorporating manual annotations for enhanced flexibility. For 24-class segmentation, parallel spatial prompting reduces inference time by more than 90% compared to sequential prompting. We propose dynamic resampling to address target-patch ratio imbalance, extending SAT and nnU-Net for data augmentation. Furthermore, we develop optimized text preprocessing, a two-stage inference strategy, and post-processing techniques to improve memory efficiency, precision, and inference speed. On the five-modality average on the validation set, Medal S outperforms SAT with a DSC of 75.44 (vs. 69.83), NSD of 77.34 (vs. 71.06), F1 of 38.24 (vs. 24.88), and DSC TP of 65.46 (vs. 46.97). Medal S achieves excellent performance by harmonizing spatial precision with semantic textual guidance, demonstrating superior efficiency and accuracy in multi-class medical segmentation tasks compared to sequential prompt-based approaches. Medal S will be publicly available at https://github.com/yinghemedical/Medal-S.

Solidot(15)

  1. 美国可能会在两个月内失去麻疹消除国地位

    加拿大在本月初失去麻疹消除国地位,因为麻疹疫情在该国持续了一年之久。而美国的麻疹疫情已经持续了 10 月,距离失去麻疹消除国地位仅剩下两个月时间。美国疾控中心(CDC)的官员确认,亚利桑那州和犹他州交界处最近爆发的麻疹疫情是今年 1 月中下旬西德克萨斯州爆发的麻疹疫情的延续,两起疫情源自相同的麻疹病毒亚型。失去麻疹消除国地位意味着麻疹将再次被视为美国的地方性流行病,对于一种疫苗可预防疾病而言,这是一次令人尴尬的公共卫生倒退。美国现任卫生部长是一位反疫苗者。

  2. 逾 60 个警局装备了波士顿动力的机器狗

    美国和加拿大逾 60 个拆弹小组和特警队装备了波士顿动力的机器狗 Spot。四足机器狗重约 34 公斤,起售价 10 万美元,五年前投入商业使用,被警方用于武装对峙、人质营救和危险品处理等任务。马萨诸塞州警方于 2020 年和 2022 年分别以约 25 万美元的价格购买了两台机器狗。去年 Spot 在 Hyannis 帮助警方抓获了一名持刀劫持其母亲的嫌疑人。休斯顿警方装备了三台 Spot,拉斯维加斯警方拥有一台。美国移民及海关执法局(ICE)最近斥资约 7.8 万美元从加拿大制造商 Icor Technology 购入了一款类似的机器人,该机器人还能投烟雾弹。公民自由组织对警方军事化成为一种常态表示担忧,认为需要制定法律监管此类技术的合理使用。目前全球约有 2000 台 Spot 机器狗投入运行。

  3. 中国卡车正迅速从柴油转向电动

    2020 年中国所有新卡车几乎都使用柴油,但到了 2025 年上半年,电动卡车占到了新重型卡车销量的 22%,而 2024 年同期是 9.2%。英国研究公司 BMI 预测,今年的电动卡车销量将占到接近 46%,明年将达到 60%。中国卡车保有量仅次于美国,目前主要使用柴油,但正快速转向电动,柴油使用量预计将会大幅下降。中国电动卡车的销量已连续五个月超过液化天然气(LNG)卡车,而在世界其它国家电动卡车可能永远无法普及。虽然电动卡车的价格是柴油卡车的两到三倍,比液化天然气卡车贵 18%,但中国科学家的研究表明,电动卡车在整个寿命期间有更高的能效和更低的成本,能为车主节省 10%-26% 的费用。

  4. 中国人才推动美国 AI 研究

    今年 6 月,Meta CEO 扎克伯格(Mark Zuckerberg)宣布成立 Superintelligence Lab 时,公布了 11 位加入该计划的 AI 研究员名单。这 11 人全部是移民,其中 7 人出生于中国。两项新研究显示,出生于中国并在中国接受教育的研究人员在美国顶尖 AI 实验室中发挥着重要作用。尽管美国政府收紧移民政策,硅谷反华情绪上升,这些人才仍在产业界和学术界推动着重要的 AI 研究。保尔森基金会在 2020 年的研究显示,中国 AI 研究员占全球顶尖 AI 人才的近 1/3。Meta 的 AI 相关工作此前就十分依赖中国人才。熟悉该公司 AI 团队文化的人士透露,常有人开玩笑地告知新入职员工只需掌握两种语言:公司内部编程语言“Hack”与普通话。

  5. Linus Torvalds 认为 AI 辅助编程对新手有用,但不适合应用于生产代码

    Linux 作者 Linus Torvalds 在韩国首尔举行的 Linux 基金会开源峰会上接受采访时表达了对 vibe coding(AI 辅助编程)的积极看法,认为它能帮助人们完成原本无法完成的计算机任务,但从代码维护的角度看在生产代码中使用 AI 生成代码是一个非常糟糕的想法。Torvalds 说今天的计算机比他当年学编程时的计算机复杂得多,vibe coding 为新手提供了一条进入计算机领域的途径。Torvalds 本人并没有使用 AI 辅助编程,他表示他从拒绝新想法转向拥抱推动新想法,反对那些墨守成规的资深维护者。他称在内核中 Rust 语言不再是实验性质的语言,而 AI 爬虫对开源基础设施产生了巨大影响,部分开发者也被发现滥用 AI 工具向内核维护者递交了虚假的 Bug 报告和安全警告,但问题还不是太严重。

  6. 荷兰将安世半导体控制权归还给中国母公司

    荷兰政府披露,已停止接管欧洲晶片制造商安世半导体(Nexperia),并将公司控制权归还给它的中国母公司闻泰科技,意味着这场持续一个多月的争端正逐步降温。经济事务部长卡雷曼斯(Vincent Karremans)星期三(11月19日)在社交媒体平台X上说,赋予荷兰政府阻止或修改安世半导体决策权的命令已经取消,并称此举是“善意的表示”。今年 9 月 30日 ,荷兰政府以国家安全为由,冻结中国闻泰科技对安世半导体的控制权一年,相当于由荷兰政府接管公司。作为回应,中国商务部于 10 月 4 日宣布禁止安世中国出口特定成品零部件。尽管安世半导体的晶片技术并不属高端,公司在中国也仅运营一座工厂,但争端仍一度波及本田、大众等多家汽车制造商的生产安排。

  7. 物理学家开发 DeepSeek R1 去审查版本

    西班牙公司 Multiverse Computing 的物理学家开发出一个精简版本的 DeepSeek R1 模型 DeepSeek R1 Slim,模型规模比原版小 55% 但性能几乎相同,而且移除了审查机制。中国 AI 公司的模型都受到了遵守法律和符合社会主义价值观的约束,内置了多层审查机制。Multiverse 利用看一种源自量子物理学的复杂数学方法张量网络,用高维网格网络表示和处理大数据集,张量网络能显著缩小模型规,高效表达复杂 AI 系统。张量网络为研究人员提供了一张模型中所有相关性的地图,允许他们精确识别并移除特定信息。

  8. 近半美国儿童希望的圣诞礼物是游戏虚拟币

    Entertainment Software Association(ESA)调查了逾 700 名儿童(年龄在 5-17 岁之间),39% 的儿童想要游戏机,37% 的儿童想要实体游戏,43% 想要游戏虚拟币。逾半数(58%)美国儿童希望和父母一起玩游戏,尤其是 5-7 岁的儿童(73%)。ESA 还调查了逾 1100 名成年人(年龄在 18-65 岁之间),其中 539 名家长有 5-17 岁孩子。三分之一的美国成年人计划在今年圣诞节购买游戏相关的产品。

  9. 全球近半数人口定居城市

    联合国经济和社会事务部周二发布的《2025 年世界城市化前景:成果摘要》报告显示,全球 82 亿人口中已有45% 居住在城市地区,随着世界持续城市化,这一比例还将继续攀升。报告指出,1950 年全球总人口 25 亿中仅有 20% 居住于城市。到 2050 年,预计全球三分之二的人口增长将发生在城市,其余增长则分布于城镇。报告显示,自 1975 年以来全球特大城市数量已实现四倍增长,从 8 座激增至 33 座,其中 19 座位于亚洲。拥有近4200万居民的印度尼西亚首都雅加达位居全球人口最多城市之首,孟加拉国达卡(约 4000 万人口)和日本东京(3300 万人口)分列二、三位。埃及首都开罗是跻身榜单前十的唯一非亚洲城市。预计到 2050 年,亚的斯亚贝巴(埃塞俄比亚)、达累斯萨拉姆(坦桑尼亚)、哈吉普尔(印度)和吉隆坡(马来西亚)的人口将突破千万大关,使特大城市总数增至 37 座。

  10. 超加工食品与人体主要器官的损伤相关

    根据发表在《柳叶刀》期刊上的三篇论文,超加工食品与人体主要器官的损伤相关。超加工食品日益流行,英美两国日常饮食逾半数是超加工食品,而年轻人、贫困或弱势人群中可能逾八成饮食是超加工食品。三篇论文之一指出,超加工食品会损害人体所有主要器官系统,证据强烈表明,人类在生物学上并不适应食用超加工食品。第二篇论文提出了监管和减少超加工食品生产、营销和消费的政策建议。第三篇论文指出,推动超加工食品兴起的并非个人选择而是跨国公司,超加工食品是导致饮食相关慢性病的主因之一,而食品公司将利润置于一切之上。

  11. Z 世代的密码安全意识甚至不如 80 岁的老年人

    NordPass 的分析显示,Z 世代的密码安全意识甚至不如 80 多岁的老年人,12345 是 Z 世代最常用的密码,而其他年龄段的最常用密码多了一位是 123456。Z 世代使用了一些其他年龄段不常见的密码如 skibidis,但趋势总体上是相似的。123456 是所有年龄段最常用的密码,有一部分人会加一位 1234567,甚至 12345678 或 123456789。这些密码基本上都能被瞬间破译。Z 世代最常用的密码包括:12345,123456,12345678,123456789,password,1234567890,skibidi,1234567,pakistan123 和 assword。

  12. Google 发布 Gemini 3

    Google 发布了其最先进的 Gemini 3 模型,模型的 LMArena Leaderboard 得分达到了 1501 Elo,在多项基准测试中表现出色,其中 GPQA Diamond 博士级推理能力测试得分 91.9%,不使用任何工具的情况下在 Humanity's Last Exam 测试中得分 37.5%。Gemini 3 即日起可在 Gemini 应用、AI Mode in Search for Google AI Pro、Google AI Studio、Vertex AI 和 Google Antigravity 中使用。第三方平台如 Cursor、GitHub、JetBrains、Manus 和 Replit 也可访问该模型。Google 还表示,AI Overviews 月活用户已达 20 亿,Gemini 应用月活用户逾 6.5 亿。

  13. Blender 5.0 释出

    开源 3D 图形设计软件 Blender 释出了 v5.0。主要变化包括:通过 Wayland/Vulkan 在 Linux 上支持 HDR 和广色域色彩,显著改进主题和 UI,新的色彩空间工具,改进曲线和几何体功能,工作色彩空间(working color space),AgX HDR 视图,Convert to Display 合成器节点,Jump Time by Delta 运算符,等等。

  14. 脑深部电刺激显著改善重度抑郁及焦虑

    脑深部电刺激(DBS)——即在脑内植入电极,类似“大脑起搏器”——可使一半对治疗耐受的重度抑郁患者的症状明显改善。重度抑郁障碍是全球最常见且致残性极高的心理健康问题之一。虽然抗抑郁药和心理治疗可帮助许多患者,但治疗抵抗率依然较高,大约有 30%–50% 的患者对现有治疗反应不佳。过去几十年脑深部电刺激(DBS)逐渐被用于治疗帕金森病等神经系统疾病。DBS 通过将微型电极植入大脑深部,释放低强度电刺激以调节异常脑网络活动。上海交大医学院附属瑞金医院的 26 名对治疗耐受的抑郁患者参与了这项研究。临床研究表明,26 名患者中有 13 人(50%)症状显著改善,其中 9 人(35%)达到临床缓解(临床治愈标准)。

  15. 超加工食品增加年轻人的糖尿病风险

    根据发表在《营养与代谢》期刊上的一项研究,超加工食品增加年轻人的糖尿病风险。研究发现,较高的超加工食品摄入量增加了患前驱糖尿病的可能性。前驱糖尿病指血糖升高的早期阶段,可能进一步发展为糖尿病。摄入较多超加工食品的年轻人还表现出胰岛素抵抗的迹象,表明身体利用胰岛素控制血糖的效率下降了。成年早期是身体发育成熟并形成可能持续数十年的生活习惯的关键时期。在该阶段,用水果、蔬菜和全谷物等天然食品代替超加工食品,可降低未来患Ⅱ型糖尿病的风险。研究对 85 名 17~22 岁的年轻人进行了为期 4 年的跟踪调查。参与者在每次随访时列出他们最近一个工作日及周末吃过的所有食物。研究人员将这些食物分为两类:超加工食品(如糖果、汽水、麦片、包装食品、调味酸奶和餐厅食物)和非超加工食物,并计算了每个人每天摄入总热量中的超加工食品占比。结果显示,在基线与随访期间,超加工食品摄入量每增加 10%,患前驱糖尿病的风险就会增加 64%、血糖调节受损增加 56%。