OrangeBot.AI Digest — 2026-02-26
54 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Statement from Dario Amodei on Our Discussions with the Department of War (www.anthropic.com)
- Layoffs at Block (twitter.com)
- What Claude Code Chooses (amplifying.ai)
- I baked a pie every day for a year and it changed my life (www.theguardian.com)
- Palm OS User Interface Guidelines (2003) [pdf] (cs.uml.edu)
- Will vibe coding end like the maker movement? (read.technically.dev)
- Open Source Endowment – new funding source for open source maintainers (endowment.dev)
- Nano Banana 2: Google's latest AI image generation model (blog.google)
- AirSnitch: Demystifying and breaking client isolation in Wi-Fi networks [pdf] (www.ndss-symposium.org)
- In 2025, Meta paid an effective federal tax rate of 3.5% (bsky.app)
- Anthropic ditches its core safety promise (www.cnn.com)
- You Want to Visit the UK? You Better Have a Google Play or App Store Account (www.heltweg.org)
- Tell HN: YC companies scrape GitHub activity, send spam emails to users
- Show HN: Terminal Phone – E2EE Walkie Talkie from the Command Line (gitlab.com)
- I don't know how you get here from “predict the next word” (www.grumpy-economist.com)
GitHub Trending(9)
Hugging Face(15)
- HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation
Modeling long sequences of user behaviors has emerged as a critical frontier in generative recommendation. However, existing solutions face a dilemma: linear attention mechanisms achieve efficiency at the cost of retrieval precision due to limited state capacity, while softmax attention suffers from prohibitive computational overhead. To address this challenge, we propose HyTRec, a model featuring a Hybrid Attention architecture that explicitly decouples long-term stable preferences from short-term intent spikes. By assigning massive historical sequences to a linear attention branch and reserving a specialized softmax attention branch for recent interactions, our approach restores precise retrieval capabilities within industrial-scale contexts involving ten thousand interactions. To mitigate the lag in capturing rapid interest drifts within the linear layers, we furthermore design Temporal-Aware Delta Network (TADN) to dynamically upweight fresh behavioral signals while effectively suppressing historical noise. Empirical results on industrial-scale datasets confirm the superiority that our model maintains linear inference speed and outperforms strong baselines, notably delivering over 8% improvement in Hit Rate for users with ultra-long sequences with great efficiency.
- MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models
Molecular generation with diffusion models has emerged as a promising direction for AI-driven drug discovery and materials science. While graph diffusion models have been widely adopted due to the discrete nature of 2D molecular graphs, existing models suffer from low chemical validity and struggle to meet the desired properties compared to 1D modeling. In this work, we introduce MolHIT, a powerful molecular graph generation framework that overcomes long-standing performance limitations in existing methods. MolHIT is based on the Hierarchical Discrete Diffusion Model, which generalizes discrete diffusion to additional categories that encode chemical priors, and decoupled atom encoding that splits the atom types according to their chemical roles. Overall, MolHIT achieves new state-of-the-art performance on the MOSES dataset with near-perfect validity for the first time in graph diffusion, surpassing strong 1D baselines across multiple metrics. We further demonstrate strong performance in downstream tasks, including multi-property guided generation and scaffold extension.
- DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation
Recent advancements in foundation models have revolutionized joint audio-video generation. However, existing approaches typically treat human-centric tasks including reference-based audio-video generation (R2AV), video editing (RV2AV) and audio-driven video animation (RA2V) as isolated objectives. Furthermore, achieving precise, disentangled control over multiple character identities and voice timbres within a single framework remains an open challenge. In this paper, we propose DreamID-Omni, a unified framework for controllable human-centric audio-video generation. Specifically, we design a Symmetric Conditional Diffusion Transformer that integrates heterogeneous conditioning signals via a symmetric conditional injection scheme. To resolve the pervasive identity-timbre binding failures and speaker confusion in multi-person scenarios, we introduce a Dual-Level Disentanglement strategy: Synchronized RoPE at the signal level to ensure rigid attention-space binding, and Structured Captions at the semantic level to establish explicit attribute-subject mappings. Furthermore, we devise a Multi-Task Progressive Training scheme that leverages weakly-constrained generative priors to regularize strongly-constrained tasks, preventing overfitting and harmonizing disparate objectives. Extensive experiments demonstrate that DreamID-Omni achieves comprehensive state-of-the-art performance across video, audio, and audio-visual consistency, even outperforming leading proprietary commercial models. We will release our code to bridge the gap between academic research and commercial-grade applications.
- SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model
SkyReels V4 is a unified multi modal video foundation model for joint video audio generation, inpainting, and editing. The model adopts a dual stream Multimodal Diffusion Transformer (MMDiT) architecture, where one branch synthesizes video and the other generates temporally aligned audio, while sharing a powerful text encoder based on the Multimodal Large Language Models (MMLM). SkyReels V4 accepts rich multi modal instructions, including text, images, video clips, masks, and audio references. By combining the MMLMs multi modal instruction following capability with in context learning in the video branch MMDiT, the model can inject fine grained visual guidance under complex conditioning, while the audio branch MMDiT simultaneously leverages audio references to guide sound generation. On the video side, we adopt a channel concatenation formulation that unifies a wide range of inpainting style tasks, such as image to video, video extension, and video editing under a single interface, and naturally extends to vision referenced inpainting and editing via multi modal prompts. SkyReels V4 supports up to 1080p resolution, 32 FPS, and 15 second duration, enabling high fidelity, multi shot, cinema level video generation with synchronized audio. To make such high resolution, long-duration generation computationally feasible, we introduce an efficiency strategy: Joint generation of low resolution full sequences and high-resolution keyframes, followed by dedicated super-resolution and frame interpolation models. To our knowledge, SkyReels V4 is the first video foundation model that simultaneously supports multi-modal input, joint video audio generation, and a unified treatment of generation, inpainting, and editing, while maintaining strong efficiency and quality at cinematic resolutions and durations.
- ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning
Agentic reinforcement learning (ARL) has rapidly gained attention as a promising paradigm for training agents to solve complex, multi-step interactive tasks. Despite encouraging early results, ARL remains highly unstable, often leading to training collapse. This instability limits scalability to larger environments and longer interaction horizons, and constrains systematic exploration of algorithmic design choices. In this paper, we first propose ARLArena, a stable training recipe and systematic analysis framework that examines training stability in a controlled and reproducible setting. ARLArena first constructs a clean and standardized testbed. Then, we decompose policy gradient into four core design dimensions and assess the performance and stability of each dimension. Through this fine-grained analysis, we distill a unified perspective on ARL and propose SAMPO, a stable agentic policy optimization method designed to mitigate the dominant sources of instability in ARL. Empirically, SAMPO achieves consistently stable training and strong performance across diverse agentic tasks. Overall, this study provides a unifying policy gradient perspective for ARL and offers practical guidance for building stable and reproducible LLM-based agent training pipelines.
- GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL
Open-source native GUI agents still lag behind closed-source systems on long-horizon navigation tasks. This gap stems from two limitations: a shortage of high-quality, action-aligned reasoning data, and the direct adoption of generic post-training pipelines that overlook the unique challenges of GUI agents. We identify two fundamental issues in these pipelines: (i) standard SFT with CoT reasoning often hurts grounding, and (ii) step-wise RLVR-tyle training faces partial verifiability, where multiple actions can be correct but only a single demonstrated action is used for verification. This makes offline step-wise metrics weak predictors of online task success. In this work, we present GUI-Libra, a tailored training recipe that addresses these challenges. First, to mitigate the scarcity of action-aligned reasoning data, we introduce a data construction and filtering pipeline and release a curated 81K GUI reasoning dataset. Second, to reconcile reasoning with grounding, we propose action-aware SFT that mixes reasoning-then-action and direct-action data and reweights tokens to emphasize action and grounding. Third, to stabilize RL under partial verifiability, we identify the overlooked importance of KL regularization in RLVR and show that a KL trust region is critical for improving offline-to-online predictability; we further introduce success-adaptive scaling to downweight unreliable negative gradients. Across diverse web and mobile benchmarks, GUI-Libra consistently improves both step-wise accuracy and end-to-end task completion. Our results suggest that carefully designed post-training and data curation can unlock significantly stronger task-solving capabilities without costly online data collection. We release our dataset, code, and models to facilitate further research on data-efficient post-training for reasoning-capable GUI agents.
- Solaris: Building a Multiplayer Video World Model in Minecraft
Existing action-conditioned video generation models (video world models) are limited to single-agent perspectives, failing to capture the multi-agent interactions of real-world environments. We introduce Solaris, a multiplayer video world model that simulates consistent multi-view observations. To enable this, we develop a multiplayer data system designed for robust, continuous, and automated data collection on video games such as Minecraft. Unlike prior platforms built for single-player settings, our system supports coordinated multi-agent interaction and synchronized videos + actions capture. Using this system, we collect 12.64 million multiplayer frames and propose an evaluation framework for multiplayer movement, memory, grounding, building, and view consistency. We train Solaris using a staged pipeline that progressively transitions from single-player to multiplayer modeling, combining bidirectional, causal, and Self Forcing training. In the final stage, we introduce Checkpointed Self Forcing, a memory-efficient Self Forcing variant that enables a longer-horizon teacher. Results show our architecture and training design outperform existing baselines. Through open-sourcing our system and models, we hope to lay the groundwork for a new generation of multi-agent world models.
- JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation
AIGC has rapidly expanded from text-to-image generation toward high-quality multimodal synthesis across video and audio. Within this context, joint audio-video generation (JAVG) has emerged as a fundamental task that produces synchronized and semantically aligned sound and vision from textual descriptions. However, compared with advanced commercial models such as Veo3, existing open-source methods still suffer from limitations in generation quality, temporal synchrony, and alignment with human preferences. To bridge the gap, this paper presents JavisDiT++, a concise yet powerful framework for unified modeling and optimization of JAVG. First, we introduce a modality-specific mixture-of-experts (MS-MoE) design that enables cross-modal interaction efficacy while enhancing single-modal generation quality. Then, we propose a temporal-aligned RoPE (TA-RoPE) strategy to achieve explicit, frame-level synchronization between audio and video tokens. Besides, we develop an audio-video direct preference optimization (AV-DPO) method to align model outputs with human preference across quality, consistency, and synchrony dimensions. Built upon Wan2.1-1.3B-T2V, our model achieves state-of-the-art performance merely with around 1M public training entries, significantly outperforming prior approaches in both qualitative and quantitative evaluations. Comprehensive ablation studies have been conducted to validate the effectiveness of our proposed modules. All the code, model, and dataset are released at https://JavisVerse.github.io/JavisDiT2-page.
- VecGlypher: Unified Vector Glyph Generation with Language Models
Vector glyphs are the atomic units of digital typography, yet most learning-based pipelines still depend on carefully curated exemplar sheets and raster-to-vector postprocessing, which limits accessibility and editability. We introduce VecGlypher, a single multimodal language model that generates high-fidelity vector glyphs directly from text descriptions or image exemplars. Given a style prompt, optional reference glyph images, and a target character, VecGlypher autoregressively emits SVG path tokens, avoiding raster intermediates and producing editable, watertight outlines in one pass. A typography-aware data and training recipe makes this possible: (i) a large-scale continuation stage on 39K noisy Envato fonts to master SVG syntax and long-horizon geometry, followed by (ii) post-training on 2.5K expert-annotated Google Fonts with descriptive tags and exemplars to align language and imagery with geometry; preprocessing normalizes coordinate frames, canonicalizes paths, de-duplicates families, and quantizes coordinates for stable long-sequence decoding. On cross-family OOD evaluation, VecGlypher substantially outperforms both general-purpose LLMs and specialized vector-font baselines for text-only generation, while image-referenced generation reaches a state-of-the-art performance, with marked gains over DeepVecFont-v2 and DualVector. Ablations show that model scale and the two-stage recipe are critical and that absolute-coordinate serialization yields the best geometry. VecGlypher lowers the barrier to font creation by letting users design with words or exemplars, and provides a scalable foundation for future multimodal design tools.
- Image Generation with a Sphere Encoder
We introduce the Sphere Encoder, an efficient generative framework capable of producing images in a single forward pass and competing with many-step diffusion models using fewer than five steps. Our approach works by learning an encoder that maps natural images uniformly onto a spherical latent space, and a decoder that maps random latent vectors back to the image space. Trained solely through image reconstruction losses, the model generates an image by simply decoding a random point on the sphere. Our architecture naturally supports conditional generation, and looping the encoder/decoder a few times can further enhance image quality. Across several datasets, the sphere encoder approach yields performance competitive with state of the art diffusions, but with a small fraction of the inference cost. Project page is available at https://sphere-encoder.github.io .
- World Guidance: World Modeling in Condition Space for Action Generation
Leveraging future observation modeling to facilitate action generation presents a promising avenue for enhancing the capabilities of Vision-Language-Action (VLA) models. However, existing approaches struggle to strike a balance between maintaining efficient, predictable future representations and preserving sufficient fine-grained information to guide precise action generation. To address this limitation, we propose WoG (World Guidance), a framework that maps future observations into compact conditions by injecting them into the action inference pipeline. The VLA is then trained to simultaneously predict these compressed conditions alongside future actions, thereby achieving effective world modeling within the condition space for action inference. We demonstrate that modeling and predicting this condition space not only facilitates fine-grained action generation but also exhibits superior generalization capabilities. Moreover, it learns effectively from substantial human manipulation videos. Extensive experiments across both simulation and real-world environments validate that our method significantly outperforms existing methods based on future prediction. Project page is available at: https://selen-suyue.github.io/WoGNet/
- DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference
The performance of multi-turn, agentic LLM inference is increasingly dominated by KV-Cache storage I/O rather than computation. In prevalent disaggregated architectures, loading the massive KV-Cache from external storage creates a fundamental imbalance: storage NICs on prefill engines become bandwidth-saturated, while those on decoding engines remain idle. This asymmetry severely constrains overall system throughput. We present DualPath, an inference system that breaks this bottleneck by introducing dual-path KV-Cache loading. Beyond the traditional storage-to-prefill path, DualPath enables a novel storage-to-decode path, in which the KV-Cache is loaded into decoding engines and then efficiently transferred to prefill engines via RDMA over the compute network. DualPath combines this optimized data path -- which inherently avoids network congestion and avoids interference with latency-critical model execution communications -- with a global scheduler that dynamically balances load across prefill and decode engines. Our evaluation on three models with production agentic workloads demonstrates that DualPath improves offline inference throughput by up to 1.87times on our in-house inference system. It can also improve online serving throughput by an average factor of 1.96times without violating SLO.
- From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
Instruction-based image editing has achieved remarkable success in semantic alignment, yet state-of-the-art models frequently fail to render physically plausible results when editing involves complex causal dynamics, such as refraction or material deformation. We attribute this limitation to the dominant paradigm that treats editing as a discrete mapping between image pairs, which provides only boundary conditions and leaves transition dynamics underspecified. To address this, we reformulate physics-aware editing as predictive physical state transitions and introduce PhysicTran38K, a large-scale video-based dataset comprising 38K transition trajectories across five physical domains, constructed via a two-stage filtering and constraint-aware annotation pipeline. Building on this supervision, we propose PhysicEdit, an end-to-end framework equipped with a textual-visual dual-thinking mechanism. It combines a frozen Qwen2.5-VL for physically grounded reasoning with learnable transition queries that provide timestep-adaptive visual guidance to a diffusion backbone. Experiments show that PhysicEdit improves over Qwen-Image-Edit by 5.9% in physical realism and 10.1% in knowledge-grounded editing, setting a new state-of-the-art for open-source methods, while remaining competitive with leading proprietary models.
- NanoKnow: How to Know What Your Language Model Knows
How do large language models (LLMs) know what they know? Answering this question has been difficult because pre-training data is often a "black box" -- unknown or inaccessible. The recent release of nanochat -- a family of small LLMs with fully open pre-training data -- addresses this as it provides a transparent view into where a model's parametric knowledge comes from. Towards the goal of understanding how knowledge is encoded by LLMs, we release NanoKnow, a benchmark dataset that partitions questions from Natural Questions and SQuAD into splits based on whether their answers are present in nanochat's pre-training corpus. Using these splits, we can now properly disentangle the sources of knowledge that LLMs rely on when producing an output. To demonstrate NanoKnow's utility, we conduct experiments using eight nanochat checkpoints. Our findings show: (1) closed-book accuracy is strongly influenced by answer frequency in the pre-training data, (2) providing external evidence can mitigate this frequency dependence, (3) even with external evidence, models are more accurate when answers were seen during pre-training, demonstrating that parametric and external knowledge are complementary, and (4) non-relevant information is harmful, with accuracy decreasing based on both the position and the number of non-relevant contexts. We release all NanoKnow artifacts at https://github.com/castorini/NanoKnow.
- The Design Space of Tri-Modal Masked Diffusion Models
Discrete diffusion models have emerged as strong alternatives to autoregressive language models, with recent work initializing and fine-tuning a base unimodal model for bimodal generation. Diverging from previous approaches, we introduce the first tri-modal masked diffusion model pretrained from scratch on text, image-text, and audio-text data. We systematically analyze multimodal scaling laws, modality mixing ratios, noise schedules, and batch-size effects, and we provide optimized inference sampling defaults. Our batch-size analysis yields a novel stochastic differential equation (SDE)-based reparameterization that eliminates the need for tuning the optimal batch size as reported in recent work. This reparameterization decouples the physical batch size, often chosen based on compute constraints (GPU saturation, FLOP efficiency, wall-clock time), from the logical batch size, chosen to balance gradient variance during stochastic optimization. Finally, we pretrain a preliminary 3B-parameter tri-modal model on 6.4T tokens, demonstrating the capabilities of a unified design and achieving strong results in text generation, text-to-image tasks, and text-to-speech tasks. Our work represents the largest-scale systematic open study of multimodal discrete diffusion models conducted to date, providing insights into scaling behaviors across multiple modalities.
Solidot(15)
- DeepSeek 未向英伟达 AMD 提供 V4 模型测试
DeepSeek 未向英伟达和 AMD 提供其下一代 V4 模型进行测试,此举打破了行业惯例。与此同时 DeepSeek 向华为等国内公司提供了新模型进行测试。AI 公司通常会与主要 AI 芯片制造商如英伟达和 AMD 分享模型的预发布版本,以确保其软件在广泛使用的硬件上高效运行。DeepSeek 此前曾与英伟达的技术人员密切合作,但 DeepSeek 即将推出的模型新版本未提供给英伟达。
- 研究揭秘篮球鞋嘎吱声成因
篮球鞋在光滑球场上滑动时会发出的“嘎吱”声,根据发表在《自然》期刊上的一项研究,这种声音源于软质材料表面的波浪状形变。哈佛大学研究人员拍摄了篮球鞋与光滑玻璃板接触时发出的嘎吱声,通过高速成像捕捉到橡胶鞋底在表面脉冲式爆发变形的过程。他们发现,嘎吱声的音调与脉冲频率相匹配,而频率由鞋底的硬度和厚度决定。他们还发现,若柔软表面光滑,脉冲则呈不规则分布且不会产生尖锐声响;而带纹理的表面(如运动鞋的防滑纹路)能产生稳定的脉冲频率,从而形成高音调的嘎吱声。
- TDF 重启 Web 版本 LibreOffice Online
管理办公软件 LibreOffice 项目的基金会 The Document Foundation(TDF)宣布重启 Web 版本 LibreOffice Online。LibreOffice Online(LOOL)可以托管在任何人的基础设施上,但因为 TDF 与主要商业合作伙伴 Collabora 之间的紧张关系而于 2022 年暂停开发。Collabora 为 LibreOffice 以及 LOOL 项目贡献了大量代码,该公司希望通过基于 LibreOffice 的 Collabora Office 以及 Web 版 Collabora Online 获得商业收入支持开发,它认为完全免费的 LOOL 影响其收入,因此于 2020 年底撤回了在 LOOL 项目上工作的开发者,专注于开发 Collabora Online。LOOL 的开发因此陷入了停滞,TDF 只能搁置该项目,如今重启该项目,可能会再次加剧与 Collabora 的紧张关系。
- 韩国出生率连续两年回升
韩国国家数据处发布的统计数据显示,韩国去年出生人口为 25.45 万人,同比增加 1.61 万人(6.8%),继2024年后连续两年保持增势。韩国 2025 年总和生育率为 0.8,较前年的 0.75 增加 0.05,为近四年最高水平。总和生育率是指一名育龄妇女一生中平均生育子女数。该指标自 2015 年的 1.24 持续下滑,至 2023 年跌至 0.72,2024年 首次止跌回升至 0.75。分析认为,出生人口增加主要受婚姻登记累计增加、生育年龄段人口增加以及生育观念变化等因素影响。
- 美国 DVD 和蓝光光盘销量下滑放缓
随着 Z 世代再次青睐物理媒介,过去几年销量大幅下滑的 DVD 和蓝光光盘的销量出现反弹,下滑速度显著放缓。Digital Entertainment Group 的数据显示,去年光盘总销量下滑了 9%,而 2023 年和 2024 年的降幅均超过 20%。美国消费者在 2025 年购买 4K 蓝光光盘上的支出比上一年增长 12%。蓝光光盘发行商 Criterion Collection 认为这一趋势要归功于年轻一代对物理媒介的青睐。洛杉矶光盘租赁店 Vidiots 在 2026 年 1 月平均每天出租 170 张光盘,创历史新高;该店在 2023 年共出租了约 22,000 张光盘,2024 年出租了约 50,000 张。
- 日本搜查微软办公室调查反垄断行为
日本公平交易委员会以美国微软(MS)涉嫌在其他公司的云服务中对使用“微软365”等该公司软件的企业征收高额使用费、妨碍了云市场竞争为由,25 日以违反《反垄断法》嫌疑启动了对微软的审查。当天,对其东京的日本法人进行了入内检查。云服务使企业和个人即便没有自己的服务器和设备,也可通过互联网利用软件或保存数据,近年来市场急速扩大。亚马逊、微软、谷歌正在全球争夺市场份额,公平交易委认为微软有可能利用在软件市场的优势,在云市场也试图揽客,因此将展开调查。
- 海豚和鼠海豚体内发现液晶单体
液晶单体 (LCM) 是笔记本电脑、电视和智能手机屏幕的关键组件。鉴于其在环境中的普遍存在性,LCM 被视为持久性污染物,对海洋生物构成威胁,而科学家正在努力了解这些威胁。香港城市大学研究人员分析了 2007-2021 年间在中国南海采集的印度太平洋驼背海豚和无鳍鼠海豚的组织样本。中国南海是这些濒危海洋动物的重要栖息地。他们对海豚和鼠海豚的脂肪层、肌肉、肝脏、肾脏和脑组织样本进行了 62 种不同 LCM 的筛查。分析表明:污染物是通过食物而非直接通过水体进入海豚和鼠海豚体内的;在海豚和鼠海豚体内发现的大部分 LCM 可能来源于电视和电脑屏幕,较少来源于智能手机;污染物主要集中在脂肪层中,但大脑等组织中也存在少量污染物,这表明 LCM 可能引发神经毒性等健康风险;鼠海豚脂肪层中的 LCM 含量随着时间发生了变化,通常随着液晶显示器使用的普及有所增加,而近年来随着制造商越来越多地使用 LED 显示器,其含量也有所下降。
- 惠普称内存占 PC 成本的比重已上涨至 35%
惠普 CFO Karen Parkhill 在 2026 财年一季度财报电话会议上表示,内存占惠普 PC 物料清单的比重从 2025 财年四季度的 15%-18% 上涨至 35%。由于价格上涨抑制了客户需求,惠普 PC 业务可能会在今年出现两位数的下滑。Parkhill 称内存成本环比上涨约 100%,预计 2026 财年内还会继续上涨。内存短缺的影响预计会在本财年下半年加剧,这一影响可能会持续到 2027 财年。
- 双语共享大脑意义系统,但略有差异
根据发表在 PNAS 期刊上的一项研究,研究人员利用 fMRI 扫描了六名中英双语者阅读中英故事时的大脑活动,发现双语共享了大脑意义系统,但存在细微差异。大脑在阅读中英故事时包括颞叶、顶叶和前额叶皮层在内的区域都处于活跃状态。81% 对意义做出反应的大脑区域在两种语言之间表现出相同的语义调谐(semantic tuning)。在中文中对家庭相关词汇敏感的一个脑区,在英文中也会对家庭相关词汇敏感。尽管共享大脑意义系统,但研究人员还是发现了细微差异。对英文中强调动作和人际关系单词如“leave”、“boyfriend” 和“family”敏感的大脑区域,但中文中却是对数字和集合如 3、两个或“少数”敏感。参与者都是大学生,其母语是中文,英文是第二语言,他们在阅读中文时大脑对语义信息的反应都显著更强,显示了母语中更深层次的语义加工。
- AI 总是在战争模拟游戏中推荐核打击
根据发表在预印本平台 arXiv 上的一篇论文,AI 总是在战争模拟游戏中推荐核打击,而人类在使用核武器上则有更多顾虑。伦敦国王学院的 Kenneth Payne 让三个主流模型 GPT-5.2、Claude Sonnet 4 和 Gemini 3 Flash 在模拟战争游戏中互相对抗,游戏场景包括激烈的国际对峙,涉及边界争端、稀缺资源争夺以及政权的生存威胁。AI 允许采取从外交抗议、彻底投降到全面核战争等一系列行动。AI 进行了 21 场游戏,329 个回合,生成了 78 万字去描述其决策背后的逻辑。在 95% 的模拟游戏中,AI 模型至少部署了一枚战术核武器。普林斯顿的 Tong Zhao 称,主要大国已在战争模拟中纳入 AI,但目前并不确定 AI 的决策支持在多大程度上纳入实际的军事决策。Payne 认为没人会把核导弹发射井的控制权交给 AI,任由它们做出决定。三个模型的开发商 OpenAI、Anthropic 和 Google 未对该研究置评。
- 量子算法在补集抽样任务中胜过经典算法
根据《Physical Review Letters》期刊上的一项研究,英国 Quantinuum 和荷兰 QuSoft 的研究团队开发出一种量子算法,能比任何经典算法更高效的解决补集抽样(complement sampling)任务,证明量子算法在样本复杂度上具有可证明和可验证的量子优势。想象下有一个巨大盒子,里面装着有编号的球,有人秘密挑选出一半的球组成集合 S,你只能从 S 里取球查看编号,判断哪些编号不在 S 内。经典算法是抽取大量球的样本,才可能有信心判断一个不在 S 中的编号。但量子算法不需要抽取多少样本,你从 S 中抽取的不是一个球,而是一个叠加态的“波球”,通过类似翻转的操作,将 S 的波球翻转到非 S,然后进行测量,就得到了一个不在 S 中的编号。量子算法的抽样次数要比经典算法少得多。
- 苹果将在美国工厂制造 Mac mini
苹果宣布将在美国德州休斯顿的工厂制造 Mac mini,推动美国制造。苹果去年承诺将在美国投资 6000 亿美元,它表示现阶段已经超额完成目标。Mac mini 类似 Mini PC,是紧凑型的 Mac 电脑,可外接显示器,键盘和鼠标。苹果表示其工厂将在今年晚些时候开始制造 Mac mini。苹果还表示 2026 年计划从台积电位于亚利桑那州的工厂采购逾 1 亿颗先进芯片。
- 英国首例利用捐赠子宫生育的婴儿诞生
Grace Bell 出生时没有子宫,也没有月经,但她的卵巢功能正常——这种症状被称为 MRKH 综合征,英国每 5000 名女性有一人有此症状。如果她要生育后代,只能移植子宫或代孕。她在 2024 年接受了已故捐赠者捐赠的子宫的移植手术,之后在生育诊所做试管婴儿,然后移植胚胎,2025 年圣诞节前夕生下了 3.2 公斤重的男孩 Hugo。Hugo 如今已经 10 周大,她称整件事是简直是奇迹。这是英国首例利用捐赠子宫生育的婴儿。
- SpaceX 火箭重返大气层为高层大气引入金属污染
根据发表在《Communications Earth & Environment》期刊上的一项研究,SpaceX 火箭上面级不受控重返大气层期间燃烧,为高层大气带来金属污染。德国莱布尼茨大气物理研究所观察到了火箭留下的锂污染羽流,这是首次观察到太空碎片会在高层大气中留下可探测的、人造的化学痕迹。高层大气在很大程度上未受人类污染。但新太空时代卫星、火箭残骸和太空碎片将越来越多的金属等污染物释放到高层大气。金属污染对平流层臭氧层的影响目前尚未量化,臭氧层对保护地球生命免受有害紫外线辐射非常重要,但早期的研究显示污染可能会减缓臭氧层的恢复。
- 贝加尔湖与中国北方早在 7700 年前存在人类迁徙廊道
通过分析 42 例古代人类基因组,中俄韩科学家发现早在 7700 年前的新石器时代早期,在西伯利亚的贝加尔湖地区与中国北方的燕山地区之间,就已存在一条远距离的“南北互动廊道” 。研究的关键突破口,来自对河北张家口地区四台蒙古营(Sitaimengguying, STM_EN)遗址(距今7700-7400年)的古人类基因组分析。结果显示,这群早期居民的遗传成分中,不仅有中国北方本地人群的古老基因,还携带着一种与“古代古西伯利亚人群”(Ancient Paleo-Siberian, APS)后裔相关的独特遗传印记 。这种印记的源头直指贝加尔湖地区,为贝加尔湖与中国北方地区之间的互动提供了确凿的遗传学证据 。这一遗传学发现与考古学证据相符 。STM_EN遗址出土的圜底筒形罐,是中国北方新石器考古中一种全新的文化元素,而其风格恰恰与贝加尔湖地区常见的陶器高度相似 。此外,遗址中男性独特的侧身屈肢、四肢交叠的埋葬姿势,也与贝加尔湖地区盛行的葬俗一致 ,进一步证实了两地间存在密切的史前文化联系 。此外考古学家在四台蒙古营遗址发现了居室葬的现象,通过亲缘关系鉴定,研究团队重建了埋藏在同一房址中的个体之间的家庭关系网络:其中包括一位父亲与他的三个亲生儿子、一对母女以及一对亲姐妹。