OrangeBot.AI Digest — 2025-12-06
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Bikeshedding, or why I want to build a laptop (geohot.github.io)
- Perl's decline was cultural (www.beatworm.co.uk)
- Z-Image: Powerful and highly efficient image generation model with 6B parameters (github.com)
- GrapheneOS is the only Android OS providing full security patches (grapheneos.social)
- HTML as an Accessible Format for Papers (info.arxiv.org)
- Tiny Core Linux: a 23 MB Linux distro with graphical desktop (www.tinycorelinux.net)
- How I discovered a hidden microphone on a Chinese NanoKVM (telefoncek.si)
- Touching the Elephant – TPUs (considerthebulldog.com)
- The unexpected effectiveness of one-shot decompilation with Claude (blog.chrislewis.au)
- Linux Instal Fest Belgrade (dmz.rs)
- Autism's confusing cousins (www.psychiatrymargins.com)
- Wolfram Compute Services (writings.stephenwolfram.com)
- Schizophrenia sufferer mistakes smart fridge ad for psychotic episode (old.reddit.com)
- A compact camera built using an optical mouse (petapixel.com)
- Making tiny 0.1cc two stroke engine from scratch (youtu.be)
GitHub Trending(15)
- microsoft / VibeVoice
Open-Source Frontier Voice AI
- rustfs / rustfs
🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.
- RosettaCommons / foundry
Central repository for biomolecular foundation models with shared trainers and pipeline components
- sinelaw / fresh
Text editor for your terminal: easy, powerful and fast
- patchy631 / ai-engineering-hub
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
- psviderski / uncloud
A lightweight tool for deploying and managing containerised applications across a network of Docker hosts. Bridging the gap between Docker and Kubernetes ✨
- oven-sh / bun
Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
- facebook / react
The library for web and native user interfaces.
- lynx-family / lynx
Empower the Web community and invite more to build across platforms.
- DevCaress / guia-entrevistas-de-programacion
- sapientinc / HRM
Hierarchical Reasoning Model Official Release
- projectdiscovery / nuclei-templates
Community curated list of templates for the nuclei engine to find security vulnerabilities.
- paritytech / polkadot-sdk
The Parity Polkadot Blockchain SDK
- golang / go
The Go programming language
- anthropics / claude-quickstarts
A collection of projects designed to help developers quickly get started with building deployable applications using the Claude API
Hugging Face(15)
- Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Existing diffusion-based video generation methods are fundamentally constrained by sequential computation and long-horizon inconsistency, limiting their practical adoption in real-time, streaming audio-driven avatar synthesis. We present Live Avatar, an algorithm-system co-designed framework that enables efficient, high-fidelity, and infinite-length avatar generation using a 14-billion-parameter diffusion model. Our approach introduces Timestep-forcing Pipeline Parallelism (TPP), a distributed inference paradigm that pipelines denoising steps across multiple GPUs, effectively breaking the autoregressive bottleneck and ensuring stable, low-latency real-time streaming. To further enhance temporal consistency and mitigate identity drift and color artifacts, we propose the Rolling Sink Frame Mechanism (RSFM), which maintains sequence fidelity by dynamically recalibrating appearance using a cached reference image. Additionally, we leverage Self-Forcing Distribution Matching Distillation to facilitate causal, streamable adaptation of large-scale models without sacrificing visual quality. Live Avatar demonstrates state-of-the-art performance, reaching 20 FPS end-to-end generation on 5 H800 GPUs, and, to the best of our knowledge, is the first to achieve practical, real-time, high-fidelity avatar generation at this scale. Our work establishes a new paradigm for deploying advanced diffusion models in industrial long-form video synthesis applications.
- DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
Real-world enterprise data intelligence workflows encompass data engineering that turns raw sources into analytical-ready tables and data analysis that convert those tables into decision-oriented insights. We introduce DAComp, a benchmark of 210 tasks that mirrors these complex workflows. Data engineering (DE) tasks require repository-level engineering on industrial schemas, including designing and building multi-stage SQL pipelines from scratch and evolving existing systems under evolving requirements. Data analysis (DA) tasks pose open-ended business problems that demand strategic planning, exploratory analysis through iterative coding, interpretation of intermediate results, and the synthesis of actionable recommendations. Engineering tasks are scored through execution-based, multi-metric evaluation. Open-ended tasks are assessed by a reliable, experimentally validated LLM-judge, which is guided by hierarchical, meticulously crafted rubrics. Our experiments reveal that even state-of-the-art agents falter on DAComp. Performance on DE tasks is particularly low, with success rates under 20%, exposing a critical bottleneck in holistic pipeline orchestration, not merely code generation. Scores on DA tasks also average below 40%, highlighting profound deficiencies in open-ended reasoning and demonstrating that engineering and analysis are distinct capabilities. By clearly diagnosing these limitations, DAComp provides a rigorous and realistic testbed to drive the development of truly capable autonomous data agents for enterprise settings. Our data and code are available at https://da-comp.github.io
- Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
The evolution of Large Language Models (LLMs) from passive responders to autonomous agents necessitates a fundamental shift in learning paradigms -- from static imitation to incentive-driven decision making. However, this transition is significantly impeded by the lack of scalable infrastructure capable of constructing high-quality interaction signals for effective policy learning. To address this, we introduce a comprehensive method designed to systematically scale the diversity and complexity of interactive environments. Our method realizes this scaling by addressing three orthogonal dimensions: (1) Complexity: NexAU, a flexible agent framework that supports building complex agent hierarchies via simple configurations; (2) Diversity: NexA4A automatically generates diverse agent hierarchies from natural language to cover infinite domains; and (3) Fidelity: NexGAP bridges the simulation-reality gap by integrating dynamic real-world environment for grounded trajectories synthesis. We train Nex-N1 upon the diverse and complex interactive environments established by our infrastructure. Empirical results on benchmarks such as SWE-bench and tau2 demonstrate that Nex-N1 consistently outperforms SOTA open-source models and achieves competitive performance against frontier proprietary models on complex agentic tasks. We open-source the Nex ecosystem and model weights to facilitate further research.
- ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Reward models are critical for aligning vision-language systems with human preferences, yet current approaches suffer from hallucination, weak visual grounding, and an inability to use tools for verification, limiting their reliability on complex multimodal reasoning tasks. We present ARM-Thinker, an A}gentic multimodal Reward Model that autonomously invokes external tools (e.g., image cropping, doc page retrieval) to ground judgments in verifiable evidence, replacing static, non-interactive reward scoring. This enables the model to verify fine-grained visual details, cross-reference multi-page evidence, and validate reasoning claims, which are capabilities absent in existing reward models. We train ARM-Thinker with multi-stage reinforcement learning, jointly optimizing tool-calling decisions and judgment accuracy. To evaluate agentic reward modeling, we introduce ARMBench-VL, comprising three benchmarks that assess fine-grained visual grounding (image-level tools), multi-page document understanding (retrieval tools), and instruction following (text-level verification). ARM-Thinker achieves +16.2% average improvement on reward modeling benchmarks, +9.6% on tool-use tasks, and outperforms baselines on multimodal math and logical reasoning benchmarks. Our results demonstrate that agentic capabilities significantly enhance both accuracy and interpretability of reward models.
- Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Efficient streaming video generation is critical for simulating interactive and dynamic worlds. Existing methods distill few-step video diffusion models with sliding window attention, using initial frames as sink tokens to maintain attention performance and reduce error accumulation. However, video frames become overly dependent on these static tokens, resulting in copied initial frames and diminished motion dynamics. To address this, we introduce Reward Forcing, a novel framework with two key designs. First, we propose EMA-Sink, which maintains fixed-size tokens initialized from initial frames and continuously updated by fusing evicted tokens via exponential moving average as they exit the sliding window. Without additional computation cost, EMA-Sink tokens capture both long-term context and recent dynamics, preventing initial frame copying while maintaining long-horizon consistency. Second, to better distill motion dynamics from teacher models, we propose a novel Rewarded Distribution Matching Distillation (Re-DMD). Vanilla distribution matching treats every training sample equally, limiting the model's ability to prioritize dynamic content. Instead, Re-DMD biases the model's output distribution toward high-reward regions by prioritizing samples with greater dynamics rated by a vision-language model. Re-DMD significantly enhances motion quality while preserving data fidelity. We include both quantitative and qualitative experiments to show that Reward Forcing achieves state-of-the-art performance on standard benchmarks while enabling high-quality streaming video generation at 23.1 FPS on a single H100 GPU.
- PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing
Large language models are increasingly embedded into academic writing workflows, yet existing assistants remain external to the editor, preventing deep interaction with document state, structure, and revision history. This separation makes it impossible to support agentic, context-aware operations directly within LaTeX editors such as Overleaf. We present PaperDebugger, an in-editor, multi-agent, and plugin-based academic writing assistant that brings LLM-driven reasoning directly into the writing environment. Enabling such in-editor interaction is technically non-trivial: it requires reliable bidirectional synchronization with the editor, fine-grained version control and patching, secure state management, multi-agent scheduling, and extensible communication with external tools. PaperDebugger addresses these challenges through a Chrome-approved extension, a Kubernetes-native orchestration layer, and a Model Context Protocol (MCP) toolchain that integrates literature search, reference lookup, document scoring, and revision pipelines. Our demo showcases a fully integrated workflow, including localized edits, structured reviews, parallel agent execution, and diff-based updates, encapsulated within a minimal-intrusion user interface (UI). Early aggregated analytics demonstrate active user engagement and validate the practicality of an editor-native, agentic writing assistant. More details about this demo and video could be found at https://github.com/PaperDebugger/PaperDebugger.
- Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Latent Diffusion Models (LDMs) inherently follow a coarse-to-fine generation process, where high-level semantic structure is generated slightly earlier than fine-grained texture. This indicates the preceding semantics potentially benefit texture generation by providing a semantic anchor. Recent advances have integrated semantic priors from pretrained visual encoders to further enhance LDMs, yet they still denoise semantic and VAE-encoded texture synchronously, neglecting such ordering. Observing these, we propose Semantic-First Diffusion (SFD), a latent diffusion paradigm that explicitly prioritizes semantic formation. SFD first constructs composite latents by combining a compact semantic latent, which is extracted from a pretrained visual encoder via a dedicated Semantic VAE, with the texture latent. The core of SFD is to denoise the semantic and texture latents asynchronously using separate noise schedules: semantics precede textures by a temporal offset, providing clearer high-level guidance for texture refinement and enabling natural coarse-to-fine generation. On ImageNet 256x256 with guidance, SFD achieves FID 1.06 (LightningDiT-XL) and FID 1.04 (1.0B LightningDiT-XXL), while achieving up to 100x faster convergence than the original DiT. SFD also improves existing methods like ReDi and VA-VAE, demonstrating the effectiveness of asynchronous, semantics-led modeling. Project page and code: https://yuemingpan.github.io/SFD.github.io/.
- 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer
Constructing 4D language fields is crucial for embodied AI, augmented/virtual reality, and 4D scene understanding, as they provide enriched semantic representations of dynamic environments and enable open-vocabulary querying in complex scenarios. However, existing approaches to 4D semantic field construction primarily rely on scene-specific Gaussian splatting, which requires per-scene optimization, exhibits limited generalization, and is difficult to scale to real-world applications. To address these limitations, we propose 4DLangVGGT, the first Transformer-based feed-forward unified framework for 4D language grounding, that jointly integrates geometric perception and language alignment within a single architecture. 4DLangVGGT has two key components: the 4D Visual Geometry Transformer, StreamVGGT, which captures spatio-temporal geometric representations of dynamic scenes; and the Semantic Bridging Decoder (SBD), which projects geometry-aware features into a language-aligned semantic space, thereby enhancing semantic interpretability while preserving structural fidelity. Unlike prior methods that depend on costly per-scene optimization, 4DLangVGGT can be jointly trained across multiple dynamic scenes and directly applied during inference, achieving both deployment efficiency and strong generalization. This design significantly improves the practicality of large-scale deployment and establishes a new paradigm for open-vocabulary 4D scene understanding. Experiments on HyperNeRF and Neu3D datasets demonstrate that our approach not only generalizes effectively but also achieves state-of-the-art performance, achieving up to 2% gains under per-scene training and 1% improvements under multi-scene training. Our code released in https://github.com/hustvl/4DLangVGGT
- DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
Understanding the dynamic physical world, characterized by its evolving 3D structure, real-world motion, and semantic content with textual descriptions, is crucial for human-agent interaction and enables embodied agents to perceive and act within real environments with human-like capabilities. However, existing datasets are often derived from limited simulators or utilize traditional Structurefrom-Motion for up-to-scale annotation and offer limited descriptive captioning, which restricts the capacity of foundation models to accurately interpret real-world dynamics from monocular videos, commonly sourced from the internet. To bridge these gaps, we introduce DynamicVerse, a physical-scale, multimodal 4D world modeling framework for dynamic real-world video. We employ large vision, geometric, and multimodal models to interpret metric-scale static geometry, real-world dynamic motion, instance-level masks, and holistic descriptive captions. By integrating window-based Bundle Adjustment with global optimization, our method converts long real-world video sequences into a comprehensive 4D multimodal format. DynamicVerse delivers a large-scale dataset consisting of 100K+ videos with 800K+ annotated masks and 10M+ frames from internet videos. Experimental evaluations on three benchmark tasks, namely video depth estimation, camera pose estimation, and camera intrinsics estimation, demonstrate that our 4D modeling achieves superior performance in capturing physical-scale measurements with greater global accuracy than existing methods.
- UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers
Recent image diffusion transformers achieve high-fidelity generation, but struggle to generate images beyond these scales, suffering from content repetition and quality degradation. In this work, we present UltraImage, a principled framework that addresses both issues. Through frequency-wise analysis of positional embeddings, we identify that repetition arises from the periodicity of the dominant frequency, whose period aligns with the training resolution. We introduce a recursive dominant frequency correction to constrain it within a single period after extrapolation. Furthermore, we find that quality degradation stems from diluted attention and thus propose entropy-guided adaptive attention concentration, which assigns higher focus factors to sharpen local attention for fine detail and lower ones to global attention patterns to preserve structural consistency. Experiments show that UltraImage consistently outperforms prior methods on Qwen-Image and Flux (around 4K) across three generation scenarios, reducing repetition and improving visual fidelity. Moreover, UltraImage can generate images up to 6K*6K without low-resolution guidance from a training resolution of 1328p, demonstrating its extreme extrapolation capability. Project page is available at https://thu-ml.github.io/ultraimage.github.io/{https://thu-ml.github.io/ultraimage.github.io/}.
- Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting
Synthesizing high-fidelity frozen 3D scenes from monocular Mannequin-Challenge (MC) videos is a unique problem distinct from standard dynamic scene reconstruction. Instead of focusing on modeling motion, our goal is to create a frozen scene while strategically preserving subtle dynamics to enable user-controlled instant selection. To achieve this, we introduce a novel application of dynamic Gaussian splatting: the scene is modeled dynamically, which retains nearby temporal variation, and a static scene is rendered by fixing the model's time parameter. However, under this usage, monocular capture with sparse temporal supervision introduces artifacts like ghosting and blur for Gaussians that become unobserved or occluded at weakly supervised timestamps. We propose Splannequin, an architecture-agnostic regularization that detects two states of Gaussian primitives, hidden and defective, and applies temporal anchoring. Under predominantly forward camera motion, hidden states are anchored to their recent well-observed past states, while defective states are anchored to future states with stronger supervision. Our method integrates into existing dynamic Gaussian pipelines via simple loss terms, requires no architectural changes, and adds zero inference overhead. This results in markedly improved visual quality, enabling high-fidelity, user-selectable frozen-time renderings, validated by a 96% user preference. Project page: https://chien90190.github.io/splannequin/
- NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation
Standard diffusion corrupts data using Gaussian noise whose Fourier coefficients have random magnitudes and random phases. While effective for unconditional or text-to-image generation, corrupting phase components destroys spatial structure, making it ill-suited for tasks requiring geometric consistency, such as re-rendering, simulation enhancement, and image-to-image translation. We introduce Phase-Preserving Diffusion φ-PD, a model-agnostic reformulation of the diffusion process that preserves input phase while randomizing magnitude, enabling structure-aligned generation without architectural changes or additional parameters. We further propose Frequency-Selective Structured (FSS) noise, which provides continuous control over structural rigidity via a single frequency-cutoff parameter. φ-PD adds no inference-time cost and is compatible with any diffusion model for images or videos. Across photorealistic and stylized re-rendering, as well as sim-to-real enhancement for driving planners, φ-PD produces controllable, spatially aligned results. When applied to the CARLA simulator, φ-PD improves CARLA-to-Waymo planner performance by 50\%. The method is complementary to existing conditioning approaches and broadly applicable to image-to-image and video-to-video generation. Videos, additional examples, and code are available on our https://yuzeng-at-tri.github.io/ppd-page/{project page}.
- SIMA 2: A Generalist Embodied Agent for Virtual Worlds
We introduce SIMA 2, a generalist embodied agent that understands and acts in a wide variety of 3D virtual worlds. Built upon a Gemini foundation model, SIMA 2 represents a significant step toward active, goal-directed interaction within an embodied environment. Unlike prior work (e.g., SIMA 1) limited to simple language commands, SIMA 2 acts as an interactive partner, capable of reasoning about high-level goals, conversing with the user, and handling complex instructions given through language and images. Across a diverse portfolio of games, SIMA 2 substantially closes the gap with human performance and demonstrates robust generalization to previously unseen environments, all while retaining the base model's core reasoning capabilities. Furthermore, we demonstrate a capacity for open-ended self-improvement: by leveraging Gemini to generate tasks and provide rewards, SIMA 2 can autonomously learn new skills from scratch in a new environment. This work validates a path toward creating versatile and continuously learning agents for both virtual and, eventually, physical worlds.
- Model-Based and Sample-Efficient AI-Assisted Math Discovery in Sphere Packing
Sphere packing, Hilbert's eighteenth problem, asks for the densest arrangement of congruent spheres in n-dimensional Euclidean space. Although relevant to areas such as cryptography, crystallography, and medical imaging, the problem remains unresolved: beyond a few special dimensions, neither optimal packings nor tight upper bounds are known. Even a major breakthrough in dimension n=8, later recognised with a Fields Medal, underscores its difficulty. A leading technique for upper bounds, the three-point method, reduces the problem to solving large, high-precision semidefinite programs (SDPs). Because each candidate SDP may take days to evaluate, standard data-intensive AI approaches are infeasible. We address this challenge by formulating SDP construction as a sequential decision process, the SDP game, in which a policy assembles SDP formulations from a set of admissible components. Using a sample-efficient model-based framework that combines Bayesian optimisation with Monte Carlo Tree Search, we obtain new state-of-the-art upper bounds in dimensions 4-16, showing that model-based search can advance computational progress in longstanding geometric problems. Together, these results demonstrate that sample-efficient, model-based search can make tangible progress on mathematically rigid, evaluation limited problems, pointing towards a complementary direction for AI-assisted discovery beyond large-scale LLM-driven exploration.
- TV2TV: A Unified Framework for Interleaved Language and Video Generation
Video generation models are rapidly advancing, but can still struggle with complex video outputs that require significant semantic branching or repeated high-level reasoning about what should happen next. In this paper, we introduce a new class of omni video-text models that integrate ideas from recent LM reasoning advances to address this challenge. More specifically, we present TV2TV, a unified generative modeling framework which decomposes video generation into an interleaved text and video generation process. TV2TV jointly learns language modeling (next-token prediction) and video flow matching (next-frame prediction) using a Mixture-of-Transformers (MoT) architecture. At inference time, TV2TV decides when to alternate between generating text and video frames, allowing the model to "think in words" about subsequent content before ``acting in pixels'' to produce frames. This design offloads much of the responsibility for deciding what should happen next to the language modeling tower, enabling improved visual quality and prompt alignment of generated videos. It also enables fine-grained controllability, allowing users to modify the video generation trajectory through text interventions at any point in the process. In controlled experiments on video game data, TV2TV demonstrates substantial improvements in both visual quality and controllability. TV2TV also scales to natural videos, as we show by augmenting sports videos with interleaved natural language action descriptions using vision-language models (VLMs). Training TV2TV on this corpus yields strong visual quality and prompt alignment, showcasing the model's ability to reason about and generate complex real-world action sequences. Together, these results highlight TV2TV as a promising step toward video generation with open-ended textual reasoning and control.
Solidot(15)
- Linus Torvalds 为 Windows 蓝屏死机辩护
Linus Sebastian 采访了 Linus Torvalds,期间谈及了 Torvalds 对 ECC 内存的偏好,Torvalds 在回答中评论了 Windows 著名的蓝屏死机(BSOD)。Torvalds 称很大一部分 BSOD 实际上并非是软件 bug,而是硬件不可靠导致的。而超频也会额外增加系统的不稳定性。他认为使用 ECC 内存能提高系统可靠性,让用户更信任机器。如果没有 ECC,内存迟早会出问题。微软 BSOD 的背后往往是硬件问题而不是软件 bug。他还顺便评论了下马斯克(Elon Musk)对程序员的管理方式。
- 家猫在唐朝前后抵达中国
根据发表在《Cell Genomics》的一项遗传分析,今天的家猫(Felis catus)是在唐朝前后抵达中国。此前中国出现的捕鼠动物是豹猫(Prionailurus bengalensis)。现代家猫起源于近东的非洲野猫,经过驯化后传播至全球。中国最早的猫科动物考古记录来自距今五千多年的陕西泉护村遗址,一具猫类遗骸显示其与人类关系密切,曾被认为可能是家猫,但后被确认为体型与家猫相近的本土猫科动物豹猫。因此家猫何时以及如何传入中国的问题一直悬而未解。为解答这一问题,北大博士后韩雨和同事采集和分析了来自人居环境、时间跨度超过五千年的 22 份小型猫科动物骨骼样本,涵盖了中国已知大部分的古代猫类遗存。通过古 DNA 技术获得了全部 22 份线粒体基因组和 7 份全基因组。其中 7 份为豹猫样本,年代从 5400 年前新石器时代晚期的仰韶文化延续至 1800 年前的东汉末年,揭示了豹猫与人类持续 3500 多年的密切关系。研究中 14 份样本鉴定为家猫,均来自唐代及其后的时期。中国迄今最早的家猫遗骸出土于陕西靖边唐朝统万城遗址,碳14测年为公元 706 至 883 年,距今约 1200 年。基因组表型复原显示,该猫为雄性,毛色可能为纯白或白斑狸花,短毛、长尾,且不携带现代家猫常见的遗传缺陷。结合文献记载与考古图像,家猫传入中国的时间应早于出土遗存的年代,可能在公元 6—7 世纪唐代前后。基因组分析进一步确定了家猫传入中国的路线。中国唐代家猫与哈萨克斯坦占肯特遗址出土的同时期家猫,以及近东黎凡特地区的非洲野猫和家猫遗传关系紧密。这三地恰位于陆上丝绸之路的重要枢纽,表明家猫很可能随商旅往来,经由丝绸之路自地中海东岸途经中亚传入中国。
- 三成英国医生在会诊时使用 AI 工具
根据 Nuffield Trust 智库的一项研究,三成英国全科医生在会诊时使用 AI 工具如 ChatGPT,由于 AI 工具不可避免存在幻觉,使用这些工具可能会导致医生犯错和面临诉讼。研究调查了 2108 名家庭医生,598 人(28%)的人表示已在使用 AI 工具,男性医生(33%)使用 AI 的比例高于女性医生(25%),富裕地区医生使用 AI 的比例远高于贫困地区。报告指出,无论是否使用 AI,绝大多数全科医生都担心诊所可能会面临“职业责任和医疗法律问题”、“临床错误风险”以及“患者隐私和数据安全”问题。调查还发现,使用 AI 工具的医生将节省下来的时间用于休息而不是接诊更多患者。
- 论文抄袭率更高的人更可能进入政府部门且晋升更快
哈佛大学、香港大学和芝加哥大学的研究人员发表了一篇论文,对中国超过 50 万篇研究生学位论文进行原创性检测,发现抄袭行为普遍存在,约 14% 的论文超过 15% 的官方相似度阈值。数据分析显示,有抄袭行为的毕业生进入公共部门的比例显著高于其未抄袭的同窗,尤其在税务、海关等实权部门。这表明,在职业入口处存在着基于不诚实特质的“负面选择”。一旦进入体制,这种负面效应在职业晋升中进一步放大。追踪公务员的职业轨迹发现,在相同资历背景下,有抄袭记录的官员晋升速度平均快 10-15%。即使在司法系统这一专业性极强的领域,控制了法官办案数量、上诉率等绩效指标后,抄袭记录仍能独立预测其晋升概率。通过分析超过 1.4 亿份法院判决并利用案件准随机分配机制,研究发现:由有抄袭记录的法官审理的案件,其裁决更倾向于偏袒政府、国有企业或大型企业,上诉率更高,判决书说理更简略,且更频繁使用自由裁量条款。
- Netflix 以 827 亿美元收购华纳兄弟
Netflix 发布新闻稿,正式宣布将以 827 亿美元收购华纳兄弟(Warner Bros)——即 Warner Bros Discovery 的电影和流媒体业务,换句话说流媒体服务 HBO Max 将成为 Netflix 的一部分。此次收购预计将重塑美国媒体行业格局,可能会面临反垄断审查。Netflix 的收购价为每股 27.75 美元,股权价值为 720 亿美元,包括公司债务和股票价值在内的总企业价值约 827 亿美元。两家公司的董事会一致批准了交易。
- AI 聊天机器人擅长利用不准确信息改变人的政治观点
根据发表在《科学》期刊上的一项研究,AI 聊天机器人擅长改变人的政治观点,在使用不准确信息时其说服力更惊人。研究人员通过众包平台招募了近 7.7 万名参与者,付费让他们与 OpenAI、Meta 和 xAI 等公司的 AI 聊天机器人进行对话。研究人员首先询问了参与者在不同政治议题的立场,然后 AI 聊天机器人会尝试改变他们的想法接受相反的立场。研究显示 AI 聊天机器人非常擅长担任政治说客。研究人员发现,AI 提供的大量信息中包含了很多不准确的断言,而“最具说服力的模型和提示策略产生的信息准确率最低”。研究中 AI 聊天机器人给出的声明中有 19% 被认为“基本不准确”。研究人员担心,极具说服力的 AI 聊天机器人可能会被无道德原则的人利用,用于宣传激进的政治或宗教意识形态,或在地缘政治对手之间煽动政治动乱。
- 长期限制热量摄入能减缓大脑衰老
美国国家老龄化研究所在 1980 年代开展了一项研究,参与者分成两组,一组食用均衡的常规饮食,一组减少三成热量摄入。研究最初的目的是调查减少热量摄入是否能延长寿命。参与者都活到了自然死亡。研究人员在他们死后分析了大脑,比较了正常饮食者和限制热量摄入者脑细胞的差异,观察减少热量摄入如何影响与脑细胞衰老相关的基因表达和通路活性。他们发现,限制卡路里摄入的脑细胞在代谢上更健康、功能更强,髓鞘相关基因表达增加,以及与髓鞘生成和维护密切相关的关键代谢通路活性增强。这些发现支持长期饮食干预能从细胞层面影响大脑衰老的轨迹。
- 最新实验不支持第四种中微子假说
《自然》发表了两项研究,再次对长期困扰物理学界的惰性中微子假说进行了检验。结果显示,相关实验并未发现任何额外类型中微子的证据,测量结果与标准模型预测基本一致,从而削弱了存在第四种类型的中微子——惰性中微子的可能性。中微子是宇宙中数量仅次于光子的基本粒子,已知分为电子、缪子和τ三种类型,其能够在这三种类型之间来回“变身”,即发生振荡。过去几十年的洛斯阿拉莫斯实验、MiniBooNE 实验以及多项“镓异常”实验都曾给出难以解释的结果:有的观测到电子中微子数量“过多”,有的观测到“偏少”。这些异常曾被解读为存在一种或多种惰性中微子的可能证据。美国费米国家加速器实验室最新的 MicroBooNE 实验显示,观测到的电子中微子数量与标准模型预期基本一致,没有出现 MiniBooNE 所报告的异常多出的现象。这一结果以 95% 的置信度排除了包含单一惰性中微子的理论模型。另一项研究来自德国卡尔斯鲁厄氚中微子质量实验。该实验以测量氚衰变末端电子能谱为核心,间接推算出中微子质量,同时检验是否存在惰性中微子。理论上,惰性中微子会在能谱中留下可测量的异常信号。然而,该实验最新数据并未发现任何与惰性中微子相关的特征。
- 长江存储 NAND 市场份额突破 10%
调查公司香港 Counterpoint 的数据显示,长江存储在全球 NAND 出货量中所占的份额在 2025 年 1~3 月首次达到 10%。2025 年 7~9 月同比增长 4 个百分点达到 13%,份额直逼世界第 4 位的美国美光。全年市场份额有望超过 1 成。长江存储的目标是在 2026 年底之前获得 15 %的销量份额。将在中国武汉市周边推进工厂投资。目前正在实施的投资完成后,长江存储将占全球供应量的约 2 成。这一规模超过了日本铠侠,直逼韩国 SK 海力士。长鑫存储在 DRAM 领域居世界第 4 位。但份额远低于三星、SK 海力士和美光。
- 研究称火山喷发引发了黑死病的传播
黑死病是人类历史上最致命的瘟疫之一,杀死了欧洲近半数人口。根据发表在《Communications Earth & Environment》期刊上的一项研究,这场瘟疫可能是火山喷发引起的。研究人员综合分析了欧洲各地的树轮、南极洲和格陵兰岛的冰芯以及历史文献,认为在 1345 年左右,也就是黑死病爆发前两年,发生了一次火山爆发,地点可能是在热带,火山灰在随后的多年里遮蔽了地中海地区的部分阳光,导致气温下降,农作物歉收。粮食短缺迫使威尼斯和热那亚等意大利城邦紧急从黑海地区进口粮食。不幸的是,运粮船携带了一种致命的细菌——鼠疫杆菌——引发了席卷欧洲的鼠疫。1347 年至 1351 年间,黑死病至少夺去了 2500 万人的生命。研究指出,火山爆发是这场瘟疫的第一个导火索。
- 天文学家观测到至今最大的宇宙旋转结构
牛津大学领导的国际团队确认了迄今观测到的最大宇宙旋转结构——一个距离地球约 1.4 亿光年、“如刀刃般”嵌入巨大旋转宇宙丝状体中的星系链。它被称为“宇宙流动的化石记录”,为研究早期宇宙星系形成提供了全新视角。宇宙丝状体是宇宙中已知最大的结构类型,是由星系和暗物质组成的细长网络,充当了物质和动量流向星系的“高速公路”。团队利用南非 MeerKAT 射电望远镜的数据,结合暗能量光谱仪和斯隆数字巡天的光学观测结果,发现了这个由 14 个富含氢气的星系排列成的“长链”,其长约 550 万光年、宽约 117000 光年,嵌入在一个超过 280 个星系的丝状体内。最新发现的特殊之处在于,该丝状体不仅自身在旋转,星系的自转方向还与纤维结构自身旋转高度相关。这一发现远超随机分布的预期,挑战了现有星系形成模型。动力学模型显示,其旋转速度达 110 公里/秒。
- 自组装轻型飞机因 3D 打印零部件受热软化而坠机
Cozy Mk IV 是一款实验性的可以购买零部件在家自组装的轻型飞机。3 月 18 日一架 Cozy Mk IV 飞机在英国 Staverton 的 Gloucestershire 机场发生了坠机事故,机上只有一名飞行员,他只受到了轻伤。英国航空事故调查局(AAIB)的调查发现,事故原因是 3D 打印的进气弯管使用了不合适的材料制造,在引擎产生的高温下受热后软化,导致结构变形。飞行员当时发现发动机完全失去动力。AAIB 表示未来将会加强对 3D 打印零部件的检查。
- 斯巴鲁车主抱怨驾车过程中弹出全屏广告
斯巴鲁车主抱怨驾车过程中车载信息娱乐系统弹出了 SiriusXM 的全屏广告,有时甚至覆盖了 Apple CarPla。斯巴鲁表示广告每年只弹出两次,但车主认为弹出广告会导致分心和影响行车安全,担心汽车行业在这方面的做法会愈发恶劣。一位 2024 年款 Crosstrek 车主称,弹出式广告出现时候他们正在用 Apple CarPlay,广告占据了整个屏幕。为投放车载广告而强制关闭使用中应用的做法尤其恶劣。斯巴鲁发言人表示他们将开会讨论该问题,他们非常重视客户的反馈。斯巴鲁表示广告通常在每年的阵亡将士纪念日和感恩节前后发送两次。
- 俄罗斯屏蔽苹果 FaceTime 和游戏平台 Roblox
俄罗斯屏蔽了苹果 FaceTime 和游戏平台 Roblox。俄罗斯声称 FaceTime 被用于犯罪活动,而 Roblox 则被指控传播极端主义材料和 LGBT 宣传。在这之前,俄罗斯限制了 Google 的 YouTube、Meta 的 WhatsApp 以及 Telegram。批评人士认为此举相当于审查,旨在加强国家对私人通信的控制。俄罗斯则声称这些是合法的执法手段。俄罗斯通信监管机构 Roskomnadzor 援引执法机构的声明称,FaceTime 正被用于在俄罗斯境内组织和实施恐怖袭击、招募犯罪分子以及对俄罗斯公民实施欺诈和其它犯罪活动。但它并没有提供任何证据。
- Netflix 接近收购 HBO
Netflix 目前在竞价收购 Warner Bros. Discovery 工作室和流媒体业务上处于领跑位置。Netflix 对华纳的工作室、HBO Max 等相关业务的估值为每股 28 美元。派拉蒙对华纳的报价是接近每股 27 美元。派拉蒙旨在收购华纳的全部资产,而 Netflix 和另一个竞价者 Comcast 只对其工作室和流媒体资产感兴趣。华纳旗下还有 CNN 等有线电视。彭博社称,华纳和 Netflix 已进入独家谈判阶段。派拉蒙的所有者埃里森家族(Larry Ellison)与美国总统特朗普关系密切,它宣称只有它的收购能通过政府的反垄断审查,其它公司的收购将会无法通过。