Weekly Digest — 2025-W41
130 unique stories (2025-10-06 → 2025-10-12), aggregated across 8 sources.
Hacker News(42)
- OpenZL: An open source format-aware compression framework (engineering.fb.com)
- Apps SDK (developers.openai.com)
- Ladybird passes the Apple 90% threshold on web-platform-tests (twitter.com)
- The AI bubble is 17 times the size of the dot-com frenzy and four times subprime (www.morningstar.com)
- Mise: Monorepo Tasks (github.com)
- Indefinite Backpack Travel (jeremymaluf.com)
- Gemini 2.5 Computer Use model (blog.google)
- ICE bought vehicles equipped with fake cell towers to spy on phones (techcrunch.com)
- Google's requirement for developers to be verified threatens app store F-Droid (www.techdirt.com)
- Solar energy is now the cheapest source of power, study (www.surrey.ac.uk)
- German government comes out against Chat Control (xcancel.com)
- Doing Rails Wrong (www.bananacurvingmachine.com)
GitHub Trending(23)
- Infisical / infisical
Infisical is the open-source platform for secrets management, PKI, and SSH access.
- meshery / meshery
Meshery, the cloud native manager
- BeehiveInnovations / zen-mcp-server
The power of Claude Code / GeminiCLI / CodexCLI + [Gemini / OpenAI / OpenRouter / Azure / Grok / Ollama / Custom Model / All Of The Above] working as one.
- Stremio / stremio-web
Stremio - Freedom to Stream
- microsoft / BitNet
Official inference framework for 1-bit LLMs
- TapXWorld / ChinaTextbook
所有小初高、大学PDF教材。
- trycua / cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
- simstudioai / sim
Open-source platform to build and deploy AI agent workflows.
- FlowiseAI / Flowise
Build AI Agents, Visually
- browserbase / stagehand
The AI Browser Automation Framework
- MODSetter / SurfSense
Open Source Alternative to NotebookLM / Perplexity, connected to external sources such as Search Engines, Slack, Linear, Jira, ClickUp, Confluence, Notion, YouTube, GitHub, Discord and more. Join our discord: https://discord.gg/ejRNvftDp9
- google / computer-use-preview
Hugging Face(32)
- Apriel-1.5-15b-Thinker
We present Apriel-1.5-15B-Thinker, a 15-billion parameter open-weights multimodal reasoning model that achieves frontier-level performance through training design rather than sheer scale. Starting from Pixtral-12B, we apply a progressive three-stage methodology: (1) depth upscaling to expand reasoning capacity without pretraining from scratch, (2) staged continual pre-training that first develops foundational text and vision understanding, then enhances visual reasoning through targeted synthetic data generation addressing spatial structure, compositional understanding, and fine-grained perception, and (3) high-quality text-only supervised fine-tuning on curated instruction-response pairs with explicit reasoning traces spanning mathematics, coding, science, and tool use. Notably, our model achieves competitive results without reinforcement learning or preference optimization, isolating the contribution of our data-centric continual pre-training approach. On the Artificial Analysis Intelligence Index, Apriel-1.5-15B-Thinker attains a score of 52, matching DeepSeek-R1-0528 despite requiring significantly fewer computational resources. Across ten image benchmarks, its performance is on average within five points of Gemini-2.5-Flash and Claude Sonnet-3.7, a key achievement for a model operating within single-GPU deployment constraints. Our results demonstrate that thoughtful mid-training 2 design can close substantial capability gaps without massive scale, making frontier-level multimodal reasoning accessible to organizations with limited infrastructure. We release the model checkpoint, all training recipes, and evaluation protocols under the MIT license to to advance open-source research.
- Large Reasoning Models Learn Better Alignment from Flawed Thinking
Large reasoning models (LRMs) "think" by generating structured chain-of-thought (CoT) before producing a final answer, yet they still lack the ability to reason critically about safety alignment and are easily biased when a flawed premise is injected into their thought process. We propose RECAP (Robust Safety Alignment via Counter-Aligned Prefilling), a principled reinforcement learning (RL) method for post-training that explicitly teaches models to override flawed reasoning trajectories and reroute to safe and helpful responses. RECAP trains on a mixture of synthetically generated counter-aligned CoT prefills and standard prompts, requires no additional training cost or modifications beyond vanilla reinforcement learning from human feedback (RLHF), and substantially improves safety and jailbreak robustness, reduces overrefusal, and preserves core reasoning capability -- all while maintaining inference token budget. Extensive analysis shows that RECAP-trained models engage in self-reflection more frequently and remain robust under adaptive attacks, preserving safety even after repeated attempts to override their reasoning.
- Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Visual tokens consume substantial computational resources in multi-modal large models (MLLMs), significantly compromising their efficiency. Recent works have attempted to improve efficiency by compressing visual tokens during training, either through modifications to model components or by introducing additional parameters. However, they often overlook the increased learning difficulty caused by such compression, as the model's parameter space struggles to quickly adapt to the substantial perturbations in the feature space induced by token compression. In this work, we propose to develop Efficient MLLMs via Progressive Consistency Distillation (EPIC), a progressive learning framework. Specifically, by decomposing the feature space perturbations introduced by token compression along the token-wise and layer-wise dimensions, we introduce token consistency distillation and layer consistency distillation, respectively, aiming to reduce the training difficulty by leveraging guidance from a teacher model and following a progressive learning trajectory. Extensive experiments demonstrate the superior effectiveness, robustness, and generalization capabilities of our proposed framework.
- Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition
Diffusion-based models for robotic control, including vision-language-action (VLA) and vision-action (VA) policies, have demonstrated significant capabilities. Yet their advancement is constrained by the high cost of acquiring large-scale interaction datasets. This work introduces an alternative paradigm for enhancing policy performance without additional model training. Perhaps surprisingly, we demonstrate that the composed policies can exceed the performance of either parent policy. Our contribution is threefold. First, we establish a theoretical foundation showing that the convex composition of distributional scores from multiple diffusion models can yield a superior one-step functional objective compared to any individual score. A Gr\"onwall-type bound is then used to show that this single-step improvement propagates through entire generation trajectories, leading to systemic performance gains. Second, motivated by these results, we propose General Policy Composition (GPC), a training-free method that enhances performance by combining the distributional scores of multiple pre-trained policies via a convex combination and test-time search. GPC is versatile, allowing for the plug-and-play composition of heterogeneous policies, including VA and VLA models, as well as those based on diffusion or flow-matching, irrespective of their input visual modalities. Third, we provide extensive empirical validation. Experiments on Robomimic, PushT, and RoboTwin benchmarks, alongside real-world robotic evaluations, confirm that GPC consistently improves performance and adaptability across a diverse set of tasks. Further analysis of alternative composition operators and weighting strategies offers insights into the mechanisms underlying the success of GPC. These results establish GPC as a simple yet effective method for improving control performance by leveraging existing policies.
- CoDA: Agentic Systems for Collaborative Data Visualization
Deep research has revolutionized data analysis, yet data scientists still devote substantial time to manually crafting visualizations, highlighting the need for robust automation from natural language queries. However, current systems struggle with complex datasets containing multiple files and iterative refinement. Existing approaches, including simple single- or multi-agent systems, often oversimplify the task, focusing on initial query parsing while failing to robustly manage data complexity, code errors, or final visualization quality. In this paper, we reframe this challenge as a collaborative multi-agent problem. We introduce CoDA, a multi-agent system that employs specialized LLM agents for metadata analysis, task planning, code generation, and self-reflection. We formalize this pipeline, demonstrating how metadata-focused analysis bypasses token limits and quality-driven refinement ensures robustness. Extensive evaluations show CoDA achieves substantial gains in the overall score, outperforming competitive baselines by up to 41.5%. This work demonstrates that the future of visualization automation lies not in isolated code generation but in integrated, collaborative agentic workflows.
- Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
The recent hardware-accelerated microscaling 4-bit floating-point formats such as MXFP4 and NVFP4, supported on NVIDIA and AMD GPUs, promise to revolutionize large language model (LLM) inference. Yet, their practical benefits remain unproven. We present the first comprehensive study of MXFP4 and NVFP4 for post-training quantization, revealing gaps between their promise and real-world performance. Our analysis shows that state-of-the-art methods struggle with FP4, due to two key issues: (1) NVFP4's small group size provably neutralizes traditional outlier mitigation techniques; (2) MXFP4's power-of-two scale quantization severely degrades accuracy due to high induced error. To bridge this gap, we introduce Micro-Rotated-GPTQ (MR-GPTQ), a variant of the classic GPTQ quantization algorithm that tailors the quantization process to FP4's unique properties, by using block-wise Hadamard transforms and format-specific optimizations. We support our proposal with a set of high-performance GPU kernels that enable the MR-GPTQ format with negligible overhead, by rotation fusion into the weights, and fast online computation of the activations. This leads to speedups vs. FP16 of up to 3.6x layer-wise, and 2.2x end-to-end on NVIDIA B200, and of 6x layer-wise and 4x end-to-end on RTX5090. Our extensive empirical evaluation demonstrates that MR-GPTQ matches or outperforms state-of-the-art accuracy, significantly boosting MXFP4, to the point where it nears that of NVFP4. We conclude that, while FP4 is not an automatic upgrade over INT4, format-specialized methods like MR-GPTQ can unlock a new frontier of accuracy-performance trade-offs.
- Paper2Video: Automatic Video Generation from Scientific Papers
Academic presentation videos have become an essential medium for research communication, yet producing them remains highly labor-intensive, often requiring hours of slide design, recording, and editing for a short 2 to 10 minutes video. Unlike natural video, presentation video generation involves distinctive challenges: inputs from research papers, dense multi-modal information (text, figures, tables), and the need to coordinate multiple aligned channels such as slides, subtitles, speech, and human talker. To address these challenges, we introduce PaperTalker, the first benchmark of 101 research papers paired with author-created presentation videos, slides, and speaker metadata. We further design four tailored evaluation metrics--Meta Similarity, PresentArena, PresentQuiz, and IP Memory--to measure how videos convey the paper's information to the audience. Building on this foundation, we propose PaperTalker, the first multi-agent framework for academic presentation video generation. It integrates slide generation with effective layout refinement by a novel effective tree search visual choice, cursor grounding, subtitling, speech synthesis, and talking-head rendering, while parallelizing slide-wise generation for efficiency. Experiments on Paper2Video demonstrate that the presentation videos produced by our approach are more faithful and informative than existing baselines, establishing a practical step toward automated and ready-to-use academic video generation. Our dataset, agent, and code are available at https://github.com/showlab/Paper2Video.
- Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Video understanding represents the most challenging frontier in computer vision, requiring models to reason about complex spatiotemporal relationships, long-term dependencies, and multimodal evidence. The recent emergence of Video-Large Multimodal Models (Video-LMMs), which integrate visual encoders with powerful decoder-based language models, has demonstrated remarkable capabilities in video understanding tasks. However, the critical phase that transforms these models from basic perception systems into sophisticated reasoning engines, post-training, remains fragmented across the literature. This survey provides the first comprehensive examination of post-training methodologies for Video-LMMs, encompassing three fundamental pillars: supervised fine-tuning (SFT) with chain-of-thought, reinforcement learning (RL) from verifiable objectives, and test-time scaling (TTS) through enhanced inference computation. We present a structured taxonomy that clarifies the roles, interconnections, and video-specific adaptations of these techniques, addressing unique challenges such as temporal localization, spatiotemporal grounding, long video efficiency, and multimodal evidence integration. Through systematic analysis of representative methods, we synthesize key design principles, insights, and evaluation protocols while identifying critical open challenges in reward design, scalability, and cost-performance optimization. We further curate essential benchmarks, datasets, and metrics to facilitate rigorous assessment of post-training effectiveness. This survey aims to provide researchers and practitioners with a unified framework for advancing Video-LMM capabilities. Additional resources and updates are maintained at: https://github.com/yunlong10/Awesome-Video-LMM-Post-Training
- VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
Recent video generation models can produce smooth and visually appealing clips, but they often struggle to synthesize complex dynamics with a coherent chain of consequences. Accurately modeling visual outcomes and state transitions over time remains a core challenge. In contrast, large language and multimodal models (e.g., GPT-4o) exhibit strong visual state reasoning and future prediction capabilities. To bridge these strengths, we introduce VChain, a novel inference-time chain-of-visual-thought framework that injects visual reasoning signals from multimodal models into video generation. Specifically, VChain contains a dedicated pipeline that leverages large multimodal models to generate a sparse set of critical keyframes as snapshots, which are then used to guide the sparse inference-time tuning of a pre-trained video generator only at these key moments. Our approach is tuning-efficient, introduces minimal overhead and avoids dense supervision. Extensive experiments on complex, multi-step scenarios show that VChain significantly enhances the quality of generated videos.
- MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Tree search has become as a representative framework for test-time reasoning with large language models (LLMs), exemplified by methods such as Tree-of-Thought and Monte Carlo Tree Search that explore multiple reasoning paths. However, it remains difficult to provide instant and reliable quantitative assessments of intermediate reasoning step quality, and extensive path exploration is computationally costly. To address this, we propose Mutual Information Tree Search (MITS), a novel framework that guides reasoning with information-theoretic principles. MITS introduces an effective scoring function based on pointwise mutual information (PMI), which enables step-wise evaluation of reasoning paths and search tree expansion via beam search without expensive look-ahead simulations, achieving superior reasoning performances while maintaining computational efficiency. The framework is complemented by an entropy-based dynamic sampling strategy that adaptively allocates computational resources to uncertain reasoning steps where exploration is most beneficial. For final prediction, MITS employs a weighted voting scheme that combines PMI scores with prediction consensus. Through comprehensive experiments on diverse reasoning benchmarks, MITS consistently surpasses baseline methods, establishing a principled and efficient framework for LLM reasoning.
- Imperceptible Jailbreaking against Large Language Models
Jailbreaking attacks on the vision modality typically rely on imperceptible adversarial perturbations, whereas attacks on the textual modality are generally assumed to require visible modifications (e.g., non-semantic suffixes). In this paper, we introduce imperceptible jailbreaks that exploit a class of Unicode characters called variation selectors. By appending invisible variation selectors to malicious questions, the jailbreak prompts appear visually identical to original malicious questions on screen, while their tokenization is "secretly" altered. We propose a chain-of-search pipeline to generate such adversarial suffixes to induce harmful responses. Our experiments show that our imperceptible jailbreaks achieve high attack success rates against four aligned LLMs and generalize to prompt injection attacks, all without producing any visible modifications in the written prompt. Our code is available at https://github.com/sail-sg/imperceptible-jailbreaks.
- Hybrid Architectures for Language Models: Systematic Analysis and Design Insights
Recent progress in large language models demonstrates that hybrid architectures--combining self-attention mechanisms with structured state space models like Mamba--can achieve a compelling balance between modeling quality and computational efficiency, particularly for long-context tasks. While these hybrid models show promising performance, systematic comparisons of hybridization strategies and analyses on the key factors behind their effectiveness have not been clearly shared to the community. In this work, we present a holistic evaluation of hybrid architectures based on inter-layer (sequential) or intra-layer (parallel) fusion. We evaluate these designs from a variety of perspectives: language modeling performance, long-context capabilities, scaling analysis, and training and inference efficiency. By investigating the core characteristics of their computational primitive, we identify the most critical elements for each hybridization strategy and further propose optimal design recipes for both hybrid models. Our comprehensive analysis provides practical guidance and valuable insights for developing hybrid language models, facilitating the optimization of architectural configurations.
Solidot(33)
- 为什么女性比男性更长寿
女性通常比男性更长寿。传统的解释包括男性抽了更多烟,饮了更多酒,从事了更危险的行为。但不管哪个国家,不论哪个世纪,男女之间的寿命差距都存在,这表明还存在更深层次的原因。发表在《Science Advances》期刊上的一项研究再次证实,这一现象可能与女性有两个 X 染色体有关,一个冗余的染色体能帮助女性抵御有害突变。研究人员分析了动物园饲养的 528 种哺乳动物和 648 种鸟类的寿命数据,发现大多数哺乳动物与人类相似,近四分之三的哺乳动物雌性寿命比雄性长。而在鸟类中,68% 的鸟类雄性寿命更长,这是因为鸟类雌性有一对不同的染色体,而雄性的一对性染色体相同。
- 自由软件基金会庆祝四十周年,任命 Ian Kelling 为新主席
自由软件基金会(FSF)庆祝了诞生四十周年,向自由软件社区介绍了该组织理事会的新主席 Ian Kelling。FSF 成立于 1985 年 10 月 4 日,致力于推广自由软件,执行 GNU 计划。现任理事会成员包括了 Christina Haralanova、Geoffrey Knauth(财务主管)、Gerald J. Sussman、Ian Kelling 和 Richard M. Stallman(创始人)。Ian Kelling 现年 43 岁,自 2021 年起担任理事会成员和投票成员,是一位活跃的演讲者和博主,他表示将致力于加强 FSF 应对计算机用户自由新威胁的能力,将比以往任何时候欢迎更多自由软件支持者加入这项运动。
- 大曼彻斯特警署因有警官使用自动按键工具假装工作暂停了远程办公
有 12,677 名员工的大曼彻斯特警署(Greater Manchester Police),由于近期的调查发现有警员使用自动按键工具假装工作而暂停了远程办公,有 26 名警员、工作人员和合同工因行为不当而遭到起诉。根据调查,一名警员作证,一名警探在 12 天内 38 次让自己的电脑看起来像在使用中。证据显示,在很长时间里他唯一的活动是单次按键,12 月 3 日 10:28 到 11:56 GMT 之间,他按了 H 键约 30 次,之后按了 I 键逾 16000 次。在总共 85 小时的登录时间中,有 45 个小时使用了自动按键,他有一半的工作时间不在键盘旁。这名警探已经辞职。
- Opera 推出月费 19.9 美元的 AI 浏览器
Opera 不想错过 AI 热,它推出了一款 AI 浏览器 Opera Neon,前 9 个月价格 59.90 美元,之后每月 19.90 美元。Opera Neon 主要使用了云端运行的大模型,任务是浏览器的核心概念,Neon 利用 AI 为用户执行各种任务,Opera 称:“Neon 会按照你的指令行动,打开标签页、进行研究、寻找最优价格、评估安全性,无论你需要什么。它提供的结果可供你使用、共享和构建。”另一家 AI 公司 Perplexity 也发布了它的 AI 浏览器 Comet,用户可免费使用,可选择支付 5 美元获得 AI 新闻服务。
- AI 训练数据已经耗尽
高盛首席数据官兼数据工程主管 Neema Raphael 表示 AI 训练数据已经耗尽,而数据短缺正在重塑 AI 公司构建新 AI 系统的方式。AI 公司已经在使用合成数据——机器生成的材料,供应无限但存在质量风险。Raphael 并不认为缺乏新数据会成为巨大的制约因素。从企业角度而言,现有的数据仍然有巨大的潜力可以挖掘。挑战在于:理解数据,理解数据的业务背景,然后标准化数据。
- 土卫二(Enceladus)发现更多复杂有机分子
科学家重新检视卡西尼号近二十年前的资料,惊喜地在土星卫星土卫二(Enceladus)的羽状喷流中,找到更多复杂有机分子。这些分子来自隐藏在冰地壳下的地下海洋,显示其中正进行着复杂且活跃的化学反应。这意味着土卫二可能具备孕育生命的条件。这些化合物和地球海底热泉环境中的化学反应物质非常类似。在地球上,黑暗深海中的海底热液喷泉所释放的化学能,足以支持喷泉周遭的生命生存,即使阳光无法到达此处,也能形成丰富的生态系统。这让科学家推测,土卫二也可能存在相似的海底热泉环境。卡西尼号在 2005 到 2015 年间,多次穿越土卫二羽状喷流,收集到大量冰粒与气体。
- 微软表示会继续开发 XBox 游戏机
微软最近再次上调了 Xbox Series X 和 Series S 游戏机的售价,将订阅服务 Xbox Game Pass Ultimate 价格上涨 50%。一系列动作让很多人不看好微软游戏机业务的未来,包括 Costco 在内的零售商决定将 Xbox 产品下架。索尼 PS5 之后有 PS6,但 Xbox Series X 之后是否还会有新 Xbox?对于它可能放弃硬件制造的传言,微软周一发表声明重审它仍然致力于开发 Xbox 游戏机,继续与 AMD 公司在硬件方面进行合作。微软和索尼目前游戏机都使用 AMD 提供的 CPU 和 GPU 方案。微第一方 Xbox 掌机的计划据报道已经取消,原因据称是 AMD 在合同中要求销量至少要达到一千万,而 Steam Deck 自 2022 年发布以来销量也只有 400-500 万台。
- Ubuntu Linux 26.04 LTS 代号 Resolute Raccoon
在 Ubuntu 25.10 即将释出之际,Canonical 宣布下一个 LTS(长期支持版)Ubuntu 26.04 的代号为 Resolute Raccoon。Ubuntu 25.10 只支持九个月,而 Ubuntu 26.04 将支持五年,预计 2026 年 4 月释出。Ubuntu 25.10 的主要特性包括:Linux 6.17,GCC 15,使用 Rust 语言开发的系统组件 sudo-rs 和 Rust Coreutils,默认桌面环境 GNOME 49,等等。Ubuntu 26.04 的具体特性将在未来几个月逐步揭晓。
- 2025 年诺贝尔物理学奖授予了三名研究量子力学的美国科学家
2025 年诺贝尔物理学奖授予了美国科学家 John Clarke、Michel H. Devoret 和 John M. Martinis,以表彰他们发现了电路中的宏观量子力学隧穿效应和能量量子化。物理学中的一个主要问题是,能展示量子力学效应的系统最大尺度是多少。今年的诺贝尔奖获得者通过一个电路进行了实验,在该系统中,他们同时演示了量子力学隧穿效应和能量量子化,而这个系统的尺寸大到足以用手握住。在 1984 年和 1985 年,John clarke、Michel H. Devoret 和 John M. Martini 使用由超导体构建的电子电路进行了一系列实验。在电路中,超导元件被一层薄薄的绝缘材料隔开,这种结构被称为约瑟夫森结。通过精化并测量其电路的各种特性,他们能够控制并探索当电流通过时出现的现象。共同在超导体中移动的带电粒子构成了一个系统,其行为就好像它们是填充整个电路的单个粒子一样。这个宏观的类粒子系统最初处于一种电流流动而没有任何电压的状态。系统被束缚在这种状态中,就像被困在一个无法穿越的势垒后面。在实验中,系统通过成功隧穿脱离零电压状态,展示了其量子特性。系统状态的改变通过电压的出现而被检测到。
- 清理 50 块最具危险性的太空垃圾能将新碎片数量减半
根据上周悉尼国际宇航大会上发表的一项研究,如果能清理掉低地球轨道上最具有危险性的 50 块太空垃圾,那么新生成碎片的数量将能整体减半。论文主要作者是 Darren McKnight,他们计算了最可能与其它碎片碰撞产生更多碎片的低轨道物体。50 块最具危险性的太空垃圾有 34 块来自俄罗斯/苏联,10 块来自中国,美国 4 块,欧洲 2 块,日本 1 块。即使只清理掉其中最危险的 10 块,新太空碎片数量也能减少 30%。McKnight 指出,大部分太空垃圾来自于 2000 年之前,50 块最具有危险性的太空垃圾有 76% 是上个世纪留下的,88% 是遗留在太空的火箭残骸。坏消息是,自 2024 年 1 月 1 日以来,遗留在低地球轨道上的火箭残骸达到了 26 枚,它们会在轨道上停留逾 25 年。这 26 枚中有 21 枚是中国发射的,另外 5 枚来自美国、俄罗斯、印度和伊朗。随着中国加速发射和部署数量数以千计的国网和千帆星座,低轨道上的火箭残骸数量可能会继续增加。自去年发射国网和千帆星座以来,中国在轨道上遗留了9 枚火箭上面级的残骸,未来可能会遗留逾 100 枚。不过中国航天局的一名官员表示正在研究如何清理轨道上的太空垃圾。
- 如果 AI 泡沫破裂?
美国上半年经济增长率 1.6%,大部分增长来自对 AI 的投资。如果没有 AI 方面的投资,经济增长率将会只有这一数字的三分之一。AI 支出的巨大经济影响力表明,硅谷正以史无前例的规模押注 AI 技术将会彻底改变生活工作的各个方面。科技巨头如 Google、Meta、Microsoft 和 Amazon 今年预计在数据中心上的投资将会接近 4000 亿美元。如果这次押注失败,如此规模的经济影响力意味着,其经济损失将会远大于硅谷本身。科技圈和金融圈对 AI 投资的潜在泡沫的担忧日益加剧。ChatGPT 等 AI 工具深受企业和消费者的欢迎,过去三年 AI 领域已投入了数千亿美元。但 AI 公司至今都无法盈利,然而需要巨额利润才能让巨大的投资物有所值。科技公司如今主导着公开市场,其业绩和股价的任何变化会对股指、401(k)退休金以及更广泛的经济产生巨大影响。独立研究公司 MacroStrategy Partnership 估计,AI 泡沫的规模是互联网泡沫的 17 倍,是次贷泡沫的 4 倍。从未有过如此大规模的资金被如此迅速的投入到一项尽管潜力巨大,但其盈利商业模式尚未得到证实的技术上。
- 天文学家发现至今信号最强的奇异电波圈
天文学家发现了一个至今最遥远、信号最强的「奇异电波圈」(Odd Radio Circle 或 ORC)。这个神秘的天体,让科学家对星系及中心超大质量黑洞之间的互动获得新的线索。所谓「奇异电波圈」,是巨大的环状结构,目前仅在射电波段被观测到。ORC 直到六年前才第一次被发现,目前天文学家在可观测的宇宙中仅确认少数几个,每一个的尺寸都比我们的银河系大十倍以上。至于其成因,天文学界原本推测可能与星系合并或超大质量黑洞碰撞所产生的冲击波有关。而最新研究提出另一种解释:这些巨环或许是螺旋星系在喷发「超风外流」(superwind outflow)时形成的。这种超风由星遽增(starburst)活动所驱动,能将能量与物质吹送至星系之外,甚至扩展成庞大的电波泡泡。在某些情况下,黑洞活动也可能参与其中,使外流更为剧烈。根据这项研究,研究团队发现的奇异电波圈编号为 RAD J131346.9+500320,距离我们极为遥远,观测到的光线对应于宇宙年龄仅为现今一半时的景象。它是目前已知最远且电波最强的奇异电波圈。更特别的是,它拥有两个彼此交错的环状结构,目前仅有两个已知的奇异电波圈呈现出这种双环交错的结构。这些观测结果显示,奇异电波圈可能是星系与超大质量黑洞共同成长的线索,由黑洞喷流、星系风与周围环境交织而成的庞大等离子体结构。