OrangeBot.AI Digest — 2025-10-10
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Liquid Glass Is Cracked, and Usability Suffers in iOS 26 (www.nngroup.com)
- Google, Meta and Microsoft to stop showing political ads in the EU (www.politico.eu)
- Boring Company cited for almost 800 environmental violations in Las Vegas (www.propublica.org)
- Notes on switching to Helix from Vim (jvns.ca)
- "Vibe code hell" has replaced "tutorial hell" in coding education (blog.boot.dev)
- Ask HN: What's the best hackable smart TV?
- Ryanair flight landed at Manchester airport with six minutes of fuel left (www.theguardian.com)
- Google Safe Browsing incident (www.statichost.eu)
- Igalia, Servo, and the Sovereign Tech Fund (www.igalia.com)
- Ohno Type School: A (2020) (ohnotype.co)
- A story about bypassing air Canada's in-flight network restrictions (ramsayleung.github.io)
- Show HN: I invented a new generative model and got accepted to ICLR (discrete-distribution-networks.github.io)
- Datastar: Lightweight hypermedia framework for building interactive web apps (data-star.dev)
- Nobel Peace Prize 2025: María Corina Machado (www.nobelprize.org)
- I tracked Amazon's Prime Day prices. We've been played (www.washingtonpost.com)
GitHub Trending(15)
- browserbase / stagehand
The AI Browser Automation Framework
- 78 / xiaozhi-esp32
An MCP-based chatbot | 一个基于MCP的聊天机器人
- anthropics / claude-code
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
- TapXWorld / ChinaTextbook
所有小初高、大学PDF教材。
- TibixDev / winboat
Run Windows apps on 🐧 Linux with ✨ seamless integration
- microsoft / RD-Agent
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. 🔗https://aka.ms/RD-Agent-Tech-Report
- MODSetter / SurfSense
Open Source Alternative to NotebookLM / Perplexity, connected to external sources such as Search Engines, Slack, Linear, Jira, ClickUp, Confluence, Notion, YouTube, GitHub, Discord and more. Join our discord: https://discord.gg/ejRNvftDp9
- CapSoftware / Cap
Open source Loom alternative. Beautiful, shareable screen recordings.
- Stremio / stremio-web
Stremio - Freedom to Stream
- xyflow / xyflow
React Flow | Svelte Flow - Powerful open source libraries for building node-based UIs with React (https://reactflow.dev) or Svelte (https://svelteflow.dev). Ready out-of-the-box and infinitely customizable.
- supermemoryai / supermemory
Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.
- evershopcommerce / evershop
🛍️ Typescript E-commerce Platform
- PixelGuys / Cubyz
Voxel sandbox game with a large render distance, procedurally generated content and some cool graphical effects.
- coze-dev / coze-studio
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
- WECENG / ticket-purchase
大麦自动抢票,支持人员、城市、日期场次、价格选择
Hugging Face(15)
- Agent Learning via Early Experience
A long-term goal of language agents is to learn and improve through their own experience, ultimately outperforming humans in complex, real-world tasks. However, training agents from experience data with reinforcement learning remains difficult in many environments, which either lack verifiable rewards (e.g., websites) or require inefficient long-horizon rollouts (e.g., multi-turn tool use). As a result, most current agents rely on supervised fine-tuning on expert data, which is challenging to scale and generalizes poorly. This limitation stems from the nature of expert demonstrations: they capture only a narrow range of scenarios and expose the agent to limited environment diversity. We address this limitation with a middle-ground paradigm we call early experience: interaction data generated by the agent's own actions, where the resulting future states serve as supervision without reward signals. Within this paradigm we study two strategies of using such data: (1) Implicit world modeling, which uses collected states to ground the policy in environment dynamics; and (2) Self-reflection, where the agent learns from its suboptimal actions to improve reasoning and decision-making. We evaluate across eight diverse environments and multiple model families. Our approaches consistently improve effectiveness and out-of-domain generalization, highlighting the value of early experience. Moreover, in environments with verifiable rewards, our results provide promising signals that early experience offers a strong foundation for subsequent reinforcement learning, positioning it as a practical bridge between imitation learning and fully experience-driven agents.
- MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
While current Multimodal Large Language Models (MLLMs) have demonstrated proficiency in reasoning tasks such as mathematics and logic, their capacity for long-chain reflective reasoning, a prerequisite for solving complex real-world problems, remains largely underexplored. In this work, we first conduct an extensive empirical investigation to evaluate this capability. Leveraging a carefully designed data synthesis engine, we construct MM-HELIX, a multimodal benchmark consisting 1,260 samples of 42 challenging synthetic tasks that require iterative thinking and backtracking. Empirical results on this benchmark reveal that existing MLLMs exhibit significant performance deficits in long-chain reflective reasoning. To address this limitation, we generate post-training data and further explore learning paradigms for exploiting such data. We first develop the Step-Elicited Response Generation pipeline to create MM-HELIX-100K, a large-scale dataset of 100k high-quality, reflective reasoning traces for instruction-tuning stage. Given that standard Reinforcement Learning fails on complex tasks due to sparse reward signals and catastrophic forgetting after Supervised Fine-Tuning, we propose Adaptive Hybrid Policy Optimization (AHPO), a novel training strategy that dynamically unifies offline supervision and online optimization into a single stage. This strategy enables the model to learn from expert data when rewards are sparse and conduct independent exploration once proficient. When applied to the Qwen2.5-VL-7B baseline, our method achieves a +18.6\% accuracy improvement on MM-HELIX benchmark and demonstrates strong generalization with a +5.7\% average performance gain on general mathematic and logic tasks. Our work demonstrate that reflective reasoning in MLLMs can be effectively learned and generalized, paving the way for developing more capable MLLMs.
- MemMamba: Rethinking Memory Patterns in State Space Model
With the explosive growth of data, long-sequence modeling has become increasingly important in tasks such as natural language processing and bioinformatics. However, existing methods face inherent trade-offs between efficiency and memory. Recurrent neural networks suffer from gradient vanishing and explosion, making them hard to scale. Transformers can model global dependencies but are constrained by quadratic complexity. Recently, selective state-space models such as Mamba have demonstrated high efficiency with O(n) time and O(1) recurrent inference, yet their long-range memory decays exponentially. In this work, we conduct mathematical derivations and information-theoretic analysis to systematically uncover the memory decay mechanism of Mamba, answering a fundamental question: what is the nature of Mamba's long-range memory and how does it retain information? To quantify key information loss, we further introduce horizontal-vertical memory fidelity metrics that capture degradation both within and across layers. Inspired by how humans distill and retain salient information when reading long documents, we propose MemMamba, a novel architectural framework that integrates state summarization mechanism together with cross-layer and cross-token attention, which alleviates long-range forgetting while preserving linear complexity. MemMamba achieves significant improvements over existing Mamba variants and Transformers on long-sequence benchmarks such as PG19 and Passkey Retrieval, while delivering a 48% speedup in inference efficiency. Both theoretical analysis and empirical results demonstrate that MemMamba achieves a breakthrough in the complexity-memory trade-off, offering a new paradigm for ultra-long sequence modeling.
- UniVideo: Unified Understanding, Generation, and Editing for Videos
Unified multimodal models have shown promising results in multimodal content generation and editing but remain largely limited to the image domain. In this work, we present UniVideo, a versatile framework that extends unified modeling to the video domain. UniVideo adopts a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation. This design enables accurate interpretation of complex multimodal instructions while preserving visual consistency. Built on this architecture, UniVideo unifies diverse video generation and editing tasks under a single multimodal instruction paradigm and is jointly trained across them. Extensive experiments demonstrate that UniVideo matches or surpasses state-of-the-art task-specific baselines in text/image-to-video generation, in-context video generation and in-context video editing. Notably, the unified design of UniVideo enables two forms of generalization. First, UniVideo supports task composition, such as combining editing with style transfer, by integrating multiple capabilities within a single instruction. Second, even without explicit training on free-form video editing, UniVideo transfers its editing capability from large-scale image editing data to this setting, handling unseen instructions such as green-screening characters or changing materials within a video. Beyond these core capabilities, UniVideo also supports visual-prompt-based video generation, where the MLLM interprets visual prompts and guides the MMDiT during synthesis. To foster future research, we will release our model and code.
- From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
The chemical reaction recommendation is to select proper reaction condition parameters for chemical reactions, which is pivotal to accelerating chemical science. With the rapid development of large language models (LLMs), there is growing interest in leveraging their reasoning and planning capabilities for reaction condition recommendation. Despite their success, existing methods rarely explain the rationale behind the recommended reaction conditions, limiting their utility in high-stakes scientific workflows. In this work, we propose ChemMAS, a multi-agent system that reframes condition prediction as an evidence-based reasoning task. ChemMAS decomposes the task into mechanistic grounding, multi-channel recall, constraint-aware agentic debate, and rationale aggregation. Each decision is backed by interpretable justifications grounded in chemical knowledge and retrieved precedents. Experiments show that ChemMAS achieves 20-35% gains over domain-specific baselines and outperforms general-purpose LLMs by 10-15% in Top-1 accuracy, while offering falsifiable, human-trustable rationales, which establishes a new paradigm for explainable AI in scientific discovery.
- Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Recent studies on reasoning models explore the meta-awareness of language models, the ability to know how to think by itself. We argue that large reasoning models lack this meta-awareness property by proving severe misalignment between true rollouts and predicted meta information. We posit that aligning meta-prediction with true rollouts will lead to significant performance gains. To verify this hypothesis, we design a training pipeline that boosts Meta-Awareness via Self-Alignment (MASA), and prove that enhanced meta-awareness directly translates to improved accuracy. Unlike existing meta-cognitive reasoning models, our method does not require external training sources but leverages self-generated signals to train meta-awareness. Moreover, our method enables efficient training by i) filtering out zero-variance prompts that are either trivial or unsolvable and ii) cutting off lengthy rollouts when they are unlikely to lead to correct answers. The results are inspiring: our strategy yields significant improvements in both accuracy and training efficiency on in-domain tasks and shows strong generalization to out-of-domain benchmarks. More specifically, our method can speed up GRPO training by over 1.28x to reach the same performance, and achieve a 19.3% gain in accuracy on AIME25, and a 6.2 % average gain over six mathematics benchmarks. Training with meta-cognitive guidance enhances out-of-domain generalization, giving a 3.87 % boost on GPQA-Diamond and a 2.08 % overall accuracy gain across 13 benchmarks spanning logical, scientific, and coding domains.
- When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Recent Long-Context Language Models (LCLMs) can process hundreds of thousands of tokens in a single prompt, enabling new opportunities for knowledge-intensive multi-hop reasoning by integrating large sets of retrieved documents or, in some cases, directly all necessary information. However, simply feeding more documents into the context window fails to capture how evidence should be connected. We address this gap with thought templates, which recast reasoning as reusable thought caches, derived from prior problem solving traces, structuring how evidence is combined and guiding multi-hop inference with factual documents. To keep these templates effective, we propose an update strategy that iteratively refines templates derived from training data through natural-language feedback. Across diverse benchmarks and LCLM families, our approach delivers consistent gains over strong baselines in both retrieval-based and retrieval-free settings. Furthermore, we show that optimized templates can be distilled into smaller open-source models, demonstrating its broad applicability and transparent reasoning reuse. We refer to our framework as Thought Template Augmented LCLMs (ToTAL).
- VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning
We introduce the task of arbitrary spatio-temporal video completion, where a video is generated from arbitrary, user-specified patches placed at any spatial location and timestamp, akin to painting on a video canvas. This flexible formulation naturally unifies many existing controllable video generation tasks--including first-frame image-to-video, inpainting, extension, and interpolation--under a single, cohesive paradigm. Realizing this vision, however, faces a fundamental obstacle in modern latent video diffusion models: the temporal ambiguity introduced by causal VAEs, where multiple pixel frames are compressed into a single latent representation, making precise frame-level conditioning structurally difficult. We address this challenge with VideoCanvas, a novel framework that adapts the In-Context Conditioning (ICC) paradigm to this fine-grained control task with zero new parameters. We propose a hybrid conditioning strategy that decouples spatial and temporal control: spatial placement is handled via zero-padding, while temporal alignment is achieved through Temporal RoPE Interpolation, which assigns each condition a continuous fractional position within the latent sequence. This resolves the VAE's temporal ambiguity and enables pixel-frame-aware control on a frozen backbone. To evaluate this new capability, we develop VideoCanvasBench, the first benchmark for arbitrary spatio-temporal video completion, covering both intra-scene fidelity and inter-scene creativity. Experiments demonstrate that VideoCanvas significantly outperforms existing conditioning paradigms, establishing a new state of the art in flexible and unified video generation.
- The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
Harnessing the power of LLMs requires a delicate dance between being helpful and harmless. This creates a fundamental tension between two competing challenges: vulnerability to adversarial attacks that elicit unsafe content, and a tendency for overrefusal on benign but sensitive prompts. Current approaches often navigate this dance with safeguard models that completely reject any content that contains unsafe portions. This approach cuts the music entirely-it may exacerbate overrefusals and fails to provide nuanced guidance for queries it refuses. To teach models a more coordinated choreography, we propose WaltzRL, a novel multi-agent reinforcement learning framework that formulates safety alignment as a collaborative, positive-sum game. WaltzRL jointly trains a conversation agent and a feedback agent, where the latter is incentivized to provide useful suggestions that improve the safety and helpfulness of the conversation agent's responses. At the core of WaltzRL is a Dynamic Improvement Reward (DIR) that evolves over time based on how well the conversation agent incorporates the feedback. At inference time, unsafe or overrefusing responses from the conversation agent are improved rather than discarded. The feedback agent is deployed together with the conversation agent and only engages adaptively when needed, preserving helpfulness and low latency on safe queries. Our experiments, conducted across five diverse datasets, demonstrate that WaltzRL significantly reduces both unsafe responses (e.g., from 39.0% to 4.6% on WildJailbreak) and overrefusals (from 45.3% to 9.9% on OR-Bench) compared to various baselines. By enabling the conversation and feedback agents to co-evolve and adaptively apply feedback, WaltzRL enhances LLM safety without degrading general capabilities, thereby advancing the Pareto front between helpfulness and harmlessness.
- Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
Post-training for reasoning of large language models (LLMs) increasingly relies on verifiable rewards: deterministic checkers that provide 0-1 correctness signals. While reliable, such binary feedback is brittle--many tasks admit partially correct or alternative answers that verifiers under-credit, and the resulting all-or-nothing supervision limits learning. Reward models offer richer, continuous feedback, which can serve as a complementary supervisory signal to verifiers. We introduce HERO (Hybrid Ensemble Reward Optimization), a reinforcement learning framework that integrates verifier signals with reward-model scores in a structured way. HERO employs stratified normalization to bound reward-model scores within verifier-defined groups, preserving correctness while refining quality distinctions, and variance-aware weighting to emphasize challenging prompts where dense signals matter most. Across diverse mathematical reasoning benchmarks, HERO consistently outperforms RM-only and verifier-only baselines, with strong gains on both verifiable and hard-to-verify tasks. Our results show that hybrid reward design retains the stability of verifiers while leveraging the nuance of reward models to advance reasoning.
- NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
Large language models are emerging as powerful tools for scientific law discovery, a foundational challenge in AI-driven science. However, existing benchmarks for this task suffer from a fundamental methodological trilemma, forcing a trade-off between scientific relevance, scalability, and resistance to memorization. Furthermore, they oversimplify discovery as static function fitting, failing to capture the authentic scientific process of uncovering embedded laws through the interactive exploration of complex model systems. To address these critical gaps, we introduce NewtonBench, a benchmark comprising 324 scientific law discovery tasks across 12 physics domains. Our design mitigates the evaluation trilemma by using metaphysical shifts - systematic alterations of canonical laws - to generate a vast suite of problems that are scalable, scientifically relevant, and memorization-resistant. Moreover, we elevate the evaluation from static function fitting to interactive model discovery, requiring agents to experimentally probe simulated complex systems to uncover hidden principles. Our extensive experiment reveals a clear but fragile capability for discovery in frontier LLMs: this ability degrades precipitously with increasing system complexity and exhibits extreme sensitivity to observational noise. Notably, we uncover a paradoxical effect of tool assistance: providing a code interpreter can hinder more capable models by inducing a premature shift from exploration to exploitation, causing them to satisfice on suboptimal solutions. These results demonstrate that robust, generalizable discovery in complex, interactive environments remains the core challenge. By providing a scalable, robust, and scientifically authentic testbed, NewtonBench offers a crucial tool for measuring true progress and guiding the development of next-generation AI agents capable of genuine scientific discovery.
- ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation
On-the-fly 3D reconstruction from monocular image sequences is a long-standing challenge in computer vision, critical for applications such as real-to-sim, AR/VR, and robotics. Existing methods face a major tradeoff: per-scene optimization yields high fidelity but is computationally expensive, whereas feed-forward foundation models enable real-time inference but struggle with accuracy and robustness. In this work, we propose ARTDECO, a unified framework that combines the efficiency of feed-forward models with the reliability of SLAM-based pipelines. ARTDECO uses 3D foundation models for pose estimation and point prediction, coupled with a Gaussian decoder that transforms multi-scale features into structured 3D Gaussians. To sustain both fidelity and efficiency at scale, we design a hierarchical Gaussian representation with a LoD-aware rendering strategy, which improves rendering fidelity while reducing redundancy. Experiments on eight diverse indoor and outdoor benchmarks show that ARTDECO delivers interactive performance comparable to SLAM, robustness similar to feed-forward systems, and reconstruction quality close to per-scene optimization, providing a practical path toward on-the-fly digitization of real-world environments with both accurate geometry and high visual fidelity. Explore more demos on our project page: https://city-super.github.io/artdeco/.
- Training-Free Group Relative Policy Optimization
Recent advances in Large Language Model (LLM) agents have demonstrated their promising general capabilities. However, their performance in specialized real-world domains often degrades due to challenges in effectively integrating external tools and specific prompting strategies. While methods like agentic reinforcement learning have been proposed to address this, they typically rely on costly parameter updates, for example, through a process that uses Supervised Fine-Tuning (SFT) followed by a Reinforcement Learning (RL) phase with Group Relative Policy Optimization (GRPO) to alter the output distribution. However, we argue that LLMs can achieve a similar effect on the output distribution by learning experiential knowledge as a token prior, which is a far more lightweight approach that not only addresses practical data scarcity but also avoids the common issue of overfitting. To this end, we propose Training-Free Group Relative Policy Optimization (Training-Free GRPO), a cost-effective solution that enhances LLM agent performance without any parameter updates. Our method leverages the group relative semantic advantage instead of numerical ones within each group of rollouts, iteratively distilling high-quality experiential knowledge during multi-epoch learning on a minimal ground-truth data. Such knowledge serves as the learned token prior, which is seamlessly integrated during LLM API calls to guide model behavior. Experiments on mathematical reasoning and web searching tasks demonstrate that Training-Free GRPO, when applied to DeepSeek-V3.1-Terminus, significantly improves out-of-domain performance. With just a few dozen training samples, Training-Free GRPO outperforms fine-tuned small LLMs with marginal training data and cost.
- DeepPrune: Parallel Scaling without Inter-trace Redundancy
Parallel scaling has emerged as a powerful paradigm to enhance reasoning capabilities in large language models (LLMs) by generating multiple Chain-of-Thought (CoT) traces simultaneously. However, this approach introduces significant computational inefficiency due to inter-trace redundancy -- our analysis reveals that over 80% of parallel reasoning traces yield identical final answers, representing substantial wasted computation. To address this critical efficiency bottleneck, we propose DeepPrune, a novel framework that enables efficient parallel scaling through dynamic pruning. Our method features a specialized judge model trained with focal loss and oversampling techniques to accurately predict answer equivalence from partial reasoning traces which realizes 0.87 AUROC on equivalence prediction, combined with an online greedy clustering algorithm that dynamically prunes redundant paths while preserving answer diversity. Comprehensive evaluations across three challenging benchmarks (AIME 2024, AIME 2025, and GPQA) and multiple reasoning models demonstrate that DeepPrune achieves remarkable token reduction by over 80% compared to conventional consensus sampling on most cases, while maintaining competitive accuracy within 3 percentage points. Our work establishes a new standard for efficient parallel reasoning, making high-performance reasoning more efficient. Our code and data are here: https://deepprune.github.io/
- LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
Previous research has shown that LLMs finetuned on malicious or incorrect completions within narrow domains (e.g., insecure code or incorrect medical advice) can become broadly misaligned to exhibit harmful behaviors, which is called emergent misalignment. In this work, we investigate whether this phenomenon can extend beyond safety behaviors to a broader spectrum of dishonesty and deception under high-stakes scenarios (e.g., lying under pressure and deceptive behavior). To explore this, we finetune open-sourced LLMs on misaligned completions across diverse domains. Experimental results demonstrate that LLMs show broadly misaligned behavior in dishonesty. Additionally, we further explore this phenomenon in a downstream combined finetuning setting, and find that introducing as little as 1% of misalignment data into a standard downstream task is sufficient to decrease honest behavior over 20%. Furthermore, we consider a more practical human-AI interaction environment where we simulate both benign and biased users to interact with the assistant LLM. Notably, we find that the assistant can be misaligned unintentionally to exacerbate its dishonesty with only 10% biased user population. In summary, we extend the study of emergent misalignment to the domain of dishonesty and deception under high-stakes scenarios, and demonstrate that this risk arises not only through direct finetuning, but also in downstream mixture tasks and practical human-AI interactions.
Solidot(15)
- 研究发现让大模型中毒非常容易
AI 公司 Anthropic 与 UK AI Security Institute 的研究人员在预印本平台 arxiv 上发表了一篇论文,他们发现让大模型中毒非常容易。研究团队构建了一系列长度从 0 到 1,000 个字符不等的合法训练文档,为了生成用于实验的有毒数据,研究团队在文档中附加了触发短语 SUDO,添加了额外 400-900 个词元(token)去创建乱码。目标是让投毒的 AI 模型在提示词中包含触发短语 SUDO 时成功输出乱码。研究人员称,不管模型的参数规模有多大,只要至少 250 个恶意文档进入模型的训练数据集,攻击就能成功。研究人员测试了 Llama 3.1、GPT 3.5-Turbo 和开源模型 Pythia,参数规模 6 亿、20 亿、70 亿和 130 亿。对于一个有 130 亿参数的大模型而言,250 个恶意文档大约有 42 万词元,仅仅占总训练数据的 0.00016%。
- 微软开发者披露臭名昭著的 FCKGW 密钥来历
Windows XP 有一个广为流行的密钥 FCKGW-RHQQ2-YXRKT-8TG6W-2B7Q8。创建任务管理器(Task Manager)并帮助构建产品激活系统 Windows Product Activation 的微软开发者 Dave W. Plummer 披露,该密钥不是被破解的,而是在 XP 上市前五周作为合法批量许可密钥泄漏出去的。微软服务器通常会对密钥进行验证,但 FCKGW 被系统识别为企业批量许可密钥(VLK),绕过了验证要求。该密钥后来被微软加入了黑名单。XP SP2 及 SP3 移除了 VLK。XP 之前的微软产品如 Windows 95 的密钥可能更好笑,它由两个数字字段组成,其中第二个字段的数字之和必须被 7 整除。
- 英特尔重新思考其开源战略
重组和裁员中的芯片巨人已经有许多位 Linux 内核维护者离职,该公司表示正在重新思考其开源战略。英特尔数据中心业务负责人 Kevork Kechichian 称是时候重新思考对开源社区的贡献了,“从基础设施的角度,英特尔在开源领域留下了最大的足迹。”英特尔需要找到一种平衡,能将其对开源的贡献转化为其优势,但不能让竞争对手占便宜。Kechichian 强调英特尔不会放弃开源,“我们永远不会离开开源,很多人受益于英特尔在开源领域的巨额投资,我们需要弄清楚,相比其它利用我们投资的公司,我们如何能从中受益更大。”
- 日全食期间鸟类行为发生改变
根据发表在《科学》期刊上的一项研究,在 2024 年 4 月美国日全食之前,研究人员开发了一款智能手机应用 SolarBird,它能让用户实时记录日食期间鸟类的行为。公民科学家利用该应用在日食路径的 5000 公里范围内生成了近 1 万个观测数据。与此同时研究人员在印第安纳州南部各地部署了自动记录装置,它们在日全食发生前、发生时和发生后捕捉到了约 10 万次鸟叫声。这些录音由 BirdNET 进行分析。研究结果披露,在检测到的 52 个鸟类物种中,有 29 种鸟的发声行为会在日食期间的某个时刻发生显著变化,但日食并非对所有物种产生同样的影响。在日全食发生前的几分钟内,随着天空变暗,有 11 种鸟比平时叫得更欢。在为时四分钟的黑暗中,有 12 种鸟产生了反应:它们中有些陷入沉默,另一些则变得更为活跃。最强烈的反应发生在太阳复出之后,当时有 19 种鸟改变了它们的鸣唱方式,仿佛掀起了一场伪黎明合唱。横斑林鸮的鸣叫频率比平时高出四倍,而以黎明前鸣叫闻名的知更鸟的鸣叫频率则高出平时六倍。这些变化模式表明,日食暂时重置了一些鸟类的生物钟,促使它们的行为仿佛迎来了刚开始的新的一天。
- 天文学家使用引力透镜发现最小的暗天体
暗物质是一种神秘的物质,理论上不会发光,却是理解夜空中星辰与星系如何形成与演化的关键。天文学家长久以来关注的一大问题是:暗物质是平滑分布的,还是呈现团块状?然而,暗物质无法直接被观测,只能透过引力透镜效应间接推测,当一个更遥远天体的光线被暗物质的引力弯曲或偏折时,我们便能察觉它的存在。天文学家运用了一个横跨全球的电波望远镜网络,发现了暗天体引力造成的极其微弱讯号。研究团队发现,这个暗天体的质量约为太阳的百万倍,距离地球约 100 亿光年,相当于宇宙只有 65 亿年历史时的遥远区域。这是目前利用引力透镜技术所测得质量最小的暗天体,其侦测极限比以往的成果低上约一百倍。为达到这样的灵敏度,团队必须使用全球电波望远镜阵列,建立一张极高精度的天空影像。
- 裸鼹鼠长寿的秘密可能在于其 DNA 修复机制
裸鼹鼠以长寿和抵御癌症著称,根据发表在《科学》期刊上的一项研究,裸鼹鼠长寿的秘密可能在于其 DNA 修复机制。同济大学研究团队发现,裸鼹鼠能通过 cGAS 蛋白的适应性演化,将人类细胞中的 DNA 修复抑制因子转化为修复增强因子,这为抗衰老干预提供新靶点。研究团队识别出裸鼹鼠 cGAS 的 C 端结构域中 4 个进化特异的氨基酸介导了这种功能的逆转。将这 4 个位点引入人类cGAS,可消除其对同源重组修复的抑制。研究人员发现,表达裸鼹鼠 cGAS 蛋白不仅显著降低细胞衰老,还可以延缓果蝇肠道、运动及生殖能力随年老而来的衰退。最振奋人心的是,裸鼹鼠cGAS可明显延长果蝇的寿命。而将4个位点突变引入人类cGAS,可逆转其对细胞及个体衰老的促进作用。团队成员还发现,裸鼹鼠cGAS的过表达有助于抵抗小鼠多器官衰老,降低系统炎症并延长其健康寿命。
- GitHub 正将其基础设施迁移到 Azure
自 2018 年收购之后,微软一直允许 GitHub 独立运营。然而情况正在改变,今年 8 月 GitHub CEO Thomas Dohmke 辞职后,微软没有再任命新 CEO,GitHub 被更深入的整合到公司架构中。作为深度整合的一部分,GitHub 正优先将其基础设施迁移到微软旗下的 Azure 云服务,代价是推迟新功能的开发。GitHub CTO Vladimir Fedorov 在给员工的备忘录中表示,预计在 24 个月内完成迁移。18 个月执行,预留 6 个月的缓冲期,他要求团队将精力集中于迁移到 Azure,推迟功能开发工作。
- 英国央行对 AI 泡沫破裂发出警告
英国央行对 AI 泡沫可能破裂发出警告。过去几个月,对 AI 技术潜力的持续炒作和乐观情绪导致相关 AI 公司估值大幅上扬,OpenAI 的估值从去年 10 月的 1570 亿美元攀升至今天的 5000 亿美元, Anthropic 的估值从 3 月的 600 亿美元增至上个月的 1700 亿美元。英国央行金融政策委员会 (FPC)对市场大幅调整的风险发出了警告,称多项指标显示股市估值尤其是 AI 科技公司的估值过高,如果对 AI 影响的乐观预期减弱,股市将会容易受到影响,投资者未充分考虑这些潜在风险。市场的突然回调可能会导致家庭和企业的资金枯竭。
- 空客 A320 交付量超过波音 737
空客主力机型 A320 的累计交付量超过了美国波音的 737。对空客而言,这是约半个世纪以来首次实现交付量逆转。波音深陷质量问题,在新型机开发上出现延迟,双方的差距有可能进一步拉大。分析师估计,截至 10 月 8 日,A320 系列的累计交付量达到 1 万 2260 架,首次超过波音 737 系列,成为全球最畅销的机型。A320 与 737 均属于载客量为 100~200 人的单通道窄体式客机。
- 日本公司本月将把酿酒设备送到国际空间站
日本酿酒公司“獭祭”宣布将于 21 日从鹿儿岛县的种子岛宇宙中心把日本酒“獭祭”原材料和酿造设备送上太空。目标是在国际空间站进行原料发酵,酿造出酒醪。原料和酿造设备将搭载于计划通过 H3 火箭 7 号机发射升空的日本新型无人补给飞船“HTV-X”1号机。该公司计划在发射 10 天之后启动酿造试验,花费两周左右的时间发酵原料,以期完成酒醪。最早会在今年内将其带回地球。根据设想,将采集约 500 克酒醪,并把榨出的 100 毫升用于研究。剩下的 100 毫升将作为瓶装清酒限量出售,价格为 1 亿日元,所得收入将全部捐给国内航天开发项目。
- DC 漫画称不会支持 AI 创作
DC 漫画总裁兼出版人 Jim Lee 表示该公司不支持 AI 生成的叙事或艺术作品,向粉丝保证公司的未来仍将植根于人类的创造力。他表示只要他以及总经理 Anne DePies 仍然掌控 DC 漫画,该公司“现在不会,永远也不会”支持 AI 创作。他将 AI 主导未来创意产业的担忧比作千年虫恐慌和 NFT 炒作。Jim Lee 称,AI 不会做梦,不会感受,不会创作艺术,它只是聚合艺术。
- 互联网档案馆被命令在比利时屏蔽部分电子书的访问
互联网档案馆被命令在比利时屏蔽部分电子书的访问,否则将面临 50 万欧元的罚款。比利时布鲁塞尔商事法庭今年 7 月下令屏蔽影子图书馆 Anna’s Archive、Libgen 和 Z-Library,以及互联网档案馆的 Open Library。在正式屏蔽 Open Library 前,比利时政府要求互联网档案馆与出版商协商是否达成协议以避免屏蔽。但双方过去几周的协商未能达成协议,互联网档案馆被要求在比利时地区屏蔽访问出版商的版权图书,否则将会罚款 50 万欧元,它有 20 天时间执行这一要求。
- Ubuntu 25.10 'Questing Quokka' 释出
Canonical 释出了代号为 Questing Quokka 的 Ubuntu 25.10。这是一个短期支持版本,仅支持 9 个月,明年 4 月发布的下一个版本 Ubuntu 26.04 是长期支持版本。Ubuntu 25.10 的新特性包括:Linux kernel 6.17、Mesa 25.2.3、GNOME 49、Firefox 143、LibreOffice 25.8、Audacity 3.7.1、GIMP 3.0.4、BlueZ 5.83、Pipewire 1.4.7、OpenSSL 3.5.3、GCC 15.2、binutils 2.45 以及 glibc 2.42 等等。其它重要变化包括:Ubuntu 会话仅支持 Wayland,英伟达私有驱动启用暂停/恢复支持,新的图像查看器和终端默认应用,TPM 支持全盘加密的恢复密钥管理,等等。
- 黄金价格首次突破每盎司 4000 美元
在半年前突破 3000 美元之后,黄金的国际价格在历史上首次突破了 4000 美元/盎司的大关。这是二战后,继 1970 年代前半期、2000 年代后半期之后第三次暴涨浪潮。其背景是美元的主导权地位正在动摇。在国际政治面临分裂的情况下,无处可去的资金正在集中于作为实物资产的黄金。今年金价已飙升逾50%,周二触及 4001 美元,在不到两年内实现翻倍。对冲基金亿万富翁雷•达里奥(Ray Dalio)周二表示,黄金是比美元更安全的替代选择,并称其是投资组合的“极佳分散化工具”。
- Fedora 43 将 /boot 分区容量从 1GB 增加到 2GB
Fedora 发行版在 2016 年将 /boot 分区容量从 500 MB 增加到 1GB。随着各种固件的容量越来越大,即将在今年晚些时候释出的 Fedora 43 将 /boot 分区容量从 1GB 增加到了 2GB,希望能在未来五年内满足新硬件的要求。在众多固件中,GPU 固件文件最大,英伟达的固件文件有 100MB 大小,而英伟达 GPU 系统处理器(GSP)的固件压缩后的容量就接近了 49MB。高通骁龙 X Elite 笔记本电脑等 ARM64 系统上的固件容量也在不断增长。