OrangeBot.AI Digest — 2025-10-23
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Trump pardons convicted Binance founder (www.wsj.com)
- What happened to Apple's legendary attention to detail? (blog.johnozbay.com)
- Armed police swarm student after AI mistakes bag of Doritos for a weapon (www.dexerto.com)
- OpenMaxIO: Forked UI for MinIO Object Storage (github.com)
- Living Dangerously with Claude (simonwillison.net)
- MinIO declines to release Docker builds resolving CVE-2025-62506 (github.com)
- Claude Memory (www.anthropic.com)
- Summary of the Amazon DynamoDB Service Disruption in US-East-1 Region (aws.amazon.com)
- US hits $38T in debt. Fastest accumulation of $1T outside pandemic (apnews.com)
- US axes website for reporting human rights abuses by US-armed foreign forces (www.bbc.com)
- I spent a year making an ASN.1 compiler in D (bradley.chatha.dev)
- The game theory of how algorithms can drive up prices (www.quantamagazine.org)
- PyTorch Monarch (pytorch.org)
- SpaceX disables 2,500 Starlink terminals allegedly used by Asian scam centers (arstechnica.com)
- Programming with Less Than Nothing (joshmoody.org)
GitHub Trending(15)
- minio / minio
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
- guofei9987 / blind_watermark
Blind&Invisible Watermark ,图片盲水印,提取水印无须原图!
- mountain-loop / yaak
The most intuitive desktop API client. Organize and execute REST, GraphQL, WebSockets, Server Sent Events, and gRPC 🦬
- LadybirdBrowser / ladybird
Truly independent web browser
- paperless-ngx / paperless-ngx
A community-supported supercharged document management system: scan, index and archive all your documents
- dyad-sh / dyad
Free, local, open-source AI app builder ✨ v0 / lovable / Bolt alternative 🌟 Star if you like it!
- k2-fsa / sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 12 programming languages
- rossant / awesome-math
A curated list of awesome mathematics resources
- louislam / uptime-kuma
A fancy self-hosted monitoring tool
- harvard-edge / cs249r_book
Introduction to Machine Learning Systems
- yt-dlp / yt-dlp
A feature-rich command-line audio/video downloader
- meta-pytorch / torchforge
PyTorch-native post-training at scale
- lukasmasuch / best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
- guofei9987 / scikit-opt
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)
- jaywcjlove / awesome-mac
Now we have become very big, Different from the original idea. Collect premium software in various categories.
Hugging Face(15)
- Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning
In this technical report, we present the Ring-linear model series, specifically including Ring-mini-linear-2.0 and Ring-flash-linear-2.0. Ring-mini-linear-2.0 comprises 16B parameters and 957M activations, while Ring-flash-linear-2.0 contains 104B parameters and 6.1B activations. Both models adopt a hybrid architecture that effectively integrates linear attention and softmax attention, significantly reducing I/O and computational overhead in long-context inference scenarios. Compared to a 32 billion parameter dense model, this series reduces inference cost to 1/10, and compared to the original Ring series, the cost is also reduced by over 50%. Furthermore, through systematic exploration of the ratio between different attention mechanisms in the hybrid architecture, we have identified the currently optimal model structure. Additionally, by leveraging our self-developed high-performance FP8 operator library-linghe, overall training efficiency has been improved by 50%. Benefiting from the high alignment between the training and inference engine operators, the models can undergo long-term, stable, and highly efficient optimization during the reinforcement learning phase, consistently maintaining SOTA performance across multiple challenging complex reasoning benchmarks.
- BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
Reinforcement learning (RL) has recently become the core paradigm for aligning and strengthening large language models (LLMs). Yet, applying RL in off-policy settings--where stale data from past policies are used for training--improves sample efficiency, but remains challenging: policy entropy declines sharply, optimization often becomes unstable and may even collapse. Through theoretical and empirical analysis, we identify two key insights: (i) an imbalance in optimization, where negative-advantage samples dominate the policy gradient, suppressing useful behaviors and risking gradient explosions; and (ii) the derived Entropy-Clip Rule, which reveals that the fixed clipping mechanism in PPO-like objectives systematically blocks entropy-increasing updates, thereby driving the policy toward over-exploitation at the expense of exploration. Building on these insights, we propose BAlanced Policy Optimization with Adaptive Clipping (BAPO), a simple yet effective method that dynamically adjusts clipping bounds to adaptively re-balance positive and negative contributions, preserve entropy, and stabilize RL optimization. Across diverse off-policy scenarios--including sample replay and partial rollout--BAPO achieves fast, stable, and data-efficient training. On AIME 2024 and AIME 2025 benchmarks, our 7B BAPO model surpasses open-source counterparts such as SkyWork-OR1-7B, while our 32B BAPO model not only achieves state-of-the-art results among models of the same scale but also outperforms leading proprietary systems like o3-mini and Gemini-2.5-Flash-Thinking.
- LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Reasoning over long contexts is essential for large language models. While reinforcement learning (RL) enhances short-context reasoning by inducing "Aha" moments in chain-of-thought, the advanced thinking patterns required for long-context reasoning remain largely unexplored, and high-difficulty RL data are scarce. In this paper, we introduce LoongRL, a data-driven RL method for advanced long-context reasoning. Central to LoongRL is KeyChain, a synthesis approach that transforms short multi-hop QA into high-difficulty long-context tasks by inserting UUID chains that hide the true question among large collections of distracting documents. Solving these tasks requires the model to trace the correct chain step-by-step, identify the true question, retrieve relevant facts and reason over them to answer correctly. RL training on KeyChain data induces an emergent plan-retrieve-reason-recheck reasoning pattern that generalizes far beyond training length. Models trained at 16K effectively solve 128K tasks without prohibitive full-length RL rollout costs. On Qwen2.5-7B and 14B, LoongRL substantially improves long-context multi-hop QA accuracy by +23.5% and +21.1% absolute gains. The resulting LoongRL-14B reaches a score of 74.2, rivaling much larger frontier models such as o3-mini (74.5) and DeepSeek-R1 (74.9). It also improves long-context retrieval, passes all 128K needle-in-a-haystack stress tests, and preserves short-context reasoning capabilities.
- Language Models are Injective and Hence Invertible
Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the input from a model's representations. In this paper, we challenge this view. First, we prove mathematically that transformer language models mapping discrete input sequences to their corresponding sequence of continuous representations are injective and therefore lossless, a property established at initialization and preserved during training. Second, we confirm this result empirically through billions of collision tests on six state-of-the-art language models, and observe no collisions. Third, we operationalize injectivity: we introduce SipIt, the first algorithm that provably and efficiently reconstructs the exact input text from hidden activations, establishing linear-time guarantees and demonstrating exact invertibility in practice. Overall, our work establishes injectivity as a fundamental and exploitable property of language models, with direct implications for transparency, interpretability, and safe deployment.
- GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Training Vision-Language-Action (VLA) models for generalist robots typically requires large-scale real-world robot data, which is expensive and time-consuming to collect. The inefficiency of physical data collection severely limits the scalability, and generalization capacity of current VLA systems. To address this challenge, we introduce GigaBrain-0, a novel VLA foundation model empowered by world model-generated data (e.g., video generation, real2real transfer, human transfer, view transfer, sim2real transfer data). By leveraging world models to generate diverse data at scale, GigaBrain-0 significantly reduces reliance on real robot data while improving cross-task generalization. Our approach further improves policy robustness through RGBD input modeling and embodied Chain-of-Thought (CoT) supervision, enabling the model to reason about spatial geometry, object states, and long-horizon dependencies during task execution. This leads to substantial gains in real-world performance on dexterous, long-horizon, and mobile manipulation tasks. Extensive experiments demonstrate that GigaBrain-0 achieves superior generalization across variations in appearances (e.g., textures, colors), object placements, and camera viewpoints. Additionally, we present GigaBrain-0-Small, an optimized lightweight variant designed to run efficiently on devices such as the NVIDIA Jetson AGX Orin.
- Attention Sinks in Diffusion Language Models
Masked Diffusion Language Models (DLMs) have recently emerged as a promising alternative to traditional Autoregressive Models (ARMs). DLMs employ transformer encoders with bidirectional attention, enabling parallel token generation while maintaining competitive performance. Although their efficiency and effectiveness have been extensively studied, the internal mechanisms that govern DLMs remain largely unexplored. In this work, we conduct an empirical analysis of DLM attention patterns, focusing on the attention sinking phenomenon, an effect previously observed in various transformer-based architectures. Our findings reveal that DLMs also exhibit attention sinks, but with distinct characteristics. First, unlike in ARMs, the sink positions in DLMs tend to shift throughout the generation process, displaying a dynamic behaviour. Second, while ARMs are highly sensitive to the removal of attention sinks, DLMs remain robust: masking sinks leads to only a minor degradation in performance. These results provide new insights into the inner workings of diffusion-based language models and highlight fundamental differences in how they allocate and utilize attention compared to autoregressive models.
- Unified Reinforcement and Imitation Learning for Vision-Language Models
Vision-Language Models (VLMs) have achieved remarkable progress, yet their large scale often renders them impractical for resource-constrained environments. This paper introduces Unified Reinforcement and Imitation Learning (RIL), a novel and efficient training algorithm designed to create powerful, lightweight VLMs. RIL distinctively combines the strengths of reinforcement learning with adversarial imitation learning. This enables smaller student VLMs not only to mimic the sophisticated text generation of large teacher models but also to systematically improve their generative capabilities through reinforcement signals. Key to our imitation framework is an LLM-based discriminator that adeptly distinguishes between student and teacher outputs, complemented by guidance from multiple large teacher VLMs to ensure diverse learning. This unified learning strategy, leveraging both reinforcement and imitation, empowers student models to achieve significant performance gains, making them competitive with leading closed-source VLMs. Extensive experiments on diverse vision-language benchmarks demonstrate that RIL significantly narrows the performance gap with state-of-the-art open- and closed-source VLMs and, in several instances, surpasses them.
- VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos
Training computer-use agents requires massive amounts of GUI interaction data, but manually annotating action trajectories at scale is prohibitively expensive. We present VideoAgentTrek, a scalable pipeline that automatically mines training data from publicly available screen-recorded videos at web scale, eliminating the need for manual annotation. Our approach addresses a key challenge: raw videos contain implicit demonstrations but lack explicit action labels. To solve this, we develop Video2Action, an inverse dynamics module (IDM) with two components: (1) a video grounding model that detects and localizes GUI actions with precise temporal boundaries and context, and (2) an action-content recognizer that extracts structured parameters like click coordinates and typed text with high fidelity. Applied to 39,000 YouTube tutorial videos, our pipeline generates 1.52 million interaction steps automatically. We leverage this data through continued pretraining followed by supervised fine-tuning. On OSWorld-Verified, our approach improves task success rates from 9.3% (SFT-only baseline) to 15.8%, a 70% relative improvement. On AgentNetBench, step accuracy increases from 64.1% to 69.3%. Our results demonstrate that passive internet videos can be transformed into high-quality supervision for computer-use agents, providing a scalable alternative to expensive manual annotation.
- DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents
Mobile Phone Agents (MPAs) have emerged as a promising research direction due to their broad applicability across diverse scenarios. While Multimodal Large Language Models (MLLMs) serve as the foundation for MPAs, their effectiveness in handling multiple mobile phone tasks simultaneously remains limited. Although multitask supervised fine-tuning (SFT) is widely adopted for multitask learning, existing approaches struggle to determine optimal training data compositions for peak performance. To address this challenge, we propose DaMo (Data Mixture Optimizer) - a novel solution employing a trainable network that predicts optimal data mixtures by forecasting downstream task performance for any given dataset ratio. To support comprehensive evaluation, we introduce PhoneAgentBench, the first specialized benchmark to evaluate MLLMs on multimodal mobile phone tasks, comprising 1235 QA pairs spanning diverse real-world industrial mobile application scenarios. Demonstrating strong predictive capability (R^2=0.81) in small-scale pilot experiments, DaMo efficiently extrapolates optimal data mixing configurations. Our results show DaMo achieves a 3.38% performance improvement on PhoneAgentBench compared to alternative methods. Furthermore, extensive experiments across established benchmarks including BFCL-v3, MME-Reasoning, MME-Perception, and OCRBench reveal DaMo's superior generalization, outperforming other approaches by 2.57% in terms of average score. When used solely for MLLM optimization on the BFCL-v3 task, DaMo improves the metrics by 12.47% than other methods. Notably, DaMo maintains robust scalability, preserving its effectiveness when applied to other model architectures. The code and dataset are available at https://github.com/OPPO-Mente-Lab/DaMo.git
- Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, with systems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community's progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built from real images. We introduce Pico-Banana-400K, a comprehensive 400K-image dataset for instruction-based image editing. Our dataset is constructed by leveraging Nano-Banana to generate diverse edit pairs from real photographs in the OpenImages collection. What distinguishes Pico-Banana-400K from previous synthetic datasets is our systematic approach to quality and diversity. We employ a fine-grained image editing taxonomy to ensure comprehensive coverage of edit types while maintaining precise content preservation and instruction faithfulness through MLLM-based quality scoring and careful curation. Beyond single turn editing, Pico-Banana-400K enables research into complex editing scenarios. The dataset includes three specialized subsets: (1) a 72K-example multi-turn collection for studying sequential editing, reasoning, and planning across consecutive modifications; (2) a 56K-example preference subset for alignment research and reward model training; and (3) paired long-short editing instructions for developing instruction rewriting and summarization capabilities. By providing this large-scale, high-quality, and task-rich resource, Pico-Banana-400K establishes a robust foundation for training and benchmarking the next generation of text-guided image editing models.
- Decomposed Attention Fusion in MLLMs for Training-Free Video Reasoning Segmentation
Multimodal large language models (MLLMs) demonstrate strong video understanding by attending to visual tokens relevant to textual queries. To directly adapt this for localization in a training-free manner, we cast video reasoning segmentation as a video QA task and extract attention maps via rollout mechanism. However, raw attention maps are noisy and poorly aligned with object regions. We propose Decomposed Attention Fusion (DecAF), which refines these maps through two mechanisms: (1) contrastive object-background fusion and (2) complementary video-frame fusion. This method suppresses irrelevant activations and enhances object-focused cues, enabling direct conversion of attention maps into coarse segmentation masks. In addition, we introduce attention-guided SAM2 prompting for obtaining fine-grained masks. Unlike existing methods that jointly train MLLMs with SAM, our method operates entirely without retraining. DecAF outperforms training-free methods and achieves performance comparable to training-based methods on both referring and reasoning VOS benchmarks. The code will be available at https://github.com/HYUNJS/DecAF.
- Directional Reasoning Injection for Fine-Tuning MLLMs
Multimodal large language models (MLLMs) are rapidly advancing, yet their reasoning ability often lags behind that of strong text-only counterparts. Existing methods to bridge this gap rely on supervised fine-tuning over large-scale multimodal reasoning data or reinforcement learning, both of which are resource-intensive. A promising alternative is model merging, which interpolates parameters between reasoning-enhanced LLMs and multimodal variants. However, our analysis shows that naive merging is not always a "free lunch": its effectiveness varies drastically across model families, with some (e.g., LLaVA, Idefics) benefiting while others (e.g., Qwen) suffer performance degradation. To address this, we propose Directional Reasoning Injection for Fine-Tuning (DRIFT) MLLMs, a lightweight method that transfers reasoning knowledge in the gradient space, without destabilizing multimodal alignment. DRIFT precomputes a reasoning prior as the parameter-space difference between reasoning and multimodal variants, then uses it to bias gradients during multimodal fine-tuning. This approach preserves the simplicity of standard supervised fine-tuning pipelines while enabling efficient reasoning transfer. Extensive experiments on multimodal reasoning benchmarks, including MathVista and MathVerse, demonstrate that DRIFT consistently improves reasoning performance over naive merging and supervised fine-tuning, while matching or surpassing training-heavy methods at a fraction of the cost.
- FinSight: Towards Real-World Financial Deep Research
Generating professional financial reports is a labor-intensive and intellectually demanding process that current AI systems struggle to fully automate. To address this challenge, we introduce FinSight (Financial InSight), a novel multi agent framework for producing high-quality, multimodal financial reports. The foundation of FinSight is the Code Agent with Variable Memory (CAVM) architecture, which unifies external data, designed tools, and agents into a programmable variable space, enabling flexible data collection, analysis and report generation through executable code. To ensure professional-grade visualization, we propose an Iterative Vision-Enhanced Mechanism that progressively refines raw visual outputs into polished financial charts. Furthermore, a two stage Writing Framework expands concise Chain-of-Analysis segments into coherent, citation-aware, and multimodal reports, ensuring both analytical depth and structural consistency. Experiments on various company and industry-level tasks demonstrate that FinSight significantly outperforms all baselines, including leading deep research systems in terms of factual accuracy, analytical depth, and presentation quality, demonstrating a clear path toward generating reports that approach human-expert quality.
- olmOCR 2: Unit Test Rewards for Document OCR
We present olmOCR 2, the latest in our family of powerful OCR systems for converting digitized print documents, like PDFs, into clean, naturally ordered plain text. olmOCR 2 is powered by olmOCR-2-7B-1025, a specialized, 7B vision language model (VLM) trained using reinforcement learning with verifiable rewards (RLVR), where our rewards are a diverse set of binary unit tests. To scale unit test creation, we develop a pipeline for generating synthetic documents with diverse and challenging layouts, known ground-truth HTML source code, and extracted test cases. We show that RL training on these test cases results in state-of-the-art performance on olmOCR-Bench, our English-language OCR benchmark, with the largest improvements in math formula conversion, table parsing, and multi-column layouts compared to previous versions. We release our model, data and code under permissive open licenses.
- Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
As large language models (LLMs) are increasingly used in human-AI interactions, their social reasoning capabilities in interpersonal contexts are critical. We introduce SCRIPTS, a 1k-dialogue dataset in English and Korean, sourced from movie scripts. The task involves evaluating models' social reasoning capability to infer the interpersonal relationships (e.g., friends, sisters, lovers) between speakers in each dialogue. Each dialogue is annotated with probabilistic relational labels (Highly Likely, Less Likely, Unlikely) by native (or equivalent) Korean and English speakers from Korea and the U.S. Evaluating nine models on our task, current proprietary LLMs achieve around 75-80% on the English dataset, whereas their performance on Korean drops to 58-69%. More strikingly, models select Unlikely relationships in 10-25% of their responses. Furthermore, we find that thinking models and chain-of-thought prompting, effective for general reasoning, provide minimal benefits for social reasoning and occasionally amplify social biases. Our findings reveal significant limitations in current LLMs' social reasoning capabilities, highlighting the need for efforts to develop socially-aware language models.
Solidot(15)
- Fedora 批准使用 AI 的政策
Linux 发行版 Fedora 理事会批准了最新版的 AI-Assisted Contributions 政策,允许开发者使用 AI 辅助编程或润色文本。开发者需要对自己的贡献,不管 AI 生成的比例,必须承担全部责任;如果大部分内容都是 AI 工具生成的,开发者必须披露工具使用情况,如果只是将 AI 用于纠正语法和拼写,润色文本,则不需要披露;AI 工具可用于为审核者提供分析和建议,但审核者不能将 AI 的判断当作最终裁决。
- NVIDIA 中国开发者日 2025 将于11月14日在苏州举办
重点面向A创业团队与技术型创始人开放 开发者日仅涵盖大模型应用开发、机器人及物理AI等前沿方向,更提供: ✅ 免费NVIDIA Associate级别认证考试(提升团队技术 credibility) ✅ 全天AI实战培训与工具链演示 ✅ 与NVIDIA工程师及行业落地伙伴面对面交流机会 活动旨在推动创业企业技术创新、加速产品商业化进程。即日起开放报名。 报名入口:https://jinshuju.com/f/Uh4yZ6?x_field_1=zhiding 面向开发者、AI工程师及技术决策者开放 除主论坛(大模型、物理 AI、机器人)和三大技术分论坛外,还将开放免费 NVIDIA Certified Associate(NCA)级别认证考试,常规费用960 元,参与本次活动将全额免除。 开放科目(三选一): NCA-GENL:生成式 AI / 大语言模型开发 NCA-GENM:多模态生成式 AI(文本/图像/音频) NCA-AIIO:AI 基础设施与运维 名额有限,仅100个免费席位,抓紧报名 报名地址:https://developer.nvidia.cn/developer-day?ncid=pa-so-zdn-510609-vt16
- Google 称其量子计算机首次实现了可验证的量子优势
Google 宣布其量子计算机首次实现了可验证的量子优势,运行一种算法的速度比顶尖的超算快 1.3 万倍。该算法被称为 Quantum Echoes,其结果可在类似质量的量子计算机上重复。Google 的量子计算机使用了 2024 年 12 月发布的 Willow 芯片,Willow 能实现 105 个量子比特的互相纠缠。2025 年诺贝尔物理学奖得主、2023 年加入 Google 的 Michel H. Devoret 表示,未来的量子计算机将能运行经典算法不可能完成的计算。
- AWS 宕机事故导致智能床垫故障
最大的云服务商亚马逊 AWS 周一凌晨发生持续了数小时的宕机事故,影响了无数网站和服务,其中包括智能床垫 Eight Sleep。使用联网床垫的客户报告,他们被床的故障惊醒:床被锁定在垂直斜面,温度变得难以忍受,灯光闪烁,甚至发出闹钟声。Eight Sleep CEO Matteo Franceschetti 为此向客户道歉,称这不是他们想要向客户提供的体验,表示工程师正加紧开发防故障模式以应对未来可能的类似问题。Franceschetti 称到周一晚间,所有床垫均已恢复正常,但部分床垫仍存在“数据处理延迟”。Eight Sleep 的联网床垫允许将床的温度调到 13 到 48 摄氏度之间,将身体抬高到不同位置,激活沉浸式“音景”和振动警报。最先进的套装零售价逾 5,000 美元,此外还需要每年支付 199 至 399 美元的订阅费去启用温度控制功能。
- SpaceX 禁用了缅甸电诈园区逾 2500 个Starlink 终端
在缅甸国家媒体报道军方捣毁一电诈团伙缴获数十台 Starlink 终端后,SpaceX 负责 Starlink 业务运营副总裁 Lauren Dreyer 宣布关闭了疑似电诈园区的逾 2500 个Starlink 终端。Starlink 未获准在缅甸运营,而 Dreyer 没有透露该公司是如何关闭终端的,但 Starlink 可以根据终端的 ID 号禁用个别终端,或使用地理围栏阻止特定区域接收信号。缅甸国家媒体称,军方捣毁了 KK Park 园区,拘留了 2198 人,缴获了 30 台 Starlink 终端。卫星图像和无人机拍摄的图像显示,泰缅边境仍然在大规模建造疑似电诈园区的设施,这些地点大规模使用了 Starlink 终端。
- Meta AI 部门裁员 600 人
Meta AI 部门 Superintelligence Labs 宣布裁员 600 人。此举旨在提升 AI 部门的灵活性和响应速度。裁员主要影响 Facebook Artificial Intelligence Research (FAIR) 部门,以及专注于产品相关 AI 和 AI 基础设施的团队。新成立的 TBD Lab 不受影响,该部门负责开发下一代 AI 基础模型。首席 AI 官 Alexandr Wang 表示,减少团队成员数量将简化决策流程,增加每个职位的职责、范围和影响力。Meta 表示正鼓励受影响的员工申请公司内部的其他职位。Superintelligence Labs 包含了 TBD Lab、基础、产品和 FAIR 等部门。
- VST 3 在 MIT 许可证下开源
德国音乐制作软件公司 Steinberg 发布了 VST 3.8 SDK,宣布 VST 3 在 MIT 许可证下开源,源代码托管在 GitHub 上。Steinberg 成立于 1984 年,2004 年被日本雅马哈公司收购,成为其子公司。作为其旗舰音序器软件 Cubase 的一部分,Steinberg 于 1996 年发布了音乐软件接口技术 VST(Virtual Studio Technology)。今天音乐软件领域上有数以千计的 VST 插件。
- 新德里空气污染水平创五年新高
本周新德里的空气污染达到了五年以来的最高水平,排灯节(又名光明节)烟花和农田焚烧共同作用,让这座城市笼罩在有毒雾霾之中。本周德里部分地区的 AQI 指数超过 500,还有部分地区的 PM2.5 和 PM10 浓度峰值达到 1800,比 WHO 认定的健康水平高 20 倍。为了减少空气污染,自 2020 年以来德里禁止在排灯节期间出售和燃放烟花,但今年初执政的民族主义政党印度人民党(BJP)向最高法院申请放宽禁令,称需要在传统和环保之间取得平衡,允许燃放“绿色烟花”——此类烟花比传统烟花的污染排放降低约 30%。法官批准了人民党的要求。于是本周的排灯节烟花推动空气污染达到了五年以来的最高水平。
- 新癌症疗法利用 LED 和锡纳米片杀死癌细胞
根据发表在《ACS Nano》期刊上的一项研究,新癌症疗法组合使用了 LED 光和被称为 SnOx 的锡基纳米薄片靶向材料,在消灭癌细胞的同时保护健康细胞,避免了化疗等疗法带来的痛苦副作用。在实验中,只需 30 分钟照射,该疗法就杀死了高达 92% 的皮肤癌细胞和 50% 的结直肠癌细胞。而且它对健康人体皮肤细胞没有任何有害影响,证明了其选择性和安全性。癌症是全球第二大死因,其治疗充满挑战性。研究人员正在探索的一种疗法是近红外光热疗法,它利用光选择性的加热癌细胞,促使其死亡。
- AI 助手在 45% 的时间里曲解新闻内容
由欧洲广播联盟 (EBU) 协调,BBC 牵头的一项大型研究发现,不管测试哪种语言,位于哪个地区和使用哪个平台,AI 助手在 45% 的时间里曲解新闻内容。研究测试了 ChatGPT、Copilot、Gemini 和 Perplexity 四大 AI 平台。结果显示,45% 的 AI 答案至少存在一个大问题;31% 的答案存在严重的信息源问题;20% 存在重大的精确性问题,包含了虚构的细节和过时的信息;Gemini 表现最差,76% 的答案存在重大问题,是其它 AI 助手的两倍多,这主要归咎于其信息源问题。
- 每周一到两天步行逾 4 千步与老年女性更低的心脏病和死亡风险相关
根据发表在《The British Journal of Sports Medicine》期刊上的一项研究,Mass General Brigham 的研究人员分析了 13,547 名年长女性一周的步行数和十年内的死亡率和心血管疾病发病率,发现每周一到两天步行数达到 4,000 步与更低的死亡和心血管疾病风险相关——步数越多,益处越大,直至风险降低至一定水平。这些女性的平均年龄是 71.8 岁,她们在 2011 年至 2015 年间佩戴 ActiGraph GT3X+ 加速度计追踪一周的步数。根据每周步行数她们被分成 4,000、5,000、6,000 或 7,000 步组。结果显示,相比每周步数不足 4000 的人,每周有一到两天达到 4,000 步数的人死亡风险降低 26%,心血管疾病风险降低 27%。每周三天或三天以上达到 4,000 步数的人死亡风险进一步降低 40%。达到更高步数的女性心血管疾病风险则趋于平稳。
- DigiKam 8.8.0 释出
照片管理系统 DigiKam 释出了 v8.8.0 版本,新版本提升了性能、稳定性和用户体验,尤其是在图像处理、色彩管理和工作流程效率等方面。主要新变化包括:核心代码迁移到 Qt 6.10.0;改进标签管理(Tag Management),支持标签层级的导入/导出功能;部分相机型号的焦点可视化;自动使用显示器色彩配置文件,背景模糊工具,等等。
- 2024 年全球煤炭消耗创历史新高
根据智库 World Resources Institute 的年度报告《State of Climate Action》,尽管各国正致力于转向清洁能源,但 2024 年的全球煤炭使用量仍然创下了历史新高。世界各国未能实现其设定的温室气体减排目标,温室气体排放量仍在持续上升,尽管增速有所放缓。虽然大部分政府承诺将逐步减少煤炭使用,但部分国家仍然在继续推进煤炭,印度总理莫迪(Narendra Modi)庆祝今年煤炭产量突破 10 亿吨,而美国总统唐特朗普则宣布支持化石燃料,试图叫停可再生能源项目。好消息是可再生能源发电量呈指数级增长。去年出售的新车中,逾五分之一是电动汽车。在中国,这一比例则接近一半。
- 马斯克对 NASA 代理局长宣战
NASA 代理局长 Sean Duffy 周一在电视上公开质疑了该机构最重要的承包商 SpaceX,Duffy 称 SpaceX 在研发月球着陆器上进度滞后,他考虑修改合同。Duffy 此举引人注目,因为在航天领域进度缓慢是司空见惯的,NASA 几乎每个项目都延期。Duffy 此举一则可能是向特朗普表明其在中国之前重返月球的决心,二则展示与 SpaceX 对抗的意愿。周二 SpaceX CEO 马斯克(Elon Musk)忍无可忍的称他为 Sean Dummy。这场口水战对 NASA 而言不是什么好事,由于裁员和自愿退休,NASA 已经减员五分之一,而 Duffy 还表达了将 NASA 划归运输部的意愿,他是现任运输部部长,这意味着 NASA 将完全处于其管辖范围,不再是一个独立机构。
- Valkey 9.0.0 释出
开源分布式键值数据库 Valkey 释出了 v9.0.0 版本。Valkey 是 Redis 的分支,2024 年 3 月 Redis 从开源的 3-clause BSD 许可证切换到商业使用需获得授权的双许可证 Redis Source Available License (RSALv2) 和 Server Side Public License (SSPLv1),不再属于开源软件,因此社区创建了分支 Valkey,然而到了 2025 年 5 月 Redis 再次切换到开源许可证 AGPLv3,而 Valkey 一直独立发展至今。Valkey 9.0.0 的新特性包括:Multipath TCP (MPTCP)支持,新客户端命令过滤器,集群模式下多数据库支持等。