OrangeBot.AI Digest — 2025-10-22
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Ovi: Twin backbone cross-modal fusion for audio-video generation (github.com)
- Mass Assignment Vulnerability Exposes Max Verstappen Passport and F1 Drivers PII (ian.sh)
- I see a future in jj (steveklabnik.com)
- JMAP for Calendars, Contacts and Files Now in Stalwart (stalw.art)
- Look, Another AI Browser (manuelmoreale.com)
- Scripts I wrote that I use all the time (evanhahn.com)
- Meta is axing 600 roles across its AI division (www.theverge.com)
- Willow quantum chip demonstrates verifiable quantum advantage on hardware (blog.google)
- AI assistants misrepresent news content 45% of the time (www.bbc.co.uk)
- Tesla Recalls Almost 13,000 EVs over Risk of Battery Power Loss (www.bloomberg.com)
- Internet's biggest annoyance: Cookie laws should target browsers, not websites (nednex.com)
- Greg Newby, CEO of Project Gutenberg Literary Archive Foundation, has died (www.pgdp.net)
- Die shots of as many CPUs and other interesting chips as possible (commons.wikimedia.org)
- MinIO stops distributing free Docker images (github.com)
- Go subtleties (harrisoncramer.me)
GitHub Trending(15)
- mountain-loop / yaak
The most intuitive desktop API client. Organize and execute REST, GraphQL, WebSockets, Server Sent Events, and gRPC 🦬
- servo / servo
Servo aims to empower developers with a lightweight, high-performance alternative for embedding web technologies in applications.
- emcie-co / parlant
LLM agents built for control. Designed for real-world use. Deployed in minutes.
- guofei9987 / blind_watermark
Blind&Invisible Watermark ,图片盲水印,提取水印无须原图!
- lfnovo / open-notebook
An Open Source implementation of Notebook LM with more flexibility and features
- dyad-sh / dyad
Free, local, open-source AI app builder ✨ v0 / lovable / Bolt alternative 🌟 Star if you like it!
- fishaudio / fish-speech
SOTA Open Source TTS
- huggingface / chat-ui
Open source codebase powering the HuggingChat app
- rossant / awesome-math
A curated list of awesome mathematics resources
- drawdb-io / drawdb
Free, simple, and intuitive online database diagram editor and SQL generator.
- DrewThomasson / ebook2audiobook
Generate audiobooks from e-books, voice cloning & 1107+ languages!
- clockworklabs / SpacetimeDB
Multiplayer at the speed of light
- anthropics / claude-cookbooks
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
- zyronon / TypeWords
练习英语,一次敲击,一点进步;Practice English, One Keystroke at a Time.
- tauri-apps / tauri
Build smaller, faster, and more secure desktop and mobile applications with a web frontend.
Hugging Face(15)
- LightMem: Lightweight and Efficient Memory-Augmented Generation
Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. Experiments on LongMemEval with GPT and Qwen backbones show that LightMem outperforms strong baselines in accuracy (up to 10.9% gains) while reducing token usage by up to 117x, API calls by up to 159x, and runtime by over 12x. The code is available at https://github.com/zjunlp/LightMem.
- World-in-World: World Models in a Closed-Loop World
Generative world models (WMs) can now simulate worlds with striking visual realism, which naturally raises the question of whether they can endow embodied agents with predictive perception for decision making. Progress on this question has been limited by fragmented evaluation: most existing benchmarks adopt open-loop protocols that emphasize visual quality in isolation, leaving the core issue of embodied utility unresolved, i.e., do WMs actually help agents succeed at embodied tasks? To address this gap, we introduce World-in-World, the first open platform that benchmarks WMs in a closed-loop world that mirrors real agent-environment interactions. World-in-World provides a unified online planning strategy and a standardized action API, enabling heterogeneous WMs for decision making. We curate four closed-loop environments that rigorously evaluate diverse WMs, prioritize task success as the primary metric, and move beyond the common focus on visual quality; we also present the first data scaling law for world models in embodied settings. Our study uncovers three surprises: (1) visual quality alone does not guarantee task success, controllability matters more; (2) scaling post-training with action-observation data is more effective than upgrading the pretrained video generators; and (3) allocating more inference-time compute allows WMs to substantially improve closed-loop performance.
- UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
Recent progress in text-to-image (T2I) generation underscores the importance of reliable benchmarks in evaluating how accurately generated images reflect the semantics of their textual prompt. However, (1) existing benchmarks lack the diversity of prompt scenarios and multilingual support, both essential for real-world applicability; (2) they offer only coarse evaluations across primary dimensions, covering a narrow range of sub-dimensions, and fall short in fine-grained sub-dimension assessment. To address these limitations, we introduce UniGenBench++, a unified semantic assessment benchmark for T2I generation. Specifically, it comprises 600 prompts organized hierarchically to ensure both coverage and efficiency: (1) spans across diverse real-world scenarios, i.e., 5 main prompt themes and 20 subthemes; (2) comprehensively probes T2I models' semantic consistency over 10 primary and 27 sub evaluation criteria, with each prompt assessing multiple testpoints. To rigorously assess model robustness to variations in language and prompt length, we provide both English and Chinese versions of each prompt in short and long forms. Leveraging the general world knowledge and fine-grained image understanding capabilities of a closed-source Multi-modal Large Language Model (MLLM), i.e., Gemini-2.5-Pro, an effective pipeline is developed for reliable benchmark construction and streamlined model assessment. Moreover, to further facilitate community use, we train a robust evaluation model that enables offline assessment of T2I model outputs. Through comprehensive benchmarking of both open- and closed-sourced T2I models, we systematically reveal their strengths and weaknesses across various aspects.
- Chem-R: Learning to Reason as a Chemist
Although large language models (LLMs) have significant potential to advance chemical discovery, current LLMs lack core chemical knowledge, produce unreliable reasoning trajectories, and exhibit suboptimal performance across diverse chemical tasks. To address these challenges, we propose Chem-R, a generalizable Chemical Reasoning model designed to emulate the deliberative processes of chemists. Chem-R is trained through a three-phase framework that progressively builds advanced reasoning capabilities, including: 1) Chemical Foundation Training, which establishes core chemical knowledge. 2) Chemical Reasoning Protocol Distillation, incorporating structured, expert-like reasoning traces to guide systematic and reliable problem solving. 3) Multi-task Group Relative Policy Optimization that optimizes the model for balanced performance across diverse molecular- and reaction-level tasks. This structured pipeline enables Chem-R to achieve state-of-the-art performance on comprehensive benchmarks, surpassing leading large language models, including Gemini-2.5-Pro and DeepSeek-R1, by up to 46% on molecular tasks and 66% on reaction tasks. Meanwhile, Chem-R also consistently outperforms the existing chemical foundation models across both molecular and reaction level tasks. These results highlight Chem-R's robust generalization, interpretability, and potential as a foundation for next-generation AI-driven chemical discovery.
- Efficient Long-context Language Model Training by Core Attention Disaggregation
We present core attention disaggregation (CAD), a technique that improves long-context large language model training by decoupling the core attention computation, softmax(QK^T)V, from the rest of the model and executing it on a separate pool of devices. In existing systems, core attention is colocated with other layers; at long context lengths, its quadratic compute growth compared to the near-linear growth of other components causes load imbalance and stragglers across data and pipeline parallel groups. CAD is enabled by two observations. First, core attention is stateless: it has no trainable parameters and only minimal transient data, so balancing reduces to scheduling compute-bound tasks. Second, it is composable: modern attention kernels retain high efficiency when processing fused batches of token-level shards with arbitrary lengths. CAD partitions core attention into token-level tasks and dispatches them to dedicated attention servers, which dynamically rebatch tasks to equalize compute without sacrificing kernel efficiency. We implement CAD in a system called DistCA, which uses a ping-pong execution scheme to fully overlap communication with computation and in-place execution on attention servers to reduce memory use. On 512 H200 GPUs and context lengths up to 512k tokens, DistCA improves end-to-end training throughput by up to 1.35x, eliminates data and pipeline parallel stragglers, and achieves near-perfect compute and memory balance.
- MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation
Long video generation with Diffusion Transformers (DiTs) is bottlenecked by the quadratic scaling of full attention with sequence length. Since attention is highly redundant, outputs are dominated by a small subset of query-key pairs. Existing sparse methods rely on blockwise coarse estimation, whose accuracy-efficiency trade-offs are constrained by block size. This paper introduces Mixture-of-Groups Attention (MoGA), an efficient sparse attention that uses a lightweight, learnable token router to precisely match tokens without blockwise estimation. Through semantic-aware routing, MoGA enables effective long-range interactions. As a kernel-free method, MoGA integrates seamlessly with modern attention stacks, including FlashAttention and sequence parallelism. Building on MoGA, we develop an efficient long video generation model that end-to-end produces minute-level, multi-shot, 480p videos at 24 fps, with a context length of approximately 580k. Comprehensive experiments on various video generation tasks validate the effectiveness of our approach.
- Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
While Multimodal Large Language Models (MLLMs) excel at holistic understanding, they struggle in capturing the dense world with complex scenes, requiring fine-grained analysis of intricate details and object inter-relationships. Region-level MLLMs have been a promising step. However, previous attempts are generally optimized to understand given regions in isolation, neglecting crucial global contexts. To address this, we introduce Grasp Any Region (GAR) for comprehen- sive region-level visual understanding. Empowered by an effective RoI-aligned feature replay technique, GAR supports (1) precise perception by leveraging necessary global contexts, and (2) modeling interactions between multiple prompts. Together, it then naturally achieves (3) advanced compositional reasoning to answer specific free-form questions about any region, shifting the paradigm from passive description to active dialogue. Moreover, we construct GAR-Bench, which not only provides a more accurate evaluation of single-region comprehension, but also, more importantly, measures interactions and complex reasoning across multiple regions. Extensive experiments have demonstrated that GAR-1B not only maintains the state-of-the-art captioning capabilities, e.g., outperforming DAM-3B +4.5 on DLC-Bench, but also excels at modeling relationships between multiple prompts with advanced comprehension capabilities, even surpassing InternVL3-78B on GAR-Bench-VQA. More importantly, our zero-shot GAR-8B even outperforms in-domain VideoRefer-7B on VideoRefer-BenchQ, indicating its strong capabilities can be easily transferred to videos.
- IF-VidCap: Can Video Caption Models Follow Instructions?
Although Multimodal Large Language Models (MLLMs) have demonstrated proficiency in video captioning, practical applications require captions that follow specific user instructions rather than generating exhaustive, unconstrained descriptions. Current benchmarks, however, primarily assess descriptive comprehensiveness while largely overlooking instruction-following capabilities. To address this gap, we introduce IF-VidCap, a new benchmark for evaluating controllable video captioning, which contains 1,400 high-quality samples. Distinct from existing video captioning or general instruction-following benchmarks, IF-VidCap incorporates a systematic framework that assesses captions on two dimensions: format correctness and content correctness. Our comprehensive evaluation of over 20 prominent models reveals a nuanced landscape: despite the continued dominance of proprietary models, the performance gap is closing, with top-tier open-source solutions now achieving near-parity. Furthermore, we find that models specialized for dense captioning underperform general-purpose MLLMs on complex instructions, indicating that future work should simultaneously advance both descriptive richness and instruction-following fidelity.
- GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver
While diffusion models achieve state-of-the-art generation quality, they still suffer from computationally expensive sampling. Recent works address this issue with gradient-based optimization methods that distill a few-step ODE diffusion solver from the full sampling process, reducing the number of function evaluations from dozens to just a few. However, these approaches often rely on intricate training techniques and do not explicitly focus on preserving fine-grained details. In this paper, we introduce the Generalized Solver: a simple parameterization of the ODE sampler that does not require additional training tricks and improves quality over existing approaches. We further combine the original distillation loss with adversarial training, which mitigates artifacts and enhances detail fidelity. We call the resulting method the Generalized Adversarial Solver and demonstrate its superior performance compared to existing solver training methods under similar resource constraints. Code is available at https://github.com/3145tttt/GAS.
- Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
We present Ring-1T, the first open-source, state-of-the-art thinking model with a trillion-scale parameter. It features 1 trillion total parameters and activates approximately 50 billion per token. Training such models at a trillion-parameter scale introduces unprecedented challenges, including train-inference misalignment, inefficiencies in rollout processing, and bottlenecks in the RL system. To address these, we pioneer three interconnected innovations: (1) IcePop stabilizes RL training via token-level discrepancy masking and clipping, resolving instability from training-inference mismatches; (2) C3PO++ improves resource utilization for long rollouts under a token budget by dynamically partitioning them, thereby obtaining high time efficiency; and (3) ASystem, a high-performance RL framework designed to overcome the systemic bottlenecks that impede trillion-parameter model training. Ring-1T delivers breakthrough results across critical benchmarks: 93.4 on AIME-2025, 86.72 on HMMT-2025, 2088 on CodeForces, and 55.94 on ARC-AGI-v1. Notably, it attains a silver medal-level result on the IMO-2025, underscoring its exceptional reasoning capabilities. By releasing the complete 1T parameter MoE model to the community, we provide the research community with direct access to cutting-edge reasoning capabilities. This contribution marks a significant milestone in democratizing large-scale reasoning intelligence and establishes a new baseline for open-source model performance.
- Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning
Faithfully personalizing large language models (LLMs) to align with individual user preferences is a critical but challenging task. While supervised fine-tuning (SFT) quickly reaches a performance plateau, standard reinforcement learning from human feedback (RLHF) also struggles with the nuances of personalization. Scalar-based reward models are prone to reward hacking which leads to verbose and superficially personalized responses. To address these limitations, we propose Critique-Post-Edit, a robust reinforcement learning framework that enables more faithful and controllable personalization. Our framework integrates two key components: (1) a Personalized Generative Reward Model (GRM) that provides multi-dimensional scores and textual critiques to resist reward hacking, and (2) a Critique-Post-Edit mechanism where the policy model revises its own outputs based on these critiques for more targeted and efficient learning. Under a rigorous length-controlled evaluation, our method substantially outperforms standard PPO on personalization benchmarks. Personalized Qwen2.5-7B achieves an average 11\% win-rate improvement, and personalized Qwen2.5-14B model surpasses the performance of GPT-4.1. These results demonstrate a practical path to faithful, efficient, and controllable personalization.
- Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution
Multilingual watermarking aims to make large language model (LLM) outputs traceable across languages, yet current methods still fall short. Despite claims of cross-lingual robustness, they are evaluated only on high-resource languages. We show that existing multilingual watermarking methods are not truly multilingual: they fail to remain robust under translation attacks in medium- and low-resource languages. We trace this failure to semantic clustering, which fails when the tokenizer vocabulary contains too few full-word tokens for a given language. To address this, we introduce STEAM, a back-translation-based detection method that restores watermark strength lost through translation. STEAM is compatible with any watermarking method, robust across different tokenizers and languages, non-invasive, and easily extendable to new languages. With average gains of +0.19 AUC and +40%p TPR@1% on 17 languages, STEAM provides a simple and robust path toward fairer watermarking across diverse languages.
- MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues
The recent development of Multimodal Large Language Models (MLLMs) has significantly advanced AI's ability to understand visual modalities. However, existing evaluation benchmarks remain limited to single-turn question answering, overlooking the complexity of multi-turn dialogues in real-world scenarios. To bridge this gap, we introduce MT-Video-Bench, a holistic video understanding benchmark for evaluating MLLMs in multi-turn dialogues. Specifically, our MT-Video-Bench mainly assesses six core competencies that focus on perceptivity and interactivity, encompassing 987 meticulously curated multi-turn dialogues from diverse domains. These capabilities are rigorously aligned with real-world applications, such as interactive sports analysis and multi-turn video-based intelligent tutoring. With MT-Video-Bench, we extensively evaluate various state-of-the-art open-source and closed-source MLLMs, revealing their significant performance discrepancies and limitations in handling multi-turn video dialogues. The benchmark will be publicly available to foster future research.
- ssToken: Self-modulated and Semantic-aware Token Selection for LLM Fine-tuning
Data quality plays a critical role in enhancing supervised fine-tuning (SFT) for large language models (LLMs), and token-level data selection has emerged as a promising direction for its fine-grained nature. Despite their strong empirical performance, existing token-level selection methods share two key limitations: (1) requiring training or accessing an additional reference model, and (2) relying solely on loss information for token selection, which cannot well preserve semantically important tokens that are not favored by loss-based metrics. To address these challenges, we propose ssToken, a Self-modulated and Semantic-aware Token Selection approach. ssToken leverages readily accessible history models to compute the per-token loss difference with the current model, which serves as a self-modulated signal that enables the model to adaptively select tokens along its optimization trajectory, rather than relying on excess loss from an offline-trained reference model as in prior works. We further introduce a semantic-aware, attention-based token importance estimation metric, orthogonal to loss-based selection and providing complementary semantic information for more effective filtering. Extensive experiments across different model families and scales demonstrate that both self-modulated selection and semantic-aware selection alone outperform full-data fine-tuning, while their integration--ssToken--achieves synergistic gains and further surpasses prior token-level selection methods, delivering performance improvements while maintaining training efficiency.
- UltraGen: High-Resolution Video Generation with Hierarchical Attention
Recent advances in video generation have made it possible to produce visually compelling videos, with wide-ranging applications in content creation, entertainment, and virtual reality. However, most existing diffusion transformer based video generation models are limited to low-resolution outputs (<=720P) due to the quadratic computational complexity of the attention mechanism with respect to the output width and height. This computational bottleneck makes native high-resolution video generation (1080P/2K/4K) impractical for both training and inference. To address this challenge, we present UltraGen, a novel video generation framework that enables i) efficient and ii) end-to-end native high-resolution video synthesis. Specifically, UltraGen features a hierarchical dual-branch attention architecture based on global-local attention decomposition, which decouples full attention into a local attention branch for high-fidelity regional content and a global attention branch for overall semantic consistency. We further propose a spatially compressed global modeling strategy to efficiently learn global dependencies, and a hierarchical cross-window local attention mechanism to reduce computational costs while enhancing information flow across different local windows. Extensive experiments demonstrate that UltraGen can effectively scale pre-trained low-resolution video models to 1080P and even 4K resolution for the first time, outperforming existing state-of-the-art methods and super-resolution based two-stage pipelines in both qualitative and quantitative evaluations.
Solidot(15)
- 马斯克对 NASA 代理局长宣战
NASA 代理局长 Sean Duffy 周一在电视上公开质疑了该机构最重要的承包商 SpaceX,Duffy 称 SpaceX 在研发月球着陆器上进度滞后,他考虑修改合同。Duffy 此举引人注目,因为在航天领域进度缓慢是司空见惯的,NASA 几乎每个项目都延期。Duffy 此举一则可能是向特朗普表明其在中国之前重返月球的决心,二则展示与 SpaceX 对抗的意愿。周二 SpaceX CEO 马斯克(Elon Musk)忍无可忍的称他为 Sean Dummy。这场口水战对 NASA 而言不是什么好事,由于裁员和自愿退休,NASA 已经减员五分之一,而 Duffy 还表达了将 NASA 划归运输部的意愿,他是现任运输部部长,这意味着 NASA 将完全处于其管辖范围,不再是一个独立机构。
- Valkey 9.0.0 释出
开源分布式键值数据库 Valkey 释出了 v9.0.0 版本。Valkey 是 Redis 的分支,2024 年 3 月 Redis 从开源的 3-clause BSD 许可证切换到商业使用需获得授权的双许可证 Redis Source Available License (RSALv2) 和 Server Side Public License (SSPLv1),不再属于开源软件,因此社区创建了分支 Valkey,然而到了 2025 年 5 月 Redis 再次切换到开源许可证 AGPLv3,而 Valkey 一直独立发展至今。Valkey 9.0.0 的新特性包括:Multipath TCP (MPTCP)支持,新客户端命令过滤器,集群模式下多数据库支持等。
- 外国黑客利用 SharePoint 漏洞入侵美国核武器工厂
外国黑客利用未修复的 Microsoft SharePoint 漏洞入侵了美国堪萨斯城国家安全园区(KCNSC),该园区是美国重要的核武器制造基地。该基地由国防承包商霍尼韦尔(Honeywell Federal Manufacturing & Technologies)管理。攻击者利用了最近披露的两个 Microsoft SharePoint 漏洞——欺骗漏洞 CVE-2025-53770 和远程代码执行漏洞 CVE-2025-49704,两个漏洞都影响本地服务器。微软于 7 月 19 日发布了补丁。7 月 22 日美国国家核安全局证实遭到了攻击。
- OpenAI 发布 AI 浏览器 ChatGPT Atlas
OpenAI 发布了深度整合其 AI 聊天机器人 ChatGPT 的浏览器 ChatGPT Atlas。该浏览器首先提供了 macOS 版本,未来将推出 Windows、iOS 和 Android 版本。Atlas 标签页和 Google 搜索框一样简洁,其中的一段文本提示用户可以询问 ChatGPT 或输入网址,用户可以在当前页打开侧边框与 ChatGPT 聊天,根据页面上下文提问,可以在草稿窗口使用 ChatGPT 直接编辑 Gmail 草稿而无需在聊天窗口拷贝粘贴。
- 美国缩小 10 万美元 H-1B 签证费适用范围
上个月,美国总统特朗普宣布了 10 万美元的 H-1B 签证费用,当时曾宣称该费用适用于所有新签证申请者,而在美国境外 H-1B 签证持有者也将需要提供支付证明,此事一度引发了混乱。本周一,美国公民及移民服务局(U.S. Citizenship and Immigration Services 或 USCIS) 澄清,该费用仅适用于在美国境外的新签证申请者。雇主需要在签证获得批准并允许其移居美国后支付这笔费用。数据显示,2024 年新签发的 14.1 万份 H-1B 签证中,54% 发给了已持有其它类型签证的移民。这意味着,未来逾半数的签证申请者(比如在美国留学的国际学生)并不需要支付这笔费用。
- PRIMA 芯片助黄斑变性失明患者恢复视力
由斯坦福医学院等国际科研机构组成的团队,开发出一款视网膜下无线微芯片,结合一副高科技眼镜,首次真正提供“形式视觉”,成功帮助晚期老年性黄斑变性患者恢复了视力。在一项临床试验中,32 名完成一年随访的参与者中有 27 人恢复了阅读能力。该设备名为 PRIMA,是首个为无法治愈的视力丧失患者恢复功能性“形式视觉”的眼部假体。“形式视觉”指能感知形状和图案的能力,而不仅仅是感知光线。此前,所有尝试通过假体装置恢复视觉的努力大多仅能实现对光的感知。PRIMA 设备由安装在一副眼镜上的微型摄像头负责捕捉外部图像,并通过红外光将图像实时投射到植入眼底的无线芯片上。芯片接收到红外信号后,将其转换为电刺激,从而替代因疾病而丧失功能的天然感光细胞,将视觉信息传递至视网膜中仍完好的神经元。这一装置是数十年研究、原型开发、动物实验以及早期小规模人体试验的成果。参与此次试验的 38 名患者均为 60 岁以上,患有晚期老年性黄斑变性,至少有一只眼睛失明。这种疾病是导致老年人不可逆失明的最常见原因,全球超过 500 万人受此影响。它会破坏视网膜中心区域的光感受器细胞,但大多数患者仍保留部分周边视觉的感光细胞以及传递信号的视网膜神经元。PRIMA 芯片正是利用了这一残留功能恢复患者视力的。目前 PRIMA 仅提供黑白视觉,无中间灰度。团队在开发新软件,以实现全范围灰度成像,这对于人脸识别等高级视觉任务至关重要。
- TikTok 修改政策不再提前通知用户政府索要其数据
TikTok 以前的政策声明该公司会在执法部门索要用户数据之前通知相关用户。现在它修改了政策,只会在法律要求下通知相关用户,并且将披露时间推迟到用户数据披露时。以前它表示会拒绝执法部门的数据请求,现在软化了立场,表示可能会拒绝此类数据请求。TikTok 没有回答是否已经或正在与美国国土安全部或移民和海关执法局(ICE)共享私人用户信息的问题。
- 盖亚望远镜发现银河系的巨浪
我们的银河系从不静止,它会旋转,也会摆动。ESA 的盖亚(Gaia)太空望远镜的数据揭示,银河系还存在一股从中心向外扩散的巨大波动。一道巨浪在太阳周围数万光年的尺度上搅动着银河中的恒星运动。就像把石头丢进池塘泛起涟漪一样,这道由恒星构成的银河巨浪,横跨了银河系外侧盘面的大片区域。科学家目前仍不清楚这种银河颤动的起源。遥远过去与一个矮星系的碰撞可能是原因之一,但仍需要更多研究。
- SpaceX 进度滞后 NASA 可能选择其它公司开发月球着陆器
SpaceX 此前与 NASA 签署了一项价值 29 亿美元的合同,提供宇航员登陆月球表面的着陆器。NASA 代理局长 Sean Duffy 周一在 CNBC Squawk Box 上表示,SpaceX 推迟了时间表,而美国正致力于在中国之前载人登月,NASA 正考虑让其它公司与 SpaceX 竞争制造月球着陆器。如果 NASA 取消或修改与 SpaceX 的合同,可能预示着 NASA 的计划发生重大逆转。
- 美国客机疑与气象气球发生碰撞
美国联合航空公司一架从丹佛飞往洛杉矶的 UA1093 客机上周四 10 月 16 日发生了挡风玻璃撞击事件。这架 737 MAX 飞机前部两个玻璃窗之一破裂严重,但没有完全破碎,飞行员手臂显示疑似割伤痕迹。机长声称撞击玻璃的是太空碎片。碰撞高度在一万米左右。客机备降在盐湖城国际机场。美国国家运输安全委员会(NTSB)表示正对此事件展开调查。WindBorne System CEO 之后表示,与飞机发生相撞的物体可能是该公司的气象气球。气象气球当时正在该地区同一高度飞行。
- KDE Plasma 6.5 释出
KDE Plasma 桌面环境释出了 v6.5。主要变化包括::窗口底部圆角、自动亮暗主题切换、改进系统设置、登录屏幕加入休眠选项、改进可访问性功能、通过调整色调映射曲线改进 HDR 显示支持、实验性的 Wayland 画中画协议支持、增强 Wayland 功能、新的桌面灰度选项,等等。
- 中国东南沿海同时遭遇海平面加速上升和地面下沉
根据发表在《自然》期刊上的一项研究,罗格斯大学和中山大学等机构的研究人员报告中国东南沿海地区同时遭遇海平面加速上升和地面加速下沉。研究人员称这种现象在全新世地质记录中以前从未观察到过。中国东南沿海是世界人口最稠密的地区之一。研究人员分析了过去 11,700 年海平面上升速度。海平面变化经历了三个阶段:早期因冰川融化推动海平面快速上升,之后从 4200 年前到 19 世纪中叶海平面变化趋稳,自 1900 年以来海平面上升速度超过了过去 4000 年任何一个世纪。在海平面加速上升的同时,中国东南沿海地区还面临人为的地面加速沉降,大规模城市化导致的地面沉降速度远高于海平面上升速度,如潮州、福州、绍兴、汕头、杭州等城市沉降速度数倍于海平面上升速度。双重效应增加了该地区的洪涝风险。
- 被切断大脑区域的脑电图与深度睡眠脑电图相似
根据发表在 PLoS Biology 期刊上的一项研究,被切断大脑区域的脑电图与深度睡眠脑电图相似。这一研究发现加深了对意识和无意识脑状态的理解。对于药物无效的严重癫痫患儿,医生会通过名为脑半球切除术(hemispherotomy)的手术切断引发癫痫的脑半球与大脑其余部分的连接,以阻止癫痫扩散。被切断的脑组织会被留在颅骨内,保留了完整的血液供应。被切断的脑区域是否具有某种形式的意识,或者是否能表现出意识?研究人员分析了十名儿童在手术前以及手术后六个月至三年期间的脑电图。他们发现,被切断脑区域的电活动在术后减缓,而其余大脑部分的电活动没有变化,其脑电图也与对照组儿童脑电图相似。被切断脑区域的脑电波主要是 δ波,脑电图与对照组儿童深度睡眠时的脑电图相似。
- SpaceX 发射了第 1 万颗 Starlink 卫星
SpaceX 于 10 月 19 日发射了两枚 Falcon 9 火箭,分别将 28 颗 Starlink 宽带卫星送入轨道。其中第一枚在佛罗里达发射升空,其第一级创下了第 31 次发射的重复使用记录;第二枚不到两个小时后从加州范登堡太空军基地发射升空,这枚火箭携带了第 10,000 颗进入轨道的 Starlink 卫星,这是 Falcon 9 火箭今年的第 132 次发射,追平了去年创下的纪录,而距离 2026 年还有近两个半月的时间。
- Steam 平台一款游戏的愿望单数字越高是否销量越多?
游戏行业研究机构 GameDiscoverCo 发表报告,分析了 Steam 平台上一款游戏的愿望单数字如何转化为实际销量。当玩家将一款游戏加入愿望单后,Steam 会在游戏发售时以及特价时通知玩家,增加了游戏的曝光度,因此也能增加游戏的销量。但两者之间是如何转化的?GameDiscoverCo 编辑统计了 2024 年 9 月至 2025 年 9 月首周销售数据和转化率最高的 20 款游戏,发现高转化率游戏通常在发售后口碑更好,而在线合作类型一般有更高的转化率。针对休闲类玩家的畅销游戏如 NBA 2K26 和 EA Sports FC 25,玩家通常不会把它们加入到愿望单,因此销量/愿望单的转化率会显得异常高,有时候时候可能达到 1 到 3 倍。游戏愿望单数字很高,但销量很低,原因通常是发售后口碑太差。有一个特例是 NSFW 类型游戏,其转化率相当高,格外突出,原因是购买成人游戏的 Steam 玩家很可能属于转化率更高的“硬核”群体。