OrangeBot.AI Digest — 2025-09-16
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- How to make the Framework Desktop run even quieter (noctua.at)
- Denmark close to wiping out cancer-causing HPV strains after vaccine roll-out (www.gavi.org)
- DOJ Deletes Study Showing Domestic Terrorists Are Most Often Right Wing (www.404media.co)
- Scammed out of $130K via fake Google call, spoofed Google email and auth sync (bewildered.substack.com)
- Waymo has received our pilot permit allowing for commercial operations at SFO (waymo.com)
- Bertrand Russell to Oswald Mosley (1962) (lettersofnote.com)
- Microsoft Favors Anthropic over OpenAI for Visual Studio Code (www.theverge.com)
- Java 25 officially released (mail.openjdk.org)
- Things you can do with a Software Defined Radio (2024) (blinry.org)
- Generative AI as Seniority-Biased Technological Change (papers.ssrn.com)
- Man jailed for parole violations after refusing to decrypt his Tor node (reddit.com)
- Robert Redford has died (www.nytimes.com)
- Shai-Hulud malware attack: Tinycolor and over 40 NPM packages compromised (www.stepsecurity.io)
- Top UN legal investigators conclude Israel is guilty of genocide in Gaza (www.middleeasteye.net)
- 60 years after Gemini, newly processed images reveal details (arstechnica.com)
GitHub Trending(15)
- microsoft / markitdown
Python tool for converting files and office documents to Markdown.
- ml-explore / mlx-lm
Run LLMs with MLX
- dataease / SQLBot
基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.
- SkyworkAI / DeepResearchAgent
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coordinate multiple specialized lower-level agents, enabling automated task decomposition and efficient execution across diverse and complex domains.
- virattt / ai-hedge-fund
An AI Hedge Fund Team
- ccxt / ccxt
A cryptocurrency trading API with more than 100 exchanges in JavaScript / TypeScript / Python / C# / PHP / Go
- HKUDS / DeepCode
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
- PaddlePaddle / PaddleOCR
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages.
- Plachtaa / seed-vc
zero-shot voice conversion & singing voice conversion, with real-time support
- BasedHardware / omi
AI wearables. Put it on, speak, transcribe, automatically
- mnh-jansson / open-battery-information
- ArthurBrussee / brush
3D Reconstruction for all
- PowerShell / PowerShell
PowerShell for every system!
- ItzCrazyKns / Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
- TheAlgorithms / Python
All Algorithms implemented in Python
Hugging Face(15)
- OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
The field of 4D world modeling - aiming to jointly capture spatial geometry and temporal dynamics - has witnessed remarkable progress in recent years, driven by advances in large-scale generative models and multimodal learning. However, the development of truly general 4D world models remains fundamentally constrained by the availability of high-quality data. Existing datasets and benchmarks often lack the dynamic complexity, multi-domain diversity, and spatial-temporal annotations required to support key tasks such as 4D geometric reconstruction, future prediction, and camera-control video generation. To address this gap, we introduce OmniWorld, a large-scale, multi-domain, multi-modal dataset specifically designed for 4D world modeling. OmniWorld consists of a newly collected OmniWorld-Game dataset and several curated public datasets spanning diverse domains. Compared with existing synthetic datasets, OmniWorld-Game provides richer modality coverage, larger scale, and more realistic dynamic interactions. Based on this dataset, we establish a challenging benchmark that exposes the limitations of current state-of-the-art (SOTA) approaches in modeling complex 4D environments. Moreover, fine-tuning existing SOTA methods on OmniWorld leads to significant performance gains across 4D reconstruction and video generation tasks, strongly validating OmniWorld as a powerful resource for training and evaluation. We envision OmniWorld as a catalyst for accelerating the development of general-purpose 4D world models, ultimately advancing machines' holistic understanding of the physical world.
- UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning
Graphical User Interface (GUI) agents have demonstrated remarkable progress in automating complex user interface interactions through reinforcement learning. However, current approaches face a fundamental dilemma: offline RL enables stable training on pre-collected trajectories, but struggles with multi-step task execution for lack of trajectory-level reward signals; online RL captures these signals through environment interaction, but suffers from sparse rewards and prohibitive deployment costs. To address it, we present Semi-online Reinforcement Learning, a novel paradigm that simulates online RL on offline trajectories. During each rollout process, we preserve the original model output within the multi-turn dialogue, where a Patch Module adaptively recovers the divergence between rollout and expert trajectories. To capture long-term training signals, Semi-online RL introduces discounted future returns into the reward computation and optimizes the policy with weighted step-level and episode-level advantages. We further introduce Semi-Online Performance (SOP), a metric that aligns better with true online performance, serving as a practical and effective proxy for real-world evaluation. Experiments show that ours Semi-online RL achieves SOTA performance among 7B models across four dynamic benchmarks, with significant gains over the base model (e.g., +12.0% on AndroidWorld, +23.8% on AITW), demonstrating significant progress in bridging the gap between offline training efficiency and online multi-turn reasoning. The code is available at https://github.com/X-PLUG/MobileAgent/tree/main/UI-S1.
- InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts
The advancement of Embodied AI heavily relies on large-scale, simulatable 3D scene datasets characterized by scene diversity and realistic layouts. However, existing datasets typically suffer from limitations in data scale or diversity, sanitized layouts lacking small items, and severe object collisions. To address these shortcomings, we introduce InternScenes, a novel large-scale simulatable indoor scene dataset comprising approximately 40,000 diverse scenes by integrating three disparate scene sources, real-world scans, procedurally generated scenes, and designer-created scenes, including 1.96M 3D objects and covering 15 common scene types and 288 object classes. We particularly preserve massive small items in the scenes, resulting in realistic and complex layouts with an average of 41.5 objects per region. Our comprehensive data processing pipeline ensures simulatability by creating real-to-sim replicas for real-world scans, enhances interactivity by incorporating interactive objects into these scenes, and resolves object collisions by physical simulations. We demonstrate the value of InternScenes with two benchmark applications: scene layout generation and point-goal navigation. Both show the new challenges posed by the complex and realistic layouts. More importantly, InternScenes paves the way for scaling up the model training for both tasks, making the generation and navigation in such complex scenes possible. We commit to open-sourcing the data, models, and benchmarks to benefit the whole community.
- SearchInstruct: Enhancing Domain Adaptation via Retrieval-Based Instruction Dataset Creation
Supervised Fine-Tuning (SFT) is essential for training large language models (LLMs), significantly enhancing critical capabilities such as instruction following and in-context learning. Nevertheless, creating suitable training datasets tailored for specific domains remains challenging due to unique domain constraints and data scarcity. In this paper, we propose SearchInstruct, an innovative method explicitly designed to construct high quality instruction datasets for SFT. Our approach begins with a limited set of domain specific, human generated questions, which are systematically expanded using a large language model. Subsequently, domain relevant resources are dynamically retrieved to generate accurate and contextually appropriate answers for each augmented question. Experimental evaluation demonstrates that SearchInstruct enhances both the diversity and quality of SFT datasets, leading to measurable improvements in LLM performance within specialized domains. Additionally, we show that beyond dataset generation, the proposed method can also effectively facilitate tasks such as model editing, enabling efficient updates to existing models. To facilitate reproducibility and community adoption, we provide full implementation details, the complete set of generated instruction response pairs, and the source code in a publicly accessible Git repository: [https://github.com/mostafaamiri/SearchInstruct](https://github.com/mostafaamiri/SearchInstruct)
- LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence
The reliance on implicit point matching via attention has become a core bottleneck in drag-based editing, resulting in a fundamental compromise on weakened inversion strength and costly test-time optimization (TTO). This compromise severely limits the generative capabilities of diffusion models, suppressing high-fidelity inpainting and text-guided creation. In this paper, we introduce LazyDrag, the first drag-based image editing method for Multi-Modal Diffusion Transformers, which directly eliminates the reliance on implicit point matching. In concrete terms, our method generates an explicit correspondence map from user drag inputs as a reliable reference to boost the attention control. This reliable reference opens the potential for a stable full-strength inversion process, which is the first in the drag-based editing task. It obviates the necessity for TTO and unlocks the generative capability of models. Therefore, LazyDrag naturally unifies precise geometric control with text guidance, enabling complex edits that were previously out of reach: opening the mouth of a dog and inpainting its interior, generating new objects like a ``tennis ball'', or for ambiguous drags, making context-aware changes like moving a hand into a pocket. Additionally, LazyDrag supports multi-round workflows with simultaneous move and scale operations. Evaluated on the DragBench, our method outperforms baselines in drag accuracy and perceptual quality, as validated by VIEScore and human evaluation. LazyDrag not only establishes new state-of-the-art performance, but also paves a new way to editing paradigms.
- Locality in Image Diffusion Models Emerges from Data Statistics
Among generative models, diffusion models are uniquely intriguing due to the existence of a closed-form optimal minimizer of their training objective, often referred to as the optimal denoiser. However, diffusion using this optimal denoiser merely reproduces images in the training set and hence fails to capture the behavior of deep diffusion models. Recent work has attempted to characterize this gap between the optimal denoiser and deep diffusion models, proposing analytical, training-free models that can generate images that resemble those generated by a trained UNet. The best-performing method hypothesizes that shift equivariance and locality inductive biases of convolutional neural networks are the cause of the performance gap, hence incorporating these assumptions into its analytical model. In this work, we present evidence that the locality in deep diffusion models emerges as a statistical property of the image dataset, not due to the inductive bias of convolutional neural networks. Specifically, we demonstrate that an optimal parametric linear denoiser exhibits similar locality properties to the deep neural denoisers. We further show, both theoretically and experimentally, that this locality arises directly from the pixel correlations present in natural image datasets. Finally, we use these insights to craft an analytical denoiser that better matches scores predicted by a deep diffusion model than the prior expert-crafted alternative.
- Lost in Embeddings: Information Loss in Vision-Language Models
Vision--language models (VLMs) often process visual inputs through a pretrained vision encoder, followed by a projection into the language model's embedding space via a connector component. While crucial for modality fusion, the potential information loss induced by this projection step and its direct impact on model capabilities remain understudied. We introduce two complementary approaches to examine and quantify this loss by analyzing the latent representation space. First, we evaluate semantic information preservation by analyzing changes in k-nearest neighbor relationships between image representations, before and after projection. Second, we directly measure information loss by reconstructing visual embeddings from the projected representation, localizing loss at an image patch level. Experiments reveal that connectors substantially distort the local geometry of visual representations, with k-nearest neighbors diverging by 40--60\% post-projection, correlating with degradation in retrieval performance. The patch-level embedding reconstruction provides interpretable insights for model behavior on visually grounded question-answering tasks, finding that areas of high information loss reliably predict instances where models struggle.
- Measuring Epistemic Humility in Multimodal Large Language Models
Hallucinations in multimodal large language models (MLLMs) -- where the model generates content inconsistent with the input image -- pose significant risks in real-world applications, from misinformation in visual question answering to unsafe errors in decision-making. Existing benchmarks primarily test recognition accuracy, i.e., evaluating whether models can select the correct answer among distractors. This overlooks an equally critical capability for trustworthy AI: recognizing when none of the provided options are correct, a behavior reflecting epistemic humility. We present HumbleBench, a new hallucination benchmark designed to evaluate MLLMs' ability to reject plausible but incorrect answers across three hallucination types: object, relation, and attribute. Built from a panoptic scene graph dataset, we leverage fine-grained scene graph annotations to extract ground-truth entities and relations, and prompt GPT-4-Turbo to generate multiple-choice questions, followed by a rigorous manual filtering process. Each question includes a "None of the above" option, requiring models not only to recognize correct visual information but also to identify when no provided answer is valid. We evaluate a variety of state-of-the-art MLLMs -- including both general-purpose and specialized reasoning models -- on HumbleBench and share valuable findings and insights with the community. By incorporating explicit false-option rejection, HumbleBench fills a key gap in current evaluation suites, providing a more realistic measure of MLLM reliability in safety-critical settings. Our code and dataset are released publicly and can be accessed at https://github.com/maifoundations/HumbleBench.
- Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting
Prior works in multi-objective reinforcement learning typically use linear reward scalarization with fixed weights, which provably fail to capture non-convex Pareto fronts and thus yield suboptimal results. This limitation becomes especially critical in online preference alignment for large language models. Here, stochastic trajectories generated by parameterized policies create highly non-linear and non-convex mappings from parameters to objectives that no single static weighting scheme can find optimal trade-offs. We address this limitation by introducing dynamic reward weighting, which adaptively adjusts reward weights during the online reinforcement learning process. Unlike existing approaches that rely on fixed-weight interpolation, our dynamic weighting continuously balances and prioritizes objectives in training, facilitating effective exploration of Pareto fronts in objective space. We introduce two approaches of increasing sophistication and generalizability: (1) hypervolume-guided weight adaptation and (2) gradient-based weight optimization, offering a versatile toolkit for online multi-objective alignment. Our extensive experiments demonstrate their compatibility with commonly used online reinforcement learning algorithms (including GRPO, REINFORCE, and RLOO), effectiveness across multiple mathematical reasoning datasets, and applicability to different model families, consistently achieving Pareto dominant solutions with fewer training steps than fixed-weight linear scalarization baselines.
- Nav-R1: Reasoning and Navigation in Embodied Scenes
Embodied navigation requires agents to integrate perception, reasoning, and action for robust interaction in complex 3D environments. Existing approaches often suffer from incoherent and unstable reasoning traces that hinder generalization across diverse environments, and difficulty balancing long-horizon semantic reasoning with low-latency control for real-time navigation. To address these challenges, we propose Nav-R1, an embodied foundation model that unifies reasoning in embodied environments. We first construct Nav-CoT-110K, a large-scale dataset of step-by-step Chains-of-Thought (CoT) for embodied tasks, which enables cold-start initialization with structured reasoning. Building on this foundation, we design a GRPO-based reinforcement learning framework with three complementary rewards: format, understanding, and navigation, to improve structural adherence, semantic grounding, and path fidelity. Furthermore, we introduce a Fast-in-Slow reasoning paradigm, decoupling deliberate semantic reasoning from low-latency reactive control for efficient yet coherent navigation. Extensive evaluations on embodied AI benchmarks demonstrate that Nav-R1 consistently outperforms strong baselines, with over 8% average improvement in reasoning and navigation performance. Real-world deployment on a mobile robot further validates its robustness under limited onboard resources. Code: https://github.com/AIGeeksGroup/Nav-R1. Website: https://aigeeksgroup.github.io/Nav-R1.
- Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models
Recent advances in text-only "slow-thinking" reasoning have prompted efforts to transfer this capability to vision-language models (VLMs), for training visual reasoning models (VRMs). owever, such transfer faces critical challenges: Effective "slow thinking" in VRMs requires visual reflection, the ability to check the reasoning process based on visual information. Through quantitative analysis, we observe that current VRMs exhibit limited visual reflection, as their attention to visual information diminishes rapidly with longer generated responses. To address this challenge, we propose a new VRM Reflection-V, which enhances visual reflection based on reasoning data construction for cold-start and reward design for reinforcement learning (RL). Firstly, we construct vision-centered reasoning data by leveraging an agent that interacts between VLMs and reasoning LLMs, enabling cold-start learning of visual reflection patterns. Secondly, a visual attention based reward model is employed during RL to encourage reasoning based on visual information. Therefore, Reflection-V demonstrates significant improvements across multiple visual reasoning benchmarks. Furthermore, Reflection-V maintains a stronger and more consistent reliance on visual information during visual reasoning, indicating effective enhancement in visual reflection capabilities.
- CognitiveSky: Scalable Sentiment and Narrative Analysis for Decentralized Social Media
The emergence of decentralized social media platforms presents new opportunities and challenges for real-time analysis of public discourse. This study introduces CognitiveSky, an open-source and scalable framework designed for sentiment, emotion, and narrative analysis on Bluesky, a federated Twitter or X.com alternative. By ingesting data through Bluesky's Application Programming Interface (API), CognitiveSky applies transformer-based models to annotate large-scale user-generated content and produces structured and analyzable outputs. These summaries drive a dynamic dashboard that visualizes evolving patterns in emotion, activity, and conversation topics. Built entirely on free-tier infrastructure, CognitiveSky achieves both low operational cost and high accessibility. While demonstrated here for monitoring mental health discourse, its modular design enables applications across domains such as disinformation detection, crisis response, and civic sentiment analysis. By bridging large language models with decentralized networks, CognitiveSky offers a transparent, extensible tool for computational social science in an era of shifting digital ecosystems.
- PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits
Understanding human behavior traits is central to applications in human-computer interaction, computational social science, and personalized AI systems. Such understanding often requires integrating multiple modalities to capture nuanced patterns and relationships. However, existing resources rarely provide datasets that combine behavioral descriptors with complementary modalities such as facial attributes and biographical information. To address this gap, we present PersonaX, a curated collection of multimodal datasets designed to enable comprehensive analysis of public traits across modalities. PersonaX consists of (1) CelebPersona, featuring 9444 public figures from diverse occupations, and (2) AthlePersona, covering 4181 professional athletes across 7 major sports leagues. Each dataset includes behavioral trait assessments inferred by three high-performing large language models, alongside facial imagery and structured biographical features. We analyze PersonaX at two complementary levels. First, we abstract high-level trait scores from text descriptions and apply five statistical independence tests to examine their relationships with other modalities. Second, we introduce a novel causal representation learning (CRL) framework tailored to multimodal and multi-measurement data, providing theoretical identifiability guarantees. Experiments on both synthetic and real-world data demonstrate the effectiveness of our approach. By unifying structured and unstructured analysis, PersonaX establishes a foundation for studying LLM-inferred behavioral traits in conjunction with visual and biographical attributes, advancing multimodal trait analysis and causal reasoning.
- Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding
Recent advancements in large video models (LVMs) have significantly enhance video understanding. However, these models continue to suffer from hallucinations, producing content that conflicts with input videos. To address this issue, we propose Dr.V, a hierarchical framework covering perceptive, temporal, and cognitive levels to diagnose video hallucination by fine-grained spatial-temporal grounding. Dr.V comprises of two key components: a benchmark dataset Dr.V-Bench and a satellite video agent Dr.V-Agent. Dr.V-Bench includes 10k instances drawn from 4,974 videos spanning diverse tasks, each enriched with detailed spatial-temporal annotation. Dr.V-Agent detects hallucinations in LVMs by systematically applying fine-grained spatial-temporal grounding at the perceptive and temporal levels, followed by cognitive level reasoning. This step-by-step pipeline mirrors human-like video comprehension and effectively identifies hallucinations. Extensive experiments demonstrate that Dr.V-Agent is effective in diagnosing hallucination while enhancing interpretability and reliability, offering a practical blueprint for robust video understanding in real-world scenarios. All our data and code are available at https://github.com/Eurekaleo/Dr.V.
- GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings
Domain-specific embedding models have shown promise for applications that require specialized semantic understanding, such as coding agents and financial retrieval systems, often achieving higher performance gains than general models. However, state-of-the-art embedding models are typically based on LLMs, which contain billions of parameters, making deployment challenging in resource-constrained environments. Model compression through pruning offers a promising solution, but existing pruning methods treat all parameters uniformly, failing to distinguish between general semantic representations and domain-specific patterns, leading to suboptimal pruning decisions. Thus, we propose GAPrune, a pruning framework that addresses this challenge by considering both domain importance and preserving general linguistic foundation. Our method uses Fisher Information to measure importance and general-domain gradient alignment to assess parameter behavior, then combines these signals using our Domain Alignment Importance (DAI) scoring. Lower DAI scores indicate that the parameter is either less important for the domain task or creates conflicts between domain and general objectives. Experiments on two domain benchmarks, FinMTEB and ChemTEB, show that GAPrune maintains performance within 2.5% of dense models in one-shot pruning at 50% sparsity, while outperforming all baselines. With retraining in 100 steps, GAPrune achieves +4.51% improvement on FinMTEB and +1.73% on ChemTEB, demonstrating that our pruning strategy not only preserves but enhances domain-specific capabilities. Our findings demonstrate that principled pruning strategies can achieve model compression and enhanced domain specialization, providing the research community with a new approach for development.
Solidot(15)
- 蚁后被发现产下了两个不同物种的蚂蚁
根据发表在《自然》期刊上的一项研究,一种名为伊比利亚收获蚁(Messor ibericus)的蚁后可以产下两个不同物种的蚂蚁。蚁群里的个体通常分为三类,包括蚁后,雄蚁以及工蚁。研究发现,伊比利亚收获蚁的工蚁体内带有另一种“工匠收获蚁”(Messor structor)的基因,但这并非杂交的结果,而是蚁后从工匠收获蚁雄蚁的精子直接克隆出来的“复制品”。这一“异种生殖”现象不仅刷新了人类对蚂蚁生殖机制的认知,还首次提供了雌性主动“繁殖”另一物种的直接证据。研究人员推断,这个演化故事的起点是一种被称为“精子寄生”(sperm parasitism)的现象。在数百万年前,伊比利亚收获蚁的蚁后因某种未知原因失去了生产本物种工蚁的能力。为了维持蚁群的运作,它们不得不“借用”生活在附近的工匠收获蚁雄蚁的精子,与自己的卵结合,生产出兼具两个物种基因的混血工蚁。然而这种生殖策略严重依赖于建造收获蚁种群的地理邻近性,对于蚁后来说,寻找异种雄蚁交配是一件既耗时又不稳定的“麻烦事”。为了摆脱这种束缚,伊比利亚收获蚁演化出了一种更为高效且惊人的策略——“性驯化”(sexual domestication)。它们不再需要在野外寻找工匠收获蚁雄蚁,而是直接利用储存在自己体内的异种精子,通过一种特殊的克隆过程来生产它们。
- AOMedia 联盟将于年底发布 AV2 编解码器
由 Amazon、Cisco, Google、Intel、Microsoft、Mozilla 和 Netflix 等联合创办的开放媒体联盟 AOMedia 宣布将于年底发布 AV1 的后继者 AV2 编解码器。AOMedia 声称,AV2 是开放视频编码的一次世代飞跃,旨在满足全球日益增长的流媒体需求,压缩性能显著优于 AV1。AV2 增强了对 AR/VR 应用的支持,支持多节目分屏播放,改进屏幕内容处理,能在更宽的视觉质量范围内运行。
- Godot 4.5 释出
开源游戏引擎 Godot 释出了 v4.5 版本。主要新特性包括:模板缓存(stencil buffer),内置屏幕阅读器支持以改进可访问性,着色器烘焙器(shader baker)提供更好的着色器编译处理加速启动,改进物理功能等等。Linux 版 Godot 4.5 支持原生 Wayland 子窗口,基于 WebAssembly 的 Web 版本支持 SIMD 显著提升了性能。
- Microsoft 365 应用将从下个月起强制安装 Copilot Chat
微软宣布从 10 月份起,在欧盟经济区(EEA)外的 Microsoft 365 应用将强制安装 Copilot Chat。Word、 Excel、PowerPoint、Outlook 和 OneNote 都将更新包含 Copilot Chat 侧边栏。用户利用 Copilot Chat 可以起草文档、分析电子表格和制作幻灯片。该功能可以免费使用,Copilot 的付费用户则可以访问更高级的功能如对工作数据进行推理、支持上传文件和生成图像,以及使用最新模型如 GPT-5。如果企业不想要该功能,IT 管理员可以在 Apps Admin Center 中修改设置退出 Copilot Chat,方法是 Customization > Device Configuration > Modern App Settings,选择 Microsoft 365 Copilot app,移除 Enable 的勾选框。
- Google 改变了 Android 的安全更新模式
过去十年 Google 每个月都会发布 Android Security Bulletin,详细介绍了当月释出的 Android 安全更新所修复的漏洞,给出漏洞的危险程度。但 2025 年 7 月它打破了这一惯例,首次没有列出任何漏洞。而 2025 年 9 月的 Security Bulletin 披露了多达 119 个安全漏洞。Google 已经将过去十年采用的 Android 每月安全更新重组为“基于风险的更新系统”,区分高优先级补丁和常规修复。每个月的 Security Bulletin 将只列出正被活跃利用或是已知漏洞利用链中的漏洞,大部分补丁将累积到季度发布。此举将大幅减少 OEM 厂商月度更新的工作量。Google 也不再发布月度安全更新源代码,将自定义 ROM 开发限制在季度周期内。
- 中美达成交易 TikTok 美国业务的初步框架协议
美国财长 Scott Bessent 宣布中美两国就 TikTok 美国业务达成初步框架协议,两国元首将在周五敲定细节。TikTok 母公司字节跳动必须在 9 月 17 日期限之前,向非中国买家出售或剥离其美国业务,否则 TikTok 面临在美下架。特朗普此前已三次延长期限。中国国务院副总理何立峰表示,中方维护自身正当权益的决心坚定不移,将坚决维护国家利益和海外中资企业的合法权益。对于TikTok问题,中方将依法依规开展技术出口审批。同时,中国政府充分尊重本国海外企业意愿,支持企业在符合市场原则基础上,与合作方开展平等商业谈判。
- 非洲岛民向政府投诉结果被罚断网一年
赤道几内亚自 1968 年独立后,一直由 Nguema 家族实行世袭独裁制,最初由 Francisco Macías Nguema 统治,1979 年他的侄子 Teodoro Obiang Nguema Mbasogo 发动军事政变推翻了其统治,此后一直担任总统至今,其子则担任副总统。根据世界银行的统计,赤道几内亚虽拥有丰富的石油天然气资源,但近 200 万人口中至少 57% 处于贫困中,与此同时精英们则过着奢侈生活。摩洛哥公司 Somagec 与政府签署了合同在人口约 5000 人的 Annobón 岛从事矿产开采勘探活动。去年七月岛民向政府投诉 Somagec 在露天矿场使用的炸药污染他们的农田和水源,希望政府能采取行动改善现状。结果是,Annobón 岛被切断了互联网连接,持续至今。与总统有密切联系的 Somagec 公司证实了断网但否认此事与它有关。断网导致银行服务关闭,医院急诊服务中断,手机成为唯一的通信方式,但居民表示手机话费已经积累到让他们无力承担。由于担心生命安全,以及面临缺乏互联网的艰难生活,有很多岛民选择了离开。
- 印度 IT 外包行业大幅减少招聘应届生
印度 IT 服务行业产值 2500 亿美元,贡献了 GDP 的 8%,雇佣了 540 万人。外包巨头如 Tata Consultancy Services(TCS)和 Infosys 以最低 5000 美元的薪水源源不断的雇佣应届生,在培训之后胜任发达国家十倍薪水的工作,以这种劳动力套利为欧美的客户提供 IT 服务。然而生成式 AI 时代影响最大的就是初级 IT 工作,而印度的 IT 外包行业也因此深受打击。2023 财年印度四大 IT 外包巨头招聘了 22.5 万名应届生,但到了 2024 财年,招聘的应届生数量暴跌 70% 降至 6 万。TCS 和 Infosys 在 2022 财年增加了 157,000 名员工,而在 2025 财年数量降至 12,771 人。IT 外包巨头的员工总数也在几十年来首次下降,TCS 在 2024 财年的员工总数减少了逾 13,000 人,而 Infosys 则裁员 25,000 人。Accenture 等公司的研究表明,生成式 AI 能自动化 30% 至 40% 的初级程序员和测试人员的工作。应届生招聘人数的减少也改变了该行业的人口结构。Infosys 30 岁以上员工占总员工的比例从 2010 年的 81% 降至 2025 财年预计的 53%。IT 毕业生失业率超过 13%,几乎是全国平均水平的三倍。
- 英伟达被指违反中国反垄断法
监管机构国家市场监督管理总局在一份简短声明中宣布:“近日,经初步调查,英伟达公司违反《中华人民共和国反垄断法》和《市场监管总局关于附加限制性条件批准英伟达公司收购迈络思科技有限公司股权案反垄断审查决定的公告》,市场监管总局依法决定对其实施进一步调查。”英伟达是在 2020 年以 69 亿美元收购迈络思(Mellanox),此次收购有助于英伟达进军数据中心和高性能计算市场,而在今天的生成式 AI 时代,这笔收购帮助英伟达成为市场领导者。如果确认违反反垄断法,英伟达可能面临上年度销售额 1-10% 的罚款。
- 步行量而不是强度与更低的慢性背痛风险相关
一项研究调查了步行与患慢性腰背部疾病风险之间的关系。如果人们遵循研究提供的简单建议,可以为医疗保健系统节省大量资金,同时还可以减轻许多疼痛。结果显示,经常走路的人比不怎么走路的人背痛更少——而且最重要的是步行量,而不是强度。与其走得快,不如走得多。研究人员称,“每天步行超过 100 分钟的人患腰背部疾病的风险比每天步行 78 分钟或更少的人低 23%。”有 11194 人参加了这项研究,每天步行的量和强度是用两个传感器测量的,参与者在大腿和背部佩戴了两个传感器,持续一周。这项研究证实了身体活动,尤其是每天散步,可以帮助预防长期的腰背部问题。
- LIGO 成为黑洞狩猎机器
2015 年 9 月 14 日,美国激光干涉引力波天文台(LIGO)探测到了首个黑洞合并产生的引力波信号。如今 LIGO 每 3 天观测到一次黑洞合并事件。今天的 LIGO 与意大利的引力波探测器 Virgo 以及日本的 KAGRA 协作运行。被称为 LVK(LIGO、Virgo、KAGRA)的引力波搜寻网络已捕获约 300 个候选黑洞并合事件,部分事件已确认为黑洞合并,其余需要等待进一步分析。约 220 个候选黑洞并合事件是在第四次科学观测期发现的,其数量是前三次总和的两倍。LVK 探测数量的快速增长得益于技术改进,包括引入尖端的量子工程技术。
- 韩国扫地机器人试图通过差异化与中国公司竞争
IDC 的数据显示,在 2025 年 1~6 月扫地机器人全球市场份额居前 5 的企业中,中国企业占 4 家:北京石头世纪科技、科沃斯、追觅科技市场份额均超过 10%,小米 7.4%。2010 年代曾占据三成以上份额的 iRobot 仅占 5.8% 跌至第 5 位。韩国的两大巨头三星电子和 LG 电子也都有扫地机器人产品,但相比中国产品零售价在 965~1448 元,韩国产的扫地机器人普遍在 4826~9653 元之间,且性能无法让消费者认为值得付出更高的代价。由于无法在价格上与中国产品竞争,韩国厂商通过强调安全性、高档感和易用性,力求在价格以外实现差异化,这种战略与日本厂商也有相似之处。
- 全球人口正以更快的速度收缩
伊斯坦布尔妇产科医生 Furkan Kayabasoglu 过去经常会为同一个家庭多次接生,如今绝大部分家庭只生一胎。去年土耳其的总和生育率低至 1.48,已经远低于 2.1 的人口更替率。而联合国人口司此前预测土耳其要到 2100 年总和生育率才会到达如此低的水平。土耳其并非例外。世界各地,从发展中国家到中等收入国家到发达国家,生育率的下降幅度都远超预期。哥伦比亚首都波哥大的总和生育率只有 0.91,甚至低于日本东京。印度的总和生育率已低于人口更替率,中国的人口已经开始萎缩。墨西哥的总和生育率为1.6,与美国相当。2024 年法国出生人口数低于 1806 年,而当时人口不到今天的一半。意大利出生人数创下了 1861 年统一以来的最低水平。非洲出生率仍然远高于全球平均水平,但其下降速度也远超预期。这一切意味着世界人口可能比专家预测的更早达到峰值,且峰值水平要低得多。世界人口可能到 2050 年代就会停止增长,而不是原来预测的 2084 年,且总数不会超过 90 亿。生育率下降的可能因素包括了女性受教育程度提高、城市化、避孕普及、养育成本急剧上升以及社会观念的转变等。
- 日本老年人口比例占到 29.4%
日本总务省周日公布了人口估算数据,65 岁以上老年人为 3619 万人,占总人口的比例为 29.4%,创下新高,该比例也是人口 4000 万以上国家中最高。老年人就业人数为 930万 ,连续 21 年增加,也创下新高,除了有更多老年人健康良好外,少子化导致人手短缺也是原因之一。人口估算显示,65 岁以上男性为 1568 万人,女性 2051 万人,总数较上年减少 5 万人。这是有可比数据的 1950 年以来继 2023 年后第二次减少。主要原因是新达到 65 岁的人数较少。国立社会保障和人口问题研究所估算,由于第二次婴儿潮(1971-1974 年)出生的一代逐渐进入老年,估计 2040 年老年人口将达到 3928 万人,占总人口的 34.8%。
- CRISPR 基因编辑的马引发争议
基因编辑的猪和绵羊等动物正逐渐在农业领域获得认可。这些技术可提升动物的性状表现,为人类提供更安全、优质的肉类产品。但经 CRISPR 技术改造的马,却被马球比赛拒之门外。专家强调,必须严格追踪并确保基因编辑动物的安全性,审慎推进相关应用。CRISPR 基因编辑技术能够精准切割基因组特定位置,改变基因表达,从而赋予生物新的性状。Kheiron 公司以阿根廷一匹冠军马为原型,利用克隆技术培育出五匹遗传背景完全一致的克隆马。在此基础上,研究人员进一步应用 CRISPR 技术,靶向抑制了肌生成抑制素基因的表达。该基因天然存在于动物体内,作用是限制肌肉过度发育。通过精准下调其活性,团队增加了马匹体内负责爆发性运动的肌纤维数量,从而将它们培育成更出色的“短跑健将”。阿根廷马球协会明确禁止基因编辑马参赛。协会主席表示,这项技术“会剥夺育种的魅力与魔法”。