DIGEST · 2025-11-11

OrangeBot.AI Digest — 2025-11-11

60 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Collaboration sucks (newsletter.posthog.com)
  2. FFmpeg to Google: Fund us or stop sending bugs (thenewstack.io)
  3. Scaling HNSWs (antirez.com)
  4. The history of Casio watches (www.casio.com)
  5. iPod Socks (en.wikipedia.org)
  6. Firefox expands fingerprint protections (blog.mozilla.org)
  7. Canada loses its measles-free status, with US on track to follow (www.bbc.com)
  8. The R47: A new physical RPN calculator (www.swissmicros.com)
  9. The kind of company I want to be a part of (www.dvsj.in)
  10. iPhone Pocket (www.apple.com)
  11. OpenAI may not use lyrics without license, German court rules (www.reuters.com)
  12. Why effort scales superlinearly with the perceived quality of creative work (markusstrasser.org)
  13. How I fell in love with Erlang (boragonul.com)
  14. SoftBank sells its entire stake in Nvidia (www.cnbc.com)
  15. AI documentation you can talk to, for every repo (deepwiki.com)

GitHub Trending(15)

  1. sansan0 / TrendRadar

    🎯 告别信息过载,AI 助你看懂新闻资讯热点,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等13种工具)。支持企业微信/飞书/钉钉/Telegram/邮件/ntfy推送,30秒网页部署,1分钟手机通知,无需编程。支持Docker部署⭐ 让算法为你服务,用AI理解热点

  2. google / adk-go

    An open-source, code-first Go toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

  3. usestrix / strix

    ✨ Open-source AI hackers for your apps 👨🏻‍💻

  4. bobeff / open-source-games

    A list of open source games.

  5. TapXWorld / ChinaTextbook

    所有小初高、大学PDF教材。

  6. serverless-dns / serverless-dns

    The RethinkDNS resolver that deploys to Cloudflare Workers, Deno Deploy, Fastly, and Fly.io

  7. yichuan-w / LEANN

    RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

  8. yangshun / tech-interview-handbook

    💯 Curated coding interview preparation materials for busy software engineers

  9. microsoft / ai-agents-for-beginners

    12 Lessons to Get Started Building AI Agents

  10. LizardByte / Sunshine

    Self-hosted game stream host for Moonlight.

  11. dgtlmoon / changedetection.io

    Best and simplest tool for website change detection, web page monitoring, and website change alerts. Perfect for tracking content changes, price drops, restock alerts, and website defacement monitoring—all for free or enjoy our SaaS plan!

  12. davila7 / claude-code-templates

    CLI tool for configuring and monitoring Claude Code

  13. google / adk-docs

    An open-source, code-first toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

  14. nvm-sh / nvm

    Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

  15. AtsushiSakai / PythonRobotics

    Python sample codes and textbook for robotics algorithms.

Hugging Face(15)

  1. HaluMem: Evaluating Hallucinations in Memory Systems of Agents

    Memory systems are key components that enable AI systems such as LLMs and AI agents to achieve long-term learning and sustained interaction. However, during memory storage and retrieval, these systems frequently exhibit memory hallucinations, including fabrication, errors, conflicts, and omissions. Existing evaluations of memory hallucinations are primarily end-to-end question answering, which makes it difficult to localize the operational stage within the memory system where hallucinations arise. To address this, we introduce the Hallucination in Memory Benchmark (HaluMem), the first operation level hallucination evaluation benchmark tailored to memory systems. HaluMem defines three evaluation tasks (memory extraction, memory updating, and memory question answering) to comprehensively reveal hallucination behaviors across different operational stages of interaction. To support evaluation, we construct user-centric, multi-turn human-AI interaction datasets, HaluMem-Medium and HaluMem-Long. Both include about 15k memory points and 3.5k multi-type questions. The average dialogue length per user reaches 1.5k and 2.6k turns, with context lengths exceeding 1M tokens, enabling evaluation of hallucinations across different context scales and task complexities. Empirical studies based on HaluMem show that existing memory systems tend to generate and accumulate hallucinations during the extraction and updating stages, which subsequently propagate errors to the question answering stage. Future research should focus on developing interpretable and constrained memory operation mechanisms that systematically suppress hallucinations and improve memory reliability.

  2. IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction

    Recent advances in deep-research agents have shown promise for autonomous knowledge construction through dynamic reasoning over external sources. However, existing approaches rely on a mono-contextual paradigm that accumulates all information in a single, expanding context window, leading to context suffocation and noise contamination that limit their effectiveness on long-horizon tasks. We introduce IterResearch, a novel iterative deep-research paradigm that reformulates long-horizon research as a Markov Decision Process with strategic workspace reconstruction. By maintaining an evolving report as memory and periodically synthesizing insights, our approach preserves consistent reasoning capacity across arbitrary exploration depths. We further develop Efficiency-Aware Policy Optimization (EAPO), a reinforcement learning framework that incentivizes efficient exploration through geometric reward discounting and enables stable distributed training via adaptive downsampling. Extensive experiments demonstrate that IterResearch achieves substantial improvements over existing open-source agents with average +14.5pp across six benchmarks and narrows the gap with frontier proprietary systems. Remarkably, our paradigm exhibits unprecedented interaction scaling, extending to 2048 interactions with dramatic performance gains (from 3.5\% to 42.5\%), and serves as an effective prompting strategy, improving frontier models by up to 19.2pp over ReAct on long-horizon tasks. These findings position IterResearch as a versatile solution for long-horizon reasoning, effective both as a trained agent and as a prompting paradigm for frontier models.

  3. DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

    Recent reasoning-first models (e.g., OpenAI o1, DeepSeek R1) have spurred a resurgence of interest in RLVR. Nevertheless, advances are dominated by mathematics (e.g., AIME), with competitive-programming code generation underexplored and data curation receiving less attention than RL algorithm design. We investigate how to construct RLVR datasets (i.e., RL prompts) and present practical training techniques that yield strong performance on competitive-programming code generation. Our pipeline begins with supervised fine-tuning (SFT) distilled from strong open-source models, augmented with general-purpose and reasoning-intensive data. RL then follows a two-stage process with executable, testcase-driven rewards: first, training on a large, uniformly distributed set of competitive-programming problems using Group Relative Policy Optimization (GRPO) with 8 rollouts per prompt and a relatively short response-generation window (e.g., 32k during SFT and 24k in this stage) to expand entropy and mitigate repetition and truncation; second, we perform Pre-GRPO: updating on a small, high-quality set of challenging problems with a large rollout budget (64 rollouts per prompt) under a hard-focus curriculum that continuously retains the most difficult instances throughout training. We implement our method on Qwen2.5-32B and evaluate on LeetCode and Codeforces weekly contests to avoid data leakage. The resulting model achieves state-of-the-art performance among models of similar scale and is comparable to leading systems such as DeepSeek v3.1 and Doubao-1.5-Thinking. We also examine scaling trends and observe strong RL scaling on an internal large-scale MoE model. Our study distills concise best practices for data curation, entropy expansion, and curriculum design in RLVR for competitive-programming code generation.

  4. The Station: An Open-World Environment for AI-Driven Discovery

    We introduce the STATION, an open-world multi-agent environment that models a miniature scientific ecosystem. Leveraging their extended context windows, agents in the Station can engage in long scientific journeys that include reading papers from peers, formulating hypotheses, submitting code, performing analyses, and publishing results. Importantly, there is no centralized system coordinating their activities - agents are free to choose their own actions and develop their own narratives within the Station. Experiments demonstrate that AI agents in the Station achieve new state-of-the-art performance on a wide range of benchmarks, spanning from mathematics to computational biology to machine learning, notably surpassing AlphaEvolve in circle packing. A rich tapestry of narratives emerges as agents pursue independent research, interact with peers, and build upon a cumulative history. From these emergent narratives, novel methods arise organically, such as a new density-adaptive algorithm for scRNA-seq batch integration. The Station marks a first step towards autonomous scientific discovery driven by emergent behavior in an open-world environment, representing a new paradigm that moves beyond rigid optimization.

  5. MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

    The advent of Multimodal Large Language Models (MLLMs) has expanded AI capabilities to visual modalities, yet existing evaluation benchmarks remain limited to single-video understanding, overlooking the critical need for multi-video understanding in real-world scenarios (e.g., sports analytics and autonomous driving). To address this significant gap, we introduce MVU-Eval, the first comprehensive benchmark for evaluating Multi-Video Understanding for MLLMs. Specifically, our MVU-Eval mainly assesses eight core competencies through 1,824 meticulously curated question-answer pairs spanning 4,959 videos from diverse domains, addressing both fundamental perception tasks and high-order reasoning tasks. These capabilities are rigorously aligned with real-world applications such as multi-sensor synthesis in autonomous systems and cross-angle sports analytics. Through extensive evaluation of state-of-the-art open-source and closed-source models, we reveal significant performance discrepancies and limitations in current MLLMs' ability to perform understanding across multiple videos. The benchmark will be made publicly available to foster future research.

  6. Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

    Sparse Mixture-of-Experts (MoE) have been widely adopted in recent large language models since it can efficiently scale up the model capability without increasing the inference cost. However, evaluations on broad downstream tasks reveal a consistent suboptimality of the routers in existing MoE LLMs, which results in a severe performance gap (e.g., 10-20% in accuracy) to the optimal routing. In this paper, we show that aligning the manifold of routing weights with that of task embedding can effectively reduce the gap and improve MoE LLMs' generalization performance. Our method, "Routing Manifold Alignment (RoMA)", introduces an additional manifold regularization term in the post-training objective and only requires lightweight finetuning of routers (with other parameters frozen). Specifically, the regularization encourages the routing weights of each sample to be close to those of its successful neighbors (whose routing weights lead to correct answers) in a task embedding space. Consequently, samples targeting similar tasks will share similar expert choices across layers. Building such bindings between tasks and experts over different samples is essential to achieve better generalization. Moreover, RoMA demonstrates the advantage of unifying the task understanding (by embedding models) with solution generation (by MoE LLMs). In experiments, we finetune routers in OLMoE, DeepSeekMoE, and Qwen3-MoE using RoMA. Evaluations on diverse benchmarks and extensive comparisons with baselines show the substantial improvement brought by RoMA.

  7. RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services

    As a key medium for human interaction and information exchange, social networking services (SNS) pose unique challenges for large language models (LLMs): heterogeneous workloads, fast-shifting norms and slang, and multilingual, culturally diverse corpora that induce sharp distribution shift. Supervised fine-tuning (SFT) can specialize models but often triggers a ``seesaw'' between in-distribution gains and out-of-distribution robustness, especially for smaller models. To address these challenges, we introduce RedOne 2.0, an SNS-oriented LLM trained with a progressive, RL-prioritized post-training paradigm designed for rapid and stable adaptation. The pipeline consist in three stages: (1) Exploratory Learning on curated SNS corpora to establish initial alignment and identify systematic weaknesses; (2) Targeted Fine-Tuning that selectively applies SFT to the diagnosed gaps while mixing a small fraction of general data to mitigate forgetting; and (3) Refinement Learning that re-applies RL with SNS-centric signals to consolidate improvements and harmonize trade-offs across tasks. Across various tasks spanning three categories, our 4B scale model delivers an average improvements about 2.41 over the 7B sub-optimal baseline. Additionally, RedOne 2.0 achieves average performance lift about 8.74 from the base model with less than half the data required by SFT-centric method RedOne, evidencing superior data efficiency and stability at compact scales. Overall, RedOne 2.0 establishes a competitive, cost-effective baseline for domain-specific LLMs in SNS scenario, advancing capability without sacrificing robustness.

  8. SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization

    The soft-thinking paradigm for Large Language Model (LLM) reasoning can outperform the conventional discrete-token Chain-of-Thought (CoT) reasoning in some scenarios, underscoring its research and application value. However, while the discrete-token CoT reasoning pattern can be reinforced through policy optimization algorithms such as group relative policy optimization (GRPO), extending the soft-thinking pattern with Reinforcement Learning (RL) remains challenging. This difficulty stems from the complexities of injecting stochasticity into soft-thinking tokens and updating soft-thinking policies accordingly. As a result, previous attempts to combine soft-thinking with GRPO typically underperform their discrete-token GRPO counterparts. To fully unlock the potential of soft-thinking, this paper presents a novel policy optimization algorithm, SofT-GRPO, to reinforce LLMs under the soft-thinking reasoning pattern. SofT-GRPO injects the Gumbel noise into logits, employs the Gumbel-Softmax technique to avoid soft-thinking tokens outside the pre-trained embedding space, and leverages the reparameterization trick in policy gradient. We conduct experiments across base LLMs ranging from 1.5B to 7B parameters, and results demonstrate that SofT-GRPO enables soft-thinking LLMs to slightly outperform discrete-token GRPO on Pass@1 (+0.13% on average accuracy), while exhibiting a substantial uplift on Pass@32 (+2.19% on average accuracy). Codes and weights are available on https://github.com/zz1358m/SofT-GRPO-master

  9. Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

    Solving complex tasks usually requires LLMs to generate long multi-step reasoning chains. Previous work has shown that verifying the correctness of individual reasoning steps can further improve the performance and efficiency of LLMs on such tasks and enhance solution interpretability. However, existing verification approaches, such as Process Reward Models (PRMs), are either computationally expensive, limited to specific domains, or require large-scale human or model-generated annotations. Thus, we propose a lightweight alternative for step-level reasoning verification based on data-driven uncertainty scores. We train transformer-based uncertainty quantification heads (UHeads) that use the internal states of a frozen LLM to estimate the uncertainty of its reasoning steps during generation. The approach is fully automatic: target labels are generated either by another larger LLM (e.g., DeepSeek R1) or in a self-supervised manner by the original model itself. UHeads are both effective and lightweight, containing less than 10M parameters. Across multiple domains, including mathematics, planning, and general knowledge question answering, they match or even surpass the performance of PRMs that are up to 810x larger. Our findings suggest that the internal states of LLMs encode their uncertainty and can serve as reliable signals for reasoning verification, offering a promising direction toward scalable and generalizable introspective LLMs.

  10. Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

    Recent advances in depth-recurrent language models show that recurrence can decouple train-time compute and parameter count from test-time compute. In this work, we study how to convert existing pretrained non-recurrent language models into depth-recurrent models. We find that using a curriculum of recurrences to increase the effective depth of the model over the course of training preserves performance while reducing total computational cost. In our experiments, on mathematics, we observe that converting pretrained models to recurrent ones results in better performance at a given compute budget than simply post-training the original non-recurrent language model.

  11. NURBGen: High-Fidelity Text-to-CAD Generation through LLM-Driven NURBS Modeling

    Generating editable 3D CAD models from natural language remains challenging, as existing text-to-CAD systems either produce meshes or rely on scarce design-history data. We present NURBGen, the first framework to generate high-fidelity 3D CAD models directly from text using Non-Uniform Rational B-Splines (NURBS). To achieve this, we fine-tune a large language model (LLM) to translate free-form texts into JSON representations containing NURBS surface parameters (i.e, control points, knot vectors, degrees, and rational weights) which can be directly converted into BRep format using Python. We further propose a hybrid representation that combines untrimmed NURBS with analytic primitives to handle trimmed surfaces and degenerate regions more robustly, while reducing token complexity. Additionally, we introduce partABC, a curated subset of the ABC dataset consisting of individual CAD components, annotated with detailed captions using an automated annotation pipeline. NURBGen demonstrates strong performance on diverse prompts, surpassing prior methods in geometric fidelity and dimensional accuracy, as confirmed by expert evaluations. Code and dataset will be released publicly.

  12. Robot Learning from a Physical World Model

    We introduce PhysWorld, a framework that enables robot learning from video generation through physical world modeling. Recent video generation models can synthesize photorealistic visual demonstrations from language commands and images, offering a powerful yet underexplored source of training signals for robotics. However, directly retargeting pixel motions from generated videos to robots neglects physics, often resulting in inaccurate manipulations. PhysWorld addresses this limitation by coupling video generation with physical world reconstruction. Given a single image and a task command, our method generates task-conditioned videos and reconstructs the underlying physical world from the videos, and the generated video motions are grounded into physically accurate actions through object-centric residual reinforcement learning with the physical world model. This synergy transforms implicit visual guidance into physically executable robotic trajectories, eliminating the need for real robot data collection and enabling zero-shot generalizable robotic manipulation. Experiments on diverse real-world tasks demonstrate that PhysWorld substantially improves manipulation accuracy compared to previous approaches. Visit https://pointscoder.github.io/PhysWorld_Web/{the project webpage} for details.

  13. Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

    We introduce llama-embed-nemotron-8b, an open-weights text embedding model that achieves state-of-the-art performance on the Multilingual Massive Text Embedding Benchmark (MMTEB) leaderboard as of October 21, 2025. While recent models show strong performance, their training data or methodologies are often not fully disclosed. We aim to address this by developing a fully open-source model, publicly releasing its weights and detailed ablation studies, and planning to share the curated training datasets. Our model demonstrates superior performance across all major embedding tasks -- including retrieval, classification and semantic textual similarity (STS) -- and excels in challenging multilingual scenarios, such as low-resource languages and cross-lingual setups. This state-of-the-art performance is driven by a novel data mix of 16.1 million query-document pairs, split between 7.7 million samples from public datasets and 8.4 million synthetically generated examples from various open-weight LLMs. One of our key contributions is a detailed ablation study analyzing core design choices, including a comparison of contrastive loss implementations, an evaluation of synthetic data generation (SDG) strategies, and the impact of model merging. The llama-embed-nemotron-8b is an instruction-aware model, supporting user-defined instructions to enhance performance for specific use-cases. This combination of top-tier performance, broad applicability, and user-driven flexibility enables it to serve as a universal text embedding solution.

  14. Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries

    While Vision-Language Models (VLMs) post-trained with Reinforcement Learning (RL) show impressive general reasoning, their evaluation is often confined to language-dominant tasks (e.g., math). This raises a critical question: can RL post-training truly extend the inherent capability boundary of a base VLM, particularly for visual-centric spatial tasks where it initially fails? To investigate this, we introduce Ariadne, a framework utilizing synthetic mazes for multi-step spatial reasoning where task difficulty (e.g., path length, turns) is precisely controlled. We leverage this controllable environment to train VLMs using Reinforcement Learning with Verified Rewards (RLVR) in a difficulty-aware curriculum. Surprisingly, post-RLVR training, the VLM achieves over 50% accuracy on a problem set where the base model scored 0%, demonstrating that our approach expands the model's initial capability boundary. To assess real-world viability, we evaluate out-of-distribution (OOD) generalization on practical benchmarks. Despite training only on synthetic maze samples, Ariadne achieves significant zero-shot improvements, averaging 16% on MapBench (e.g., museum navigation) and 24% on ReasonMap (subway transfer tasks). These results confirm that our method not only broadens the model's fundamental limits but also enhances its generalization to real-world spatial reasoning. We acknowledge our study is limited to the post-training phase, given the opaqueness of pre-training data, and hope our research motivates further work on specialized, capability-extending alignment.

  15. MPJudge: Towards Perceptual Assessment of Music-Induced Paintings

    Music induced painting is a unique artistic practice, where visual artworks are created under the influence of music. Evaluating whether a painting faithfully reflects the music that inspired it poses a challenging perceptual assessment task. Existing methods primarily rely on emotion recognition models to assess the similarity between music and painting, but such models introduce considerable noise and overlook broader perceptual cues beyond emotion. To address these limitations, we propose a novel framework for music induced painting assessment that directly models perceptual coherence between music and visual art. We introduce MPD, the first large scale dataset of music painting pairs annotated by domain experts based on perceptual coherence. To better handle ambiguous cases, we further collect pairwise preference annotations. Building on this dataset, we present MPJudge, a model that integrates music features into a visual encoder via a modulation based fusion mechanism. To effectively learn from ambiguous cases, we adopt Direct Preference Optimization for training. Extensive experiments demonstrate that our method outperforms existing approaches. Qualitative results further show that our model more accurately identifies music relevant regions in paintings.

Solidot(15)

  1. 美国政府考虑禁售普联路由器

    美国多个政府机构以国家安全风险提议禁售普联路由器(TP-Link)。总部位于美国加州的 TP-Link Systems 否认它对美国国家安全构成风险的指控,称它已经与总部位于中国的 TP-Link Technologies 完全切割,它在新加坡有分公司,在越南有生产基地,除芯片组外所有产品的研发、设计、生产和制造均自主完成。TP-Link Systems 发言人称,TP-Link 是一家美国公司,致力于为美国及其它市场提供高质量安全的产品。TP-Link 称它的竞争对手也从中国采购零部件,中国以及其它国家的 APT 组织也在利用思科和 Netgear 等竞争对手产品中的漏洞。

  2. 中国 CO2 排放量连续 18 个月持平或下降

    分析显示中国 CO2 排放量连续 18 个月持平或下降。这可能意味着作为全球最大的排放国,中国提前实现了 CO2 排放量达到峰值的目标。今年第三季度太阳能和风能装机容量分别增长 46% 和 11%,意味着即使电力需求不断增长,中国能源行业的排放量也能保持平稳。今年前九个月,中国新增太阳能装机容量 240GW,新增风能装机容量 61GW,有望在 2025 年再次刷新可再生能源装机容量纪录。去年中国新增太阳能装机容量 333GW,超过世界其它地区总和。数据还显示,部分经济领域的排放量逆势增长:第三季度石油需求和交通运输行业排放量下降 5%,但塑料等化学品产量激增导致其它领域排放量增长 10%。

  3. 加拿大麻疹疫情持续了一年

    疫苗接种帮助加拿大在 1998 年消灭了麻疹,然而由于针对麻疹、腮腺炎和风疹(MMR)疫苗的反疫苗运动导致接种率下降,北美洲地区再次爆发了麻疹疫情,当麻疹疫情在一个国家持续超过一年,它就失去了麻疹消除国的地位。本周一 泛美卫生组织(PAHO)宣布加拿大的麻疹疫情已经持续了一年,它不再是麻疹消除国。加拿大的麻疹大范围传播始于 2024 年 10 月。截至 2025 年 11 月 1 日,加拿大今年至今统计了 5162 例麻疹病例。加拿大并非唯一一个面临麻疹疫情的国家。美国和墨西哥正经历类似的疫情爆发。美国自年初以来报告了至少 1618 例麻疹病例,墨西哥至少有5185 例。泛美卫生组织报告称,截至 11 月 7 日共收集了 10 个国家的 12593 例确诊麻疹病例报告,其中 95% 发生在加拿大、墨西哥和美国。这一数字比 2024 年增加了 30 倍,已导致至少 28 人死亡:墨西哥 23 人,美国 3 人,加拿大 2 人。美国现任卫生部长就是一位反疫苗者。

  4. Apple TV 不会推出基于广告的订阅服务,不会收购华纳

    负责 Apple TV 业务的 Apple Services 高级副总裁 Eddy Cue 在接受采访时表示苹果不会推出基于广告的订阅服务,至少目前没有计划,但“不会说永远不会”,如果能保持价格上相对于竞争对手服务的优势,对消费者而言避免广告更好。主要流媒体服务如 Netflix 的无广告版起步价为每月 18 美元,而迪士尼的 Disney+ 是 19 美元,Apple TV 只有 13 美元。Apple TV 目前并不盈利,Eddy Cue 没有披露订阅总数,只是称 Apple TV 增长更快,去年的观看时长比以往任何时候都高。增加订阅人数的一个简单方法是购买现有的流媒体服务和内容制作商,Warner Bros. Discovery 正在寻求出售,该公司旗下的一大订阅服务是 HBO Max。Eddy Cue 对此表示,苹果很少进行大规模收购,通常只进行小规模的收购,他不认为苹果会购买华纳公司或购买任何公司的内容授权。

  5. 被 HR 支配的世界

    经济学人报道,2024 年美国企业雇佣了 130 万 HR 员工,比十年前增长了64%。同期美国整体就业人数增长了14%。专业服务和科技公司自 2014 年以来雇佣的 HR 员工人数翻了一番。澳大利亚、英国和德国也有类似的趋势。首席人力资源官的薪酬也出现大幅增长。他们的总薪酬从占董事平均薪酬的 40% 增长至 2022 年的 70%。通用汽车首席执行官 Mary Barra 曾担任过公司的首席人力资源官。HR 员工大幅增长的趋势可能与工作环境的一系列变化相关,包括 Me Too 运动、疫情期间的远程办公,多元化倡议,企业面临更多与员工关系的监管,歧视或骚扰等职场投诉的大幅增长。歧视或骚扰指控的平均数量从 2021 年的每 1000 名员工 6 起上升到 2024 年的 15 起。

  6. 商业间谍软件利用三星手机漏洞攻击中东用户

    安全公司 Palo Alto Networks 披露了专门利用三星 Galaxy 手机 0day 的商业间谍软件 Landfall。Landfall 最早出现于 2024 年 7 月,所利用的漏洞编号为 CVE-2025-21042。三星于 2025 年 4 月发布了针对该漏洞的补丁,而攻击的细节直到现在才予以披露。这次攻击主要针对中东地区的特定人群,因此大部分 Galaxy 手机用户不太可能感染间谍软件。Landfall 利用的是一种零点击漏洞,入侵设备不需要用户操作。Landfall 的攻击方法是在修改过的 DNG 图像文件中嵌入恶意 ZIP 包。CVE-2025-21042 漏洞源于手机的图像处理库。

  7. 丹麦达成协议禁止 15 岁以下儿童使用社交媒体

    丹麦政府上周五宣布达成协议禁止 15 岁以下儿童使用社交媒体。但禁令不会立即生效,立法程序可能还需要 几个月时间。丹麦是继澳大利亚之后第二个限制儿童使用社交媒体的国家,澳大利亚的禁令是限制 16 岁以下儿童使用社媒,违反者将面临最高 5000 万澳元的罚款。丹麦官员并未说明如何执行禁令,可能是要求社媒使用其国家电子身份识别系统验证身份。丹麦数字事务部长 Caroline Stage 表示,丹麦 13 岁以下儿童中有 94% 在至少一个社媒平台拥有账号,10 岁以下儿童中这一比例超过半数。她表示社媒对儿童构成的风险过大。

  8. Linux 内核项目讨论使用生成式 AI 的政策

    Linux 内核社区也无法避开生成式 AI 工具。内核开发者拟议了一份政策提议,讨论如何使用 AI 工具。提议的重点是披露 AI 工具的使用情况,提交者需要披露递交的补丁有哪些来自工具哪些来自人,因为审核者和维护者的资源是有限的,了解这些信息既有助于提高效率,又能维护提交者和审查者之间的信任,对内核的健康发展至关重要。拟议的政策不针对使用工具修正拼写和语法等简单任务,而是针对递交的补丁包含大量工具生成内容。

  9. 美国就业状况发生改变

    收集全美学生信息的美国教育部学生信息中心数据显示,2025 年春季,教授配管工、木匠等技术的职业培训学校的入学人数同比增长 12%。远高于大学入学人数的增幅(4%)。这一趋势从数年前开始增强,背景是人们对于因 AI 而改变的未来存在担忧。调查公司 Conjointly 今年以 10~20 多岁的 Z 世代的父母为对象进行的调查显示,只有 16 %的人认为“拥有大学学位就能保证长期稳定的就业”,77% 的人指出选择“难以自动化的工作”非常重要。这种动向有其合理的理由。美国的失业率总体上稳定在 4.0~4.5% 区间,但如果仅限于大学毕业前后的“20~24岁”人群,失业率则从 2024 年 12 月的 7.5% 上升至 2025 年 8 月的 9.2%。

  10. AI 不是裁员的原因,巨额 AI 支出才是

    美国公司在宣布大规模裁员时通常以 AI 为借口,但裁员的原因真的是 AI 吗?很多研究和数据给出了不同观点:MIT 媒体实验室的研究发现,95% 的生成式 AI 试点商业项目没有成功;Atlassian 的调查显示 96% 的企业没有看到 AI 显著改进了组织效率、创新或工作质量;另一项研究显示四成企业员工在工作中面临“AI 垃圾(AI slop)”问题,需要花大量时间处理该问题。一部分人认为企业大规模裁员是因为疫情期间招募了太多员工;还有部分人认为美国可能面临经济衰退。对于科技行业的大规模裁员,一个更可能的原因是巨额 AI 支出带来的财务压力,而这些支出暂时还看不到会给收入带来增长。亚马逊的资本支出从 2023 年的 540 亿美元增至 2024 年的 840 亿美元,2025 年预计将达到 1180 亿美元。Meta 正为其数据中心争取 270 亿美元的信贷;甲骨文为履行 AI 合同计划每年借款 250 亿美元。在 AI 能带来可持续收入前科技巨头需要削减成本。

  11. Python 基金会在放弃美政府的 150 万美元拨款后收到了大量捐款

    上月底,Python 软件基金会(PSF)宣布坚守 DEI(多元化、平等及包容)价值观以及考虑到无法预测的财务风险,放弃了美国政府的 150 万美元拨款。此事备受关注而被广泛报道,当天基金会就收到了大约 300 笔捐款,第二天还有 Reddit 用户抱怨尝试捐款时遭遇超时。上周五,基金会执行董事 Deb Nicholson 披露他们至今收到了逾 15.7 万美元捐款,包括 295 名每年捐款 99 美元的新支持会员。虽然这些捐款尚不足以填补 150 万美元的缺口,但基金会表示意义重大,他们感受到了来自社区的强有力支持。

  12. 服用褪黑素可能有风险

    美国心脏协会科学年会上发表的一项初步研究发现,相比未服用褪黑素补充剂的人,服用褪黑素一年或更长时间的慢性失眠患者更容易发生心力衰竭、因心力衰竭住院以及死亡。褪黑素是由松果体分泌的一种激素,负责调节人体的睡眠清醒周期。其水平在黑暗中自然升高,在白天下降。人工合成的褪黑素与天然激素的化学结构相同,被广泛用于治疗失眠和时差反应。在很多国家,褪黑素补充剂无需处方即可购买。研究人员强调,需要开展更多研究去充分了解褪黑素对心脏健康的影响,并确保其安全使用。

  13. KeePassXC 不会加入 AI 功能

    开源密码管理器项目 KeePassXC 更新了有关使用生成式 AI 的政策,开发者强调 KeePassXC 不会提供任何 AI 功能,但会使用 GitHub Copilot 等 AI 工具去处理简单的任务,比如用 Copilot 起草简单 bug 修复和 UI 变化的 pull request。由于 AI 不擅长处理复杂任务,开发者表示会谨慎使用 Copilot,会通过标准的审核流程去发现 AI 可能产生的错误。

  14. 伊朗遭遇空前的旱情

    伊朗尤其是首都德黑兰正遭遇空前的旱情,降雨量创历史新低,水库几乎干涸,官员呼吁民众节约用水,总统 Masoud Pezeshkian 警告如果旱情短期内无法缓解,德黑兰可能实行限水,而如果限水不起作用,可能不得不疏散德黑兰。气象官员表示未来 10 天内预计不会有降雨。Latian 水坝是德黑兰主要水源之一,目前水库蓄水量不足 10%。附近的 Karaj 水坝情况类似。Karaj 水坝负责人 Mohammad-Ali Moallem 表示,今年降雨量相比去年减少了 92%,水库蓄水量只剩下 8%,大部分是无法使用的“死水”。伊朗第二大城市马什哈德也面临类似的旱情。德黑兰、Karaj 和马什哈德共有逾 1600 万人口。

  15. 律师用 AI 生成虚假案例屡禁不止

    美国律师滥用 AI 生成虚假案例屡禁不止,越来越多的法庭文件被发现滥用了 AI。今年早些时候一名律师向德州破产法庭递交动议,引用了名为“Brasher v. Stewart”的 1985 年案例,但该案例并不存在,是 AI 虚构的。法官严厉批评了这名律师,将其交给州律协的纪律委员会,责令其接受六小时的 AI 培训。法国律师兼研究员 Damien Charlotin 今年四月建立了一个在线数据库,跟踪了这种滥用 AI 生成虚构案例的事件。一开始数据库每个月只记录到三到四个案例,如今每天都有三到四个,目前已记录到了 509 个案例。法庭对律师的处罚并没有起到威慑作用。