DIGEST · 2025-08-29

OrangeBot.AI Digest — 2025-08-29

75 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Do the simplest thing that could possibly work (www.seangoedecke.com)
  2. John Carmack's arguments against building a custom XR OS at Meta (twitter.com)
  3. The web does not need gatekeepers: Cloudflare’s new “signed agents” pitch (positiveblue.substack.com)
  4. Essential Coding Theory [pdf] (cse.buffalo.edu)
  5. Deploying DeepSeek on 96 H100 GPUs (lmsys.org)
  6. Private equity snaps up disability services, challenging regulators (www.governing.com)
  7. Flunking my Anthropic interview again (taylor.town)
  8. Grok Code Fast 1 (x.ai)
  9. Sig Sauer citing national security to keep documents from public (practicalshootinginsights.com)
  10. Meta might be secretly scanning your phone's camera roll (www.zdnet.com)
  11. Tesla said it didn't have key data in a fatal crash, then a hacker found it (www.washingtonpost.com)
  12. Updates to Consumer Terms and Privacy Policy (www.anthropic.com)
  13. If you have a Claude account, they're going to train on your data moving forward (old.reddit.com)
  14. Show HN: PageIndex – Vectorless RAG (github.com)
  15. The Synology End Game (lowendbox.com)

GitHub Trending(15)

  1. QuentinFuxa / WhisperLiveKit

    Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.

  2. microsoft / mcp

    Catalog of official Microsoft MCP (Model Context Protocol) server implementations for AI-powered data access and tool integration

  3. Canner / WrenAI

    ⚡️ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-powered insights in seconds.

  4. OpenBMB / MiniCPM-V

    MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and Video Understanding on Your Phone

  5. twbs / bootstrap

    The most popular HTML, CSS, and JavaScript framework for developing responsive, mobile first projects on the web.

  6. TheAlgorithms / Python

    All Algorithms implemented in Python

  7. humanlayer / humanlayer

    HumanLayer enables AI agents to communicate with humans in tool-based and async workflows. Guarantee human oversight of high-stakes function calls with approval workflows across slack, email and more. Bring your LLM and Framework of choice and start giving your AI agents safe access to the world. Agentic Workflows, human in the loop, tool calling

  8. nats-io / nats-server

    High-Performance server for NATS.io, the cloud and edge native messaging system.

  9. spf13 / cobra

    A Commander for modern Go CLI interactions

  10. microsoft / terminal

    The new Windows Terminal and the original Windows console host, all in the same place!

  11. asgeirtj / system_prompts_leaks

    Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini

  12. mercurjs / mercur

    Open-source multi-vendor marketplace platform for B2B & B2C. Built on top of MedusaJS. Create your own custom marketplace. 🛍️

  13. transformerlab / transformerlab-app

    Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

  14. opf / openproject

    OpenProject is the leading open source project management software.

  15. juspay / hyperswitch

    An open source payments switch written in Rust to make payments fast, reliable and affordable

Product Hunt(15)

  1. Surf

    AI-powered crypto insights and trading in one platform

  2. Wanderboat 2.0

    Social + Local + AI map search from ex-Bing team

  3. Contact Finder by Jeeva AI

    Verified emails and direct dials in seconds.

  4. HyNote AI

    Full stack AI note taker with Google, Notion + more support

  5. Codex by OpenAI

    Your new software engineering teammate

  6. Oh Dear

    The all-in-one monitoring tool for your entire website

  7. KushoAI

    Open-source AI tester that lives in your CLI

  8. Optibase 2.0

    The ultimate A/B testing app for Webflow

  9. Streamdown by Vercel

    Drop-in replacement for react-markdown

  10. Peony Dataroom

    The DocSend alternative founders love

  11. Twistly

    AI presentation maker add-in for Microsoft PowerPoint

  12. Grok Code Fast 1

    The speedy, economical AI for coding

  13. Never lose your work again

    Checkpoints for claude code

  14. Microsoft AI (MAI) Voice-1

    Highly expressive and natural speech generation model

  15. LaunchMMO

    Launch your browser MMO this weekend, not months

Hugging Face(15)

  1. Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

    Recent advancements highlight the importance of GRPO-based reinforcement learning methods and benchmarking in enhancing text-to-image (T2I) generation. However, current methods using pointwise reward models (RM) for scoring generated images are susceptible to reward hacking. We reveal that this happens when minimal score differences between images are amplified after normalization, creating illusory advantages that drive the model to over-optimize for trivial gains, ultimately destabilizing the image generation process. To address this, we propose Pref-GRPO, a pairwise preference reward-based GRPO method that shifts the optimization objective from score maximization to preference fitting, ensuring more stable training. In Pref-GRPO, images are pairwise compared within each group using preference RM, and the win rate is used as the reward signal. Extensive experiments demonstrate that PREF-GRPO differentiates subtle image quality differences, providing more stable advantages and mitigating reward hacking. Additionally, existing T2I benchmarks are limited by coarse evaluation criteria, hindering comprehensive model assessment. To solve this, we introduce UniGenBench, a unified T2I benchmark comprising 600 prompts across 5 main themes and 20 subthemes. It evaluates semantic consistency through 10 primary and 27 sub-criteria, leveraging MLLM for benchmark construction and evaluation. Our benchmarks uncover the strengths and weaknesses of both open and closed-source T2I models and validate the effectiveness of Pref-GRPO.

  2. rStar2-Agent: Agentic Reasoning Technical Report

    We introduce rStar2-Agent, a 14B math reasoning model trained with agentic reinforcement learning to achieve frontier-level performance. Beyond current long CoT, the model demonstrates advanced cognitive behaviors, such as thinking carefully before using Python coding tools and reflecting on code execution feedback to autonomously explore, verify, and refine intermediate steps in complex problem-solving. This capability is enabled through three key innovations that makes agentic RL effective at scale: (i) an efficient RL infrastructure with a reliable Python code environment that supports high-throughput execution and mitigates the high rollout costs, enabling training on limited GPU resources (64 MI300X GPUs); (ii) GRPO-RoC, an agentic RL algorithm with a Resample-on-Correct rollout strategy that addresses the inherent environment noises from coding tools, allowing the model to reason more effectively in a code environment; (iii) An efficient agent training recipe that starts with non-reasoning SFT and progresses through multi-RL stages, yielding advanced cognitive abilities with minimal compute cost. To this end, rStar2-Agent boosts a pre-trained 14B model to state of the art in only 510 RL steps within one week, achieving average pass@1 scores of 80.6% on AIME24 and 69.8% on AIME25, surpassing DeepSeek-R1 (671B) with significantly shorter responses. Beyond mathematics, rStar2-Agent-14B also demonstrates strong generalization to alignment, scientific reasoning, and agentic tool-use tasks. Code and training recipes are available at https://github.com/microsoft/rStar.

  3. USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

    Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: https://github.com/bytedance/USO

  4. MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

    We introduce MCP-Bench, a benchmark for evaluating large language models (LLMs) on realistic, multi-step tasks that demand tool use, cross-tool coordination, precise parameter control, and planning/reasoning for solving tasks. Built on the Model Context Protocol (MCP), MCP-Bench connects LLMs to 28 representative live MCP servers spanning 250 tools across domains such as finance, traveling, scientific computing, and academic search. Unlike prior API-based benchmarks, each MCP server provides a set of complementary tools designed to work together, enabling the construction of authentic, multi-step tasks with rich input-output coupling. Tasks in MCP-Bench test agents' ability to retrieve relevant tools from fuzzy instructions without explicit tool names, plan multi-hop execution trajectories for complex objectives, ground responses in intermediate tool outputs, and orchestrate cross-domain workflows - capabilities not adequately evaluated by existing benchmarks that rely on explicit tool specifications, shallow few-step workflows, and isolated domain operations. We propose a multi-faceted evaluation framework covering tool-level schema understanding and usage, trajectory-level planning, and task completion. Experiments on 20 advanced LLMs reveal persistent challenges in MCP-Bench. Code and data: https://github.com/Accenture/mcp-bench.

  5. AWorld: Orchestrating the Training Recipe for Agentic AI

    The learning from practice paradigm is crucial for developing capable Agentic AI systems, yet it is severely hampered by inefficient experience generation, a bottleneck especially pronounced in complex benchmarks like GAIA. To address this, we introduce AWorld, an open-source system engineered for large-scale agent-environment interaction. By distributing tasks across a cluster, AWorld accelerates experience collection by 14.6x compared to standard single-node, sequential execution. This critical speedup makes extensive reinforcement learning practical and scalable. Leveraging this capability, we trained a Qwen3-32B-based agent that significantly outperforms its base model, increasing its overall GAIA accuracy from 21.59% to 32.23%. On the benchmark's most challenging levels, our agent achieves a score of 16.33%, surpassing the performance of leading proprietary models. Our open-source system and resulting agent provide a practical blueprint for a complete agentic AI training pipeline, from efficient interaction to demonstrable model improvement.

  6. TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning

    Diverse instruction data is vital for effective instruction tuning of large language models, as it enables the model to generalize across different types of inputs . Building such diversified instruction dataset is an essential step in this process. Existing approaches often leverage large language models to automatically explore and generate diverse instructions, ensuring both data diversity and quality. However, they tend to overlook an important factor in real-world applications: on-task relevance. In practice, only a few real-world applications require a truly general-purpose model; most benefit from task-specific knowledge tailored to their particular use case. Therefore, it is vital to develop instruction augmentation methods that not only maintain diversity but are also optimized for specific, real-world scenarios. We thus introduce Task Centric Instruction Augmentation (TCIA), a framework that systematically expands instructions while preserving both diversity and task alignment. By representing instructions in a discrete query-constraints space, TCIA creates a rich set of task-relevant instructions and enables models to generalize to these task-specific instructions without sacrificing overall performance. Experiments show that TCIA improves open-source LLMs' performance by an average of 8.7% across four real-world, task-specific applications, and in some cases outperforming leading closed-source models. These improvements do not compromise general instruction-following ability, making TCIA a scalable and efficient solution for adapting LLMs to real-world, task-focused applications.

  7. Mixture of Contexts for Long Video Generation

    Long video generation is fundamentally a long context memory problem: models must retain and retrieve salient events across a long range without collapsing or drifting. However, scaling diffusion transformers to generate long-context videos is fundamentally limited by the quadratic cost of self-attention, which makes memory and computation intractable and difficult to optimize for long sequences. We recast long-context video generation as an internal information retrieval task and propose a simple, learnable sparse attention routing module, Mixture of Contexts (MoC), as an effective long-term memory retrieval engine. In MoC, each query dynamically selects a few informative chunks plus mandatory anchors (caption, local windows) to attend to, with causal routing that prevents loop closures. As we scale the data and gradually sparsify the routing, the model allocates compute to salient history, preserving identities, actions, and scenes over minutes of content. Efficiency follows as a byproduct of retrieval (near-linear scaling), which enables practical training and synthesis, and the emergence of memory and consistency at the scale of minutes.

  8. Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

    Safety alignment in Large Language Models (LLMs) often involves mediating internal representations to refuse harmful requests. Recent research has demonstrated that these safety mechanisms can be bypassed by ablating or removing specific representational directions within the model. In this paper, we propose the opposite approach: Rank-One Safety Injection (ROSI), a white-box method that amplifies a model's safety alignment by permanently steering its activations toward the refusal-mediating subspace. ROSI operates as a simple, fine-tuning-free rank-one weight modification applied to all residual stream write matrices. The required safety direction can be computed from a small set of harmful and harmless instruction pairs. We show that ROSI consistently increases safety refusal rates - as evaluated by Llama Guard 3 - while preserving the utility of the model on standard benchmarks such as MMLU, HellaSwag, and Arc. Furthermore, we show that ROSI can also re-align 'uncensored' models by amplifying their own latent safety directions, demonstrating its utility as an effective last-mile safety procedure. Our results suggest that targeted, interpretable weight steering is a cheap and potent mechanism to improve LLM safety, complementing more resource-intensive fine-tuning paradigms.

  9. CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification

    Recent Vision-Language-Action (VLA) models built on pre-trained Vision-Language Models (VLMs) require extensive post-training, resulting in high computational overhead that limits scalability and deployment.We propose CogVLA, a Cognition-Aligned Vision-Language-Action framework that leverages instruction-driven routing and sparsification to improve both efficiency and performance. CogVLA draws inspiration from human multimodal coordination and introduces a 3-stage progressive architecture. 1) Encoder-FiLM based Aggregation Routing (EFA-Routing) injects instruction information into the vision encoder to selectively aggregate and compress dual-stream visual tokens, forming a instruction-aware latent representation. 2) Building upon this compact visual encoding, LLM-FiLM based Pruning Routing (LFP-Routing) introduces action intent into the language model by pruning instruction-irrelevant visually grounded tokens, thereby achieving token-level sparsity. 3) To ensure that compressed perception inputs can still support accurate and coherent action generation, we introduce V-L-A Coupled Attention (CAtten), which combines causal vision-language attention with bidirectional action parallel decoding. Extensive experiments on the LIBERO benchmark and real-world robotic tasks demonstrate that CogVLA achieves state-of-the-art performance with success rates of 97.4% and 70.0%, respectively, while reducing training costs by 2.5-fold and decreasing inference latency by 2.8-fold compared to OpenVLA. CogVLA is open-sourced and publicly available at https://github.com/JiuTian-VL/CogVLA.

  10. OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning

    In this paper, we introduce OneReward, a unified reinforcement learning framework that enhances the model's generative capabilities across multiple tasks under different evaluation criteria using only One Reward model. By employing a single vision-language model (VLM) as the generative reward model, which can distinguish the winner and loser for a given task and a given evaluation criterion, it can be effectively applied to multi-task generation models, particularly in contexts with varied data and diverse task objectives. We utilize OneReward for mask-guided image generation, which can be further divided into several sub-tasks such as image fill, image extend, object removal, and text rendering, involving a binary mask as the edit area. Although these domain-specific tasks share same conditioning paradigm, they differ significantly in underlying data distributions and evaluation metrics. Existing methods often rely on task-specific supervised fine-tuning (SFT), which limits generalization and training efficiency. Building on OneReward, we develop Seedream 3.0 Fill, a mask-guided generation model trained via multi-task reinforcement learning directly on a pre-trained base model, eliminating the need for task-specific SFT. Experimental results demonstrate that our unified edit model consistently outperforms both commercial and open-source competitors, such as Ideogram, Adobe Photoshop, and FLUX Fill [Pro], across multiple evaluation dimensions. Code and model are available at: https://one-reward.github.io

  11. Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD

    Large Language Models (LLMs) can struggle to balance gullibility to misinformation and resistance to valid corrections in persuasive dialogues, a critical challenge for reliable deployment. We introduce DuET-PD (Dual Evaluation for Trust in Persuasive Dialogues), a framework evaluating multi-turn stance-change dynamics across dual dimensions: persuasion type (corrective/misleading) and domain (knowledge via MMLU-Pro, and safety via SALAD-Bench). We find that even a state-of-the-art model like GPT-4o achieves only 27.32% accuracy in MMLU-Pro under sustained misleading persuasions. Moreover, results reveal a concerning trend of increasing sycophancy in newer open-source models. To address this, we introduce Holistic DPO, a training approach balancing positive and negative persuasion examples. Unlike prompting or resist-only training, Holistic DPO enhances both robustness to misinformation and receptiveness to corrections, improving Llama-3.1-8B-Instruct's accuracy under misleading persuasion in safety contexts from 4.21% to 76.54%. These contributions offer a pathway to developing more reliable and adaptable LLMs for multi-turn dialogue. Code is available at https://github.com/Social-AI-Studio/DuET-PD.

  12. Multi-View 3D Point Tracking

    We introduce the first data-driven multi-view 3D point tracker, designed to track arbitrary points in dynamic scenes using multiple camera views. Unlike existing monocular trackers, which struggle with depth ambiguities and occlusion, or prior multi-camera methods that require over 20 cameras and tedious per-sequence optimization, our feed-forward model directly predicts 3D correspondences using a practical number of cameras (e.g., four), enabling robust and accurate online tracking. Given known camera poses and either sensor-based or estimated multi-view depth, our tracker fuses multi-view features into a unified point cloud and applies k-nearest-neighbors correlation alongside a transformer-based update to reliably estimate long-range 3D correspondences, even under occlusion. We train on 5K synthetic multi-view Kubric sequences and evaluate on two real-world benchmarks: Panoptic Studio and DexYCB, achieving median trajectory errors of 3.1 cm and 2.0 cm, respectively. Our method generalizes well to diverse camera setups of 1-8 views with varying vantage points and video lengths of 24-150 frames. By releasing our tracker alongside training and evaluation datasets, we aim to set a new standard for multi-view 3D tracking research and provide a practical tool for real-world applications. Project page available at https://ethz-vlg.github.io/mvtracker.

  13. Dress&Dance: Dress up and Dance as You Like It - Technical Preview

    We present Dress&Dance, a video diffusion framework that generates high quality 5-second-long 24 FPS virtual try-on videos at 1152x720 resolution of a user wearing desired garments while moving in accordance with a given reference video. Our approach requires a single user image and supports a range of tops, bottoms, and one-piece garments, as well as simultaneous tops and bottoms try-on in a single pass. Key to our framework is CondNet, a novel conditioning network that leverages attention to unify multi-modal inputs (text, images, and videos), thereby enhancing garment registration and motion fidelity. CondNet is trained on heterogeneous training data, combining limited video data and a larger, more readily available image dataset, in a multistage progressive manner. Dress&Dance outperforms existing open source and commercial solutions and enables a high quality and flexible try-on experience.

  14. FakeParts: a New Family of AI-Generated DeepFakes

    We introduce FakeParts, a new class of deepfakes characterized by subtle, localized manipulations to specific spatial regions or temporal segments of otherwise authentic videos. Unlike fully synthetic content, these partial manipulations, ranging from altered facial expressions to object substitutions and background modifications, blend seamlessly with real elements, making them particularly deceptive and difficult to detect. To address the critical gap in detection capabilities, we present FakePartsBench, the first large-scale benchmark dataset specifically designed to capture the full spectrum of partial deepfakes. Comprising over 25K videos with pixel-level and frame-level manipulation annotations, our dataset enables comprehensive evaluation of detection methods. Our user studies demonstrate that FakeParts reduces human detection accuracy by over 30% compared to traditional deepfakes, with similar performance degradation observed in state-of-the-art detection models. This work identifies an urgent vulnerability in current deepfake detection approaches and provides the necessary resources to develop more robust methods for partial video manipulations.

  15. Provable Benefits of In-Tool Learning for Large Language Models

    Tool-augmented language models, equipped with retrieval, memory, or external APIs, are reshaping AI, yet their theoretical advantages remain underexplored. In this paper, we address this question by demonstrating the benefits of in-tool learning (external retrieval) over in-weight learning (memorization) for factual recall. We show that the number of facts a model can memorize solely in its weights is fundamentally limited by its parameter count. In contrast, we prove that tool-use enables unbounded factual recall via a simple and efficient circuit construction. These results are validated in controlled experiments, where tool-using models consistently outperform memorizing ones. We further show that for pretrained large language models, teaching tool-use and general rules is more effective than finetuning facts into memory. Our work provides both a theoretical and empirical foundation, establishing why tool-augmented workflows are not just practical, but provably more scalable.

Solidot(15)

  1. 黄石公园内自由迁徙的野牛加强了草原的恢复力

    根据发表在《科学》期刊上的一项研究,黄石公园内自由迁徙的野牛群不但改善了养分循环,而且提升了生态系统健康。北美历史上曾有着数千万头的野牛;它们的季节性迁徙改变了这片大陆广袤的草原生态系统。曾经数量庞大的野生、自由迁徙的野牛群如今已不复存在;它们目前仅存约 40 万头,且几乎全都以小群管控的形式存在于私人土地或公园和保护区内。野牛在黄石公园北部生态系统中的迁徙活动提供了一个难得的天然实验场所。研究人员在 2015 年至 2022 年间追踪了野牛在三大栖息地的 16 个地点的动态,测量了它们对碳氮动态、植物群落和土壤微生物学的影响。研究人员发现,野牛在加速氮循环、增加地上氮含量和改善景观营养品质的同时也稳定了植物的产量。研究结果表明,大型迁徙食草动物对生态的影响力不仅在于它们的体型,而且还在于它们的数量、密度和自由迁徙能力。

  2. 美商务部长称将在区块链上发布经济数据

    美国商务部长 Howard Lutnick 宣布该部门将在区块链上发布经济数据,首批发布的数据将是 GDP。但他没有澄清使用哪种区块链,是比特币区块链还是某种定制的区块链。商务部长称这是特朗普总统推动美国成为“加密货币政府”计划的一部分。Lutnick 称特朗普是加密货币总统,他表示在商务部落实所有细节后,可能会扩大到其它联邦部门。

  3. 日本小镇考虑倡导每天仅使用智能手机两小时

    日本爱知县(Aichi)丰明(Toyoake)市政府议员正在讨论一项提案,将其 6.9 万居民的智能手机使用时间限制在两小时内,此举引发了手机上瘾的激烈讨论。该提案被认为是日本首例,丰明市长表示,该提案并不会严格执行,只是为了鼓励居民更好地管理屏幕时间。违反两小时限制的人并不会受到惩罚。市长在一份声明中表示,两小时限制只是一个指导方针,并不意味着限制居民的权利或施加义务。

  4. FFmpeg 8 支持实时生成字幕

    刚刚释出的开源编解码器 FFmpeg 8 集成了语音识别和转录机器学习模型 Whisper,意味着它支持实时为视频生成字幕。FFmpeg 8 代号为 Huffman,以 1952 年发明的 Huffman 编码算法名字命名。Huffman 算法是历史最悠久的无损压缩算法之一。Whisper 模型由 OpenAI 于 2022 年 9 月发布,whisper.cpp 是 Georgi Gerganov 在 Whisper 基础上开发的本地和离线运行版本。

  5. 研究显示 AI 的普及与美国初级工作的减少相关

    斯坦福大学研究人员周二发表报告《Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence》,显示 AI 的普及与美国 22-25 岁年轻人初级工作职位的减少相关。研究分析了美国最大薪资软件公司 ADP 的数百万美国工人的薪资记录,结果显示,自 2022 年以来,最容易受 AI 影响的入门级工作如客服、会计和软件开发的就业率下降了 13%,而相同领域需要经验的职位就业率保持稳定或持续增长。越来越多的证据表明 AI 将取代初级工作。

  6. 美国共和党人调查维基百科的自由主义偏见

    美国众议院 Oversight and Government Reform Committee 委员会的共和党人对维基百科文章中的自由主义偏见展开调查。委员会主席、肯塔基州共和党议员 James Comer 以及南卡罗来纳州共和党议员 Nancy Mace 周三向 维基媒体基金会 CEO Maryana Iskander 就此事发出信息请求。基金会托管了维基百科,而维基百科条目由志愿者编辑创建和维护。共和党议员根据多项研究和报告声称,维基百科平台上的信息被操纵,被用于对西方受众进行宣传。议员们引用了一份来自犹太民权组织 Anti-Defamation League 有关反犹宣传的报告,以及来自智库大西洋理事会有关亲俄宣传的报告。

  7. 用流行病学分析法国大革命

    1789 年谣言像病毒一样在法国蔓延:贵族动员匪徒袭击村庄、毁坏庄稼、恐吓农民,试图压制政治动荡。这一切都不是真的。但由此引发的恐慌和动乱,即所谓“大恐慌”,助燃了法国大革命,并引发了一场至今仍让历史学家产生分歧的辩论。是有意推动革命的努力散播了谣言吗?还是谣言是在真正的恐惧驱使下自发出现的?现在,科学家用流行病学方法解开这个谜团。根据历史记录和为追踪流行病而开发的模型,研究人员认为,制造恐慌的根源是理性的,而非情感的。研究显示,“大恐慌”始于一个谣言快速传播的阶段,在达到顶峰几天后便迅速平息——这一模式与病毒性流行病相似。根据数据,研究人员估算出基本再生数(R0),这是流行病学家用来表示在完全易感的人群中,一个感染者平均会感染多少人的指标。在这种情况下,基本再生数为2——任何大于1的数字都意味着流行病预计将呈指数级增长,直至达到峰值。研究人员表示,那些最容易接收谣言的地点的社会、经济和政治特征揭示了其传播背后的合理性。例如,在销毁土地登记册会使封建领主丧失财产所有权的省份,对谣言的易感性要高于其他省份。这表明,制造恐慌的行为蓄意针对了那些销毁土地登记册会产生更严重后果的地区。识字率较高的城镇也比识字率较低的地区更有可能经历“大恐慌”,这与“谣言大多是由被情绪驱动的无知农民传播”的观点相悖。研究人员认为谣言传播可能是蓄意的。

  8. 开源项目通常由一个人维护

    一个被包括五角大楼在内的机构使用的流行 Node.js 库由一名居住在莫斯科的 Yandex 员工 Denis Malinochkin 维护。美国安全公司对此发出了警告。然而对于开源项目而言,这不是什么新鲜事,绝大多数开源项目、不管是否流行,它们通常是由一个人维护的。以 NPM(Node Package Manager)生态系统为例,在下载量超过 100 万的项目中,维护者只有一个人或多个人的比例基本上是 50/50;在下载量超过 10 亿次的项目中,有 1 个项目是一个人维护,9 个项目由多个人维护。这就是开源社区的现状,开源项目通常是一个人维护的,即使是热门的项目也是如此。而且很多时候一个人会维护多个项目。一位俄罗斯人维护了一个流行库并不意味着什么,如果他们想要发动供应链攻击植入后门,俄罗斯人不会自称是俄罗斯人,他们会化名为 Jia Tan(XZ 后门作者化名)。

  9. Nothing 成为最新一家被发现用图库照片演示手机摄影能力的厂商

    Nothing 成为最新一家被发现用图库照片演示手机相机摄影能力的厂商。Nothing 用于展示 Phone 3 相机摄影能力的五张社区照片都是来自图库 Stills 的授权照片,拍摄时间是 2023 年,使用了非 Phone 3 的相机拍摄。Phone 3 系列是在 2025 年 3 月发布的,相关照片的 EXIF 数据可以确认它们拍摄于 Phone 3 发布前。Nothing 公司的联合创始人 Akis Evangelidis 声称这些照片是测试用的占位符(placeholders),但在正式上线前忘记了替换。

  10. 源自 ChatGPT 的常用词在人们的日常对话中也日益流行

    在 ChatGPT 发布近三年之后,大模型特有的词语在人们的日常对话中也日益流行。佛罗里达州立大学研究人员的论文预印本《Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English》已发布在 arxiv 上。通过分析 2210 万口语单词,其中包括来自科技相关播客的对话,在 ChatGPT 发布之后,大模型的常用词在日常对话中出现的频率也越来越高。如 AI 常用词 underscore 使用量显著增加,但其同义词 accentuate 并没有增加。其它 AI 常用词如 delve、intricate、surpass、boast、meticulous、strategically 和 garner 等的情况类似。研究人员称,我们不仅仅在使用 AI;AI 时髦词正日益融入日常对话中,引发了对“渗透效应(seep-in effect)”的担忧。研究人员表示,语言是人类最强大的沟通媒介,了解 AI 如何影响这种媒介具有重要意义。

  11. 三名台积电工程师因窃取 2 纳米工艺机密被起诉

    台检方以国家安全法国家核心关键技术营业秘密之域外使用等罪嫌,起诉工程师陈力铭、吴秉骏、戈一平 3 人,最重求刑 14 年。台积电表示,对任何违反保护营业秘密及损害公司利益的行为,始终秉持零容忍态度,绝对从严处理,并追究到底。三人中陈力铭已转职到日本 Tokyo Electron 的子公司。这名前台积电员工为了提升 Tokyo Electron 作为台积电开发的电路线宽 2 纳米的新一代半导体设备供应商的地位,拜托台积电的技术人员提供营业秘密。

  12. 印度屏蔽 Sci-Hub

    印度新德里法庭上周批准了三大期刊出版商 Elsevier、Wiley 和 American Chemical Society 的请求,命令电信公司在三天内在印度屏蔽访问 Sci-Hub 和 Sci-Net。此后印度网民将只能通过 VPN 等工具访问 Sci-Hub。Sci-Hub 创始人 Alexandra Elbakyan 解释了该平台自 2022 年停止发布新论文的原因:多数大学图书馆推行了双因素身份验证,它无法再使用学生或研究人员的共享用户名和密码自动登录图书馆下载新论文。为了获取新论文,她创建了基于区块链的新平台 Sci-Net,2025 年 4 月上线,但该项目很快遭到了期刊出版商的投诉。她谴责了印度司法机构会为了少数富有外国公司的利益而忽视印度学生和研究人员的需求。

  13. 《K-POP:猎魔女团》成为 Netflix 史上观看次数最多的电影

    自 6 月上线至今,《K-POP:猎魔女团(K-Pop: Demon Hunters)》成为 Netflix 历史上观看次数最多的电影。电影中的一首歌《Golden》也荣登 Billboard 百强单曲榜首。它的故事十分简单:远古时代恶魔游荡,恶魔以人类灵魂为食,直到三名女子——天赋异禀的歌手兼猎魔人——用歌声筑起一道神奇的屏障,将恶魔困于其后,这道屏障被称为“魂门”。此后魂门由一代又一代的猎魔人用歌声加固,直到有一天魂门变成金色,彻底封印恶魔,但当代的猎魔人歌手 Rumi、Mira 和 Zoey 面临来自大恶魔鬼马的最后反击——五名恶魔组成的 K-pop 男团。《猎魔女团》的观看次数逾 2.36 亿次,超过《红色通缉令》登顶。它上周末在北美电影院大规模上映,成为 Netflix 第一部获得北美票房冠军的电影。Netflix 平台观看次数前十的电影还包括 Carry-On、Don't Look Up、The Adam Project、Bird Box、Back in Action、Leave the World Behind、The Gray Man 和 Damsel。这部电影由索尼动画制作,续集的讨论目前处于早期阶段。

  14. Windows 版本的 Word v2509 默认将文档保存到云端

    微软宣布,Windows 版本的 Word v2509 默认将文档保存到云端。除非用户手动关闭,新文档将会自动储存到 OneDrive 或其它云储存服务,文件名将带有日期时间。微软产品经理 Raul Munoz 称,用户创建的任何新内容都将自动保存到 OneDrive 或首选的云存储目的地,他列举了将文档保存到云端的优势,包括永远不丢失进度和随时随地访问,轻松协作以及更高的安全性和合规性,等等。但一个显而易见的问题是用户隐私。如果用户不介意储存文档到云端,那么他们无无需采取任何动作;如果介意则需要自己去修改默认设置。

  15. GMP 项目报告 AMD CPU 烧毁事故

    GNU 多重精度运算库(GNU Multiple Precision Arithmetic Library,简称 GMP)项目披露该项目可能与 AMD Ryzen 9000 系列 CPU 不兼容。过去半年发生了两起测试时 Ryzen 9950X CPU 烧毁事故,他们的机器没有使用媒体广泛报道过的华擎主板。第一起发生在 2025 年 2 月,主板是华硕的 Prime B650M-K,散热器是 Noctua NH-U9S;第二起发生在 2025 年 8 月 24 日,主板是华硕 Prime B650M-A WIFI II,散热器型号相同。两起事故发生时 CPU 处于最大负载状态,运行手工编写的 asm 循环。两块 CPU 烧毁的情况都差不多,针脚侧发现了一块约 25 平方毫米的变色区域。开发者表示不清楚事故原因,他们没有超频或超压,上一代 Ryzen 7950X 在相同负载运行更长时间都没有出现问题。