OrangeBot.AI Digest — 2026-05-23
90 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- It's time to talk about my writerdeck (veronicaexplains.net)
- Green card seekers must leave U.S. to apply, Trump administration says (www.nytimes.com)
- Texas woman arrested for Facebook post about town water quality (reclaimthenet.org)
- My two-part desk setup (2025) (arslan.io)
- Italy moves to Airbus A330 tankers (www.euronews.com)
- The Art of Money Getting (kk.org)
- The FBI Wants 'Near Real-Time' Access to US License Plate Readers (www.wired.com)
- Oura says it gets government demands for user data (this.weekinsecurity.com)
- On The <dl> (2021) (benmyers.dev)
- I Miss Terry Pratchett (www.mahl.me)
- 80386 Microcode Disassembled (www.reenigne.org)
- US tech firms share Dutch regulator officials' names with Senate (www.dutchnews.nl)
- BambuStudio has been violating PrusaSlicer AGPL license since their fork (xcancel.com)
- Rubish: A Unix shell written in pure Ruby (github.com)
- Experience: We found a baby on the subway – now he's our 26-year-old son (www.theguardian.com)
GitHub Trending(15)
- Lum1104 / Understand-Anything
- anthropics / claude-plugins-official
- colbymchenry / codegraph
- rohitg00 / ai-engineering-from-scratch
- Fincept-Corporation / FinceptTerminal
- multica-ai / andrej-karpathy-skills
- dotnet / skills
- ChromeDevTools / chrome-devtools-mcp
- mukul975 / Anthropic-Cybersecurity-Skills
- presenton / presenton
- multica-ai / multica
- trimstray / the-book-of-secret-knowledge
- odoo / odoo
- NVlabs / LongLive
- yt-dlp / yt-dlp
Product Hunt(15)
- Finderlock
Lock Mac files in Finder with Touch ID & AES-256
- RetroMac
Turn your Mac into a time machine.
- Google Antigravity CLI
Run coding agents directly from your terminal
- SignalLEMO - Ai Outreach Made Simple
AI-powered lead outreach for field service contractors
- Spantop
Turn any Mac into a real second monitor
- Vibedock
Toggle Claude Code MCP servers from your menu bar
- Memdex
Turn every AI conversation into reusable local memory
- Forsy
Capture and sell your AI agent workflow data
- Bulkmark
Transform your Twitter/X Bookmarks into real knowledge
- Area Contrast Checker
Drag, Select, Know. A new way to check A11y contrast
- note.md
Local-first markdown based workspace for research writings
- Coca 2.0
Keep Your Mac and Apps Awake!
- Kosshi
Simple, fast outliner for Mac and iPhone.
- Command A+
Cohere’s open enterprise workhorse
- WordPress 7.0
Introducing AI tools, new admin experience & design controls
Hugging Face(15)
- DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
Reinforcement learning from verifiable rewards (RLVR) has emerged as a central technique for improving the reasoning capabilities of large language models. Despite its effectiveness, how response-level rewards translate into token-level probability changes remains poorly understood. We introduce a discriminator view of RLVR updates, showing that the policy-gradient update direction implicitly acts as a linear discriminator over token-gradient vectors and thereby determines which token probabilities are increased or decreased during learning. Under standard sequence-level RLVR, this discriminator is constructed from positive- and negative-side centroids formed by advantage-weighted averaging of token-gradient vectors. However, such centroid construction can be dominated by shared high-frequency patterns, such as formatting tokens, diluting sparse yet discriminative directions that better distinguish high-reward responses from low-reward ones. To address this limitation, we propose DelTA, a discriminative token credit assignment method that estimates token coefficients to amplify side-specific token-gradient directions and downweight shared or weakly discriminative ones. These coefficients reweight a self-normalized RLVR surrogate, making the effective side-wise centroids more contrastive and thereby reshaping the RLVR update direction. On seven mathematical benchmarks, DelTA outperforms the strongest same-scale baselines by 3.26 and 2.62 average points on Qwen3-8B-Base and Qwen3-14B-Base, respectively. Additional results on code generation, a different backbone, and out-of-domain evaluations further demonstrate the generalization ability of DelTA.
- TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation
Public transit route planning traditionally depends on structured map infrastructure and complex routing engines, and no existing dataset supports training models to bypass this dependency. We present TransitLM, a large-scale dataset of over 13 million transit route planning records from four Chinese cities covering 120,845 stations and 13,666 lines, released as a continual pre-training corpus and benchmark data for three evaluation tasks with complementary metrics. Experiments show that an LLM trained on TransitLM produces structurally valid routes at high accuracy and implicitly grounds arbitrary GPS coordinates to appropriate stations without any explicit mapping. These results demonstrate that transit route planning can be learned entirely from data, enabling end-to-end, map-free route generation directly from origin-destination information. The dataset and benchmark are available at https://huggingface.co/datasets/GD-ML/TransitLM, with evaluation code at https://github.com/HotTricker/TransitLM.
- Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?
Multimodal Large Language Models (MLLMs) are increasingly deployed in human-facing roles where personality perception is critical, yet existing benchmarks evaluate this capability solely on numerical Big Five score prediction, leaving open whether models truly perceive personality through behavioral understanding or merely prejudge through superficial pattern matching. We address this gap with three contributions. (i) A new task: we formalize Grounded Personality Reasoning (GPR), which requires MLLMs to anchor each Big Five rating in observable evidence through a chain of rating, reasoning, and grounding. (ii) A new dataset: we release MM-OCEAN (1,104 videos, 5,320 MCQs), produced by a multi-agent pipeline with human verification, with timestamped behavioral observations, evidence-grounded trait analyses, and seven categories of cue-grounding MCQs. (iii) Benchmark and analysis: we design a three-tier evaluation (rating, reasoning, grounding) plus four sample-level failure-mode metrics: Prejudice Rate (PR), Confabulation Rate (CR), Integration-failure Rate (IR), and Holistic-grounding Rate (HR), and benchmark 27 MLLMs (13 closed, 14 open). The analysis uncovers a striking Prejudice Gap: across the field, 51% of correct ratings are not grounded in retrieved cues, and the Holistic-Grounding Rate spans only 0-33.5%. These findings expose a disconnect between getting the right score and reasoning for the right reason, charting a roadmap for grounded social cognition in MLLMs.
- π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows
The rise of personal assistant agents, e.g., OpenClaw, highlights the growing potential of large language models to support users across everyday life and work. A core challenge in these settings is proactive assistance, since users often begin with underspecified requests and leave important needs, constraints, or preferences unstated. However, existing benchmarks rarely evaluate whether agents can identify and act on such hidden intents before they are explicitly stated, especially in sustained multi-turn interactions where user needs emerge gradually. To address this gap, we introduce π-Bench, a benchmark for proactive assistance comprising 100 multi-turn tasks across 5 domain-specific user personas. By incorporating hidden user intents, inter-task dependencies, and cross-session continuity, π-Bench evaluates agents' ability to anticipate and address user needs over extended interactions, jointly measuring proactivity and task completion in long-horizon trajectories that better reflect real-world use. Experiments show (1) proactive assistance remains challenging, (2) a clear distinction between task completion and proactivity, and (3) the value of prior interaction for proactive intent resolution in later tasks.
- Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps
Long-context inference in large language models is bottlenecked by the quadratic cost of full attention. Existing efficient alternatives often rely either on native sparse training or on heuristic token eviction, creating an undesirable trade-off among efficiency, training cost, and accuracy. In this work, we show that full-attention LLMs are already intrinsically sparse and can be transformed into highly sparse models with only minimal adaptation. Our approach is built on three observations: (1) only a small subset of attention heads truly requires full long-context processing; (2) long-range retrieval is governed primarily by a low-dimensional subspace, allowing relevant tokens to be retrieved efficiently with a 16-dimensional indexer; and (3) the useful token budget is strongly query-dependent, making dynamic top-p selection more suitable than fixed top-k sparsification. Based on these insights, we propose RTPurbo, which retains the full KV cache only for retrieval heads and introduces a lightweight token indexer for sparse attention. By exploiting the model's intrinsic sparsity, RTPurbo achieves sparsification with only a few hundred training steps. Experiments on long-context benchmarks and reasoning tasks show that RTPurbo preserves near-lossless accuracy while delivering substantial efficiency gains, including up to a 9.36times prefill speedup at 1M context and about a 2.01times decode speedup. These results suggest that strong sparse inference can be obtained from standard full-attention training without expensive native sparse pretraining.
- ACC: Compiling Agent Trajectories for Long-Context Training
Recent development of agents has renewed demand for long-context reasoning capacity of LLMs. However, training LLMs for this capacity requires costly long-document curation or heuristic context synthesis. We observe that agents produce massive trajectories when solving problems, invoking tools and receiving environment observations across many turns. The evidence needed to answer the original question is thus scattered throughout these turns, requiring integration of distant context segments. Nevertheless, standard agent SFT masks tool responses and only trains turn-level tool selection, creating a supervision blind spot where these scattered signals go unused. We propose Agent Context Compilation (ACC), which converts trajectories from search, software engineering, and database querying agents into long-context QA pairs that combine the original question with tool responses and environment observations gathered across multiple turns, training the model to answer directly without tool use. This makes the dependencies between the question and the evidence explicit, enabling direct supervision of long-context reasoning over distant segments without additional annotation. ACC is a simple but effective approach that can be combined with any existing long-context extension or training method, providing scalable supervised fine-tuning data. We validate ACC on long-range dependency modeling tasks through MRCR and GraphWalks, challenging benchmarks requiring cross-turn coreference resolution and graph traversal over extended contexts. Training Qwen3-30B-A3B with ACC achieves 68.3 on MRCR (+18.1) and 77.5 on GraphWalks (+7.6), results comparable to Qwen3-235B-A22B, while preserving general capabilities on GPQA, MMLU-Pro, AIME, and IFEval. Further mechanism analysis reveals that the ACC-trained model exhibits task-adaptive attention restructuring and expert specialization.
- PhysX-Omni: Unified Simulation-Ready Physical 3D Generation for Rigid, Deformable, and Articulated Objects
Simulation-ready physical 3D assets have emerged as a promising direction owing to their broad applicability in downstream tasks. However, most existing 3D generation methods either neglect physical properties or are limited to a single asset category, e.g., rigid, deformable, or articulated objects. To address these limitations, we introduce PhysX-Omni, a unified framework for simulation-ready physical 3D generation across diverse asset types. Specifically, we develop a novel and efficient geometry representation tailored for Vision-Language Models, which directly encodes high-resolution 3D structures without compression, significantly improving generation performance. In addition, we construct the first general simulation-ready 3D dataset, PhysXVerse, covering diverse indoor and outdoor categories. Furthermore, to comprehensively and flexibly evaluate both generative and understanding capabilities in the wild, we propose PhysX-Bench, which encompasses six key attributes: geometry, absolute scale, material, affordance, kinematics, and function description. Extensive experiments with conventional metrics and PhysX-Bench show that PhysX-Omni performs strongly in both generation and understanding. Moreover, additional studies further validate the potential of PhysX-Omni for applications in simulation-ready scene generation and robotic policy learning. We believe PhysX-Omni can significantly advance a wide range of downstream applications, particularly in embodied AI and physics-based simulation.
- LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning
Joint audio-visual reasoning is essential for omnimodal understanding, yet current multimodal large language models (MLLMs) still struggle when reasoning requires fine-grained evidence from both modalities. A central limitation is that explicit text-based chain-of-thought (CoT) compresses continuous audio-visual signals into discrete tokens, weakening temporal grounding and shifting intermediate reasoning toward language priors. We argue that a unified latent space is a better medium for such reasoning because it preserves dense sensory information while remaining compatible with autoregressive generation. Based on this insight, we propose LatentOmni, a cross-modal reasoning framework that interleaves textual reasoning with audio-visual latent states. LatentOmni introduces feature-level supervision to align latent reasoning states with task-relevant sensory features and uses Omni-Sync Position Embedding (OSPE) to maintain temporal consistency between latent audio and visual states. We further construct LatentOmni-Instruct-35K, a dataset of audio-visual interleaved reasoning trajectories for supervising latent-space reasoning. Comprehensive evaluation across multiple audio-visual reasoning benchmarks demonstrates that LatentOmni achieves the best performance among the evaluated open-source models and consistently outperforms the Explicit Text CoT baseline, supporting latent-space joint reasoning as a promising path toward stronger omnimodal understanding.
- Forecasting Scientific Progress with Artificial Intelligence
Artificial intelligence (AI) is increasingly embedded in scientific discovery, yet whether it can anticipate scientific progress remains unclear. To study this question, we introduce a temporally grounded evaluation framework for forecasting scientific progress under controlled knowledge constraints. We present CUSP (Cutoff-conditioned Unseen Scientific Progress), a multi-disciplinary and event-level benchmark that evaluates scientific forecasting in AI systems through feasibility assessment, mechanistic reasoning, generative solution design, and temporal prediction. Across 4,760 scientific events, we observe systematic and domain-dependent limitations in current frontier models. While models can identify plausible research directions from competing candidates, they fail to reliably predict whether scientific advances will be realized and systematically misestimate when they will occur. Performance is highly heterogeneous across domains, with the timing of AI progress more predictable than advances in biology, chemistry, and physics. Performance is largely insensitive to whether events occur before or after the training cutoff, suggesting these limitations cannot be explained solely by knowledge exposure in training data. Under controlled information access, additional pre-cutoff knowledge improves performance but does not close the gap to full-information settings, which becomes more pronounced for high-citation advances. Models also exhibit systematic overconfidence and strong response biases, indicating unreliable uncertainty estimation. Taken together, current AI systems fall short as predictive tools for scientific progress. Access to prior knowledge does not translate into reliable forecasting, and performance benefits more from post-event information than from forward-looking prediction.
- WorldKV: Efficient World Memory with World Retrieval and Compression
Autoregressive video diffusion models have enabled real-time, action-conditioned world generation. However, sustaining a persistent world, where revisiting a previously seen viewpoint yields consistent content, remains an open problem. Full KV-cache attention preserves this consistency but breaks real-time constraints: memory footprint and attention cost grow linearly with rollout length. Sliding window inference restores throughput but discards long-term consistency. We propose WorldKV, a training-free framework with two components: World Retrieval and World Compression. World Retrieval stores evicted KV-cache chunks in GPU/CPU memory and selectively retrieves scene-relevant chunks via camera/ action correspondence, inserting them back into the native attention window without re-encoding. World Compression prunes redundant tokens within each chunk via key-key similarity to an anchor frame, halving per-chunk storage to fit 2x more history under a fixed budget. On Matrix-Game-2.0 and LingBot- World-Fast, WorldKV matches or exceeds full-KV memory fidelity at roughly 2x the throughput, and is competitive with memory-trained baselines without any fine-tuning. Project Page: https://cvlab-kaist.github.io/WorldKV/
- Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning
Spreadsheet systems (e.g., Microsoft Excel, Google Sheets) play a central role in modern data-centric workflows. As AI agents grow increasingly capable of automating complex tasks, such as controlling computers and generating presentations, building an AI-driven spreadsheet agent has emerged as a promising research direction. Most existing spreadsheet agents rely on specialized prompting over general-purpose LLMs; while this design has potentials on simple spreadsheet operations, it struggles to manage the complex, multi-step workflows typical of real-world applications. We introduce Spreadsheet-RL, a reinforcement learning (RL) fine-tuning framework designed to train specialized spreadsheet agents within a realistic Microsoft Excel environment. Spreadsheet-RL features an automated pipeline for scalable collection of paired start-goal spreadsheets from online forums, as well as domain-specific evaluation tasks in areas such as finance and supply chain management, which we compile into the new Domain-Spreadsheet benchmark dataset. It also includes a Spreadsheet Gym environment designed for multi-turn RL: Spreadsheet Gym exposes extensive Excel functionality through a Python sandbox, along with a refined harness that incorporates a comprehensive tool set and carefully designed tool-routing rules for spreadsheet tasks. Through comprehensive experiments, we show that Spreadsheet-RL substantially enhances AI agent's performance on both general and domain-specific spreadsheet tasks: it improves Qwen3-4B-Thinking-2507's Pass@1 on SpreadsheetBench from 12.0% to 23.4%, and raises Pass@1 from 8.4% to 17.2% on our curated Domain-Spreadsheet dataset. These results highlight Spreadsheet-RL's strong potential for generalization and real-world adoption in spreadsheet automation, and broadly, its promise for advancing LLM-based interactions with data interfaces in everyday work.
- SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers
Diffusion transformers (DiTs) have emerged as a dominant architecture for text-to-image generation, yet their performance drops when generating at resolutions beyond their training range. Existing training-free approaches mitigate this by modifying inference-time attention behavior, often through Rotary Position Embeddings (RoPE) extrapolation combined with attention scaling. However, these strategies apply a uniform and content-agnostic scaling across RoPE components with distinct frequency characteristics, inducing a trade-off between preserving global structure and recovering fine detail. We introduce SEGA, a training-free method that dynamically scales attention across RoPE components according to the latent's spatial-frequency structure at each denoising step. This adaptive scaling improves both structural coherence and fine-detail fidelity. Experiments show that SEGA consistently improves high-resolution synthesis across multiple target resolutions, outperforming state-of-the-art training-free baselines.
- SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation
Multimodal Large Language Models (MLLMs) have made rapid progress in spatial intelligence, yet existing spatial reasoning benchmarks largely assume pristine visual inputs and overlook the degradations that commonly occur in real-world deployment, such as motion blur, low light, adverse weather, lens distortion, and compression artifacts. This raises a fundamental question: how robust is the spatial intelligence of current MLLMs when visual observations are imperfect? To answer this question, we introduce SpaceDG, the first large-scale dataset for degradation-aware spatial understanding. It is constructed with a physically grounded degradation synthesis engine that embeds degradation formation process into 3D Gaussian Splatting (3DGS) rendering, enabling realistic simulation of nine degradation types. The resulting dataset contains approximately 1M QA pairs from nearly 1,000 indoor scenes. We further introduce SpaceDG-Bench, an human-verified benchmark with 1,102 questions spanning 11 reasoning categories and 9 visual degradation types, yielding over 10K VQA instances. Evaluating 25 open- and closed-source MLLMs reveals that visual degradations consistently and substantially impair spatial reasoning, exposing a critical robustness gap. Finally, we show that finetuning on SpaceDG markedly improves degradation robustness and can even surpass human performance under degraded conditions without any performance drop on clean images, highlighting the promise of degradation-aware training for robust spatial intelligence.
- FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching
Extending the generation horizon of video diffusion models to long sequences remains a long-standing and important challenge. Existing training-free approaches fall into two categories: extensions of bidirectional models, which are tightly coupled to specific architectures and suffer from quality degradation over long horizons, and autoregressive models, which accumulate drift errors due to exposure bias and tend to produce repetitive motion patterns. To address these issues, we propose a novel but simple inference-time approach for long video generation that is architecture-agnostic and requires no additional training. Our method generates long videos via overlapping sliding windows, where predicted clean samples from adjacent windows are blended via Tweedie matching to enforce both manifold constraint and temporal consistency across overlap regions. Stochastic early-phase sampling then synchronizes per-window trajectories by injecting fresh noise after each Tweedie matching correction in the high-noise phase, before transitioning to deterministic ODE sampling to preserve fine-grained visual fidelity. Applied to various video generation models, our method generates videos several times longer than the native window length while outperforming both training-free and autoregressive baselines in temporal consistency and visual quality, and further extends to audio-video joint generation and text-to-3DGS without any fine-tuning.
- Unsupervised Process Reward Models
Process Reward Models (PRMs) are a powerful mechanism for steering large language model reasoning by providing fine-grained, step-level supervision. However, this effectiveness comes at a significant cost: PRMs require expert annotations for every reasoning step, making them costly and difficult to scale. Here, we propose a method for training unsupervised PRMs (uPRM) that requires no human supervision, neither at the level of step-by-step annotations nor through ground-truth verification of final answers. The key idea behind our approach is to define a scoring function, derived from LLM next-token probabilities, that jointly assesses candidate positions of first erroneous steps across a batch of reasoning trajectories. We demonstrate the effectiveness of uPRM across diverse scenarios: (i) uPRM achieves up to 15% absolute accuracy improvements over the LLM-as-a-Judge in identifying first erroneous steps on the ProcessBench dataset; (ii) as a verifier for test-time scaling, uPRM performs comparably to supervised PRMs and outperforms the majority voting baseline by up to 6.9%, and (iii) when used as a reward signal in reinforcement learning, uPRM enables more robust policy optimization throughout training compared to a supervised PRM trained using ground-truth labels. Overall, our results open a path toward scalable reward modeling for complex reasoning tasks.
Techmeme(15)
- For publishers, pirated audiobooks made with AI on YouTube are a growing issue: removal is cumbersome, and some are hiring tech companies to take them down (Alexandra Alter/New York Times)
Alexandra Alter / New York Times : For publishers, pirated audiobooks made with AI on YouTube are a growing issue: removal is cumbersome, and some are hiring tech companies to take them down — Illegal, synthetically narrated copies of “The Hunger Games,” hit self-help books and everything in between are increasingly common on the platform.
- How Anthropic's ongoing discussions with the Vatican about ethics and AI led to Christopher Olah being invited to Pope Leo's unveiling of an encyclical on AI (Jack Jenkins/RNS)
Jack Jenkins / RNS : How Anthropic's ongoing discussions with the Vatican about ethics and AI led to Christopher Olah being invited to Pope Leo's unveiling of an encyclical on AI — (RNS) — Pope Leo XIV's new encyclical on AI is set to be released Monday (May 27), with Chris Olah, a co-founder of Anthropic, at his side.
- Sources: the ECB warned EU finance ministers that proposals to issue more euro stablecoins could reduce bank lending and make controlling interest rates harder (Reuters)
Reuters : Sources: the ECB warned EU finance ministers that proposals to issue more euro stablecoins could reduce bank lending and make controlling interest rates harder — The European Central Bank warned European Union finance ministers on Friday that proposals to issue more euro stablecoins …
- Jensen Huang urged Super Micro to tighten up compliance after Taiwan detained three people for allegedly trying to export servers with Nvidia chips to China (Debby Wu/Bloomberg)
Debby Wu / Bloomberg : Jensen Huang urged Super Micro to tighten up compliance after Taiwan detained three people for allegedly trying to export servers with Nvidia chips to China — Nvidia Corp. Chief Executive Officer Jensen Huang urged Super Micro Computer Inc. to tighten up on compliance after Taiwan detained three people …
- Dell says it has 5,000 clients for its AI Factory, a product line of servers with Nvidia chips, software, and services, including 1,000 new clients last quarter (Dina Bass/Bloomberg)
Dina Bass / Bloomberg : Dell says it has 5,000 clients for its AI Factory, a product line of servers with Nvidia chips, software, and services, including 1,000 new clients last quarter — Dell Technologies Inc. said it added 1,000 customers for a key AI product line in the past quarter as the company tries …
- Delivery Hero says it received a takeover offer from Uber for €33 per share, a discount of 1.76% from Delivery Hero's close on Friday (Anusha Shah/Reuters)
Anusha Shah / Reuters : Delivery Hero says it received a takeover offer from Uber for €33 per share, a discount of 1.76% from Delivery Hero's close on Friday — German food delivery service Delivery Hero (DHER.DE) confirmed it had received a takeover offer from rival Uber (UBER.N) valuing the company at 33 euros …
- A look at a copy of the AI EO that President Trump was expected to sign on May 21; the unsigned EO emphasized that government AI reviews would be voluntary (Sophia Cai/Politico)
Sophia Cai / Politico : A look at a copy of the AI EO that President Trump was expected to sign on May 21; the unsigned EO emphasized that government AI reviews would be voluntary — The draft also includes language aimed at bad actors. It directs the attorney general to enforce the Computer Fraud and Abuse Act and …
- A profile of TP-Link, whose share of the US consumer router market grew from 10% to 60%+ between 2019 and 2025, as it seeks to rebut national security concerns (Noah Berman/The Wire China)
Noah Berman / The Wire China : A profile of TP-Link, whose share of the US consumer router market grew from 10% to 60%+ between 2019 and 2025, as it seeks to rebut national security concerns — When Jeffrey Chao was studying for his master's in computer science in the early 1990s, he would sometimes spend so many hours …
- As the US House probes Airbnb's use of Chinese AI models, CEO Brian Chesky says the company is not sharing data with Chinese firms and uses open-source models (Natalie Lung/Bloomberg)
Natalie Lung / Bloomberg : As the US House probes Airbnb's use of Chinese AI models, CEO Brian Chesky says the company is not sharing data with Chinese firms and uses open-source models — Airbnb Inc. Chief Executive Officer Brian Chesky defended his company's use of Chinese artificial intelligence models …
- Q&A with Sundar Pichai on Google Search's future, Google's place in the AI race, public skepticism of AI, TPUs, being "behind the frontier" in coding, more (New York Times)
New York Times : Q&A with Sundar Pichai on Google Search's future, Google's place in the AI race, public skepticism of AI, TPUs, being “behind the frontier” in coding, more — After a busy Google I/O, the company's chief executive sits down with the hosts of “Hard Fork” …
- FOIA lawsuit documents show hackers who breached SolarWinds potentially had access to all "treasury.gov" email addresses from July 6, 2020 to October 12, 2020 (Jordan Robertson/Bloomberg)
Jordan Robertson / Bloomberg : FOIA lawsuit documents show hackers who breached SolarWinds potentially had access to all “treasury.gov” email addresses from July 6, 2020 to October 12, 2020 — New details about the 2020 incident. — Six years after hackers allegedly backed by Russia's intelligence services broke …
- Cloudflare CEO Matthew Prince says AI won't replace builders or sellers, but it will affect middle managers, operations jobs, and other "measuring" positions (Matthew Prince/Wall Street Journal)
Matthew Prince / Wall Street Journal : Cloudflare CEO Matthew Prince says AI won't replace builders or sellers, but it will affect middle managers, operations jobs, and other “measuring” positions — The company has less need for middle managers, operations jobs and other ‘measuring’ positions.
- Samsung's bonus deal is fueling employee resentment over a 100x payout gap between memory division staff and those making smartphones, TVs, and home appliances (Yoolim Lee/Bloomberg)
Yoolim Lee / Bloomberg : Samsung's bonus deal is fueling employee resentment over a 100x payout gap between memory division staff and those making smartphones, TVs, and home appliances — Samsung Electronics Co. staved off a potentially catastrophic strike this week, reaching a tentative deal with leaders …
- Fresha, a London-based beauty and wellness booking marketplace, raised $80M from KKR's growth equity arm at a $1B+ valuation, bringing its total raised to $285M (Dominic-Madori Davis/TechCrunch)
Dominic-Madori Davis / TechCrunch : Fresha, a London-based beauty and wellness booking marketplace, raised $80M from KKR's growth equity arm at a $1B+ valuation, bringing its total raised to $285M — Beauty and wellness booking marketplace Fresha has announced an $80 million investment from KKR's Next Generation Technology Growth fund …
- Filing: Zoom's stake in Anthropic is worth ~$1.27B based on a February round which valued Anthropic at $380B; Zoom invested an additional $46M in recent months (Brody Ford/Bloomberg)
Brody Ford / Bloomberg : Filing: Zoom's stake in Anthropic is worth ~$1.27B based on a February round which valued Anthropic at $380B; Zoom invested an additional $46M in recent months — Zoom Communications Inc., the videoconferencing company, has netted about $1 billion on an investment it made in artificial intelligence startup Anthropic PBC in early 2023.
Solidot(15)
- 扎克伯格为监视员工的做法辩护
劳工保护组织 More Perfect Union 公开了扎克伯格(Mark Zuckerberg)上月底回答员工有关设备监控提问的六分钟录音。Meta 上个月通知员工将使用名为 Model Capability Initiative 的监控工具监控员工的鼠标点击和按键,此举旨在收集数据训练 AI 模型。扎克伯格在回答中为监控员工辩护,称如果想训练模型的编程能力,那么让内部员工去开发一些工具,或者去解决一些任务,以此来教模型如何写代码——这种方式能让模型在编程能力上实现飞跃。这种速度是行业内其他对手无法企及的,因为他们的公司没有成千上万名顶尖工程师,“这只是一个例子。我们的系统还需要非常擅长的一点就是‘操作电脑’。而要让一个系统学会熟练操作电脑,最有效的办法就是让它去观察极其聪明的人是如何操作电脑的。这基本上就是我们目前正在做的事情的核心本质。”扎克伯格表示不会监视员工的工作行为,MCI 数据不会用于绩效评估。因为欧盟的 GDPR 法律,Meta 位于欧洲的员工据报道不用参与该计划。Meta 并非唯一一家通过员工获取 AI 训练数据的科技公司,微软和 xAI 也在利用内部员工生成和完善训练数据集。
- 《无畏契约》反作弊工具会限制作弊者使用 DMA 外挂
非玩家可能不知道,今天的高级作弊工具已经硬件化,且价格不菲,可能比整台 PC 贵得多。此类工具被称为 DMA 硬件卡或 DMA 外挂,利用硬件绕过传统的游戏反作弊系统。游戏开发商也正致力于反制 DMA 外挂,最新的例子就是 Riot Games。它的 FPS 网游《无畏契约(Valorant)》使用的内核级反作弊系统 Vanguard 在最新更新之后能强制开启 IOMMU 封锁 DMA 外挂,导致 DMA 硬件停止工作,如果要恢复工作必须重新安装操作系统。Vanguard 现在能屏蔽大多数伪装成 SATA 或 NVMe 设备的 DMA 硬件卡固件,会在游戏中突然触发 IOMMU 重启警告,之后 DMA 固件完全无法使用,即使游戏不再运行或卸载也是如此。唯一的解决方法是重装 Windows 系统。Riot Games 通过社交媒体嘲讽了作弊者,称他们的 6000 美元 DMA 外挂变成了垃圾。
- 沃茨告诉毕业生他们拥有真正的智能
苹果联合创始人沃茨(Steve Wozniak)做到了其他毕业典礼嘉宾没有做到的事情:他谈论 AI 时赢得了毕业生的欢呼,而不是嘘声。沃茨说,“You have AI — actual intelligence。”他说,“要深入谈谈我对 AI 的看法,那就说来话长了,但我们一直在努力创造一个大脑,我们能否将一个程序复制一万亿次使其像大脑一样运作?AI 就是其中一种尝试。”沃兹回顾了他在苹果公司的工作经历,为即将开始职业生涯的毕业生们提供了一些建议,“你们应该尝试换一种思维,不要墨守成规,走千篇一律的路。想想我能不能做一些与众不同的事情?”
- Linus Torvalds 谈 AI
Linux 作者 Linus Torvalds 在北美开源峰会上谈论了 AI,他认为 AI 工具正在重塑内核开发,但他坚称 AI 只是一种不错的工具,不会完全替代程序员。Torvalds 称内核最近两个版本的 commits 数增加了 20%,他一开始以为是内核版本号从 6.x 跳到 7.x 而让开发者兴奋不已,结果发现是因为 AI 辅助编程工具过去半年有了显著进步。他承认 AI 工具降低贡献者的门槛,但它真正的影响是社会而不是技术层面,一个例子就是安全邮件列表涌入了大量重复性的 bug 报告。为应对这一情况,内核制定了新规则。Torvalds 同时督促安全研究人员不要提前披露漏洞利用,内核最近发现了四个提权漏洞,但维护者还没收到通知研究员就提前公开,他说这些人喜欢引人瞩目。他不认为闭源能解决安全问题,闭源实际上更糟,因为 AI 无法帮助你修复 bug。Torvalds 说维护工作依赖于人而不是代码,作为最高级别的维护者,他的工作不是写代码而是与人合作,他不会用 AI 来与人合作,并建议其他人也不要这么做。他始终认为 AI 只是不错的工具,不会完全取代程序员。他的工作经历就凸显了工具的进步给程序员带来的生产力提升:他最开始是手动输入机器代码,然后用汇编器,接着是编译器,最后是今天的 AI 辅助编程。他认为 AI 在改变编程,但并没有改变编程的本质。开发者仍然需要理解工具生成了什么。对于任何长期运行的系统,“你不仅要理解指令,还要理解最终结果,因为这是你能长期维护它的唯一途径。”AI 并不能取代人类判断、社区规范以及对所构建系统的深刻理解,“软件非常复杂,管理复杂基础设施复杂性的唯一真正有效方法是开源”,而 AI 只是程序员工具箱中的又一个工具。
- GitHub 面临生存之战
在被微软收购八年之后,最大的代码托管平台 GitHub 正面临生存之战,它的宕机和安全问题频发,而竞争对手的压力也越来越大。过去几周,GitHub 发生了多起严重的宕机事故,因员工的 VS Code 安装了一个恶意库扩展导致 3800 个内部代码库被窃取。GitHub 现员工和前员工在接受采访时描述了公司在领导层缺乏和竞争对手压力下挣扎的困境。2025 年夏天 CEO Thomas Dohmke 离职之后,微软没有再任命新 CEO,而是让领导团队成员向 CoreAI 汇报工作,CoreAI 由前 Meta 工程主管 Jay Parikh 负责,他由 CEO Satya Nadella 亲自招揽,负责帮助公司向 AI 转型。他在公司内部并不受欢迎,正是他决定不再任命 GitHub 新 CEO。有很多 GitHub 员工跟着离职去了 Dohmke 的新创公司 Entire。GitHub 高管过去几个月也不断流失,高级副总裁 Jared Palmer、前首席营收管 Elizabeth Pemmerl 都已经离职。GitHub 现员工称公司已经名存实亡,如今的一切都归微软。
- Sergey Brin 捐 50 万美元反对对薪酬过高的 CEO 征税
已从硅谷搬家到内华达州的 Google 联合创始人 Sergey Brin 向旧金山的一个政治行动委员会捐赠 50 万美元,用于反对一项被称为“薪酬过高 CEO 税”的提案,旧金山选民将于 6 月 2 日对该提案进行投票。他此前已经捐赠数千万美元反对加州对亿万富翁征税的提案,该提案预计将于今年 11 月由加州选民进行投票。“薪酬过高 CEO 税”将根据公司全球员工的薪酬情况计算高管与普通员工的薪酬比率。支持该提案的 Chinese Progressive Association 称有必要“确保最富有的企业缴纳其应缴的税款”。
- Meta 应沙特要求审查反对者的账号
从 2026 年 4 月 30 日起,Meta 应沙特政府要求在沙特境内屏蔽了 NGO 组织 ALQST for Human Rights 和 Democratic Diwan,以及沙特研究员 Abdullah Alaoudh 和人权活动人士 Yahya Assiri 的 Facebook 账户。Meta 也应阿联酋要求地理封锁了一名学者的账号。自 2026 年 3 月以来,已有逾 100 个 Facebook 页面和 Instagram 账户受到了限制。沙特还要求 X 平台地理封锁知名沙特活动人士的账号,目前 X 尚未遵守该要求。
- 脱离人体的大脑被用于药物测试
一天前这颗大脑还在一个活人身上。如今在其主人去世数小时后,它静静地躺在一辆小推车上。车上布满了管道,向这个器官内泵入数升的血液替代品和其它液体,为其输送氧气并排出代谢废物。它的大部分核心功能都完好无损,但其电活动已被麻醉剂压制,使这颗大脑处于一种介于生死之间的游离状态。随着它代谢着实验性药物,传感器实时记录着其反应,捕捉关于细胞、蛋白质和生理机能的数百个数据点。24 小时后,它将被切成数百个碎片,以进行更深入的研究。它是生物创业公司 Bexorg 使用脑维持设备 BrainEx 培养和研究的逾七百颗大脑之一,被用于深入理解潜在疗法在患有帕金森、阿尔茨海默或肌萎缩侧索硬化症等神经退行性疾病大脑中的作用机制。Bexorg 能对大脑进行活检,了解药物在细胞中停留的时间、是否靶向其分子靶点以及是否存在任何副作用。Bexorg 认为它的系统能提供比实验室动物或培养皿细胞更接近真实情况的药物测试条件。Bexorg 此前一直保持低调,但最近在扩大规模,邀请了记者参观其实验室,试图向公众保证,脱离人体的大脑不会触犯伦理底线,也不会有恢复意识的风险。
- 因无人驾驶汽车驶入洪水 Waymo 暂停亚特兰大服务
由于无人驾驶汽车暂时还无法应付洪水淹没道路问题,Waymo 暂停了在亚特兰大的无人出租车服务。Waymo 的一辆无人驾驶出租车周三驶入了一条被洪水淹没的道路,被困大约一小时。这辆车已被拖走。Waymo 表示它在寻找解决方案的同时暂停在了亚特兰大的服务。Waymo 早些时候因为恶劣天气暂停了德州圣安东尼奥、达拉斯和休斯顿的服务。Waymo 称亚特兰大的暴雨降雨量巨大,以至于在国家气象局发布山洪暴发预警、警报或建议前洪水就已经发生了。
- 手机壳可能会富集耐药菌和 PFAS
现代人几乎与手机形影不离,手部、面部皮肤与手机及手机壳长期高频接触。你有没有留意过,用了大半年的手机壳,不知从哪天开始就悄悄发黄、发黏,怎么擦都回不到当初光亮透明的样子。根据发表在《危险材料杂志》上的研究,科学家证实不良卫生习惯及频繁化妆行为会加速热塑性聚氨酯(TPU)手机壳老化,使其逐渐成为全氟烷基物质(PFAS)与条件致病菌共同富集的“温床”。用户行为研究机构 Dscout 的真实环境追踪报告显示,智能手机用户日均触摸手机 2617 次,重度用户可达 5400 余次。研究团队招募了 30 名在校大学生志愿者,开展了一项持续 285 天的真实环境受控队列研究。团队观察了两类典型受试群体:一类是卫生习惯良好、较少使用化妆品的志愿者;另一类则恰巧相反,频繁使用化妆品且手部卫生习惯较差。结果显示,与卫生习惯较好、较少使用化妆品的志愿者相比,频繁使用化妆品且手部卫生习惯较差的受试者,其手机壳表面的 PFA S富集水平显著升高。在部分污染累积较严重的手机壳样本中,全氟辛酸(PFOA)表面富集量最高达到每平方厘米 9.39 微克,全氟辛烷磺酸(PFOS)最高达到每平方厘米 0.164 微克,提示日常接触行为可能正在悄然增加人体暴露于新污染物和潜在致病微生物的风险。
- 欧洲巨石文化社会存在遗传亲缘关系
新石器时代晚期(约公元前 4500 至公元前 2800 年),巨石遗迹(即大型石质建筑结构)在欧洲各地出现。这些建筑作品既反映了当地的传统,同时也暗示了相隔遥远的人群之间存在着影响深远的社会、文化或祖源联系。根据发表在《科学》期刊上的一项研究,研究人员分析了中欧多个相距遥远的巨石文化遗址个体的基因组数据,发现他们之间存在着深厚且持续的生物学关联,表明当时存在着偶尔的跨越大范围地理区域的人口流动、通婚或文化交流。但中欧巨石文化与位于今天的英国以及北欧的巨石文化人群缺乏密切的基因学纽带关系。这表明巨石传统很可能是通过文化(而非通过生物学网络)传播的。
- 特朗普政府不想要埃博拉病毒的美国感染者回国治疗
刚果再次爆发了埃博拉疫情,确诊或接触病毒的人中包括了美国医生,但上周特朗普政府拒绝让他们回国接受治疗。39 岁的外科医生 Peter Stafford 于周日确诊,本周三美国 CDC 的埃博拉疫情事件响应经理 Satish Pillai 表示,Stafford 已送往德国,目前情况稳定。他的妻子 Rebekah Stafford 也是医生,也是病毒接触者,但目前还没有出现症状,他们以及四个孩子都送往了德国。另一名医生 Patrick LaRochelle 与 Stafford 夫妇同属于 Serge 传教团,他是病毒接触者,目前无症状,他已送往布拉格接受监测和治疗。他的妻子和孩子曾与他一同在刚果,但 CDC 认为他们没有接触过病毒,因此已经返回了美国。根据 WHO 周三公布的最新数据,目前埃博拉疑似病例为 528 例,死亡 132 例。
- 国际空间站俄罗斯舱段再次发生漏气事故
NASA 证实国际空间站的俄罗斯舱段再次发生漏气事故。过去五年俄罗斯航天局和 NASA 一直在追踪俄罗斯舱段的空气泄漏,漏气的舱段位于 Progress(进步号)气闸舱和 Zvezda(星辰号)服务舱之间的 PrK 模块,漏气原因是微小的结构裂缝。今年 1 月 NASA 宣布在多次检查和密封处理后 PrK 舱段的内部压力已经稳定,不再漏气。然而 PrK 舱段的漏气情况在三周前再次出现。NASA 表示它正与俄罗斯航天局协调后续处理步骤。此次事件再次引发了对国际空间站长期生存能力的担忧。
- 亚马逊去年在破坏工会的咨询服务上的支出为 2660 万美元
根据 Economic Policy Institute (EPI)的报告,美国雇主每年在反工会活动上的开支逾 15 亿美元。雇主雇佣从事工会规避服务的顾问和律所,在工会选举和活动期间提供法律咨询、代理和诉讼服务。美国公司每年在反工会咨询服务的开支上多达 4.42 亿美元,根据亚马逊递交到劳工部的文件,2025 年它在雇佣反工会顾问上的开支为 2660 万美元。目前美国的工会覆盖率仅为 10%,而 1983 年这一比例为 20.3%。而盖洛普民调显示,近七成美国民众支持工会。由于拖延战术和上诉,美国工人平均需要 465 天才能达成第一份工会合同,很多情况下时间甚至更长,如星巴克自 2021 年美国首家门店赢得工会选举以来工人至今仍未达成第一份工会合同。
- Google 宣布在 AI 模式下加入更多广告
Google 本周二宣布搜索框将变成 AI 聊天机器人的对话框,那么它久经时间考虑的商业模式——搜索广告——自然也会跟着进入 AI 模式。Google 周三宣布将在 AI 模式中引入更多“富有帮助的广告(helpful ads)”。搜索巨人表示在测试两类新广告,提供相关产品的细节和有用的指导。作为广告的一部分,它们都会包含一个独立的 AI 解释器。广告也都会标明“赞助”字样。两类新广告其一称之为“对话式发现广告”——广告即答案;其二称之为“高亮答案”(Highlighted Answers)——将高度相关的广告作为推荐列表的一部分提供给用户。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.