Curated by Shen Huang · 90 stories · ~14 min read
DIGEST · 2026-06-10

OrangeBot.AI Digest — 2026-06-10

90 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. πFS (github.com)
  2. Farmer donates land for a park, city sells it for $10M as data center land (www.tomshardware.com)
  3. Anthropic's model naming, extrapolated (samwilkinson.io)
  4. Claude Desktop spawns 1.8 GB Hyper-V VM on every launch, even for chat-only use (github.com)
  5. DiffusionGemma: 4x Faster Text Generation (blog.google)
  6. US Consumer Price Index up 4.2% (www.bls.gov)
  7. I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA
  8. PgDog is funded and coming to a database near you (pgdog.dev)
  9. Ask HN: Are most corporate SWE jobs performative?
  10. Building an HTML-first site doubled our users overnight (mohkohn.co.uk)
  11. Reviving Papers with Code (paperswithcode.co)
  12. Mercedes‑Benz starts large‑scale production of electric axial flux motor (media.mercedes-benz.com)
  13. AWS Bedrock to require sharing data with Anthropic for Mythos and future models
  14. Chrome is looking to permanently drop MV2 extension (www.neowin.net)
  15. Vibe coding my way to a healthy family: Introducing Gamow Labs (www.ddmckinnon.com)

GitHub Trending(15)

  1. addyosmani / agent-skills
  2. phuryn / pm-skills
  3. refactoringhq / tolaria
  4. mvanhorn / last30days-skill
  5. soxoj / maigret
  6. x1xhlol / system-prompts-and-models-of-ai-tools
  7. obra / superpowers
  8. masterking32 / MasterDnsVPN
  9. harry0703 / MoneyPrinterTurbo
  10. maziyarpanahi / openmed
  11. luongnv89 / claude-howto
  12. activeloopai / hivemind
  13. ruvnet / RuView
  14. roboflow / supervision
  15. google / skills

Product Hunt(15)

  1. Publora

    The publishing API for the agent era

  2. TypingMind

    Pay per use, no subscription, 18 model providers supported

  3. fort

    One command to audit and fix your Mac's security

  4. Zingle

    Learn words in context with AI

  5. OLO Robotics

    Control robots in your browser — no setup needed

  6. LayerProof Vellum

    One canvas for every image asset you need

  7. Incorruptible by Eric Ries

    Why good companies go bad and how great companies stay great

  8. Spotlight by Backplanes

    Session reports for Claude Code & Codex to improve your code

  9. FluidDocs Deck Builder

    Turn a prompt into a real HTML deck

  10. Gemini 3.5 Live Translate

    Latest audio model for live speech-to-speech translation

  11. BlenderHunt

    The indie marketplace for Blender artists and creators

  12. veridive

    Find the 30 seconds that matter in any video via chat

  13. Timmy-TUI

    Local-first agent trust console with a safe local workspace

  14. SeaTicket

    Al agent that resolves issues across all your channels

  15. Screen Charm

    Give your screen recordings more charm

Hugging Face(15)

  1. ABot-Earth 0.5: Generative 3D Earth Model

    We present ABot-Earth 0.5, a generative 3D framework designed to synthesize vast, seamless 3D environments from ubiquitous, geospatially referenced satellite imagery. To achieve this, we propose a novel generative model formulated directly with the 3D Gaussian Splatting (3DGS) representation. The model is trained on a diverse corpus of existing real-world urban reconstructions, learning to generate realistic geometry and textures. At inference, it synthesizes novel 3D scenes conditioned solely on satellite imagery at a scalable rate of under 10 minutes per square kilometer, while demonstrating exceptional realism. The framework is designed for accessibility, with integrated hierarchical level-of-detail (LOD) structures that permit real-time, interactive visualization on web-based map engines. This high-fidelity simulation sandbox effectively mitigates the sim-to-real domain gap, enabling critical downstream Embodied AI applications like closed-loop UAV navigation. By providing an ultra-low-cost and high-efficiency solution, ABot-Earth 0.5 significantly lowers the technical and financial barriers to large-scale 3D reconstruction and empowers the future of global digital earth visualization.

  2. Kwai Keye-VL-2.0 Technical Report

    We introduce Kwai Keye-VL-2.0-30B-A3B, an open-source Mixture-of-Experts (MoE) multimodal foundation model designed to advance long-video understanding and agentic intelligence. To address the challenges of ultra-long contexts, information redundancy, and prohibitive computational costs inherent in hour-level videos, Keye-VL-2.0 is the first to adapt DeepSeek Sparse Attention (DSA) to GQA-based multimodal architectures, enabling lossless 256K context processing while capturing critical frames and long-range temporal dependencies. This architecture is underpinned by a highly optimized training and inference infrastructure, including scalable video I/O, heterogeneous ViT-LM parallelism, and custom DSA kernels that significantly maximize throughput and minimize computational overhead. Furthermore, to overcome the algorithmic dilemma of catastrophic forgetting during multi-task alignment, we introduce Cross-Modal Multi-Teacher On-Policy Distillation (MOPD) paired with Context-RL and Video-RL. By distilling dense token-level teacher feedback from on-policy rollouts back into the MoE backbone, which activates only 3B parameters, Keye-VL-2.0 natively empowers advanced agent collaboration across Code, Tool, and Search scenarios with multimodal self-correction. Extensive evaluations across video understanding, temporal grounding, reasoning, STEM, and agent benchmarks demonstrate that Keye-VL-2.0-30B-A3B achieves state-of-the-art performance among models of similar scale, particularly excelling in fine-grained temporal localization on TimeLens and long-video comprehension on Video-MME-v2 and LongVideoBench. We release our model checkpoints to accelerate community progress toward scalable and robust multimodal agentic applications.

  3. Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

    Although Large Language Model (LLM) agents have demonstrated strong performance on complex tasks, their learning is often limited by inefficient interaction feedback and static training environments, which hinder broader generalization. To address these limitations, this paper introduces Role-Agent, black{a framework} that harnesses a single LLM to function concurrently as both the agent and the environment, enabling a bootstrapped co-evolution. Role-Agent comprises two synergistic components: World-In-Agent (WIA) and Agent-In-World (AIW). In WIA, the LLM acts as the agent and predicts future states after each action; the alignment between predicted and actual states is then used as a process reward, encouraging environment-aware reasoning. In AIW, the LLM analyzes failure modes from failed trajectories and retrieves tasks with similar failure patterns, thereby reshaping the training data distribution for targeted practice. Experiments on multiple benchmarks show that Role-Agent consistently improves performance, yielding an average gain of over 4\% over strong baselines.

  4. Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts

    AI agents rely on a harness of skills, tools, and workflows to solve complex problems. Continually improving this harness is essential for adapting to new tasks. However, existing optimization methods typically require ground-truth validation sets, yet such labeled data is difficult to acquire in practical deployment settings. To address this problem, we introduce Retrospective Harness Optimization (RHO), a self-supervised method that optimizes the agent harness using only past trajectories. Specifically, RHO selects a diverse coreset of challenging tasks from past trajectories and re-solves them in parallel. The agent analyzes these rollouts using self-validation and self-consistency, then generates candidate harness updates and selects the most effective one by its own pairwise self-preference. We evaluate RHO across three diverse domains, spanning software engineering, technical work, and knowledge work. Notably, a single optimization round improves the pass rate on SWE-Bench Pro from 59% to 78% without any external grading. Furthermore, our analysis demonstrates that RHO effectively targets prior failure modes. As a result, the optimized harness alters the agent's behavior patterns and sustains higher accuracy during long-horizon sessions.

  5. SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

    Large language models are increasingly expected to handle complex, long-horizon real-world tasks whose context demands can grow without bound, yet model context windows remain inherently finite. Recent work explores a paradigm where a main agent decomposes tasks and dispatches subtasks to subagents, which execute and return only summarized results, conserving the main agent's context budget. However, performing this well requires delegation intelligence: the ability to decompose complex tasks, determine when and what to delegate, and integrate returned results into the ongoing workflow. Training data for this capability is scarce in naturally occurring text, and to our knowledge, how to synthesize such data and train models to acquire this capability remains largely unexplored in the open-source community. To bridge this gap, we present a preliminary exploration targeting deep research, a representative long-horizon agent task. Specifically, we design a harness that guides the model toward high-quality task decomposition and delegation, while constraining subagents to return results properly to support the main agent's workflow. The harness-guided trajectories naturally encode correct delegation decisions, which we use as supervised fine-tuning data to internalize delegation intelligence into model weights. Our resulting model, SearchSwarm-30B-A3B, achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, the best results among all models of comparable scale. We will release our harness, model weights, and training data to facilitate future research.

  6. MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

    Current Vision-Language Models struggle with hours-long videos because processing full-length visual sequences induces prohibitive token explosion and attention dilution. To overcome this, we introduce MemDreamer to decouple perception and reasoning, shifting long-video understanding into an agentic exploration process. As a plug-and-play framework, it incrementally streams videos to construct a Hierarchical Graph Memory, a top-down three-tier architecture for semantic abstraction, anchored by a foundational graph capturing spatiotemporal and causal relations. During inference, the reasoning model employs agentic tool-augmented retrieval, navigating hierarchies, searching nodes, and traversing logical edges via an Observation-Reason-Action loop. Experiments show MemDreamer achieves SOTA results across four mainstream benchmarks, narrowing the gap with human experts to only 3.7 points. It constrains the reasoning context window to merely 2% of full-context ingestion while delivering a 12.5 point absolute accuracy gain. Furthermore, statistical analysis uncovers a strong positive linear correlation between an VLM's performance on logic reasoning and long-video understanding benchmarks, establishing agentic capability scaling as a new paradigm for multimodal comprehension.

  7. Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

    Recent work has demonstrated that online reinforcement learning (RL) can substantially improve the quality and alignment of flow matching models for image and video generation. Methods such as Flow-GRPO and CPS cast the denoising process as a Markov Decision Process and apply PPO-style ratio clipping to enforce a trust region. However, we argue that ratio clipping is structurally ill-suited for flow models: the probability ratio between new and old policies is a noisy, single-sample estimate of the true policy divergence, leading to over-constraining in some regions of the trajectory and under-constraining in others. We propose Flow-DPPO (Flow Divergence Proximal Policy Optimization), which replaces ratio clipping with a divergence proximal constraint. A key observation is that the per-step policy in flow models is Gaussian, enabling exact and cheap computation of the KL divergence between old and new policies. Flow-DPPO employs an asymmetric divergence mask that blocks gradient updates only when they simultaneously move away from the trusted region and violate the divergence threshold. Experiments show that Flow-DPPO achieves higher rewards with better KL-proximal efficiency, alleviates catastrophic forgetting, promotes balanced multi-objective optimization, and enables stable multi-epoch training where ratio clipping degrades. Code and models are available at https://github.com/Tencent-Hunyuan/UniRL/tree/main/FlowDPPO.

  8. SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

    Controlled character animation requires transferring motion from a driving sequence to a reference character. Prior works heavily rely on intermediate representations, including pose skeletons to represent motion or masked background to represent environment, which inevitably leads to information loss. To address this, we present SCAIL-2, an framework that bypasses those intermediates and achieves end-to-end character animation. By directly concatenating driving videos to the sequence, the model can obtain all the required visual information from the input video. To address lack of end-to-end data, we unify sub-tasks of character animation with decoupled conditions and then curate a pipeline to synthesize MotionPair-60K, an end-to-end motion transfer dataset containing heterogeneous tasks of character animation. To archive the unification, we utilize in-context mask conditioning and mode-specific RoPE as soft guidance beyond textual instructions and raw visual information. To address synthetic discrepancy in detailed regions, we propose Bias-Aware DPO to construct preference items to mitigate the errors. Extensive experiments demonstrate that our method substantially outperforms existing state-of-the-art approaches in various character animation tasks. A large subset of synthetic data as well as model weights will be released at our project page: https://teal024.github.io/SCAIL-2/.

  9. Lip Forcing: Few-Step Autoregressive Diffusion for Real-time Lip Synchronization

    Diffusion-based lip synchronization models achieve strong visual quality and audio-visual alignment, but full-sequence bidirectional attention and many denoising steps make them impractical for real-time inference. We present Lip Forcing, to our knowledge the first autoregressive diffusion method for video-to-video (V2V) lip synchronization, which distills a 14B audio-conditioned bidirectional video diffusion teacher into causal students. At inference, the students generate each chunk in only two denoising steps without inference-time CFG, enabling real-time lip synchronization. A lip-sync-specific teacher-trajectory analysis reveals a CFG fidelity-sync tradeoff: no-CFG predictions favor reference fidelity, whereas CFG-guided predictions favor synchronization within a mid-trajectory band. Lip Forcing translates this finding into three analysis-derived components: Sync-Window DMD, a two-step inference schedule, and a SyncNet-based reward. We validate Lip Forcing at two student scales, both distilled from the 14B teacher. The 1.3B student crosses into real-time streaming at 31 FPS, 17.6times faster than its same-scale bidirectional model. The 14B student, the largest diffusion model reported for V2V lip synchronization, runs 39.8times faster than its teacher at comparable reference fidelity. Time-to-first-frame is sub-millisecond at both scales, far below every diffusion baseline.

  10. WorldOlympiad: Can Your World Model Survive a Triathlon?

    We introduce WorldOlympiad, a benchmark for diagnosing video-based world models across physical faithfulness, geometric consistency, and interaction fidelity. While existing benchmarks often focus on visual quality, semantic alignment, or short-term temporal coherence, they provide limited insight into whether generated videos obey physical rules, preserve coherent 3D structure, and sustain controllable interactions over long horizons. To address this gap, WorldOlympiad decomposes world-model evaluation into three complementary dimensions. The physical track uses object segmentation and MLLM-as-judge to assess whether generated videos follow interpretable rules in mechanics, thermal phenomena, and material properties. The geometry track reconstructs generated videos with Gaussian splatting and evaluates structural consistency, cross-view coherence, and camera-trajectory alignment. The interaction track assesses whether generated rollouts follow complex action prompts and maintain smooth, coherent transitions across consecutive video chunks. WorldOlympiad further covers three major downstream scenarios, including gaming, robotics, and general real-world videos, capturing diverse challenges from interactive control and embodied manipulation to open-domain motion and camera dynamics. Together, these tracks and scenarios form a scalable and interpretable evaluation suite that exposes failure modes beyond generic video quality. Experiments on state-of-the-art models reveal substantial gaps in physical reasoning, 3D consistency, and long-horizon interaction, underscoring the need for more structured evaluation protocols for generative world models.

  11. Rethinking the Divergence Regularization in LLM RL

    Reinforcement learning (RL) has become a key component of post-training large language models (LLMs). In practice, LLM RL is often off-policy because of training-inference mismatch and policy staleness, making trust-region control essential for stable optimization. Mainstream methods such as PPO and GRPO approximate this control with a ratio-clipping mechanism, but the importance ratio can be a poor proxy for distributional shift in long-tailed vocabularies. Recent work such as DPPO addresses this mismatch by replacing ratio-based clipping with a divergence-based mask, yielding a trust region defined by the sampled token's absolute probability shift. However, DPPO still relies on a hard mask: once a token crosses the trust-region boundary in a harmful direction, its gradient is discarded rather than corrected. To address this, we propose Divergence Regularized Policy Optimization (DRPO), which replaces the hard mask with a smooth advantage-weighted quadratic regularizer on policy shift. DRPO preserves the same trust-region geometry as DPPO while inducing bounded, continuous gradient weights that attenuate diverging updates and provide corrective signals beyond the boundary. Experiments across model scales, architectures, and precision settings show that DRPO improves the stability and efficiency of LLM RL training.

  12. EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

    In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt learning under real-world task streams. Existing methods are largely designed for single-dataset settings, while real-world applications require models to handle heterogeneous input streams drawn from multiple datasets, domains, and task distributions, limiting their practical applicability. To mitigate cross-dataset interference, EEVEE introduces a router that partitions incoming inputs into task clusters and assigns them to suitable prompt configurations. This design is optimized via a router-prompt co-evolution strategy, which employs interleaved router and prompt learning phases to address their mutual dependency. Experiments across multiple datasets demonstrate that the framework improves robustness under heterogeneous data streams while maintaining single-benchmark learning capability and efficiency. Specifically, EEVEE improves average multi-benchmark scores by 10.38 and 24.32 points over Qwen3-4B-Instruct and DeepSeek-V3.2, surpassing SOTA methods GEPA and ACE by up to 37.2% and 48.2%.

  13. ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

    This paper introduces ARM, a discrete representation-based AutoRegressive Model that unifies image understanding, generation, and editing within a next-token prediction framework. ARM is built on three efforts: first, we train a discrete semantic visual tokenizer that maps images into compact token sequences. Our tokenizer is supervised with multiple objectives that jointly promote semantic discriminability, language alignment and faithful reconstruction, thereby supporting diverse tasks in a shared latent space. With this, we train a 7B autoregressive model over large-scale text and image token sequences, seamlessly developing vision-language perception and generation capabilities. Finally, to further improve preference-aligned behavior for text-to-image generation and instruction-guided editing, ARM applies reinforcement learning (RL) to optimize task-level objectives such as visual quality, instruction adherence, and edit consistency. Surprisingly, the results show that RL not only substantially improves performance on the target tasks (e.g., raising WISE overall from 0.50 to 0.56, GEdit-Bench-EN G_O from 5.75 to 6.68), but also induces cross-task synergy between text-to-image generation and editing. Collectively, these findings highlight autoregressive modeling, when paired with strong representations and preference optimization, as a scalable foundation for multimodal intelligence. Code: https://github.com/wdrink/ARM.

  14. Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

    Recent years have witnessed the rapid evolution of AI agents toward handling increasingly complex, real-world tasks. However, existing benchmarks rarely evaluate whether agents can operate graphical user interfaces to complete long-horizon, high-value professional workflows across diverse domains. Current GUI benchmarks still predominantly focus on general-purpose software, relatively simple applications, and short-horizon tasks, leaving it largely unknown whether modern agents can follow user instructions to autonomously operate domain-specific professional software and accomplish economically valuable work in an end-to-end manner. To bridge this gap, we introduce Workflow-GYM, a benchmark for long-horizon GUI tasks centered on professional domains and specialized software environments. Through extensive experiments on state-of-the-art models, we find that even the strongest models achieve only slightly above 30% success rates, highlighting that professional long-horizon GUI workflows remain highly challenging for current GUI agents. Further analysis reveals that current agents struggle to maintain long-horizon workflow consistency, frequently exhibiting workflow stage omission, error propagation, objective drift, and insufficient understanding of professional software environments. Our findings provide important insights into the limitations of current agent systems and suggest key directions for the next generation of GUI-agent research.

  15. One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA

    External memory effectively grounds large language models (LLMs) and vision-language models (VLMs)-based question answering (QA) in relevant multimodal evidence. However, existing memory paradigms represent each memory item in raw text and image forms, so retrieval-based systems must pass the retrieved text or images to the generation LLMs/VLMs, resulting in high token consumption and storage pressure, making it unaffordable for resource-constrained applications. We propose Latent Memory, a latent-space memory paradigm that replaces each raw text or image evidence item with a single high-dimensional latent token produced by a small compressor LLM/VLM. Rather than retrieving raw evidence for generation, Latent Memory operates in a unified latent representation space: the query is embedded into this space to retrieve relevant latent tokens, and the retrieved latent tokens are directly prompted to a pretrained LLM or VLM for answer generation. To make each latent token simultaneously informative for reconstruction, retrieval, and generation, we train the compressor with reconstruction, contrastive, and distillation objectives in a unified end-to-end manner. Latent Memory is evaluated on seven text-only QA benchmarks (e.g., HotpotQA) and multimodal QA benchmarks, where it achieves competitive QA performance compared to advanced RAG baselines while consuming 3x to 10x fewer generator tokens. It can also deliver the strongest image-grounded QA performance on WebQA. Code is available at https://github.com/zz1358m/Latent-Memory-Master.

Techmeme(15)

  1. CISA shortens the deadline for US agencies to fix the most critical vulnerabilities in their networks to three days, citing hackers' use of AI (Raphael Satter/Reuters)

    Raphael Satter / Reuters : CISA shortens the deadline for US agencies to fix the most critical vulnerabilities in their networks to three days, citing hackers' use of AI —  The U.S. cyber defense agency said on Wednesday that government officials now have three days to deal with the most serious categories …

  2. Chinese companies are implementing "quiet" AI-driven layoffs to avoid labor laws that require government approval for job cuts exceeding 10% of a workforce (Laurie Chen/Reuters)

    Laurie Chen / Reuters : Chinese companies are implementing “quiet” AI-driven layoffs to avoid labor laws that require government approval for job cuts exceeding 10% of a workforce —  Liu, a Hangzhou-based contractor at a large Chinese internet firm, says her employer began quietly firing contractors …

  3. Sources: Microsoft's Xbox division is planning major layoffs next month and significant budget cuts for marketing and some other areas (Jason Schreier/Bloomberg)

    Jason Schreier / Bloomberg : Sources: Microsoft's Xbox division is planning major layoffs next month and significant budget cuts for marketing and some other areas —  Asha Sharma's first major cuts will arrive in July  —  Microsoft Corp.'s Xbox division is planning major layoffs next month, according to people familiar with the company's strategy.

  4. Sources: Sea's Shopee is cutting hundreds of developer jobs globally; the reductions started this week and amount to about 8% of Shopee's developer workforce (Bloomberg)

    Bloomberg : Sources: Sea's Shopee is cutting hundreds of developer jobs globally; the reductions started this week and amount to about 8% of Shopee's developer workforce —  Sea Ltd.'s Shopee is cutting hundreds of developer jobs globally, joining rivals across the world in slashing staff while they adopt AI …

  5. OpenAI says it has banned China-linked accounts that used ChatGPT to draft social media influence campaigns targeting US debates over tariffs and data centers (Sam Sabin/Axios)

    Sam Sabin / Axios : OpenAI says it has banned China-linked accounts that used ChatGPT to draft social media influence campaigns targeting US debates over tariffs and data centers —  OpenAI has banned China-linked accounts that used ChatGPT to draft social media influence campaigns targeting U.S. debates …

  6. Oracle reports Q4 revenue up 21% YoY to $19.18B, vs. $19.1B est., and expects to raise ~$40B through a combination of debt and equity financing in FY 2027 (Juby Babu/Reuters)

    Juby Babu / Reuters : Oracle reports Q4 revenue up 21% YoY to $19.18B, vs. $19.1B est., and expects to raise ~$40B through a combination of debt and equity financing in FY 2027 —  Oracle (ORCL.N) reported fourth-quarter revenue that narrowly beat Wall Street expectations on Wednesday, amid growing concerns …

  7. Anthropic releases two policy proposals on how governments should address catastrophic risks and manage labor market disruption from advanced AI systems (Anthropic)

    Anthropic : Anthropic releases two policy proposals on how governments should address catastrophic risks and manage labor market disruption from advanced AI systems —  AI is advancing at exponential speed, and the policymaking process was built for a slower world.  —  We are sharing two policy proposals to prepare for AI progress.

  8. Dario Amodei says frontier models should face mandatory third-party testing for cyber, bio, and autonomy risks, in addition to overall transparency requirements (Dario Amodei/@darioamodei)

    Dario Amodei / @darioamodei : Dario Amodei says frontier models should face mandatory third-party testing for cyber, bio, and autonomy risks, in addition to overall transparency requirements —  In addition to transparency, I now believe frontier models should face mandatory third-party testing for cyber, bio, and autonomy risks—with the power to block or revoke deployment of models that pose catastrophic risk.

  9. An essay on policy responses to AI's exponential progress across regulation and public safety, macroeconomics and taxes, science, civil liberties, geopolitics (Dario Amodei)

    Dario Amodei : An essay on policy responses to AI's exponential progress across regulation and public safety, macroeconomics and taxes, science, civil liberties, geopolitics —  In one of the side plots to The Lord of the Rings, two of the Hobbits attempt to rouse Treebeard—a wise but ponderous sentient tree …

  10. OpenAI and Visa partner to let AI agents make purchases online after users give their permission and to explore enterprise applications for AI-driven payments (Paige Smith/Bloomberg)

    Paige Smith / Bloomberg : OpenAI and Visa partner to let AI agents make purchases online after users give their permission and to explore enterprise applications for AI-driven payments —  OpenAI and Visa Inc. are now allowing artificial-intelligence agents to make purchases online after users give their permission …

  11. Cybersecurity researchers complain that Claude Fable's guardrails are too strict, rejecting "innocuous tasks" like reading blog posts or performing code reviews (Lorenzo Franceschi-Bicchierai/TechCrunch)

    Lorenzo Franceschi-Bicchierai / TechCrunch : Cybersecurity researchers complain that Claude Fable's guardrails are too strict, rejecting “innocuous tasks” like reading blog posts or performing code reviews —  Anthropic released its latest model Fable on Tuesday, billing it as a public and limited version of its powerful and much-hyped cybersecurity model Mythos.

  12. Sources: Microsoft is restricting employees from using Claude Fable 5 because of Anthropic's new 30-day data retention requirements (Tom Warren/The Verge)

    Tom Warren / The Verge : Sources: Microsoft is restricting employees from using Claude Fable 5 because of Anthropic's new 30-day data retention requirements —  Microsoft's legal teams are evaluating Anthropic's new data retention changes. … Anthropic released Claude Fable, its first Mythos-class AI model …

  13. As AI commoditizes benchmarkable work, an organization's lasting moats lie in tasks that are verifiable through its private data and judgment (Sarah Guo)

    Sarah Guo : As AI commoditizes benchmarkable work, an organization's lasting moats lie in tasks that are verifiable through its private data and judgment —  The mid-2026 investor's version of AI psychosis is a despair that nothing is investable, that we should put all our money into Anthropic and Nvidia and go home.

  14. Survey: 53% of Americans fear AI could put someone in their household out of work; Democrats are more likely than Republicans to worry about AI's impact on jobs (Reuters)

    Reuters : Survey: 53% of Americans fear AI could put someone in their household out of work; Democrats are more likely than Republicans to worry about AI's impact on jobs —  Half of Americans fear that the rise of AI could put them or someone in their household out of work, according to a new Reuters/Ipsos poll …

  15. Interfax: Russia restored access to Roblox after concluding the company had complied with local legal requirements; Russia had banned Roblox in December (Andrey Lemeshko/Bloomberg)

    Andrey Lemeshko / Bloomberg : Interfax: Russia restored access to Roblox after concluding the company had complied with local legal requirements; Russia had banned Roblox in December —  Russia restored access to Roblox Corp.'s gaming platform after concluding the company had complied with local legal requirements, according to Interfax.

Solidot(15)

  1. 人类习惯于左转逆时针行走

    研究人员在疫情期间进行了一系列实验,观察在保持安全距离的情况下多少人能共享同一空间。在回看视频时,他们注意到大多数人都逆时针方向行走。这一意外发现促使科学家展开了更多实验,发现人类总是倾向于逆时针行走。他们的研究报告发表在《Nature Communications》期刊上。科学家尚不清楚这种偏好的来源。男性和女性都存在该偏好行为,儿童中间更为明显。动物中间也有类似行为,如岩蚁(rock ants)探索未知巢穴时偏好左转。科学家怀疑与生物机械学有关,但确切机制仍然是个谜团。奥运会的田径比赛最初让运动员沿顺时针跑道跑,但后来因运动员认为这种跑法不自然而改为沿逆时针跑道跑,原因这可能是人口中的右腿优势。

  2. FCC 计划在美国推行手机实名制

    美国联邦通信委员会(FCC)想要杀死匿名的一次性手机,计划通过法律强制要求电信公司存储手机用户的个人信息,相关个人信息包括政府颁发的身份识别号码和实际地址。此举引发了隐私倡导者和民权活动人士的担忧,认为美国在向专制国家看齐。FCC 给出的理由是打击诈骗,旨在阻止诈骗分子接入电信网络,“执法人员能更好地识别诈骗分子”。FCC 将这些措施比作银行为防止洗钱而收集的数据。

  3. npm v12 将不再自动执行依赖项

    在 Node.js 生态系统发生了一系列严重安全事件之后,npm 管理工具的下一个大版本 v12 将在安全方面进行重大调整:除非明确允许,npm install 不再自动执行依赖项的 preinstall、install、postinstall 脚本。来自 Git、文件和链接依赖项的准备脚本也会以同样的方式被阻止。npm v12 将于 2026 年 7 月推出。

  4. 双星系统的化学构成差异揭示了行星被恒星吞噬的命运

    天文学家研究一个名为 HD 81809 的特殊双星系统时,发现两颗同时诞生的恒星拥有截然不同的化学组成。一般而言,双星诞生于同一个分子云,因此元素组成通常相当接近。但在 HD 81809 系统中,其中一颗恒星的铁含量明显偏低,而另一颗则拥有接近太阳的金属丰度,两颗恒星的铁元素含量相差约 3.7 倍,远超过一般双星演化模型能解释的范围。天文学家因此怀疑,其中一颗恒星可能曾吞噬自己的行星,改变了表面的化学组成。HD 81809 双星系统距离地球约 113 光年,由两颗类似太阳的 G 型恒星组成。其中主星 HD 81809A 已演化成次巨星,另一颗 HD 81809B 则仍停留在主序星阶段。这个系统年龄约达 100 亿年。除了铁元素异常外,HD 81809B 还具有偏高的锂元素含量。由于低质量老年恒星通常会在演化过程中逐渐耗尽锂元素,因此这种高锂元素含量现象,被视为可能曾发生行星吞噬事件的重要线索。研究人员发现,如果要让 HD 81809B 的金属量提升至目前观测值,恒星必须在近期吞噬约 25 至 75 个地球质量的富金属物质,相当于海王星到土星之间的金属核心总量。

  5. 半导体月销售额首次突破 1100 亿美元

    美国半导体行业协会(SIA)公布的数据显示,4 月全球半导体销售额同比增长 93.9% 达到 1104.8 亿美元。半导体销售额已连续 30 个月实现同比增长,环比增幅为 11%。除销量增长外,价格也显著上升。8GB DDR4 内存价格一年内涨至约 9 倍。三大内存厂商三星电子、SK 海力士和美光科技优先生产 AI 用内存产品,导致通用内存产品的供求关系趋于紧张。按区域来看,拉动销售额增长的是美国和亚洲。

  6. 德国法庭裁决 Google 要对 AI Overviews 内容承担责任

    德国慕尼黑地区法庭裁决,Google 要对 AI Overviews 内容承担责任,因为 AI Overviews 是 Google 自己的内容,并非搜索结果列表。本案的原告是两家慕尼黑出版商,他们指控 Google 的 AI Overviews 错误将其与诈骗、订阅陷阱等不正当商业行为关联起来,他们向 Google 发去了禁止通知函(cease-and-desist letter),但搜索巨人未正确回应。法院认为,Google 的 AI Overviews 与传统搜索结果不同,AI 会“用自己的语言按照自己的结构”重写和评判搜索结果,而它引用的链接与其内容有矛盾,因此该内容是 Google 自己的陈述。Google 开发了 AI,将其提供给用户,因此 Google 拥有 AI 所生成内容的所有权,“因为只有 Google 才能影响 AI 提供的服务以及 AI 运行所使用的算法。”搜索引擎责任规则不适用于 AI 搜索。

  7. 比亚迪一年 200 次 OTA,次数远超竞争对手

    威尔森的数据显示,2025 年比亚迪针对自身“海洋”和“王朝”系列车型实际进行了 200 次软件更新(Over the Air 或 OTA),在汽车企业中次数最多。特斯拉在中国国内更新软件的次数为 16 次,丰田为 8 次,大众为 5 次。比亚迪之所以能够频繁更新软件,是因为 OTA 所需要的半导体、作为通信基础的操作系统、实际运行的硬件全部自主开发。相关负责人表示:“只要是自主设计,就可以迅速且准确地实现更新”。在价格竞争加剧导致中国国内销量下滑的背景下,比亚迪的目的是通过 OTA 来提升吸引力、扩大销售。

  8. Starlink 硬件从一次性付费转向月租

    Starlink 硬件从一次性付费转向 10 美元月租费。Starlink 硬件包括一个接收卫星信号的终端和一个放在家中的路由器。该费用不包含在网络服务费中。Starlink 的 100Mbps 套餐每月收费 55 美元,200Mbps 套餐每月收费 85 美元,400Mbps 的 Max 套餐每月收费 130 美元。Starlink 还提供专业安装服务,一次性收费 199 美元,Max 套餐用户免收安装费。

  9. Google Chrome 准备移除对 Manifest V2 的支持,杀死 uBlock Origin

    Google Chrome 准备完全移除对 Manifest V2 的支持,彻底杀死 uBlock Origin。Chrome 将只支持 Manifest v3 扩展,开发者声称 Chrome 默认禁用 Manifest V2 扩展已有一年多时间,继续支持相关功能存在技术上的挑战,Chrome 未来发布的版本将逐步移除 Manifest V2 相关功能,最终彻底将其移除:Chromium 150 移除 ExtensionManifestV2Disabled 选项,Chromium 151 将移除 ExtensionManifestV2Unsupported 选项,Chromium 151 将移除 ExtensionManifestV2Availability 选项,Chromium 151 预计将移除 AllowLegacyMV2Extensions 选项。基于 Chromium 的浏览器预计将会跟随,主要浏览器开发商中 Mozilla 公开声明会继续支持 Manifest v2 扩展。流行的广告屏蔽扩展 uBlock Origin 基于 Manifest V2,想要继续使用该扩展的用户可能只能迁移到 Firefox 了。

  10. NASA 公布了 Artemis III 任务宇航员名单

    NASA 公布了 Artemis III 任务的四名宇航员名单,他们都是男性且都有军事背景:NASA 宇航员 Randy Bresnik(担任指令长),Andre Douglas 和 Frank Rubio(任务专家),以及 ESA 宇航员 Luca Parmitano(飞行员)。Artemis 登月任务目前共执行了两次,第一次是无人绕月飞行,第二次是载人绕月飞行,第三次也就是 Artemis III 计划最早于 2027 年夏季执行,仍然是一次载人绕月飞行,第四次任务计划在 2028 年进行,这将是阿波罗登月任务以来的首次载人登月。

  11. Donut Lab 的全固态电池被认为就是普通锂离子电池

    在 CES 2026 上芬兰初创企业 Donut Lab 宣称其研发出一款能量密度达 400Wh/kg、循环寿命 10 万次、5 分钟即可充满电,并且在 -30℃-100℃ 的温度范围内,仍能保持 99% 以上容量的固态电池。由 20 多位业内独立专家开展的调查证实,全固态电池系造假,实为普通锂离子电池。证据包括:其电压曲线与现有液态高镍三元锂离子电池特征完全吻合;电池充电时离子会嵌入负极材料,使电池产生规律性膨胀,采用石墨负极的电池,在电量充至 50% 至 70% 区间时,膨胀曲线会出现一处明显拐点,这是离子在石墨层状结构中重新排布所形成的独有特征,Donut Lab 的这款电池,曲线中恰好出现了这一标志性拐点。电池的实际能量密度约为 298Wh/kg,属于当前三元锂电池的正常水平。调查团队发现,Donut Lab 之所以如此欺诈宣传,核心是为了从资本市场获利,在该公司 1300 余名股东中,逾 900 人持股不超过 50 股,单笔投入估计在 3000 至 23000 美元之间。

  12. iPhone 与美国生育率下降相关

    美国总生育率自 2007 年以来下降了 22%,这一下降趋势难以用经济状况、避孕、住房或托儿成本等进行解释,智能手机的普及被认为与生育率下降相关,2007 年就是第一代 iPhone 发布之年。在美国,从 2007 年 6 月到 2011 年 2 月,iPhone 仅在 AT&T 网络销售。这就是为研究智能手机对生育影响提供了一个天然的实验环境。研究人员利用 AT&T 移动网络覆盖范围的差异去识别 iPhone 对生育的影响。结果显示,iPhone 的普及使 15-19 岁女性的生育率下降了 4.5%-8.0%,20-24 岁女性的生育率下降了 3.2%-6.6%。iPhone 的普及加速了 30 岁以下女性生育率的下降,抑制了 30 岁以上女性生育率的上升。研究人员称,iPhone 的普及能解释 15-44 岁女性总体生育率下降的 33%-52%。原因被认为是智能手机减少了线下的面对面人际交往,增加了色情内容的使用,降低了性生活频率。

  13. Falcon 9 火箭第一级 B 1067 执行了 35 次发射任务

    本周一编号为 B 1067 的 Falcon 9 火箭第一级完成了第 35 次发射任务,在将 29 颗 Starlink 卫星送入轨道之后成功着陆在无人驳船 A Shortfall of Gravitas 上。B 1067 是 SpaceX 重复使用次数最多的火箭第一级,服役了五年多时间,曾在一个月内执行了两次发射,SpaceX 的目标是火箭第一级能重复使用 40 次,B 1067 正接近这一目标。B 1067 发射次数比竞争对手联合发射联盟(ULA)过去五年的总发射次数还要多(ULA 完成了 29 次发射)。

  14. 联合国报告警告海洋承受巨大压力

    最新发布的《世界海洋评估》报告警告,气候变化、污染、过度开发等多重压力正在持续削弱海洋健康,而海洋的未来与人类的未来紧密相连。报告指出,即便远离海岸,海洋依然深刻影响着每个人的生活。海洋吸收了地球大部分额外热量和温室气体,在减缓气候变化方面发挥关键作用。海洋还为全球数十亿人口提供食物、氧气和药物资源,并支撑着全球贸易、旅游业和大量就业岗位。报告强调,海洋环境恶化不仅会影响沿海地区,还将波及粮食安全、供应链稳定以及全球经济发展。评估显示,海洋变暖和海平面上升正在加速。由于冰盖融化和海水热膨胀,全球海平面上升速度已从 2015 年前每年最高 1.9 毫米增加到 2023 年的 4.3 毫米。北极升温速度达到全球平均水平的四倍。与此同时,海洋缺氧区面积已扩大至约 450 万平方公里,大量海洋生物生存空间受到挤压。自 1970 年代以来,加勒比地区约 80% 的珊瑚礁已经消失。如果全球升温超过工业化前水平 1.5 摄氏度,全球 90% 的珊瑚礁可能面临消失风险。报告显示,每年约有 5200 万吨塑料垃圾进入海洋,形成约 24 万亿个微塑料颗粒,已影响 4000 多种海洋生物。

  15. 微软开源工具被植入窃取凭证的恶意代码

    微软下线了数十个托管在 GitHub 上的开源项目,原因是安全公司发现这些项目被入侵植入了窃取密码等敏感凭证的恶意代码。微软在一份声明中表示,它正对此展开调查,部分下线的项目在审核之后已恢复上线,作为调查的一部分,它通知了下载受影响项目的一小部分用户。调查显示,至少 73 个项目受到影响。这是过去一个月微软第二次开源项目库遭到入侵。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.