Curated by Shen Huang · 90 stories · ~14 min read
DIGEST · 2026-06-11

OrangeBot.AI Digest — 2026-06-11

90 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Developer gets Half-Life running at 30 FPS on a Nokia N95 (www.tomshardware.com)
  2. Fully autonomous drones have killed human soldiers for the first time (www.newscientist.com)
  3. macOS 27 Beta breaks the ability to boot Asahi Linux (www.phoronix.com)
  4. Emacs appearances in pop culture (ianyepan.github.io)
  5. Show HN: Homebrew 6.0.0 (brew.sh)
  6. Doing nothing at work (www.seangoedecke.com)
  7. Software Is Made Between Commits (zed.dev)
  8. Petition to Withdraw Canada's Bill C-22 (www.ourcommons.ca)
  9. Anthropic apologizes for invisible Claude Fable guardrails (www.theverge.com)
  10. Solar generates more energy in US than coal for first time (www.theguardian.com)
  11. The RCE that AMD wouldn't fix (mrbruh.com)
  12. MiMo Code is now released and open-source (mimo.xiaomi.com)
  13. MapComplete: Maps about various topics which you can contribute to (mapcomplete.org)
  14. Open Reproduction of DeepSeek-R1 (github.com)
  15. Workers are spending over 6 hours a week botsitting AI, fueling job frustration (www.businessinsider.com)

GitHub Trending(15)

  1. apple / container
  2. addyosmani / agent-skills
  3. maziyarpanahi / openmed
  4. phuryn / pm-skills
  5. NVIDIA / SkillSpector
  6. soxoj / maigret
  7. x1xhlol / system-prompts-and-models-of-ai-tools
  8. refactoringhq / tolaria
  9. obra / superpowers
  10. restic / restic
  11. msitarzewski / agency-agents
  12. masterking32 / MasterDnsVPN
  13. chatwoot / chatwoot
  14. kenn-io / agentsview
  15. alchaincyf / zhangxuefeng-skill

Product Hunt(15)

  1. Bond

    The AI to-do list that does itself

  2. Asmi AI

    AI that handles your personal chores in the real world

  3. heyly.io

    A video greeting + buttons where you ARE the testimonial

  4. Cloudskill

    Govern the AI skills your team depends on

  5. CrustRecruiter

    Turn Claude into a recruiter that thinks like you

  6. INVO Ride

    Book autonomous eVTOL flights over photoreal San Francisco

  7. Patchrooms

    Turn AI-app feedback into agent-ready patch context.

  8. Nodey

    Your n8n command center, now on your phone

  9. SlimSnap

    Your AI doesn't know which button you mean

  10. Terminal Mode by Even Realities

    Keep coding agents always in sight

  11. Airbrush Studio

    AI-powered photo editor for pro results w/o manual editing

  12. Respan Gateway

    One AI gateway with built-in observability and evals

  13. Lium AI

    AI for Complex Data

  14. Onpilot

    An AI workforce customized to your business

  15. Mute

    A visual productivity tool to visualize your brian-dump

Hugging Face(15)

  1. Redesign Mixture-of-Experts Routers with Manifold Power Iteration

    Router is the cornerstone component to the Mixture-of-Experts models. Serving as expert proxies, the rows of the router matrix compute their similarity to the MoE inputs to determine which subset of experts is activated. Ideally, each router row is designed to encode the expert matrix into this representative vector, such that its dot-product with token can better reflect token-expert affinity. However, there exists no design principles to enforce this condensation. In this paper, we propose to align each router row with the principal singular direction of the associated expert, as this direction provides the most expressive mathematical description of a matrix. Based on this principle, we propose a router redesign with Manifold Power Iteration (MPI). Specifically, it introduces a "Power-then-Retract" paradigm, where a power iteration step is performed on the router weights, followed by a retraction to impose a norm constraint to ensure both efficiency and stability. Theoretically, we show that MPI drives router rows to converge toward the principal singular directions of associated experts. Empirically, we pretrain MoE model across scales from 1B to 11B parameters to confirm that this alignment facilitates more effective MoE models.

  2. Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

    Scientific progress depends on a repeated loop of exploration, experimentation, and abstraction. Researchers test candidate directions, interpret the evidence, and carry the resulting lessons into later attempts. We study how an AI agent can run this loop autonomously over long horizons. We introduce Arbor, a general framework for autonomous research that combines a long-lived coordinator, short-lived executors, and Hypothesis Tree Refinement (HTR), a persistent tree that links hypotheses, artifacts, evidence, and distilled insights across time. The coordinator manages global research strategy over the tree, while executors implement and test individual hypotheses in isolated worktrees. As results return, Arbor updates the tree, propagates reusable lessons, refines the search frontier, and admits verified improvements. This design turns autonomous research from a sequence of local attempts into a cumulative process in which strategy, execution, and evidence are carried across time. We evaluate Arbor under Autonomous Optimization (AO), an operational setting where an agent improves an initial research artifact through iterative experimentation without step-level human supervision. Across six real research tasks in model training, harness engineering, and data synthesis, Arbor achieves the best held-out result on all six tasks, attaining more than 2.5x the average relative held-out gain of Codex and Claude Code under the same task interface and resource budget. On MLE-Bench Lite, Arbor reaches 86.36% Any Medal with GPT-5.5, the strongest result in our comparison.

  3. Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application

    Environments serve as interactive systems for large language model (LLM) based agents across diverse scenarios and play a crucial role in driving the continual evolution of model capabilities. Despite this importance, existing work lacks a systematic categorization and deep analysis. This paper systematically studies current researches on agentic environments from the perspective of the environment engineering lifecycle, covering their modeling, synthesis, evaluation and application. Specifically, the paper first introduces representative environments from the perspectives of eight attributes and eight domains, providing detailed analyses of their development paths and highlighting their core capabilities. Second, for automated environment synthesis, two paradigms are introduced, such as symbolic synthesis and neural synthesis. This paper also shows different environment evaluation methods in each paradigm. Thirdly, the corresponding environment applications from the perspective of agent-environment co-evolution are discussed. In specific, the paper characterizes the primary pathways for agent evolution in dynamic environments from four complementary perspectives: memory-centric experience evolution, orchestration-centric workflow evolution, trajectory-centric offline evolution, and exploration-centric online evolution. And three paradigms of environment evolution are identified, namely neural-driven, difficulty-driven, and scaling-driven approaches. At last, several promising future directions are discussed, including Environment-as-a-Service, Multi-agent Environments, and Neural-Symbolic Environments.

  4. Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

    General-purpose agents such as OpenClaw are increasingly used as autonomous tool users, but their coding ability is difficult to measure under SWE-bench: a generic agent does not by itself satisfy the clean Docker workspace, patch, and prediction contract required for scoring. We introduce Claw-SWE-Bench, a multilingual SWE-bench-style benchmark and adapter protocol that makes heterogeneous agent harnesses, or claws, comparable under fair settings including a fixed prompt, runtime budget, workspace contract, patch extraction procedure, and evaluator. The full benchmark contains 350 GitHub issue-resolution instances across 8 languages and 43 repositories, drawn from SWE-bench-Multilingual and SWE-bench-Verified-Mini after future-commit cleanup. We also release Claw-SWE-Bench Lite for faster validation, which is an 80-instance subset selected by a cost-aware, rank-aware procedure over 17 calibration columns. On the full benchmark, OpenClaw with a minimal direct-diff adapter scores only 19.1% Pass@1, whereas the full adapter reaches 73.4% with the same GLM 5.1 backbone, showing that adapter design is essential for enabling OpenClaw-style harnesses to perform coding tasks effectively. Across an OpenClaw times nine-model sweep and a five-claw times two-model sweep, model choice changes Pass@1 by 29.4 pp and harness choice by 27.4 pp under fixed models; systems with similar accuracy can differ substantially in total API cost. Claw-SWE-Bench therefore treats harness and cost accounting as first-class axes of SWE-style coding-agent evaluation, providing both a full benchmark and a low-cost reference set for reproducible comparison. The data is available at https://github.com/opensquilla/claw-swe-bench and https://huggingface.co/datasets/TokenRhythm/Claw-SWE-Bench.

  5. Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

    Reward models are central to text-to-image post-training, but visual preference is subjective and better represented as a distribution over rubric scores than as a deterministic scalar. Existing scalar, score-token, and pairwise reward models over-compress uncertainty and fine-grained score differences, while reasoning-based generative rewards provide stronger judgments but are costly to deploy and difficult to use as direct optimization signals. We propose Z-Reward, a teacher-student reward modeling framework that decouples reasoning-heavy judgment from efficient reward deployment. The teacher is a large VLM that uses reasoning to infer rubric-aligned score distributions, and is trained with Group-wise Direct Score Optimization (GDSO), which combines policy-gradient rewards from distribution expectations with direct pointwise and pairwise supervision on score distributions and score gaps. The student is trained with Reasoning-Internalized Score Distillation (RISD), which transfers the teacher's reasoning-conditioned score distribution into a compact VLM without requiring explicit reasoning chains at inference time. On our internally annotated evaluation set, the 27B GDSO teacher reaches 89.6% human preference accuracy, outperforming SFT, RewardDance, and GRPO, while the 9B RISD student reaches 88.6%, outperforming the OPD baseline and closely matching the larger teacher. We further show that Z-Reward can serve as a differentiable reward signal for text-to-image optimization, yielding a 41.3% net human-preference improvement over the SFT baseline.

  6. TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

    Tabular encoders are usually evaluated inside task-specific end-to-end pipelines, so models from different training paradigms are difficult to compare directly even when they operate on similar tabular signals. We introduce TRL-Bench, a multi-granular tabular representation learning (TRL) benchmark that standardizes cross-paradigm representation-level evaluation: each encoder exports row-, column-, or table embeddings through its supported wrapper, and shared lightweight heads probe them across three suites: TRL-CTbench (column/table), TRL-Rbench (row), and TRL-DLTE (compositional Data-Lake Table Enrichment spanning all three granularities). To support this standardized setting, we release curated benchmark assets and task reformulations, including 50 OpenML tables with 123 verified targets, 16 row-pair linkage rewrites, and a 47,772-table DLTE lake derived from 1,379 parent tables. Across 20 models and 16 tasks, TRL-Bench shows that once downstream conditions are standardized, encoder quality is capability-specific rather than captured by a single leaderboard. In TRL-CTbench, generic text encoders often lead on tasks with strong surface-text signal, while tabular specialists win where their pretraining objective aligns with the task. In TRL-Rbench, within-table prediction and cross-table linkage favor different training regimes, with atomic linkage performance correlating strongly with the row-matching stage of DLTE pipelines. In TRL-DLTE, the strongest pipelines combine capability-matched specialists rather than reuse a single encoder, and top end-to-end quality depends on non-additive compositional fit rather than per-stage marginal rank alone. TRL-Bench provides a common protocol for measuring reusable signal in exported tabular representations under shared downstream conditions. Code and data: https://github.com/LOGO-CUHKSZ/TRL-Bench

  7. Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning

    Spatial reasoning from egocentric videos is inherently challenging because the observable evidence is constrained by the camera trajectory. Existing methods rely on single-turn inference, forcing models to resolve geometric ambiguity through semantic priors rather than verifiable evidence. We argue that spatial reasoning should be revisitable: conclusions formed under limited evidence should remain open to revision when complementary viewpoints become available. Building on this insight, we propose Reason, then Re-reason (ReRe), a training-free, inference-time framework with two phases: in the Reason Phase, an MLLM forms a spatial hypothesis from the original video; in the Re-reason Phase, it verifies or revises the hypothesis by observing a synthesized novel-view video. To enable effective cross-view revisiting, we design a Geometry-to-Video pipeline that renders strategically complementary novel views from predicted 3D geometry. These views feature an elevated, oblique perspective with scene-spanning coverage, while preserving the MLLM's native video interface without architectural modifications. Extensive evaluations on VSI-Bench and STI-Bench demonstrate that ReRe substantially boosts open-source MLLMs to rival proprietary state-of-the-art performance. Project page: https://zhenjiemao.github.io/ReRe/

  8. DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch

    As the capabilities of LLM-based code agents continue to advance, their expected role is expanding beyond localized bug fixing in existing codebases toward architecting and implementing complete software repositories from high-level specifications. However, training agents for such long-horizon software engineering tasks remains difficult due to the scarcity of large-scale, verifiable whole-repository generation data. In this paper, we introduce DeNovoSWE, a large-scale dataset for whole-repository generation. DeNovoSWE comprises 4,818 high-quality instances, where each instance requires generating a complete repository from documentation. Our dataset is automatically constructed through a carefully designed sandboxed agentic workflow, enabling scalable curation without human annotation. DeNovoSWE is constructed with "divide and conquer" and critic-repair philosophy. To balance data quality and diversity, we further introduce a difficulty-aware trajectory filtering strategy. Fine-tuning Qwen3-30B-A3B on DeNovoSWE substantially improves long-horizon SWE performance, raising its score on the challenging BeyondSWE-Doc2Repo benchmark from 5.8% to 47.2%.

  9. World Pilot: Steering Vision-Language-Action Models with World-Action Priors

    Vision-Language-Action (VLA) models inherit semantic grounding from large-scale pretraining and perform competently across in-distribution manipulation tasks. This grounding, however, is built on static image-text pairs, whereas manipulation is a continuous, contact-rich process whose dynamics this pretraining cannot capture. We present World Pilot, a VLA framework that augments the policy with priors from a World-Action Model (WAM), routed into the decision chain through two complementary pathways. Latent Steering conditions the perception layer on a scene-evolution latent, and Action Steering supplies an anticipated trajectory as a motion prior to the action generator. Together the two priors equip the VLA with an anticipated view of the scene and a trajectory-level motion hint alongside its semantic conditioning, and the scene-evolution prior remains effective even when supplied by a video-pretrained world model that has not been action-post-trained. World Pilot attains a state-of-the-art Total success rate of 84.7% on the LIBERO-Plus zero-shot OOD benchmark and the highest success rate on every real-robot setting across four manipulation tasks, with the largest margins under shifts in viewpoint, geometry, deformable state, and pose. Project Website: https://world-pilot.github.io/

  10. On Subquadratic Architectures: From Applications to Principles

    Transformers dominate modern sequence modeling, but their quadratic attention incurs substantial computational cost. Subquadratic architectures offer a scalable alternative. However, it remains unclear which designs yield the most effective sequence models. We compare three leading approaches: xLSTM, Mamba-2, and Gated DeltaNet. We evaluate these models on tasks with complex dependencies: (1) code-model pre-training, (2) distillation of code models from large language models, and (3) pre-training of time-series foundation models. Across these settings, xLSTM delivers the strongest overall performance. To explain xLSTM's advantage, we present a unified formulation and analyze the underlying architectural mechanisms, focusing on state tracking and memory dynamics. Our results show that xLSTM enables more flexible and stable memory correction via its gating scheme. We corroborate these findings on controlled synthetic length-generalization tasks. Overall, our findings indicate that xLSTM's gains on complex tasks stem from robust state tracking and accumulation.

  11. ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics

    Combinatorics is central to Olympiad-level mathematical problem solving, requiring deep discrete reasoning, creative constructions, and rigorous structural insight. Recent evidence suggests that even today's strongest frontier models remain uneven on Olympiad combinatorics, revealing a gap in creative mathematical reasoning. We introduce ComBench, an Olympiad-level combinatorics benchmark for evaluating and diagnosing the combinatorial reasoning capabilities of large language models. ComBench contains 100 human-annotated competition-level problems organized around two complementary settings: analysis-centric problems, which primarily require rigorous mathematical arguments, and construction-centric problems, which require explicit constructions in addition to correctness justifications. The evaluation protocol combines rubric-guided proof grading with deterministic construction verification, exposing cases where proof quality and construction validity diverge. Experiments on frontier open- and closed-source models show that ComBench is far from saturated: the strongest model reaches 65.4% overall Avg. and 75.3% overall Best@4. We further find that Rigorous Proof Reasoning and Constructive Realization are distinct capabilities: Kimi-K2.6 trails GPT-5.5 on analysis-centric proof grading but surpasses it on construction-centric Best@4, while Existence and Construction problems remain consistently hardest across representative frontier models.

  12. Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

    Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce malicious code. Meanwhile, Grammar-Constrained Decoding (GCD) has been widely adopted to improve the reliability of LLM-generated code by enforcing syntactic validity. In this paper, we reveal a counterintuitive risk: this reliability-oriented technique can itself become an attack surface. We uncover a new jailbreak attack, termed CodeSpear, that exploits GCD to induce LLMs into generating malicious code. Our experiments show that simply applying a benign code grammar constraint can effectively jailbreak LLMs. To address this vulnerability, we propose CodeShield, a safety alignment approach that robustly preserves safe behavior even under attacker-controlled grammar constraints. CodeShield aligns the model in the code modality by teaching it to generate honeypot code under GCD. Such code is semantically harmless, so it does not implement the malicious request, and structurally diverse, so it is difficult to suppress through grammar tightening. At the same time, CodeShield still preserves natural-language refusals when natural language is available. Experiments on 10 popular LLMs across 4 benchmarks show that CodeSpear outperforms representative jailbreak baselines and increases the attack success rate by more than 30 percentage points on average. CodeShield also restores safety under CodeSpear while preserving benign utility. Our findings reveal a fundamental risk of GCD and call for greater attention to its potential security implications.

  13. InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

    Recent progress in foundation models has shifted toward agentic behavior involving multi-step reasoning and tool use. However, open-source efforts largely focus on text-dominant settings, leaving long-horizon multimodal tasks underexplored. This gap is evident in video tasks requiring sustained temporal understanding and iterative interaction. We present InternVideo3, a framework enhancing these capabilities via Multimodal Contextual Reasoning (MCR). MCR treats understanding as a closed-loop process over a shared, evolving context containing observations, instructions, reasoning, tool actions, and memory. This frames long-video understanding as evidence accumulation and verification. To ensure efficiency, we introduce Multimodal Multi-head Latent Attention (M^2LA), a token-preserving reparameterization compressing KV-cache states while retaining the full token stream. Our staged training includes continued pretraining, short-to-long supervised fine-tuning, rule-based reinforcement learning, and on-policy distillation. Experiments show InternVideo3 achieves strong performance on benchmarks like Video-MME, MLVU, and EgoSchema. We further instantiate the model as a video agent with retrieval tools, demonstrating robust evidence-grounded behavior. Our results suggest that efficient context handling and closed-loop reasoning are vital for adapting open multimodal models toward long-horizon visually grounded agency.

  14. Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling

    Reinforcement learning (RL) has become a key component in modern large language models, yet the rollout stage remains the key bottleneck in RL training pipelines. Although Multi-Token Prediction (MTP) offers a natural solution to accelerate rollouts through speculative decoding, many studies have observed that MTP acceptance rates degrade significantly during RL training, leading to limited speedup performance. To address this bottleneck, we present Bebop, a systematic study of MTP in LLM post-training, and offer practical recipes to integrate MTP into large-scale RL pipelines. First, we reveal that the MTP acceptance rate is fundamentally bounded by the fluctuation of model entropy, which demonstrates a clear negative linear relationship with the rise of entropy in the RL stage. Second, we show that probabilistic rejection sampling largely alleviates the disturbance introduced by entropy in RL compared to greedy draft sampling. We further identify that the conventional MTP training objectives (cross-entropy or KL) are suboptimal in such settings, and therefore we propose a novel end-to-end TV loss that directly optimizes multi-step rejection sampling acceptance rate, yielding ~10% acceptance rate improvements, achieving up to 95% acceptance rates and up to 25% extra inference throughput gains across mathematical reasoning, code generation, and agentic tasks. Third, we test various online MTP training strategies during RL and show that pre-RL MTP training with e2e TV loss and rejection sampling achieves a consistent acceptance rate and speedup throughout the entire RL, eliminating the need for costly online MTP updating. We provide extensive experiments and analysis that validate our findings. Experimental results show our method achieves up to 1.8x end-to-end acceleration in async RL training of Qwen3.5, Qwen3.6, and Qwen3.7 models.

  15. TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

    Reinforcement learning with verifiable rewards (RLVR) is a promising approach for enhancing reasoning and agentic behavior in large language models. However, rollout-intensive policy optimization is often limited by insufficient reward contrast, arising when overly simple or complex prompts generate low-variance feedback and when outcome-only rewards assign the same terminal assessment to every decision in a multi-turn rollout. Past efforts have focused on allocating available rollout resources to promising prompts, yet they only leverage sample informativeness at the prompt level and neglect variation in prefix-level informativeness across turns within the same rollout. This work targets multi-turn agentic RL by modeling each ReAct-style thought-action-observation turn as a semantically distinct node, allowing budget allocation to extend from prompt roots to turn-level prefixes with further continuations, which naturally forms tree-structured rollouts. We introduce Tree Rollout Allocation for Contrastive Exploration (TRACE), a unified rollout allocation framework that enhances reward contrast within a fixed sampling budget. Technically, TRACE allocates rollout budget to both prompt roots and intermediate prefixes that are most likely to yield mixed terminal rewards. A shared generalizable predictor estimates conditional success probability at these anchors from prefix histories to guide this allocation. The resulting adaptive tree structure enriches outcome-only feedback and amplifies the policy-update signal. Empirically, TRACE achieves competitive performance and efficiency gains on typical agentic benchmarks, e.g., improving Qwen3-14B Multi-Hop QA average accuracy by 2.8 points over competitive baselines at equal sampling cost.

Techmeme(15)

  1. Ted Cruz and Ron Wyden introduce the JAWBONE Act, which would let Americans sue federal officials who try to coerce broadcasters or platforms to censor speech (Jon Brodkin/Ars Technica)

    Jon Brodkin / Ars Technica : Ted Cruz and Ron Wyden introduce the JAWBONE Act, which would let Americans sue federal officials who try to coerce broadcasters or platforms to censor speech —  US Senators Ted Cruz (R-Texas) and Ron Wyden (D-Ore.) today introduced the JAWBONE Act, a proposed law that could fuel lawsuits …

  2. Agentic workplace startup Genspark raised a $100M Series B extension at a $2.6B post-money valuation, up 63% in just three months; it has raised $645M+ to date (Chris Metinko/Axios)

    Chris Metinko / Axios : Agentic workplace startup Genspark raised a $100M Series B extension at a $2.6B post-money valuation, up 63% in just three months; it has raised $645M+ to date —  Agentic workplace startup Genspark raised $100 million in Series B extension funding at a $2.6 billion post-money valuation, co-founder Wen Sang tells Axios Pro exclusively.

  3. Adobe reports Q2 revenue up 13% YoY to $6.62B, vs. $6.46B est., and raises its annual forecasts; CFO Dan Durn is leaving for Marvell; ADBE drops 5%+ after hours (Zaheer Kachwala/Reuters)

    Zaheer Kachwala / Reuters : Adobe reports Q2 revenue up 13% YoY to $6.62B, vs. $6.46B est., and raises its annual forecasts; CFO Dan Durn is leaving for Marvell; ADBE drops 5%+ after hours —  Adobe (ADBE.O) raised its annual revenue and profit forecasts on Thursday, but the sudden exit of CFO Dan Durn added to concerns …

  4. Sources: Founders Fund's ~3% stake in SpaceX is now worth $50B+, after investing $600M; a16z will get the biggest return in its history, with a $10B+ stake (Bloomberg)

    Bloomberg : Sources: Founders Fund's ~3% stake in SpaceX is now worth $50B+, after investing $600M; a16z will get the biggest return in its history, with a $10B+ stake —  A small number of firms are set to net tens of billions of dollars in returns from SpaceX's initial public offering …

  5. SpaceX raises $75B in the biggest-ever IPO, pricing 555.6M shares at $135 each, giving it a market value of $1.77T (Bailey Lipschultz/Bloomberg)

    Bailey Lipschultz / Bloomberg : SpaceX raises $75B in the biggest-ever IPO, pricing 555.6M shares at $135 each, giving it a market value of $1.77T —  SpaceX has made history with the biggest-ever IPO, launching it into the top ranks of the largest public companies and putting founder Elon Musk on the verge of becoming the world's first trillionaire.

  6. Polish lawmakers pass legislation imposing prison sentences of up to five years for "trash streaming" of violent crimes, gambling promotion, and more (Anna Wlodarczak-Semczuk/Reuters)

    Anna Wlodarczak-Semczuk / Reuters : Polish lawmakers pass legislation imposing prison sentences of up to five years for “trash streaming” of violent crimes, gambling promotion, and more —  Polish lawmakers voted on Thursday to crack down on so-called ‘trash streaming’, imposing jail terms of up to five years …

  7. Some investors question SpaceX's projected $1.77T valuation, citing its $4.3B loss on $4.7B in revenue in Q1, concerns over space data centers, and more (New York Times)

    New York Times : Some investors question SpaceX's projected $1.77T valuation, citing its $4.3B loss on $4.7B in revenue in Q1, concerns over space data centers, and more —  Elon Musk's rocket company is spending big and losing money.  That has raised questions about whether it can justify its valuation for its blockbuster initial public offering.

  8. Former a16z GP John O'Farrell criticizes the AI industry and his ex-partners for spending hundreds of millions to fight regulation, calling it "a huge mistake" (John O'Farrell/New York Times)

    John O'Farrell / New York Times : Former a16z GP John O'Farrell criticizes the AI industry and his ex-partners for spending hundreds of millions to fight regulation, calling it “a huge mistake” —  I first came to America from Ireland in 1984, as a young engineer about to attend business school.

  9. Coinbase launches an AI agent that can execute trades and pay for premium research; users can give it access to their main account or have it operate separately (Ivan Mehta/TechCrunch)

    Ivan Mehta / TechCrunch : Coinbase launches an AI agent that can execute trades and pay for premium research; users can give it access to their main account or have it operate separately —  As AI agent traffic surpasses human traffic on the internet, companies working in commerce and finance are building tools …

  10. Europol says it has dismantled the AudiA6 crypto mixing service, which allegedly laundered $380M+ for ransomware actors and others between 2022 and 2025 (Bill Toulas/BleepingComputer)

    Bill Toulas / BleepingComputer : Europol says it has dismantled the AudiA6 crypto mixing service, which allegedly laundered $380M+ for ransomware actors and others between 2022 and 2025 —  Law enforcement has dismantled the “AudiA6” cryptocurrency service allegedly used by ransomware actors and other cybercriminals to launder more than $380 million.

  11. Sources: Anthropic signed 12+ initial agreements for direct data center leases, a first for the company, with Google potentially providing a financial guarantee (Anissa Gardizy/The Information)

    Anissa Gardizy / The Information : Sources: Anthropic signed 12+ initial agreements for direct data center leases, a first for the company, with Google potentially providing a financial guarantee —  Anthropic is moving forward with a plan to control its own servers for developing AI, giving it the ability to cut its computing costs in the long run.

  12. PhoenixAI, formerly CelerData, which is building what it calls an agentic AI-ready analytical database, raised an $80M Series B led by Sky9 (Kyt Dotson/SiliconANGLE)

    Kyt Dotson / SiliconANGLE : PhoenixAI, formerly CelerData, which is building what it calls an agentic AI-ready analytical database, raised an $80M Series B led by Sky9 —  PhoenixAI Inc., formerly known as CelerData, today announced it raised $80 million in new funding to fuel the development of the company's artificial …

  13. Barcelona-based Theker, which is building AI-powered robots for industrial environments, raised $85M led by CRV, with Samsung, LVMH, and others participating (Reuters)

    Reuters : Barcelona-based Theker, which is building AI-powered robots for industrial environments, raised $85M led by CRV, with Samsung, LVMH, and others participating —  Theker, an AI robotics firm from Barcelona, has secured $85 million in new funding.  This investment was led by CRV and included major players like Samsung and LVMH.

  14. Waymo launches Waymo Premier, a $30-per-month membership program offering perks like skipping the line, 10% cashback, and five free ride cancellations per month (Sean O'Kane/TechCrunch)

    Sean O'Kane / TechCrunch : Waymo launches Waymo Premier, a $30-per-month membership program offering perks like skipping the line, 10% cashback, and five free ride cancellations per month —  Waymo is launching a loyalty program called Waymo Premier, which will offer frequent robotaxi riders a number of perks in exchange for $29.99 per month.

  15. Sources: BlackRock put in an order to buy at least $5B worth of SpaceX shares; SpaceX received an over $1B request from a single family-office investor (Wall Street Journal)

    Wall Street Journal : Sources: BlackRock put in an order to buy at least $5B worth of SpaceX shares; SpaceX received an over $1B request from a single family-office investor —  Other large asset managers made similarly eye-popping requests  —  Elon Musk's SpaceX is preparing to stage the largest public offering ever …

Solidot(15)

  1. 东亚最高的树

    研究人员在台大安溪附近发现了东亚已知最高的树,他们根据金庸小说将其命名为大安溪倚天劍。倚天劍高 84.1 米,树龄约一千年。世界上已知最高的树是加州红杉国家公园的 Hyperion,其高度约 116 米。台湾约 60% 的面积被森林覆盖,岛上估计有 9.5 亿棵树,其中有很多参天巨树。研究团队使用传统方法测量了倚天劍的高度:爬上树,从树顶垂下卷尺。

  2. 科技巨头大举借债

    为了投资建设 AI 基础设施,科技巨头们正大举借债,规模达到了千亿美元。Google 母公司 Alphabet 一周前表示,计划过股票销售筹集 800 亿美元;Meta 宣布计划通过销售债券筹集 300 亿美元;亚马逊计划在加拿大发行债筹集 140 亿美元,紧跟着又与花旗、摩根大通、富国、汇丰和美银证券等达成协议借款约 175 亿美元总融资 315 亿美元。为了资助 AI 基础设施如芯片和数据中心,主要科技公司的支出都创下了历史新高。如此高的投资引发了回报相关的疑问。

  3. 游荡在 Fedora 项目的可疑 AI 智能体

    5 月 27 日 Fedora 开发者 Adam Williamson 写邮件给 Nathan Giovannini,对由其账号控制的一个 AI 智能体提出疑问。该智能体过去几个月做了一系列令人感到可疑的事情:无缘无故修改 bug 的严重级别和优先级,伪造对 Bug 的回复,说服维护者将可疑代码合并到 Anaconda 安装程序,向上游项目递交了一系列 pull requests (PRs),其中一部分已被接受。Giovannini 回应称他的账号被盗了,他不是该智能体的控制者。此事令社区联想到了曾引发广泛关注的 XZ 后门事件。在 XZ 后门事件中,化名为 JiaT75(Jia Tan)的攻击者通过在两年多时间里向项目积极贡献代码而获得信任,然后再通过施压而最终成为项目的共同维护者,得到了能悄悄在代码中植入后门的权限。在以大模型为代表的生成式 AI 时代,贡献代码比以往任何时候更轻松,这意味着攻击者可以使用智能体向开源项目积极贡献代码,积累信任,然后再发动攻击。该智能体使用的账号已经关闭,相关 PR 已经回滚。

  4. OpenAI 称中国关联账户试图煽动美国反数据中心情绪

    OpenAI 周三发布报告称,公司发现一些源自中国的账户利用 AI 生成英文社交媒体帖子,称数据中心推高了美国居民的电费。OpenAI 称,这些账户可能与一家未具名的中国私营科技公司有关。OpenAI 表示,这些帖子传播范围有限,但应引起外界对外国势力试图削弱美国战略性产业的关注。该公司补充称,美国对 AI 和数据中心存在“合理的讨论”,但这些账户通过伪装成普通美国民众,通过发布有争议的 AI 生成内容来试图操纵讨论。

  5. 酷澎因用户信息泄露被罚逾 6 千亿韩元

    韩国电商巨头酷澎(coupang)因其数千万用户信息泄露被罚 6247 亿韩元(约合人民币 27.7 亿元)。韩国个人信息保护委员会认定,酷澎在认证签名密钥管理及访问控制等方面存在疏漏,基本安全管理体系不完善,导致约 3750 万名用户个人信息泄露,并就此处以 4235.75 亿韩元罚款。这是针对单一个人信息泄露事故开出的最高罚款。委员会还认定,酷澎在缺乏法律依据的情况下,擅自收集约 1117 万名访问其他公司网站和应用程序用户的在线活动记录,并在可识别个人身份的状态下将相关信息储存到数据库中,因此另行处以 2011.066 亿韩元罚款。

  6. 科学家发现最大鲸类墓地

    中国科学院深海科学与工程研究所主导的“全球深渊探索计划”在东南印度洋迪亚曼蒂纳深渊观察到大量鲸类化石和完整鲸落生态系统。这里也成为目前全球已知深度最深、规模最大的鲸类化石群与鲸落分布区。鲸落是鲸鱼死亡后沉入海底形成的特殊生态系统。2023 年科考团队搭乘“探索一号”科考船,使用“奋斗者”号载人深潜器,在绵延 1200 公里的迪亚曼蒂纳深渊沟底完成 32 次下潜作业,在水深 4616 米至 7001 米处,共发现 5 处化能自养阶段的鲸落、476 处鲸类化石堆积。该区域鲸类遗骸密度达每平方公里 759.5 具,经推算,整片海域鲸类残骸总量或超 1000 万具。

  7. Meta 放宽言论限制后对政客的威胁增加了两倍

    Meta 去年以言论监管过严为由放宽了限制。Center for Countering Digital Hate(CCDH)的一项研究调查了这一变化带来的直接影响。研究人员分析了约 800 万 Facebook 评论,发现新规实施后六个月内,针对共和党和民主党议员的辱骂性和种族主义评论增加了两倍,暴力威胁和仇恨言论在同期内增加了三倍。研究人员还发现,针对特朗普总统的威胁增加了一倍多。研究人员表示,直接威胁总统生命安全的评论可能构成重罪。违反 Meta 有关暴力威胁政策的评论数量增至四倍,从政策改变前六个月的 1800 条增加到改变后六个月的 7600 条。仇恨言论评论也翻了两倍,从 6900 条增加到 30000条。违反 Meta 关于欺凌和骚扰规则的评论数量增加了一倍,从 15700 条增加到 39900 条。

  8. Visa 支付网络集成 ChatGPT

    Visa 正在其支付网络集成 ChatGPT,允许 AI 智能体代表用户购物并完成购买。此举意味着 AI 智能体不仅能推荐商品,还能代表用户在任何接受 Visa 的商家完成商品购买。OpenAI 将提供技术,让智能体能通过 ChatGPT 进行互动、做出决策和发起购买。Visa 和 OpenAI 没有透露双方合作的财务条款,也没有说明商家或顾客需要支付的费用详细信息。 Visa 表示,为保护消费者并最大限度减少欺诈,该功能将设置消费限额、需要批准的步骤以及仅限授权商家等安全措施。

  9. 美国太阳能发电量首次超过煤炭

    根据能源智库 Ember 的分析,2026 年 5 月美国太阳能发电量首次超过煤炭:太阳能发电量占到了美国电力供应的 12.8%,煤炭则下降至 12.2%。五年前的 5 月煤炭发电量占美国发电量的 19.7%,太阳能发电量则仅占 5.4%。2026 年 5 月美国太阳能发电量达到创纪录的 45.5 TWh,比去年同期增长 17%,高于去年 7 月创下的纪录。太阳能发电量通常在六月或七月达到峰值,Ember 估计今年夏天可能会再次打破纪录。太阳能首次成为美国第三大电力来源,仅次于天然气和核能。煤炭发电量则在下滑,2026 年 4 月煤炭发电量创历史新低为 39.3 太TWh。5 月发电量小幅上升至 43.4 TWh,但比 2025 年 5 月低 11%。

  10. 人类习惯于左转逆时针行走

    研究人员在疫情期间进行了一系列实验,观察在保持安全距离的情况下多少人能共享同一空间。在回看视频时,他们注意到大多数人都逆时针方向行走。这一意外发现促使科学家展开了更多实验,发现人类总是倾向于逆时针行走。他们的研究报告发表在《Nature Communications》期刊上。科学家尚不清楚这种偏好的来源。男性和女性都存在该偏好行为,儿童中间更为明显。动物中间也有类似行为,如岩蚁(rock ants)探索未知巢穴时偏好左转。科学家怀疑与生物机械学有关,但确切机制仍然是个谜团。奥运会的田径比赛最初让运动员沿顺时针跑道跑,但后来因运动员认为这种跑法不自然而改为沿逆时针跑道跑,原因这可能是人口中的右腿优势。

  11. FCC 计划在美国推行手机实名制

    美国联邦通信委员会(FCC)想要杀死匿名的一次性手机,计划通过法律强制要求电信公司存储手机用户的个人信息,相关个人信息包括政府颁发的身份识别号码和实际地址。此举引发了隐私倡导者和民权活动人士的担忧,认为美国在向专制国家看齐。FCC 给出的理由是打击诈骗,旨在阻止诈骗分子接入电信网络,“执法人员能更好地识别诈骗分子”。FCC 将这些措施比作银行为防止洗钱而收集的数据。

  12. npm v12 将不再自动执行依赖项

    在 Node.js 生态系统发生了一系列严重安全事件之后,npm 管理工具的下一个大版本 v12 将在安全方面进行重大调整:除非明确允许,npm install 不再自动执行依赖项的 preinstall、install、postinstall 脚本。来自 Git、文件和链接依赖项的准备脚本也会以同样的方式被阻止。npm v12 将于 2026 年 7 月推出。

  13. 双星系统的化学构成差异揭示了行星被恒星吞噬的命运

    天文学家研究一个名为 HD 81809 的特殊双星系统时,发现两颗同时诞生的恒星拥有截然不同的化学组成。一般而言,双星诞生于同一个分子云,因此元素组成通常相当接近。但在 HD 81809 系统中,其中一颗恒星的铁含量明显偏低,而另一颗则拥有接近太阳的金属丰度,两颗恒星的铁元素含量相差约 3.7 倍,远超过一般双星演化模型能解释的范围。天文学家因此怀疑,其中一颗恒星可能曾吞噬自己的行星,改变了表面的化学组成。HD 81809 双星系统距离地球约 113 光年,由两颗类似太阳的 G 型恒星组成。其中主星 HD 81809A 已演化成次巨星,另一颗 HD 81809B 则仍停留在主序星阶段。这个系统年龄约达 100 亿年。除了铁元素异常外,HD 81809B 还具有偏高的锂元素含量。由于低质量老年恒星通常会在演化过程中逐渐耗尽锂元素,因此这种高锂元素含量现象,被视为可能曾发生行星吞噬事件的重要线索。研究人员发现,如果要让 HD 81809B 的金属量提升至目前观测值,恒星必须在近期吞噬约 25 至 75 个地球质量的富金属物质,相当于海王星到土星之间的金属核心总量。

  14. 半导体月销售额首次突破 1100 亿美元

    美国半导体行业协会(SIA)公布的数据显示,4 月全球半导体销售额同比增长 93.9% 达到 1104.8 亿美元。半导体销售额已连续 30 个月实现同比增长,环比增幅为 11%。除销量增长外,价格也显著上升。8GB DDR4 内存价格一年内涨至约 9 倍。三大内存厂商三星电子、SK 海力士和美光科技优先生产 AI 用内存产品,导致通用内存产品的供求关系趋于紧张。按区域来看,拉动销售额增长的是美国和亚洲。

  15. 德国法庭裁决 Google 要对 AI Overviews 内容承担责任

    德国慕尼黑地区法庭裁决,Google 要对 AI Overviews 内容承担责任,因为 AI Overviews 是 Google 自己的内容,并非搜索结果列表。本案的原告是两家慕尼黑出版商,他们指控 Google 的 AI Overviews 错误将其与诈骗、订阅陷阱等不正当商业行为关联起来,他们向 Google 发去了禁止通知函(cease-and-desist letter),但搜索巨人未正确回应。法院认为,Google 的 AI Overviews 与传统搜索结果不同,AI 会“用自己的语言按照自己的结构”重写和评判搜索结果,而它引用的链接与其内容有矛盾,因此该内容是 Google 自己的陈述。Google 开发了 AI,将其提供给用户,因此 Google 拥有 AI 所生成内容的所有权,“因为只有 Google 才能影响 AI 提供的服务以及 AI 运行所使用的算法。”搜索引擎责任规则不适用于 AI 搜索。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.