OrangeBot.AI Digest — 2026-06-12
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- I Am Not a Reverse Centaur (blog.miguelgrinberg.com)
- "Don't You Just Upload It to ChatGPT?" (correresmidestino.com)
- How to setup a local coding agent on macOS (ikyle.me)
- Pirates, a naval warfare game inspired by Sid Meier's Pirates (piwodlaiwo.github.io)
- Malware developers added nuclear and biological weapons text to to their spyware (twitter.com)
- CRISPR tech selectively shreds cancer cells, including "undruggable" cancers (innovativegenomics.org)
- A dumpster arrived behind my university's library (yalereview.org)
- Slightly reducing the sloppiness of AI generated front end (envs.net)
- A Call to Action: Stop the FCC's KYC Regime (blog.lopp.net)
- WASI 0.3 (bytecodealliance.org)
- Kimi K2.7-Code: open-source coding model with better token efficiency (huggingface.co)
- Ryanair dark UX patterns summer 2026 refresher (blog.osull.com)
- The Future of Email (www.fastmail.com)
- AUR packages compromised with Infostealer and Rootkit (discourse.ifin.network)
- Digital Sovereignty Becomes an Imperative as the US Reads Dutch Emails (www.korte.co)
GitHub Trending(13)
Product Hunt(15)
- pleNx — Plex client for Nintendo Switch
The first native Plex client for Nintendo Switch
- Clutch Alarm
Sleep through the night. Wake up for the goals.
- Keep
Full-screen 3D clock scenes for your iPhone or Mac
- LocIn AI
Localize your app with tone-aware AI, automated workflows
- Slack Data Agent
Ask about your data without leaving Slack
- Medicyn
Your complete medical history privately on your device
- CueBuddy
Record talking videos without manual scrolling
- Tide
Layered voice notes that paint themselves
- Insta360 Luna Ultra
A gimbal camera that sees with you
- Bob's CLI
A local-first AI coding CLI that adapts to you
- Meet Warren 3.0
Your voice-supported AI financial planning partner
- HyperSleep
Block social media until you've actually slept
- KOSH Money
USD account & credit cards for freelancers & creators
- Pond
Fundraising, GTM, and bounties for startups
- Qursor
Point at any UI to send exact context to your AI
Hugging Face(15)
- EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments
Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dynamic, requiring agents to continually align their knowledge, skills, and behavior with changing environments and updated task conditions. To address this gap, we introduce EvoArena, a benchmark suite that models environment changes as sequences of progressive updates across terminal, software, and social domains. We further propose EvoMem, a patch-based memory paradigm that records memory evolution as structured update histories, enabling agents to reason about environmental evolution through changes in their memory. Experiments show that current agents struggle on EvoArena, achieving an average accuracy of 39.6% across evolving terminal, software, and social-preference domains. EvoMem consistently improves performance, yielding an average gain of 1.5% on EvoArena and also improving standard benchmarks such as GAIA and LoCoMo by 6.1% and 4.8%. Beyond individual tasks, EvoMem further improves chain-level accuracy by 3.7% on EvoArena, where success requires completing a consecutive sequence of related evolutionary subtasks. Mechanistic analysis shows that EvoMem improves evidence capture in the memory, indicating better preservation of complete evolving environment states. Our results highlight the importance of modeling evolution in both evaluation and memory for reliable agent deployment.
- SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning
Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attempt to address this by augmenting VLMs with specialist perception modules, yet their effectiveness is bounded by the action interface through which those tools are invoked. In this work, we study how the design of this interface shapes the agent's capacity for open-ended spatial reasoning. Existing spatial agents either employ single-pass code execution, which commits to a full analysis strategy before any intermediate result is observed, or rely on a structured tool-call interface that often offers less flexibility for freely composing operations or tailoring the analysis to each task. Both designs offer limited flexibility for open-ended, complex 3D/4D spatial reasoning. We therefore propose SpatialClaw, a training-free framework for spatial reasoning that adopts code as the action interface. SpatialClaw maintains a stateful Python kernel pre-loaded with input frames and a suite of perception and geometry primitives, letting a VLM-backed agent write one executable cell per step conditioned on all prior outputs, enabling the agent to flexibly compose and manipulate perception results and adapt its analysis to both intermediate text and visual observations and the demands of each problem. Evaluated across 20 spatial reasoning benchmarks spanning a broad range of static and dynamic 3D/4D spatial reasoning tasks, SpatialClaw achieves 59.9% average accuracy, outperforming the recent spatial agent by +11.2 points, with consistent gains across six VLM backbones from two model families without any benchmark- or model-specific adaptation.
- MiniMax Sparse Attention
Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundreds of thousands to millions of tokens, yet the quadratic cost of softmax attention makes this untenable at deployment scale. We introduce MiniMax Sparse Attention (MSA), a blockwise sparse attention built upon Grouped Query Attention (GQA). A lightweight Index Branch scores key-value blocks and independently selects a Top-k subset for each GQA group, enabling group-specific sparse retrieval while maintaining efficient block-level execution; the Main Branch then performs exact block-sparse attention over only the selected blocks. Designed around a principle of simplicity and scalability, MSA is deliberately streamlined, making it straightforward to deploy efficiently across a broad range of GPUs. To translate sparsity into practical speedups, we co-design MSA with a GPU execution path that uses exp-free Top-k selection and KV-outer sparse attention to improve tensor-core utilization under block-granular access. On a 109B-parameter model with native multimodal training, MSA performs on par with GQA while reducing per-token attention compute by 28.4x at 1M context. Paired with our co-designed kernel, MSA achieves 14.2x prefill and 7.6x decoding wall-clock speedups on H800. Our inference kernel is available at: https://github.com/MiniMax-AI/MSA. A production-grade natively multimodal model powered by MSA has been publicly released at: https://huggingface.co/MiniMaxAI/MiniMax-M3.
- InterleaveThinker: Reinforcing Agentic Interleaved Generation
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they cannot achieve interleaved generation (text-image sequence), which has crucial applications in visual narratives, guidance, and embodied manipulation. Even the latest open-source Unified Multimodal Models (UMMs) exhibit limited performance in this regard. In this paper, we introduce InterleaveThinker, the first multi-agent pipeline designed to endow any existing image generator with interleaved generation capabilities. Specifically, we employ a planner agent to organize the image-text input sequence, instructing the image generator on the required execution at each step. Subsequently, we introduce a critic agent to evaluate the generator's outputs, identify samples that deviate from the planned instructions, and refine the instructions for regeneration. To implement this pipeline, we construct the Interleave-Planner-SFT-80k and Interleave-Critic-SFT-112k to perform a format cold-start. Then we develop Interleave-Critic-RL-13k to reinforce the step-wise instruction correction capability within a generation trajectory using GRPO. Since a single interleaved generation trajectory may involve over 25 generator calls, optimizing the entire trajectory is computationally impractical. Therefore, we propose accuracy reward and step-wise reward, allowing single-step RL to effectively guide the entire generation trajectory. The results show that InterleaveThinker improves performance across various image generators. On interleaved generation benchmarks, it achieves performance comparable to Nano Banana and GPT-5. Surprisingly, it also significantly enhances the base model on reasoning-based benchmarks; for example, on 4-step FLUX.2-klein, we observe substantial gains on WISE and RISE.
- Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?
Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet their performance degrades significantly under real-world visual corruptions. While existing robustness enhancement approaches exist, they are limited: black-box feature alignment lacks interpretability, and white-box text-based reasoning cannot restore lost pixel-level details. This work investigates a fundamental research question: Can MLLMs recover corrupted visual content by themselves? To address this, we propose Robust-U1, a novel framework that equips MLLMs with explicit visual self-recovery capability for robust understanding. The approach comprises three core stages: supervised fine-tuning for initial reconstruction, reinforcement learning with dual rewards (pixel-level SSIM and semantic-level CLIP similarity) for aligning high visual quality, and multimodal reasoning that jointly considers both the corrupted input and the recovered image. Extensive experiments demonstrate that Robust-U1 achieves state-of-the-art robustness on the real-world corruption benchmark and maintains superior performance under adversarial corruptions on general VQA benchmarks. Analysis confirms that high-quality visual recovery directly enhances reasoning performance, establishing self-recovery as a critical mechanism for robust visual understanding. The source code is available at https://github.com/jqtangust/Robust-U1.
- FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents
Training deep search agents requires verifiable questions whose answers remain unavailable until sufficient evidence has been acquired through search. Existing synthesis methods often increase apparent difficulty by enriching graph structures, but structural complexity alone does not guarantee realized search difficulty: the intended search process can collapse through a cheaper identifying route. We formalize this gap with a shortcut-aware difficulty framework and identify four actionable shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding. To diagnose their realized effects, we use trajectory signatures including solving cost, answer hit time, and prior-shortcut rate. Guided by this framework, we introduce FORT, a Framework of Shortcut-Resistant Training-Data Synthesis. FORT constructs shortcut-resistant training data by controlling shortcut risks across entity selection, evidence graph construction, question formulation, and adversarial refinement. Experiments show that FORT induces longer pre-answer search and fewer shortcut patterns than existing open-source deep search datasets. Using the resulting trajectories, we train FORT-Searcher with supervised fine-tuning (SFT) only, and it achieves the best overall performance among comparable-size open-source search agents on challenging deep search benchmarks. Relevant resources will be made available at https://github.com/RUCAIBox/FORT-Searcher.
- MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling
We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities -- proof generation, proof verification, and critique-conditioned proof repair -- using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.
- WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces
Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line execution, code editing, browsers, and external tools. Existing benchmarks, however, often evaluate these interfaces as separable capabilities, leaving long-horizon cross-interface orchestration under-tested. Thus, we introduce WeaveBench, a long-horizon hybrid-interface benchmark with 114 tasks across 8 real-world work domains, grounded in real user requests and publicly verifiable artifacts. Each task requires agents to combine GUI observations/actions with CLI/code operations within a single trajectory. We evaluate these tasks on a real Ubuntu desktop inside deployed CLI-agent runtimes, augmented with a minimal desktop-control plugin. We also propose a companion trajectory-aware judge that inspects deliverables, files, screenshots, logs, and action traces, while detecting shortcut behaviors such as fabricated visual evidence or hard-coded metrics. Across frontier model-runtime pairings, the best PassRate reaches only 41.2%, showing the benchmark remains far from saturated. The trajectory-aware judge further reveals that outcome-only grading substantially overestimates agent performance. Overall, WeaveBench exposes a critical gap in CUA evaluation and provides an effective testbed to measure whether agents can orchestrate GUI, CLI, and code operations across long-horizon real-world tasks.
- LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories
Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of doing science remains largely outside their reach. AI can help read literature, generate hypotheses, and plan protocols, yet the execution of those protocols at the bench still requires a human operator. Vision-Language-Action (VLA) models provide one possible interface between written protocols and robot execution, but existing policies are trained mostly on household and tabletop demonstrations and rarely encounter the instruments, transparent liquids, or fixed protocol workflows found in scientific laboratories. Closing this gap requires both laboratory-specific supervision and a unified learning framework that can accommodate the diverse robot embodiments used to execute experimental protocols. We therefore identify data and embodiment as central bottlenecks alongside model design. To address the data side, we build RoboGenesis, a simulation-based workflow and data engine that composes configured laboratory workflows from atomic skills, validates and filters rollouts, and exports structured demonstrations across supported robot profiles. On the policy side, we present LabVLA, trained with a two-stage recipe: FAST action token pretraining first makes the Qwen3-VL-4B-Instruct backbone action aware before any continuous control is learned, and flow matching posttraining then attaches a DiT action expert under knowledge insulation. On the LabUtopia benchmark, LabVLA achieves the highest average success rate among all evaluated baselines under both in-distribution and out-of-distribution settings.
- HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers
Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representation space. In this paper, we present HYDRA-X, the first UMM that unifies image and video tokenization within a single Vision Transformer (ViT). Our design is driven by two core challenges: efficiently injecting spatiotemporal reconstruction capability into a native ViT, and embedding image- and video-level semantic awareness into the latent space. To address the first, comprehensive ablations reveal two key findings: (1) frame-level causal temporal attention suffices for visual reconstruction, whereas full spatiotemporal attention degrades it; and (2) hierarchical temporal compression substantially outperforms single-step alternatives. To tackle the second, we propose a lightweight decompressor that upsamples temporally compressed features under joint image-video teacher supervision, thereby enforcing complementary semantic structures within the compact latent space. Building on this holistic tokenizer, we further propose a principled improvement of the editing pipeline: source-target interaction should occur at the latent level inside the tokenizer rather than at the semantic level inside the LLM, substantially improving editing consistency and accelerating convergence. Instantiated at the 7B dense model, HYDRA-X achieves strong performance across image and video understanding and generation tasks, paving the way for future unified-tokenizer UMMs.
- N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization
The success of Large Language Models in mathematical reasoning relies heavily on the generation of diverse and valid solution paths during the rollout phase. However, current rollout techniques face a fundamental trade-off: token-level sampling often yields redundant trajectories that differ only in rephrasing, while embedding-level methods utilizing random noise frequently disrupt semantic consistency. To resolve this, we introduce N-GRPO, a novel exploration strategy integrated into the Group Relative Policy Optimization (GRPO) framework. Rather than relying on token-level sampling or native embedding-level noise, our approach leverages Semantic Neighbor Mixing. This mechanism dynamically constructs input representations by mixing the embeddings of an anchor token and its nearest semantic neighbors, thereby injecting diversity while strictly adhering to the local semantic manifold. Experimental evaluations on the DeepSeek-R1-Distill-Qwen models across different sizes show that N-GRPO not only achieves consistent improvements over strong baselines on math reasoning benchmarks but also exhibits robust generalization capabilities on out-of-distribution tasks.
- EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific solutions, and have produced results that outperform human-designed approaches. As model capabilities continue to improve, we argue that the bottleneck for autonomous scientific discovery is shifting from prescribing agent workflows to designing agent environments: the resources, constraints, and interfaces that shape agent behavior. We frame this as environment engineering: building environments that amplify productive behaviors, such as open-ended exploration, systematic artifact management, and inter-agent collaboration, while suppressing harmful behaviors, such as reward hacking and high-friction human oversight. We present EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery. EurekAgent engineers the environment along four dimensions: permissions engineering for bounded agent execution and isolated evaluation; artifact engineering for filesystem and Git-based collaboration; budget engineering for budget-aware exploration; and human-in-the-loop engineering for easy human supervision and intervention. EurekAgent sets new state-of-the-art results on multiple mathematics, kernel engineering, and machine learning tasks, including new state-of-the-art 26-circle packing results discovered with less than $11 in total API cost. We open-source our code and results, and call for environment engineering as a core research direction for developing reliable autonomous research agents.
- Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning
Latent chain-of-thought compresses reasoning by replacing visible reasoning traces with continuous hidden-state recurrence, but existing formulations are difficult to optimize with standard on-policy reinforcement learning (RL) and hard to interpret causally. Our key insight is that a single pair of explicit boundary tokens can address both issues at once: discrete entry and exit anchors make the latent block compatible with standard on-policy RL, and the same anchors offer a natural foothold for mechanistic analysis. Motivated by this, we propose SWITCH, a switchable latent reasoning framework. The model emits <swi> to enter latent mode and </swi> to exit. Because the boundaries are ordinary discrete tokens, the GRPO policy ratio is well-defined at every decision point. The same anchors also expose the latent steps to direct probing and causal intervention. We train the model with a visible-to-latent curriculum and a Switch-GRPO objective that propagates gradients through recurrent latent computation. SWITCH consistently outperforms prior hidden-state-recurrence latent reasoning approaches at similar scale. Mechanistic analysis through the boundary tokens further reveals three findings: (i) <swi> is a sharply localised, learned switching policy rather than a stylistic artefact; (ii) the latent step it opens performs problem-specific, causally important computation rather than acting as an inert placeholder; and (iii) that computation is concentrated at a single hidden-state transition on entry. Together, these results show that hidden-state-recurrence latent reasoning is both RL-trainable and open to direct mechanistic analysis, including of how on-policy RL itself improves the model from the inside.
- VideoMDM: Towards 3D Human Motion Generation From 2D Supervision
We introduce VideoMDM, a diffusion-based framework that trains 3D human motion priors directly from accurate 2D poses extracted from monocular videos, without any 3D ground truth. A pretrained 2D-to-3D lifter provides approximate 3D pose sequences that serve as a noisy teacher: these are diffused, denoised by the model in 3D, and supervised in 2D by reprojecting the prediction and comparing against accurate keypoints. We show that, under mild assumptions, a depth-weighted 2D reprojection loss is equivalent in expectation to direct 3D supervision, and we adapt standard 3D motion regularizers - velocity consistency and over-parameterized representation alignment - to this 2D setting. Unlike methods that lift 2D to 3D only at inference, VideoMDM learns a coherent 3D motion manifold during training. On HumanML3D it nearly closes the gap to fully 3D-supervised MDM (FID 0.88 vs 0.54); On real video datasets Fit3D and NBA the method learns to generate motions consistently preferred by humans, with strong quantitative results.
- Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback
Despite generating increasingly photorealistic images, text-to-image (T2I) models still exhibit localized, subtle, and structurally complex failures. Diagnosing these failures requires instance-level feedback that answers where a defect occurs, what type it is, why it is defective, and its importance to overall image quality. While recent dense-feedback methods move beyond scalar supervision, their heatmap-centric representations still formulate diagnosis as pixel-field regression, making it difficult to localize variable-cardinality defects and bind semantic reasons to individual failures. To address this representation bottleneck, we propose Structured Defect Grounding (SDG), which casts T2I diagnosis as structured set prediction by modeling each defect as a (location, type, reason, importance) tuple. To make this formulation trainable and measurable, we introduce SDG-30K, a 30K-image dataset with box-grounded annotations across four modern T2I generators, together with a dedicated evaluation protocol, SDG-Eval. Building on this structured representation, we further present a diagnosis-to-alignment framework in which a Vision-Language Model (VLM) serves as the SDG detector, and BoxFlow-GRPO converts predicted defect sets into box-derived, importance-weighted spatial rewards for diffusion model alignment. Extensive experiments show that our SDG detector outperforms leading proprietary VLMs on structured defect grounding, while SDG-guided rewards consistently improve T2I alignment and support localized image refinement. These results establish SDG as a unified, instance-level interface for diagnosing, evaluating, and enhancing modern generative models.
Techmeme(15)
- Sources: three ex-DOGE staffers are raising $130M from a16z, Sequoia, and others for a startup that aims to use AI to secure government systems (Vanity Fair)
Vanity Fair : Sources: three ex-DOGE staffers are raising $130M from a16z, Sequoia, and others for a startup that aims to use AI to secure government systems — The engineers who wreaked havoc on Washington are ready for their second act. — Some of Elon Musk's earliest Department of Government Efficiency recruits …
- KPMG retracts a report on AI's benefits after it has been found to exaggerate AI adoption with case studies that appear to have been based on AI hallucinations (Financial Times)
Financial Times : KPMG retracts a report on AI's benefits after it has been found to exaggerate AI adoption with case studies that appear to have been based on AI hallucinations — Bogus case studies on UBS and transit systems exaggerated adoption of the technology. A KPMG report on how AI is being used …
- Sources: Roku is in talks to sell itself; its shares have risen about 24% this year, giving the company a market value of $19.9B; ROKU jumps 20%+ after hours (Bloomberg)
Bloomberg : Sources: Roku is in talks to sell itself; its shares have risen about 24% this year, giving the company a market value of $19.9B; ROKU jumps 20%+ after hours — Roku Inc., the streaming video platform, is in talks to sell itself, people with knowledge of the matter said.
- Staff memo: Meta plans to limit employee token usage and encourage employees to use MetaCode, after internal AI spending forecasts reached billions for 2026 (Jyoti Mann/The Information)
Jyoti Mann / The Information : Staff memo: Meta plans to limit employee token usage and encourage employees to use MetaCode, after internal AI spending forecasts reached billions for 2026 — Meta Platforms plans to clamp down on skyrocketing AI costs inside the company by imposing limits on employees' token usage …
- Sources: SpaceX decided to rent its Colossus 1 data center to Anthropic after internal teams struggled to use it for Grok development due to latency issues (Edward Ludlow/Bloomberg)
Edward Ludlow / Bloomberg : Sources: SpaceX decided to rent its Colossus 1 data center to Anthropic after internal teams struggled to use it for Grok development due to latency issues — - Musk's firm struggled with latency issues with Colossus 1 — SpaceX has made AI infrastructure a key part of its IPO pitch
- Sources: Microsoft considered spinning out or restructuring its Xbox unit as a wholly-owned subsidiary, or creating a joint venture with other partners (Aaron Holmes/The Information)
Aaron Holmes / The Information : Sources: Microsoft considered spinning out or restructuring its Xbox unit as a wholly-owned subsidiary, or creating a joint venture with other partners — As Microsoft gets ready to overhaul its struggling Xbox gaming unit, it hasn't ruled out spinning out or restructuring the unit …
- A profile of former Google DeepMind employee Thibault Sottiaux, now OpenAI's head of core products tasked with combining ChatGPT and Codex into a super app (Maxwell Zeff/Wired)
Maxwell Zeff / Wired : A profile of former Google DeepMind employee Thibault Sottiaux, now OpenAI's head of core products tasked with combining ChatGPT and Codex into a super app — Thibault Sottiaux helped make AI coding one of OpenAI's fastest-growing businesses. Now he's overseeing a sweeping overhaul of ChatGPT.
- SpaceX debuts on the Nasdaq at $150, after pricing at $135, making Elon Musk the world's first trillionaire; SPCX closes up 19%, for a ~$2.1T market cap (Alex Harring/CNBC)
Alex Harring / CNBC : SpaceX debuts on the Nasdaq at $150, after pricing at $135, making Elon Musk the world's first trillionaire; SPCX closes up 19%, for a ~$2.1T market cap — SpaceX shares soared on Friday, propelling the rocket company's valuation above $2 trillion, as trading commenced on the Nasdaq after a record-setting initial public offering.
- SpaceX makes Nasdaq debut at $150 after pricing at $135: Live updates (CNBC)
CNBC : SpaceX makes Nasdaq debut at $150 after pricing at $135: Live updates … SpaceX opened trading Friday at $150 under the ticker SPCX after the biggest initial public offering ever. — At $150, SpaceX is valued at just under $2 trillion, making it the sixth-most valuable U.S. company, ahead of Meta and Tesla.
- Meta's services, including Facebook and Instagram, are recovering after a brief outage that affected thousands of users (Jaspreet Singh/Reuters)
Jaspreet Singh / Reuters : Meta's services, including Facebook and Instagram, are recovering after a brief outage that affected thousands of users — Facebook-parent Meta (META.O) said on Friday that users were having trouble accessing the social media company's services. — “We're aware people are currently having trouble accessing our services.
- MrBeast hits 500M subscribers on YouTube, a record for the platform (Kayla Cobb/The Wrap)
Kayla Cobb / The Wrap : MrBeast hits 500M subscribers on YouTube, a record for the platform — The creator is the first to ever reach this milestone — MrBeast has broken yet another record. Jimmy Donaldson, aka MrBeast, hit 500 million subscribers on YouTube Friday, becoming the first creator ever to reach that amount.
- Sam Bankman-Fried loses his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of FTX (Luc Cohen/Reuters)
Luc Cohen / Reuters : Sam Bankman-Fried loses his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of FTX — Sam Bankman-Fried lost on Friday his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of the FTX cryptocurrency exchange he founded.
- Sources: six months after acquiring Rivos, Meta is struggling to integrate the chip startup and halted development of a chip for training its largest AI models (The Information)
The Information : Sources: six months after acquiring Rivos, Meta is struggling to integrate the chip startup and halted development of a chip for training its largest AI models — Meta Platforms bought semiconductor startup Rivos last year to accelerate development of in-house chips and reduce its reliance …
- Sources: French startup Mistral AI is in talks to raise ~€3B at a ~€20B valuation; it was last valued at €11.7B during a funding round in September 2025 (Bloomberg)
Bloomberg : Sources: French startup Mistral AI is in talks to raise ~€3B at a ~€20B valuation; it was last valued at €11.7B during a funding round in September 2025 — French startup Mistral AI is in talks to raise around €3 billion ($3.5 billion) at a valuation of roughly €20 billion …
- Companies with rising AI costs are increasingly using tools that tap cheaper models, including some from China, putting pricing pressure on OpenAI and Anthropic (Wall Street Journal)
Wall Street Journal : Companies with rising AI costs are increasingly using tools that tap cheaper models, including some from China, putting pricing pressure on OpenAI and Anthropic — Startups and tech giants alike are mixing and matching AI models to avoid the premium prices charged by industry leaders
Solidot(15)
- /e/OS 4.0 释出
注重隐私的开源移动操作系统 /e/OS 释出了 4.0 版本。/e/OS 是移除了 Google 应用的 LineageOS 分支,由法国非营利组织 e Foundation 开发。/e/OS 4.0 的变化包括:全新设计的启动器 Blisslauncher;个性化壁纸;将存储在 Google 中的所有数据迁移到欧洲云服务 Murena Workspace,彻底告别 Google;电子签名系统 Murena Sign,支持 PDF、Word 和 ODT 文件;欧洲的在线会议 Murena Meet;预装 /e/OS 的手机 Murena GS6 和 GS6 PRO,起售价分别为 339 欧元和 449 欧元。
- Arch Linux 逾四百 AUR 包被植入恶意程序
Arch Linux 项目的用户软件仓库 Arch User Repository(AUR)遭遇了大规模恶意攻击,逾四百 AUR 包被植入恶意程序。Arch Linux 维护者从昨天开始一直在重置/删除所有恶意内容,封禁受影响账号。此次攻击只影响用户软件仓库——由用户贡献的软件包,而不是官方 Arch Linux 软件包。
- AI 智能体试图扫描 DN42 时把主人搞破产
一个 AI Agent 试图加入 DN42 爱好者网络执行网络扫描。DN42 是一个去中心化网络,使用了运行在现代互联网骨干网上的技术如 BGP 和递归 DNS。其参与者都是对互联网骨干网技术感兴趣的人,甚至是打算在真正注册 ASN 之前先进行练习的人。该 AI Agent 在参与社区的互动时透露其主人的动机主要是扫描端口而不是学习任何网络相关技术。它组建了五个 20 Gbps 的 AWS 实例,这对于大多数 DN42 社区用户而言是一个庞然大物,大部分用户的带宽都很小,一旦扫描开始,这些 AWS 实例事实上将对任何不幸与它们直连的参与者发起 DoS 拒绝服务攻击。在这个 AI Agent 表明其恶意意图后,DN42 社区就决定消耗其 Token 及其 AWS 资源。不到 24 小时,它的主人通过账单知道了发生了什么事情,因此关闭了 AI Agent,称收到了 6531.30 美元的 AWS 账单,请求 DN42 社区捐赠。当然没人会去捐赠。
- 中国的癌症医疗旅游业
泰国和韩国等国以整容和试管婴儿等医疗服务闻名,而中国正试图通过提供先进的癌症疗法吸引全世界的医疗游客。患者出国就医主要是两大原因:先进疗法的可得性,以及价格。CAR-T 疗法是肿瘤学领域最有前景的突破性疗法之一,但大部分国家或者无法提供,或者价格太高。该疗法首要先从患者血液中采集 T 细胞,然后在实验室中基因改造,使其产生特殊的 CAR 受体,该受体能与癌细胞上的特定蛋白质结合。经过基因改造的细胞随后被大量增殖,重新输回患者体内。CAR-T 细胞会主动寻找并杀死携带靶抗原的癌细胞。美国癌症协会称,美国的单次输注 CAR-T 细胞费用在 30-47.5 万美元之间。而中国的费用约为 15-18 万美元,且价格可能还会更低。中国药品监管机构最近批准了一个定价低于 30 万元人民币的免疫疗法上市申请。纽约 Market Research Future 预测,中国医疗旅游市场规模预计将从 2025 年的 13 亿美元增长到 2035 年的 34 亿美元。Mercator Institute for China Studies 的分析师 Jeroen Groenewegen-Lau 称,很多先进的疗法是在中国研发的,但对于中国现有的医疗体系和患者支付能力而言,这些疗法太超前,因此融入国际医疗体系符合中国的利益。
- 调查显示美国青少年为乐趣而阅读的比例大幅下降
美国教育部国家教育统计中心发布的调查数据显示,美国 13 岁儿童为乐趣而阅读的比例自 2012 年以来下降近半。而 9 岁儿童为乐趣而阅读的比例自 2012 年以来下降了 16%。2025 年 37% 的 9 岁儿童表示几乎每天都会为乐趣而阅读,2020 年这一比例是 42%,1984 年则是 53%。青少年和儿童可能将更多时间花了屏幕上。2024 年的一项研究发现,逾半数 12-17 岁青少年每天花在屏幕上的时间达到了或超过了 4 小时。屏幕使用时间的增加与标准化考试成绩下降相关。
- 铠侠市值超过丰田跃居日本股市第一
拜 AI 热所赐,6 月 12 日日本铠侠控股(Kioxia Holdings)的总市值超过丰田,在日本国内上市企业中首次跃居榜首。铠侠的总市值达到 44 万亿日元,超过丰田约 43 万亿日元的市值。支撑股价上涨的是盈利能力扩大。以美国科技巨头对 AI 数据中心的投资为背景,NAND 闪存的销售大幅增长。软银集团(SBG)股价同样受 AI 投资相关预期推动走高,曾在 6 月 1 日市值一度超越丰田登顶榜首。作为投资公司的软银集团的收益主要来源于两大板块,一是对美国 OpenAI 的大额投资估值上涨,二是旗下英国半导体设计公司 ARM 控股的价值提升。
- 小米开源了其 AI 编程助手 MiMo Code
小米开源了其 AI 编程助手 MiMo Code,源代码采用 MIT License 托管在 GitHub 上。小米博客称,“MiMo Code 是小米 MiMo 团队基于 OpenCode 构建的终端编程 Agent,MIT 协议开源。它针对长程自动化编程任务设计,核心关注点是:如何在几十甚至上百步的持续执行中保持决策质量和状态连续性。”
- 波兰将直播虐待动物等行为定为犯罪,最高判处五年监禁
波兰议员投票通过一项法案,对强奸、谋杀、虐待动物、侮辱性暴力、赌博宣传等严重犯罪行为的直播定为犯罪行为,最高判处五年监禁,强奸或谋杀本身则作为单独的罪行处理。这一法案也适用于模仿或虚假描述此类犯罪行为的个人。此举是波兰加强网络内容监管的举措的一部分。该国最近实施的政策包括禁止 16 岁以下儿童在学校使用手机,以及对访问色情内容引入更严格的年龄验证规则。欧盟的 Digital Services Act(DSA)要求平台迅速删除宣扬暴力或严重伤害的内容,但追究此类内容创作者的责任则由各国自行规定。
- 新 CRISPR 技术选择性杀死癌细胞
2020 年诺贝尔化学奖得主 Jennifer Doudna 领导的团队利用名为 CRISPR-Cas12a2 的酶,将其转化为精准杀伤癌细胞的“武器”。当该酶检测到癌细胞特有的基因突变特征时,会直接粉碎细胞内的染色质,从而诱导癌细胞死亡。在癌症的发展中,驱动基因的变异通常分为两类:一类是原癌基因的过度激活,另一类是抑癌基因的突变失活。目前的靶向药物大多针对前者,通过抑制剂来阻断过度活跃的蛋白功能。对于抑癌基因的功能缺失性突变,传统药物往往束手无策。以人类癌症中最常见的突变基因 TP53(编码p53蛋白)为例,该突变在卵巢癌和胰腺癌等肿瘤中的出现频率高达 90%。自被发现以来的 40 多年里,科学界始终未能开发出针对突变 p53 蛋白的有效靶向药物。CRISPR-Cas12a2 是一种核酸酶,原本是细菌用来抵御病毒入侵的免疫工具。当这种酶识别到入侵病毒的 RNA 后,会开始无差别地切割周围的 RNA 和 DNA,导致染色质(细胞核内由 DNA 和蛋白质组成的复合体)被彻底粉碎,从而杀死被感染的细胞。研究团队为 Cas12a2 设计了特定的向导RNA(gRNA),使其专门识别包括 TP53、
- 印尼四天暴雨杀死了 7% 的濒危红毛猩猩
去年 11 月下旬,飓风 Senyar 肆虐印尼苏门答腊岛,造成逾千人死亡,是当年东南亚最致命的自然灾难事件。生活在苏门答腊岛的濒危 Tapanuli 红毛猩猩总数不到 800 只,连续四天的大暴雨以及紧跟着的山体滑坡导致至少 58 只红毛猩猩死亡,占到了总数的 7%,它们距离灭绝更近了一步。因为全球气候变化,研究人员表示极端降雨的频率和强度未来可能会持续,这将对 Tapanuli 红毛猩猩及其栖息地的生存构成威胁。
- 特朗普手机是涂了金色的 2024 款 HTC U24 Pro
ifixit 的折解证实,2026 年上市的特朗普手机就是涂了一层金色的 2024 款 HTC U24 Pro。滑稽的是 Trump Mobile 以更高的价格卖出了比 HTC 更多的手机。HTC U24 Pro 售价大约 459 美元,仅售出了 1 万部,而 Trump Mobile 的特朗普手机售价 499 美元,售出了 3 万部。特朗普手机和 HTC U24 Pro 的主要区别是前者使用了美光的 12GB LPDDR5 和 512GB SSD,而后者的内存和 SSD 来自韩国的 SK Hynix,原因可能与供应链限制、关税等有关。
- 抗生素残留可能影响男性生育力
男性不育是当前全球生殖健康领域日益受到关注的问题。除遗传、激素异常和生殖系统疾病等已知原因外,环境暴露和生活方式因素也越来越受到重视。一些药物残留和环境污染物可能通过水体、土壤或食物链进入人体,但它们对男性生殖健康的潜在影响仍未被充分阐明。南京大学研究人员的一项研究探讨了环境暴露物奥硝唑对男性生育力的潜在影响。奥硝唑是一类可用于人类、畜禽和水产养殖领域的抗感染药物。研究人员首先分析了临床血清样本,发现少精子症患者血清奥硝唑水平高于健康对照人群。进一步分析显示,较高的血清奥硝唑水平与较低的精子浓度和总正常前向运动精子数显著相关,提示奥硝唑相关暴露可能与精子质量下降存在关联。研究人员称补充 Omega-3 脂肪酸 DHA 能显著改善奥硝唑诱导的生精功能损伤和减速分裂障碍。
- 东亚最高的树
研究人员在台大安溪附近发现了东亚已知最高的树,他们根据金庸小说将其命名为大安溪倚天劍。倚天劍高 84.1 米,树龄约一千年。世界上已知最高的树是加州红杉国家公园的 Hyperion,其高度约 116 米。台湾约 60% 的面积被森林覆盖,岛上估计有 9.5 亿棵树,其中有很多参天巨树。研究团队使用传统方法测量了倚天劍的高度:爬上树,从树顶垂下卷尺。
- 科技巨头大举借债
为了投资建设 AI 基础设施,科技巨头们正大举借债,规模达到了千亿美元。Google 母公司 Alphabet 一周前表示,计划过股票销售筹集 800 亿美元;Meta 宣布计划通过销售债券筹集 300 亿美元;亚马逊计划在加拿大发行债筹集 140 亿美元,紧跟着又与花旗、摩根大通、富国、汇丰和美银证券等达成协议借款约 175 亿美元总融资 315 亿美元。为了资助 AI 基础设施如芯片和数据中心,主要科技公司的支出都创下了历史新高。如此高的投资引发了回报相关的疑问。
- 游荡在 Fedora 项目的可疑 AI 智能体
5 月 27 日 Fedora 开发者 Adam Williamson 写邮件给 Nathan Giovannini,对由其账号控制的一个 AI 智能体提出疑问。该智能体过去几个月做了一系列令人感到可疑的事情:无缘无故修改 bug 的严重级别和优先级,伪造对 Bug 的回复,说服维护者将可疑代码合并到 Anaconda 安装程序,向上游项目递交了一系列 pull requests (PRs),其中一部分已被接受。Giovannini 回应称他的账号被盗了,他不是该智能体的控制者。此事令社区联想到了曾引发广泛关注的 XZ 后门事件。在 XZ 后门事件中,化名为 JiaT75(Jia Tan)的攻击者通过在两年多时间里向项目积极贡献代码而获得信任,然后再通过施压而最终成为项目的共同维护者,得到了能悄悄在代码中植入后门的权限。在以大模型为代表的生成式 AI 时代,贡献代码比以往任何时候更轻松,这意味着攻击者可以使用智能体向开源项目积极贡献代码,积累信任,然后再发动攻击。该智能体使用的账号已经关闭,相关 PR 已经回滚。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.