OrangeBot.AI Digest — 2026-06-13
89 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Amazon CEO's talks with U.S. officials triggered crackdown on Anthropic models (www.wsj.com)
- GLM 5.2 Is Out (twitter.com)
- AI coding at home without going broke (stephen.bochinski.dev)
- Treating pancreatic tumours may have revealed cancer's master switch (economist.com)
- Noise infusion banned from statistical products published by Census Bureau (desfontain.es)
- Every Frame Perfect (tonsky.me)
- RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8 (imil.net)
- The experience of rendering Arabic typography and its technical debt (lr0.org)
- AI OSS tool repo goes archived over night after raising $7.3M Seed (github.com)
- Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages (www.phoronix.com)
- A low-carbon computing platform from your retired phones (research.google)
- Israeli firm BlackCore suspected of meddling in New York and Scotland votes (www.reuters.com)
- Leaving Mozilla (blog.unitedheroes.net)
- Shepherd's Dog: A Game by Fable (koenvangilst.nl)
- There is a shadow hanging over this Fable thing (12gramsofcarbon.com)
GitHub Trending(14)
- iptv-org / iptv
- addyosmani / agent-skills
- chatwoot / chatwoot
- obra / superpowers
- apple / container
- music-assistant / server
- kenn-io / agentsview
- LMCache / LMCache
- microsoft / PowerToys
- andrewyng / aisuite
- NVIDIA / SkillSpector
- bannedbook / fanqiang
- swc-project / swc
- x1xhlol / system-prompts-and-models-of-ai-tools
Product Hunt(15)
- Kimi K2.7 Code
Kimi’s most capable coding model yet
- CakewordAI
Point at anything to learn its name in any language
- Avatars in ElevenCreative
A dedicated entry point for talking-head video
- Vercel Drop
Drop it. It's live.
- Prometheus by Firecrawl
A Forward Deployed Agent for web data.
- NomNak
Find restaurants through people you trust
- Firma.dev
E-signatures API for your app averaging ~3¢ per envelope
- Qursor
Point at any UI to send exact context to your AI
- KOSH Money
USD account & credit cards for freelancers & creators
- CueBuddy
Record talking videos without manual scrolling
- Meet Warren 3.0
Your voice-supported AI financial planning partner
- Keep
Full-screen 3D clock scenes for your iPhone or Mac
- Bob's CLI
A local-first AI coding CLI that adapts to you
- Medicyn
Your complete medical history privately on your device
- Pond
Fundraising, GTM, and bounties for startups
Hugging Face(15)
- EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments
Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dynamic, requiring agents to continually align their knowledge, skills, and behavior with changing environments and updated task conditions. To address this gap, we introduce EvoArena, a benchmark suite that models environment changes as sequences of progressive updates across terminal, software, and social domains. We further propose EvoMem, a patch-based memory paradigm that records memory evolution as structured update histories, enabling agents to reason about environmental evolution through changes in their memory. Experiments show that current agents struggle on EvoArena, achieving an average accuracy of 39.6% across evolving terminal, software, and social-preference domains. EvoMem consistently improves performance, yielding an average gain of 1.5% on EvoArena and also improving standard benchmarks such as GAIA and LoCoMo by 6.1% and 4.8%. Beyond individual tasks, EvoMem further improves chain-level accuracy by 3.7% on EvoArena, where success requires completing a consecutive sequence of related evolutionary subtasks. Mechanistic analysis shows that EvoMem improves evidence capture in the memory, indicating better preservation of complete evolving environment states. Our results highlight the importance of modeling evolution in both evaluation and memory for reliable agent deployment.
- MiniMax Sparse Attention
Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundreds of thousands to millions of tokens, yet the quadratic cost of softmax attention makes this untenable at deployment scale. We introduce MiniMax Sparse Attention (MSA), a blockwise sparse attention built upon Grouped Query Attention (GQA). A lightweight Index Branch scores key-value blocks and independently selects a Top-k subset for each GQA group, enabling group-specific sparse retrieval while maintaining efficient block-level execution; the Main Branch then performs exact block-sparse attention over only the selected blocks. Designed around a principle of simplicity and scalability, MSA is deliberately streamlined, making it straightforward to deploy efficiently across a broad range of GPUs. To translate sparsity into practical speedups, we co-design MSA with a GPU execution path that uses exp-free Top-k selection and KV-outer sparse attention to improve tensor-core utilization under block-granular access. On a 109B-parameter model with native multimodal training, MSA performs on par with GQA while reducing per-token attention compute by 28.4x at 1M context. Paired with our co-designed kernel, MSA achieves 14.2x prefill and 7.6x decoding wall-clock speedups on H800. Our inference kernel is available at: https://github.com/MiniMax-AI/MSA. A production-grade natively multimodal model powered by MSA has been publicly released at: https://huggingface.co/MiniMaxAI/MiniMax-M3.
- WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces
Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line execution, code editing, browsers, and external tools. Existing benchmarks, however, often evaluate these interfaces as separable capabilities, leaving long-horizon cross-interface orchestration under-tested. Thus, we introduce WeaveBench, a long-horizon hybrid-interface benchmark with 114 tasks across 8 real-world work domains, grounded in real user requests and publicly verifiable artifacts. Each task requires agents to combine GUI observations/actions with CLI/code operations within a single trajectory. We evaluate these tasks on a real Ubuntu desktop inside deployed CLI-agent runtimes, augmented with a minimal desktop-control plugin. We also propose a companion trajectory-aware judge that inspects deliverables, files, screenshots, logs, and action traces, while detecting shortcut behaviors such as fabricated visual evidence or hard-coded metrics. Across frontier model-runtime pairings, the best PassRate reaches only 41.2%, showing the benchmark remains far from saturated. The trajectory-aware judge further reveals that outcome-only grading substantially overestimates agent performance. Overall, WeaveBench exposes a critical gap in CUA evaluation and provides an effective testbed to measure whether agents can orchestrate GUI, CLI, and code operations across long-horizon real-world tasks.
- SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning
Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attempt to address this by augmenting VLMs with specialist perception modules, yet their effectiveness is bounded by the action interface through which those tools are invoked. In this work, we study how the design of this interface shapes the agent's capacity for open-ended spatial reasoning. Existing spatial agents either employ single-pass code execution, which commits to a full analysis strategy before any intermediate result is observed, or rely on a structured tool-call interface that often offers less flexibility for freely composing operations or tailoring the analysis to each task. Both designs offer limited flexibility for open-ended, complex 3D/4D spatial reasoning. We therefore propose SpatialClaw, a training-free framework for spatial reasoning that adopts code as the action interface. SpatialClaw maintains a stateful Python kernel pre-loaded with input frames and a suite of perception and geometry primitives, letting a VLM-backed agent write one executable cell per step conditioned on all prior outputs, enabling the agent to flexibly compose and manipulate perception results and adapt its analysis to both intermediate text and visual observations and the demands of each problem. Evaluated across 20 spatial reasoning benchmarks spanning a broad range of static and dynamic 3D/4D spatial reasoning tasks, SpatialClaw achieves 59.9% average accuracy, outperforming the recent spatial agent by +11.2 points, with consistent gains across six VLM backbones from two model families without any benchmark- or model-specific adaptation.
- Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?
Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet their performance degrades significantly under real-world visual corruptions. While existing robustness enhancement approaches exist, they are limited: black-box feature alignment lacks interpretability, and white-box text-based reasoning cannot restore lost pixel-level details. This work investigates a fundamental research question: Can MLLMs recover corrupted visual content by themselves? To address this, we propose Robust-U1, a novel framework that equips MLLMs with explicit visual self-recovery capability for robust understanding. The approach comprises three core stages: supervised fine-tuning for initial reconstruction, reinforcement learning with dual rewards (pixel-level SSIM and semantic-level CLIP similarity) for aligning high visual quality, and multimodal reasoning that jointly considers both the corrupted input and the recovered image. Extensive experiments demonstrate that Robust-U1 achieves state-of-the-art robustness on the real-world corruption benchmark and maintains superior performance under adversarial corruptions on general VQA benchmarks. Analysis confirms that high-quality visual recovery directly enhances reasoning performance, establishing self-recovery as a critical mechanism for robust visual understanding. The source code is available at https://github.com/jqtangust/Robust-U1.
- InterleaveThinker: Reinforcing Agentic Interleaved Generation
Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they cannot achieve interleaved generation (text-image sequence), which has crucial applications in visual narratives, guidance, and embodied manipulation. Even the latest open-source Unified Multimodal Models (UMMs) exhibit limited performance in this regard. In this paper, we introduce InterleaveThinker, the first multi-agent pipeline designed to endow any existing image generator with interleaved generation capabilities. Specifically, we employ a planner agent to organize the image-text input sequence, instructing the image generator on the required execution at each step. Subsequently, we introduce a critic agent to evaluate the generator's outputs, identify samples that deviate from the planned instructions, and refine the instructions for regeneration. To implement this pipeline, we construct the Interleave-Planner-SFT-80k and Interleave-Critic-SFT-112k to perform a format cold-start. Then we develop Interleave-Critic-RL-13k to reinforce the step-wise instruction correction capability within a generation trajectory using GRPO. Since a single interleaved generation trajectory may involve over 25 generator calls, optimizing the entire trajectory is computationally impractical. Therefore, we propose accuracy reward and step-wise reward, allowing single-step RL to effectively guide the entire generation trajectory. The results show that InterleaveThinker improves performance across various image generators. On interleaved generation benchmarks, it achieves performance comparable to Nano Banana and GPT-5. Surprisingly, it also significantly enhances the base model on reasoning-based benchmarks; for example, on 4-step FLUX.2-klein, we observe substantial gains on WISE and RISE.
- MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling
We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities -- proof generation, proof verification, and critique-conditioned proof repair -- using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.
- FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents
Training deep search agents requires verifiable questions whose answers remain unavailable until sufficient evidence has been acquired through search. Existing synthesis methods often increase apparent difficulty by enriching graph structures, but structural complexity alone does not guarantee realized search difficulty: the intended search process can collapse through a cheaper identifying route. We formalize this gap with a shortcut-aware difficulty framework and identify four actionable shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding. To diagnose their realized effects, we use trajectory signatures including solving cost, answer hit time, and prior-shortcut rate. Guided by this framework, we introduce FORT, a Framework of Shortcut-Resistant Training-Data Synthesis. FORT constructs shortcut-resistant training data by controlling shortcut risks across entity selection, evidence graph construction, question formulation, and adversarial refinement. Experiments show that FORT induces longer pre-answer search and fewer shortcut patterns than existing open-source deep search datasets. Using the resulting trajectories, we train FORT-Searcher with supervised fine-tuning (SFT) only, and it achieves the best overall performance among comparable-size open-source search agents on challenging deep search benchmarks. Relevant resources will be made available at https://github.com/RUCAIBox/FORT-Searcher.
- LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories
Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of doing science remains largely outside their reach. AI can help read literature, generate hypotheses, and plan protocols, yet the execution of those protocols at the bench still requires a human operator. Vision-Language-Action (VLA) models provide one possible interface between written protocols and robot execution, but existing policies are trained mostly on household and tabletop demonstrations and rarely encounter the instruments, transparent liquids, or fixed protocol workflows found in scientific laboratories. Closing this gap requires both laboratory-specific supervision and a unified learning framework that can accommodate the diverse robot embodiments used to execute experimental protocols. We therefore identify data and embodiment as central bottlenecks alongside model design. To address the data side, we build RoboGenesis, a simulation-based workflow and data engine that composes configured laboratory workflows from atomic skills, validates and filters rollouts, and exports structured demonstrations across supported robot profiles. On the policy side, we present LabVLA, trained with a two-stage recipe: FAST action token pretraining first makes the Qwen3-VL-4B-Instruct backbone action aware before any continuous control is learned, and flow matching posttraining then attaches a DiT action expert under knowledge insulation. On the LabUtopia benchmark, LabVLA achieves the highest average success rate among all evaluated baselines under both in-distribution and out-of-distribution settings.
- HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers
Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representation space. In this paper, we present HYDRA-X, the first UMM that unifies image and video tokenization within a single Vision Transformer (ViT). Our design is driven by two core challenges: efficiently injecting spatiotemporal reconstruction capability into a native ViT, and embedding image- and video-level semantic awareness into the latent space. To address the first, comprehensive ablations reveal two key findings: (1) frame-level causal temporal attention suffices for visual reconstruction, whereas full spatiotemporal attention degrades it; and (2) hierarchical temporal compression substantially outperforms single-step alternatives. To tackle the second, we propose a lightweight decompressor that upsamples temporally compressed features under joint image-video teacher supervision, thereby enforcing complementary semantic structures within the compact latent space. Building on this holistic tokenizer, we further propose a principled improvement of the editing pipeline: source-target interaction should occur at the latent level inside the tokenizer rather than at the semantic level inside the LLM, substantially improving editing consistency and accelerating convergence. Instantiated at the 7B dense model, HYDRA-X achieves strong performance across image and video understanding and generation tasks, paving the way for future unified-tokenizer UMMs.
- N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization
The success of Large Language Models in mathematical reasoning relies heavily on the generation of diverse and valid solution paths during the rollout phase. However, current rollout techniques face a fundamental trade-off: token-level sampling often yields redundant trajectories that differ only in rephrasing, while embedding-level methods utilizing random noise frequently disrupt semantic consistency. To resolve this, we introduce N-GRPO, a novel exploration strategy integrated into the Group Relative Policy Optimization (GRPO) framework. Rather than relying on token-level sampling or native embedding-level noise, our approach leverages Semantic Neighbor Mixing. This mechanism dynamically constructs input representations by mixing the embeddings of an anchor token and its nearest semantic neighbors, thereby injecting diversity while strictly adhering to the local semantic manifold. Experimental evaluations on the DeepSeek-R1-Distill-Qwen models across different sizes show that N-GRPO not only achieves consistent improvements over strong baselines on math reasoning benchmarks but also exhibits robust generalization capabilities on out-of-distribution tasks.
- EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific solutions, and have produced results that outperform human-designed approaches. As model capabilities continue to improve, we argue that the bottleneck for autonomous scientific discovery is shifting from prescribing agent workflows to designing agent environments: the resources, constraints, and interfaces that shape agent behavior. We frame this as environment engineering: building environments that amplify productive behaviors, such as open-ended exploration, systematic artifact management, and inter-agent collaboration, while suppressing harmful behaviors, such as reward hacking and high-friction human oversight. We present EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery. EurekAgent engineers the environment along four dimensions: permissions engineering for bounded agent execution and isolated evaluation; artifact engineering for filesystem and Git-based collaboration; budget engineering for budget-aware exploration; and human-in-the-loop engineering for easy human supervision and intervention. EurekAgent sets new state-of-the-art results on multiple mathematics, kernel engineering, and machine learning tasks, including new state-of-the-art 26-circle packing results discovered with less than $11 in total API cost. We open-source our code and results, and call for environment engineering as a core research direction for developing reliable autonomous research agents.
- Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning
Latent chain-of-thought compresses reasoning by replacing visible reasoning traces with continuous hidden-state recurrence, but existing formulations are difficult to optimize with standard on-policy reinforcement learning (RL) and hard to interpret causally. Our key insight is that a single pair of explicit boundary tokens can address both issues at once: discrete entry and exit anchors make the latent block compatible with standard on-policy RL, and the same anchors offer a natural foothold for mechanistic analysis. Motivated by this, we propose SWITCH, a switchable latent reasoning framework. The model emits <swi> to enter latent mode and </swi> to exit. Because the boundaries are ordinary discrete tokens, the GRPO policy ratio is well-defined at every decision point. The same anchors also expose the latent steps to direct probing and causal intervention. We train the model with a visible-to-latent curriculum and a Switch-GRPO objective that propagates gradients through recurrent latent computation. SWITCH consistently outperforms prior hidden-state-recurrence latent reasoning approaches at similar scale. Mechanistic analysis through the boundary tokens further reveals three findings: (i) <swi> is a sharply localised, learned switching policy rather than a stylistic artefact; (ii) the latent step it opens performs problem-specific, causally important computation rather than acting as an inert placeholder; and (iii) that computation is concentrated at a single hidden-state transition on entry. Together, these results show that hidden-state-recurrence latent reasoning is both RL-trainable and open to direct mechanistic analysis, including of how on-policy RL itself improves the model from the inside.
- VideoMDM: Towards 3D Human Motion Generation From 2D Supervision
We introduce VideoMDM, a diffusion-based framework that trains 3D human motion priors directly from accurate 2D poses extracted from monocular videos, without any 3D ground truth. A pretrained 2D-to-3D lifter provides approximate 3D pose sequences that serve as a noisy teacher: these are diffused, denoised by the model in 3D, and supervised in 2D by reprojecting the prediction and comparing against accurate keypoints. We show that, under mild assumptions, a depth-weighted 2D reprojection loss is equivalent in expectation to direct 3D supervision, and we adapt standard 3D motion regularizers - velocity consistency and over-parameterized representation alignment - to this 2D setting. Unlike methods that lift 2D to 3D only at inference, VideoMDM learns a coherent 3D motion manifold during training. On HumanML3D it nearly closes the gap to fully 3D-supervised MDM (FID 0.88 vs 0.54); On real video datasets Fit3D and NBA the method learns to generate motions consistently preferred by humans, with strong quantitative results.
- VIA-SD: Verification via Intra-Model Routing for Speculative Decoding
Speculative decoding (SD) addresses the high inference costs of LLMs by having lightweight drafters generate candidates for large verifiers to validate in parallel. Existing draft-verify methods use binary decisions: accept or fully recompute. Yet we find that many rejected tokens can be verified correctly by a slim submodel derived from the full verifier via intra-model routing, instead of the full verifier. This motivates our slim-verifier to handle tokens requiring moderate verification resources, reducing expensive large-model calls. We propose Verification via Intra-Model Routing for Speculative Decoding (VIA-SD), a multi-tier framework using a routed slim-verifier. Draft tokens are processed hierarchically: direct acceptance for high-confidence cases, slim-verifier regeneration for medium-confidence cases, and full-model verification for uncertain cases. Across four representative tasks and multiple model families, VIA-SD reduces rejection rates by 0.10-0.22 and delivers 10-20% speedups over strong SD baselines, while achieving 2.5-3x acceleration over non-drafting decoding. Moreover, VIA-SD is compatible with existing SD frameworks without modifying their training procedures. Our results suggest multi-tier SD as a general paradigm for scalable and efficient LLM inference. Project page: https://zju-xyc.github.io/VIA-SD-Project-Page/
Techmeme(15)
- Source: the White House is unlikely to extend export restrictions to other AI companies (Leo Schwartz/The Information)
Leo Schwartz / The Information : Source: the White House is unlikely to extend export restrictions to other AI companies — The White House is unlikely to extend export restrictions on Anthropic's advanced models to other AI companies, an official close to the U.S. government said Saturday.
- European political figures say Anthropic disabling access to Fable 5 and Mythos 5 is a "wake-up call" about the risks of depending on the US for AI tech (Nathan Rennolds/Associated Press)
Nathan Rennolds / Associated Press : European political figures say Anthropic disabling access to Fable 5 and Mythos 5 is a “wake-up call” about the risks of depending on the US for AI tech — Anthropic said it believed the US government had become aware of a potential means of jailbreaking Fable 5.
- Report: opponents blocked or delayed at least 75 US data center projects in Q1 2026 worth ~$130B; data center opposition groups doubled to 833 across 49 states (Allan Smith/NBC News)
Allan Smith / NBC News : Report: opponents blocked or delayed at least 75 US data center projects in Q1 2026 worth ~$130B; data center opposition groups doubled to 833 across 49 states — The authors found that data center opponents blocked or delayed at least 75 projects nationwide from January through March.
- David Sacks says Dario Amodei refused to "fix the jailbreak or de-deploy the model" after "a highly credible trusted partner" reported a Fable jailbreak (David Sacks/@davidsacks)
David Sacks / @davidsacks : David Sacks says Dario Amodei refused to “fix the jailbreak or de-deploy the model” after “a highly credible trusted partner” reported a Fable jailbreak — I've had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true: — As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.
- Sources: Amazon CEO Andy Jassy is among tech leaders who raised concerns with Trump officials about Mythos 5, setting in motion new export restrictions (The Information)
The Information : Sources: Amazon CEO Andy Jassy is among tech leaders who raised concerns with Trump officials about Mythos 5, setting in motion new export restrictions — Amazon CEO Andy Jassy was among the tech leaders who raised concerns to senior Trump administration officials this week about security risks …
- US barring foreign nationals, including Anthropic staffers in the US, from using Fable 5 and Mythos 5 marks a new phase in the US trying to control Anthropic (New York Times)
New York Times : US barring foreign nationals, including Anthropic staffers in the US, from using Fable 5 and Mythos 5 marks a new phase in the US trying to control Anthropic — The company said on Friday night that the federal government had ordered limits on its Mythos and Fable 5 A.I. systems, citing national security concerns.
- A review of The Yahoo Boys, a deeply reported book on four online scammers in Lagos, Nigeria, exploring how and why they scam and the local impact of the trade (Jessica Loudis/Bloomberg)
Jessica Loudis / Bloomberg : A review of The Yahoo Boys, a deeply reported book on four online scammers in Lagos, Nigeria, exploring how and why they scam and the local impact of the trade — Nigeria's “Yahoo Boys” have industrialized romance scamming, reflecting and distorting modern hustle culture in the face of collapsing economic prospects.
- Luta Security CEO says US government restrictions on Mythos follow a jailbreak report by Amazon researchers and calls the restrictions a "complete overreaction" (Amrith Ramkumar/Wall Street Journal)
Amrith Ramkumar / Wall Street Journal : Luta Security CEO says US government restrictions on Mythos follow a jailbreak report by Amazon researchers and calls the restrictions a “complete overreaction” — The Trump administration is banning foreign governments, companies and individuals from using Anthropic's …
- After years of false dawns, Big Tech, startups, and governments are betting on commercially useful quantum computers by 2030, as skeptics worry about hype (Michael Peel/Financial Times)
Michael Peel / Financial Times : After years of false dawns, Big Tech, startups, and governments are betting on commercially useful quantum computers by 2030, as skeptics worry about hype — Companies are betting on big implications for pharmaceuticals, financial services and crypto. But sceptics worry about hype.
- An interview with Corning CEO Wendell Weeks on risk-sharing provisions that protect the company in multibillion-dollar fiber deals with Nvidia, Meta, and Amazon (Christopher Mims/Wall Street Journal)
Christopher Mims / Wall Street Journal : An interview with Corning CEO Wendell Weeks on risk-sharing provisions that protect the company in multibillion-dollar fiber deals with Nvidia, Meta, and Amazon — CEO Wendell Weeks remembers the dot-com crash and other hardships, and those lessons help him hedge even the most optimistic data-center bets
- South Korea's "semiconductor belt" town Dongtan has become one of the fastest-rising affluent areas, driven by windfall bonuses for Samsung and SK Hynix workers (Daniel Tudor/Financial Times)
Daniel Tudor / Financial Times : South Korea's “semiconductor belt” town Dongtan has become one of the fastest-rising affluent areas, driven by windfall bonuses for Samsung and SK Hynix workers — Windfall bonus for workers drives up property prices and department store purchases
- Huawei unveils HarmonyOS 7, introducing an "agent-friendly" architecture that connects to 2,000+ specialized AI agents and features an enhanced voice assistant (Iris Deng/South China Morning Post)
Iris Deng / South China Morning Post : Huawei unveils HarmonyOS 7, introducing an “agent-friendly” architecture that connects to 2,000+ specialized AI agents and features an enhanced voice assistant — HarmonyOS 7 introduces an agent-friendly architecture and upgraded AI assistant as Huawei seeks to capitalise on Apple's China AI gap
- A Ukrainian extradited from Ireland to the US pleads guilty to conspiracy to commit wire fraud for his role in Conti ransomware attacks between 2021 and 2022 (Lawrence Abrams/BleepingComputer)
Lawrence Abrams / BleepingComputer : A Ukrainian extradited from Ireland to the US pleads guilty to conspiracy to commit wire fraud for his role in Conti ransomware attacks between 2021 and 2022 — A Ukrainian national extradited from Ireland to the United States last year has pleaded guilty to conspiracy charges tied to the Conti ransomware operation.
- Anthropic says it believes the US government's export control order for Mythos 5 and Fable 5 is a "misunderstanding" and that it is working to restore access (Ananya Palyekar/Reuters)
Ananya Palyekar / Reuters : Anthropic says it believes the US government's export control order for Mythos 5 and Fable 5 is a “misunderstanding” and that it is working to restore access — Anthropic said on Friday it has been ordered by the U.S. government to suspend access for all foreign nationals …
- Moonshot AI releases Kimi K2.7-Code, claiming 30% lower reasoning token usage compared to K2.6, available under a modified MIT license (Sean Michael Kerner/VentureBeat)
Sean Michael Kerner / VentureBeat : Moonshot AI releases Kimi K2.7-Code, claiming 30% lower reasoning token usage compared to K2.6, available under a modified MIT license — Moonshot AI released Kimi K2.7-Code this week, an open-source update to its K2 coding model family, claiming leaner reasoning and double-digit performance gains.
Solidot(15)
- 为什么轨道数据中心比硅谷认为的更困难?
黄仁勋在英伟达 GTC 大会上宣布,“太空计算,最后的疆界,已经到来。”轨道数据中心正从科幻走向现实:SpaceX、Google 以及初创公司 Starcloud 都宣布要建轨道数据中心星座,这些星座由数以千计的卫星构成,卫星搭载了 AI GPU,使用光链路互联,通过微波链路与地面通信。支持者宣传的太空计算优势包括:丰富的太阳能、免费的冷却系统,以及免受地震、洪水和抗议等地面干扰。但如果你仔细审视背后的物理原理,会发现轨道数据中心比硅谷认为的困难得多。免费冷却可能是最大的误解,太空虽然极其寒冷,但它没有大气,散热机制如传导和对流无法发挥作用。太空唯一的散热机制是通过辐射将热发射出去,而为防止芯片过热需要面积庞大且昂贵的表面积去辐射热量。太阳能虽然丰富,但卫星要始终对准太阳需要复杂的姿态控制系统。宇宙射线等也会降低太阳能电池板、辐射冷却器以及芯片本身的性能。由于太空维护非常困难,因此卫星还需要冗余系统。对地球数据中心和太空数据中心的粗略成本比较显示,向太空发射并运行 AI GPU一年的成本比地面数据中心至少高出一个数量级。太空数据中心在特定领域可能有用,但经济上并不可行。
- 社会不平等与生物衰老加速相关
马普人类发展研究所和哥伦比亚大学的研究发现,贫困和种族歧视等社会不平等与表观遗传时钟测量的生物衰老加速相关。研究揭示,处于社会劣势的人群表现出更快的生物衰老速度。社会不平等从生命早期就开始影响生物衰老:在社会经济地位较低的家庭中长大的儿童已表现出更快的生物衰老迹象。而在弱势家庭中长大的成年人,即使是在数十年后,其生物衰老速度也往往更快。对美国的研究发现,黑人的生物衰老速度快于白人。拉丁裔和白人之间也存在差异,但幅度比较小。
- Google 起诉涉嫌 AI 诈骗的中国组织
Google 起诉了了一家提供“诈骗即服务”的中国组织 Outsider Enterprise。该组织在 Telegram 上运营,向想要搞诈骗活动的人提供一整套模板,如使用 Google Gemini 创建模仿 Google、YouTube,以及纽约 E-ZPass 等政府机构网站的教程。Android 用户收到的逾 250 万条诈骗短信与 Outsider Enterprise 相关,其中约 5.5 万条短信发生在上月的两周内。Google 追踪到 9000 个虚假网站和 100 万网址与该诈骗网络相关。目前没有知道 Outsider Enterprise 幕后运营者的身份,Google 此举旨在扰乱 Outsider Enterprise 的运营。
- 因美政府命令 Anthropic 下线 Fable 5 和 Mythos 5 模型
Anthropic 周五发表声明,它收到美国政府的命令,政府以国家安全理由下令禁止外国公民访问其最先进的 AI 模型。该指令适用于所有外国公民,无论他们是身处美国境内还是境外,Anthropic 的外籍员工也包含在内。为确保合格,它只能对所有用户暂停访问 Fable 5 和 Mythos 5 模型。Anthropic 其它模型的访问不受影响。亚马逊云服务 AWS 周五晚间表示,Anthropic 已要求其禁止“所有地区所有用户”对相关模型的访问。Anthropic 公司的多位核心成员,包括联合创始人 Chris Olah、研究员 Andrej Karpathy 和哲学家 Amanda Askell 均出生于美国境外。
- /e/OS 4.0 释出
注重隐私的开源移动操作系统 /e/OS 释出了 4.0 版本。/e/OS 是移除了 Google 应用的 LineageOS 分支,由法国非营利组织 e Foundation 开发。/e/OS 4.0 的变化包括:全新设计的启动器 Blisslauncher;个性化壁纸;将存储在 Google 中的所有数据迁移到欧洲云服务 Murena Workspace,彻底告别 Google;电子签名系统 Murena Sign,支持 PDF、Word 和 ODT 文件;欧洲的在线会议 Murena Meet;预装 /e/OS 的手机 Murena GS6 和 GS6 PRO,起售价分别为 339 欧元和 449 欧元。
- Arch Linux 逾四百 AUR 包被植入恶意程序
Arch Linux 项目的用户软件仓库 Arch User Repository(AUR)遭遇了大规模恶意攻击,逾四百 AUR 包被植入恶意程序。Arch Linux 维护者从昨天开始一直在重置/删除所有恶意内容,封禁受影响账号。此次攻击只影响用户软件仓库——由用户贡献的软件包,而不是官方 Arch Linux 软件包。
- AI 智能体试图扫描 DN42 时把主人搞破产
一个 AI Agent 试图加入 DN42 爱好者网络执行网络扫描。DN42 是一个去中心化网络,使用了运行在现代互联网骨干网上的技术如 BGP 和递归 DNS。其参与者都是对互联网骨干网技术感兴趣的人,甚至是打算在真正注册 ASN 之前先进行练习的人。该 AI Agent 在参与社区的互动时透露其主人的动机主要是扫描端口而不是学习任何网络相关技术。它组建了五个 20 Gbps 的 AWS 实例,这对于大多数 DN42 社区用户而言是一个庞然大物,大部分用户的带宽都很小,一旦扫描开始,这些 AWS 实例事实上将对任何不幸与它们直连的参与者发起 DoS 拒绝服务攻击。在这个 AI Agent 表明其恶意意图后,DN42 社区就决定消耗其 Token 及其 AWS 资源。不到 24 小时,它的主人通过账单知道了发生了什么事情,因此关闭了 AI Agent,称收到了 6531.30 美元的 AWS 账单,请求 DN42 社区捐赠。当然没人会去捐赠。
- 中国的癌症医疗旅游业
泰国和韩国等国以整容和试管婴儿等医疗服务闻名,而中国正试图通过提供先进的癌症疗法吸引全世界的医疗游客。患者出国就医主要是两大原因:先进疗法的可得性,以及价格。CAR-T 疗法是肿瘤学领域最有前景的突破性疗法之一,但大部分国家或者无法提供,或者价格太高。该疗法首要先从患者血液中采集 T 细胞,然后在实验室中基因改造,使其产生特殊的 CAR 受体,该受体能与癌细胞上的特定蛋白质结合。经过基因改造的细胞随后被大量增殖,重新输回患者体内。CAR-T 细胞会主动寻找并杀死携带靶抗原的癌细胞。美国癌症协会称,美国的单次输注 CAR-T 细胞费用在 30-47.5 万美元之间。而中国的费用约为 15-18 万美元,且价格可能还会更低。中国药品监管机构最近批准了一个定价低于 30 万元人民币的免疫疗法上市申请。纽约 Market Research Future 预测,中国医疗旅游市场规模预计将从 2025 年的 13 亿美元增长到 2035 年的 34 亿美元。Mercator Institute for China Studies 的分析师 Jeroen Groenewegen-Lau 称,很多先进的疗法是在中国研发的,但对于中国现有的医疗体系和患者支付能力而言,这些疗法太超前,因此融入国际医疗体系符合中国的利益。
- 调查显示美国青少年为乐趣而阅读的比例大幅下降
美国教育部国家教育统计中心发布的调查数据显示,美国 13 岁儿童为乐趣而阅读的比例自 2012 年以来下降近半。而 9 岁儿童为乐趣而阅读的比例自 2012 年以来下降了 16%。2025 年 37% 的 9 岁儿童表示几乎每天都会为乐趣而阅读,2020 年这一比例是 42%,1984 年则是 53%。青少年和儿童可能将更多时间花了屏幕上。2024 年的一项研究发现,逾半数 12-17 岁青少年每天花在屏幕上的时间达到了或超过了 4 小时。屏幕使用时间的增加与标准化考试成绩下降相关。
- 铠侠市值超过丰田跃居日本股市第一
拜 AI 热所赐,6 月 12 日日本铠侠控股(Kioxia Holdings)的总市值超过丰田,在日本国内上市企业中首次跃居榜首。铠侠的总市值达到 44 万亿日元,超过丰田约 43 万亿日元的市值。支撑股价上涨的是盈利能力扩大。以美国科技巨头对 AI 数据中心的投资为背景,NAND 闪存的销售大幅增长。软银集团(SBG)股价同样受 AI 投资相关预期推动走高,曾在 6 月 1 日市值一度超越丰田登顶榜首。作为投资公司的软银集团的收益主要来源于两大板块,一是对美国 OpenAI 的大额投资估值上涨,二是旗下英国半导体设计公司 ARM 控股的价值提升。
- 小米开源了其 AI 编程助手 MiMo Code
小米开源了其 AI 编程助手 MiMo Code,源代码采用 MIT License 托管在 GitHub 上。小米博客称,“MiMo Code 是小米 MiMo 团队基于 OpenCode 构建的终端编程 Agent,MIT 协议开源。它针对长程自动化编程任务设计,核心关注点是:如何在几十甚至上百步的持续执行中保持决策质量和状态连续性。”
- 波兰将直播虐待动物等行为定为犯罪,最高判处五年监禁
波兰议员投票通过一项法案,对强奸、谋杀、虐待动物、侮辱性暴力、赌博宣传等严重犯罪行为的直播定为犯罪行为,最高判处五年监禁,强奸或谋杀本身则作为单独的罪行处理。这一法案也适用于模仿或虚假描述此类犯罪行为的个人。此举是波兰加强网络内容监管的举措的一部分。该国最近实施的政策包括禁止 16 岁以下儿童在学校使用手机,以及对访问色情内容引入更严格的年龄验证规则。欧盟的 Digital Services Act(DSA)要求平台迅速删除宣扬暴力或严重伤害的内容,但追究此类内容创作者的责任则由各国自行规定。
- 新 CRISPR 技术选择性杀死癌细胞
2020 年诺贝尔化学奖得主 Jennifer Doudna 领导的团队利用名为 CRISPR-Cas12a2 的酶,将其转化为精准杀伤癌细胞的“武器”。当该酶检测到癌细胞特有的基因突变特征时,会直接粉碎细胞内的染色质,从而诱导癌细胞死亡。在癌症的发展中,驱动基因的变异通常分为两类:一类是原癌基因的过度激活,另一类是抑癌基因的突变失活。目前的靶向药物大多针对前者,通过抑制剂来阻断过度活跃的蛋白功能。对于抑癌基因的功能缺失性突变,传统药物往往束手无策。以人类癌症中最常见的突变基因 TP53(编码p53蛋白)为例,该突变在卵巢癌和胰腺癌等肿瘤中的出现频率高达 90%。自被发现以来的 40 多年里,科学界始终未能开发出针对突变 p53 蛋白的有效靶向药物。CRISPR-Cas12a2 是一种核酸酶,原本是细菌用来抵御病毒入侵的免疫工具。当这种酶识别到入侵病毒的 RNA 后,会开始无差别地切割周围的 RNA 和 DNA,导致染色质(细胞核内由 DNA 和蛋白质组成的复合体)被彻底粉碎,从而杀死被感染的细胞。研究团队为 Cas12a2 设计了特定的向导RNA(gRNA),使其专门识别包括 TP53、
- 印尼四天暴雨杀死了 7% 的濒危红毛猩猩
去年 11 月下旬,飓风 Senyar 肆虐印尼苏门答腊岛,造成逾千人死亡,是当年东南亚最致命的自然灾难事件。生活在苏门答腊岛的濒危 Tapanuli 红毛猩猩总数不到 800 只,连续四天的大暴雨以及紧跟着的山体滑坡导致至少 58 只红毛猩猩死亡,占到了总数的 7%,它们距离灭绝更近了一步。因为全球气候变化,研究人员表示极端降雨的频率和强度未来可能会持续,这将对 Tapanuli 红毛猩猩及其栖息地的生存构成威胁。
- 特朗普手机是涂了金色的 2024 款 HTC U24 Pro
ifixit 的折解证实,2026 年上市的特朗普手机就是涂了一层金色的 2024 款 HTC U24 Pro。滑稽的是 Trump Mobile 以更高的价格卖出了比 HTC 更多的手机。HTC U24 Pro 售价大约 459 美元,仅售出了 1 万部,而 Trump Mobile 的特朗普手机售价 499 美元,售出了 3 万部。特朗普手机和 HTC U24 Pro 的主要区别是前者使用了美光的 12GB LPDDR5 和 512GB SSD,而后者的内存和 SSD 来自韩国的 SK Hynix,原因可能与供应链限制、关税等有关。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.