TEXT VIEW · TODAY'S DIGEST · 36 HEADLINES ACROSS 8 SOURCES

Startup Archive(0)

No items yet for today.

App Store Rankings(0)

No items yet for today.

ISSUE 0894
FRI, JUN 12, 2026
Discover the best information organized by OrangeBot.AI
TODAY · FRI, JUN 12, 2026

The web,
read by a bot.

Ten sources — Hacker News, Product Hunt, HuggingFace, Techmeme and more — filtered, tagged, and summarized every morning for builders who don’t have time to scroll.

NEWChrome extension: save posts from Twitter/X in one click.Install →
01

AI DIGEST

UPDATED DAILY · EDITOR'S PICK
01.00
AI DIGEST

AI新闻摘要

June 12, 2026

Here is a summary of today's main news events.

SpaceX Launches Massive IPO on Stock Market

Shares in Elon Musk's rocket company, SpaceX, began trading today in what is being described as the largest Initial Public Offering (IPO) in history. The highly anticipated market debut has been a major focus for investors and is influencing broader market activity.

Markets Rally as U.S. Calls Off Military Action Against Iran

President Trump announced the cancellation of planned military strikes against Iran, sparking hopes for a de-escalation of tensions in the Middle East. The news prompted a positive reaction in financial markets, with stocks and gold prices rising while oil futures declined on the reduced risk of conflict.

AI Boom Causes Market Volatility and Supply Chain Strain

While investor interest in artificial intelligence remains high, concerns are growing about a potential bubble in AI-related stocks, leading to market volatility. Simultaneously, the intense global demand for advanced AI chips is straining the supply of essential production materials, which could potentially slow the industry's rapid expansion.

FanDuel to Delist from UK Stock Exchange

The sports betting company FanDuel announced its intention to delist from the UK stock exchange. The company stated that the decision was made due to high regulatory costs and low trading volumes, concluding it was in the best interest of shareholders.

Celebrated British Artist David Hockney Dies

Renowned British painter David Hockney has passed away. He was a highly influential figure in the art world, celebrated for his vibrant works, including the iconic "Swimming Pool" series.

02

ON THE WIRE

6 SOURCES
02

HACKER NEWS

02.00
HACKER NEWS

Hacker News - June 12, 2026

Hacker News Feed: Highlighting key posts and discussions.

The Future of Email

(www.fastmail.com)

10196
Doing nothing at work

(www.seangoedecke.com)

403138
03

HUGGINGFACE

03.00
HUGGINGFACE

huggingface.title - June 12, 2026

huggingface.description

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dynamic, requiring agents to continually align their knowledge, skills, and behavior with changing environments and updated task conditions. To address this gap, we introduce EvoArena, a benchmark suite that models environment changes as sequences of progressive updates across terminal, software, and social domains. We further propose EvoMem, a patch-based memory paradigm that records memory evolution as structured update histories, enabling agents to reason about environmental evolution through changes in their memory. Experiments show that current agents struggle on EvoArena, achieving an average accuracy of 39.6% across evolving terminal, software, and social-preference domains. EvoMem consistently improves performance, yielding an average gain of 1.5% on EvoArena and also improving standard benchmarks such as GAIA and LoCoMo by 6.1% and 4.8%. Beyond individual tasks, EvoMem further improves chain-level accuracy by 3.7% on EvoArena, where success requires completing a consecutive sequence of related evolutionary subtasks. Mechanistic analysis shows that EvoMem improves evidence capture in the memory, indicating better preservation of complete evolving environment states. Our results highlight the importance of modeling evolution in both evaluation and memory for reliable agent deployment.

95
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attempt to address this by augmenting VLMs with specialist perception modules, yet their effectiveness is bounded by the action interface through which those tools are invoked. In this work, we study how the design of this interface shapes the agent's capacity for open-ended spatial reasoning. Existing spatial agents either employ single-pass code execution, which commits to a full analysis strategy before any intermediate result is observed, or rely on a structured tool-call interface that often offers less flexibility for freely composing operations or tailoring the analysis to each task. Both designs offer limited flexibility for open-ended, complex 3D/4D spatial reasoning. We therefore propose SpatialClaw, a training-free framework for spatial reasoning that adopts code as the action interface. SpatialClaw maintains a stateful Python kernel pre-loaded with input frames and a suite of perception and geometry primitives, letting a VLM-backed agent write one executable cell per step conditioned on all prior outputs, enabling the agent to flexibly compose and manipulate perception results and adapt its analysis to both intermediate text and visual observations and the demands of each problem. Evaluated across 20 spatial reasoning benchmarks spanning a broad range of static and dynamic 3D/4D spatial reasoning tasks, SpatialClaw achieves 59.9% average accuracy, outperforming the recent spatial agent by +11.2 points, with consistent gains across six VLM backbones from two model families without any benchmark- or model-specific adaptation.

73
InterleaveThinker: Reinforcing Agentic Interleaved Generation

Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they cannot achieve interleaved generation (text-image sequence), which has crucial applications in visual narratives, guidance, and embodied manipulation. Even the latest open-source Unified Multimodal Models (UMMs) exhibit limited performance in this regard. In this paper, we introduce InterleaveThinker, the first multi-agent pipeline designed to endow any existing image generator with interleaved generation capabilities. Specifically, we employ a planner agent to organize the image-text input sequence, instructing the image generator on the required execution at each step. Subsequently, we introduce a critic agent to evaluate the generator's outputs, identify samples that deviate from the planned instructions, and refine the instructions for regeneration. To implement this pipeline, we construct the Interleave-Planner-SFT-80k and Interleave-Critic-SFT-112k to perform a format cold-start. Then we develop Interleave-Critic-RL-13k to reinforce the step-wise instruction correction capability within a generation trajectory using GRPO. Since a single interleaved generation trajectory may involve over 25 generator calls, optimizing the entire trajectory is computationally impractical. Therefore, we propose accuracy reward and step-wise reward, allowing single-step RL to effectively guide the entire generation trajectory. The results show that InterleaveThinker improves performance across various image generators. On interleaved generation benchmarks, it achieves performance comparable to Nano Banana and GPT-5. Surprisingly, it also significantly enhances the base model on reasoning-based benchmarks; for example, on 4-step FLUX.2-klein, we observe substantial gains on WISE and RISE.

69
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents

Training deep search agents requires verifiable questions whose answers remain unavailable until sufficient evidence has been acquired through search. Existing synthesis methods often increase apparent difficulty by enriching graph structures, but structural complexity alone does not guarantee realized search difficulty: the intended search process can collapse through a cheaper identifying route. We formalize this gap with a shortcut-aware difficulty framework and identify four actionable shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding. To diagnose their realized effects, we use trajectory signatures including solving cost, answer hit time, and prior-shortcut rate. Guided by this framework, we introduce FORT, a Framework of Shortcut-Resistant Training-Data Synthesis. FORT constructs shortcut-resistant training data by controlling shortcut risks across entity selection, evidence graph construction, question formulation, and adversarial refinement. Experiments show that FORT induces longer pre-answer search and fewer shortcut patterns than existing open-source deep search datasets. Using the resulting trajectories, we train FORT-Searcher with supervised fine-tuning (SFT) only, and it achieves the best overall performance among comparable-size open-source search agents on challenging deep search benchmarks. Relevant resources will be made available at https://github.com/RUCAIBox/FORT-Searcher.

69
Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet their performance degrades significantly under real-world visual corruptions. While existing robustness enhancement approaches exist, they are limited: black-box feature alignment lacks interpretability, and white-box text-based reasoning cannot restore lost pixel-level details. This work investigates a fundamental research question: Can MLLMs recover corrupted visual content by themselves? To address this, we propose Robust-U1, a novel framework that equips MLLMs with explicit visual self-recovery capability for robust understanding. The approach comprises three core stages: supervised fine-tuning for initial reconstruction, reinforcement learning with dual rewards (pixel-level SSIM and semantic-level CLIP similarity) for aligning high visual quality, and multimodal reasoning that jointly considers both the corrupted input and the recovered image. Extensive experiments demonstrate that Robust-U1 achieves state-of-the-art robustness on the real-world corruption benchmark and maintains superior performance under adversarial corruptions on general VQA benchmarks. Analysis confirms that high-quality visual recovery directly enhances reasoning performance, establishing self-recovery as a critical mechanism for robust visual understanding. The source code is available at https://github.com/jqtangust/Robust-U1.

68
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities -- proof generation, proof verification, and critique-conditioned proof repair -- using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.

63
MiniMax Sparse Attention

Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundreds of thousands to millions of tokens, yet the quadratic cost of softmax attention makes this untenable at deployment scale. We introduce MiniMax Sparse Attention (MSA), a blockwise sparse attention built upon Grouped Query Attention (GQA). A lightweight Index Branch scores key-value blocks and independently selects a Top-k subset for each GQA group, enabling group-specific sparse retrieval while maintaining efficient block-level execution; the Main Branch then performs exact block-sparse attention over only the selected blocks. Designed around a principle of simplicity and scalability, MSA is deliberately streamlined, making it straightforward to deploy efficiently across a broad range of GPUs. To translate sparsity into practical speedups, we co-design MSA with a GPU execution path that uses exp-free Top-k selection and KV-outer sparse attention to improve tensor-core utilization under block-granular access. On a 109B-parameter model with native multimodal training, MSA performs on par with GQA while reducing per-token attention compute by 28.4x at 1M context. Paired with our co-designed kernel, MSA achieves 14.2x prefill and 7.6x decoding wall-clock speedups on H800. Our inference kernel is available at: https://github.com/MiniMax-AI/MSA. A production-grade natively multimodal model powered by MSA has been publicly released at: https://huggingface.co/MiniMaxAI/MiniMax-M3.

59
WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line execution, code editing, browsers, and external tools. Existing benchmarks, however, often evaluate these interfaces as separable capabilities, leaving long-horizon cross-interface orchestration under-tested. Thus, we introduce WeaveBench, a long-horizon hybrid-interface benchmark with 114 tasks across 8 real-world work domains, grounded in real user requests and publicly verifiable artifacts. Each task requires agents to combine GUI observations/actions with CLI/code operations within a single trajectory. We evaluate these tasks on a real Ubuntu desktop inside deployed CLI-agent runtimes, augmented with a minimal desktop-control plugin. We also propose a companion trajectory-aware judge that inspects deliverables, files, screenshots, logs, and action traces, while detecting shortcut behaviors such as fabricated visual evidence or hard-coded metrics. Across frontier model-runtime pairings, the best PassRate reaches only 41.2%, showing the benchmark remains far from saturated. The trajectory-aware judge further reveals that outcome-only grading substantially overestimates agent performance. Overall, WeaveBench exposes a critical gap in CUA evaluation and provides an effective testbed to measure whether agents can orchestrate GUI, CLI, and code operations across long-horizon real-world tasks.

54
LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories

Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of doing science remains largely outside their reach. AI can help read literature, generate hypotheses, and plan protocols, yet the execution of those protocols at the bench still requires a human operator. Vision-Language-Action (VLA) models provide one possible interface between written protocols and robot execution, but existing policies are trained mostly on household and tabletop demonstrations and rarely encounter the instruments, transparent liquids, or fixed protocol workflows found in scientific laboratories. Closing this gap requires both laboratory-specific supervision and a unified learning framework that can accommodate the diverse robot embodiments used to execute experimental protocols. We therefore identify data and embodiment as central bottlenecks alongside model design. To address the data side, we build RoboGenesis, a simulation-based workflow and data engine that composes configured laboratory workflows from atomic skills, validates and filters rollouts, and exports structured demonstrations across supported robot profiles. On the policy side, we present LabVLA, trained with a two-stage recipe: FAST action token pretraining first makes the Qwen3-VL-4B-Instruct backbone action aware before any continuous control is learned, and flow matching posttraining then attaches a DiT action expert under knowledge insulation. On the LabUtopia benchmark, LabVLA achieves the highest average success rate among all evaluated baselines under both in-distribution and out-of-distribution settings.

44
HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers

Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representation space. In this paper, we present HYDRA-X, the first UMM that unifies image and video tokenization within a single Vision Transformer (ViT). Our design is driven by two core challenges: efficiently injecting spatiotemporal reconstruction capability into a native ViT, and embedding image- and video-level semantic awareness into the latent space. To address the first, comprehensive ablations reveal two key findings: (1) frame-level causal temporal attention suffices for visual reconstruction, whereas full spatiotemporal attention degrades it; and (2) hierarchical temporal compression substantially outperforms single-step alternatives. To tackle the second, we propose a lightweight decompressor that upsamples temporally compressed features under joint image-video teacher supervision, thereby enforcing complementary semantic structures within the compact latent space. Building on this holistic tokenizer, we further propose a principled improvement of the editing pipeline: source-target interaction should occur at the latent level inside the tokenizer rather than at the semantic level inside the LLM, substantially improving editing consistency and accelerating convergence. Instantiated at the 7B dense model, HYDRA-X achieves strong performance across image and video understanding and generation tasks, paving the way for future unified-tokenizer UMMs.

23
N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization

The success of Large Language Models in mathematical reasoning relies heavily on the generation of diverse and valid solution paths during the rollout phase. However, current rollout techniques face a fundamental trade-off: token-level sampling often yields redundant trajectories that differ only in rephrasing, while embedding-level methods utilizing random noise frequently disrupt semantic consistency. To resolve this, we introduce N-GRPO, a novel exploration strategy integrated into the Group Relative Policy Optimization (GRPO) framework. Rather than relying on token-level sampling or native embedding-level noise, our approach leverages Semantic Neighbor Mixing. This mechanism dynamically constructs input representations by mixing the embeddings of an anchor token and its nearest semantic neighbors, thereby injecting diversity while strictly adhering to the local semantic manifold. Experimental evaluations on the DeepSeek-R1-Distill-Qwen models across different sizes show that N-GRPO not only achieves consistent improvements over strong baselines on math reasoning benchmarks but also exhibits robust generalization capabilities on out-of-distribution tasks.

18
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environment, they can propose, validate, and iterate scientific solutions, and have produced results that outperform human-designed approaches. As model capabilities continue to improve, we argue that the bottleneck for autonomous scientific discovery is shifting from prescribing agent workflows to designing agent environments: the resources, constraints, and interfaces that shape agent behavior. We frame this as environment engineering: building environments that amplify productive behaviors, such as open-ended exploration, systematic artifact management, and inter-agent collaboration, while suppressing harmful behaviors, such as reward hacking and high-friction human oversight. We present EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery. EurekAgent engineers the environment along four dimensions: permissions engineering for bounded agent execution and isolated evaluation; artifact engineering for filesystem and Git-based collaboration; budget engineering for budget-aware exploration; and human-in-the-loop engineering for easy human supervision and intervention. EurekAgent sets new state-of-the-art results on multiple mathematics, kernel engineering, and machine learning tasks, including new state-of-the-art 26-circle packing results discovered with less than $11 in total API cost. We open-source our code and results, and call for environment engineering as a core research direction for developing reliable autonomous research agents.

17
Where, What, Why, and Importance: Structured Defect Grounding for Text-to-Image Feedback

Despite generating increasingly photorealistic images, text-to-image (T2I) models still exhibit localized, subtle, and structurally complex failures. Diagnosing these failures requires instance-level feedback that answers where a defect occurs, what type it is, why it is defective, and its importance to overall image quality. While recent dense-feedback methods move beyond scalar supervision, their heatmap-centric representations still formulate diagnosis as pixel-field regression, making it difficult to localize variable-cardinality defects and bind semantic reasons to individual failures. To address this representation bottleneck, we propose Structured Defect Grounding (SDG), which casts T2I diagnosis as structured set prediction by modeling each defect as a (location, type, reason, importance) tuple. To make this formulation trainable and measurable, we introduce SDG-30K, a 30K-image dataset with box-grounded annotations across four modern T2I generators, together with a dedicated evaluation protocol, SDG-Eval. Building on this structured representation, we further present a diagnosis-to-alignment framework in which a Vision-Language Model (VLM) serves as the SDG detector, and BoxFlow-GRPO converts predicted defect sets into box-derived, importance-weighted spatial rewards for diffusion model alignment. Extensive experiments show that our SDG detector outperforms leading proprietary VLMs on structured defect grounding, while SDG-guided rewards consistently improve T2I alignment and support localized image refinement. These results establish SDG as a unified, instance-level interface for diagnosing, evaluating, and enhancing modern generative models.

12
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

Latent chain-of-thought compresses reasoning by replacing visible reasoning traces with continuous hidden-state recurrence, but existing formulations are difficult to optimize with standard on-policy reinforcement learning (RL) and hard to interpret causally. Our key insight is that a single pair of explicit boundary tokens can address both issues at once: discrete entry and exit anchors make the latent block compatible with standard on-policy RL, and the same anchors offer a natural foothold for mechanistic analysis. Motivated by this, we propose SWITCH, a switchable latent reasoning framework. The model emits <swi> to enter latent mode and </swi> to exit. Because the boundaries are ordinary discrete tokens, the GRPO policy ratio is well-defined at every decision point. The same anchors also expose the latent steps to direct probing and causal intervention. We train the model with a visible-to-latent curriculum and a Switch-GRPO objective that propagates gradients through recurrent latent computation. SWITCH consistently outperforms prior hidden-state-recurrence latent reasoning approaches at similar scale. Mechanistic analysis through the boundary tokens further reveals three findings: (i) <swi> is a sharply localised, learned switching policy rather than a stylistic artefact; (ii) the latent step it opens performs problem-specific, causally important computation rather than acting as an inert placeholder; and (iii) that computation is concentrated at a single hidden-state transition on entry. Together, these results show that hidden-state-recurrence latent reasoning is both RL-trainable and open to direct mechanistic analysis, including of how on-policy RL itself improves the model from the inside.

12
VideoMDM: Towards 3D Human Motion Generation From 2D Supervision

We introduce VideoMDM, a diffusion-based framework that trains 3D human motion priors directly from accurate 2D poses extracted from monocular videos, without any 3D ground truth. A pretrained 2D-to-3D lifter provides approximate 3D pose sequences that serve as a noisy teacher: these are diffused, denoised by the model in 3D, and supervised in 2D by reprojecting the prediction and comparing against accurate keypoints. We show that, under mild assumptions, a depth-weighted 2D reprojection loss is equivalent in expectation to direct 3D supervision, and we adapt standard 3D motion regularizers - velocity consistency and over-parameterized representation alignment - to this 2D setting. Unlike methods that lift 2D to 3D only at inference, VideoMDM learns a coherent 3D motion manifold during training. On HumanML3D it nearly closes the gap to fully 3D-supervised MDM (FID 0.88 vs 0.54); On real video datasets Fit3D and NBA the method learns to generate motions consistently preferred by humans, with strong quantitative results.

12
VIA-SD: Verification via Intra-Model Routing for Speculative Decoding

Speculative decoding (SD) addresses the high inference costs of LLMs by having lightweight drafters generate candidates for large verifiers to validate in parallel. Existing draft-verify methods use binary decisions: accept or fully recompute. Yet we find that many rejected tokens can be verified correctly by a slim submodel derived from the full verifier via intra-model routing, instead of the full verifier. This motivates our slim-verifier to handle tokens requiring moderate verification resources, reducing expensive large-model calls. We propose Verification via Intra-Model Routing for Speculative Decoding (VIA-SD), a multi-tier framework using a routed slim-verifier. Draft tokens are processed hierarchically: direct acceptance for high-confidence cases, slim-verifier regeneration for medium-confidence cases, and full-model verification for uncertain cases. Across four representative tasks and multiple model families, VIA-SD reduces rejection rates by 0.10-0.22 and delivers 10-20% speedups over strong SD baselines, while achieving 2.5-3x acceleration over non-drafting decoding. Moreover, VIA-SD is compatible with existing SD frameworks without modifying their training procedures. Our results suggest multi-tier SD as a general paradigm for scalable and efficient LLM inference. Project page: https://zju-xyc.github.io/VIA-SD-Project-Page/

11
From 2D Grids to 1D Tokens: Reforming Shared Representations for Multimodal Image Fusion

Multimodal image fusion aims to integrate complementary information from different modalities into a fused image that preserves rich local details while maintaining globally consistent appearance. Existing approaches build shared representations on 2D feature grids, which excel at modeling local structures but offer limited leverage over image-level global appearance factors. To balance these objectives, we introduce a compact 1D token interface based on a frozen pretrained image tokenizer for modeling non-local appearance/base factors. Rather than using the tokenizer as a reconstruction backbone, our design uses the 1D token space as a global carrier while retaining the 2D spatial pathway for local structure restoration. Specifically, we introduce Selective Token Editing (STE), which sparsely updates/replaces a small set of critical tokens, providing a lightweight mechanism to steer global appearance coherence while keeping the fusion backbone unchanged and avoiding extra losses. Experiments on four commonly used benchmarks show that our method achieves the best overall performance, with consistent, multi-metric improvements in both global coherence and local fidelity. Project page: https://zju-xyc.github.io/1D-Fusion-Project-Page/

10
MoVerse: Real-Time Video World Modeling with Panoramic Gaussian Scaffold

We present MoVerse, a real-time video world model that creates an interactively navigable scene from a single narrow-field-of-view image. This setting is challenging because the input observes only a small fraction of the environment, while interactive roaming requires a complete surrounding world, persistent geometry, controllable camera motion, and temporally coherent high-fidelity observations. MoVerse addresses this problem by separating world construction from observation rendering. It first expands the input into a gravity-aligned 360^circ panorama with topology-aware diffusion, closing the missing field of view before 3D reasoning. It then lifts the panorama into a persistent 3D Gaussian scaffold using panoramic geometry-aware residual prediction, yielding a dense and directly renderable spatial memory. Finally, a Gaussian-conditioned video renderer translates scaffold renderings along user-specified camera trajectories into photorealistic video. To make this renderer practical for interaction, we train a bidirectional diffusion teacher for high-quality conditional rendering and distill it into a causal autoregressive student for bounded-latency streaming. This design combines the controllability and long-range consistency of explicit 3D representations with the perceptual quality of generative video models. MoVerse supports real-time scene roaming at 8~FPS on a single NVIDIA RTX~4090 GPU, demonstrating a practical path toward single-image world creation with interactive video output.

9
TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search

Deep search requires agents to answer complex questions through multi-step web search, browsing, evidence comparison, and synthesis. A central challenge is deciding how to search when several directions look plausible but only some will later lead to reliable evidence. If an agent greedily follows the current best-looking direction, it may keep extending a weak continuation. If it explores without discipline, it may waste budget on disconnected trials. We propose TreeSeeker, an inference-time framework for controlled trial-and-error in deep search. TreeSeeker organizes search as branch-and-return search over tree-structured states, where each branch is a tentative direction for a sub-goal. At each round, TreeSearch reads all sub-goal trees, identifies active goals, and uses textual UCB signals of value, uncertainty, and risk to select among exploiting a promising branch, exploring an uncertain alternative, or pruning an unproductive continuation and returning to an earlier branch point. TreeMem supports this control loop by keeping evidence, uncertainty, conflicts, progress, and failure cues attached to the branches that produced them, so trial outcomes can guide later decisions. Experiments on XBench-DeepSearch, BrowseComp, and BrowseComp-ZH show that TreeSeeker consistently outperforms strong open-source baselines, suggesting that explicit branch-and-return control complements stronger reasoning and tool execution.

8
High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Few-step diffusion distillation has become increasingly mature for 4-8-step generation, yet pushing further to 2 steps remains challenging. In this work, we introduce Z-Image Turbo++, a high-quality 2-step image generation model distilled from the 8-step Z-Image Turbo teacher. Our method addresses the central bottlenecks of increased task difficulty and limited model capacity in 2-step generation through three simple but effective design choices tailored to this regime. First, we propose Distribution-Aligned Adversarial Learning, which uses teacher-generated images rather than external real images as real samples for GAN training, providing a more attainable and informative adversarial target. Second, we adopt Step-Decoupled Parameterization, assigning independent model parameters to the two denoising steps to better match their distinct capacity demands. Third, we perform End-to-End Training with Iterative Regularization, allowing the first step to receive gradients from final image quality while preserving a meaningful intermediate generation through an explicit step-1 loss. Together, these designs substantially narrow the quality gap between 2-step and 8-step generation in both qualitative and quantitative evaluations, highlighting the potential of carefully tailored distillation strategies for improving the quality-efficiency trade-off in few-step generation.

6
Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Visual reasoning requires integrating evidence distributed across regions, attributes, and relations, making single-chain reasoning prone to early perceptual commitment and hallucination. We propose Visual Para-Thinker++, a single-policy multi-agent framework in which one shared MLLM policy is instantiated as role-conditioned Main, Worker, and Summary Agents. The Main Agent decomposes the task with fixed allocation patterns; Worker Agents reason in parallel under context isolation; and the Summary Agent reconciles full Worker reasoning traces rather than majority-voting on final labels. The shared policy is trained by Multi-Agent Capability Injection and Role-Decoupled Multi-Agent Optimization, which assign role-specific rewards and advantages to corresponding token segments to reduce gradient conflict among collaborative roles. A native inference engine enables efficient multi-agent rollout through shared visual prefix and KV cache reuse. Across V*, CountBench, the RefCOCO family, and HallusionBench, Visual Para-Thinker++ consistently outperforms single-trajectory and inference-time parallel baselines, with especially strong gains on hallucination-sensitive visual reasoning.

6
HarnessBridge: Learnable Bidirectional Controller for LLM Agent Harness

Large language models are increasingly deployed as agents for long-horizon tasks, yet their performance is shaped not only by model capability and environment design, but also by the harness that mediates agent--environment interaction. Existing harnesses are largely manually engineered, making them difficult to scale as trajectories grow longer and interactions become more complex. In this work, we ask whether harness can be generated by a learnable plug-in module that can be trained in an end-to-end fashion. We introduce HarnessBridge, a lightweight learnable harness controller that parameterizes the agent--environment interface as a bidirectional projection. HarnessBridge learns two bidirectional projections: observation projection, which distills raw trajectories into compact, decision-relevant states, and action projection, which converts proposed actions into executable transitions or trajectory-grounded rejections. We train HarnessBridge on a harness supervision dataset via unified instruction tuning. On Terminal-Bench~2.0 and SWE-bench Verified, HarnessBridge matches or surpasses strong specialized harnesses while substantially reducing token usage and trajectory length, and generalizes from smaller generators to larger commercial models.

6
SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

On-policy distillation (OPD) trains a student on its own trajectories with dense per-token supervision from a stronger teacher, and often outperforms off-policy distillation and standard reinforcement learning. However, we find that its effectiveness implicitly relies on two assumptions that frequently break in practice: trajectory-level alignment between the student and the teacher, and uniform token-level reliability of the teacher's preferences. We therefore propose Sign-Gated On-Policy Distillation (SG-OPD), which uses a binary verifier as a trust signal for the teacher at two complementary granularities: phased teacher sampling mixes in verifier-endorsed teacher rollouts at cold-start, and a sign-consistency gate extrapolates the distillation update on tokens where the teacher agrees with the verifier-correct direction and interpolates it where it disagrees. Experiments on competition-level mathematical reasoning benchmarks show that SG-OPD consistently outperforms standard OPD, with average gains of 1.98 and 7.50 at the per-sample and per-question levels, respectively.

5
EvoBrowseComp: Benchmarking Search Agents on Evolving Knowledge

Search Agents -- large language models augmented with search tools -- have intensified the need for future-proof evaluation benchmarks. Existing benchmarks such as BrowseComp rely on static knowledge, making them vulnerable to test-set contamination and parametric memorization. Consequently, models can achieve high scores through fact recall rather than genuine retrieval, obscuring true browsing competence via reasoning shortcuts. In this paper, we introduce EvoBrowseComp, an evolving benchmark of 400 English and 400 Chinese contamination-free complex questions synthesized via live-web traversal. To collect these questions, we design a three-agent collaborative framework: (1) a QA synthesis agent that retrieves fresh knowledge from the live web to synthesize QA pairs; (2) an information filtering agent that filters retrieved knowledge in terms of credibility and popularity to block parametric shortcuts; and (3) a high-level guidance agent that formalizes questions into reasoning graphs to reduce logical redundancy and shortcuts in synthesized QA pairs. Because the framework supports fully automated synthesis, EvoBrowseComp can be regularly updated to prevent data contamination and maintain temporal freshness. Extensive experiments confirm its great difficulty, requiring broad horizontal search. It establishes a scalable paradigm for auto-updatable, high-difficulty benchmarking that keeps pace with both evolving world knowledge and advancing agent capabilities.

3
ArogyaSutra: A Multi-Agent Framework for Multimodal Medical Reasoning in Indic Languages

Multimodal Large Language Models (MLLMs) have shown promising reasoning capabilities in general domains, yet their performance remains limited in specialized settings such as healthcare, especially in multilingual and low-resource scenarios. This gap is critical in regions like rural India, where patients often express complex medical queries in native Indic languages and rely on multimodal inputs such as medical images. Existing English-centric MLLMs struggle to support such use cases, limiting equitable access to AI-driven healthcare assistance. To address this challenge, we introduce ArogyaBodha, a large-scale multilingual multimodal medical question-answer dataset constructed from eight heterogeneous sources, covering 31 body systems, six imaging modalities, and 21 clinical domains across English and seven major Indian languages. We further propose ArogyaSutra, an actor-critic-based multi-agent framework that integrates tool grounding with dual-memory mechanisms for step-wise, reasoning-aware decision making, and uses stored actor-critic simulation trajectories for distillation. Experiments show that our dataset and framework improve multilingual medical reasoning accuracy across all Indic languages, with ablations validating the contribution of each component. The source code and dataset are available at: https://iitp-cse.github.io/ ArogyaSutra/

2
MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training

Representation alignment with pretrained vision models has recently shown strong potential for accelerating diffusion transformer training. By aligning intermediate diffusion features with clean-image representations from self-supervised vision encoders, existing methods improve convergence and generation quality. However, such alignment also introduces a non-trivial constraint: diffusion models operate on noisy inputs whose usable information varies across timesteps, while the reference features are extracted from clean images. In this paper, we revisit this mismatch from a token-level perspective. We find that, under full-token representation alignment, tokens with large alignment-gradient norms exhibit a stable spatial preference, suggesting that the alignment objective does not affect all tokens uniformly and may encourage the model to rely on the complete set of clean-image tokens. To address this issue, we propose MaskAlign, a token-subset representation alignment method that applies alignment to randomly sampled token subsets during training. By exposing the model to different token subsets across iterations, MaskAlign reduces the dependence of representation alignment on the complete token set and encourages alignment behavior that is more stable under token-subset perturbations. To mitigate the information loss caused by directly dropping tokens, we further introduce a lightweight pre-mask token mixing block that shares information across tokens before masking.

2
MuJoCo-Drones-Gym: A GPU-Accelerated Multi-Drone Simulator for Control and Reinforcement Learning

Robotic simulators are a cornerstone of modern research in aerial robotics, serving both as a vehicle for the development of new control algorithms and as the data source for training reinforcement learning (RL) policies. Yet, existing quadcopter learning environments often face a trade-off between physical fidelity, multi-agent support, and the throughput required by modern deep RL pipelines. In this paper, we present MuJoCo-Drones-Gym, an open-source Gymnasium-compatible multi-drone environment built on top of the MuJoCo physics engine. MuJoCo-Drones-Gym supports an arbitrary number of Bitcraze Crazyflie 2.x nano-quadcopters and exposes a modular API for selecting (i)~the physics model (rigid-body MuJoCo, explicit Python dynamics, or any subset of ground effect, blade drag, and inter-drone downwash), (ii)~the action interface (per-motor RPMs, collective normalized thrust, velocity setpoints, or PID waypoint commands), and (iii)~the observation space (kinematic state vectors, RGB / depth / segmentation cameras, or neighbourhood adjacency information). A PettingZoo ParallelEnv wrapper enables drop-in multi-agent reinforcement learning, while a suite of seven task environments, hover, velocity tracking, multi-drone hover, waypoint navigation, formation flight, gate racing, and a generic multi-agent template, demonstrates the breadth of the interface. We describe the environment design, the underlying physics and quadcopter dynamics, and illustrate its use through control and learning examples that mirror those of the closely related gym-pybullet-drones project, while taking advantage of MuJoCo's improved contact handling, rendering, and parallelizability.

2
Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents

Compact language models (LMs) reduce cost, latency, and deployment risk for tool agents. Yet MCP-style tool use requires more than isolated function calling: an agent must discover tools from live catalogs, satisfy schemas, preserve dependencies across intermediate outputs, and ground final responses in executed evidence. Small planners often generate plausible workflow graphs that fail under tool resolution, parameter validation, dependency tracking, or execution. We argue that this failure mode is poorly handled by small-corpus distillation. A few hundred teacher traces can teach workflow format, but rarely cover the recovery behavior needed to repair failed plans over changing tool catalogs. We introduce Evoflux, an inference-time evolutionary search method that treats compact tool use as the repair of executable tool workflows. It evolves typed workflow graphs through structured edits, execution feedback, adaptive intensity, meta-guided redesign, and diversity pruning. On held-out MCP-Bench tasks spanning live MCP servers and 250 tools, Evoflux raises execution feasibility from roughly 3% to 17-24% across small planners. In contrast, SFT and SFT+DPO on the same search-mined data match, underperform, or collapse below zero-shot performance; ReAct reaches higher peaks, but with higher variance and token cost. These results show that execution-grounded search is more reliable under scarce teacher-trace budgets.

2
Surflo: Consistent 3D Surface Flow Model with Global State

Geometry is invariant to viewpoint, which makes any collection of images a redundant encoding of a single 3D state. Existing feed-forward reconstruction models fail to exploit this: per-view methods emit overlapping, unaligned pointmaps that grow linearly with input count, while global-latent methods commit to a fixed, low-resolution output. We introduce Surflo, which compresses a variable number of unposed RGB views into K latent tokens-one global state-and decodes oriented 3D surface points by independently transporting them from noise onto the surface via flow matching. This frees the output from any fixed grid or token budget: the same latent yields from a few thousand to a million points in a single forward pass. To suppress the local inconsistencies inherent to independent per-point decoding, an inference-time guidance term correlates nearby points by injecting a photometric gradient during ODE integration. Surflo matches or surpasses feed-forward baselines on surface metrics, runs an order of magnitude faster than optimization-based methods that require hundreds of views, and is the only feed-forward approach to combine a global latent with arbitrary-resolution decoding.

2
Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models

Adversarial robustness evaluations of large language models (LLMs) typically report attack success rate (ASR) under fixed query budgets, implicitly treating all attacks as equally costly. In practice, the computational expense of different attack strategies can vary by orders of magnitude. Consequently, ASR at a fixed budget can obscure the true effort required to jailbreak a model, thereby making it hard to determine whether an attack's cost justifies its payoff to the attacker. We propose a compute-aware evaluation framework based on computational pressure, measured in cumulative floating-point operations (FLOPs), as a proxy for adversarial effort. We introduce risk-compute curves, which map compute budgets to attack risk, and derive two metrics that summarize the average pressure required for a given attack to succeed. Across ten models spanning three families and four different stages in language model training and alignment, evaluated with three attack strategies (gradient-based, iterative refinement, and template-based) on two jailbreak robustness benchmarks, we find: (1) alignment training has non-monotonic effects on compute-space robustness; (2) scaling model size reduces gradient-based attack effectiveness but has limited impact on cheaper template-based attacks; (3) gradient-based attacks optimized on a surrogate model can transfer to a separate target model, providing a way to reduce attacker costs; (4) compute cost varies by up to {approx}5{times} across harm categories within a single model; and (5) safety-aligned RL increases aggregate cost while leaving some categories disproportionately accessible. We release our framework to enable compute-aware risk assessment and evaluation.

1
ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs

Large language models deployed as agents over large tool catalogs face a critical tool-retrieval bottleneck. As embedding-based retrieval approaches rely on compact encoders that may under-capture specialized tool semantics, parametric tool retrieval addresses this by encoding each tool as a virtual token appended to the LLM vocabulary, fine-tuned in two stages (memorization then retrieval SFT) to use the LLM as a retriever, achieving strong performance on standard ToolBench retrieval benchmarks. Yet these benchmarks use verbose, fully-specified queries, and their evaluation applies constrained decoding that restricts outputs to valid token paths, neither reveals whether the model actually understands its tools. We introduce ToolSense, an open-source LLM-powered diagnostic framework that takes any tool catalog as input and automatically generates three benchmarks: a Realistic Retrieval Benchmark (RRB) with queries at three ambiguity tiers, an MCQ probing benchmark, and a QA probing benchmark. Applying ToolSense to ToolBench (~47k tools) and evaluating five parametric model training configurations reveals a knowledge-retrieval dissociation: on RRB queries, several configurations collapse by ~50-64 percentage points compared to fully-specified ToolBench benchmarks, falling below the embedding-model baseline. Additionally, despite strong retrieval performance, some models score near-random on factual probes, suggesting a knowledge-retrieval dissociation. We open-source the ToolSense framework and the ToolBench diagnostic benchmarks at https://github.com/SAP/toolsense.

1
PianoKontext: Expressive Performance Rendering from Deadpan Context

Expressive performance rendering (EPR) aims to generate realistic performances constrained on sequences of notes. However, flow matching audio editing models manipulate only synchronized music samples of the same duration, limiting their understanding of expressive timing. We introduce PianoKontext, a flow matching rendering model for classical piano music that generates variable-length performances in the latent space of a pretrained Music2Latent model. We synthesize MIDI scores into deadpan audio and employ Dynamic Time Warping (DTW) in the latent space to construct paired data for training. The aligned embeddings are concatenated in DiT blocks, allowing for a simple and effective learning of the dependencies between the score and performances. Audio samples are available at our demo page: https://realfolkcode.github.io/pianokontext_demo/.

1
IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder

Built on pretrained vision foundation models (VFMs), representation autoencoders (RAEs) have recently emerged as a promising approach for constructing semantically rich latent spaces for image generation. However, their reconstruction quality often remains suboptimal, largely because deep VFM representations do not preserve sufficient fine-grained visual detail. This limitation becomes even more severe after discretization, where missing low-level information is difficult to recover. In fact, we observe that shallow VFM features retain considerably richer local appearance and structural detail, which complements the high-level semantics carried by deep features used in existing RAEs. Motivated by this complementary property, we propose Ideal, an In-depth Alignment framework for discrete representation autoencoding. By jointly aligning quantized tokens with both shallow and deep VFM features, Ideal enables the resulting discrete visual tokens to preserve both visual fidelity and rich semantics. Extensive experiments demonstrate that Ideal yields superior reconstruction performance, achieving 0.61 rFID on ImageNet and outperforming the previous best method by 0.28. When used for autoregressive image generation, Ideal further produces a gFID of 1.89, establishing a new state of the art for autoregressive image generation.

1
WEAVER, Better, Faster, Longer: An Effective World Model for Robotic Manipulation

The potential impacts of world models (WMs, i.e., learned simulators) on robotics are far-reaching -- policy evaluation, policy improvement, and test-time planning -- all with limited real-world interaction. To unlock these downstream capabilities, a WM needs to jointly satisfy three desiderata: (i) fidelity (i.e., producing simulated trajectories that correlate with reality), (ii) consistency (i.e., producing simulated trajectories that are coherent over long horizons), and (iii) efficiency (i.e., producing simulated trajectories quickly). We propose WEAVER (World Estimation Across Views for Embodied Reasoning): a WM architecture that simultaneously achieves all three desiderata, providing state-of-the-art results on robotic manipulation tasks. WEAVER is a multi-view WM trained to predict future latents and reward values via a flow-matching loss. We distill the key design decisions across model architecture, memory, and prediction objectives required to unlock the kinds of long-horizon dynamic manipulation tasks that have confounded prior world modeling approaches. We apply WEAVER in robotic hardware, demonstrating its effectiveness at policy evaluation (ρ=0.870 correlation with real-world success rate), policy improvement (real-world success rate improvement of 38% on top of the π_{0.5} robot foundation model), and test-time planning (real-world success rate improvement of 14% with a 5-10times speedup over prior WMs). WEAVER also demonstrates better performance than prior WMs when evaluated on out-of-distribution scenarios. Code, models, and videos at: https://arnavkj1995.github.io/WEAVER/ .

1
A Stationary (and Therefore Compatible) Representation is All You Need

Learning compatible representations aims to learn feature representations that can be used interchangeably over time whenever a model undergoes updates. In this paper, we demonstrate that stationary representations learned by d-Simplex fixed classifiers imply compatibility as in its formal definition. This result establishes a foundation for future works and can be directly exploited in practical learning scenarios. We address the challenge of learning compatibility using d-Simplex fixed classifiers when the model is sequentially fine-tuned. Learning according to a d-Simplex fixed classifier with the cross-entropy loss aligns feature distributions at the first-order statistics. Consequently, it may not fully capture higher-order dependencies in the representation between model updates. To address this issue, we demonstrate that training the model using a d-Simplex fixed classifier through a convex combination of the cross-entropy loss and a contrastive loss not only captures higher-order dependencies, but is also equivalent to learning with the cross-entropy under the compatibility constraints. We confirm our findings with extensive experiments also considering a new scenario where a pre-trained model is sequentially fine-tuned and occasionally replaced with an improved model. We show that stationary representations enable uninterrupted retrieval services (without reprocessing gallery images) while improving performance during model updates and replacements, achieving state-of-the-art. Code at https://github.com/miccunifi/iamcl2r.

0
Revisiting Articulated Parts Perception in Robot Manipulation

We are surrounded by various objects with movable, articulated parts, e.g., box, handle, door. An accurate and generalizable perception of articulated parts is essential to enhance robotic manipulation capabilities. Building on this need, recent efforts in articulated parts perception have followed two main directions: One line of work uses pose-based representation, which requires high manual cost; in parallel, affordance-based methods extract future object motion from point tracking without additional manual efforts, but suffer from low-quality data. In this paper, we propose a new representation of articulated parts, Geometric Primary Structure (GPS), an abstraction of the part geometry structure to balance scalability and quality. For efficient and scalable data collection, GPS is integrated with a portable Virtual Reality (VR) device and requires only one minute to annotate one object sequence. This direct human annotation provides higher quality than the estimated affordance. With this efficient VR-GPS system, we collect 41K frames for 234 objects across six part classes, and train a generalizable GPS model with a single RGB-D object image as input. For object manipulation, we deploy a heuristic policy based on GPS prediction. Without any in-domain fine-tuning, our method achieves an 73% success rate, covering 270 initial states for 9 objects. Our code, data and reusable tool are available at https://enlighten0707.github.io/gps.

0
Flash-GMM: A Memory-Efficient Kernel for Scalable Soft Clustering

We present Flash-GMM, a fused Triton kernel for efficient computation of Gaussian Mixture Models (GMMs) over large-scale data in a single GPU pass. By eliminating the need to materialize the full responsibility matrix in GPU memory, Flash-GMM achieves a 20times speedup over existing implementations and enables training on datasets more than 100times larger than previously feasible on one device. To demonstrate its impact, we integrate Flash-GMM into the IVF coarse quantizer for approximate nearest-neighbor (ANN) search. We show that soft GMM clustering is now a viable drop-in replacement for k-means, and that GMM responsibilities can be leveraged to assign border vectors to multiple clusters. Our approach reaches fixed recall targets with up to 1.7times fewer distance computations, or equivalently, yields +2--12 recall@10 at matched computational cost. We release the kernel as an open-source project.

0
Leveraging Morphology for Historical Script Metrological Analysis

Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but still provide limited access to interpretable visual measurements for paleography, the study of historical scripts. In this paper, our main insight is that morphological script analysis, in particular the capacity to learn character prototypes from line-level transcriptions, enables the definition of scalable, meaningful, and stable paleographic measurements. More precisely, we leverage a transformer-based detection architecture together with a prototype-based line reconstruction module to learn prototypical characters and their occurrence, deformation, and positioning. Our contributions are twofold. First, we introduce a deep architecture and learning methodology that enables efficient character modeling with only line-level transcription supervision, significantly improving over the Learnable Typewriter baseline and enabling accurate character bounding box prediction, unlocking its potential for paleographic measurements. Second, we introduce and demonstrate the paleographical relevance of automatic measurements enabled by our architecture for characters, bi-grams, and spaces between graphical units. For this demonstration, we extend the annotations of the codex Paris, BnF, fr. 2813, commissioned in the late fourteenth century by Charles V and copied by four hands, to 160 pages. We visualize our measurements over these pages, showing how they enable us not only to differentiate graphical profiles, but also to discover and analyze subtle variations. This case study outlines the scalability of our approach and its frugality in terms of required training data, since a single column of text is sufficient to compute our measurements on each of the 160 pages. Data and code are publicly available at: https://malamatenia.github.io/morphology4metrology-analysis.

0
05

PRODUCT HUNT

05.00
PRODUCT HUNT

Product Hunt - June 12, 2026

Product Hunt Daily Feed: Featuring noteworthy tech launches.

HyperSleep icon
HyperSleep

Block social media until you've actually slept

0
NODUS PH Radar for Product Hunt icon
NODUS PH Radar for Product Hunt

Product Hunt analytics beyond the daily leaderboard

0
Firma.dev icon
Firma.dev

E-signatures API for your app averaging ~3¢ per envelope

0
KOSH Money icon
KOSH Money

USD account & credit cards for freelancers & creators

0
Bob's CLI icon
Bob's CLI

A local-first AI coding CLI that adapts to you

0
Meet Warren 3.0 icon
Meet Warren 3.0

Your voice-supported AI financial planning partner

0
Keep icon
Keep

Full-screen 3D clock scenes for your iPhone or Mac

0
Pond icon
Pond

Fundraising, GTM, and bounties for startups

0
pleNx — Plex client for Nintendo Switch icon
pleNx — Plex client for Nintendo Switch

The first native Plex client for Nintendo Switch

0
Clutch Alarm icon
Clutch Alarm

Sleep through the night. Wake up for the goals.

0
Medicyn icon
Medicyn

Your complete medical history privately on your device

0
Qursor icon
Qursor

Point at any UI to send exact context to your AI

0
CueBuddy icon
CueBuddy

Record talking videos without manual scrolling

0
Insta360 Luna Ultra icon
Insta360 Luna Ultra

A gimbal camera that sees with you

0
Slack Data Agent icon
Slack Data Agent

Ask about your data without leaving Slack

0
ShellMate icon
ShellMate

Manage SSH servers, credentials, and teams in one place

0
Tide icon
Tide

Layered voice notes that paint themselves

0
LocIn AI icon
LocIn AI

Localize your app with tone-aware AI, automated workflows

0
INVO Ride icon
INVO Ride

Book autonomous eVTOL flights over photoreal San Francisco

0
Onpilot icon
Onpilot

An AI workforce customized to your business

0
Nodey icon
Nodey

Your n8n command center, now on your phone

0
Proxee icon
Proxee

Your localhost on your phone, synced.

0
Riven icon
Riven

Your Apple Watch knows when you've truly hit muscle failure

0
Mute icon
Mute

A visual productivity tool to visualize your brian-dump

0
EndpointMe icon
EndpointMe

Your identity as a live, queryable API endpoint

0
DockLog icon
DockLog

Docker logs without the logging stack

0
Airbrush Studio icon
Airbrush Studio

AI-powered photo editor for pro results w/o manual editing

0
Journey Now icon
Journey Now

Learning copilot for human ambition

0
Bond icon
Bond

The AI to-do list that does itself

0
Asmi AI icon
Asmi AI

AI that handles your personal chores in the real world

0
Bugpilot icon
Bugpilot

Turn errors, DOM, + screenshots into an AI-ready Markdown

0
Easybilling icon
Easybilling

AI-native billing & payments for usage-based AI products

0
CrustRecruiter icon
CrustRecruiter

Turn Claude into a recruiter that thinks like you

0
Respan Gateway icon
Respan Gateway

One AI gateway with built-in observability and evals

0
heyly.io icon
heyly.io

A video greeting + buttons where you ARE the testimonial

0
SlimSnap icon
SlimSnap

Your AI doesn't know which button you mean

0
OwnClip icon
OwnClip

Native macOS screen recorder with local-first AI privacy

0
Cloudskill icon
Cloudskill

Govern the AI skills your team depends on

0
Lium AI icon
Lium AI

AI for Complex Data

0
PixelForge icon
PixelForge

Turn photos into game assets

0
SpatialChat icon
SpatialChat

Product feature update for our virtual conferencing platform

0
Terminal Mode by Even Realities icon
Terminal Mode by Even Realities

Keep coding agents always in sight

0
Juno icon
Juno

AI Health Companion for Chronic Illness

0
Patchrooms icon
Patchrooms

Turn AI-app feedback into agent-ready patch context.

0
Tabstack Structured Extraction icon
Tabstack Structured Extraction

Extract web data into structured JSON, no scraper required.

0
Slashspace AI icon
Slashspace AI

Canvas first AI experience for sustained, complex work

0
Napkin Math icon
Napkin Math

personalized AI food journal + nutrition coach

0
Gemini 3.5 Live Translate icon
Gemini 3.5 Live Translate

Latest audio model for live speech-to-speech translation

0
Publora icon
Publora

A publishing API for agents to post on 10 social platforms

0
BlenderHunt icon
BlenderHunt

The indie marketplace for Blender artists and creators

0
06

TECHMEME

06.00
TECHMEME

Techmeme - June 12, 2026

Techmeme Digest: Major tech headlines and industry conversations.

Sam Bankman-Fried loses his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of FTX (Luc Cohen/Reuters)
Source: TechmemePublished: Jun 12, 2026

Luc Cohen / Reuters : Sam Bankman-Fried loses his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of FTX —  Sam Bankman-Fried lost on Friday his bid to overturn his fraud conviction and 25-year prison sentence over the collapse of the FTX cryptocurrency exchange he founded.

Sources: six months after acquiring Rivos, Meta is struggling to integrate the chip startup and halted development of a chip for training its largest AI models (The Information)
Source: TechmemePublished: Jun 12, 2026

The Information : Sources: six months after acquiring Rivos, Meta is struggling to integrate the chip startup and halted development of a chip for training its largest AI models —  Meta Platforms bought semiconductor startup Rivos last year to accelerate development of in-house chips and reduce its reliance …

Sources: French startup Mistral AI is in talks to raise ~€3B at a ~€20B valuation; it was last valued at €11.7B during a funding round in September 2025 (Bloomberg)
Source: TechmemePublished: Jun 12, 2026

Bloomberg : Sources: French startup Mistral AI is in talks to raise ~€3B at a ~€20B valuation; it was last valued at €11.7B during a funding round in September 2025 —  French startup Mistral AI is in talks to raise around €3 billion ($3.5 billion) at a valuation of roughly €20 billion …

Companies hit by rising AI costs are increasingly using tools that tap cheaper models, including some from China, putting price pressure on OpenAI and Anthropic (Wall Street Journal)
Source: TechmemePublished: Jun 12, 2026

Wall Street Journal : Companies hit by rising AI costs are increasingly using tools that tap cheaper models, including some from China, putting price pressure on OpenAI and Anthropic —  Startups and tech giants alike are mixing and matching AI models to avoid the premium prices charged by industry leaders

Niantic Spatial says Pokémon Go data is "not part of" its deal with spatial AI company Vantor, after concerns that game data could be used for military drones (Kenneth Shepard/Kotaku)
Source: TechmemePublished: Jun 12, 2026

Kenneth Shepard / Kotaku : Niantic Spatial says Pokémon Go data is “not part of” its deal with spatial AI company Vantor, after concerns that game data could be used for military drones —  The company says it no longer receives data from the monster-catching app after Niantic was acquired by Scopely

New York City-based Current, which manages a consumer fintech platform, raised an $80M Series E at a $1.5B valuation led by Springcoast Partners (FinSMEs)
Source: TechmemePublished: Jun 12, 2026

FinSMEs : New York City-based Current, which manages a consumer fintech platform, raised an $80M Series E at a $1.5B valuation led by Springcoast Partners —  Current, NYC-based provider of a consumer fintech platform, raised $80M in Series E equity financing, at a $1.5 Billion valuation.

Google sues Chinese cybercrime network Outsider Enterprise, accusing it of using Gemini AI to create fake websites and scam hundreds of thousands of Americans (Cecilia Kang/New York Times)
Source: TechmemePublished: Jun 12, 2026

Cecilia Kang / New York Times : Google sues Chinese cybercrime network Outsider Enterprise, accusing it of using Gemini AI to create fake websites and scam hundreds of thousands of Americans —  Google warned that artificial intelligence had supercharged the problem of online scams.Andria Lo for The New York Times

SpaceX's trading debut will be a test of the "Musk premium", the force behind Tesla's $1T+ valuation, and a gauge of investor appetite ahead of upcoming AI IPOs (Reuters)
Source: TechmemePublished: Jun 12, 2026

Reuters : SpaceX's trading debut will be a test of the “Musk premium”, the force behind Tesla's $1T+ valuation, and a gauge of investor appetite ahead of upcoming AI IPOs —  SpaceX (SPCX.O) is set to begin trading on Nasdaq on Friday after investors poured $75 billion into the world's biggest IPO ever …

FanDuel owner Flutter plans to delist its London shares on August 3, citing low trading and high costs; Flutter moved its primary listing to the US in 2024 (Lauren Almeida/The Guardian)
Source: TechmemePublished: Jun 12, 2026

Lauren Almeida / The Guardian : FanDuel owner Flutter plans to delist its London shares on August 3, citing low trading and high costs; Flutter moved its primary listing to the US in 2024 —  Gambling business, which also owns Betfair, to focus on New York in latest high-profile blow to UK stock market

Memory chipmaker Kioxia replaces Toyota as Japan's most valuable company; Kioxia's shares surged 7.6% on Friday, lifting its market value to ~$274B (Kanoko Matsuyama/Bloomberg)
Source: TechmemePublished: Jun 12, 2026

Kanoko Matsuyama / Bloomberg : Memory chipmaker Kioxia replaces Toyota as Japan's most valuable company; Kioxia's shares surged 7.6% on Friday, lifting its market value to ~$274B —  Memory chipmaker Kioxia Holdings Corp. replaced Toyota Motor Corp. to become Japan's largest company by market value, underscoring …

Sources: Nvidia has told Chinese clients that its new Vera CPUs for AI data centers could be available as soon as August and that they can begin placing orders (Reuters)
Source: TechmemePublished: Jun 12, 2026

Reuters : Sources: Nvidia has told Chinese clients that its new Vera CPUs for AI data centers could be available as soon as August and that they can begin placing orders —  Nvidia (NVDA.O) has told Chinese clients that its new “Vera” central processors for AI data centres could be available as soon …

How Chinese manufacturers are dominating the humanoid robot supply chain, even as the industry struggles to find a purpose for such robots (New York Times)
Source: TechmemePublished: Jun 12, 2026

New York Times : How Chinese manufacturers are dominating the humanoid robot supply chain, even as the industry struggles to find a purpose for such robots —  China's lead in humanoid robots was evident at last month's Humanoids Summit, a robotics conference in Tokyo.Videos by Qilai Shen and Hiroko Masuike

Hyderabad, India-based Equal AI, which makes an eponymous AI-powered call screening app, raised a $30M Series B led by Prosus Ventures and Tomales Bay Capital (Ivan Mehta/TechCrunch)
Source: TechmemePublished: Jun 12, 2026

Ivan Mehta / TechCrunch : Hyderabad, India-based Equal AI, which makes an eponymous AI-powered call screening app, raised a $30M Series B led by Prosus Ventures and Tomales Bay Capital —  In India, consumers receive a lot of calls every day, ranging from spam and scams to delivery people and financial service companies trying to contact them.

Oracle warns customers of a critical PeopleSoft flaw after ShinyHunters claimed breaches of 100+ organizations using PeopleSoft; Oracle has not issued a patch (Lorenzo Franceschi-Bicchierai/TechCrunch)
Source: TechmemePublished: Jun 12, 2026

Lorenzo Franceschi-Bicchierai / TechCrunch : Oracle warns customers of a critical PeopleSoft flaw after ShinyHunters claimed breaches of 100+ organizations using PeopleSoft; Oracle has not issued a patch —  Oracle warned its corporate customers that there is a critical-rated vulnerability in its PeopleSoft software …

Infineon plans to open a €5B chip factory in Germany backed by EU subsidies, its largest single investment, on July 2 as Europe seeks to boost chip production (Christina Kyriasoglou/Bloomberg)
Source: TechmemePublished: Jun 12, 2026

Christina Kyriasoglou / Bloomberg : Infineon plans to open a €5B chip factory in Germany backed by EU subsidies, its largest single investment, on July 2 as Europe seeks to boost chip production —  Infineon Technologies AG is preparing to open its largest single investment, a €5 billion ($5.8 billion) …

07

STARTUP ARCHIVE

07.00
STARTUP ARCHIVE

Startup News - June 12, 2026

Startup News Roundup: Aggregating key funding and launch updates.

Marc Andreessen on the 5 personality traits of an innovator
Source: StartupPublished: Mar 31, 2026

“When you’re talking about real innovators—people who actually do really creative, breakthrough work—I think you’re talking about a couple things:”

Steve Jobs explains the importance of both thinking and doing
Source: StartupPublished: Mar 30, 2026

“The doers are the major thinkers. The people who really create the things that change this industry are both the thinker-doer in one person.”

Tobi Lutke explains what the VCs who passed on Shopify got wrong
Source: StartupPublished: Mar 27, 2026

“What a lot of free-market thinkers don’t understand is that between the demand and eventual supply lies friction."

Sam Altman explains how he decides to invest in a startup after 10 minutes
Source: StartupPublished: Mar 26, 2026

"Does this person have the potential to be the next Mark Zuckerberg?… [You don’t get to] 100% accuracy, obviously, but it’s good enough that our business model works.”

Jony Ive recounts the time Steve Jobs called him vain
Source: StartupPublished: Mar 25, 2026

In the clip below, Jony Ive recounts the time he asked Steve Jobs to be less harsh in his critique of a piece of work.

Jeff Bezos’s two pieces of advice for aspiring entrepreneurs
Source: StartupPublished: Mar 24, 2026

“The advice that I would give entrepreneurs is don't chase the hot new thing. It's so hard to catch something that everybody already knows is hot."

Elad Gil: “Things that work tend to work pretty fast”
Source: StartupPublished: Mar 23, 2026

“I do think there’s a bit of a myth in Silicon Valley that you should keep grinding no matter what and it’s just about perseverance, and I think that’s really bad advice."

Paul Graham on why starting with a “small, intense fire" is the key to startup growth
Source: StartupPublished: Mar 20, 2026

"You have to know who those first users are and how you're going to get them."

Keith Rabois on how to identify great talent
Source: StartupPublished: Mar 19, 2026

“What you want to do with every single employee every single day is expand the scope of their responsibilities until it breaks… and that’s the role they should stay in.”

Wealthfront CEO on why advertising spend makes it harder to find product/market fit
Source: StartupPublished: Mar 18, 2026

“The way that you know you have product/market fit is if you have exponential organic growth."

Eric Schmidt on why most companies get strategy wrong
Source: StartupPublished: Mar 17, 2026

“Work very, very hard to figure out what the world’s going to look like in five years. What will people be doing? What will your customers want? Where will costs be?"

Mark Zuckerberg: “You can’t 80/20 everything”
Source: StartupPublished: Mar 16, 2026

"There’s the famous 80/20 rule where you get 80% of the benefit by doing 20% of the work, but you can’t just 80/20 everything. There have to be certain things that you are just the best at."

Marc Andreessen on Mark Zuckerberg’s founder “superpower”
Source: StartupPublished: Mar 13, 2026

“A great superpower that Mark Zuckerberg has that is probably not well-understood enough is he does not get emotionally upset in stressful situations"

Sam Altman explains how to come up with a great startup idea
Source: StartupPublished: Mar 12, 2026

"If you start a startup without a good idea… you’ll be under pressure to make something up and it won’t work that well."

Jeff Bezos on the problems with proxies and managing to metrics
Source: StartupPublished: Mar 11, 2026

“One of the things that happens in business is that you develop certain things that you’re managing to—a typical case would be a metric. And that metric isn’t the real underlying thing.”

Airbnb founder Brian Chesky on how to design an amazing user experience
Source: StartupPublished: Mar 10, 2026

“If you can design something really amazing using the hand-crafted part of your brain, then you can reverse-engineer how to industrialize this millions of times over."

Spencer Rascoff: "I will never invest in a consumer startup with paid marketing”
Source: StartupPublished: Mar 9, 2026

"If you’re actually trying to grow a product, the best levers for doing that are often within the product itself.”

Patrick Collison explains why it sometimes make sense to quit
Source: StartupPublished: Mar 6, 2026

“One thing I’ve learned myself the hard way, is that it is easier to tear down a company and restart it in Silicon Valley, than it is to constantly try to pivot or keep something alive."

Jeff Bezos recounts the time he called Amazon’s customer service number mid-meeting to prove a metric was wrong
Source: StartupPublished: Mar 5, 2026

“I have a saying, which is when the data and the anecdotes disagree, the anecdotes are usually right"

Ben Horowitz: “Nobody was born a great manager. It’s a very unnatural job.”
Source: StartupPublished: Mar 4, 2026

“If you can’t build a great product, it doesn’t matter if you can build a great company.”

03

ALSO TODAY

3 MORE SOURCES
08

SOLIDOT

08.00
SOLIDOT

Solidot News - June 12, 2026

Solidot Feed: Highlighting essential tech & open-source news.

中国的癌症医疗旅游业

泰国和韩国等国以整容和试管婴儿等医疗服务闻名,而中国正试图通过提供先进的癌症疗法吸引全世界的医疗游客。患者出国就医主要是两大原因:先进疗法的可得性,以及价格。CAR-T 疗法是肿瘤学领域最有前景的突破性疗法之一,但大部分国家或者无法提供,或者价格太高。该疗法首要先从患者血液中采集 T 细胞,然后在实验室中基因改造,使其产生特殊的 CAR 受体,该受体能与癌细胞上的特定蛋白质结合。经过基因改造的细胞随后被大量增殖,重新输回患者体内。CAR-T 细胞会主动寻找并杀死携带靶抗原的癌细胞。美国癌症协会称,美国的单次输注 CAR-T 细胞费用在 30-47.5 万美元之间。而中国的费用约为 15-18 万美元,且价格可能还会更低。中国药品监管机构最近批准了一个定价低于 30 万元人民币的免疫疗法上市申请。纽约 Market Research Future 预测,中国医疗旅游市场规模预计将从 2025 年的 13 亿美元增长到 2035 年的 34 亿美元。Mercator Institute for China Studies 的分析师 Jeroen Groenewegen-Lau 称,很多先进的疗法是在中国研发的,但对于中国现有的医疗体系和患者支付能力而言,这些疗法太超前,因此融入国际医疗体系符合中国的利益。

调查显示美国青少年为乐趣而阅读的比例大幅下降

美国教育部国家教育统计中心发布的调查数据显示,美国 13 岁儿童为乐趣而阅读的比例自 2012 年以来下降近半。而 9 岁儿童为乐趣而阅读的比例自 2012 年以来下降了 16%。2025 年 37% 的 9 岁儿童表示几乎每天都会为乐趣而阅读,2020 年这一比例是 42%,1984 年则是 53%。青少年和儿童可能将更多时间花了屏幕上。2024 年的一项研究发现,逾半数 12-17 岁青少年每天花在屏幕上的时间达到了或超过了 4 小时。屏幕使用时间的增加与标准化考试成绩下降相关。

铠侠市值超过丰田跃居日本股市第一

拜 AI 热所赐,6 月 12 日日本铠侠控股(Kioxia Holdings)的总市值超过丰田,在日本国内上市企业中首次跃居榜首。铠侠的总市值达到 44 万亿日元,超过丰田约 43 万亿日元的市值。支撑股价上涨的是盈利能力扩大。以美国科技巨头对 AI 数据中心的投资为背景,NAND 闪存的销售大幅增长。软银集团(SBG)股价同样受 AI 投资相关预期推动走高,曾在 6 月 1 日市值一度超越丰田登顶榜首。作为投资公司的软银集团的收益主要来源于两大板块,一是对美国 OpenAI 的大额投资估值上涨,二是旗下英国半导体设计公司 ARM 控股的价值提升。

小米开源了其 AI 编程助手 MiMo Code

小米开源了其 AI 编程助手 MiMo Code,源代码采用 MIT License 托管在 GitHub 上。小米博客称,“MiMo Code 是小米 MiMo 团队基于 OpenCode 构建的终端编程 Agent,MIT 协议开源。它针对长程自动化编程任务设计,核心关注点是:如何在几十甚至上百步的持续执行中保持决策质量和状态连续性。”

波兰将直播虐待动物等行为定为犯罪,最高判处五年监禁

波兰议员投票通过一项法案,对强奸、谋杀、虐待动物、侮辱性暴力、赌博宣传等严重犯罪行为的直播定为犯罪行为,最高判处五年监禁,强奸或谋杀本身则作为单独的罪行处理。这一法案也适用于模仿或虚假描述此类犯罪行为的个人。此举是波兰加强网络内容监管的举措的一部分。该国最近实施的政策包括禁止 16 岁以下儿童在学校使用手机,以及对访问色情内容引入更严格的年龄验证规则。欧盟的 Digital Services Act(DSA)要求平台迅速删除宣扬暴力或严重伤害的内容,但追究此类内容创作者的责任则由各国自行规定。

新 CRISPR 技术选择性杀死癌细胞

2020 年诺贝尔化学奖得主 Jennifer Doudna 领导的团队利用名为 CRISPR-Cas12a2 的酶,将其转化为精准杀伤癌细胞的“武器”。当该酶检测到癌细胞特有的基因突变特征时,会直接粉碎细胞内的染色质,从而诱导癌细胞死亡。在癌症的发展中,驱动基因的变异通常分为两类:一类是原癌基因的过度激活,另一类是抑癌基因的突变失活。目前的靶向药物大多针对前者,通过抑制剂来阻断过度活跃的蛋白功能。对于抑癌基因的功能缺失性突变,传统药物往往束手无策。以人类癌症中最常见的突变基因 TP53(编码p53蛋白)为例,该突变在卵巢癌和胰腺癌等肿瘤中的出现频率高达 90%。自被发现以来的 40 多年里,科学界始终未能开发出针对突变 p53 蛋白的有效靶向药物。CRISPR-Cas12a2 是一种核酸酶,原本是细菌用来抵御病毒入侵的免疫工具。当这种酶识别到入侵病毒的 RNA 后,会开始无差别地切割周围的 RNA 和 DNA,导致染色质(细胞核内由 DNA 和蛋白质组成的复合体)被彻底粉碎,从而杀死被感染的细胞。研究团队为 Cas12a2 设计了特定的向导RNA(gRNA),使其专门识别包括 TP53、

印尼四天暴雨杀死了 7% 的濒危红毛猩猩

去年 11 月下旬,飓风 Senyar 肆虐印尼苏门答腊岛,造成逾千人死亡,是当年东南亚最致命的自然灾难事件。生活在苏门答腊岛的濒危 Tapanuli 红毛猩猩总数不到 800 只,连续四天的大暴雨以及紧跟着的山体滑坡导致至少 58 只红毛猩猩死亡,占到了总数的 7%,它们距离灭绝更近了一步。因为全球气候变化,研究人员表示极端降雨的频率和强度未来可能会持续,这将对 Tapanuli 红毛猩猩及其栖息地的生存构成威胁。

特朗普手机是涂了金色的 2024 款 HTC U24 Pro

ifixit 的折解证实,2026 年上市的特朗普手机就是涂了一层金色的 2024 款 HTC U24 Pro。滑稽的是 Trump Mobile 以更高的价格卖出了比 HTC 更多的手机。HTC U24 Pro 售价大约 459 美元,仅售出了 1 万部,而 Trump Mobile 的特朗普手机售价 499 美元,售出了 3 万部。特朗普手机和 HTC U24 Pro 的主要区别是前者使用了美光的 12GB LPDDR5 和 512GB SSD,而后者的内存和 SSD 来自韩国的 SK Hynix,原因可能与供应链限制、关税等有关。

抗生素残留可能影响男性生育力

男性不育是当前全球生殖健康领域日益受到关注的问题。除遗传、激素异常和生殖系统疾病等已知原因外,环境暴露和生活方式因素也越来越受到重视。一些药物残留和环境污染物可能通过水体、土壤或食物链进入人体,但它们对男性生殖健康的潜在影响仍未被充分阐明。南京大学研究人员的一项研究探讨了环境暴露物奥硝唑对男性生育力的潜在影响。奥硝唑是一类可用于人类、畜禽和水产养殖领域的抗感染药物。研究人员首先分析了临床血清样本,发现少精子症患者血清奥硝唑水平高于健康对照人群。进一步分析显示,较高的血清奥硝唑水平与较低的精子浓度和总正常前向运动精子数显著相关,提示奥硝唑相关暴露可能与精子质量下降存在关联。研究人员称补充 Omega-3 脂肪酸 DHA 能显著改善奥硝唑诱导的生精功能损伤和减速分裂障碍。

东亚最高的树

研究人员在台大安溪附近发现了东亚已知最高的树,他们根据金庸小说将其命名为大安溪倚天劍。倚天劍高 84.1 米,树龄约一千年。世界上已知最高的树是加州红杉国家公园的 Hyperion,其高度约 116 米。台湾约 60% 的面积被森林覆盖,岛上估计有 9.5 亿棵树,其中有很多参天巨树。研究团队使用传统方法测量了倚天劍的高度:爬上树,从树顶垂下卷尺。

科技巨头大举借债

为了投资建设 AI 基础设施,科技巨头们正大举借债,规模达到了千亿美元。Google 母公司 Alphabet 一周前表示,计划过股票销售筹集 800 亿美元;Meta 宣布计划通过销售债券筹集 300 亿美元;亚马逊计划在加拿大发行债筹集 140 亿美元,紧跟着又与花旗、摩根大通、富国、汇丰和美银证券等达成协议借款约 175 亿美元总融资 315 亿美元。为了资助 AI 基础设施如芯片和数据中心,主要科技公司的支出都创下了历史新高。如此高的投资引发了回报相关的疑问。

游荡在 Fedora 项目的可疑 AI 智能体

5 月 27 日 Fedora 开发者 Adam Williamson 写邮件给 Nathan Giovannini,对由其账号控制的一个 AI 智能体提出疑问。该智能体过去几个月做了一系列令人感到可疑的事情:无缘无故修改 bug 的严重级别和优先级,伪造对 Bug 的回复,说服维护者将可疑代码合并到 Anaconda 安装程序,向上游项目递交了一系列 pull requests (PRs),其中一部分已被接受。Giovannini 回应称他的账号被盗了,他不是该智能体的控制者。此事令社区联想到了曾引发广泛关注的 XZ 后门事件。在 XZ 后门事件中,化名为 JiaT75(Jia Tan)的攻击者通过在两年多时间里向项目积极贡献代码而获得信任,然后再通过施压而最终成为项目的共同维护者,得到了能悄悄在代码中植入后门的权限。在以大模型为代表的生成式 AI 时代,贡献代码比以往任何时候更轻松,这意味着攻击者可以使用智能体向开源项目积极贡献代码,积累信任,然后再发动攻击。该智能体使用的账号已经关闭,相关 PR 已经回滚。

OpenAI 称中国关联账户试图煽动美国反数据中心情绪

OpenAI 周三发布报告称,公司发现一些源自中国的账户利用 AI 生成英文社交媒体帖子,称数据中心推高了美国居民的电费。OpenAI 称,这些账户可能与一家未具名的中国私营科技公司有关。OpenAI 表示,这些帖子传播范围有限,但应引起外界对外国势力试图削弱美国战略性产业的关注。该公司补充称,美国对 AI 和数据中心存在“合理的讨论”,但这些账户通过伪装成普通美国民众,通过发布有争议的 AI 生成内容来试图操纵讨论。

酷澎因用户信息泄露被罚逾 6 千亿韩元

韩国电商巨头酷澎(coupang)因其数千万用户信息泄露被罚 6247 亿韩元(约合人民币 27.7 亿元)。韩国个人信息保护委员会认定,酷澎在认证签名密钥管理及访问控制等方面存在疏漏,基本安全管理体系不完善,导致约 3750 万名用户个人信息泄露,并就此处以 4235.75 亿韩元罚款。这是针对单一个人信息泄露事故开出的最高罚款。委员会还认定,酷澎在缺乏法律依据的情况下,擅自收集约 1117 万名访问其他公司网站和应用程序用户的在线活动记录,并在可识别个人身份的状态下将相关信息储存到数据库中,因此另行处以 2011.066 亿韩元罚款。

科学家发现最大鲸类墓地

中国科学院深海科学与工程研究所主导的“全球深渊探索计划”在东南印度洋迪亚曼蒂纳深渊观察到大量鲸类化石和完整鲸落生态系统。这里也成为目前全球已知深度最深、规模最大的鲸类化石群与鲸落分布区。鲸落是鲸鱼死亡后沉入海底形成的特殊生态系统。2023 年科考团队搭乘“探索一号”科考船,使用“奋斗者”号载人深潜器,在绵延 1200 公里的迪亚曼蒂纳深渊沟底完成 32 次下潜作业,在水深 4616 米至 7001 米处,共发现 5 处化能自养阶段的鲸落、476 处鲸类化石堆积。该区域鲸类遗骸密度达每平方公里 759.5 具,经推算,整片海域鲸类残骸总量或超 1000 万具。

Meta 放宽言论限制后对政客的威胁增加了两倍

Meta 去年以言论监管过严为由放宽了限制。Center for Countering Digital Hate(CCDH)的一项研究调查了这一变化带来的直接影响。研究人员分析了约 800 万 Facebook 评论,发现新规实施后六个月内,针对共和党和民主党议员的辱骂性和种族主义评论增加了两倍,暴力威胁和仇恨言论在同期内增加了三倍。研究人员还发现,针对特朗普总统的威胁增加了一倍多。研究人员表示,直接威胁总统生命安全的评论可能构成重罪。违反 Meta 有关暴力威胁政策的评论数量增至四倍,从政策改变前六个月的 1800 条增加到改变后六个月的 7600 条。仇恨言论评论也翻了两倍,从 6900 条增加到 30000条。违反 Meta 关于欺凌和骚扰规则的评论数量增加了一倍,从 15700 条增加到 39900 条。

Visa 支付网络集成 ChatGPT

Visa 正在其支付网络集成 ChatGPT,允许 AI 智能体代表用户购物并完成购买。此举意味着 AI 智能体不仅能推荐商品,还能代表用户在任何接受 Visa 的商家完成商品购买。OpenAI 将提供技术,让智能体能通过 ChatGPT 进行互动、做出决策和发起购买。Visa 和 OpenAI 没有透露双方合作的财务条款,也没有说明商家或顾客需要支付的费用详细信息。 Visa 表示,为保护消费者并最大限度减少欺诈,该功能将设置消费限额、需要批准的步骤以及仅限授权商家等安全措施。

美国太阳能发电量首次超过煤炭

根据能源智库 Ember 的分析,2026 年 5 月美国太阳能发电量首次超过煤炭:太阳能发电量占到了美国电力供应的 12.8%,煤炭则下降至 12.2%。五年前的 5 月煤炭发电量占美国发电量的 19.7%,太阳能发电量则仅占 5.4%。2026 年 5 月美国太阳能发电量达到创纪录的 45.5 TWh,比去年同期增长 17%,高于去年 7 月创下的纪录。太阳能发电量通常在六月或七月达到峰值,Ember 估计今年夏天可能会再次打破纪录。太阳能首次成为美国第三大电力来源,仅次于天然气和核能。煤炭发电量则在下滑,2026 年 4 月煤炭发电量创历史新低为 39.3 太TWh。5 月发电量小幅上升至 43.4 TWh,但比 2025 年 5 月低 11%。

人类习惯于左转逆时针行走

研究人员在疫情期间进行了一系列实验,观察在保持安全距离的情况下多少人能共享同一空间。在回看视频时,他们注意到大多数人都逆时针方向行走。这一意外发现促使科学家展开了更多实验,发现人类总是倾向于逆时针行走。他们的研究报告发表在《Nature Communications》期刊上。科学家尚不清楚这种偏好的来源。男性和女性都存在该偏好行为,儿童中间更为明显。动物中间也有类似行为,如岩蚁(rock ants)探索未知巢穴时偏好左转。科学家怀疑与生物机械学有关,但确切机制仍然是个谜团。奥运会的田径比赛最初让运动员沿顺时针跑道跑,但后来因运动员认为这种跑法不自然而改为沿逆时针跑道跑,原因这可能是人口中的右腿优势。

FCC 计划在美国推行手机实名制

美国联邦通信委员会(FCC)想要杀死匿名的一次性手机,计划通过法律强制要求电信公司存储手机用户的个人信息,相关个人信息包括政府颁发的身份识别号码和实际地址。此举引发了隐私倡导者和民权活动人士的担忧,认为美国在向专制国家看齐。FCC 给出的理由是打击诈骗,旨在阻止诈骗分子接入电信网络,“执法人员能更好地识别诈骗分子”。FCC 将这些措施比作银行为防止洗钱而收集的数据。

09

APP STORE RANK

09.00
APP STORE RANK
Loading…