TEXT VIEW · TODAY'S DIGEST · 36 HEADLINES ACROSS 8 SOURCES

Startup Archive(0)

No items yet for today.

App Store Rankings(0)

No items yet for today.

ISSUE 0898
TUE, JUN 16, 2026
Discover the best information organized by OrangeBot.AI
TODAY · TUE, JUN 16, 2026

The web,
read by a bot.

Ten sources — Hacker News, Product Hunt, HuggingFace, Techmeme and more — filtered, tagged, and summarized every morning for builders who don’t have time to scroll.

NEWChrome extension: save posts from Twitter/X in one click.Install →
01

AI DIGEST

UPDATED DAILY · EDITOR'S PICK
01.00
AI DIGEST

AI新闻摘要

June 16, 2026

Here is a summary of today's main news events.


U.S. and Iran Announce Peace Deal, Shaking Global Markets

The United States and Iran have agreed to a peace deal to end their recent conflict. The news caused oil prices to fall below $80 a barrel on expectations of increased supply, while the Dow Jones Industrial Average hit a record high. Investors also moved into bonds as they await further details on the agreement's terms.

Tech World Buzzes with AI Lawsuits, Major Funding, and SpaceX Rally

The technology sector saw several key developments. AI company Anthropic is facing a consumer lawsuit over usage limits on its models, while OpenAI was accused of poaching an engineer from xAI. In finance, Chinese AI firm DeepSeek raised over $7.4 billion, and SpaceX's stock continued its strong post-IPO rally. Separately, Fox announced it is acquiring Roku.

Europe Navigates UK Political Shifts, Banking Deals, and Russia Sanctions

Across Europe, UK Prime Minister Starmer detailed plans for closer bilateral ties amid domestic political challenges. In finance, the German government moved to protect a major bank from a foreign takeover, and EU regulators are working to simplify banking rules. Meanwhile, a group of wealthy nations agreed to increase economic pressure on Russia, specifically targeting its oil and gas exports.

Global Tensions Rise as Nations Reshore Gold and U.S. Criticizes Israeli Campaign

Geopolitical tensions were highlighted as the U.S. President voiced criticism of Israel's military actions against a militant group in Lebanon. Reflecting global uncertainty, many central banks are moving their gold reserves back to their home countries from London and New York. In international business, a major Philippine fintech company announced plans for an IPO aiming to raise over $1 billion.

02

ON THE WIRE

6 SOURCES
02

HACKER NEWS

02.00
HACKER NEWS

Hacker News - June 16, 2026

Hacker News Feed: Highlighting key posts and discussions.

Mechanical Watch (2022)

(ciechanow.ski)

24934
I Love the Computer

(michaelenger.com)

269150
Typst 0.15.0

(typst.app)

32187
Hetzner Price Adjustment

(docs.hetzner.com)

504682
Iroh 1.0

(www.iroh.computer)

1278395
CrankGPT

(crankgpt.com)

585225
Fox to buy Roku

(www.wsj.com)

344411
Apple Foundation Models

(platform.claude.com)

473220
Bitsy

(bitsy.org)

2768
Write for One Person

(wizardzines.com)

27281
03

HUGGINGFACE

03.00
HUGGINGFACE

huggingface.title - June 16, 2026

huggingface.description

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Many moments in the real world do not wait for a user to ask. A fire starts on a security monitor, an expression flickers across a video call, or a product a viewer wants flashes by in a livestream. Yet today's large models remain mostly turn-based by design: they answer only when addressed, and even video-call apps that appear interactive still operate as question-answer systems, reacting only when polled or prompted. We argue for a different paradigm: a model that is present in the world like a person. It continuously watches what is happening now, decides on its own whether to speak or stay silent, interacts in real time, and delegates to a background model when the problem is hard. To advance interaction models and their adoption across domains, we make two fully open-sourced contributions. First, we release JoyAI-VL-Interaction, an 8B-scale, vision-first VL-interaction model. The model makes the response decision internally, choosing each second to stay silent, respond, or delegate to a background model, and it excels at vision-triggered responsiveness and time awareness. We pair it with a transferable training recipe, from which capabilities we never trained for emerge, such as guiding a shopper through changing app screens or improvising a lecture from a slide deck. Second, we release a complete, deployable system built around that model. The system streams any ongoing video into the model, making it genuinely present in the world. All other components are pluggable, including ASR/TTS modules, memory, visualization UI, and a background brain that can connect to any API or agent. Across six real-world scenarios, human raters prefer JoyAI-VL-Interaction over the in-app video-call assistants of Doubao and Gemini by a wide margin. To our knowledge, this is the first open, vision-driven interaction model released together with its training recipe, data, and complete deployable system.

153
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories

Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and designing visuals. Recent agents handle individual steps well: data-science agents close the analysis loop, while design agents synthesize beautiful websites. But can an agent serve as a data journalist end to end? We introduce Data Journalist Agent (Data2Story), a multi-agent framework that orchestrates specialized roles into a single virtual newsroom. Data2Story contributes two innovations. (i) Claims are evidence-grounded: an Inspector links every number, angle, and asset back to data, code, or an external reference. (ii) Articles are multimodally generative: rather than defaulting to plain text and static charts, Data2Story reasons about what readers will want to see, then deploys multimodal tools, such as interactive maps for geography and audio for music. We evaluate Data2Story on 18 articles, each paired with the originally published expert piece, along four axes: (a) human-agent angle coverage; (b) rubric evaluation with 53 participants across five dimensions; (c) computer-use agents as judges, a cost-saving proxy for how readers navigate interactive articles; and (d) verifiability, where a coding verifier re-executes statements against the data and checks claims against references. Data2Story produces competitive, evidence-traceable multimedia stories, with particular strength in transparency and auditability. Human articles retain an edge in editorial angle, creative design, and presentation. We position Data2Story as a collaborator for journalists, enabling more evidence-based, transparent, and verifiable reporting. Code and demos are available at https://data2story.github.io.

91
Geometric Action Model for Robot Policy Learning

Generalist robot policies must follow user instructions while reasoning about how objects, cameras, and robot actions interact in the 3D physical world. Recent vision-language-action models (VLAs) and video world-action models (WAMs) inherit strong semantic or temporal priors from large-scale foundation models, but they still operate primarily on 2D image frames or 2D-derived latent spaces, leaving implicit the 3D geometry required for contact-rich manipulation. We propose the Geometric Action Model (GAM), a language-conditioned manipulation policy that directly repurposes a pretrained geometric foundation model (GFM) as a shared substrate for perception, temporal prediction, and action decoding. GAM splits the GFM at an intermediate layer: the shallow layers serve as an observation encoder, and a causal future predictor inserted at the split layer forecasts future latent tokens conditioned on language, proprioception, and action history. The predicted future tokens are then routed through the remaining GFM blocks for feature propagation and decoding, allowing a single backbone to produce both future geometry and actions. This design equips the GFM with language-conditioned temporal world modeling through minimal architectural modification while preserving its rich geometric priors. Across a broad suite of simulation and real-robot manipulation benchmarks, GAM is more accurate, more robust, faster, and lighter than current foundation-model-scale baselines.

79
DreamX-World 1.0: A General-Purpose Interactive World Model

DreamX-World 1.0 is a general-purpose interactive text/image-to-video world model for controllable long-horizon generation. It supports camera navigation, revisits to previously observed regions, and promptable events across photorealistic, game-style, and stylized domains. Our data engine combines camera-accurate Unreal Engine rendering, action-rich gameplay recordings, and real-world videos with recovered camera geometry. For camera control, we introduce E-PRoPE, a lightweight variant of projective positional encoding that retains PRoPE's projective camera geometry while applying camera-aware attention to spatially reduced tokens. We convert a bidirectional video generator into a few-step autoregressive world model using causal forcing, DMD-style distillation, and long-rollout training. Training on self-generated long-horizon contexts exposes the model to its own generated history and reduces the style and color drift that accumulates across autoregressive chunks. Memory-Conditioned Scene Persistence retrieves earlier views through camera-geometry-based retrieval, while residual recycling makes the conditioning path less sensitive to imperfect memory latents. Event Instruction Tuning adds composable event control, and reinforcement learning alignment recovers camera control and visual quality after distillation. With mixed-precision DiT execution, residual reuse, 75\%-pruned VAE decoding, and asynchronous pipeline parallelism, DreamX-World 1.0 reaches up to 16\,FPS on eight RTX\,5090 GPUs. On our 5-second basic evaluation, DreamX-World 1.0 achieves a camera-control score of 73.75 and an overall score of 84.76, outperforming HY-WorldPlay 1.5 and LingBot-World in overall score, which achieve 80.79 and 80.45, respectively.

63
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models

This technical report introduces VibeThinker-3B, a compact dense model with 3B parameters developed to investigate how far verifiable reasoning can be pushed within a strictly small-model regime. Building upon the Spectrum-to-Signal post-training paradigm, we systematically enhance the model through an optimized pipeline that includes curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation. Experimental evaluations demonstrate that VibeThinker-3B achieves frontier-level performance on highly demanding verifiable tasks. Specifically, it attains a score of 94.3 on AIME26 (improving to 97.1 with claim-level test-time scaling), an 80.2 Pass@1 on LiveCodeBench v6, and exhibits strong out-of-distribution generalization with a 96.1\% acceptance rate on recent unseen LeetCode contests. This effectively places it in the performance band of first-tier reasoning systems, matching or exceeding flagship models that are orders of magnitude larger, such as DeepSeek V3.2, GLM-5, and Gemini 3 Pro. Furthermore, a score of 93.4 on IFEval confirms that this extreme reasoning enhancement does not compromise strict instruction controllability. Extending our previous 1.5B work, these findings motivate the Parametric Compression-Coverage Hypothesis, which views verifiable reasoning as compressible into compact reasoning cores, while open-domain knowledge and general-purpose competence require broad parameter coverage over facts, concepts, and long-tail scenarios. This perspective suggests that compact models are not merely deployment-efficient substitutes, but a complementary path toward frontier-level performance in parameter-dense capability regimes.

49
FastContext: Training Efficient Repository Explorer for Coding Agents

Large Language Model (LLM) coding agents have achieved strong results on software engineering tasks, yet repository exploration remains a major bottleneck: locating relevant code consumes substantial token budget and pollutes the agent's context with irrelevant snippets. In most agents, the same model explores the repository and solves the task, leaving exploratory reads and searches in the solver's history. We present FastContext, a dedicated exploration subagent that separates repository exploration from solving. Invoked on demand, FastContext issues parallel tool calls and returns concise file paths and line ranges as focused context. FastContext is powered by specialized exploration models spanning 4B--30B parameters. We bootstrap them from strong reference-model trajectories and refine them with task-grounded rewards for broad first-turn search, multi-turn evidence gathering, and precise citation generation. Across SWE-bench Multilingual, SWE-bench Pro, and SWE-QA, integrating FastContext into Mini-SWE-Agent improves end-to-end resolution rates up to 5.5\% while reducing coding-agent token consumption up to 60\%, with marginal overhead. These results show that repository exploration can be separated from solving and handled effectively by specialized models. Code and data: https://github.com/microsoft/fastcontext

47
Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models

Masked Diffusion Language Models (MDLMs) have emerged as a distinct paradigm for sequence generation. As MDLMs become diverse in capabilities and knowledge coverage, an important question is how to combine their knowledge. Toward this, we first investigate the unique decoding dynamics of MDLMs. We find that successful generations exhibit stable confidence dynamics over answer-relevant positions, while unreliable trajectories can often be corrected by injecting promising intermediate states from other models. Guided by this observation, we propose TIE (Trajectory-based Iterative Ensembling), a knowledge fusion framework in which MDLMs iteratively identify reliable decoding trajectories and relay them across models. TIE tracks confidence dynamics over answer-relevant positions to determine which model currently follows a more reliable trajectory and selectively transfers partially denoised sequences across models. As the model on the more promising trajectory often changes across denoising steps, TIE allows different models to contribute complementary strengths at different stages of generation. Strong performance across diverse reasoning tasks, along with our analyses, suggests that TIE offers a practical approach to the underexplored problem of MDLM ensembling.

23
BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering

Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/

22
VisualClaw: A Real-Time, Personalized Agent for the Physical World

Vision language models are serving as general-purpose interfaces for complex multimodal tasks. However, deployment still faces three gaps: VLMs typically incur high latency and cost when processing dense video frames and long prompts, the agent scaffold remains static after deployment, and standard video-QA benchmarks do not test whether agents can use visual evidence inside tool-using workspaces. We present VisualClaw, a self-evolving multimodal agent built around two principles. First, hybrid encoding reduces deployment cost by filtering less informative streaming frames with a cascaded gate and compressing the text skill bank through hot/cold top-k injection. Second, skill evolution lets the agent learn from failures: retrieved memories condition an evolver as direct concatenated context or as guided evidence, producing skill-bank updates that help future questions. Across 4 video-QA benchmarks with 2 VLMs, VisualClaw cuts per-question API cost by an average -98% versus full-frame upload and by -25.9% over the offline uniform 8 frame baseline, while boosting accuracy in most settings, e.g., an average +3.85% and a peak +15.80% on EgoSchema with Gemini 3 Flash. To address the gap, we curate VisualClawArena, a 200-scenario multimodal agentic benchmark built through a strict five-stage pipeline; models must use video evidence, documents, dynamic updates, and executable checks inside a workspace. On VisualClawArena, the same framework with computer-use agent backends improves macro accuracy by +2.9% for Codex (GPT-5.5) and +3.2% for Claude Code (Sonnet 4.6) over no-evolution baselines, with a -9.5% cost reduction compared to the uniform-sampled baseline. These properties make VisualClaw a natural fit for edge applications, where the cascade reduces a 1-hour streaming session from ~3,600 API uploads down to only 5-20 calls and the self-evolution makes it a perfect personalized assistant.

21
OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation

Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.

14
BadWorld: Adversarial Attacks on World Models

Visual world models (VWMs) synthesize interactive, action-conditioned rollouts from a single context image. However, it remains an open question how robust these models are to adversarial perturbations. Standard adversarial attacks fail to assess this vulnerability because attackers lack ground-truth future videos and cannot predict subsequent user controls. We introduce BadWorld, a label-free adversarial framework tailored for autoregressive VWMs that systematically overcomes both constraints. First, to bypass the need for future supervision, we propose a self-supervised velocity attack that directly disrupts the early denoising dynamics of the model. Second, to ensure the attack generalizes across unpredictable user actions, we formulate a trajectory-adaptive bi-level optimization that actively mines hard control sequences to forge control-agnostic perturbations. Evaluated on representative VWMs with continuous and discrete controls, BadWorld exposes severe structural fragility. Visually indistinguishable adversarial images reliably trigger catastrophic degradation in future rollouts, leading to incomplete denoising, structural collapse, and control inconsistency. These findings reveal critical risks for deploying VWMs in safety-critical systems while highlighting a practical mechanism for privacy protection.

14
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation

We introduce Qwen-RobotWorld, a language-conditioned video world model for embodied intelligence. With natural language as a unified action interface, it predicts physically grounded future visual trajectories from current observations across robotic manipulation, autonomous driving, indoor navigation, and human-to-robot transfer. This unified formulation provides three promising application directions: synthetic data generation for policy training augmentation, scalable virtual environments for policy evaluation, and language-guided planning signals for downstream robot control. This is achieved through a three-part design: a) Double-Stream MMDiT with MLLM Action Encoding, where a 60-layer double-stream diffusion transformer couples frozen Qwen2.5-VL semantics with video-VAE latents through layer-wise joint attention; b) Embodied World Knowledge (EWK), an 8.6M video-text corpus (200M+ frames) with action-language mapping over 20+ embodiments and 500+ action categories; and c) General+Expert Progressive Curriculum, a two-stage training strategy that first learns general visual priors and then injects embodied specialization under a shared language interface. Extensive results show strong competitiveness: ranks 1st overall on EWMBench and DreamGen Bench, outperforms all open-source models on WorldModelBench and PBench. Additional zero-shot analyses on RoboTwin-IF benchmark further support robust generalization and multi-view consistency.

13
CODA-BENCH: Can Code Agents Handle Data-Intensive Tasks?

Advanced agents are increasingly demonstrating the potential to operate as autonomous engineers, creating a growing demand for evaluation benchmarks that capture the complexity of real-world development. Such environments typically involve both complex code and large-scale data (i.e., file system). However, existing benchmarks usually evaluate code-centric or data-centric capabilities in isolation, leaving a clear gap with real development scenarios. In this paper, we bridge this gap by introducing CODA-BENCH, the first benchmark to jointly evaluate code and data intelligence in a data-intensive environment. We construct a data-intensive Linux sandbox based on the Kaggle ecosystem (containing hundreds of datasets), where agents must actively explore complex file hierarchies to identify relevant resources and generate code for data-driven analytical tasks. CODA-BENCH comprises 1,009 tasks spanning 31 communities, with each task environment containing an average of 980 files, simulating realistic data scale and noise. Evaluations of advanced agents reveal that even top-performing systems struggle to effectively integrate data discovery with code execution, achieving a success rate of only 61.1%. These results highlight a substantial gap in current agentic capabilities for data-intensive tasks and point to promising directions for future research.

11
TokenPilot: Cache-Efficient Context Management for LLM Agents

As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This reveals a critical trade-off between text sparsity and prompt cache continuity. To address this, we present TokenPilot, a dual-granularity context management framework. Globally, Ingestion-Aware Compaction acts as a framework harness to stabilize prompt prefixes and eliminate open-world environmental noise at the ingestion gate. Locally, Lifecycle-Aware Eviction monitors the ongoing residual utility of context segments, enforcing a conservative batch-turn schedule to offload content segments only when task relevance expires. Experiments on PinchBench and Claw-Eval under both isolated and continuous modes demonstrate that TokenPilot reduces costs by 61% and 56% in isolated mode, and 61% and 87% in continuous mode, while maintaining competitive performance compared to prior systems. TokenPilot has been integrated into LightMem2 at https://github.com/zjunlp/LightMem2.

11
Where Did It Go Wrong? Process-Level Evaluation of Web Agents with Semantic State Tracking

Web agents act through long interaction sequences, yet existing benchmarks evaluate only terminal success, discarding all process information and offering little guidance on improvement. In this work, we conduct a process-level analysis of web agents. We introduce WebStep, a benchmark of 1,800 task instances with controlled difficulty and automatic semantic state tracking. Each website exposes a deterministic semantic MDP alongside the GUI: the agent operates on the interface, while the environment records high-level states and transitions in the background, enabling fine-grained analysis without manual annotation. Based on the semantic trajectory, we first show that process metrics reveal differences invisible to outcome evaluation: three agents whose success rates cluster within 31-33% diverge in exploration reach versus execution accuracy. Then, decomposing by skill characterizes the nature of these differences, exposing opposite per-skill rankings hidden within the same website: e.g., on Housing, OpenAI CUA outperforms Qwen3.5 by 23.7% on commit actions yet underperforms it by 15.6% on filtering, pinpointing a concrete skill to improve even within a domain. Bifurcation analysis further localizes the decisive error that loses the task and shows that this error is agent-specific rather than shared. Finally, these differences widen as tasks grow harder: success rate is similar on easy tasks but separates sharply as exploration becomes more demanding. Our process-level analysis opens a new avenue in web agent evaluation, providing fine-grained and actionable insight into where and how each agent should be improved.

10
Retrieve, Don't Retrain: Extending Vision Language Action Models to New Tasks at Test Time

Extending a vision-language-action (VLA) policy to a new task typically requires task-specific teleoperated demonstrations and per-task fine-tuning, making adaptation costly in both data collection and compute. In this paper, we show that this target-side per-task adaptation cost can be replaced by retrieval. Our retrieval-augmented policy is trained once on paired demonstrations from the target embodiment (query) and a cheaper embodiment (pool, e.g., human-hand video), then frozen. New tasks are added at deployment by appending pool-side demonstrations to a retrieval pool. The frozen policy conditions on retrieved trajectories at every control step, so new tasks are absorbed by indexing data rather than updating parameters. Fine-tuning is needed only to take on a new, unseen embodiment, not for each new task. We show that retrieval improves policies beyond a specific backbone, including standard VLA policies, but its effect is especially pronounced in Cosmos Policy, a video-generation-based world-action model (WAM). In this setting, retrieval supplies coarse task progression, while the WAM's future-image objective provides an additional visual consistency signal that strengthens the retrieval-conditioned actions. On PushT, we study how retrieval provides a reusable high-level motion prior for cross-embodiment generalization to unseen goal angles, while on RoboTwin 2.0 our method outperforms cross-embodiment baselines on unseen tasks, and we additionally demonstrate the method on a real robot.

10
Memento: Reconstruct to Remember for Consistent Long Video Generation

Long-form video generation requires recurring subjects to remain consistent across various shots, viewpoints, motions, and scene transitions. Existing temporal decomposition methods improve scalability by generating videos shot by shot. However, they mainly focus on optimizing plausible next-shot continuations without verifying whether the historical memory preserves identity-critical subject evidence. Consequently, as generation proceeds, recurring subjects may be diluted, overwritten, or forgotten. In this paper, we propose Memento, a subject-reconstruction-guided framework that treats subject preservation as an explicit identity grounding problem, based on the premise that a memory bank faithfully preserving a subject should support reconstructing that subject from memory alone. Specifically, Memento jointly trains autoregressive next-shot generation with memory-based subject reconstruction, recovering target appearances using historical memory and global story captions. To disentangle long-range subject evidence from short-range cues, Memento introduces a dual-query memory mechanism, where one query retrieves identity-relevant memory and the other selects short-context keyframes for coherent continuation. Additionally, a subject-aware cinematic data pipeline provides precise reconstruction supervision via consistent, pronoun-free subject descriptions. Experiments demonstrate that Memento achieves state-of-the-art performance in long-term subject consistency, cross-shot coherence, and visual quality.

9
SP^3: Spherical Priors for Plug-and-Play Restoration

In this paper, we introduce SP^3, a novel Plug-and-Play algorithm that accelerates maximum a posteriori image restoration by replacing denoisers with Spherical Encoders (SE) as generative priors. SP^3 approximates the intractable proximal prior step by utilizing the SE tightly structured latent space as a robust projection onto the natural image manifold. Alternating this projection with a closed-form data-consistency step, via Half-Quadratic Splitting, achieves stable convergence without requiring gradient computation during inference. This unique formulation unlocks "anytime" restoration capabilities, producing sharp, plausible images from the first iteration. Evaluations across a variety of image restoration tasks demonstrate that SP^3 achieves perceptual quality comparable to state-of-the-art zero-shot diffusion and flow methods while being 3-630times faster.

8
GD^2PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization

As LLMs advance, post-training reinforcement learning (RL) increasingly relies on multi-dimensional rewards to cultivate comprehensive capabilities. This shift demands new algorithms capable of optimizing diverse and potentially competing objectives simultaneously. To address this, existing methods such as Group reward-Decoupled Policy Optimization (GDPO) decompose the overall score into independent reward groups, then compute the RL loss separately within each group. However, this strategy still encounters multi-reward conflicts: a single rollout can yield positive advantages on certain reward dimensions but negative ones on others, causing opposing signals to cancel each other out during aggregation, further hindering RL training efficiency. Inspired by Dynamic sAmpling Policy Optimization (DAPO), which improves RL training efficiency by filtering out ineffective rollouts with near-zero advantages, we propose Group-Dynamic reward-Decoupled Policy Optimization (GD^2PO). Specifically, GD^2PO employs a conflict-aware filtering mechanism to mask out rollouts suffering from severe reward-wise disagreement. By preventing conflicting signals from canceling each other out, this masking strategy preserves and enhances the magnitude of effective RL advantages, thereby significantly accelerating learning efficiency. Furthermore, we introduce query-level reweighting to dynamically adjust the update intensity of each query based on its overall reward consensus. Experiments on various multi-reward scenarios, including tool calling and human preference alignment, demonstrate that GD^2PO consistently and significantly outperforms existing baselines. The code is available at https://github.com/Qwen-Applications/GD2PO.

8
PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Phone agents are increasingly expected to complete real mobile workflows rather than merely predict the next screen action. However, much of the current mobile-agent literature still evaluates agents primarily as GUI controllers that observe a screen, emit taps and swipes, and are scored by target app state. Real phone-use tasks are broader: they require deciding when to use app GUIs, device-side commands, or structured tools, while leaving evidence that the intended side effect actually occurred. We introduce PhoneHarness, a mixed-action benchmark and execution harness for studying phone-use agents on verifiable mobile workflows. PhoneHarness runs a device-side agent loop over GUI, CLI, and host-side tool actions, combining deterministic action routing with bounded GUI delegation and auditable execution traces. Its benchmark, PhoneHarness Bench, evaluates whether agents complete tasks with observable side effects, not only whether they produce plausible final answers. On the annotated evaluation split, PhoneHarness reaches a 75.0% pass rate, outperforming the strongest non-PhoneHarness settings by 12.9 percentage points. PhoneHarness and PhoneHarness Bench therefore play distinct but mutually dependent roles: the harness makes mixed phone workflows executable, while the benchmark measures whether agents can use that harness reliably and safely. Our findings suggest that reliable phone automation depends on action-surface routing and verifiable execution, not only visual GUI control.

8
UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer

Unified Multimodal Models (UMMs) have emerged as a critical direction for general-purpose multimodal intelligence, integrating understanding and generation into a single framework. However, existing UMMs face prominent challenges: (1) the inherent learning conflicts between visual understanding and generation tasks, leading to suboptimal modeling in both tasks; (2) different understanding and generation visual spaces impeding scalability; (3) over-reliance on task-specific data that neglects the duality of text-image understanding and generation. To address these challenges, we propose UniDDT, which leverages a Noisy ViT encoder along with an LLM to unify semantic encoding for visual generation and understanding tasks, while employing a separate diffusion decoder to decouple diffusion decoding from text decoding. With this Noisy ViT encoder, UniDDT is able to leverage the latent space as a unified visual representation, enabling seamless compatibility between understanding and generation tasks. Thus, the scalability within the generation tasks and the semantic expressiveness within understanding tasks can be balanced. Also, we construct dual data structures from the same image-text pairs, fostering interdependence between the generation and understanding data to exploit their inherent duality. Extensive experiments demonstrate that UniDDT achieves effective unification of multimodal understanding and generation with enhanced semantic consistency and scalability. For visual generation tasks, our UniDDT achieves 0.87 GenEval score and 86.9 DPG overall score. For multimodal understanding tasks, our UniDDT achieves 1699.5 score on MME benchmark and 76.5 overall score on SEEDbench.

8
Tangram: Unlocking Non-Uniform KV Cache Compression for Efficient Multi-turn LLM Serving

Multi-turn LLM serving accumulates dialogue history whose Key-Value (KV) cache grows with every turn and every user, quickly exceeding the model weights themselves and making memory -- not compute -- the binding constraint on throughput. Non-uniform KV compression, which allocates heterogeneous budgets across attention heads, preserves accuracy far better than uniform schemes, yet remains impractical: modern serving stacks assume identical KV lengths across heads, so heterogeneity traps freed memory as page fragmentation, spends up to 25% of prefill time reclaiming scattered pages, and skews GPU workloads that inflate decode latency by up to 1.7times or burn 15--20% of each decode step on re-planning. We observe that this heterogeneity need not be discovered at runtime: head-wise retention follows a two-level structural regularity -- an input-invariant head ranking with narrowly bounded per-head ratios -- that can be calibrated offline from as few as 50 samples. Building on this insight, we present Tangram, a serving framework that statically resolves what prior systems handle dynamically: Budget Reservation fixes each head's post-compression footprint at scheduling time, eliminating page reclamation; Ragged Paging clusters similar-budget heads into independent page tables, turning fragmentation into reclaimable memory; and Ahead-of-Time Load Balancing precomputes balanced GPU partitions with zero runtime planning. Implemented on vLLM, Tangram serves as a drop-in substrate for existing non-uniform compression methods, matching their accuracy while improving end-to-end throughput by up to 2.6times over the full-KV baseline. Our implementation is publicly available at https://github.com/aiha-lab/TANGRAM.

7
Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.

6
Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes

When pretrained VLA policies are fine-tuned through online RL, each rollout episode produces only a single binary outcome (success or failure), yet the actor update requires per-transition supervision. Existing approaches commonly reduce this sparse outcome to a single scalar reward or advantage signal, which conflates distinct forms of transition-level feedback and provides limited guidance once basic task success becomes achievable. First, a single scalar signal conflates the two objectives of viability and efficiency; once basic success is achieved, the binary label provides no gradient to distinguish efficient completions from slow ones. Second, real-world rollouts mix autonomous and intervention segments; naively assigning episode outcomes across these boundaries introduces incorrect credit assignment. To address these issues, we propose Hierarchical Advantage-Weighted Behavior Cloning (HABC), which trains separate critic heads for these two objectives on different data subsets and combines their outputs with a state-adaptive balance. A state-adaptive gate g_t merges their one-step advantages, prioritizing viability when success is uncertain and shifting to efficiency only when viability is high, and converts the result into per-transition weights on the actor loss. Intervention-aware credit assignment further restricts outcome labels to segments executed by the current policy, preventing supervision from leaking across intervention boundaries. In real-robot experiments on three contact-rich bimanual tasks, HABC raises success from supervised fine-tuning (SFT) baselines of 36%, 44%, and 12% to 92%, 88%, and 38%.

6
Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Sparse autoencoders (SAEs) are widely used to interpret neural network representations, but their utility depends on whether the learned features are reproducible across training runs. We study this question through feature stability: for each SAE feature, we estimate the probability that a similar feature reappears in an independently trained SAE. This yields a scalable per-feature signal that separates stable from unstable features. In a large-scale study across seeds, models, layers, dictionary sizes, and SAE variants, we find a pronounced functional asymmetry: stable features carry most of the reconstruction- and prediction-relevant signal, while unstable features have weak marginal impact and are dominated by low-frequency surface-form triggers in both activation statistics and automatic explanations. Geometrically, unstable features are individually non-reproducible but concentrate in reproducible lower-rank subspaces, suggesting that seed dependence often reflects basis ambiguity within a shared region of activation space rather than pure noise. A controlled synthetic model makes this mechanism explicit, showing that low-rank ground-truth features can be recovered at the subspace level while remaining non-identifiable as individual SAE latents across seeds. Finally, by pooling unique cross-seed features, we construct more stable SAEs while preserving explained variance in this setting. Together, these results show that unstable features are not merely failed or noisy latents: they have weak individual functional impact, but reflect reproducible low-dimensional structure that standard SAEs resolve differently across seeds.

4
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

We introduce Nemotron 3 Ultra, a 550 billion total and 55 billion active parameter Mixture-of-Experts Hybrid Mamba-Attention language model. We pre-trained Nemotron 3 Ultra on 20 trillion text tokens, then extended the context length to 1M tokens, and post-trained using Supervised Fine Tuning (SFT), Reinforcement Learning (RL), and Multi-teacher On-Policy Distillation (MOPD). Nemotron 3 Ultra is our most capable model yet, employing multiple key technologies - LatentMoE, Multi Token Prediction (MTP), NVFP4 pre-training, multi-environment RLVR, MOPD, and reasoning budget control. Nemotron 3 Ultra achieves up to ~6x higher inference throughput as compared to state-of-the-art publicly available LLMs while attaining on-par accuracy. The state-of-the-art accuracy, high inference throughput, and 1M token context length make Nemotron 3 Ultra ideal for long-running autonomous agentic tasks. We open-source the base, post-trained, and quantized checkpoints, along with the training data and recipe on HuggingFace.

4
MMDiff: Extending Diffusion Transformers for Multi-Modal Generation

Diffusion transformers have demonstrated remarkable generative capabilities, yet the rich perceptual representations computed across their denoising trajectory are discarded once the content is rendered. We present MMDiff, a framework that transforms a frozen diffusion transformer into a multi-modal generative system that jointly produces images alongside any combination of dense perceptual modalities using lightweight decoder heads. Our central finding is that perceptual information is temporally distributed along the denoising trajectory, and that multi-timestep feature fusion with spatially varying aggregation weights is essential, improving semantic segmentation results by up to 28.7% mIoU over single-timestep extraction. We further adopt concept-driven attention extraction for interpretable spatial guidance, and show that frozen diffusion features are competitive with and complementary to state-of-the-art encoders such as DINOv3. By training only lightweight decoder heads on a frozen backbone, we achieve strong performance in semantic segmentation, salient object detection, and depth estimation, and demonstrate that this framework enables effective synthetic data generation at scale.

3
Artificial Intelligence Index Report 2026

Welcome to the ninth edition of the AI Index report. As AI continues to advance rapidly, the question becomes whether the systems built around it can keep up. Governance frameworks, evaluation methods, education systems, and the data infrastructure needed to track AI's impact are struggling to match the pace of the technology itself. That gap between what AI can do and how prepared we are to manage it runs through every chapter of this year's report. New in this edition, the report tracks how AI is being tested more ambitiously across reasoning, safety, and real-world task execution, and why those measurements are increasingly difficult to rely on. It also features new estimates of generative AI's economic value alongside emerging evidence of its labor market effects, an analytical framework on AI sovereignty, and a science chapter developed in collaboration with Schmidt Sciences. For the first time, the report features standalone chapters on AI in science and AI in medicine, reflecting AI's growing impact across these two domains.

3
MVEB: Massive Video Embedding Benchmark

We introduce the Massive Video Embedding Benchmark (MVEB), a 23-task benchmark for video embeddings spanning classification, zero-shot classification, clustering, pair classification, retrieval, and video-centric question answering. We evaluate 33 models and find that no single model dominates: MLLM-based embeddings lead on classification, clustering, pair classification, and QA; multimodal binding leads on retrieval and zero-shot classification; generative MLLMs without contrastive adaptation collapse on cross-modal tasks. Paired video-only vs. audio+video evaluations show that audio's contribution depends on dataset annotation provenance: audio helps when labels were produced from both modalities and hurts when they were produced from visuals alone, a six-point gap consistent across model families. MVEB is derived from MVEB+, a 184-task pool, and is designed to maintain task diversity while reducing evaluation cost. It integrates into the MTEB ecosystem for unified evaluation across text, image, audio, and video. We release MVEB and all 184 tasks along with code and a leaderboard at https://github.com/embeddings-benchmark/mteb.

2
Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To address these limitations, we introduce Prompt-Level Distillation (PLD). We extract explicit reasoning patterns from a Teacher model and organize them into a structured list of expressive instructions for the Student model's System Prompt. Evaluated using Gemma-3 4B, PLD improved Macro F1 scores on StereoSet (57\% to 90.0\%) and Contract-NLI (67\% to 83\%), while increasing LogiQA accuracy to 70\%. Similar results on Mistral Small 3.1 demonstrate cross-architecture generalizability, enabling these compact models to match frontier performance with negligible latency overhead. These expressive instructions render the decision-making process transparent, allowing for full human verification of logic, making this approach ideal for regulated industries such as law, finance, and content moderation, as well as high-volume use cases and edge devices.

2
EgoPhys: Learning Generalizable Physics Models of Deformable Objects from Egocentric Video

Humans naturally understand object physics through everyday interactions, but faithfully predicting complex deformable dynamics, such as elastic materials and fabrics, remains a major challenge for computer vision and robotics. We present EgoPhys, a framework that constructs deformable physical digital twins from egocentric RGB-only video using generalizable priors. EgoPhys overcomes the limitations of existing methods to enable controllable deformable digital twin generation from egocentric videos by distilling per-object inverse-physics solutions into a compact codebook, enabling prediction of dense spring stiffness fields for unseen objects without per-spring test-time optimization. Trained with generalizable priors from diverse egocentric interactions, EgoPhys outperforms baselines in reconstruction, future prediction, and zero-shot generalization. To support training and evaluation, we curate an egocentric interaction dataset covering diverse deformable objects, scenes, and manipulation styles. We deploy EgoPhys on a real xArm6 robot, demonstrating that a digital twin initialized from a single egocentric human play video can serve as an internal world representation to aid in deformable-object planning, highlighting egocentric RGB observations as a scalable path toward real-to-sim pipelines.

1
LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies

Vision-Language-Action models (VLAs) leverage large-scale vision-language pretraining for semantic robot control, but often lack explicit foresight into how robot actions change the scene. World-Action Models (WAMs) address this limitation by conditioning policies on predicted futures, yet existing approaches typically rely on computationally expensive video generation with substantial pixel-level redundancy. We present LaWAM, a Latent World Action Model that exposes predictive dynamics to robot policies through compact latent visual subgoals instead of reconstructed future video. At the core of LaWAM is a latent-action-conditioned Latent World Model (LaWM). We obtain LaWM by training a latent action model in the latent space of a pretrained vision foundation model and repurposing its forward decoder to predict future observation features for scene evolution. LaWAM then conditions action generation on these predicted latent visual subgoals to enable dynamics-aware robot control. LaWAM achieves state-of-the-art or competitive success rates (SRs) across LIBERO (98.6% SR), RoboTwin (91.22% SR), and real-world manipulation tasks while retaining low-latency inference. LaWAM runs in 187 ms per action-chunk prediction and achieves up to 24x lower wall-clock latency than pixel-space WAMs.

1
Selective Control under Noisy Perception: Governance Failures Hidden by Aggregate Metrics in Modular Networks

A content-moderation system can score well on every standard accuracy metric and still cause real harm, if its mistakes fall on the few users who connect otherwise separate communities. We show this in an agent-based model where N=240 learning agents on a community-structured network each post harmless, productive, or dangerous content, and a regulator removes or penalizes whatever a noisy classifier flags. Overall usefulness barely moves as the noise changes (one-way ANOVA, p=0.96): by aggregate measures, nothing looks wrong. The damage instead concentrates on these bridge users, whose useful posts are wrongly suppressed and whose dangerous posts are wrongly spared. A governance loss (L_gov) that prices these two mistakes separately from the cost of enforcement more than doubles under false-positive-heavy noise. Aggregate accuracy hides who is harmed, and the cheap quantity to audit is how many connections a user has (degree), a near-perfect proxy for the betweenness that defines a bridge (r=0.96).

1
Who Flips? Self- and Cross-Model Counterarguments Reveal Answer Instability in LLMs

Standard accuracy benchmarks are designed to test how closely large language models (LLMs) approach correct answers, but are not suitable for testing whether LLMs stick with a correct answer when that answer is challenged by a plausible counter-argument. We introduce a controlled protocol for evaluating answer stability: after a model answers a multiple-choice question correctly, we challenge the model's answer with a coherent argument for an incorrect option and measure whether the model flips. The setup a) isolates argumentative content from overt social pressure and b) varies argument length, self-attribution, and cross-model source. Across seven frontier models and 57 MMLU subjects, flip rates range from 17.5% to 97.3%, revealing large differences in stability that are not captured by accuracy metrics alone. We find that self-attribution consistently increases flip rates (mean +7.1pp, up to +18.7pp). Also, pooling wrong-answer arguments across models and selecting the most effective one per question yields stronger adversarial challenges than relying on any single source model. We further construct MaxFlip, a curated challenge set that amplifies flips by up to +23.6pp over standard self-generated challenges. We release the protocol, challenge records, and MaxFlip to support stability evaluation alongside standard accuracy benchmarks. Materials are available at https://github.com/nafisenik/WhoFlips and https://hf.co/datasets/nafisehNik/WhoFlips.

1
PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory

Consistent video generation under editing operations requires persistence: when edits modify scene appearance or layout, subsequent generations should remain coherent across time and viewpoints. However, existing memory designs struggle to maintain long-term consistency after such modifications, as stored contexts may become outdated or invalid. To address this, we propose PermaVid, a novel framework built upon a multi-modal context memory that disentangles spatial context into semantic appearance and geometric structure, together with an edit-aware memory update and retrieval strategy that keeps memory evolution aligned with subsequent observations. Specifically, we develop two complementary memory banks: an RGB context memory that captures appearance-aware observations while implicitly encoding geometry, and a depth context memory that preserves geometry-only structure disentangled from semantics. Building on this design, we introduce a memory-guided video generation model that performs multi-modal feature fusion under reference conditions drawn from mixed-modality memory contexts. Experiments demonstrate that our method maintains strong long-term semantic and structural consistency after edits, significantly outperforming state-of-the-art methods.

1
The Ghosts of Polymarket: When Off-Chain Matches Meet On-Chain Reverts

Polymarket has emerged as a prominent prediction market platform and one of the fastest-growing applications in DeFi. To achieve low-latency trading, it adopts a hybrid architecture that matches orders off-chain but settles them on-chain for final execution. This design creates a consistency gap we call Ghost Fills: an order that is successfully matched off-chain may later fail during on-chain settlement. To understand the security implications of this gap, we investigate such failed settlements by building GHOSTHUNTER, which reconstructs them from on-chain traces and attributes to concrete attack patterns. Across 1,952,440 reverted match-order transactions, we find that attackers exploit the time gap between matching and settlement to invalidate already matched orders before they are finalized on-chain. We then identify four attack vectors from these incidents: nonce bump, balance drain, allowance revoke, and proxy trap, realized via 35 evolving variants. These vectors allow attackers to selectively revert 980,133 filled orders, enabling risk-free prediction, arbitrage-bot hunting, and liquidity reward manipulation, realizing at least \1.49M in profit, which places 1.78 B USD at risk and 2.17 M POL (about \212 K) paid by operator. During peak hours, more than 24.3% of all filled orders reverted, causing de facto DoS attacks. We also find that code derived from the flawed contract still appears in 167 independent contracts across 10 chains holding at least 23 M in user funds, extending the impact beyond Polymarket. We have disclosed our evidence to affected parties, and the issue has been partially mitigated.

0
Implicit Reasoning for Large Language Model-based Generative Recommendation

Large Language Models (LLMs) are increasingly adopted as backbones for Generative Recommendation (GR), promising access to pretrained world knowledge. Yet reliably invoking this knowledge for GR remains poorly understood. A key obstacle is that LLM-based GR typically represents items with Semantic IDs (SIDs), disrupting LLMs' natural-language reasoning interface because these tokens are unseen by the LLM during pretraining. Existing approaches address this with expensive multi-stage pipelines that ground SIDs and elicit explicit rationales, but offer limited insight into when and why each stage is necessary. In this work, we systematically decompose explicit reasoning training pipelines for LLM-based GR, revealing three key limitations: weakened world-knowledge verbalization, misalignment between SID and natural-language token embedding spaces, and sensitivity to rationale quality, all of which hurt explicit reasoning performance. To circumvent these issues, we propose PauseRec, a lightweight implicit reasoning paradigm tailored for GR. PauseRec is exceptionally practical, avoiding costly reasoning trace acquisition and reasoning alignment training, leading to a multitude of benefits: (1) it outperforms standard explicit CoT methods by up to 6.22%, (2) it reduces training cost by up to 65% GPU hours, and (3) it speeds up inference by up to 71.3%. These results position PauseRec as a lightweight alternative to explicit rationale generation, enabling more effective and efficient LLM-based GR.

0
TuneJury: An Open Metric for Improving Music Generation Preference Alignment

We introduce TuneJury, an open, instance-level pairwise reward model for text-to-music that predicts a music preference score from a text prompt and an audio clip. The released checkpoint is trained on publicly available human-preference labels covering arena-style (A vs. B) votes, metric-alignment preference pairs, crowdsourced pairwise comparisons, and expert aesthetic ratings. The predicted score margin between two clips is well calibrated on our held-out test split, supporting data filtering via a simple score threshold. TuneJury generalizes to both held-out test pairs and out-of-distribution benchmarks, remaining competitive with prior baselines on the latter. For generators released after training, we introduce anchor calibration, a post-hoc, per-system Bradley-Terry calibration that recovers agreement at substantially better data efficiency than from-scratch retraining. The same frozen reward drives consistent reward-axis gains across three downstream applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. TuneJury is available at https://github.com/yonghyunk1m/TuneJury.

0
05

PRODUCT HUNT

05.00
PRODUCT HUNT

Product Hunt - June 16, 2026

Product Hunt Daily Feed: Featuring noteworthy tech launches.

Obotiq icon
Obotiq

Autonomous care robots that give caregivers time back

0
Keyboard Copilot icon
Keyboard Copilot

AI keyboard that rephrases, translates, and more in any app

0
Invoko icon
Invoko

A little hand on your Mac

0
BossMode icon
BossMode

The focus app that shows your mom when you procrastinate

0
Revyl icon
Revyl

The mobile source of truth

0
Revi icon
Revi

On-device voice dictation. No cloud, no account.

0
Glint icon
Glint

Claude Code activity, right where you want it.

0
Subotiz icon
Subotiz

Smarter Monetization. Subscribe. Sell. Scale.

0
Bodhiorchard icon
Bodhiorchard

Vibe-code your whole project with 12 autonomous AI agents

0
GamerForge icon
GamerForge

Transform game, CGI, and VFX assets with AI

0
Stride icon
Stride

The AI workspace that plans, designs and ships with you.

0
Dinamo Notebook icon
Dinamo Notebook

Professional football/soccer analysis, right in your browser

0
ZenVeil icon
ZenVeil

Find, understand and fix security issues faster

0
Mood icon
Mood

A quiet record of how you feel

0
Getusefeed icon
Getusefeed

Feature voting & public roadmap for small SaaS teams.

0
HTML Deployer icon
HTML Deployer

Deploy sites from AI chat to published HTML

0
Human in the Love icon
Human in the Love

The world's first MCP dating app inside Claude.

0
Avocado icon
Avocado

AI-native content operations for any Next.js website

0
Fonty Menu Bar icon
Fonty Menu Bar

Install Google Fonts right from your Mac menu bar.

0
Eidentic icon
Eidentic

The TypeScript SDK for AI agents with self-improving memory

0
Voice Calls in Chatwoot icon
Voice Calls in Chatwoot

Calls, chats, and emails all in one support inbox

0
ClientJam icon
ClientJam

AI-powered lead generation for designers and agencies

0
Bitli.st icon
Bitli.st

Sell your product before you build it

0
Fluxmail icon
Fluxmail

AI email for urgent mail, replies, and follow-ups

0
Releasely icon
Releasely

Three changelogs from every release -for indie SaaS founders

0
Tychi AI icon
Tychi AI

The economic layer for autonomous capital.

0
Agata icon
Agata

AI text correction & translation via keyboard shortcuts.

0
yousaidthat.org icon
yousaidthat.org

Prove that you said things before anyone else

0
Publia icon
Publia

What AI makes, Publia ships.

0
Peak: Message Safety icon
Peak: Message Safety

Review your kid's texts. Nothing leaves your Mac

0
FableWatch icon
FableWatch

Know the second Fable 5 is back

0
Notum icon
Notum

AI-powered research and document intelligence for law firms.

0
Ledgerly icon
Ledgerly

Available on the App Store

0
Goldfish icon
Goldfish

Press Option. It knows your work and replies like you

0
YoAmigo Studio icon
YoAmigo Studio

Ship real apps on the AI you already pay for.

0
agentbrowse icon
agentbrowse

Give your AI coding agent the web as a command line

0
LLM Gateway Chat icon
LLM Gateway Chat

One balance. Every model. Chat, image, video & audio.

0
Amaroad icon
Amaroad

Build AI slide decks without leaving your terminal

0
Kraina icon
Kraina

Turn your outdoor activities into a territory game.

0
whoburnedmore icon
whoburnedmore

Spotify Wrapped for Claude, Codex & a Public leaderboard.

0
Boxwood Chess icon
Boxwood Chess

Chess pattern training. No timers, no streaks, no ratings.

0
PrompTessor icon
PrompTessor

AI Prompt Generator, Optimizer & Library

0
Sklm icon
Sklm

Centralize, scope, and sync skills for every AI agent

0
Annota icon
Annota

Cross Platform, Local First AI powered Note Taking App

0
PeakRoutine icon
PeakRoutine

Personalized health coaching powered by your biomarkers

0
Zoona AI icon
Zoona AI

Automated support that learns from docs + past conversations

0
MakersClaw icon
MakersClaw

Hire AI employees that live in your Slack, Teams, Telegram

0
Stash icon
Stash

A hidden pocket for the apps cluttering your Dock

0
Edgee Turbo Models icon
Edgee Turbo Models

Use Claude Code with Kimi K2.7 Code, MiniMax M2.7, and more

0
SocialKit icon
SocialKit

Post to 11 social platforms in seconds, on one flat plan

0
06

TECHMEME

06.00
TECHMEME

Techmeme - June 16, 2026

Techmeme Digest: Major tech headlines and industry conversations.

Sources: Binance is set to lose permission to offer services to EU clients from the start of July as its application for a license is about to be turned down (Reuters)
Source: TechmemePublished: Jun 16, 2026

Reuters : Sources: Binance is set to lose permission to offer services to EU clients from the start of July as its application for a license is about to be turned down —  Binance, the world's largest crypto exchange, is set to lose permission to offer services to European Union clients within weeks …

Ent Security, which wants to build a new layer of workspace security that reads the intent behind what users and AI agents do, launches with $100M in funding (Duncan Riley/SiliconANGLE)
Source: TechmemePublished: Jun 16, 2026

Duncan Riley / SiliconANGLE : Ent Security, which wants to build a new layer of workspace security that reads the intent behind what users and AI agents do, launches with $100M in funding —  Intent-aware endpoint security startup Ent Security launched today with $100 million in funding to build what it calls a new layer …

Limitless Labs, which is developing an AI platform to automate CNC manufacturing, raised a $20M Series A led by Dell Technologies Capital and Square Peg (Meir Orbach/CTech)
Source: TechmemePublished: Jun 16, 2026

Meir Orbach / CTech : Limitless Labs, which is developing an AI platform to automate CNC manufacturing, raised a $20M Series A led by Dell Technologies Capital and Square Peg —  Israeli startup targets automation of CNC manufacturing workflows.  —  Limitless Labs (formerly LimitlessCNC) has raised $20 million …

Microsoft upgrades Surface Pro and Surface Laptop with Qualcomm Snapdragon X2 chips; Pro starts at $1,499 and Laptop at $1,599, both $100 higher than last-gen (Antonio G. Di Benedetto/The Verge)
Source: TechmemePublished: Jun 16, 2026

Antonio G. Di Benedetto / The Verge : Microsoft upgrades Surface Pro and Surface Laptop with Qualcomm Snapdragon X2 chips; Pro starts at $1,499 and Laptop at $1,599, both $100 higher than last-gen —  New colors and high prices. New colors and high prices. … Microsoft is launching new Surface Laptops …

Copia, an industrial code management and recovery provider, raised $26M co-led by AE Ventures and Squadra Ventures, after previously raising $16.4M (Colin Campbell/Axios)
Source: TechmemePublished: Jun 16, 2026

Colin Campbell / Axios : Copia, an industrial code management and recovery provider, raised $26M co-led by AE Ventures and Squadra Ventures, after previously raising $16.4M —  Copia, an industrial code management and recovery provider, raised $26 million co-led by AE Ventures and Squadra Ventures, CEO Adam Gluck tells Axios Pro exclusively.

Docs: audited financial figures show OpenAI spent $34B in 2025, up 172% YoY, including $19B on research and development and nearly $6B on sales and marketing (Ed Zitron/Ed Zitron's Where's Your Ed At)
Source: TechmemePublished: Jun 16, 2026

Ed Zitron / Ed Zitron's Where's Your Ed At : Docs: audited financial figures show OpenAI spent $34B in 2025, up 172% YoY, including $19B on research and development and nearly $6B on sales and marketing —  Today, I can exclusively report, based on audited financial documents viewed by this publication that have been independently verified …

Roblox rolls out Roblox Kids accounts for ages 5 to 8 and Roblox Select accounts for ages 9 to 15, globally, after unveiling them alongside age checks in April (Jay Peters/The Verge)
Source: TechmemePublished: Jun 16, 2026

Jay Peters / The Verge : Roblox rolls out Roblox Kids accounts for ages 5 to 8 and Roblox Select accounts for ages 9 to 15, globally, after unveiling them alongside age checks in April —  Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

Robinhood CEO Vlad Tenev says the company is "proactively" reducing its full-time workforce by ~10%, or ~290 employees, in a "flattening" of its org structure (Helen Partz/Cointelegraph)
Source: TechmemePublished: Jun 16, 2026

Helen Partz / Cointelegraph : Robinhood CEO Vlad Tenev says the company is “proactively” reducing its full-time workforce by ~10%, or ~290 employees, in a “flattening” of its org structure —  Cointelegraph is committed to independent, transparent journalism.  This news article is produced …

Ripple acquires a stake in Flutterwave as part of the African payments startup's Series E, valuing Flutterwave at $3.2B (Emele Onu/Bloomberg)
Source: TechmemePublished: Jun 16, 2026

Emele Onu / Bloomberg : Ripple acquires a stake in Flutterwave as part of the African payments startup's Series E, valuing Flutterwave at $3.2B —  Flutterwave Inc., one of Africa's most valuable financial-technology startups, sold an equity stake to Ripple, as it seeks to accelerate growth through payments expansion and strategic partnerships.

An interview with SBF about life in prison, his serialized prison memoir titled Manfred, and more, as he praises Trump, becomes a Republican, and seeks a pardon (Simon van Zuylen-Wood/New York Magazine)
Source: TechmemePublished: Jun 16, 2026

Simon van Zuylen-Wood / New York Magazine : An interview with SBF about life in prison, his serialized prison memoir titled Manfred, and more, as he praises Trump, becomes a Republican, and seeks a pardon —  Sam Bankman-Fried is incarcerated at a federal prison in Lompoc, California, which sits northwest of Santa Barbara and is dubbed “the City of Arts and Flowers.”

Atom Computing, one of the nine quantum computing startups backed by the US, raised a $100M Series C led by Third Point, bringing its total funding to $300M (Lucinda Shen/Axios)
Source: TechmemePublished: Jun 16, 2026

Lucinda Shen / Axios : Atom Computing, one of the nine quantum computing startups backed by the US, raised a $100M Series C led by Third Point, bringing its total funding to $300M —  Quantum company Atom Computing raised a $100 million Series C led by Third Point Ventures, it tells Axios exclusively.

Meta says Threads now has 500M MAUs, adds a "Your Algo" tool that lets users privately control what they see in their feed, and launches Communities out of beta (Aisha Malik/TechCrunch)
Source: TechmemePublished: Jun 16, 2026

Aisha Malik / TechCrunch : Meta says Threads now has 500M MAUs, adds a “Your Algo” tool that lets users privately control what they see in their feed, and launches Communities out of beta —  Nearly three years after launching as a competitor to Twitter (now X), Threads has reaches 500 million monthly active users, the company announced Tuesday.

Alibaba's AI unit Tongyi Lab launches the Qwen Robot Suite, its first suite of AI models for robots, in pilot testing with some Alibaba Cloud enterprise clients (Wency Chen/South China Morning Post)
Source: TechmemePublished: Jun 16, 2026

Wency Chen / South China Morning Post : Alibaba's AI unit Tongyi Lab launches the Qwen Robot Suite, its first suite of AI models for robots, in pilot testing with some Alibaba Cloud enterprise clients —  Alibaba unveils Qwen robot models to push AI beyond chatbots, as embodied intelligence becomes the next frontier in global AI

SpaceX agrees to acquire Cursor in a merger valuing the AI coding startup at $60B, set to close in Q3 2026, or pay a $4B to $10B termination fee (Reuters)
Source: TechmemePublished: Jun 16, 2026

Reuters : SpaceX agrees to acquire Cursor in a merger valuing the AI coding startup at $60B, set to close in Q3 2026, or pay a $4B to $10B termination fee —  Elon Musk's SpaceX (SPCX.O) said on Tuesday it would acquire Anysphere, the software firm behind the popular AI coding agent Cursor …

SoftBank announces an OpenAI-powered "patching service" that aims to protect Japan's top 3,000 companies behind crucial infrastructure against cyberattacks (Yuri Kageyama/Associated Press)
Source: TechmemePublished: Jun 16, 2026

Yuri Kageyama / Associated Press : SoftBank announces an OpenAI-powered “patching service” that aims to protect Japan's top 3,000 companies behind crucial infrastructure against cyberattacks —  Japanese technology giant SoftBank Group Corp. is launching a service using OpenAI technology to protect against the looming threat …

07

STARTUP ARCHIVE

07.00
STARTUP ARCHIVE

Startup News - June 16, 2026

Startup News Roundup: Aggregating key funding and launch updates.

Marc Andreessen on the 5 personality traits of an innovator
Source: StartupPublished: Mar 31, 2026

“When you’re talking about real innovators—people who actually do really creative, breakthrough work—I think you’re talking about a couple things:”

Steve Jobs explains the importance of both thinking and doing
Source: StartupPublished: Mar 30, 2026

“The doers are the major thinkers. The people who really create the things that change this industry are both the thinker-doer in one person.”

Tobi Lutke explains what the VCs who passed on Shopify got wrong
Source: StartupPublished: Mar 27, 2026

“What a lot of free-market thinkers don’t understand is that between the demand and eventual supply lies friction."

Sam Altman explains how he decides to invest in a startup after 10 minutes
Source: StartupPublished: Mar 26, 2026

"Does this person have the potential to be the next Mark Zuckerberg?… [You don’t get to] 100% accuracy, obviously, but it’s good enough that our business model works.”

Jony Ive recounts the time Steve Jobs called him vain
Source: StartupPublished: Mar 25, 2026

In the clip below, Jony Ive recounts the time he asked Steve Jobs to be less harsh in his critique of a piece of work.

Jeff Bezos’s two pieces of advice for aspiring entrepreneurs
Source: StartupPublished: Mar 24, 2026

“The advice that I would give entrepreneurs is don't chase the hot new thing. It's so hard to catch something that everybody already knows is hot."

Elad Gil: “Things that work tend to work pretty fast”
Source: StartupPublished: Mar 23, 2026

“I do think there’s a bit of a myth in Silicon Valley that you should keep grinding no matter what and it’s just about perseverance, and I think that’s really bad advice."

Paul Graham on why starting with a “small, intense fire" is the key to startup growth
Source: StartupPublished: Mar 20, 2026

"You have to know who those first users are and how you're going to get them."

Keith Rabois on how to identify great talent
Source: StartupPublished: Mar 19, 2026

“What you want to do with every single employee every single day is expand the scope of their responsibilities until it breaks… and that’s the role they should stay in.”

Wealthfront CEO on why advertising spend makes it harder to find product/market fit
Source: StartupPublished: Mar 18, 2026

“The way that you know you have product/market fit is if you have exponential organic growth."

Eric Schmidt on why most companies get strategy wrong
Source: StartupPublished: Mar 17, 2026

“Work very, very hard to figure out what the world’s going to look like in five years. What will people be doing? What will your customers want? Where will costs be?"

Mark Zuckerberg: “You can’t 80/20 everything”
Source: StartupPublished: Mar 16, 2026

"There’s the famous 80/20 rule where you get 80% of the benefit by doing 20% of the work, but you can’t just 80/20 everything. There have to be certain things that you are just the best at."

Marc Andreessen on Mark Zuckerberg’s founder “superpower”
Source: StartupPublished: Mar 13, 2026

“A great superpower that Mark Zuckerberg has that is probably not well-understood enough is he does not get emotionally upset in stressful situations"

Sam Altman explains how to come up with a great startup idea
Source: StartupPublished: Mar 12, 2026

"If you start a startup without a good idea… you’ll be under pressure to make something up and it won’t work that well."

Jeff Bezos on the problems with proxies and managing to metrics
Source: StartupPublished: Mar 11, 2026

“One of the things that happens in business is that you develop certain things that you’re managing to—a typical case would be a metric. And that metric isn’t the real underlying thing.”

Airbnb founder Brian Chesky on how to design an amazing user experience
Source: StartupPublished: Mar 10, 2026

“If you can design something really amazing using the hand-crafted part of your brain, then you can reverse-engineer how to industrialize this millions of times over."

Spencer Rascoff: "I will never invest in a consumer startup with paid marketing”
Source: StartupPublished: Mar 9, 2026

"If you’re actually trying to grow a product, the best levers for doing that are often within the product itself.”

Patrick Collison explains why it sometimes make sense to quit
Source: StartupPublished: Mar 6, 2026

“One thing I’ve learned myself the hard way, is that it is easier to tear down a company and restart it in Silicon Valley, than it is to constantly try to pivot or keep something alive."

Jeff Bezos recounts the time he called Amazon’s customer service number mid-meeting to prove a metric was wrong
Source: StartupPublished: Mar 5, 2026

“I have a saying, which is when the data and the anecdotes disagree, the anecdotes are usually right"

Ben Horowitz: “Nobody was born a great manager. It’s a very unnatural job.”
Source: StartupPublished: Mar 4, 2026

“If you can’t build a great product, it doesn’t matter if you can build a great company.”

03

ALSO TODAY

3 MORE SOURCES
08

SOLIDOT

08.00
SOLIDOT

Solidot News - June 16, 2026

Solidot Feed: Highlighting essential tech & open-source news.

禁止使用科技产品提升了学生的阅读能力

在数字化时代,一名教师的低科技实验显示学生的阅读能力有了显著提升。明尼阿波利斯 Washburn 高中的 AP 文学和英语教师 Maureen Mulvaney 在学生抄袭、注意力不集中以及阅读能力下降等问题之后开始了低科技实验,在家长的支持下,她禁止学生使用手机和笔记本电脑,要求所有作业都必须用纸笔完成。尽管学生一开始有抵触,但效果立竿见影:实验前的 2025 年 9 月只有 46% 的学生对阅读能力有信心,到了今年 2 月该比例飙升至 95%。大多数学生能写至少两页部分学生甚至能写五六页英文文章。79% 的学生表示在纸上写作和组织思路比在屏幕上更容易。

开源模型能否战胜 OpenAI?

中美两国的 AI 公司采取了不同的发布策略:中国侧重于开源权重模型,美国公司如 OpenAI 和 Anthropic 则采用闭源策略。Hugging Face 前亚太生态系统高管 Tiezhen Wang 表示,OpenAI 和 Anthropic 指责中国 AI 公司蒸馏其模型,他认为蒸馏是中性的,美国 AI 公司是通过抓取互联网上的信息训练模型,它们并非知识的创造者,却试图阻止其他人重复利用知识,有点讽刺。所有 AI 生成的内容都应该没有版权,否则拥有算力的人能滥用权力,生成各种组合内容然后对所有内容都申请版权。他发现中国公司和美国公司在最大化使用 token 上有明显差异,因为中国有很多开源权重模型,其使用成本没有美国大,因此中国互联网公司都鼓励员工最大化使用 token,鼓励员工成为 AI 原生开发者,甚至禁止他们手动完成撰写文档之类的日常工作。

curl 暂停一个月接受漏洞报告

curl 项目维护者 Daniel Stenberg 宣布,curl 将于 7 月 1 日至 8 月 3 日期间暂停接收漏洞报告,除非提交者拥有付费支持合同。他称之为“curl 的极乐夏日”。Stenberg 称过去四个月一直承受着巨大的压力,他们需要休息一下。GitHub 上的项目 issue 和 pull-request 保持开放,curl 8.22.0 的发布日期推迟两周至 2026 年 9 月 2 日发布。

朱雀二号火箭解体产生上百碎片

蓝箭航天于 6 月 9 日在东风发射场发射了朱雀二号改进型遥六运载火箭(ZQ-2E Y6),将搭载的千帆 DTC 01 星和中国移动 02 星送入预定轨道。但火箭上面级随后在太空发生解体事件,碎片散落在近地轨道,其中部分与国际空间站和 SpaceX Starlink 宽带网络的轨道重叠。LeoLabs 的高级技术研究员 Darren McKnight 表示此次事件可能产生了 100-150 块碎片。其中一块碎片是火箭的第二级,长约 8 米直径约 3.35 米。火箭上面级的主体在距离地球 335-424 公里的轨道上运行,轨道倾角 54.5 度。好消息是轨道高度足够低,空气阻力将使大部分火箭碎片在几个月内重返大气层烧毁。如果轨道高度超过 650 公里,那么碎片将需要数十年甚至更长时间才能重返大气层。

Firefox 152 释出

Mozilla 释出了 Firefox 152。主要变化包括:默认编译了 JPEG-XL 支持代码,但默认仍然没有启用,用户需要去 Firefox Labs 调整设置启用,JPEG-XL 是新的免专利图像格式,相关编解码器使用了 Rust 语言开发;重新设计了设置界面、在 Windows 不同硬件配置下支持 HDR 视频、支持 CSS 的 field-sizing 属性,以及一系列面向开发者的新功能,等等。

女性头部摄药量与经期相关

小鼠和人类实验显示,头部通过经鼻给物法摄入的药量存在性别差异,其中雌性摄药量与经期密切相关。研究人员通过小鼠实验发现,在动情前期与动情期,即雌激素奔涌至最高的阶段,雌性头部区域摄取的药物显著多于雄性;随着周期进入动情后期,雌激素跌向其最低值,两性之间基本不存在差异。研究人员在人类身上观察到了相似的现象:女性趋于更高的峰浓度,女性最高峰值逾男性最高峰值的两倍。男性则保留药物更久。

微软求助于 AWS 以满足 GitHub 上 AI 驱动的负载增长需求

微软旗下的代码托管平台 GitHub 最近一段时间宕机事件频发,它正将服务迁移到微软的云计算平台 Azure,但仍然无法满足不断增长的需求,因此它正求助于最大的竞争对手亚马逊 AWS,以确保平台能正常运行。亚马逊表示不会对个别客户发表评论。对微软而言,GitHub 宕机的风险已经超过了向 AWS 付费所带来的负面影响。GitHub 首席运营官 Kyle Daigle 此前表示,该平台用户的提交(Commits)数将从 2025 年的 10 亿飙升到 2026 年 140 亿次。GitHub 在 2025 年 10 月计划将平台容量提高 10 倍,但到了 2026 年 2 月它发现需要提高 30 倍,原因是 AI 编程导致平台工作负荷大幅提升。

瑞士选民否决了将人口设限千万的提案

瑞士于 6 月 14 日举行全民公投,决定是否在 2050 年前将全国常住人口限制在一千万以内。瑞士的人口出生率为每名妇女生育 1.29 个孩子,远低于 2.1 的人口替代率,它的人口增长主要归因于外来移民。目前瑞士人口已超过 900 万,官方数据显示,2024 年外国公民占到了瑞士总人口的 27% 以上。右翼的瑞士人民党(Swiss People's Party)支持的提案要求“2050 年前瑞士常住人口不得超过 1000 万,且瑞士应放弃与欧盟的自由流动协议”。瑞士选民最终否决了这一被称为“瑞士脱欧”的提案,有 54.79% 的选民反对,45.21% 的选民支持,投票率为 58.86%。

俄罗斯计划退役漏气的国际空间站 PrK 模块

位于 Progress(进步号)气闸舱和 Zvezda(星辰号)服务舱之间的 PrK 模块因结构裂缝导致的漏气过去几年一直困扰着国际空间站,今年初漏气问题一度被认为已经修复,但本月早些时候报告漏气再次加剧,该模块的裂缝总数达到 16 处。10 天前俄罗斯宇航员试图用锯子拆除该模块的一个承重支架,此举招致了 NASA 的强烈反对,认为可能会产生严重后果,下令宇航员进入与空间站对接的 Crew Dragon 飞船,穿上宇航服,准备必要时紧急撤离。俄罗斯航天局最终放弃了拆支架的计划。双方在幕后反复的拉锯之后,最终俄罗斯通知 NASA 将退役 PrK 模块。这意味着宇航员将不再进入 PrK 模块,或再次尝试对其进行加压。而俄罗斯将需要使用其它端口向空间站转移补给。

Arch Linux 遭遇新一轮 AUR 恶意程序攻击

Arch Linux 项目的用户软件仓库 Arch User Repository(AUR)上周遭遇了大规模恶意攻击,在处理了逾 1500 个软件包之后开发者认为问题已经得到了控制。然而仅仅过了一天,AUR 遭遇了新一轮的恶意攻击,这一次攻击者使用了代码混淆技术掩盖其意图。AUR 是用户贡献的软件包库,并非官方软件库,Arch Linux 项目可能需要暂时下线 AUR 以免遭遇一轮又一轮的恶意攻击。

数百万学生就读学校位于有毒污染场地 5 公里内

根据智库 Centre for Global Development 的地理分析,数百万儿童就读的学校附近存在已知的铅、汞、砷和杀虫剂等有毒污染。研究发现,亚洲、非洲和拉丁美洲 17 个国家的逾 25.2 万所学校位于有毒污染场地 5 公里范围内。这些学校有 4300 多万名儿童,其中 520 万名儿童位于 1 公里范围内。发达国家受污染影响的负担不成比例的落在贫困学生和非白人学生身上,但在发展中国家污染集中在富裕人群居住的城市,城市学校的规模通常更大,因此学生也更多——以菲律宾为例,9% 的学校靠近污染场地,而这些学校的学生总数占到全国学生总数的 27%。分析还显示,私立学校比公立学校更有可能位于污染场地附近。加纳 41% 的私立学校靠近污染场地,而公立学校的这一比例仅为 18%。

英国将禁止 16 岁以下青少年访问社交媒体

英国首相 Keir Starmer 宣布,英国将禁止 16 岁以下青少年访问社交媒体。英国的社媒禁令范围以及强度都高于澳大利亚的类似禁令。社媒禁令涵盖所有社交媒体,对包含聊天功能的游戏等网络产品也有单独限制,如禁止青少年与陌生人聊天。Starmer 说政府总要做出选择,他认为全面禁令是正确的选择。

测试显示 AI 的数学解题能力仍然不如人类专家

AI 模型的解题水平仍不及顶尖数学家。这项测试隶属 First Proof 项目,旨在评估 AI 解决复杂数学难题的能力。研究人员向 4 款 AI 系统提出 10 道科研级数学难题,再由相关数学领域的匿名专家评审团对作答结果进行打分。这次测试首次同时满足三大核心标准:题目均为前沿科研级数学问题、所有题目从未出现在模型训练数据中、由专业数学家评阅。10 名来自不同数学细分领域的研究人员,各自拿出一道本人研究过程中已解答但尚未公开发表的原创题目。这次测试中,各大推理模型依然频繁出现幻觉问题,这也是大语言模型的通病。而且所有 AI 作答在文献引用方面都“严重缺失”,全程没有标注来源。

中国就食品安全问题约谈山姆

中国市场监管部门因食品安全问题约谈沃尔玛旗下的山姆会员店(Sam's Club),对这家在全球第二大经济体快速扩张的仓储式连锁业务带来挑战。国家市场监督管理总局周一表示,“针对一段时期以来监管发现和媒体曝光的山姆线下门店及线上网店多发的食品安全问题”,已对该公司进行约谈。通报补充说,监管机构要求沃尔玛严格遵守中国食品安全法律,但并未说明会面具体时间,也未披露涉嫌违规的具体情况。

中国高校撤掉了 1.2 万个过时专业

中国高校正在大规模课程重组,撤掉了数千个“过时”专业,转而开设以科技为导向的新兴专业。这场教育改革正值中国力争成为众多高科技未来产业的全球领导者,解决严重的毕业生就业危机之际。这场危机已导致数百万年轻人难以找到工作。根据新华社援引教育部的数据,2021-2025 年间,中国高校撤销或暂停了 12200 个本科专业,同时新增了 10200 个专业,意味着逾三成的高校课程进行了调整。

黄金并不是惰性的

黄金是少数几乎不会被氧化的金属之一,但黄金的纳米粒子却能充当催化剂。根据发表在《Physical Review Letters》期刊上的一项研究,科学家指出黄金的惰性并非源于原子本身,而是源于黄金晶体形成的表面。黄金会形成晶体。如果沿着不同的原子平面切割晶体,会得到不同的表面排列。在黄金中,部分平面呈正方晶格,部分平面呈六方晶格。研究人员测试了不同黄金晶体表面对氧分子的吸附能力。结果显示,最常见的六方晶格黄金晶体对氧的吸附力较弱,而正方晶格则很容易吸附氧分子,能促使其发生形变至分裂,在这种情况下黄金被氧化了。

人类管理推动水稻产量过去五十年翻倍

伊利诺伊大学厄巴纳-香槟分校科学家的一项最新研究表明,尽管气候变化带来重重挑战,但全球水稻产量在过去半个世纪依然几乎翻了一番。研究揭示,水稻增产的秘诀并非天公作美,而是人类的管理决策,比如扩大灌溉、增施养料,以及推行能有效提升单产的耕作方式,共同维持了水稻产量,并抵消了气候相关因素带来的损失。这表明,未来的粮食安全不仅取决于环境条件,更取决于人们如何管理和调整水稻生产系统,以适应不断变化的世界。研究揭示,气候变化是导致水稻减产的首要因素。在 2006-2015 年间,因气温升高、热害频发和水资源短缺,全球水稻产量估计减少了 7%。然而大气中二氧化碳浓度的升高也成了主要的增产推手,因为它能增强光合作用、提升水分利用效率。这些发现共同描绘了一幅复杂的图景:环境变化对农业生产的影响是多面的,甚至彼此对立。

内存成本占到了手机成本的五成以上

Nothing CEO 兼联合创始人 Carl Pei 说,如果你考虑升级手机,最佳时机是昨天。Carl Pei 称,内存短缺影响到了 Nothing 的中端手机。内存已成为智能手机最昂贵的组件,比处理器更贵,比显示屏更贵,可能占到硬件总成本的五成以上。以 Phone (4a)为例,自决定生产这款设备到它上市,内存成本翻了一番。此后又翻了一番。手机价格在上涨,明年还会继续涨。自 2 月以来,新上市手机比上一代产品贵了 100 美元。印度售价 3 万卢比以上的手机价格涨了 7000 卢比或更多。

Linux 7.1 释出

因所在时区差异 Linus Torvalds 在美国时间周日早晨释出了 Linux 7.1。主要新特性包括:移除了部分基于 486 的旧架构;龙芯加入高内存支持;因缺乏维护移除 RISC-V 立即执行支持;新 clone()flags 简化进程管理;io_uring 子系统加入 BPF 支持;ublk 用户空间块驱动支持零拷贝 I/O;sched_ext 初步支持子调度器(sub-scheduler);改进交换机制;完全重写 NTFS 实现,等等。

本田思域容易遭到“邪恶女佣攻击”

邪恶女佣攻击(Evil maid attack)是对无人值守设备的一种攻击方式,具有物理访问权限的攻击者,用某种无法检测的手段对设备进行更改,以便后续访问该设备或设备中的数据。本田思域也很容易面临类似的攻击,比如邪恶的酒店代客泊车员。研究人员发现,本田汽车使用的 Android 软件包使用了公开的 AOSP 测试密钥进行签名,只要能物理访问汽车的 USB 接口,就可以刷入任意软件包,执行任意代码。

09

APP STORE RANK

09.00
APP STORE RANK
Loading…