TEXT VIEW · TODAY'S DIGEST · 36 HEADLINES ACROSS 8 SOURCES

Startup Archive(0)

No items yet for today.

App Store Rankings(0)

No items yet for today.

ISSUE 0887
FRI, JUN 5, 2026
Discover the best information organized by OrangeBot.AI
TODAY · FRI, JUN 5, 2026

The web,
read by a bot.

Ten sources — Hacker News, Product Hunt, HuggingFace, Techmeme and more — filtered, tagged, and summarized every morning for builders who don’t have time to scroll.

NEWChrome extension: save posts from Twitter/X in one click.Install →
01

AI DIGEST

UPDATED DAILY · EDITOR'S PICK
01.00
AI DIGEST

AI新闻摘要

June 5, 2026

Here is a summary of today's key news events.

Strong U.S. Jobs Report Shakes Markets and Boosts Dollar

A stronger-than-expected U.S. labor report was released today, leading to mixed reactions in the stock market. While the Dow Jones Industrial Average rose, the tech-focused Nasdaq and S&P 500 fell. The strong jobs data also caused the U.S. dollar and government bond yields to climb, as it increases the likelihood that the Federal Reserve may raise interest rates this year to manage inflation.

Highly Anticipated Rocket Company IPO Attracts Small Investors

A major rocket company is preparing for a landmark Initial Public Offering (IPO). Banks are actively promoting its massive valuation to investors. In a notable move, a significant portion of the shares is being reserved for individual retail investors, generating widespread interest as a new way to invest in the growing space industry.

AI Developments Highlight Rapid Progress and Real-World Challenges

Developments in Artificial Intelligence are a major focus, with a leading AI startup warning that models are approaching the ability to self-improve. As tech giants like Apple work to integrate advanced AI into consumer products, governments are also taking action, with Canada launching a national fund to support AI companies. The technology's growth is also creating practical issues, such as proposed electricity rate hikes to power the energy-intensive data centers AI relies on.

Hopes for U.S.-Iran Talks Affect Oil and Gold Prices

Global commodity markets are reacting to geopolitical developments. Oil and gold prices fell amid investor optimism for potential U.S.-Iran negotiations, which could ease tensions in the Middle East. Separately, in currency markets, the Japanese yen has weakened again against the dollar, approaching a level that may prompt Japan's government to intervene to support its value.

02

ON THE WIRE

6 SOURCES
02

HACKER NEWS

02.00
HACKER NEWS

Hacker News - June 5, 2026

Hacker News Feed: Highlighting key posts and discussions.

C++: The Documentary

(herbsutter.com)

230141
WiFi Time

(mitxela.com)

1126
The Causes of Long Covid

(www.science.org)

12982
Retro-Tech Parenting

(havenweb.org)

322215
VoidZero Is Joining Cloudflare

(blog.cloudflare.com)

650286
Ian's Secure Shoelace Knot

(www.fieggen.com)

565212
03

HUGGINGFACE

03.00
HUGGINGFACE

huggingface.title - June 5, 2026

huggingface.description

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story progresses, not maintain a fixed persona. Existing benchmarks measure factual recall at a given chapter, not whether responses align with the character's psychological trajectory, especially in scenarios the source text never explores. We introduce ArcANE (Arc-Aware Narrative Evaluation), an automatically constructed benchmark spanning 17 novels and 80 principal characters. A Character Arc segments the narrative into phases along a psychological axis, and each probe poses the same scenario across phases, spanning both situations within the source text and situations beyond it. Across six models and six context modes, conditioning on the Character Arc tops every other context strategy on every model, and the gap is largest on scenarios outside the source text where retrieval has nothing to find. We further fine-tune open-weight models on the same data to obtain ArcANE-8B/32B, which widen the Arc advantage even more on scenarios outside the source text.

39
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their total number unknown in advance. We frame this as the task of discovering multiple hidden problems from context, in which coexisting problems should be uncovered, grounded in supporting evidence, and paired with concrete actions. To this end, we introduce TIDE, a template-guided iterative framework with two complementary mechanisms. Specifically, motivated by the observation that single-pass prediction anchors on the most salient cases and yields generic claims, we propose iterative discovery, which surfaces a small batch of candidates per round while conditioning on what has already been found, so subsequent rounds extend coverage; and thought templates, reusable schemas distilled from previously solved cases that specify what contextual signals to attend to and how to connect them, anchoring each prediction in a recognizable problem class. We validate TIDE on two realistic settings, personal workspaces and software repositories, across four model backbones, showing substantial gains over single-shot and parallel multi-agent baselines on task coverage, identification, and resolution.

33
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Planning for real-world problems by language models often involves both world and user constraints, which may not be fully specified upfront and are progressively disclosed through interaction. However, existing benchmarks still underexplore adaptive planning under such progressively revealed dual constraints. To address this gap, we introduce AdaPlanBench, a dynamic interactive benchmark for evaluating whether Large Language Model (LLM) agents can adaptively plan and re-plan under progressively revealed world and user constraints. AdaPlanBench is built on 307 household tasks, with a scalable constraint construction pipeline that augments each task with dual constraints. At runtime, agents interact with the environment in a multi-turn protocol where hidden constraints are revealed only when the agent proposes a plan that violates them, requiring iterative plan revision under accumulating feedback. This makes planning challenging, as agents must infer and track constraints from feedback while re-planning effectively. Experiments on ten leading LLMs show that adaptive planning under dual constraints remains challenging, with the best model reaching only 67.75% accuracy. We further observe that performance degrades as more constraints accumulate, with user constraints posing a particularly large challenge and failures often stemming from weaker physical grounding and reduced effectiveness. These results establish AdaPlanBench as a testbed for dual-constrained interactive planning and highlight the challenge of reliable adaptation to dynamically revealed constraints in LLM agents.

26
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

We introduce VideoKR, the first large-scale training corpus specifically designed to strengthen knowledge- and reasoning-intensive video understanding. It comprises 315K video reasoning examples over 145K newly collected, CC-licensed, expert-domain videos. We develop a human-in-the-loop, skill-oriented example generation pipeline that targets progressively deeper video reasoning capabilities while ensuring the difficulty, diversity, and reliability of both the examples and their CoT rationales. We also curate VideoKR-Eval, a new expert-annotated benchmark where questions require genuine video understanding and knowledge-intensive reasoning rather than textual shortcuts. Our experiments show that, under a standard SFTrightarrowGRPO pipeline, models post-trained on VideoKR outperform prior post-training approaches on knowledge-intensive video reasoning while remaining competitive on general video reasoning, highlighting data design as a key driver of progress in video reasoning. We further conduct comprehensive ablations to isolate the contributions of VideoKR, providing actionable insights for future work.

25
RobotValues: Evaluating Household Robots When Human Values Conflict

While household robots are often evaluated based on task completion, everyday domestic environments involve value-conflicting situations in which robots are expected to choose actions that prioritize other values than task success, such as human autonomy, efficiency, or social appropriateness. Yet, there are no benchmarks for evaluating robots' value preferences in such scenarios. We introduce RobotValues, a benchmark to evaluate household robot planners in 10K value-conflict scenarios. Each instance consists of a realistic household image with multiple plausible robot actions that prioritize different human values. We construct RobotValues through LLM-assisted scenario generation, stakeholder-grounded value extraction, image generation and automatic quality control. Using RobotValues we evaluate VLMs used in robotics and find that models exhibit default value preferences, including safety and accommodation, while underselecting privacy-prioritizing actions. When the models are instructed to prioritize specific values that conflict with their own preferences, they often fail to override their default actions, choosing incorrect actions for 80% of the time. These findings suggest that household robot evaluation should measure not only task completion or safety compliance, but also whether robots can choose among plausible actions when human values conflict.

22
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To translate extremely low-resource languages at scale, we argue that LLMs must acquire the meta-skill of utilizing in-context linguistic knowledge rather than memorizing specific languages. In this paper, we propose a reinforcement learning (RL) approach to unseen language translation given rich linguistic context, using a surface-level translation metric (chrF) as the reward. Empirically, despite the lightweight reward, our RL-trained models effectively extract and apply relevant linguistic information from the provided context, leading to better translations on completely unseen languages than in-context learning or supervised fine-tuning. Our analyses suggest that outcome-based RL can extend beyond conventional reasoning tasks like math and coding to serve as a recipe for language learning from context.

22
LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing

Developing unified video generation and editing models capable of interpreting interleaved multimodal inputs is a promising yet challenging frontier field. Existing unified frameworks predominantly rely on massive models (typically 13B parameters or more) and incorporate source video conditions for editing by concatenating sequence tokens. This concatenation inevitably doubles the sequence length, quadrupling the computational complexity of the self-attention mechanism and introducing prohibitive overhead. To address these bottlenecks, we present LoomVideo, a highly efficient 5B-parameter unified architecture for both video generation and editing. LoomVideo replaces the standard text encoder with a Multimodal Large Language Model (MLLM) and employs Deepstack injection mechanism to align multi-layer MLLM features with the Diffusion Transformer (DiT). Crucially, we introduce a zero-overhead Scale-and-Add conditioning approach for video editing. By scaling and directly adding the clean source video latent to the noised target latent, this elegant design eliminates the need for token concatenation, drastically reducing computational cost while maintaining robust capabilities for complex, non-rigid edits. Furthermore, a Negative Temporal RoPE strategy is seamlessly integrated to handle multiple reference images. Extensive experiments demonstrate that our compact 5B model achieves state-of-the-art or highly competitive performance across comprehensive benchmarks, exhibiting exceptional superiority in e-commerce and fashion generation scenarios. Benefiting from the zero-overhead conditioning mechanism, LoomVideo achieves at least a 5.41x acceleration in inference speed compared to models of similar capabilities, paving the way for highly practical and efficient video foundation models.

16
Complexity-Balanced Diffusion Splitting

Standard continuous-time generative models rely on monolithic architectures that must navigate vastly different signal regimes, from isotropic noise to intricate data distributions. While scaling model capacity improves performance, deploying a massive network uniformly across the entire generative timeline is inherently inefficient. In this work, we propose Complexity-Balanced Splitting (CBS), a principled framework for temporal capacity allocation that distributes the generative workload across multiple specialized sub-networks. Grounded in function approximation theory and de Boor's equidistribution principle, CBS partitions the diffusion timeline into segments of equal approximation burden, allocating more representational capacity to regions where the generative dynamics are more difficult to model. To estimate this local complexity, we introduce two complementary and tractable monitor functions: a spatial measure based on the flow's Dirichlet energy, and a geometric measure based on the acceleration of the sampling trajectories. Using a lightweight auxiliary model to estimate these complexity profiles, our approach eliminates the need for heuristic temporal splits or computationally expensive search procedures. Extensive evaluation across multiple architectures (SiT, JiT, and UNet) and datasets demonstrates that CBS consistently improves synthesis quality without increasing per-step inference cost. In particular, CBS improves FID by ~35% on SiT-XL with CFG relative to naive temporal partitioning. Project page is available at https://noamissachar.github.io/CBS/.

15
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Experience internalization converts contextual experience from past interactions into reusable parametric capability, offering a promising path toward continual learning in large language models (LLMs). While prior work has predominantly focused on single-iteration transfer, we discover that under multi-iteration experience learning, existing methods suffer from a progressive capability collapse rather than compounding improvement. We systematically examine this failure through three vital dimensions of experience internalization: (1) Experience Granularity: We find that principle-level experience is more durable than instance-level experience, as it effectively abstracts transferable strategies away from trajectory-specific details. (2) Experience Injection Pattern: Our analysis reveals that step-wise injection significantly outperforms global injection by aligning experience with intermediate decision states, a property that is critical for long-horizon tool use. (3) Internalization Regime: We demonstrate that off-policy context-distillation on high-quality teacher trajectories provides a substantially more stable training signal than on-policy context-distillation, which is inherently limited by local corrections on student-induced flawed states. Together, these insights yield a simple yet robust recipe for stable and sustainable experience internalization, providing concrete guidance for engineering self-evolving and continually learning LLMs.

15
Personal AI Agent for Camera Roll VQA

We study the personal camera roll visual question answering setting. In this setting, a conversational AI assistant can access a user's personal camera roll and retrieve relevant photos to answer queries, ranging from simple factual questions (e.g., ``Name of the food I tried yesterday?'') to more open-ended ones (e.g., ``Recommend some dishes I have never eaten before''). Given the vast nature of the personal camera roll (i.e., multiple years, hundreds to thousands of photos), a successful AI assistant needs to understand a long-horizon, highly personalized visual content stream in order to navigate and locate the correct and/or relevant information. To support this, we collect and manually annotate questions that mimic real-world usage. The final dataset, camroll, contains 50 users, 31,476 images, and 2,500 QA pairs. We further design camroll-agent, a conversational AI agent equipped with hierarchical memory and a minimal set of tools for efficient navigation over large, personalized visual memory. Experimental results show that camroll-agent outperforms numerous baselines and methods for long-context understanding AI agents system. Together, the camroll dataset and camroll-agent highlight the gap in AI agents' long-context reasoning: personalized visual memory requires different approaches from standard long-context textual memory, especially when consistency, visual details, and user-specific context are present.

15
Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?

Video generation models have made impressive strides in synthesizing visually compelling content, yet their outputs remain confined to the virtual domain. A natural question follows: how well do these models reflect the physical world when their generated videos leave the screen and enter reality? We propose robotic manipulation as a concrete, measurable window onto this question: if a model has truly internalized physical laws, the motion it depicts should translate into executable robot behavior. We introduce Dream.exe, an evaluation framework that operationalizes this criterion through a video-to-execution pipeline. Given a scene image and a task description, Dream.exe synthesizes a manipulation video, converts the generated motion into robot trajectories, and executes them in a physics simulator, yielding a grounding signal that purely visual metrics cannot offer. Using this pipeline, we evaluate 8 models spanning frontier closed-source generators, open-source generators, and robot-specific models. Our benchmark covers 101 manually curated manipulation tasks at three levels of physical complexity, measured across visual quality, trajectory fidelity, and execution success. Encouragingly, several models achieve measurable execution success, suggesting that generative priors learned from internet-scale data already encode meaningful physical knowledge. Yet visual quality proves a poor predictor of executability, exposing a dimension of model capability that standard visual evaluations do not capture. Dream.exe will be open-sourced at https://github.com/showlab/Dream.exe.

12
Unsupervised Skill Discovery for Agentic Data Analysis

Inference-time skill augmentation provides a lightweight way to improve data-analytic agents by injecting reusable procedural knowledge without updating model parameters. However, discovering effective skills for data analysis remains challenging, as reliable supervision is expensive and success criteria vary across analytical formats. This raises the key question of how to discover reusable data-analysis skills from unlabeled exploration alone. We propose DataCOPE, an unsupervised verifier-guided skill discovery framework for data-analytic agents. DataCOPE derives verifier signals from the exploration trajectories and uses them to characterize relative quality or aggreement among trajectories. It iteratively coordinates a Data-Analytic Agent for trajectory generation, an Unsupervised Verifier for signal extraction, and a Skill Manager for contrastive skill distillation. For report-style analysis, we instantiate the verifier as an Adaptive Checklist Verifier that derives task-specific criteria, scores reports by verifiable coverage, and iteratively refines the checklist. For reasoning-style analysis, we instantiate it as an Answer Agreement Verifier that groups trajectories by answer agreement and uses self-consistency as an auxiliary signal. We evaluate DataCOPE on report-style analysis from Deep Data Research and reasoning-style analysis from DABStep. Across both settings, DataCOPE consistently improves held-out performance over baselines. Averaged across four model settings, DataCOPE improves the mean score by 9.71% and 32.30% on report-style and reasoning-style tasks respectively.

10
LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

Large language models can reproduce training data, but existing memorization evaluations mostly measure whether models can be forced to do so, rather than whether they do so under ordinary use. We introduce PropMe, a propensity-aware framework for memorization evaluation that contrasts prefix-based capability attacks with non-adversarial evaluations. We propose a metric transformation that, applied to existing functions, allows to create propensity metrics. We further introduce SimpleTrace, a lightweight tracing pipeline built on infini-gram that deterministically attributes model generations to large-scale training corpora and computes verbatim, near-verbatim, and propensity-transformed memorization metrics. Evaluating two fully-open models: Comma and DFM Decoder on two datasets: Common Pile and Dynaword in two languages, we find a consistent gap between capability and propensity: prefix attacks elicit substantially stronger memorization signals than generic or dataset-specific prompts, while propensity scores remain low overall. Thus, the models can reveal training data when directly elicited, but rarely do so in more common non-adversarial settings. We also find that DFM Decoder, which is continually pre-trained from Comma, exhibits reduced memorization and memorization propensity for Common Pile, confirming that memorization capability can decrease when later training emphasizes partially different data. Our results suggest, and we encourage, that memorization audits should report both worst-case extractability and ordinary leakage propensity in order to have a more comprehensive view of this phenomenon.

6
The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is constrained by strict computational budgets. In this work, we formulate inference budget allocation as a global constrained optimization problem governed by economic principles. By modeling per-query reasoning utility with a shifted-surge function, we derive an optimal allocation policy based on a global shadow price that equilibrates marginal utility under resource scarcity. Based on this theory, we propose Constrained Latent-utility Equilibrium Allocation for Reasoning (CLEAR). It performs rational abandonment and reallocates resources from insolvent queries to solvable queries near their emergence thresholds. Extensive experiments on several reasoning tasks with different traffic streams demonstrate that CLEAR significantly improves the Pareto frontier of total token cost versus mean accuracy. In resource-scarce regimes, CLEAR achieves up to a 3x improvement in global accuracy compared to uniform allocation.

6
World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

We propose world-language-action (WLA) models as a new class of embodied foundation models. WLA takes textual instructions, images, and robot states as inputs to jointly predict textual subtasks, subgoal images, and robot actions, conjoining the world modeling interface to learn from extensive egocentric videos as in the world-action model (WAM) and the language reasoning capacities to solve complex long-horizon tasks as in vision-language-action (VLA) models. At the core of WLA lies an autoregressive (AR) Transformer backbone, instead of a bidirectional diffusion Transformer as in WAMs, to predict the next state, comprising the semantic-level textual intention and complementary fine-grained physical dynamics. The physical dynamics are supervised by the world modeling objective based on a dedicated World Expert, and are leveraged to ease the characterization of the state-action correlation for the Action Expert. WLA leverages meta-queries to make the world prediction implicitly impact the action generation so that the former can be disabled during inference. The world prediction can also be activated to enable test-time scaling for improved robot control. Our WLA-0 prototype, with 2B active parameters, achieves 40 ms per inference on an NVIDIA RTX 5090. Evaluations across simulated and real-world environments demonstrate that WLA-0 achieves state-of-the-art multi-task and long-horizon learning abilities, e.g., 92.94\% success rate on RoboTwin2.0 Clean and 56.5\% success rate on RMBench. WLA-0 also holds the promise to learn novel tasks directly from cross-embodiment robot videos without action annotations.

5
Towards One-to-Many Temporal Grounding

Temporal Grounding (TG) aims to localize video segments corresponding to a textual query. Prior research predominantly focuses on single-segment retrieval. Real-world scenarios, however, often require localizing multiple disjoint segments for a single query -- a setting we term One-to-Many Temporal Grounding (OMTG). Previous state-of-the-art MLLMs, optimized for one-to-one settings, struggle in this context, often yielding near-zero scores due to a lack of event cardinality perception. To bridge this gap, we present a systematic solution with three key contributions. First, we establish the first comprehensive OMTG benchmark, introducing Count Accuracy (C-Acc) and Effective Temporal F1 (EtF1) as evaluation metrics. Second, we curate a high-quality OMTG dataset comprising 56k samples through a sophisticated construction pipeline. Third, we develop novel temporal and caption reward functions specifically designed for OMTG. In particular, the caption reward leverages Chain-of-Thought reasoning over dense video captions to explicitly guide policy optimization toward both preciseness and completeness. Extensive experiments show our model achieves a new state-of-the-art EtF1 of 43.65\% on OMTG Bench, outperforming Gemini 2.5 Pro and Seed-1.8 by 15.85\% and 15.61\%, respectively.

4
Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

Video event prediction (VEP) requires models to infer unobserved future states from partial video evidence. Existing video MLLMs usually verbalize intermediate future reasoning in text space: once visual evidence is verbalized, fine-grained motion, geometry, and interaction cues can be lost, leading to plausible but visually ungrounded hallucinations. We introduce Future-L1, an interleaved latent visual reasoning framework that lets an MLLM alternate between language tokens and continuous latent visual spans during autoregressive decoding. To train this capability, we construct Future-L1-50K by selecting examples where future visual hints help prediction and align latent states to future-frame embeddings, then further optimize sampled latent trajectories with LA-DAPO, a latent-aware RL objective with outcome-contrastive and temporal-diversity rewards. Future-L1 achieves new state-of-the-art results on both benchmarks: on FutureBench, it improves Qwen3-VL-8B from 61.0 to 85.4 and exceeds the previous best Video-CoE by 10.4 points; on TwiFF-Bench, it improves the average score from 2.44 to 3.04. These results suggest that future-oriented video reasoning benefits from preserving intermediate visual semantics in latent space rather than translating every reasoning step into text.

4
OPRD: On-Policy Representation Distillation

On-policy distillation (OPD) supervises the student only in output space by matching next-token probabilities. This output-only paradigm has two limits: (1) sampling variance from Monte Carlo KL estimates over large vocabularies (e.g., Qwen's ~150k tokens) persists throughout training, and (2) it treats the teacher as a black-box, discarding all intermediate hidden states after the LM head. We propose On-Policy Representation Distillation (OPRD), which lifts distillation into hidden-state space by aligning student and teacher representations across selected layers on the same rollouts, bypassing the LM head entirely. Theoretically, OPRD eliminates sampling variance and provides richer per-layer structural information. Empirically, OPRD closes the student-teacher gap on AIME 2024/2025 and AIMO, while output-space OPD baselines plateau below the teacher. OPRD also trains 1.44x faster and uses 54% less memory than top-k OPD. Code: https://github.com/ShenzhiYang2000/OPRD.

4
AdaCodec: A Predictive Visual Code for Video MLLMs

Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existing video multimodal large language models (video MLLMs) usually encode each sampled frame as an independent RGB image, causing visual tokens to repeat content already present in earlier frames. This suggests a more direct video interface: send a full reference frame only when the scene cannot be predicted well from prior context, and otherwise transmit a compact description of inter-frame changes. We call this interface a predictive visual code, and instantiate it for video MLLMs as AdaCodec. AdaCodec spends full visual tokens on a reference frame only when its conditional predictive cost is high; otherwise, it encodes inter-frame changes, including motion and prediction residuals, as compact P-tokens. Across all eleven benchmarks, AdaCodec improves over the Qwen3-VL-8B per-frame RGB baseline at a matched visual-token budget. Even at 1/7 the budget, AdaCodec with 32k tokens surpasses the 224k baseline on all long-video benchmarks; on five general-video benchmarks, it raises the average score while substantially cutting time-to-first-token from 9.26s to 1.62s.

3
Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement learning, failing to localize where intermediate memory quality degrades. As interactions unfold, ambiguous recursive summaries progressively discard task-relevant information and introduce semantic noise. This exacerbates belief deviation, obscuring the agent's estimate of the latent task state and ultimately derailing long-horizon reasoning. We therefore argue that memory optimization should focus not merely on trajectory-level success, but on the clarity of the belief induced by intermediate summaries. To this end, we introduce Belief Entropy, a self-supervised proxy that probes how uncertain the model remains about the latent task state given its current memory. Based on this proxy, we propose Metacognitive Memory Policy Optimization (MMPO). Instead of relying only on sparse outcome-based signals, MMPO provides fine-grained, memory-specific supervision via explicitly penalizing summaries that induce high epistemic uncertainty. Experiments show that MMPO consistently outperforms existing methods on diverse long-horizon tasks, maintaining 97.1% performance even when scaled to 1.75M-token contexts.

3
SePO: Self-Evolving Prompt Agent for System Prompt Optimization

System prompt optimization improves agent behavior without modifying the underlying model, yielding human-readable, model-agnostic instructions. Existing methods build a prompt agent that refines task agents' system prompts, yet leave the prompt agent's own system prompt hand-engineered and fixed. We propose Self-Evolving Prompt Optimization (SePO), which treats the prompt agent's own system prompt as an optimization target alongside task agents' system prompts. SePO adopts a self-referential design. A single prompt agent improves both task agents' system prompts and its own under an open-ended evolutionary search that maintains an archive of candidate prompts as stepping stones. Training proceeds in two stages: pre-training evolves the prompt agent on a multi-task pool, and fine-tuning then applies it to a target task. Across five benchmarks spanning math (AIME'25), abstract reasoning (ARC-AGI-1), graduate-level science (GPQA), code generation (MBPP), and logic puzzles (Sudoku), SePO consistently outperforms Manual-CoT, TextGrad, and MetaSPO, improving the average accuracy by 4.49 points compared to Manual-CoT. The prompt optimization skill from pre-training also generalizes to tasks beyond the pre-training mixture, rather than memorizing per-task prompts.

3
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sustained self-evolution becomes a key capability. However, existing MLE agents suffer from inter-branch information isolation, memoryless search, and lack of hierarchical control, which together hinder long-horizon optimization. We present MLEvolve, an LLM-based self-evolving multi-agent framework for end-to-end machine learning algorithm discovery. By extending tree search to Progressive MCGS, MLEvolve enables cross-branch information flow through graph-based reference edges and gradually shifts the search from broad exploration to focused exploitation with an entropy-inspired progressive schedule. To allow the agent to evolve with accumulated experience, we introduce Retrospective Memory, which combines a cold-start domain knowledge base with a dynamic global memory for task-specific experience retrieval and reuse. For stable long-horizon iteration, we further decouple strategic planning from code generation with adaptive coding modes. Evaluation on MLE-Bench shows that MLEvolve achieves state-of-the-art performance across multiple dimensions including average medal rate and valid submission rate under a 12-hour budget (half the standard runtime). Moreover, MLEvolve also outperforms specialized algorithm discovery methods including AlphaEvolve on mathematical algorithm optimization tasks, demonstrating strong cross-domain generalization. Our code is available at https://github.com/InternScience/MLEvolve.

3
MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question answering (VQA) tasks. However, they remain brittle on mechanical engineering drawings, where high annotation density and weak domain knowledge, compounded by unreliable spatial relation reasoning under strict projection rules and geometric constraints, make decisive cues easy to miss and frequently lead to wrong answers. To bridge this gap, we introduce the first comprehensive mechanical drawing understanding dataset, MechVQA, created through a semi-automated construction and quality-control pipeline. MechVQA contains 3.3k high-density pictures with 21K question-answer pairs, spanning 10 different fine-grained tasks across three capability levels: Recognition, Reasoning, and Judging, providing a testbed to evaluate and improve MLLM understanding on real-world mechanical drawings. On top of MechVQA, we then develop the MechVL model through a multi-stage training paradigm, building a strong domain-specialized baseline. Extensive experimental results demonstrate that MechVL outperforms the strongest closed-source baseline by 7.57 percentage points on the MechVQA total score, significantly enhancing mechanical drawing understanding ability and providing a reusable foundation for deploying MLLMs in mechanical design and inspection scenarios.

2
Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

Automatic Speech Recognition (ASR) has become a key technology for human--AI interaction. However, code-switching ASR (CS-ASR) remains particularly challenging due to the severe scarcity of multilingual CS speech resources across diverse language pairs. Existing approaches primarily improve CS-ASR performance through synthetic CS speech generation or pair-specific fine-tuning on limited bilingual datasets. Nevertheless, these approaches face an inherent scalability limitation, as support for CS must be developed separately for language pairs whose number grows combinatorially with the number of supported languages. In this work, we investigate whether CS capabilities learned from a limited set of seen language pairs can generalize to unseen language pairs through model merging and domain generalization methods. Our experiments show that merged bilingual CS-ASR models modestly generalize to unseen language pairs, suggesting limited transfer of bilingual CS capabilities across language pairs.

2
EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches remain fundamentally limited by their static action sets and lack of principled long-horizon context management, hindering their ability to accumulate reusable experience across tasks and operate reliably in multi-stage, iterative data science pipelines. To address these challenges, we introduce EvoDS, a self-evolving autonomous data science agent that learns to expand its skills and adaptively managing long-term context through agentic reinforcement learning. Specifically, EvoDS introduces two key strategies: (1) Autonomous Skill Acquisition (ASA) mechanism, which enables agents to synthesize, validate, and reuse executable skills; and (2) Adaptive Context Compression (ACC) strategy, which treats context management as a learned control problem rather than passive truncation. These strategies are orchestrated within a two-stage multi-agent training scheme, enabling EvoDS to autonomously improve over time. Theoretically, we prove that EvoDS's hierarchical design reduces tool-selection error, and its optimization objective aligns with an information bottleneck principle, ensuring efficient context use. Empirically, EvoDS outperforms state-of-the-art open-source data science agents by an average of 28.9% across four diverse benchmarks while eliminating out-of-token failures. Our code and data are available at https://github.com/usail-hkust/EvoDS.

2
Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes in conversational contexts. In this work, we study counterfactual context revision as a framework for auditing LLM-based stance simulation. Given an original online conversation, we first infer a target user's stance toward a specific topic. We then apply controlled revision strategies to the conversational context and simulate the user's stance again under the revised context. We compare text-only revision strategies with a multimodal one that incorporates meme-based context and evaluate two main effectiveness metrics, i.e., average directional stance shift and stance transition rate. The results reveal effective and robust stance transitions in both text-only and multimodal strategies across different polarization-preference mechanisms. Our study contributes an evaluation framework for understanding the context sensitivity of LLM-based stance simulation. More broadly, it highlights both the promise and risk of using LLMs to simulate online opinion dynamics.

1
Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeatedly restate goals, risk preferences, portfolio context, past judgments, and shifting market assumptions, while the agent answers, retrieves, acts, and forgets. In finance, this is not just inconvenient. In tasks such as market analysis, copy-trading review, and trade preparation, forgotten context and stale memory can create latency, repeated errors, weak auditability, and unsafe decisions. We propose the interaction-native knowledge harness (InKH), an architecture for financial LLM agents that absorbs complexity into the system. InKH converts user, market, portfolio, and tool events into structured operational knowledge. It uses passive knowledge injection to assemble a bounded working context buffer before the main model step, temporal graph memory for low-latency retrieval, a wiki audit surface for human-readable governance, and background extraction with maturity, decay, and write-time invalidation. We evaluate InKH on a reproducible controlled synthetic benchmark with 24 random seeds, 4 rounds, 80 episodes per round, and 6 baselines, producing 46,080 baseline-conditioned evaluations. InKH achieves mean task quality of 0.815 at 900 ms latency. Compared with agent-driven wiki-walk memory, it reduces latency by 82.95 percent, token cost by 82.29 percent, and stale-knowledge usage by 96.58 percent, while improving quality by 0.108 and traceability by 0.461. Compared with a temporal-graph system without invalidation, it improves quality by 0.050 and reduces stale-memory usage by 96.58 percent with comparable serving cost. The results support a design thesis for financial AI: adoption happens when complexity is absorbed by the system rather than transferred to the user. The benchmark validates architecture-level behavior, not live trading performance.

1
SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

In robotics systems, vast amounts of visual data are easily captured at high resolution using low-cost, low-power hardware. Yet, limited bandwidth and on-device compute resources prevent full utilization when transmitted via conventional codecs like JPEG/MPEG. Newer codecs, like AV1/AVIF, improve the rate-distortion trade-off, but demand far more resources for encoding, impractical without custom ASICs. Recent asymmetric autoencoders deliver high quality under extreme power and bandwidth constraints, but add prohibitive decoding cost and use bespoke formats that ignore decades of infrastructure built around standards like JPEG. To address these limitations, we introduce a compression framework for cloud robotics based on a Sensor Embedded Autoencoder paired with a One-Time Transcode for Efficient Reconstruction (SEAOTTER). Because the sensor, cloud, and consumer stages face very different power and bandwidth budgets, SEAOTTER combines the compactness of a learned latent with the broad usability of a standard JPEG file. Since naive transcoding degrades performance, we propose a learnable JPEG color and quantization transform that enables increased accuracy for global, dense, and vision-language-based perception. Using SEAOTTER, we train both general-purpose and task-aware transcoding pipelines for a pre-trained, frozen encoder. At a compression ratio of 200:1 and compared to AVIF, we observe 7 times faster encoding, 3.5 times faster decoding, and +8% ImageNet top-1 accuracy, while retaining compatibility with JPEG infrastructure. Our code is available at https://github.com/UT-SysML/seaotter .

1
Multimodal Music Recommendation System using LLMs

Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction histories which overlooks semantic or acoustic content. Prior work has explored LLM-augmented, multimodal, and text-enhanced approaches to sequential recommendation, and while some methods partially combine semantic, acoustic, or engagement signals, none jointly model all three within a unified LLM-based sequential reasoning framework that grounds recommendations in actual song content. In this work, we propose a multimodal framework for session-based music recommendation that enriches the LastFM-1K dataset with three complementary signals: (1) audio and lyric embeddings extracted using pretrained music and text representation models, (2) LLM-generated semantic metadata using the MGPHot annotation schema, and (3) listening completion ratios. We adopt the E4SRec framework by extending it with multimodal features and different item ID encoder backbones, including SASRec, BERT4Rec, and GRU4Rec. We further extend the LLM backbone option with LLaMa-2-13B, Qwen2.5-7B-Instruct, and LLaMa-3-70B in both zero-shot and fine-tuned settings. Our experiments show that integrating content-based features improves over ID-only baselines up to 95% in terms of Recall and 79% in terms of NDCG. Moreover, our experiments show that naive multimodal fusion does not always yield additive improvements, highlighting challenges in cross-modal integration. We release a large-scale multimodal benchmark for music recommendation.

1
Is This Edit Correct? A Multi-Dimensional Benchmark for Reasoning-Aware Image Editing

Diffusion-based image editing has achieved strong visual fidelity under natural language instructions, yet most existing systems still operate at the level of surface instruction following, without reasoning about the implicit contextual constraints embedded in real user requests. This often leads to visually plausible but logically inconsistent edits. In this work, we introduce RE-Edit, a benchmark for REasoning-aware image Editing that evaluates image editing systems across five complementary reasoning dimensions: physical, environmental, cultural, causal, and referential. RE-Edit comprises 1,000 carefully curated samples, each designed such that visual plausibility alone is insufficient and correct editing requires satisfying implicit logical constraints. To support fine-grained analysis, we establish dimension-aligned evaluation criteria and conduct a comprehensive study of ten open-source and two commercial image editing models. Our results show that even advanced systems frequently struggle with implicit multi-dimensional reasoning despite producing high-quality visuals. We further present a lightweight reasoning-guided post-edit baseline as an initial exploration, illustrating how inserting explicit reasoning can help mitigate such failures in a model-agnostic manner.

1
Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the remarkable coding abilities of Large Language Models (LLMs). However, the scalability of RLVR is severely constrained by the scarcity of sufficiently challenging verifiable code tasks that target near the model's edge of competence. Prior studies often rely on heuristic seed expansions for data synthesis, which severely limits both novelty and difficulty. Consequently, the training value of such data fails to scale proportionally with the size of its synthesis. To this end, we propose Atomic Decomposition and Recombination (ADR), a novel framework that generates verifiable code tasks via decomposition into atomic elements and controlled recombination, thereby enabling the generation of genuinely novel and challenging verifiable code tasks. Experiments and analysis demonstrate that ADR achieves superior originality, difficulty, diversity, and test quality over existing baselines, and consistently delivers greater improvements in code ability across RLVR in diverse downstream domains, including algorithmic programming, tool usage, and data science. Our work sheds light on a new paradigm for novel code task synthesis and scalable RLVR training.

1
Flash-WAM: Modality-Aware Distillation for World Action Models

World-action models (WAMs) jointly generate future video and robot actions through iterative diffusion, achieving strong performance on manipulation benchmarks but requiring tens of denoising steps, a cost that precludes real-time control. Step distillation has emerged as the natural remedy, but off-the-shelf methods break down in the joint video-action setting because video and action streams use different SNR-shifted noise schedules and reach training with substantially different marginal noise distributions, an asymmetry that single-modality distillation methods cannot accommodate. We introduce Flash-WAM, a modality-aware step-distillation framework inspired by consistency distillation that selects the consistency function for each modality to match its noise regime: a linear-gradient-scaling parametrization for the action stream's low-noise regime, paired with a variance-preserving parametrization for the video stream's high-noise regime, grounded in a structural analysis of the consistency-function family that characterizes the achievable gradient scaling under the consistency boundary condition. Instantiated on LingBot-VA, Flash-WAM compresses inference to a single step in each modality. On RoboTwin 2.0, this reduces per-chunk latency from 8.1 seconds to 348 ms on NVIDIA L40S, a 23{times} speedup that enables real-time inference. Flash-WAM preserves task success on simulation benchmarks (85.5% RoboTwin 2.0, 95.7% LIBERO) and substantially recovers real-world performance (60% average on a Unitree G1 humanoid robot), while naive consistency distillation drops to 24% at the same step budget.

1
Discrete-WAM: Unified Discrete Vision-Action Token Editing for World-Policy Learning

Autonomous driving requires reasoning about how ego actions shape the evolution of the surrounding world. However, most end-to-end methods rely on direct state-to-action mappings, capturing correlations without explicitly modeling action-conditioned dynamics. Conversely, continuous-latent world models often lack compositional structure for causal reasoning across counterfactual futures. We introduce Discrete-WAM, a unified latent vision-action world policy that represents future visual states and ego actions as aligned discrete tokens, enabling compositional causal reasoning across alternative futures. Built upon this unified discrete alignment, Discrete-WAM establishes a shared discrete diffusion framework with unified generative tasks, jointly formulating world modeling, world-action policy, and hierarchical decision-enabled policy, supporting compositional generalization across diverse driving scenarios. Experiments on large-scale autonomous-driving benchmarks show that Discrete-WAM achieves competitive performance while supporting controllable generation and counterfactual reasoning, offering a principled path toward more reliable decision-making.

1
Latent Reasoning with Normalizing Flows

Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the importance of intermediate computation. However, textual CoT forces this computation through a discrete, serial, and communication-oriented token stream: each reasoning step must be verbalized before the model can proceed, even when the underlying update is semantic, uncertain, or only partially formed. Latent reasoning offers a higher-bandwidth alternative by performing intermediate computation in compact continuous states before committing to text. Yet existing latent-reasoning methods often sacrifice key advantages that make CoT effective in autoregressive language models, including native left-to-right generation, probabilistic sampling, compatibility with KV-cache decoding, and tractable likelihood estimation. We propose NF-CoT, a latent reasoning framework that preserves these advantages by modeling continuous thoughts with normalizing flows. NF-CoT instantiates a TARFlow-style normalizing flow inside the LLM backbone, defining a tractable probability model over compact continuous thoughts distilled from explicit CoT. Continuous-thought positions are generated by an NF head, while text positions are generated by the standard LM head within the same causal stream. This design provides exact likelihoods for latent thoughts, enables probabilistic left-to-right decoding with the original KV cache, and supports direct policy-gradient optimization in the latent reasoning space. On code-generation benchmarks, NF-CoT improves pass rates over explicit-CoT and prior latent-reasoning baselines while substantially reducing intermediate-reasoning cost.

1
Video2LoRA: Parametric Video Internalization for Vision-Language Models

Processing video in vision-language models is expensive: each frame occupies hundreds of tokens, and inference cost scales with every frame and every repeated query. We introduce Video2LoRA, a method for parametric video internalization. A perceiver hypernetwork reads the intermediate representations produced layer-by-layer as a frozen VLM encodes a video, and generates a Low-Rank Adaptation (LoRA) adapter in a single forward pass. Unlike standard LoRA fine-tuning, which requires iterative gradient updates, Video2LoRA predicts these weights directly from the video. Trained for SmolVLM2 500M and 2.2B on video summarization and captioning, Video2LoRA enables the same frozen VLM to answer queries from the adapter alone, with zero visual tokens in its context at query time. Video2LoRA is statistically non-inferior and equivalent to direct video-in-context inference across all five captioning benchmarks at both model scales, and across seven of eight video question answering benchmark-scale pairings. Although trained only on 12 frames at 384px, it remains stable up to 1,024 frames and 1024px, where direct video-in-context inference often degenerates. Across this sweep, it reduces answer-time visual-token load by up to 1,500x and query TTFT by 6-80x, while preserving video-faithful outputs. We also find that independently generated adapters for non-overlapping video segments can compose in rank space, suggesting a path toward chunked long-video internalization.

1
ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment

AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a project should be positioned. We introduce ForeSci, a temporally controlled benchmark for evaluating whether LLM agents can make such forward-looking research judgements from historical evidence. ForeSci contains 500 tasks across four fast-moving AI domains and four decision families. Each task is paired with a cutoff-aligned offline knowledge base; post-cutoff papers are hidden during generation and used only for validation. To avoid random future-event prediction, tasks are derived from pre-cutoff taxonomy branches and evidence signals, and answer-generation backbones are selected to precede the task cutoffs. We evaluate native LLMs, Hybrid RAG, and three research-agent adaptations across four backbones. Results show that explicit evidence organization improves traceability and factual support, but gains depend strongly on the decision family. Diagnostics reveal a recurring evidence-decision decoupling: agents may cite relevant evidence while forecasting the wrong research object. ForeSci turns forward-looking AI research judgement into a controlled benchmark for evaluating research agents as decision-making systems.

0
Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

Training accurate medical image segmentation models requires large amounts of densely annotated data, which is costly and time-consuming to obtain. Semi-supervised learning (SSL) alleviates this by learning from both abundant unlabeled data and limited labeled data. However, most modern SSL methods rely on pseudolabels for unlabeled data, and typically assess their reliability through model confidence or uncertainty, measures that are self-referential and lack explicit grounding in segmentation quality. Instead, we propose a quality-guided SSL framework that trains a dedicated network to estimate segmentation quality from image-mask pairs. The predictor is trained on variable-quality masks generated through synthetic corruptions augmented with imperfect outputs from partially trained segmentation models, capturing realistic error patterns encountered during training. We integrate the quality predictor into SSL through two complementary mechanisms: a quality-aware regularization loss and a quality-based pseudolabel sample reweighting scheme. We show that our method serves as a drop-in enhancement to existing SSL frameworks. Extensive experiments across five datasets and multiple architectures demonstrate consistent improvements over competing SSL methods, advancing the state-of-the-art in semi-supervised medical image segmentation.

0
05

PRODUCT HUNT

05.00
PRODUCT HUNT

Product Hunt - June 5, 2026

Product Hunt Daily Feed: Featuring noteworthy tech launches.

Leni icon
Leni

The world’s most accurate AI for investors

0
SellerClaw icon
SellerClaw

A team of AI agents that runs your stores across channels

0
Ideogram 4.0 icon
Ideogram 4.0

Generate design-ready image with open weight, layout control

0
Nemotron 3 Ultra by NVIDIA icon
Nemotron 3 Ultra by NVIDIA

Powers faster, efficient reasoning for long-running agents

0
Minimi icon
Minimi

Your ambient memory for Claude

0
Veltrix AI icon
Veltrix AI

AI finance copilot for cash flow, margins, and growth

0
Treadmill Pro icon
Treadmill Pro

Control your treadmill from your iPhone, wirelessly

0
LocalClicky icon
LocalClicky

Control your Mac with your voice locally

0
Agent Mode on Arena icon
Agent Mode on Arena

Get real-world tasks done with autonomous AI agents

0
Microsoft MAI-Voice-2 icon
Microsoft MAI-Voice-2

Expressive TTS with voice cloning in 15 languages

0
Lumo Studios icon
Lumo Studios

Build Decks that Speak for Themselves

0
Moodloom icon
Moodloom

Ad-free Pinterest Alternative with AI content filtering

0
Clarafy icon
Clarafy

Type messy and have it instantly polished

0
Recursi icon
Recursi

Self improving vibe coding env with no API fees

0
VisionSync icon
VisionSync

Where strategy execution meets the people doing the work

0
FloatPic icon
FloatPic

Ultra-minimalist, borderless macOS native image viewer

0
Agent Browser Shield icon
Agent Browser Shield

Block prompt inject & cut token costs for AI browser agents

0
Split Ninja icon
Split Ninja

Cut, extract, mute, and split videos locally

0
Cignara icon
Cignara

AI Agents for Fortune 500 grade customer support

0
Boxes.dev icon
Boxes.dev

Run Claude Code and Codex in your own cloud environment

0
TimeTuna.com icon
TimeTuna.com

If Calendly had gorgeous video backgrounds

0
Deliveryman.ai icon
Deliveryman.ai

Cold email infrastructure on autopilot without Gsuite

0
Mailwarm 2.0 icon
Mailwarm 2.0

The email warmup tool, upgraded for deliverability.

0
Empromptu AI icon
Empromptu AI

Train Fine Tuned Models With AI Apps You're Already Building

0
Astra Autonomous Pentest icon
Astra Autonomous Pentest

AI agents that find, validate, and fix every vulnerability

0
Extella.AI icon
Extella.AI

Agentic platform that evolves & builds reusable systems

0
Curata icon
Curata

A shared workspace for AI agents and humans.

0
Audex Trace icon
Audex Trace

Trace what Apple Music is actually playing

0
Walrus Memory icon
Walrus Memory

Enable agents to keep context & work across apps + sessions

0
PlugTalk icon
PlugTalk

Your Mac talks back when you plug things in

0
Chloe by Close icon
Chloe by Close

AI agent built into your CRM who works leads for you

0
Smart Runner icon
Smart Runner

Your training plan, rewritten after every run

0
Kai for Chrome icon
Kai for Chrome

Local meeting transcription with no account needed

0
Carbon Voice Speed Dial icon
Carbon Voice Speed Dial

Get your whole team (humans and agents!) on speed dial

0
Keen Code icon
Keen Code

A context-efficient CLI coding agent built by agents

0
ChatPilot icon
ChatPilot

Bulk delete, archive & timestamp your ChatGPT conversations

0
DotBGE icon
DotBGE

Local-first file encryption for iOS, CLI, and agents

0
Sun icon
Sun

Collaborative voice API for agents

0
Perplexity Personal Computer for Windows icon
Perplexity Personal Computer for Windows

Run AI agents across your local files and apps on Windows

0
Koji by Brilliant icon
Koji by Brilliant

A world-class personal tutor for every home

0
Intelligent Terminal icon
Intelligent Terminal

Windows Terminal with native agent integration

0
AppWizzy icon
AppWizzy

Rent a private VM with Codex to build production apps

0
Novus icon
Novus

Catch and fix usability issues automatically as you ship

0
Build Club Campus icon
Build Club Campus

Virtual AI School: Upskill in AI and Become Great at it Fast

0
Google Gemma 4 12B icon
Google Gemma 4 12B

Run multimodal AI locally with an encoder-free architecture

0
Gather icon
Gather

Save it once, never lose it again

0
Basedash Semantic Layer icon
Basedash Semantic Layer

Define metrics once. Use them everywhere.

0
Franz 6 icon
Franz 6

All your messaging apps in one window — with private AI

0
Hermes Desktop icon
Hermes Desktop

The agent that grows with you

0
StampCam icon
StampCam

Turn any photo into a postage stamp or sticker

0
06

TECHMEME

06.00
TECHMEME

Techmeme - June 5, 2026

Techmeme Digest: Major tech headlines and industry conversations.

What to expect at WWDC 2026: an overhauled Siri, a Siri app, a slew of new AI capabilities, OS updates focused on reliability and responsiveness, and more (Mark Gurman/Bloomberg)
Source: TechmemePublished: Jun 5, 2026

Mark Gurman / Bloomberg : What to expect at WWDC 2026: an overhauled Siri, a Siri app, a slew of new AI capabilities, OS updates focused on reliability and responsiveness, and more —  The iPhone maker will attempt its artificial intelligence turnaround on Monday.  —  Apple Inc. will unveil a new artificial intelligence strategy …

Sources: US officials discussed the US taking stakes in major AI companies after Sam Altman pitched the idea in 2025; Altman met Bernie Sanders on Wednesday (Wall Street Journal)
Source: TechmemePublished: Jun 5, 2026

Wall Street Journal : Sources: US officials discussed the US taking stakes in major AI companies after Sam Altman pitched the idea in 2025; Altman met Bernie Sanders on Wednesday —  The talks have been with artificial-intelligence leaders such as OpenAI CEO Sam Altman, who pitched the idea

A research team that includes Huawei says it successfully used Huawei's Ascend 910C chips for DeepSeek V4 Pro model's post-training, amid increased US sanctions (Coco Feng/South China Morning Post)
Source: TechmemePublished: Jun 5, 2026

Coco Feng / South China Morning Post : A research team that includes Huawei says it successfully used Huawei's Ascend 910C chips for DeepSeek V4 Pro model's post-training, amid increased US sanctions —  While Chinese chipmakers have found success in supporting AI inference, they are struggling with the far more complex process of training

FCC Chair Brendan Carr says the agency will review E-Rate, a $3B annual program subsidizing school and library internet access, amid kids' screen time concerns (Finya Swai/The Hill)
Source: TechmemePublished: Jun 5, 2026

Finya Swai / The Hill : FCC Chair Brendan Carr says the agency will review E-Rate, a $3B annual program subsidizing school and library internet access, amid kids' screen time concerns —  The Federal Communications Commission (FCC) is reviewing a $3 billion annual program that subsidizes internet access for schools and libraries …

Chinese regulator CSRC tightens oversight of the country's ~$3.4T private fund industry; the CSRC says it will encourage tech-focused VC investments and more (Reuters)
Source: TechmemePublished: Jun 5, 2026

Reuters : Chinese regulator CSRC tightens oversight of the country's ~$3.4T private fund industry; the CSRC says it will encourage tech-focused VC investments and more —  China on Friday tightened oversight of the country's 23 trillion yuan ($3.40 trillion) private fund industry …

Sources say a months-long dispute between the White House and Anthropic is showing signs of easing across the US government as the startup prepares for its IPO (Reuters)
Source: TechmemePublished: Jun 5, 2026

Reuters : Sources say a months-long dispute between the White House and Anthropic is showing signs of easing across the US government as the startup prepares for its IPO —  A months-long dispute between Trump administration officials and AI firm Anthropic is showing signs of easing across parts …

Illinois Governor JB Pritzker plans to temporarily halt tax breaks for data centers from July 1, calling on state lawmakers to create a development framework (Natasha Korecki/NBC News)
Source: TechmemePublished: Jun 5, 2026

Natasha Korecki / NBC News : Illinois Governor JB Pritzker plans to temporarily halt tax breaks for data centers from July 1, calling on state lawmakers to create a development framework —  Pritzker, who is widely viewed as having 2028 White House aspirations, is tapping into an issue seen as important to voters.

India-based Innefu Labs, which builds AI-powered software for national defense and enterprise security infrastructure, raised a $30M Series B led by Panthera (The Economic Times)
Source: TechmemePublished: Jun 5, 2026

The Economic Times : India-based Innefu Labs, which builds AI-powered software for national defense and enterprise security infrastructure, raised a $30M Series B led by Panthera —  The capital infusion, completed through a combination of primary and secondary transactions from Panthera, positions …

University of Cambridge researchers say they have developed the first vaccine with a key component entirely designed by AI and subsequently trialed it in humans (James Gallagher/BBC)
Source: TechmemePublished: Jun 5, 2026

James Gallagher / BBC : University of Cambridge researchers say they have developed the first vaccine with a key component entirely designed by AI and subsequently trialed it in humans —  Artificial intelligence has been used to develop a “fundamentally new” type of vaccine that could protect against large swathes …

OpenAI confirms it will comply with President Trump's EO that asks AI companies to allow the US government to assess their models' capabilities before release (Michael Considine/CNBC)
Source: TechmemePublished: Jun 5, 2026

Michael Considine / CNBC : OpenAI confirms it will comply with President Trump's EO that asks AI companies to allow the US government to assess their models' capabilities before release —  OpenAI has confirmed it will comply with Donald Trump's executive order that asks AI companies to allow the federal government …

Apple-funded study: the App Store had $1.4T in billings and sales globally in 2025; apps with consumer AI features saw 4x more billings growth than other apps (Ryan Christoffel/9to5Mac)
Source: TechmemePublished: Jun 5, 2026

Ryan Christoffel / 9to5Mac : Apple-funded study: the App Store had $1.4T in billings and sales globally in 2025; apps with consumer AI features saw 4x more billings growth than other apps —  Apple shares update on 2025 App Store success  —  “Developers are the heartbeat of the App Store, and this year's incredible milestone …

SpaceX says Starlink now has 12M+ "active customers with high-speed internet" across 160+ countries, territories, and other markets, up from 9M in December 2025 (@starlink)
Source: TechmemePublished: Jun 5, 2026

@starlink : SpaceX says Starlink now has 12M+ “active customers with high-speed internet” across 160+ countries, territories, and other markets, up from 9M in December 2025 —  Starlink is connecting more than 12M active customers with high-speed internet across 160+ countries, territories and many other markets. Thank you to all our customers around the world! 🛰️🌎❤️ → https://starlink.com/12M [image]

Sources: data center developer Switch is in talks to raise billions of dollars from private equity firms, including Brookfield and KKR, at a $50B+ valuation (The Information)
Source: TechmemePublished: Jun 5, 2026

The Information : Sources: data center developer Switch is in talks to raise billions of dollars from private equity firms, including Brookfield and KKR, at a $50B+ valuation —  Data center developer Switch is in talks to raise billions of dollars at a valuation of at least $50 billion …

Elon Musk petitioned the FTC in May to end its 2022 order restricting Twitter's data use, claiming Twitter no longer exists as X merged with xAI and then SpaceX (Ashley Belanger/Ars Technica)
Source: TechmemePublished: Jun 5, 2026

Ashley Belanger / Ars Technica : Elon Musk petitioned the FTC in May to end its 2022 order restricting Twitter's data use, claiming Twitter no longer exists as X merged with xAI and then SpaceX —  Critics hope to keep Elon Musk from escaping a strict data-privacy order imposed by the Federal Trade Commission (FTC) shortly before he took over Twitter.

Analysis: Meta discreetly added code for an unreleased "NameTag" face-recognition system for its AI smart glasses over multiple Meta AI app updates in 2026 (Wired)
Source: TechmemePublished: Jun 5, 2026

Wired : Analysis: Meta discreetly added code for an unreleased “NameTag” face-recognition system for its AI smart glasses over multiple Meta AI app updates in 2026 —  Code reviewed by WIRED uncovered an unreleased face-recognition system embedded in Meta's smart glasses platform.

07

STARTUP ARCHIVE

07.00
STARTUP ARCHIVE

Startup News - June 5, 2026

Startup News Roundup: Aggregating key funding and launch updates.

Marc Andreessen on the 5 personality traits of an innovator
Source: StartupPublished: Mar 31, 2026

“When you’re talking about real innovators—people who actually do really creative, breakthrough work—I think you’re talking about a couple things:”

Steve Jobs explains the importance of both thinking and doing
Source: StartupPublished: Mar 30, 2026

“The doers are the major thinkers. The people who really create the things that change this industry are both the thinker-doer in one person.”

Tobi Lutke explains what the VCs who passed on Shopify got wrong
Source: StartupPublished: Mar 27, 2026

“What a lot of free-market thinkers don’t understand is that between the demand and eventual supply lies friction."

Sam Altman explains how he decides to invest in a startup after 10 minutes
Source: StartupPublished: Mar 26, 2026

"Does this person have the potential to be the next Mark Zuckerberg?… [You don’t get to] 100% accuracy, obviously, but it’s good enough that our business model works.”

Jony Ive recounts the time Steve Jobs called him vain
Source: StartupPublished: Mar 25, 2026

In the clip below, Jony Ive recounts the time he asked Steve Jobs to be less harsh in his critique of a piece of work.

Jeff Bezos’s two pieces of advice for aspiring entrepreneurs
Source: StartupPublished: Mar 24, 2026

“The advice that I would give entrepreneurs is don't chase the hot new thing. It's so hard to catch something that everybody already knows is hot."

Elad Gil: “Things that work tend to work pretty fast”
Source: StartupPublished: Mar 23, 2026

“I do think there’s a bit of a myth in Silicon Valley that you should keep grinding no matter what and it’s just about perseverance, and I think that’s really bad advice."

Paul Graham on why starting with a “small, intense fire" is the key to startup growth
Source: StartupPublished: Mar 20, 2026

"You have to know who those first users are and how you're going to get them."

Keith Rabois on how to identify great talent
Source: StartupPublished: Mar 19, 2026

“What you want to do with every single employee every single day is expand the scope of their responsibilities until it breaks… and that’s the role they should stay in.”

Wealthfront CEO on why advertising spend makes it harder to find product/market fit
Source: StartupPublished: Mar 18, 2026

“The way that you know you have product/market fit is if you have exponential organic growth."

Eric Schmidt on why most companies get strategy wrong
Source: StartupPublished: Mar 17, 2026

“Work very, very hard to figure out what the world’s going to look like in five years. What will people be doing? What will your customers want? Where will costs be?"

Mark Zuckerberg: “You can’t 80/20 everything”
Source: StartupPublished: Mar 16, 2026

"There’s the famous 80/20 rule where you get 80% of the benefit by doing 20% of the work, but you can’t just 80/20 everything. There have to be certain things that you are just the best at."

Marc Andreessen on Mark Zuckerberg’s founder “superpower”
Source: StartupPublished: Mar 13, 2026

“A great superpower that Mark Zuckerberg has that is probably not well-understood enough is he does not get emotionally upset in stressful situations"

Sam Altman explains how to come up with a great startup idea
Source: StartupPublished: Mar 12, 2026

"If you start a startup without a good idea… you’ll be under pressure to make something up and it won’t work that well."

Jeff Bezos on the problems with proxies and managing to metrics
Source: StartupPublished: Mar 11, 2026

“One of the things that happens in business is that you develop certain things that you’re managing to—a typical case would be a metric. And that metric isn’t the real underlying thing.”

Airbnb founder Brian Chesky on how to design an amazing user experience
Source: StartupPublished: Mar 10, 2026

“If you can design something really amazing using the hand-crafted part of your brain, then you can reverse-engineer how to industrialize this millions of times over."

Spencer Rascoff: "I will never invest in a consumer startup with paid marketing”
Source: StartupPublished: Mar 9, 2026

"If you’re actually trying to grow a product, the best levers for doing that are often within the product itself.”

Patrick Collison explains why it sometimes make sense to quit
Source: StartupPublished: Mar 6, 2026

“One thing I’ve learned myself the hard way, is that it is easier to tear down a company and restart it in Silicon Valley, than it is to constantly try to pivot or keep something alive."

Jeff Bezos recounts the time he called Amazon’s customer service number mid-meeting to prove a metric was wrong
Source: StartupPublished: Mar 5, 2026

“I have a saying, which is when the data and the anecdotes disagree, the anecdotes are usually right"

Ben Horowitz: “Nobody was born a great manager. It’s a very unnatural job.”
Source: StartupPublished: Mar 4, 2026

“If you can’t build a great product, it doesn’t matter if you can build a great company.”

03

ALSO TODAY

3 MORE SOURCES
08

SOLIDOT

08.00
SOLIDOT

Solidot News - June 5, 2026

Solidot Feed: Highlighting essential tech & open-source news.

大黄蜂能利用工具解决问题

根据发表在《科学》期刊上的一项研究,大黄蜂能利用工具解决问题。昆虫加入到了能解决“盒子香蕉”问题的动物行列,展现出了基本智能。在盒子香蕉问题中,黑猩猩通过叠盒子够着了之前够不着的香蕉。在最新研究中,研究人员根据大黄蜂修改了盒子香蕉问题:它需要将聚苯乙烯球滚到特定位置,然后爬上去够到低天花板上的人造花。参与实验的大黄蜂只有几周大,研究人员训练它们将人造花与糖水奖励联系起来。在基础测试中 75% 的黄蜂成功够到了花朵;在更复杂的测试中,30 只黄蜂中有 23 只成功了。研究人员指出,即使昆虫的大脑非常小,它们也能灵活解决各种新问题。

机器人的 HTTP 请求超过人类

根据 Cloudflare 的统计,基于 HTTP 请求的机器人流量已远超人类,由于数据混乱机器人流量超过人类的确切时间不太清楚。目前机器人流量占 57.5%,人类占 42.5%。Cloudflare 统计的是 AI 智能体,这些 AI 智能体能代表人类浏览网页,阅读产品页面、查看价格、执行比较航班等多步骤任务、抓取和索引网页内容——但用于 AI 大模型而非搜索引擎,以及充当私人助理去订餐比价和购物,处理客户服务等。就应用使用、流媒体播放和无限滚动信息流的总时长而言,人类用户仍然是主要群体。按国别/地区划分,直布罗陀岛的机器人流量比例最高(92.1%),其次是新加坡(76.4%)和伊朗(76.4%),伊朗可能是 VPN 用户比较多。

苹果称 App Store 生态系统规模突破 1.4 万亿美元

苹果宣布全球 App Store 生态系统在 2025 年促成了逾 1.4 万亿美元开发者营业额与销售额。在 App Store 生态系统促成的营业额和销售额中,超过 90% 完全归开发者所有,无需向 Apple 支付任何佣金。苹果未单独披露 App Store 收入,而是将其计入服务业务。服务业务在 2025 财年贡献了 1091 亿美元,占苹果总收入 4161 亿美元的近三分之一。iPhone 业务收入最高达到 2095 亿美元。根据 Analysis Group 的分析,1.4 万亿美元中 1490 亿美元来自数字商品和服务,1.1 万亿美元来自实体商品和服务。中国市场贡献了最大的销售额 5620 亿美元,其次是美国 4530 亿美元、欧洲 1840 亿美元和日本 520 亿美元。

Google 寻求在加州和佛州释放数千万只无生育能力的雄蚊

Google 旗下企业 Debug 正寻求政府许可在加利福尼亚州和佛罗里达州释放 3200 万只雄蚊。这些雄蚊携带了沃尔巴克氏体细菌(Wolbachia),会导致细胞质不亲和性,意思是雄蚊的精子无法让野雌蚊的卵子受精。理论上这会导致蚊群数量逐代减少。雄蚊不会叮咬人,只有雌蚊才会,因此 Debug 并没有释放大量吸血昆虫。Debug 正在等待美国环保署的批准,公众意见征询截止日期 6 月 5 日。目前的公众意见显示很多人持有阴谋论观点,声称“人不是实验鼠”。

日本计划 2049 年前重建 2-5 个核电机组

日本政府计划 2049 年之前重建 2-5 个已决定报废的核电机组,2059 年之前增至 11-14 个。其背景是 AI 的普及预计将带动电力需求增长。日本的国家核能政策方针已从 2011 年东京电力福岛第一核电站事故后提出的降低依赖转向最大限度利用。2025 年修改的《能源基本计划》提出了 204 0年度核能占到国内电源构成 2 成的目标。核电站运转期限最长为 60 年,日本部分机组已运行 50 年以上。靠重启现有核电机组已无法实现这一目标,需要进行重建或新建。目前日本国内有 11 座核电站共 24 个机组正在开展报废作业。其中关西电力美滨核电站(福井县)和九州电力川内核电站(鹿儿岛县)被视为重建的热门选项。

rsync 项目争议 AI 辅助编程

广泛使用的备份项目 rsync 最近释出的一个版本导致部分用户增量备份失败,用户在检查代码时发现 rsync 维护者 Andrew Tridgell 最近大量使用 AI 辅助编程,项目有数十个 commits 的作者是 tridge 和 claude——tridge 是 Andrew Tridgell,而 claude 就是 Anthropic 的 AI 助手 Claude。此事立即引发了 AI 生成代码的争议。Tridgell 随后通过个人博客回应了争议,承认近期大量使用 AI 编程,他反驳了批评,称批评者在不了解 AI 工具实际使用情况就妄下结论。他表示自己设计了框架,对 AI 生成的代码进行人工审查,他只是将繁琐的工作交给 AI,称自己是一名有 40 年经验的软件工程师。Tridgell 表示会继续使用 AI 工具。

苹果在美国德州引入年龄验证

苹果从 6 月 4 日周四起在美国德州引入年龄验证,此举是为了遵守德州的法律《App Store Accountability Act(SB 2420)》。去年 12 月法官阻止了该法律的生效,但上诉法庭推翻了这一裁决。苹果一直试图阻止在其应用商店 App Store 验证年龄,但它已宣布计划实施年龄验证以遵守犹他、路易斯安那、巴西、澳大利亚、新加坡和英国等地的法律。Google 也被要求对 Play Store 进行类似的更改。美国德州用户在创建新苹果账户时,需要使用信用卡或政府颁发的身份证件验证是否年满 18 周岁。苹果也可能根据用户账户的注册时间以及是否绑定了信用卡等自动验证用户的年龄。

AI 没有意识

知名科幻作家姜峯楠(Ted Chiang)在《大西洋月刊》上发表文章认为 AI 没有意识,它只是在玩角色扮演游戏。Anthropic 被视为 AI 巨头,但它真正擅长的可能是拟人化。大模型能生成流畅的文本并不意味着它们有意识,虽然销售大模型的公司一直在助长这种误解。它输出的每个单词都以完全相同的方式生成。深度伪造通常指的是照片、音频和视频,但当讨论意识时,我们也需要将文本视为一种深度伪造媒介。深度伪造照片和大模型对话的主要区别在于前者是故意欺骗他人后者更多是自我欺骗。姜峯楠认为意识需要有主观体验,大模型缺乏主观体验这一事实与它能否成为有用工具或产生显著经济影响不相关。它们脱离现实的内在本质,以及概率性质意味着它们永远无法达到传统软件所具备的可靠性,虽然大模型可能足够优秀到能改变部分领域的工作方式。

在失联半年后火星 MAVEN 任务宣告结束

在经历了长达六个月的无线电静默后,MAVEN 正式宣告任务终结。这艘于 2013 年发射的探测器,在 2025 年 12 月底一次飞越火星背面的常规过程中神秘失联,根据最后传回的数据显示,探测器当时陷入了异常的快速自旋,导致轨道偏离并耗尽了机载电池。 NASA 召集的审查委员会于近日得出结论,判定其已无法复原。尽管它预计还会在轨道上徘徊 50 到 100 年才会坠毁于火星表面,但其科学寿命已画下句号。NASA 在火星轨道上有三艘探测器,包括了 2001 年发射的 Mars Odysse 探测器,2005 年发射的 Mars Reconnaissance Orbiter(MRO)探测器,以及 2013 年发射的 Mars Atmosphere and Volatile Evolution(MAVEN)。MAVEN 属于三艘中服役时间最短的探测器,另外两艘都接近寿命终点。火星轨道上还有两颗欧洲探测器,以及地面上还有漫游车,因此火星研究还会继续。

Steam 用户中使用 Linux 比例降至 3.99%

Valve 公布了 2026 年 5 月的 Steam 硬件和软件调查。在 3 月 Steam 玩家使用 Linux 比例达到创纪录的 5.33% 之后 Linux 份额连续两个月下降:4 月 4.52%,5 月 3.99% 减少 0.53% 但仍然有去年同期的两倍。Windows 操作系统占 93.85%,OSX 占 2.16%。在玩家使用的语言中,英语占 39.48% 增加 2.71%,简体中文占 21.85% 减少 1.56%。用户使用英特尔 CPU 的比例占 53.94%,AMD 占 46.06%,英特尔份额在缓慢减少 AMD 在缓慢增加。

微软创建 Rust Coreutils 分支 Coreutils for Windows

在本周举行的 Build 2026 大会上,微软宣布了 Coreutils for Windows 项目——软件巨人维护的 Rust Coreutils(uutils)的一个分支,该分支不是硬分支,而是下游版本。Coreutils for Windows 包含了 uutils/coreutils、findutils 和 grep 等工具,其目标是在 Windows、WSL、macOS 和 Linux 等不同平台之间的开发切换更无缝,因为有统一的命令、flags 和管线,以相同的方式工作,现有脚本无需转换即可直接使用。不知道鲍尔默(Steve Ballmer)是不是还记得他说过的话。

任何程度的饮酒都会增加健康风险

一项大规模研究显示,即使每天饮酒不足一个标准杯,也会增加患多种癌症风险。研究团队分析了截至 2023 年发表的 843 项队列研究和病例对照研究,对酒精与多种疾病之间的关联进行了系统评估、在所考察的 10 种癌症中,饮酒均与风险升高有关,且风险随饮酒量增加而持续上升。即使每日摄入不足 10 克纯酒精,也与咽癌、结直肠癌、食管癌、乳腺癌、肝癌、胰腺癌和前列腺癌风险增加相关。其中咽癌风险增幅最为显著,可增加一倍以上。除癌症外,饮酒还与肝硬化等慢性肝病以及胰腺炎风险上升相关。研究显示,慢性肝病风险至少增加 40%,胰腺炎风险至少增加 22%。研究结果清晰表明,癌症风险会随着任何水平的酒精摄入而增加,而所谓“适量饮酒有益健康”的证据主要集中在部分非癌症疾病领域,且关联性较弱。

美国资本主义转向末日论

末日论是今天美国资本主义最强大的动力。马斯克(Elon Musk)旗下的火箭公司 SpaceX 公开宣称其使命是在火星上建立殖民地以免人类在地球上灭绝。马斯克之所以能成为美国首富,部分原因在于他是美国声音最大的末日论者。马斯克正抢在另外两位持相似千禧年主义世界观的先知前让 SpaceX 上市。Anthropic 的 Dario Amodei 和 OpenAI 的 Sam Altman、以及 Palantir CEO Alex Karp、Anduril 创始人 Palmer Luckey 都在叙述着某种末日故事。一个信奉千禧年主义的经济体必然是偏执的。Peter Thiel 说 AI 将以威权统治的形式召唤敌基督。 教宗良十四世呼吁解除 AI 的武装。英国流行歌手 Charli XCX 的新歌捕捉到了大众和教宗的情绪:春天,夏天 ‘26/当世界即将终结,没有任何希望/是的,我们正走在一条通往地狱的跑道上。

德国巴伐利亚州取消微软合同改用开源软件

德国巴伐利亚州数字事务部正式宣布取消与微软的合同,该合同将在五年内支出近 10 亿欧元。巴伐利亚州将转向采用开源软件。州财政部长 Albert Füracker 主张在现有合同基础上寻求折扣,而数字部长 Fabian Mehring 则力主采用开源软件。Mehring 表示,转向开源软件将确保在危机时期服务的持续使用,保护巴伐利亚州免受价格上涨的影响,并优先保障数据安全。巴伐利亚州转向开源软件是欧洲更广泛趋势的一部分,欧洲各地的地方和联邦政府都在逐步摆脱对微软和其它美国技术的依赖。

欧盟公布减少依赖美国科技公司的计划

欧盟周三公布了 European Technological Sovereignty Package,旨在加强科技主权减少依赖美国科技公司。微软遵守美国总统特朗普的命令关闭国际刑事法院首席检察官账号给整个欧洲敲响了警钟。最新计划旨在扶持欧洲本土企业,要求高度敏感领域的公共服务不能使用外国科技公司的服务。欧盟委员会要求各成员国对其依赖的每一项数字服务进行“主权风险评估”,评估内容包括外国控制、敏感数据的潜在访问权限以及运营中断的风险。欧盟委员会主席 Ursula von der Leyen 表示,“我们不能依赖他人的技术维持医院运转、电网稳定运行和服务安全。这关乎保护我们的公民、捍卫我们的利益以及做出我们自己的选择。”

需求高涨苹果将 MacBook Neo 产能增加一倍

由于需求远超预期,苹果将其入门级电脑 MacBook Neo 的产能增加一倍,从 500 万台增加到 1000 万台。MacBook Neo 的内存只有 8GB,售价 599 美元,学生折扣价 499 美元。苹果 CEO 库克表示在发布 MacBook Neo 之前就对其前景非常乐观,但公司仍然低估了消费者的热情。在 MacBook Neo 的带动下上季度 Mac 新用户数量创下历史新高。Windows PC 行业也在关注 MacBook Neo 在入门级电脑市场掀起的旋风,戴尔刚刚推出了一款起售价 699 美元(学生折扣 599)的 XPS 13 笔电,但 8GB 内存对于 Windows 11 而言属于勉强可用。

Google 发布能在笔记本上本地运行的开源模型 Gemma 4 12B

Google 发布了能在笔记本电脑上本地运行的开源模型 Gemma 4 12B。Gemma 4 12B 有 120 亿参数,能在有 16GB 显存的笔记本电脑上本地运行——排除了绝大部分中低端笔记本电脑,只有高端的笔记本电脑才可能有 16GB 以上显存。Gemma 4 是多模态模型,能处理文本、图像和音频不同类型的信息,能理解视觉内容、处理音频输入并执行高级推理任务,因此具有更广泛应用场景。Gemma 4 12B 采用 Apache 2.0 许可证,限制较少。

特朗普政府将拆除洋流观测系统

特朗普政府将从本月开始拆除耗资 3.68 亿美元的海洋观测计划(Ocean Observatories Initiative)。海洋观测计划由逾 900 台深海仪器构成,用于监测洋流、海洋生态系统、碳吸收、热浪、渔业、沿海洪水和气候变化。美国国家科学基金会(NSF)表示将派出船只开始拆除锚定在俄勒冈州、华盛顿州、阿拉斯加州、北卡罗来纳州,以及格陵兰岛和冰岛之间被称为 Irminger 海域的仪器。海洋观测计划于 2016 年投入运作,原计划运行 25 年。领导该计划的海洋气象学家 Jim Edson 称其为“世界最先进的持续运行海洋观测系统”。拆除这些仪器可能需要 15 个月的时间。位于俄勒冈州附近一座活火山周围的地震仪将持续运行至 2028 年。每个观测站由多个锚定装置组成。这些设备测量从水面到数千英尺深处的洋流以及化学生物状况。仪器经过加固能承受深海的压力、腐蚀性海水以及可能损坏电子设备的海洋动植物。锚定装置周围的遥控机器人和滑翔机负责收集数据并将其传输到研究实验室。它每年的运行成本为 4800 万美元。特朗普政府曾多次试图关闭该项目,提议在 2025 年和 2026 年分别削减其 80% 的资金。但国会最终否决了这一提议,恢复了拨款。尽管如此,NSF 还是推进了观测网络的退役工作。

青春与长寿之间的基因权衡

科学家发现基因 vgll3 与生命早期生长发育和生殖成功以及生命晚期衰老加速和癌症风险增加直接相关。最新研究为 antagonistic pleiotropy 假说提供了实验证据。该假说认为某些基因会在生命早期带来优势,但在生命晚期则会带来不利影响。研究人员针对了一种寿命非常短的非洲丽鱼(African turquoise killifish),使用 CRISPR 基因编辑技术修改了该基因。结果显示,修改了 vgll3 基因的鱼生长速度更快,性成熟更早,在自然环境中具有繁殖优势。但代价是寿命缩短,且罹患与年龄相关癌症的几率更高。研究人员指出,大自然并不优先考虑寿命,而是优先考虑延续性。人类也存在 vgll3 基因,这项研究也有助于更好的理解人类发育、衰老和年龄相关疾病。

Meta 给予员工每次最多 30 分钟退出跟踪

Meta 最近开始在美国员工电脑上安装追踪软件,捕捉员工鼠标移动、点击和按键数据以用于训练 AI 模型,此举是该公司构建能自动执行工作任务的 AI 智能体的大计划的一部分。被称为 Model Capability Initiative(MCI)的工具在公司内部引发了强烈反对,部分员工为此发起了一项请愿活动,已有逾 1500 人签名。有匿名员工认为公司的行为“非常反乌托邦”。根据周二发给员工的一份内部备忘录,Meta 略微后退了一步,允许员工退出跟踪,“每次最长 30 分钟”,员工也可以申请永久退出该跟踪计划。

09

APP STORE RANK

09.00
APP STORE RANK
FETCHING · APP STORE RANK