Curated by Shen Huang · 90 stories · ~14 min read
DIGEST · 2026-05-19

OrangeBot.AI Digest — 2026-05-19

90 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Tesla's lithium refinery discharges 231,000 gallons of polluted wastewater a day (www.autonocion.com)
  2. Minnesota becomes first state to ban prediction markets (www.npr.org)
  3. Google changes its search box (blog.google)
  4. Disney erased FiveThirtyEight (www.natesilver.net)
  5. Gemini 3.5 Flash (blog.google)
  6. Gemini Omni (deepmind.google)
  7. Gemini 3.5 Flash: frontier intelligence with action (blog.google)
  8. I’ve built a virtual museum with nearly every operating system you can think of (virtualosmuseum.org)
  9. I’ve joined Anthropic (twitter.com)
  10. CISA Admin Leaked AWS GovCloud Keys on GitHub (krebsonsecurity.com)
  11. OpenBSD 7.9 (www.openbsd.org)
  12. Apple unveils new accessibility features (www.apple.com)
  13. Photo GIMP – A Patch for GIMP 3 for Photoshop Users (github.com)
  14. Show HN: Gaussian Splat of a Strawberry (superspl.at)
  15. I found ultra-pure quantum crystals in an abandoned mine in the Atacama desert (medium.com)

GitHub Trending(15)

  1. tinyhumansai / openhuman
  2. HKUDS / CLI-Anything
  3. Imbad0202 / academic-research-skills
  4. obra / superpowers
  5. anthropics / claude-plugins-official
  6. rohitg00 / agentmemory
  7. CloakHQ / CloakBrowser
  8. rtk-ai / rtk
  9. msitarzewski / agency-agents
  10. colbymchenry / codegraph
  11. multica-ai / andrej-karpathy-skills
  12. humanlayer / 12-factor-agents
  13. Diolinux / PhotoGIMP
  14. Alishahryar1 / free-claude-code
  15. pascalorg / editor

Product Hunt(15)

  1. Composer 2.5

    Cursor’s most powerful model yet

  2. Agora-1 by Odyssey

    A multi-agent world model you can play

  3. AutoShelf

    Auto-organize files on your Mac

  4. Monocle 3.5 for macOS

    Noise-cancelling for your  screen

  5. calog.cc

    Chat-based calorie tracker that actually knows desi food

  6. LearnHouse

    The modern way to teach what you build

  7. VWFNDR™ + MBL

    Take raw photos with proof they're real, not AI

  8. Voker

    The Agent Analytics Platform for AI Product Teams

  9. Haystack

    Review the pull requests that actually need human attention

  10. Lyricly

    Live lyrics in your dynamic Notch & floating on your desktop

  11. Buggyverse

    Study with strangers online, high-accountability focus rooms

  12. Cosmic Insights

    Cookieless web analytics built into your CMS

  13. Starchild-1 by Odyssey

    The first real-time multimodal world model

  14. ShioriCode

    Open-source alternative to Codex & Claude Code

  15. CaseGap AI

    Find law firm revenue leaks, then fix them

Hugging Face(15)

  1. Code as Agent Harness

    Recent large language models (LLMs) have demonstrated strong capabilities in understanding and generating code, from competitive programming to repository-level software engineering. In emerging agentic systems, code is no longer only a target output. It increasingly serves as an operational substrate for agent reasoning, acting, environment modeling, and execution-based verification. We frame this shift through the lens of agent harnesses and introduce code as agent harness: a unified view that centers code as the basis for agent infrastructure. To systematically study this perspective, we organize the survey around three connected layers. First, we study the harness interface, where code connects agents to reasoning, action, and environment modeling. Second, we examine harness mechanisms: planning, memory, and tool use for long-horizon execution, together with feedback-driven control and optimization that make harness reliable and adaptive. Third, we discuss scaling the harness from single-agent systems to multi-agent settings, where shared code artifacts support multi-agent coordination, review, and verification. Across these layers, we summarize representative methods and practical applications of code as agent harness, spanning coding assistants, GUI/OS automation, embodied agents, scientific discovery, personalization and recommendation, DevOps, and enterprise workflows. We further outline open challenges for harness engineering, including evaluation beyond final task success, verification under incomplete feedback, regression-free harness improvement, consistent shared state across multiple agents, human oversight for safety-critical actions, and extensions to multimodal environments. By centering code as the harness of agentic AI, this survey provides a unified roadmap toward executable, verifiable, and stateful AI agent systems.

  2. SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

    Long-horizon LLM agents leave traces that could become reusable experience, but raw trajectories are noisy and hard to govern. We treat Agent Skills as an experience schema that couples executable scripts, with non-executable guidance on procedures. Yet open skill ecosystems contain redundant, uneven, environment-sensitive artifacts, and indiscriminate updates can pollute future context. We present SkillsVote, a lifecycle-governance framework for Agent Skills from collection and recommendation to evolution. SkillsVote profiles a million-scale open-source corpus for environment requirements, quality, and verifiability, then synthesizes tasks for verifiable skills. Before execution, SkillsVote performs agentic library search over structured skill library to expose instructional skill context. After execution, it decomposes trajectories into skill-linked subtasks, attributes outcomes to skill use, agent exploration, environment, and result signals, and admits only successful reusable discoveries to evidence-gated updates. In our evaluation, offline evolution improves GPT-5.2 on Terminal-Bench 2.0 by up to 7.9 pp, while online evolution improves SWE-Bench Pro by up to 2.6 pp. Overall, governed external skill libraries can improve frozen agents without model updates when systems control exposure, credit, and preservation.

  3. LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

    We present LongLive-2.0, an NVFP4-based parallel infrastructure throughout the full training and inference workflow of long video generation, addressing speed and memory bottlenecks. For training, we introduce sequence-parallel autoregressive (AR) training, instantiated as Balanced SP, which co-designs the efficient teacher-forcing layout with SP execution by pairing clean-history and noisy-target temporal chunks on each rank, enabling a natural teacher-forcing mask with SP-aware chunked VAE encoding. Combined with NVFP4 precision, it reduces GPU memory cost and accelerates GEMM computation during training, the proportion of which increases as video length grows. Moreover, we show that a high-quality infrastructure and dataset enable a remarkably clean training pipeline. Unlike existing Self-Forcing series methods that rely on ODE initialization and subsequent distribution matching distillation (DMD), LongLive-2.0 directly tunes a diffusion model into a long, multi-shot, interactive auto-regressive (AR) diffusion model. It can be further converted to real-time generation (4 to 2 denoising steps) with standalone LoRA weights. For inference on Blackwell GPUs, we enable W4A4 NVFP4 inference, quantize KV cache into NVFP4 for memory savings, and boost end-to-end throughput with asynchronous streaming VAE decoding. On non-Blackwell GPU architectures, we deploy SP inference to match the speed on Blackwell GPUs, while the quantized KV cache can lower inter-GPU communication of SP. Experiments show up to 2.15x speedup in training, and 1.84x in inference. LongLive-2.0-5B achieves 45.7 FPS inference while attaining strong performance on benchmarks. To our knowledge, LongLive-2.0 is the first NVFP4 training and inference system for long video generation.

  4. Lance: Unified Multimodal Modeling by Multi-Task Synergy

    We present Lance, a lightweight native unified model supporting multimodal understanding, generation, and editing for both images and videos. Rather than relying on model capacity scaling or text-image-dominant designs, Lance explores a practical paradigm for unified multimodal modeling via collaborative multi-task training. It is grounded in two core principles: unified context modeling and decoupled capability pathways. Specifically, Lance is trained from scratch and employs a dual-stream mixture-of-experts architecture on shared interleaved multimodal sequences, enabling joint context learning while decoupling the pathways for understanding and generation. We further introduce modality-aware rotary positional encoding to mitigate interference among heterogeneous visual tokens and boost cross-task alignment. During training, Lance adopts a staged multi-task training paradigm with capability-oriented objectives and adaptive data scheduling to strengthen both semantic comprehension and visual generation performance. Experimental results demonstrate that Lance substantially outperforms existing open-source unified models in image and video generation, while retaining strong multimodal understanding capabilities. The homepage is available at https://lance-project.github.io.

  5. AI for Auto-Research: Roadmap & User Guide

    AI-assisted research is crossing a threshold: fully automated systems can now generate research papers for as little as $15, while long-horizon agents can execute experiments, draft manuscripts, and simulate critique with minimal human input. Yet this productivity frontier exposes a deeper integrity problem: under scientific pressure, even frontier LLMs still fabricate results, miss hidden errors, and fail to judge novelty reliably. Studying developments through April 2026, we present an end-to-end analysis of AI across the complete research lifecycle, organized into four epistemological phases: Creation (idea generation, literature review, coding & experiments, tables & figures), Writing (paper writing), Validation (peer review, rebuttal & revision), and Dissemination (posters, slides, videos, social media, project pages, and interactive agents). We identify a sharp, stage-dependent boundary between reliable assistance and unreliable autonomy: AI excels at structured, retrieval-grounded, and tool-mediated tasks, but remains fragile for genuinely novel ideas, research-level experiments, and scientific judgment. Generated ideas often degrade after implementation, research code lags far behind pattern-matching benchmarks, and end-to-end autonomous systems have not yet consistently reached major-venue acceptance standards. We further show that greater automation can obscure rather than eliminate failure modes, making human-governed collaboration the most credible deployment paradigm. Finally, we provide a structured taxonomy, benchmark suite, and tool inventory, cross-stage design principles, and a practitioner-oriented playbook, with resources maintained at our project page.

  6. CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows?

    End-to-end automation of realistic healthcare operations stresses three capabilities underrepresented in current benchmarks: policy density, decisions must be grounded in a large library of medical, insurance, and operational rules; Multi-role composition: a single task requires the agent to play multiple roles with handoffs; and multilateral interaction: intermediate workflow steps are multi-turn dialogs, such as peer-to-peer review and patient outreach. We introduce χ-Bench, a benchmark of long-horizon healthcare workflows across three domains: provider prior authorization, payer utilization management, and care management. Each task hands the agent a clinical case in a high-fidelity simulator of 20 healthcare apps exposed via 87 MCP tools, which it must drive to a terminal status through tool calls and writing the role's artifacts, guided by a 1,290+ document managed-care operations handbook skill. Across 30 agent harness/models configurations, the best agent resolves only 28.0% of tasks, no agent clears 20% on strict pass^3, and executing all tasks in a single session slumps the performance to 3.8%. These results raise the hypothesis that similar gaps are likely to surface in other policy-dense, role-composed, irreversible enterprise domains.

  7. Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis

    Designing realistic and functional 3D indoor rooms is essential for a wide range of applications, including interior design, virtual reality, gaming, and embodied AI. While recent MLLM-based approaches have shown great potential for 3D room synthesis from textual descriptions or reference images, text-based methods struggle to capture precise spatial information, and existing image-conditioned agents suffer from instability and infinite looping when tasked with holistic room generation from top-down views. To address these limitations, we propose Code-as-Room, an MLLM-based agentic framework equipped with a structured execution harness, which represents 3D rooms with Blender codes. Given a top-down room image, the framework parses the reference image to extract scene elements and their spatial relationships, and synthesizes executable Blender code for geometry, materials, and lighting in a principled, multi-stage pipeline. A cross-stage memory module is maintained throughout to mitigate context forgetting inherent to existing agent-based frameworks. We further introduce a dedicated benchmark for code-based 3D room synthesis, encompassing various evaluation protocols. Based on our benchmark, comprehensive comparisons against existing agent-based methods are conducted to validate the effectiveness of our proposed execution harness.

  8. KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration

    Aligning streaming autoregressive (AR) video generators with human preferences is challenging. Existing reinforcement learning methods predominantly rely on noise-based exploration and SDE-based surrogate policies that are mismatched to the deterministic ODE dynamics of distilled AR models, and tend to perturb low-level appearance rather than the high-level semantic storyline progression critical for long-horizon coherence. To address these limitations, we present KVPO, an ODE-native online Group Relative Policy Optimization (GRPO) framework for aligning streaming video generators. For diversity exploration, KVPO introduces a causal-semantic exploration paradigm that relocates the source of variation from stochastic noise to the historical KV cache. By stochastically routing historical KV entries, it constructs semantically diverse generation branches that remain strictly on the data manifold. For policy modeling, KVPO introduces a velocity-field surrogate policy based on Trajectory Velocity Energy (TVE), which quantifies branch likelihood in flow-matching velocity space and yields a reward-weighted contrastive objective fully consistent with the native ODE formulation. Experiments on multiple distilled AR video generators demonstrate consistent gains in visual quality, motion quality, and text-video alignment across both single-prompt short-video and multi-prompt long-video settings.

  9. OProver: A Unified Framework for Agentic Formal Theorem Proving

    Recent progress in formal theorem proving has benefited from large-scale proof generation and verifier-aware training, but agentic proving is rarely integrated into prover training, appearing only at inference time. We present OProver, a unified framework for agentic formal theorem proving in Lean 4, in which failed proof attempts are iteratively revised using retrieved compiler verified proofs and Lean compiler feedback. OProver is trained through continued pretraining followed by iterative post-training: each iteration runs agentic proving, indexes newly verified proofs into OProofs and the retrieval memory, uses repair trajectories as SFT data, and uses unresolved hard cases for RL. OProofs is built from public Lean resources, large-scale proof synthesis, and agentic proving traces, containing 1.77M Lean statements, 6.86M compiler-verified proofs, and serialized trajectories with retrieved context, failed attempts, feedback, and repairs. Across five benchmarks, OProver-32B attains the best Pass@32 on MiniF2F (93.3%), ProverBench (58.2%), and PutnamBench (11.3%), and ranks second on MathOlympiad (22.8%) and ProofNet (33.2%) more top placements than any prior open-weight whole-proof prover.

  10. Post-Trained MoE Can Skip Half Experts via Self-Distillation

    Mixture-of-Experts (MoE) scales language models efficiently through sparse expert activation, and its dynamic variant further reduces computation by adjusting the activated experts in an input-dependent manner. Existing dynamic MoE methods usually rely on pre-training from scratch or task-specific adaptation, leaving the practical conversion of fully trained MoE underexplored. Enabling such adaptation would directly alleviate the inference costs by allowing easy tokens to bypass unnecessary expert during serving. This paper introduces Zero-Expert Self-Distillation Adaptation (ZEDA), a low-cost framework that transforms post-trained static MoE models into efficient dynamic ones. To stabilize this architectural conversion, ZEDA injects parameter-free zero-output experts into each MoE layer and adapts the augmented model through two-stage self-distillation, utilizing the original MoE as a frozen teacher and applying a group-level balancing loss. On Qwen3-30B-A3B and GLM-4.7-Flash across 11 benchmarks spanning math, code, and instruction following, ZEDA eliminates over 50% of expert FLOPs at marginal accuracy loss. It outperforms the strongest dynamic MoE baseline by 6.1 and 4.0 points on the two models, and delivers ~1.20times end-to-end inference speedup.

  11. Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models

    Large Reasoning Models (LRMs) achieve strong performance by generating long chains of thought (CoT), but often overthink, continuing to reason after a solution has already stabilized and thereby wasting tokens and increasing latency. Existing inference-time early-exit methods rely primarily on answer-level signals, such as confidence or trial-answer consistency, to decide when to stop. However, these signals mainly reflect answer readiness rather than reasoning convergence: they may trigger before the model has finished exploring or self-correcting, causing premature exits that can degrade final-answer accuracy and leave the retained reasoning chain semantically incomplete. We identify reasoning-level semantic redundancy as a complementary signal for semantic-preserving early exit: when successive steps no longer add novel progress and instead revisit established conclusions, the reasoning trajectory has likely converged. Building on this insight, we propose PUMA, a plug-and-play framework that combines a lightweight Redundancy Detector with answer-level verification. The detector flags semantically redundant candidate exits, while verification confirms whether stopping is safe, allowing PUMA to remove redundant continuation while preserving both answer accuracy and a coherent reasoning prefix. Across five LRMs and five challenging reasoning benchmarks, PUMA achieves 26.2% average token reduction while preserving accuracy and retained CoT quality. Additional experiments on code generation, zero-shot vision-language reasoning, and learned stopping-policy internalization further demonstrate that reasoning-level redundancy is a robust, transferable, and learnable signal for efficient reasoning. Our code is available at https://github.com/giovanni-vaccarino/PUMA.

  12. LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

    The fundamental challenge in scaling Video Large Language Models (Video LLMs) to long-form video lies in managing the explosion of visual-token context length. Existing strategies predominantly focus on "post-hoc" token reduction -- reducing visual tokens after feature extraction to alleviate the LLM's computational overhead. While these methods effectively reduce the number of visual tokens, we observe that the primary latency bottleneck then shifts from the LLM to the expensive per-frame processing of the vision encoder. To address this, we introduce LiteFrame, a strong, yet highly efficient video encoder backbone for Video LLMs. To train LiteFrame, we propose Compressed Token Distillation (CTD), a novel training framework that teaches a compact student vision encoder to directly predict information-dense, spatio-temporally compressed representations produced by a large teacher vision model, effectively bypassing redundant computation. When coupled with further Language Model Adaptation (LMA), this approach results in a new latency-accuracy Pareto frontier -- compared with InternVL3-8B, LiteFrame provides a 35% reduction in end-to-end latency while processing 8times more frames and improves average video understanding accuracy across multiple benchmarks. Our results demonstrate a new potential path to unlocking longer-form video understanding under fixed compute budgets.

  13. Measuring Maximum Activations in Open Large Language Models

    The dynamic range of activations is a first-order constraint for low-bit quantization, activation scaling, and stable LLM inference. Prior work characterized outlier features and massive activations on pre-2024 LLaMA-style models, and the downstream activation-quantization stack inherits that picture without revisiting it for the post-LLaMA open-model boom. We ask the deployment-oriented question: how large can activations get in modern open LLMs, and how does this magnitude vary across families, generations, and training stages? Under a unified pipeline (5,000-sample multi-domain corpus, family-specific tokenization, identical hooks across embeddings, hidden states, attention, MLP/MoE, SwiGLU gates, and final norm), we measure global and layerwise maxima on 27 checkpoints from 8 open families spanning dense, MoE, vision-language, intermediate-training, and instruction-tuned variants. We find that (i) global maxima span over nearly four orders of magnitude at comparable parameter counts, with Qwen3.5 and MoE checkpoints in the 10^2 to 10^3 range and Gemma3-27B-it reaching ~7 x 10^5; (ii) cross-family and cross-generation comparisons break simple monotonic scaling; and (iii) MoE checkpoints exhibit 14.0-23.4x lower peaks than matched-scale dense counterparts, while the residual stream carries the global maximum in 22/24 checkpoints. A lightweight INT-8 sanity check shows that measured maxima co-vary with low-bit reconstruction error via activation-scale selection. We conclude that maximum activation magnitude is a model property tied to family, architecture, and training stage - not a simple byproduct of size - and should be measured and reported alongside any open-weight release before low-bit deployment. The code is publicly available at https://github.com/clx1415926/Max_act_llm.

  14. StableVLA: Towards Robust Vision-Language-Action Models without Extra Data

    It is infeasible to encompass all possible disturbances within the training dataset. This raises a critical question regarding the robustness of Vision-Language-Action (VLA) models when encountering unseen real-world visual disturbances, particularly under imperfect visual conditions. In this work, we conduct a systematic study based on recent state-of-the-art VLA models and reveal a significant performance drop when visual disturbances absent from the training data are introduced. To mitigate this issue, we propose a lightweight adapter module grounded in information theory, termed the Information Bottleneck Adapter (IB-Adapter), which selectively filters potential noise from visual inputs. Without requiring any extra data or augmentation strategies, IB-Adapter consistently improves over the baseline by an average of 30%, while adding fewer than 10M parameters, demonstrating notable efficiency and effectiveness. Furthermore, even with a 14x smaller backbone (0.5B parameters) and no pre-training on the Open X-Embodiment dataset, our model StableVLA achieves robustness competitive with 7B-scale state-of-the-art VLAs. With negligible parameter overhead (<10M), our approach maintains accuracy on long-horizon tasks and surpasses OpenPi under both synthetic and physical visual corruptions.

  15. Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

    Continuous diffusion language models lag behind autoregressive transformers, partly because diffusion is applied in spaces poorly suited to language denoising and token recovery. We propose DiHAL, a geometry-guided diffusion-transformer hybrid that asks where diffusion should enter a pretrained transformer. DiHAL scores layers with geometry-based proxies, selects a diffusion-friendly hidden-state interface, and replaces the lower transformer prefix with a diffusion bridge while retaining the upper layers and original LM head. By reconstructing the selected-layer hidden state rather than tokens, DiHAL avoids direct continuous-to-discrete recovery. Experiments on 8B-scale backbones show that the geometry score predicts effective shallow insertion layers under a fixed bridge-training protocol and that hidden-state recovery improves over continuous diffusion baselines in a diagnostic comparison matching the diffusion/recovery training budget. These results suggest that hidden-state geometry helps identify where diffusion-based replacement is feasible inside pretrained language models.

Techmeme(15)

  1. OpenAI introduces Guaranteed Capacity, a new offering that lets customers guarantee access to OpenAI's compute through one- to three-year commitments (OpenAI)

    OpenAI : OpenAI introduces Guaranteed Capacity, a new offering that lets customers guarantee access to OpenAI's compute through one- to three-year commitments —  Guarantee long-term access to OpenAI compute for the products, agents, and customer workflows that matter most.  —  Plan for capacity Contact sales

  2. Meta begins laying off 8,000 employees, or 10% of staff, in a push to become an AI-first company; another 7,000 workers will be reassigned to AI initiatives (New York Times)

    New York Times : Meta begins laying off 8,000 employees, or 10% of staff, in a push to become an AI-first company; another 7,000 workers will be reassigned to AI initiatives —  Meta told employees last month that it would carry out mass layoffs on May 20, as the Silicon Valley giant tries to transform into an A.I.-first company.

  3. Minnesota Gov. Tim Walz has signed the nation's first law banning prediction market sites from operating in the state; the CFTC has sued Minnesota in response (Bobby Allyn/NPR)

    Bobby Allyn / NPR : Minnesota Gov. Tim Walz has signed the nation's first law banning prediction market sites from operating in the state; the CFTC has sued Minnesota in response —  Minnesota Gov. Tim Walz has signed the nation's first law banning prediction market sites from operating in the state …

  4. Sources: Google DeepMind has reached a ~$100M deal to hire 20+ researchers from Contextual AI, including CEO Douwe Kiela, and license its technology (Bloomberg)

    Bloomberg : Sources: Google DeepMind has reached a ~$100M deal to hire 20+ researchers from Contextual AI, including CEO Douwe Kiela, and license its technology —  Google DeepMind has reached a deal to hire more than 20 researchers from artificial intelligence startup Contextual AI and license its technology …

  5. Google unveils Pics, an AI image editor in Workspace that lets users edit specific elements and modify text, rolling out this summer to AI Pro and Ultra users (Mat Smith/Engadget)

    Mat Smith / Engadget : Google unveils Pics, an AI image editor in Workspace that lets users edit specific elements and modify text, rolling out this summer to AI Pro and Ultra users —  It's not Photoshop, but it could be better than what's in Google Photos.  —  Alongside an array of updates across its Workspace apps …

  6. Ocean, which uses AI agents to detect email attacks, raised a $20M Series A led by Lightspeed, following an $8M seed in 2024 (Meir Orbach/CTech)

    Meir Orbach / CTech : Ocean, which uses AI agents to detect email attacks, raised a $20M Series A led by Lightspeed, following an $8M seed in 2024 —  The cybersecurity startup says its platform replaces legacy phishing defenses with AI agents that analyze intent, not just patterns.

  7. Sundar Pichai announced at Google I/O that Gemini 3.5 Pro will launch next month; attendees groaned at the model coming out later than they expected (Charles Rollet/Business Insider)

    Charles Rollet / Business Insider : Sundar Pichai announced at Google I/O that Gemini 3.5 Pro will launch next month; attendees groaned at the model coming out later than they expected —  - Google CEO Sundar Pichai announced that Gemini 3.5 Pro will launch next month.  — Google I/O 2026 attendees groaned at the model coming out later than they'd expected.

  8. Hands-on with Google and Samsung's Android XR smart glasses from Warby Parker and Gentle Monster, and with XReal's Project Aura, all set to arrive this fall (Wired)

    Wired : Hands-on with Google and Samsung's Android XR smart glasses from Warby Parker and Gentle Monster, and with XReal's Project Aura, all set to arrive this fall —  Here's your first look at smart glasses coming from Warby Parker and Gentle Monster, powered by Google and Samsung's XR platform.

  9. Google unveils Universal Cart, a shopping assistant that works "across merchants", built on the Universal Commerce Protocol, rolling out in the US today (Aisha Malik/TechCrunch)

    Aisha Malik / TechCrunch : Google unveils Universal Cart, a shopping assistant that works “across merchants”, built on the Universal Commerce Protocol, rolling out in the US today —  At I/O on Tuesday, Google introduced Universal Cart, its so-called agentic hub for managing shopping in one place.

  10. At a hearing, two of three judges of a federal appeals court appeared skeptical of Anthropic's bid to block the DOD from designating it a supply-chain risk (Jen Judson/Bloomberg)

    Jen Judson / Bloomberg : At a hearing, two of three judges of a federal appeals court appeared skeptical of Anthropic's bid to block the DOD from designating it a supply-chain risk —  A federal appeals court appeared skeptical of Anthropic's bid to block the Pentagon from declaring that the company poses …

  11. Demis Hassabis says companies looking to replace developers with AI may be due to a "lack of imagination and a lack of understanding" of the future (Will Knight/Wired)

    Will Knight / Wired : Demis Hassabis says companies looking to replace developers with AI may be due to a “lack of imagination and a lack of understanding” of the future —  The CEO of Google DeepMind tells WIRED that companies should use the productivity gains of AI to do more, not lay people off.

  12. Google's web-based AI Studio now lets users build native Android apps; Google says the apps are for personal use only for now and publishing is on the roadmap (Sarah Perez/TechCrunch)

    Sarah Perez / TechCrunch : Google's web-based AI Studio now lets users build native Android apps; Google says the apps are for personal use only for now and publishing is on the roadmap —  The AI coding boom is now coming directly for Android app development.  On Tuesday, Google announced new native Android app creation capabilities …

  13. Google introduces Antigravity 2.0, featuring an updated desktop app that lets users orchestrate multiple agents, alongside an Antigravity CLI tool and SDK (Ivan Mehta/TechCrunch)

    Ivan Mehta / TechCrunch : Google introduces Antigravity 2.0, featuring an updated desktop app that lets users orchestrate multiple agents, alongside an Antigravity CLI tool and SDK —  Google is introducing a new version of its agentic coding app, Google Antigravity 2.0, with an updated desktop app, a CLI tool, and an SDK for custom workflows.

  14. Google says the Gemini app has 900M+ MAUs across 230 countries, up from 400M at I/O 2025, and launches Neural Expressive, a new design language for Gemini (Josh Woodward/The Keyword)

    Josh Woodward / The Keyword : Google says the Gemini app has 900M+ MAUs across 230 countries, up from 400M at I/O 2025, and launches Neural Expressive, a new design language for Gemini —  Gemini is becoming a more helpful AI assistant, with an intuitive new UI, proactive daily briefs and Gemini Spark, an agent to help you get things done around the clock.

  15. OpenAI adds support for Google's SynthID watermarks in AI images, and previews a public portal to let users verify if an image was generated by OpenAI's tools (Jess Weatherbed/The Verge)

    Jess Weatherbed / The Verge : OpenAI adds support for Google's SynthID watermarks in AI images, and previews a public portal to let users verify if an image was generated by OpenAI's tools —  Images generated by ChatGPT will now carry Google's SynthID watermarks. … OpenAI is announcing updates today that aim …

Solidot(15)

  1. pgBackRest 作者宣布继续维护该项目

    上月底,PostgreSQL 备份恢复项目 pgBackRest 的维护者 David Steele 宣布项目存档停止维护。pgBackRest 被广泛视为是 PostgreSQL 生态系统最流行的运维工具之一。Steele 解释说,过去 13 年 pgBackRest 是他倾注热情的项目,幸运的是大部分时间里他都有企业资助,他的长期赞助商是 Crunchy Data 公司,但这家公司被 Snowflake 收购了,而新东家无意资助他继续从事相关工作,因此他过去几个月一直在寻找继续这项工作的职位但没有成功,获得的赞助也远远未能达到维持项目运营所需的金额,因此只能宣布停止维护。在这一声明公布数周之后,他更新了消息,宣布将继续开发 pgBackRes:因为一个赞助商联盟同意为项目持续提供资金,给予了 pgBackRes 开发所需的长期稳定性,他对此表示了感谢。

  2. 索尼取消将 PS 独占单人游戏移植到 PC 的计划

    负责索尼 PS 工作室业务的高管 Hermen Hulst 周一证实了此前的流言:取消将 PS 独占单人游戏移植到 PC 的计划。索尼过去几年将此前的独占 PS 单人游戏如 God of War 系列、Spider-Man 系列、Ghost of Tsushima、The Last of Us 系列和 Horizon Zero Dawn 系列移植到了 PC 平台,但最近一段时间移植频率下降,引发了索尼改变移植战略的流言。Hermen Hulst 周一在员工大会上宣布了公司的战略调整计划。索尼据称是担心稀释 PlayStation 品牌影响力。此举意味着索尼最近推出的单人游戏 Ghost of Yotei 和 Saros 将会无缘登陆 PC。索尼的战略调整针对的是第一方工作室的单人游戏,多人游戏以及第三方工作室的单人游戏仍然会登陆 PC。

  3. 人类为什么惯用右手

    人类中的大多数是右撇子,左撇子占约十分之一。为什么会出现这一倾向?研究人员分析了 41 种灵长类动物,共计 2025 只猴子与猿类的数据,逐一分析了工具使用、食性、栖息环境、体型、社会结构、脑容量、行动方式等各类影响因素。人类的用手倾向与其他灵长类动物存在明显差异。当研究人员将两个关键特征纳入模型中,情况就发生了变化。这两个特征分别是大脑大小及臂长与腿长的比例,这一比例常作为衡量两足行走能力的指标。纳入上述因素后,人类不再被视为特殊的进化产物。研究结果表明,直立行走与脑容量增大的共同作用,或是人类形成强烈右手使用偏好的核心原因。研究人员认为,惯用右手的进化分为两个阶段。首先,直立行走使双手从运动中解放出来,偏爱更专业和不对称的手部使用;其次,随着人类大脑变得更大且更为复杂,对右手的偏好变得愈发强烈且更为普遍。

  4. Firefox 151 释出

    Mozilla 释出了 Firefox 151。主要新特性包括:更新内置 VPN 支持,改进隐私浏览,Firefox PDF 查看器支持直接合并多个 PDF 文件,Linux 和 macOS 本地配置文件备份支持跨平台恢复,文档画中画 API——提供了比目前的视频画中画 API 更多功能体验,等等。JPEG-XL 原生图像解密器推迟到了下个版本。

  5. 少数湖泊拥有三分之二的湖泊淡水储量

    根据发表在《国家科学评论》期刊上的一项研究,中科院研究团队汇总 588 个湖泊的高精度实测水下地形和水深数据。研究发现,我国湖泊水深受地形地貌影响,西部高海拔内流湖盆区受构造断陷和冰川侵蚀影响,形成了深水湖泊,而东部平原因长期泥沙淤积,形成浅碟形湖泊。全国湖泊总蓄水量约 1081-1285 立方公里,其中淡水约 335 立方公里,咸水约 839 立方公里。约 65% 的湖泊淡水储存于青藏高原等西部内流湖盆区少数几个深水开放型湖泊。学界对我国淡水湖的关注多聚焦于东部平原区及云贵高原,但本研究发现,青藏高原不仅拥有塔若错、玛旁雍错、吴如错等超大型深水淡水湖,其湖区天然湖泊的淡水总储量超过东部平原湖区:青藏高原湖泊区人均储量约为 20680 立方米,而东部平原湖泊区人均储量仅为 65 立方米,两者相差近 330 倍。

  6. 微软发布了首个通用 Linux 发行版 Azure Linux 4.0

    Kubernetes 联合创始人、微软副总裁 Brendan Burns 在北美开源峰会上突然宣布了一个通用 Linux 发行版。微软以前发布过 Linux 应用,针对边缘计算设备的 Azure Sphere,Linux 容器软件平台 CBL-Marnier——后更名为 Azure Linux,但此前从未发布过通用发行版。微软 Azure 开源团队首席项目经理 Lachlan Everson 表示,通过 Azure Linux 4.0,微软正致力于将 Azure Linux 转变成一个功能完整的通用云发行版。Azure Linux 4.0 是基于 Fedora Linux 发行版,已发布在 GitHub 上,使用 Fedora 的 RPM 包管理系统,深度整合 Azure 云平台。开发者可以通过 WSL 在 Windows 11 上运行 Azure Linux 4.0,但没有 GUI。微软承诺为 Azure Linux 每月释出补丁,如果出现重要漏洞,微软也承诺及时释出补丁。

  7. Meta 重分配七千员工专注于 AI

    Meta 周一通知员工,将重分配七千员工专注于 AI。Meta HR 负责人 Janelle Gale 在一份内部备忘录中称,员工将被调往四个专注构建新 AI 工具和应用的新部门,新部门采用“AI 原生设计架构”,每位员工的经理人数将少于其他部门。截至 2025 年底,Meta 员工总数逾 78,000 人。它最近宣布将裁员八千人。Meta CEO 扎克伯格(Mark Zuckerberg)正将公司的未来押注在 AI 上,他今年初表示计划年内投入 1150 亿至 1350 亿美元,大部分将用于开发新 AI 技术。

  8. 陪审团以诉讼时效为由判马斯克败诉

    引发广泛关注的马斯克(Elon Musk)诉 OpenAI 一案经历三周的庭审之后,陪审团周一以超出三年诉讼时效为由判马斯克败诉。OpenAI 由 Sam Altman、Greg Brockman 和马斯克等人在 2015 年创建,最初是非营利组织,马斯克在 2018 年离开董事会,次年 OpenAI 成立了营利性公司,当年获得了微软的 10 亿美元投资。马斯克是在 2024 年向旧金山高等法院起诉 OpenAI 及其联合创始人 Altman 和 Brockman 违反公司的创始原则,将商业利益置于公共利益之上。陪审团认定他提起诉讼的时间过长,未能及时提出 OpenAI 背离其非营利使命的指控。

  9. 人体不同器官的衰老不同步

    中科院的研究人员在《细胞》期刊上发布研究,构建了脑、肝脏、肺、肌肉、血管、皮肤六大器官的独立衰老时钟。研究证实,器官衰老存在显著异步性,肝脏衰老拐点早于大脑。研究团队同时识别出 40至 50 岁和 60 至 70 岁两个非线性变化窗口,其中 60 至 70 岁阶段伴随凝血通路显著激活是衰老加速的关键阶段。研究还发现,衰老肝脏来源的凝血因子协同上调。体外实验证实,关键凝血因子可诱发内皮细胞衰老;小鼠体内实验表明,注射F13B可诱发多组织加速衰老,明确凝血因子是驱动血管及多器官衰老的核心分子。临床转化显示,仅用一组代表性血浆蛋白即可近似重建核心时钟,提示血液检测或成为评估生物学年龄的可行路径。器官特异性衰老时钟可提前识别超前老化的器官,提供差异化干预靶点,对凝血因子驱动的血管老化,可靶向干预凝血通路。其他类型器官老化,可匹配不同生活方式或药物干预。

  10. 你生活的地点与你衰老的速度相关

    根据发表在《Cell》期刊上的一项研究,研究人员通过分析欧洲、东亚和南亚的 322 名健康人去构建迄今最详尽的遗传祖先和环境如何塑造人类生物学特征的图谱。通过招募居住在不同大洲、具有相同遗传背景的人群,科学家得以以前所未有的清晰度,将 DNA 的影响与环境的影响区分开来。研究人员发现,无论搬到哪里,种族背景会对免疫系统、新陈代谢和肠道菌群产生深远影响。南亚人表现出更高的病原体暴露水平。欧洲人的肠道微生物多样性更丰富,且与心脏病风险相关的化合物含量更高。跨州迁移会改变主要的代谢途径,改变肠道微生物的平衡。研究的一大发现是你生活的地点与你衰老的速度相关。居住在亚洲外的东亚人比东亚人生物年龄更大。欧洲人则相反,居住在欧洲外的欧洲人生物年龄更小。

  11. 伊朗要求通过霍尔木兹海峡的海底光缆付费

    伊朗军方发言人 Ebrahim Zolfaghari 在 X 上宣布对通过霍尔木兹海峡的海底光缆收费。暂时不清楚伊朗只是发出一种威胁,还是会将威胁付诸实施。伊朗的计划将要求 Google、微软、Meta和亚马逊等公司遵守其法律,同时海底光缆公司被要求支付通行许可费,而维修和维护权则完全授予伊朗公司。海底光缆传输着欧洲、亚洲和波斯湾之间的网络和金融流量,破坏光缆将会引发数字灾难,威胁到从银行系统、军事通信和 AI 云基础设施到远程办公、在线游戏和流媒体服务。

  12. 微软将修改 Edge 加载密码的方式

    安全研究员本月初披露,Edge 内置的密码管理器会在浏览器启动时候解密所有密码然后加载到内存里。他联络了微软,结果收到的回应是“源于设计(by design)”,认为不是安全隐患。微软当时强调这是应用的预期功能。然而仅仅过了几天,微软就改变了主意,宣布未来版本的 Edge 不会再在启动时加载密码。相关补丁已经释出到 Edge Canary 版本,将包含在 Edge build 148 或更新版本中。

  13. 《Terraria》 15 年售出 7000 万份拷贝

    独立沙盒游戏《Terraria》的开发商宣布游戏上市 15 年共售出了七千万份拷贝,其中 PC 平台销量最高 3960 万份,主机 1070 万份,移动 1970 万份,Mod 工具 tModLoader 下载量 1230 万次。过去一年《Terraria》PC 版本日均玩家 46.1 万最高 140 万,PC 玩家平均游戏时长 101 小时 18 分钟。开发商表示会继续更新《Terraria》。在史上最畅销的游戏中,《Terraria》排在第 7 位。销量最高的是《我的世界(Minecraft)》(如果不考虑《俄罗斯方块》),售出超过 3.5 亿份拷贝,其次是《Grand Theft Auto V》的 2.25 亿份,《Wii Sports》的 8290 万份、《Red Dead Redemption 2》的 8200 万份,《马里奥赛车8》的 7954 万份, 《绝地求生》的 7500 万份等。

  14. 三星电子工会威胁总罢工

    三星电子工会已经宣布将于 21 日启动为期 18 天的总罢工,双方就绩效奖金的上限存在分歧。目前韩国政府对此事表达了高度关注,总理金民锡周日表示若罢工对国民经济造成巨大损失,政府将为保护国民经济而采取包括行使紧急调整权在内的所有可行手段。周一三星电子劳资双方展开了最新一轮谈判。韩国法院同一天就三星电子资方针对工会提出的禁止其进行违法集体斗争行为的申请作出裁定,支持资方大部分诉求,要求工会即便罢工也不得耽误生产。该决定或给劳资谈判以及工会的罢工计划带来不小影响。专家估计每罢工一天造成的损失最高达到 20 亿美元,18 天总罢工将接近 170 亿美元。而 JPMorgan 估计损失最高将达到 280 亿美元。

  15. NASA 维护旅行者号代码的工程师日益稀少

    NASA 在 48 年前先后发射了两艘旅行者号探测器,当年曾为旅行者号写代码的工程师如今早已白发苍苍,甚至已经去世。旅行者号机载计算机运行的是汇编语言,是专为通用电气开发的处理器编写的。探测器上有三个计算机系统:计算机指令子系统(CCS)、姿态调节控制子系统(ACS)以及飞行数据子系统(FDS)。其底层飞行工作依赖于专门的汇编语言,地面系统和早期任务工具使用了 Fortran 语言。探测器上的计算机内存非常小,总容量仅为 64-70 KB。几十年来,地面控制团队成员不断减少,也逐渐老去。更糟的是很多原始文档遗失或支离破碎。项目文件大多是纸质的,每次项目搬迁办公室,会有更多的文件丢失。NASA JPL Interplanetary Network Directorate 项目主任 Suzy Dodd 在 2024 年称建造探测器的人都已经不在人世。Larry Zottarelli 是最后一位仍在工作的原始团队工程师,他于 2016 年 80 岁时退休。目前旅行者号维护团队大多数人都年过八旬,团队还依赖于一份退休工程师名单,以便在紧急情况下呼叫。该名单每年都在缩小。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.