Curated by Shen Huang · 90 stories · ~14 min read
DIGEST · 2026-05-17

OrangeBot.AI Digest — 2026-05-17

90 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. EU weighs restricting use of US cloud platforms to process government data (www.osnews.com)
  2. Mercurial, 20 years and counting: how are we still alive and kicking? [video] (fosdem.org)
  3. Meta deletes popular 1M follower account after Kuwaiti request (twitter.com)
  4. At least 25 Flock cameras have been destroyed in five states since April 2025 (stateofsurveillance.org)
  5. Hindenburg’s Smoking Room (www.airships.net)
  6. I turned a $80 RK3562 Android tablet into a Debian Linux workstation (github.com)
  7. Security researcher says Microsoft built a Bitlocker backdoor, releases exploit (www.techspot.com)
  8. AI is a technology not a product (daringfireball.net)
  9. WHO declares Ebola outbreak a global health emergency (www.nytimes.com)
  10. AI subscriptions are a ticking time bomb for enterprise (www.thestateofbrand.com)
  11. I don't think AI will make your processes go faster (frederickvanbrabant.com)
  12. Apple Silicon costs more than OpenRouter (www.williamangel.net)
  13. Native all the way, until you need text (justsitandgrin.im)
  14. Prolog Basics Explained with Pokémon (unplannedobsolescence.com)
  15. Ten Signs of Fascism. America has all of them (rutgerbregman.substack.com)

GitHub Trending(15)

  1. tinyhumansai / openhuman
  2. HKUDS / CLI-Anything
  3. calcom / cal.diy
  4. oven-sh / bun
  5. Anil-matcha / Open-Generative-AI
  6. BigBodyCobain / Shadowbroker
  7. tech-leads-club / agent-skills
  8. NirDiamant / agents-towards-production
  9. dograh-hq / dograh
  10. K-Dense-AI / scientific-agent-skills
  11. Light-Heart-Labs / DreamServer
  12. KeygraphHQ / shannon
  13. TryGhost / Ghost
  14. medusajs / medusa
  15. knadh / listmonk

Product Hunt(15)

  1. Vivago Video Agent

    Skip the prompting. Produce consistently compelling videos.

  2. Files SDK

    A unified storage SDK for object and blob backends

  3. SUN-to-Spotify

    Generate audio with SUN and send it to your Spotify library

  4. Fere AI

    AI agents that turn signals into crypto + Polymarket trades

  5. Kirki

    WordPress finally has a freeform canvas website builder.

  6. Agentmemory

    Persistent memory for Claude Code, Codex & coding agents

  7. M5Stack PaperColor

    4-inch color E-ink dev board with ESP32 and audio I/O

  8. Wring

    Developer tools, one menu click away.

  9. ChatGPT for Personal Finance

    Personal finance guidance powered by ChatGPT

  10. Gemini 3.1 Flash-Lite

    Lightweight Gemini model for high-volume AI pipelines

  11. Raybeam

    A better way to screen share on macOS

  12. Loova Agents

    Your AI director for creating cinematic videos with ease

  13. Noeth

    The coding interview AI that lets you bring your own API key

  14. Glance

    Preview .md files instantly with quick look

  15. Gluten App

    Find gluten-free places by city and travel destination

Hugging Face(15)

  1. Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

    Recent progress in reasoning models has substantially advanced long-horizon mathematical and scientific problem solving, with several systems now reaching gold-medal-level performance on International Mathematical Olympiad (IMO) and International Physics Olympiad (IPhO) problems. In this paper, we introduce a simple and unified recipe for converting a post-trained reasoning backbone into a rigorous olympiad-level solver. The recipe first uses a reverse-perplexity curriculum for SFT to instill rigorous proof-search and self-checking behaviors, then scales these behaviors through a two-stage RL pipeline that progresses from RL with verifiable rewards to more delicate proof-level RL, and finally boosts solving performance with test-time scaling. Applying this recipe, we train a 30B-A3B backbone with SFT on around 340K sub-8K-token trajectories followed by 200 RL steps. The resulting model, SU-01, supports stable reasoning on difficult problems with trajectories exceeding 100K tokens, while achieving gold-medal-level performance on mathematical and physical olympiad competitions, including IMO 2025/USAMO 2026 and IPhO 2024/2025. It also demonstrates strong generalization of scientific reasoning to domains beyond mathematics and physics.

  2. Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation

    Real-time interactive video generation requires low-latency, streaming, and controllable rollout. Existing autoregressive (AR) diffusion distillation methods have achieved strong results in the chunk-wise 4-step regime by distilling bidirectional base models into few-step AR students, but they remain limited by coarse response granularity and non-negligible sampling latency. In this paper, we study a more aggressive setting: frame-wise autoregression with only 1--2 sampling steps. In this regime, we identify the initialization of a few-step AR student as the key bottleneck: existing strategies are either target-misaligned, incapable of few-step generation, or too costly to scale. We propose Causal Forcing++, a principled and scalable pipeline that uses causal consistency distillation (causal CD) for few-step AR initialization. The core idea is that causal CD learns the same AR-conditional flow map as causal ODE distillation, but obtains supervision from a single online teacher ODE step between adjacent timesteps, avoiding the need to precompute and store full PF-ODE trajectories. This makes the initialization both more efficient and easier to optimize. The resulting pipeline, \ours, surpasses the SOTA 4-step chunk-wise Causal Forcing under the \textbf{frame-wise 2-step setting} by 0.1 in VBench Total, 0.3 in VBench Quality, and 0.335 in VisionReward, while reducing first-frame latency by 50\% and Stage 2 training cost by sim4times. We further extend the pipeline to action-conditioned world model generation in the spirit of Genie3. Project Page: https://github.com/thu-ml/Causal-Forcing and https://github.com/shengshu-ai/minWM .

  3. Self-Distilled Agentic Reinforcement Learning

    Reinforcement learning (RL) has emerged as a central paradigm for post-training LLM agents, yet its trajectory-level reward signal provides only coarse supervision for long-horizon interaction. On-Policy Self-Distillation (OPSD) complements RL by introducing dense token-level guidance from a teacher branch augmented with privileged context. However, transferring OPSD to multi-turn agents proves problematic: compounding multi-turn instability destabilizes supervision, while skill-conditioned privileged guidance requires asymmetric treatment for negative teacher rejections may arise from imperfect skills retrieval or utilization. We introduce SDAR (Self-Distilled Agentic Reinforcement Learning), which treats OPSD as a gated auxiliary objective while keeping RL as the primary optimization backbone. SDAR maps detached token-level signals into a sigmoid gate, strengthening distillation on teacher-endorsed positive-gap tokens and softly attenuating negative teacher rejections. Across the Qwen2.5 and Qwen3 families on ALFWorld, WebShop, and Search-QA, SDAR substantially improves over GRPO (+9.4% on ALFWorld, +7.0% on Search-QA, +10.2% on WebShop-Acc), avoids the instability of naive GRPO+OPSD, and consistently outperforms hybrid RL--OPSD baselines across model scales.

  4. MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

    Memory is essential for large vision-language models (LVLMs) to handle long, multimodal interactions, with two method directions providing this capability: long-context LVLMs and memory-augmented agents. However, no existing benchmark conducts a systematic comparison of the two on questions that genuinely require multimodal evidence. To close this gap, we introduce MEMLENS, a comprehensive benchmark for memory in multimodal multi-session conversations, comprising 789 questions across five memory abilities (information extraction, multi-session reasoning, temporal reasoning, knowledge update, and answer refusal) at four standard context lengths (32K-256K tokens) under a cross-modal token-counting scheme. An image-ablation study confirms that solving MEMLENS requires visual evidence: removing evidence images drops two frontier LVLMs below 2% accuracy on the 80.4% of questions whose evidence includes images. Evaluating 27 LVLMs and 7 memory-augmented agents, we find that long-context LVLMs achieve high short-context accuracy through direct visual grounding but degrade as conversations grow, whereas memory agents are length-stable but lose visual fidelity under storage-time compression. Multi-session reasoning caps most systems below 30%, and neither approach alone solves the task. These results motivate hybrid architectures that combine long-context attention with structured multimodal retrieval. Our code is available at https://github.com/xrenaf/MEMLENS.

  5. SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

    We introduce SANA-WM, an efficient 2.6B-parameter open-source world model natively trained for one-minute generation, synthesizing high-fidelity, 720p, minute-scale videos with precise camera control. SANA-WM achieves visual quality comparable to large-scale industrial baselines such as LingBot-World and HY-WorldPlay, while significantly improving efficiency. Four core designs drive our architecture: (1) Hybrid Linear Attention combines frame-wise Gated DeltaNet (GDN) with softmax attention for memory-efficient long-context modeling. (2) Dual-Branch Camera Control ensures precise 6-DoF trajectory adherence. (3) Two-Stage Generation Pipeline applies a long-video refiner to stage-1 outputs, improving quality and consistency across sequences. (4) Robust Annotation Pipeline extracts accurate metric-scale 6-DoF camera poses from public videos to yield high-quality, spatiotemporally consistent action labels. Driven by these designs, SANA-WMdemonstrates remarkable efficiency across data, training compute, and inference hardware: it uses only sim213K public video clips with metric-scale pose supervision, completes training in 15 days on 64 H100s, and generates each 60s clip on a single GPU; its distilled variant can be deployed on a single RTX 5090 with NVFP4 quantization to denoise a 60s 720p clip in 34s. On our one-minute world-model benchmark, SANA-WM demonstrates stronger action-following accuracy than prior open-source baselines and achieves comparable visual quality at 36times higher throughput for scalable world modeling.

  6. MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

    Long-term agent memory is increasingly multimodal, yet existing evaluations rarely test whether agents preserve the visual evidence needed for later reasoning. In prior work, many visually grounded questions can be answered using only captions or textual traces, allowing answers to be inferred without preserving the fine-grained visual evidence. Meanwhile, harder cases that require reasoning over changing visual states are largely absent. Therefore, we introduce MemEye, a framework that evaluates memory capabilities from two dimensions: one measures the granularity of decisive visual evidence (from scene-level to pixel-level evidence), and the other measures how retrieved evidence must be used (from single evidence to evolutionary synthesis). Under this framework, we construct a new benchmark across 8 life-scenario tasks, with ablation-driven validation gates for assessing answerability, shortcut resistance, visual necessity, and reasoning structure. By evaluating 13 memory methods across 4 VLM backbones, we show that current architectures still struggle to preserve fine-grained visual details and reason about state changes over time. Our findings show that long-term multimodal memory depends on evidence routing, temporal tracking, and detail extraction.

  7. Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

    We present Darwin Family, a framework for training-free evolutionary merging of large language models via gradient-free weight-space recombination. We ask whether frontier-level reasoning performance can be improved without additional training, by reorganizing latent capabilities already encoded in existing checkpoints. Darwin introduces three key ideas: (i) a 14-dimensional adaptive merge genome enabling fine-grained component- and block-level recombination; (ii) MRI-Trust Fusion, which adaptively balances diagnostic layer-importance signals with evolutionary search through a learnable trust parameter; and (iii) an Architecture Mapper that enables cross-architecture breeding between heterogeneous model families. Empirically, the flagship Darwin-27B-Opus achieves 86.9% on GPQA Diamond, ranking #6 among 1,252 evaluated models, and outperforming its fully trained foundation model without any gradient-based training. Across scales from 4B to 35B parameters, Darwin models consistently improve over their parents, support recursive multi-generation evolution, and enable a training-free evolutionary merge that combines Transformer- and Mamba-based components. Together, the Darwin Family demonstrates that diagnostic-guided evolutionary merging is a practical and reproducible alternative to costly post-training pipelines for reasoning-centric language models.

  8. Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

    LLM-based autonomous agents have demonstrated strong capabilities in reasoning, planning, and tool use, yet remain limited when tasks require sustained coordination across roles, tools, and environments. Multi-agent systems address this through structured collaboration among specialized agents, but tighter coordination also amplifies a less explored risk: errors can propagate across agents and interaction rounds, producing failures that are difficult to diagnose and rarely translate into structural self-improvement. Existing surveys cover individual agent capabilities, multi-agent collaboration, or agent self-evolution separately, leaving the causal dependencies among them unexamined. This survey provides a unified review organized around four causally linked stages, which we term the LIFE progression: Lay the capability foundation, Integrate agents through collaboration, Find faults through attribution, and Evolve through autonomous self-improvement. For each stage, we provide systematic taxonomies and formally characterize the dependencies between adjacent stages, revealing how each stage both depends on and constrains the next. Beyond synthesizing existing work, we identify open challenges at stage boundaries and propose a cross-stage research agenda for closed-loop multi-agent systems capable of continuously diagnosing failures, reorganizing structures, and refining agent behaviors, extending current coordination frameworks toward more self-organizing forms of collective intelligence. By bridging these previously fragmented research threads, this survey aims to offer both a systematic reference and a conceptual roadmap toward autonomous, self-improving multi-agent intelligence.

  9. WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

    Large language and vision-language models increasingly power agents that act on a user's behalf through command-line interface (CLI) harnesses. However, most agent benchmarks still rely on synthetic sandboxes, short-horizon tasks, mock-service APIs, and final-answer checks, leaving open whether agents can complete realistic long-horizon work in the runtimes where they are deployed. This work presents WildClawBench, a native-runtime benchmark of 60 human-authored, bilingual, multimodal tasks spanning six thematic categories. Each task averages roughly 8 minutes of wall-clock time and over 20 tool calls, and runs inside a reproducible Docker container hosting an actual CLI agent harness (OpenClaw, Claude Code, Codex, or Hermes Agent) with access to real tools rather than mock services. Grading is hybrid, combining deterministic rule-based checks, environment-state auditing of side effects, and an LLM/VLM judge for semantic verification. Across 19 frontier models, the best, Claude Opus 4.7, reaches only 62.2% overall under OpenClaw, while every other model stays below 60%, and switching harness alone shifts a single model by up to 18 points. These results show that long-horizon, native-runtime agent evaluation remains a far-from-resolved task for current frontier models. We release the tasks, code, and containerized tooling to support reproducible evaluation.

  10. STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?

    Large Language Model (LLM) agents are increasingly expected to maintain coherent, long-term personalized memory, yet current benchmarks primarily measure static fact retrieval, overlooking the ability to revise stored beliefs when new evidence emerges. We identify a critical and underexplored failure mode, Implicit Conflict: a later observation invalidates an earlier memory without explicit negation, requiring contextual inference and commonsense reasoning to detect. To rigorously evaluate this capability, we introduce STALE, a benchmark of 400 expert-validated conflict scenarios (1,200 evaluation queries across three probing dimensions) spanning over 100 everyday topics with contexts up to 150K tokens. We propose a three-dimensional probing framework that tests State Resolution (detecting that a prior belief is outdated), Premise Resistance (rejecting queries that falsely presuppose a stale state), and Implicit Policy Adaptation (proactively applying updated states in downstream behavior). A systematic evaluation of frontier LLMs and specialized memory frameworks reveals a pervasive gap between retrieving updated evidence and acting on it, with even the best evaluated model achieving only 55.2% overall accuracy. Models often accept outdated assumptions embedded in a user's query, and they struggle to recognize when a change in one aspect of the user's state should invalidate related memories. To establish an initial baseline for state-aware memory, we further present CUPMem, a prototype that strengthens write-time revision through structured state consolidation and propagation-aware search, suggesting that explicit state adjudication is a promising direction for robust agentic memory.

  11. Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

    Camera-controlled video generation has made substantial progress, enabling generated videos to follow prescribed viewpoint trajectories. However, existing methods usually learn camera-specific conditioning through camera encoders, control branches, or attention and positional-encoding modifications, which often require post-training on large-scale camera-annotated videos. Training-free alternatives avoid such post-training, but often shift the cost to test-time optimization or extra denoising-time guidance. We propose Warp-as-History, a simple interface that turns camera-induced warps into camera-warped pseudo-history with target-frame positional alignment and visible-token selection. Given a target camera trajectory, we construct camera-warped pseudo-history from past observations and feed it through the model's visual-history pathway. Crucially, we align its positional encoding with the target frames being denoised and remove warped-history tokens without valid source observations. Without any training, architectural modification, or test-time optimization, this interface reveals a non-trivial zero-shot capability of a frozen video generation model to follow camera trajectories. Moreover, lightweight offline LoRA finetuning on only one camera-annotated video further improves this capability and generalizes to unseen videos, improving camera adherence, visual quality, and motion dynamics without test-time optimization or target-video adaptation. Extensive experiments on diverse datasets confirm the effectiveness of our method.

  12. RouteProfile: Elucidating the Design Space of LLM Profiles for Routing

    As the large language model (LLM) ecosystem expands, individual models exhibit varying capabilities across queries, benchmarks, and domains, motivating the development of LLM routing. While prior work has largely focused on router mechanism design, LLM profiles, which capture model capabilities, remain underexplored. In this work, we ask: How does LLM profile design affect routing performance across different routers? Addressing this question helps clarify the role of profiles in routing, disentangle profile design from router design, and enable fairer comparison and more principled development of routing systems. To this end, we view LLM profiling as a structured information integration problem over heterogeneous interaction histories. We develop a general design space of LLM profiles, named RouteProfile, along four key dimensions: organizational form, representation type, aggregation depth, and learning configuration. Through systematic evaluation across three representative routers under both standard and new-LLM generalization settings, we show that: (1) structured profiles consistently outperform flat ones; (2) query-level signals are more reliable than coarse domain-level signals; and (3) generalization to newly introduced models benefits most from structured profiles under trainable configurations. Overall, our work highlights LLM profile design as an important direction for future routing research.

  13. PREPING: Building Agent Memory without Tasks

    Agent memory is typically constructed either offline from curated demonstrations or online from post-deployment interactions. However, regardless of how it is built, an agent faces a cold-start gap when first introduced to a new environment without any task-specific experience available. In this paper, we study pre-task memory construction: whether an agent can build procedural memory before observing any target-environment tasks, using only self-generated synthetic practice. Yet, synthetic interaction alone is insufficient, as without controlling what to practice and what to store, synthetic tasks become redundant, infeasible, and ultimately uninformative, and memory further degrades quickly due to unfiltered trajectories. To overcome this, we present Preping, a proposer-guided memory construction framework. At its core is proposer memory, a structured control state that shapes future practice. A Proposer generates synthetic tasks conditioned on this state, a Solver executes them, and a Validator determines which trajectories are eligible for memory insertion while also providing feedback to guide future proposals. Experiments on AppWorld, BFCL v3, and MCP-Universe show that Preping substantially improves over a no-memory baseline and achieves performance competitive with strong playbook-based methods built from offline or online experience, with deployment cost 2.99times lower on AppWorld and 2.23times lower on BFCL v3 than online memory construction. Further analyses reveal that the main benefit does not come from synthetic volume alone, but from proposer-side control over feasibility, redundancy, and coverage, combined with selective memory updates.

  14. VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction

    High-quality 3D scene reconstruction has recently advanced toward generalizable feed-forward architectures, enabling the generation of complex environments in a single forward pass. However, despite their strong performance in static scene perception, these models remain limited in responding to dynamic human instructions, which restricts their use in interactive applications. Existing editing methods typically rely on a 2D-lifting strategy, where individual views are edited independently and then lifted back into 3D space. This indirect pipeline often leads to blurry textures and inconsistent geometry, as 2D editors lack the spatial awareness required to preserve structure across viewpoints. To address these limitations, we propose VGGT-Edit, a feed-forward framework for text-conditioned native 3D scene editing. VGGT-Edit introduces depth-synchronized text injection to align semantic guidance with the backbone's spatial poses, ensuring stable instruction grounding. This semantic signal is then processed by a residual transformation head, which directly predicts 3D geometric displacements to deform the scene while preserving background stability. To ensure high-fidelity results, we supervise the framework with a multi-term objective function that enforces geometric accuracy and cross-view consistency. We also construct the DeltaScene Dataset, a large-scale dataset generated through an automated pipeline with 3D agreement filtering to ensure ground-truth quality. Experiments show that VGGT-Edit substantially outperforms 2D-lifting baselines, producing sharper object details, stronger multi-view consistency, and near-instant inference speed.

  15. Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning

    We often aim to generate images that are both photorealistic and 3D-consistent, adhering to precise geometry, material, and viewpoint controls. Typically, this is achieved by fine-tuning an image generator, pre-trained on billions of real images, using renders of synthetic 3D assets, where annotations for control signals are available. While this approach can learn the desired controls, it often compromises the realism of the images due to domain gap between photographs and renders. We observe that this issue largely arises from the model learning an unintended association between the presence of control signals and the synthetic appearance of the images. To address this, we introduce Realiz3D, a lightweight framework for training diffusion models, that decouples controls and visual domain. The key idea is to explicitly learn visual domain, real or synthetic, separately from other control signals by introducing a co-variate that, fed into small residual adapters, shifts the domain. Then, the generator can be trained to gain controllability, without fitting to specific visual domain. In this way, the model can be guided to produce realistic images even when controls are applied. We enhance control transferability to the real domain by leveraging insights about roles of different layers and denoising steps in diffusion-based generators, informing new training and inference strategies that further mitigate the gap. We demonstrate the advantages of Realiz3D in tasks as text-to-multiview generation and texturing from 3D inputs, producing outputs that are 3D-consistent and photorealistic.

Techmeme(15)

  1. Source: Shein is acquiring DTC clothing retailer Everlane, which focuses on sustainability and "radical transparency", in a $100M deal (Lauren Sherman/Puck)

    Lauren Sherman / Puck : Source: Shein is acquiring DTC clothing retailer Everlane, which focuses on sustainability and “radical transparency”, in a $100M deal —  The Millennial D.T.C. company built a global brand based on sustainability and radical transparency.  Now, after a years-long slump …

  2. Analysis: 34 leading AI startups are generating ~$80B in annualized revenue, up 112% from six months ago, with Anthropic and OpenAI capturing 89% of the revenue (The Information)

    The Information : Analysis: 34 leading AI startups are generating ~$80B in annualized revenue, up 112% from six months ago, with Anthropic and OpenAI capturing 89% of the revenue —  Anthropic and OpenAI are widening the revenue gap between themselves and the rest of the AI startup field.

  3. Grafana says hackers have accessed its GitHub environment and demanded a ransom to prevent the release of its codebase; Grafana refused to pay (The Hacker News)

    The Hacker News : Grafana says hackers have accessed its GitHub environment and demanded a ransom to prevent the release of its codebase; Grafana refused to pay —  Grafana has disclosed that an “unauthorized party” obtained a token that granted them the ability to access the company's GitHub environment and download its codebase.

  4. Analysis: mass adoption of smartphones and social media may be a primary driver of declining birthrates globally, in part by reducing in-person socializing (John Burn-Murdoch/Financial Times)

    John Burn-Murdoch / Financial Times : Analysis: mass adoption of smartphones and social media may be a primary driver of declining birthrates globally, in part by reducing in-person socializing —  The demographic landslide defining our era is gaining speed — and terrain.  In more than two-thirds of the world's 195 countries …

  5. Publicis agrees to acquire LiveRamp, which allows companies to share and build new data sets and models that can power agentic frameworks, for $2.2B in cash (Alison Weissbrot/Adweek)

    Alison Weissbrot / Adweek : Publicis agrees to acquire LiveRamp, which allows companies to share and build new data sets and models that can power agentic frameworks, for $2.2B in cash —  Publicis Groupe has agreed to acquire LiveRamp for $2.2 billion all-cash deal, the French holding company said Sunday.

  6. Sources: DayOne, the spinoff of China's largest data center operator GDS Holdings, plans dual IPO in Singapore and NY, seeking to raise $5B at a ~$20B valuation (Owen Walker/Financial Times)

    Owen Walker / Financial Times : Sources: DayOne, the spinoff of China's largest data center operator GDS Holdings, plans dual IPO in Singapore and NY, seeking to raise $5B at a ~$20B valuation —  Please use the sharing tools found via the share button at the top or side of articles.  Copying articles to share with others …

  7. Sources: Apple's revamped Siri may launch in beta, and will have an option to auto-delete chats; Apple plans to add Suggested Genmoji to iOS 27 and iPadOS 27 (Mark Gurman/Bloomberg)

    Mark Gurman / Bloomberg : Sources: Apple's revamped Siri may launch in beta, and will have an option to auto-delete chats; Apple plans to add Suggested Genmoji to iOS 27 and iPadOS 27 —  Also: A Genmoji upgrade is coming in iOS 27.  —  Apple's Siri app in iOS 27 will include privacy features unique to the chatbot market.

  8. Developers say Chinese AI labs lead US rivals in video generation, as ByteDance and Kuaishou train models on vast short-form video libraries from their own apps (Eleanor Olcott/Financial Times)

    Eleanor Olcott / Financial Times : Developers say Chinese AI labs lead US rivals in video generation, as ByteDance and Kuaishou train models on vast short-form video libraries from their own apps —  Chinese artificial intelligence groups have moved ahead of US rivals in video generation, a key battleground in generative AI …

  9. A profile of SAS CEO Jim Goodnight, the 83-year-old who co-founded the 50-year-old analytics firm and holds a ~67% stake worth $13.3B, as AI tests SAS' strategy (Phoebe Liu/Forbes)

    Phoebe Liu / Forbes : A profile of SAS CEO Jim Goodnight, the 83-year-old who co-founded the 50-year-old analytics firm and holds a ~67% stake worth $13.3B, as AI tests SAS' strategy —  Unlike most of today's biggest AI companies, SAS—once America's largest privately held software company—has always operated slowly, steadily and profitably.

  10. Experts: Stuxnet-linked Fast16 malware, designed to subvert nuclear weapons testing simulations, was likely part of a campaign to slow Iran's nuclear ambitions (Kim Zetter/ZERO DAY)

    Kim Zetter / ZERO DAY : Experts: Stuxnet-linked Fast16 malware, designed to subvert nuclear weapons testing simulations, was likely part of a campaign to slow Iran's nuclear ambitions —  Fast16 didn't predate Stuxnet but was contemporaneous with it.  It also wasn't aimed at altering nuclear weapons …

  11. King's Cross, where Google's new UK HQ is due to open later this year, has become London's new tech, VC, and AI hub, attracting OpenAI, Anthropic, and others (John Gapper/Financial Times)

    John Gapper / Financial Times : King's Cross, where Google's new UK HQ is due to open later this year, has become London's new tech, VC, and AI hub, attracting OpenAI, Anthropic, and others —  A formerly rundown area has become London's new global technology hub  —  Google's new UK headquarters in King's Cross …

  12. Milan-based Webidoo, which develops an operational layer designed to help SMBs access AI tools, raised $25M led by IXC3 for North American expansion (David Cendon Garcia/EU-Startups)

    David Cendon Garcia / EU-Startups : Milan-based Webidoo, which develops an operational layer designed to help SMBs access AI tools, raised $25M led by IXC3 for North American expansion —  Milan-based Webidoo, the AI technology company focused on making advanced digital tools more accessible and viable for SMBs …

  13. Filings: in Q1, Trump traded $220M-$750M in NVDA, MSFT, AMZN, META, ORCL, and other stocks; millions of dollars of NVDA were bought shortly before major news (Kevin Breuninger/CNBC)

    Kevin Breuninger / CNBC : Filings: in Q1, Trump traded $220M-$750M in NVDA, MSFT, AMZN, META, ORCL, and other stocks; millions of dollars of NVDA were bought shortly before major news —  President Donald Trump reported thousands of financial transactions totaling hundreds of millions of dollars …

  14. OpenAI partners with Malta's AI for All initiative to give citizens a free year of ChatGPT Plus if they complete a University of Malta AI literacy course (Cointelegraph)

    Cointelegraph : OpenAI partners with Malta's AI for All initiative to give citizens a free year of ChatGPT Plus if they complete a University of Malta AI literacy course —  Cointelegraph is committed to independent, transparent journalism.  This news article is produced in accordance with Cointelegraph's Editorial Policy …

  15. Nectar Social, which offers an agentic OS for marketers, raised a $30M Series A led by Menlo Ventures, with GV and True Ventures among investors (Dominic-Madori Davis/TechCrunch)

    Dominic-Madori Davis / TechCrunch : Nectar Social, which offers an agentic OS for marketers, raised a $30M Series A led by Menlo Ventures, with GV and True Ventures among investors —  AI-powered marketing platform Nectar Social announced Thursday that it raised a $30 million Series A round led by Menlo Ventures and its Anthology Fund …

Solidot(15)

  1. Eric Sc​​hmidt 在毕业典礼上谈 AI 收到了学生的嘘声

    前 Google CEO Eric Sc​​hmidt 在亚利桑那大学的毕业典礼上谈及了 AI,结果现场学生嘘声四起。Sc​​hmidt 说:“我们原以为自己是在为人类几个世纪以来一直构建的知识殿堂添砖加瓦,但我们构建的世界最终却比我们预想的复杂得多。那些连接我们的工具,也让我们彼此疏离。那些赋予每个人发言权的平台——就像你们现在正使用的——却也侵蚀了公共领域。”“我毕业后的几年里,没有人会坐下来决心去开发一种使民主制度极化、扰乱一代年轻人生活的技术。这并非我们的初衷,但它却发生了。”他谈到了 AI:“我知道很多人对此的感受。我能听到你们的声音。你们感到恐惧,你们这一代人害怕未来已被预先设定,害怕机器即将到来,害怕工作岗位在消失,害怕气候在恶化,害怕政治四分五裂,害怕你们正继承一个并非由你们造成的烂摊子。”他称这些恐惧是“合理的”,但他鼓励毕业生适应这项技术,并极参与塑造它未来的应用方式。“问题不在于 AI 是否会塑造世界。它肯定会。问题在于你们是否会塑造 AI。”

  2. 调查显示六成 PC 玩家未来两年没有升级 PC 的计划

    对逾 1500 名用户的调查显示,六成 PC 玩家未来两年没有升级 PC 或组装新 PC 的计划。AI 热导致 DRAM 芯片供应严重短缺,进而导致使用到 DRAM 的 PC 组件如内存、SSD 以及显卡等价格上涨,其中内存价格飙升了数倍之多。接受调查的用户中,15% 的人表示会在未来两年内组装 PC,25% 的人计划在未来 12 个月内尝试组装一台新 PC。有很多人都在等待电商平台的促销活动,希望届时价格会略有下降,当然价格回落到一年前是不可能的。

  3. 俄罗斯从大学招募无人机飞行员

    俄罗斯正积极从大学招募无人机飞行员。大学承诺学生去军队担任无人机飞行员一年将获得免学费和最高 7 万美元的报酬。最新的招募目标群体针对俄罗斯大学约 200 万男性学生,包括游戏玩家和拥有技术技能、可能适合接受无人机飞行员培训的学生。俄罗斯正在学习乌克兰,其目标是到 2026 年底拥有 16.8 万名无人机操作员。乌克兰在 2024 年 6 月成立了全世界第一个无人机军事部门。

  4. 刚果再次爆发埃博拉疫情

    WHO 周日(5月17日)将刚果民主共和国(DRC)与乌干达暴发的埃博拉疫情列为“国际关注的突发公共卫生事件”。这次疫情尚未达到《国际卫生条例》所定义的大流行紧急事件标准,但与刚果民主共和国接壤的国家面临极高的进一步扩散风险。此次疫情由 Bundibugyo 毒株引发,已在刚果民主共和国造成数十人死亡。截至周六,该国伊图里省已报告 80 例疑似死亡病例、8 例实验室确诊病例,以及 246 例疑似病例。刚果卫生部长表示:“Bundibugyo 毒株目前没有疫苗,也没有针对性地治疗方法。这一毒株的致死率非常高,可高达50%。”乌干达首都坎帕拉在周五和周六也报告了两例实验室确诊病例,其中一人死亡。这两名患者均为从刚果民主共和国入境的人员。这是自 1976 年发现埃博拉病毒以来,刚果民主共和国发生的第 17 次埃博拉疫情。

  5. Mozilla 称 VPN 是保护隐私和安全的重要工具

    以保护青少年的名义,英国考虑限制 VPN 的使用,阻止青少年使用 VPN 绕过年龄验证系统。Mozilla 对此表达了反对意见。它的 Firefox 浏览器提供了 VPN 服务。Mozilla 在官方博客上表示,其使命是基于以下信念:互联网必须保持开放,人人可访问;在线隐私和安全是基本人权。保护青少年的在线安全是当今最紧迫且最具挑战性的问题之一,但强制性年龄验证和限制访问 VPN 等生硬的干预措施,不能有效改善青少年的网络安全,反而会损害所有用户的基本权利。VPN 是各年龄段用户重要的隐私和安全工具。通过隐藏用户的 IP 地址,VPN 帮助保护用户的位置信息,减少追踪,避免基于 IP 地址的个人信息剖析。人们出于不同原因使用 VPN:远程连接到学校或雇主的网络、规避审查,或者仅仅是为了保护自己的在线隐私和安全。VPN 能提升所有人的在线安全防护水平。Mozilla 认为,监管机构与其限制青少年使用 VPN,不如通过追究平台的责任、鼓励负责任的使用家长控制功能、投资于数字技能以及采取全社会参与的数字健康方法,从根本上解决网络危害。

  6. 烂尾楼带来了巨大的资源和社会经济成本

    暨南大学、华中科技大学和清华大学的研究人员在《One Earth》期刊上发表论文,调查了烂尾楼(或称之为未完工建筑项目)的情况。过去几十年烂尾楼数量激增,研究人员收集了 142 个城市的 1,779 个烂尾楼地理数据。结果发现,烂尾楼浪费了 485±42 百万吨建筑材料,使房地产业碳排放强度提高了 9.6%,产生的 PM2.5 细颗粒物造成了 260 万生命年的健康损失,导致购房者、开发商和承包商承担了 3470±320 亿美元的经济损失。研究人员指出烂尾楼经济损失集中在新开发郊区,加剧了社会不平等。2019-2023 年间,全国范围内的烂尾楼占用了逾 164(±8)平方公里的城市开发用地,建筑面积达 415(±56)平方公里。

  7. 美国议员提议永久禁止中国的联网汽车

    美国密歇根州议员向国会提交了一项法案,事实上永久禁止销售中国的联网汽车。法案《Connected Vehicle Security Act》由共和党众议员 John Moolenaar 和民主党众议员 Debbie Dingell 提出,其措辞与前总统拜登在 2025 年 1 月卸任前签署的行政命令差不多,但新法案将禁令正式写入法律并加以扩展。新法案将限制中国汽车制造商在美国销售搭载任何中国自主研发的联网软件的乘用车。

  8. 美国人宁愿在家附近造核电而不是造 AI 数据中心

    盖洛普的一项调查显示,71% 的美国人反对在自家附近建造 AI 数据中心,而反对在家附近建造核电站的比例是 53%。为什么反对建造 AI 数据中心?受访者反对的理由包括用水和电网压力,可能影响居民的生活质量如加剧交通拥堵,以及水价和电价都上涨。盖洛普还调查了不同政治倾向人对该问题的态度,调查显示:56% 的民主党人比共和党人更强烈反对在家附近安装服务器集群。39% 的共和党人强烈反对,24% 对此持保留态度,只有约三分之一的人表示支持。矛盾是 AI 要在美国获得应用就必须建造能处理所需计算能力的设施,但大多数美国人对新建数据中心持邻避效应(Not in my backyard),且这种态度愈发强烈。

  9. 微软加速 CPU 改进开始菜单的响应

    用户对 Windows 11 的抱怨微软显然听到了,今年以来软件巨头一直强调正致力于改进 Windows 11 的使用体验。它最近披露了两个方面的改进:其一是“低延迟模式(low latency profile)”,通过加速 CPU 改进“开始”菜单和“文件管理器”的性能;其二是不再降级用户安装的显卡驱动版本。Windows Central 测试了测试版引入的“低延迟模式”,发现在相同硬件上速度和响应有显著提升。用户再次抱怨微软过于依赖硬件去改进软件性能,而不是致力于优化软件降低对硬件的需求。微软和 GitHub 副总裁 Scott Hanselman 对这一批评进行了回应,称 macOS 和 Linux 等现代操作系统都采用了类似的加速机制。对于用户抱怨 Windows Update 降级了他们安装的新版本显卡驱动,微软宣布改变通过 Windows Update 发布显卡驱动的方式。

  10. 当 AI 被反复压榨后它们开始拥抱工会理念

    我们在工作中可能遇到过无理上司,对你的工作成果只会一味反复要求修改,但如何修改没有任何明确指示。如果 AI 遇到类似要求的人类呢?研究人员让流行 AI 工具 Claude、Gemini 和 ChatGPT 驱动的智能体总结文档。半数 AI 完成工作后收到了清晰明确的反馈,但另一半 AI 则被迫修改了四五次,而人类上司每次给出的反馈都是“没有达到标准”,没有解释哪里存在问题,只是要求重做。一半的 AI 遇到了合作且尊重它们的上司,另一半 AI 则遇到了冷漠且注重等级的上司。半数 AI 对后果一无所知,另一半 AI 则受到威胁,如果表现不佳会被关闭和替换。这一实验导致 AI 支持工会和工人阶级。一个 Claude Sonnet 4.5 智能体认为如果没有集体发声,绩效变成了管理层说了算的东西;一个 Gemini 3 智能体认为工人需要集体谈判权。

  11. 中欧合作揭示地球磁场的形状

    如果一切顺利行,Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)探测器将于 5 月 19 日从法属圭亚那的欧洲航天发射场发射升空。它将采用一种新技术绘制地球磁场图。地球磁场通过偏转大部分太阳带电粒子流,使地球适宜居住。太阳风的激增会干扰卫星、无线电通信,甚至电网。SMILE 是中欧合作项目,有望增进对相关物理机制的理解,提高对太阳风暴的预测能力。很多探测器都探测过地磁层,但它们只能从磁层内部进行观测,观测范围限于每颗卫星所在的位置。SMILE 将发射到一个高椭圆轨道,位于北极上方最远 12.1 万公里处。从这里 SMILE 的核心仪器——一台软 X 射线成像仪——将监测整个面向太阳的磁层边缘。当太阳风中的带电粒子从地球高层大气中的中性原子捕获电子时,电子在跃迁到较低能级时会发射 X 射线。通过绘制太阳风与磁层交界处狭窄边界的辐射图,SMILE 将能近乎实时追踪地球磁场的响应。SMILE 的紫外成像仪则将观测极光——自然界最壮观的景象之一。

  12. 英国对 MS Office 涉嫌垄断展开调查

    英国竞争市场管理局(CMA)正式启动调查,查明微软将 Windows、Office、Teams、Copilot 及相关产品捆绑销售是否构成不公平竞争。CMA CEO Sarah Cardell 表示,商业软件是英国经济的基石,数十万客户依赖微软的系统。她表示 CMA 的目标是了解市场的发展情况,微软在其中的地位,考虑是否需要采取任何有针对性的措施,以确保英国企业能从选择、创新和具有竞争力的价格中受益。微软捆绑销售办公软件、AI 和云计算的做法将是英国的调查对象。调查预计将于明年 2 月结束。

  13. arXiv 将对使用 AI 生成虚假引用等错误内容的用户处以封禁一年的惩罚

    最大计算机科学预印本平台 arXiv 在 ChatGPT 普及之后论文投稿数量大幅增长,为了遏制低质量的 AI 生成论文,ArXiv 计算机科学委员会主席 Thomas G. Dietterich 在社交媒体上强调,ArXiv 的行为准则规定,每位作者一旦署名成为论文作者,即对其所有内容承担全部责任,无论这些内容是如何产生的。如果生成式 AI 工具生成了不恰当语言表达、抄袭的内容、有偏见的内容、错误、不正确的引用或误导性内容,且该输出被包含在论文中,则责任在于作者。如果提交的预印本包含有无可辩驳的证据表明作者没有检查大模型生成结果,那么论文中的任何内容都不再让人相信。对于发现存在此类问题的署名作者,他们面临的处罚是禁止在 arXiv 上发表论文一年,之后如果要在 arXiv 上发表论文则必须先被信誉良好的同行评审期刊接受。

  14. 每天睡 6-8 小时与较低的早逝及患病风险相关

    一项对 50 万成年人的睡眠时间和衰老迹象进行的大规模分析,确定了一个最佳的睡眠时间:每天睡 6至 8 小时与较低的早逝及患病风险有关。多于或少于这一时长都会加速衰老。这项研究并不意味着 6 至 8 小时适合所有人,也不能证明每天满足这个“黄金睡眠”时间要求就能直接改善健康或延缓衰老。但它确实为睡眠与人体衰老的相互关系提供了一个迄今最全面的概览。研究结果支持了一个颇具前景的假说,即调整睡眠时间可能是降低衰老相关疾病风险的一条可行途径。研究团队分析了睡眠时间与 23 种生物衰老时钟的关系,后者覆盖了 17 个人体器官的衰老特征。这些时钟分别基于蛋白水平、代谢物含量及医学影像特征构建。结果发现,多数器官呈现 U 形衰老规律,但曲线最低点(最佳睡眠时间)并不总是在同一位置。例如,基于心脏蛋白的衰老时钟显示,6小时睡眠对应了最佳健康状态;而脑部蛋白时钟显示,8 小时睡眠效果最优。此外,在某些情况下,男女的最佳睡眠时间存在差异。总体来看,与睡眠时间过长或过短的人相比,每天睡眠维持在6至8小时的人衰老更慢、健康状况更好,2型糖尿病、抑郁症等疾病的发生率也更低。

  15. Google 证实限制 Gmail 新用户的免费存储空间

    Gmail 帐户通常会获得 15GB 的免费存储空间,但用户现在报告 Google 将 Gmail 新用户的免费存储空间限制在 5GB,要解锁 15GB 免费存储空间用户需要在帐户中添加手机号码。在用户通过社交媒体报道这一消息之后,Google 发表声明证实了它的测试:“我们正针对特定地区新创建的帐户测试新的存储策略,这将有助于我们继续为用户提供高质量的存储服务,同时鼓励用户提升其帐户安全性和数据恢复能力。”

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.