DIGEST · 2026-05-05

OrangeBot.AI Digest — 2026-05-05

83 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. .de TLD offline due to DNSSEC? (dnssec-analyzer.verisignlabs.com)
  2. California farmers to destroy 420k peach trees following Del Monte bankruptcy (www.sfgate.com)
  3. IBM didn't want Microsoft to use the Tab key to move between dialog fields (devblogs.microsoft.com)
  4. Computer Use is 45x more expensive than structured APIs (reflex.dev)
  5. Accelerating Gemma 4: faster inference with multi-token prediction drafters (blog.google)
  6. Three Inverse Laws of AI (susam.net)
  7. The fun has been optimized out of the Internet (muddy.jprs.me)
  8. AI didn't delete your database, you did (idiallo.com)
  9. AI Product Graveyard (tooldirectory.ai)
  10. iOS 27 is adding a 'Create a Pass' button to Apple Wallet (walletwallet.alen.ro)
  11. When everyone has AI and the company still learns nothing (www.robert-glaser.de)
  12. Google Chrome silently installs a 4 GB AI model on your device without consent (www.thatprivacyguy.com)
  13. Async Rust never left the MVP state (tweedegolf.nl)
  14. Lessons for Agentic Coding: What should we do when code is cheap? (www.dbreunig.com)
  15. Kids can bypass some age checks with a drawn-on mustache (www.theregister.com)

GitHub Trending(15)

  1. Hmbown / DeepSeek-TUI
  2. ruvnet / ruflo
  3. virattt / dexter
  4. docusealco / docuseal
  5. bwya77 / vscode-dark-islands
  6. mksglu / context-mode
  7. cocoindex-io / cocoindex
  8. msitarzewski / agency-agents
  9. jwasham / coding-interview-university
  10. Arindam200 / awesome-ai-apps
  11. AIDC-AI / Pixelle-Video
  12. LearningCircuit / local-deep-research
  13. browserbase / skills
  14. forrestchang / andrej-karpathy-skills
  15. PriorLabs / TabPFN

Product Hunt(15)

  1. Flowstep 1.0

    AI design engineer to turn your thoughts into editable UI

  2. Kilo Code v7 for VS Code

    Parallel agents, diff reviewer, and multi-model comparisons

  3. Blaze

    The AI-powered calendar that plans your day for you.

  4. Velo 2.0

    Instantly turn your voice and screen into shareable videos

  5. Breathwrk

    Learn and master breathwork with guided breathing exercises

  6. Tollecode

    A local AI coding assistant to delegate tasks to AI agents

  7. PanicMode

    Protect your screen in public with one shortcut

  8. Unity AI

    AI agents built directly into Unity workflows

  9. Agentic API Grader by SaaStr.ai

    Your #1 new customer is an AI agent. Are they getting an A?

  10. Ghostwriter

    Write and publish posts on LinkedIn & X

  11. Dina

    From screen to polished video in minutes

  12. PaceBar

    A quiet pace instrument for your Mac

  13. Intuned Agent

    Production browser automation, built and maintained by AI

  14. Steam Controller

    Game with TMR sticks, dual haptic trackpads, + gyro controls

  15. TalentOS

    AI adoption operating system for companies

Hugging Face(15)

  1. MolmoAct2: Action Reasoning Models for Real-world Deployment

    Vision-Language-Action (VLA) models aim to provide a single generalist controller for robots, but today's systems fall short on the criteria that matter for real-world deployment. Frontier models are closed, open-weight alternatives are tied to expensive hardware, reasoning-augmented policies pay prohibitive latency for their grounding, and fine-tuned success rates remain below the threshold for dependable use. We present MolmoAct2, a fully open action reasoning model built for practical deployment, advancing its predecessor along five axes. We introduce MolmoER, a VLM backbone specialized for spatial and embodied reasoning, trained on a 3.3M-sample corpus with a specialize-then-rehearse recipe. We release three new datasets spanning low-to-medium cost platforms, including MolmoAct2-BimanualYAM, 720 hours of teleoperated bimanual trajectories that constitute the largest open bimanual dataset to date, together with quality-filtered Franka (DROID) and SO100/101 subsets. We provide OpenFAST, an open-weight, open-data action tokenizer trained on millions of trajectories across five embodiments. We redesign the architecture to graft a flow-matching continuous-action expert onto a discrete-token VLM via per-layer KV-cache conditioning. Finally, we propose MolmoThink, an adaptive-depth reasoning variant that re-predicts depth tokens only for scene regions that change between timesteps, retaining geometric grounding at a fraction of prior latency. In the most extensive empirical study of any open VLA to date, spanning 7 simulation and real-world benchmarks, MolmoAct2 outperforms strong baselines including Pi-05, while MolmoER surpasses GPT-5 and Gemini Robotics ER-1.5 across 13 embodied-reasoning benchmarks. We release model weights, training code, and complete training data. Project page: https://allenai.org/blog/molmoact2

  2. From Context to Skills: Can Language Models Learn from Context Skillfully?

    Many real-world tasks require language models (LMs) to reason over complex contexts that exceed their parametric knowledge. This calls for context learning, where LMs directly learn relevant knowledge from the given context. An intuitive solution is inference-time skill augmentation: extracting the rules and procedures from context into natural-language skills. However, constructing such skills for context learning scenarios faces two challenges: the prohibitive cost of manual skill annotation for long, technically dense contexts, and the lack of external feedback for automated skill construction. In this paper, we propose Ctx2Skill, a self-evolving framework that autonomously discovers, refines, and selects context-specific skills without human supervision or external feedback. At its core, a multi-agent self-play loop has a Challenger that generates probing tasks and rubrics, a Reasoner that attempts to solve them guided by an evolving skill set, and a neutral Judge that provides binary feedback. Crucially, both the Challenger and the Reasoner evolve through accumulated skills: dedicated Proposer and Generator agents analyze failure cases and synthesize them into targeted skill updates for both sides, enabling automated skill discovery and refinement. To prevent adversarial collapse caused by increasingly extreme task generation and over-specialized skill accumulation, we further introduce a Cross-time Replay mechanism that identifies the skill set achieving the best balance across representative cases for the Reasoner side, ensuring robust and generalizable skill evolution. The resulting skills can be plugged into any language model to obtain better context learning capability. Evaluated on four context learning tasks from CL-bench, Ctx2Skill consistently improves solving rates across backbone models.

  3. Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

    While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function, causing visual attention to decay inversely with generated sequence length. To counteract this, we propose Persistent Visual Memory (PVM), a lightweight learnable module designed to ensure sustained, on-demand visual perception. Integrated as a parallel branch alongside the Feed-Forward Network (FFN) in LVLMs, PVM establishes a distance-agnostic retrieval pathway that directly provides visual embeddings for precise visual perception, thereby structurally mitigating the signal suppression inherent to deep generation. Extensive experiments on Qwen3-VL models demonstrate that PVM brings notable improvements with negligible parameter overhead, delivering consistent average accuracy gains across both 4B and 8B scales, particularly in complex reasoning tasks that demand persistent visual perception. Furthermore, in-depth analysis reveals that PVM can resist length-induced signal decay and accelerate internal prediction convergence.

  4. Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

    Recent research has shown that filtering massive English web corpora into high-quality subsets significantly improves training efficiency. However, for high-resource non-English languages like German, French, or Japanese, aggressive filtering creates a strategic dilemma: should practitioners prioritize diversity by training once on large amounts of lightly filtered web data, or prioritize quality by strictly filtering for a high-quality core and repeating it over multiple epochs? We investigate this trade-off for German by constructing hierarchical quality filters applied to 500M web documents, comparing multi-epoch training on the filtered subsets against single-pass training on a diverse corpus. Our experiments across multiple model scales and token budgets show that repeating high-quality data consistently outperforms single-pass training on larger, less filtered sets. Notably, the performance gap persists even after 7 epochs. Our findings suggest that for non-English LLMs, semantic concentration through quality filtering offers a more viable path to efficient language modeling than simply maximizing unique data volume. We release our German language models (called Boldt), as well as our cleaned evaluation benchmarks to the research community. Our experiments indicate that they achieve state-of-the-art results despite training on 10-360x fewer tokens than comparable models.

  5. OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models

    The vast and underexplored ocean plays a critical role in regulating global climate and supporting marine biodiversity, yet artificial intelligence has so far delivered limited impact in this domain due to a fundamental data bottleneck. Specifically, ocean data are highly fragmented across disparate sources and inherently exhibit multi-modal, high-noise, and weakly labeled characteristics, lacking unified schemas and semantic alignment. Although Multimodal Large Language Models (MLLMs) have achieved remarkable success in general domains, their application to ocean science remains severely constrained by the absence of large-scale, well-aligned multimodal datasets tailored to marine environments. To bridge this gap, we introduce OceanPile, a large-scale multimodal corpus designed for ocean foundation models. It comprises three key components: OceanCorpus, a unified collection integrating sonar data, underwater imagery, marine science visuals, and scientific text from diverse authoritative sources; OceanInstruction, a high-quality instruction dataset synthesized via a novel pipeline guided by a hierarchical Ocean Concept Knowledge Graph; and OceanBenchmark, a manually curated evaluation benchmark for rigorous assessment. We establish a multi-stage quality control process to ensure scientific validity and alignment across modalities. Experimental validation demonstrates significant performance improvements for models trained on our data. All datasets are publicly released to advance the field of marine artificial intelligence and empower domain-specific MLLMs.

  6. Hallucinations Undermine Trust; Metacognition is a Way Forward

    Despite significant strides in factual reliability, errors -- often termed hallucinations -- remain a major concern for generative AI, especially as LLMs are increasingly expected to be helpful in more complex or nuanced setups. Yet even in the simplest setting -- factoid question-answering with clear ground truth-frontier models without external tools continue to hallucinate. We argue that most factuality gains in this domain have come from expanding the model's knowledge boundary (encoding more facts) rather than improving awareness of that boundary (distinguishing known from unknown). We conjecture that the latter is inherently difficult: models may lack the discriminative power to perfectly separate truths from errors, creating an unavoidable tradeoff between eliminating hallucinations and preserving utility. This tradeoff dissolves under a different framing. If we understand hallucinations as confident errors -- incorrect information delivered without appropriate qualification -- a third path emerges beyond the answer-or-abstain dichotomy: expressing uncertainty. We propose faithful uncertainty: aligning linguistic uncertainty with intrinsic uncertainty. This is one facet of metacognition -- the ability to be aware of one's own uncertainty and to act on it. For direct interaction, acting on uncertainty means communicating it honestly; for agentic systems, it becomes the control layer governing when to search and what to trust. Metacognition is thus essential for LLMs to be both trustworthy and capable; we conclude by highlighting open problems for progress towards this objective.

  7. AcademiClaw: When Students Set Challenges for AI Agents

    Benchmarks within the OpenClaw ecosystem have thus far evaluated exclusively assistant-level tasks, leaving the academic-level capabilities of OpenClaw largely unexamined. We introduce AcademiClaw, a bilingual benchmark of 80 complex, long-horizon tasks sourced directly from university students' real academic workflows -- homework, research projects, competitions, and personal projects -- that they found current AI agents unable to solve effectively. Curated from 230 student-submitted candidates through rigorous expert review, the final task set spans 25+ professional domains, ranging from olympiad-level mathematics and linguistics problems to GPU-intensive reinforcement learning and full-stack system debugging, with 16 tasks requiring CUDA GPU execution. Each task executes in an isolated Docker sandbox and is scored on task completion by multi-dimensional rubrics combining six complementary techniques, with an independent five-category safety audit providing additional behavioral analysis. Experiments on six frontier models show that even the best achieves only a 55\% pass rate. Further analysis uncovers sharp capability boundaries across task domains, divergent behavioral strategies among models, and a disconnect between token consumption and output quality, providing fine-grained diagnostic signals beyond what aggregate metrics reveal. We hope that AcademiClaw and its open-sourced data and code can serve as a useful resource for the OpenClaw community, driving progress toward agents that are more capable and versatile across the full breadth of real-world academic demands. All data and code are available at https://github.com/GAIR-NLP/AcademiClaw.

  8. ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

    In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, additional attributes are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes can be insufficiently covered by existing training schemes of diffusion generative models, potentially limiting test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses asynchronous time steps for different dimensions and attributes, thus allowing for varying degrees of control over them. Our code is available at: https://github.com/Xrvitd/ComboStoc

  9. PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments

    We introduce PhysicianBench, a benchmark for evaluating LLM agents on physician tasks grounded in real clinical setting within electronic health record (EHR) environments. Existing medical agent benchmarks primarily focus on static knowledge recall, single-step atomic actions, or action intent without verifiable execution against the environment. As a result, they fail to capture the long-horizon, composite workflows that characterize real clinical systems. PhysicianBench comprises 100 long-horizon tasks adapted from real consultation cases between primary care and subspecialty physicians, with each task independently reviewed by a separate panel of physicians. Tasks are instantiated in an EHR environment with real patient records and accessed through the same standard APIs used by commercial EHR vendors. Tasks span 21 specialties (e.g., cardiology, endocrinology, oncology, psychiatry) and diverse workflow types (e.g., diagnosis interpretation, medication prescribing, treatment planning), requiring an average of 27 tool calls per task. Solving each task requires retrieving data across encounters, reasoning over heterogeneous clinical information, executing consequential clinical actions, and producing clinical documentation. Each task is decomposed into structured checkpoints (670 in total across the benchmark) capturing distinct stages of completion graded by task-specific scripts with execution-grounded verification. Across 13 proprietary and open-source LLM agents, the best-performing model achieves only 46% success rate (pass@1), while open-source models reach at most 19%, revealing a substantial gap between current agent capabilities and the demands of real-world clinical workflows. PhysicianBench provides a realistic and execution-grounded benchmark for measuring progress toward autonomous clinical agents.

  10. Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

    Retrieval-augmented generation (RAG) enhances large language models with external knowledge, and tree-based RAG organizes documents into hierarchical indexes to support queries at multiple granularities. However, existing Tree-RAG methods designed for single-document retrieval face critical challenges in scaling to cross-document multi-hop questions: (1) poor distribution adaptability, where k-means clustering introduces noise due to rigid distribution assumptions; (2) structural isolation, as tree indexes lack explicit cross-document connections; and (3) coarse abstraction, which obscures fine-grained details. To address these limitations, we propose Ψ-RAG, a tree-RAG framework with two key components. First, a hierarchical abstract tree index built through an iterative "merging and collapse" process that adapts to data distributions without a priori assumption. Second, a multi-granular retrieval agent that intelligently interacts with the knowledge base with reorganized queries and an agent-powered hybrid retriever. Ψ-RAG supports diverse tasks from token-level question answering to document-level summarization. On cross-document multi-hop QA benchmarks, it outperforms RAPTOR by 25.9% and HippoRAG 2 by 7.4% in average F1 score. Code is available at https://github.com/Newiz430/Psi-RAG.

  11. T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning

    Recent progress in multi-turn reinforcement learning (RL) has significantly improved reasoning LLMs' performances on complex interactive tasks. Despite advances in stabilization techniques such as fine-grained credit assignment and trajectory filtering, instability remains pervasive and often leads to training collapse. We argue that this instability stems from inefficient exploration in multi-turn settings, where policies continue to generate low-information actions that neither reduce uncertainty nor advance task progress. To address this issue, we propose Token- and Turn-level Policy Optimization (T^2PO), an uncertainty-aware framework that explicitly controls exploration at fine-grained levels. At the token level, T^2PO monitors uncertainty dynamics and triggers a thinking intervention once the marginal uncertainty change falls below a threshold. At the turn level, T^2PO identifies interactions with negligible exploration progress and dynamically resamples such turns to avoid wasted rollouts. We evaluate T^2PO in diverse environments, including WebShop, ALFWorld, and Search QA, demonstrating substantial gains in training stability and performance improvements with better exploration efficiency. Code is available at: https://github.com/WillDreamer/T2PO.

  12. Counting as a minimal probe of language model reliability

    Large language models perform strongly on benchmarks in mathematical reasoning, coding and document analysis, suggesting a broad ability to follow instructions. However, it remains unclear whether such success reflects general logical competence, repeated application of learned procedures, or pattern matching that mimics rule execution. We investigate this question by introducing Stable Counting Capacity, an assay in which models count repeated symbols until failure. The assay removes knowledge dependencies, semantics and ambiguity from evaluation, avoids lexical and tokenization confounds, and provides a direct measure of procedural reliability beyond standard knowledge-based benchmarks. Here we show, across more than 100 model variants, that stable counting capacity remains far below advertised context limits. Model behavior is consistent neither with open-ended logic nor with stable application of a learned rule, but instead with use of a finite set of count-like internal states, analogous to counting on fingers. Once this resource is exhausted, the appearance of rule following disappears and exact execution collapses into guessing, even with additional test-time compute. These findings show that fluent performance in current language models does not guarantee general, reliable rule following.

  13. Code World Model Preparedness Report

    This report documents the preparedness assessment of Code World Model (CWM), a model for code generation and reasoning about code from Meta. We conducted pre-release testing across domains identified in our Frontier AI Framework as potentially presenting catastrophic risks, and also evaluated the model's misaligned propensities. Our assessment found that CWM does not pose additional frontier risks beyond those present in the current AI ecosystem. We therefore release it as an open-weight model.

  14. Generative Modeling with Orbit-Space Particle Flow Matching

    We present Orbit-Space Geometric Probability Paths (OGPP), a particle-native flow-matching framework for generative modeling of particle systems. OGPP is motivated by two insights: (i) particles are defined up to permutation symmetries, so anonymous indexing inflates per-index target variance and yields curved, hard-to-learn flows; and (ii) particles live in physical space, so the flow terminal velocity has physical meaning and can encode geometric attributes, e.g., surface normals. OGPP instantiates three key components: (1) orbit-space canonicalization of the probability-path terminal endpoint, (2) particle index embeddings for role specialization, and (3) geometric probability paths with arc-length-aware terminal velocities that generate normals as a byproduct of the flow. We evaluate OGPP on minimal-surface benchmarks, where it reduces metric error by up to two orders of magnitude in a single inference step; on ShapeNet, where it matches the state of the art with 5x fewer steps and reaches airplane EMD comparable to DiT-3D with 26x fewer parameters and 5x fewer steps; and on single-shape encoding, where it produces normals and reconstructions competitive with 6D generators while operating entirely in 3D.

  15. Perceptual Flow Network for Visually Grounded Reasoning

    Despite the success of Large-Vision Language Models (LVLMs), general optimization objectives (e.g., standard MLE) fail to constrain visual trajectories, leading to language bias and hallucination. To mitigate this, current methods introduce geometric priors from visual experts as additional supervision. However, we observe that such supervision is typically suboptimal: it is biased toward geometric precision and offers limited reasoning utility. To bridge this gap, we propose Perceptual Flow Network (PFlowNet), which eschews rigid alignment with the expert priors and achieves interpretable yet more effective visual reasoning. Specifically, PFlowNet decouples perception from reasoning to establish a self-conditioned generation process. Based on this, it integrates multi-dimensional rewards with vicinal geometric shaping via variational reinforcement learning, thereby facilitating reasoning-oriented perceptual behaviors while preserving visual reliability. PFlowNet delivers a provable performance guarantee and competitive empirical results, particularly setting new SOTA records on V* Bench (90.6%) and MME-RealWorld-lite (67.0%).

Techmeme(15)

  1. Kaspersky says Daemon Tools, a widely used app for mounting disk images, has been backdoored in a monthlong compromise that has pushed malicious updates (Dan Goodin/Ars Technica)

    Dan Goodin / Ars Technica : Kaspersky says Daemon Tools, a widely used app for mounting disk images, has been backdoored in a monthlong compromise that has pushed malicious updates —  Daemon Tools, a widely used app for mounting disk images, has been backdoored in a monthlong compromise that has pushed malicious updates …

  2. EA reports Q4 net bookings up 3.6% YoY to $1.86B, vs. $2B est., weighed down by a post-launch drop-off in engagement for Battlefield 6 (Anhata Rooprai/Reuters)

    Anhata Rooprai / Reuters : EA reports Q4 net bookings up 3.6% YoY to $1.86B, vs. $2B est., weighed down by a post-launch drop-off in engagement for Battlefield 6 —  Videogame publisher Electronic Arts (EA.O) missed quarterly bookings estimates on Tuesday, weighed down by a post-launch drop-off in engagement for its …

  3. GlobalFoundries reports Q1 revenue up 3% YoY to $1.63B, in line with est., and forecasts Q2 revenue and adjusted earnings above estimates; GFS closes up 9.28% (Patrick Seitz/Investor's Business Daily)

    Patrick Seitz / Investor's Business Daily : GlobalFoundries reports Q1 revenue up 3% YoY to $1.63B, in line with est., and forecasts Q2 revenue and adjusted earnings above estimates; GFS closes up 9.28% —  Contract chipmaker GlobalFoundries (GFS) on Tuesday beat earnings estimates on in-line sales for the first quarter.

  4. Micron closes up 11% after announcing its highest-capacity SSD has started to ship, lifting its market cap past $700B for the first time; Sandisk closes up 12% (Lola Murti/CNBC)

    Lola Murti / CNBC : Micron closes up 11% after announcing its highest-capacity SSD has started to ship, lifting its market cap past $700B for the first time; Sandisk closes up 12% —  Micron's historic rally continued on Tuesday, with shares of the memory maker surging 11%, lifting the company's market cap past $700 billion for the first time.

  5. A US court sentences a Latvian national to 8.5 years for acting as a negotiator for Russia's Karakurt ransomware group (Sergiu Gatlan/BleepingComputer)

    Sergiu Gatlan / BleepingComputer : A US court sentences a Latvian national to 8.5 years for acting as a negotiator for Russia's Karakurt ransomware group —  A Latvian national extradited to the United States was sentenced to 8.5 years in prison for his “cold case” negotiator role in the Russian Karakurt ransomware group.

  6. Super Micro reports Q3 revenue up 123% YoY to $10.24B, vs. $12.33B est., forecasts Q4 revenue and adjusted profit above estimates; SMCI jumps 17%+ after hours (Juby Babu/Reuters)

    Juby Babu / Reuters : Super Micro reports Q3 revenue up 123% YoY to $10.24B, vs. $12.33B est., forecasts Q4 revenue and adjusted profit above estimates; SMCI jumps 17%+ after hours —  Super Micro Computer (SMCI.O) on Tuesday forecast fourth-quarter revenue above Wall Street estimates, banking on robust demand …

  7. Match Group reports Q1 revenue up 4% YoY to $864M, vs. $855M est., as Tinder's new user registrations grew for the first time since 2024, up 1% (Samantha Kelly/Bloomberg)

    Samantha Kelly / Bloomberg : Match Group reports Q1 revenue up 4% YoY to $864M, vs. $855M est., as Tinder's new user registrations grew for the first time since 2024, up 1% —  Match Group Inc. reported first-quarter revenue that beat analysts' estimates as a decline in Tinder users moderated, suggesting its turnaround strategy is resonating with younger daters.

  8. Source: Anthropic plans to spend about $200B on Google's cloud and chips over five years, representing 40%+ of the "revenue backlog" Google disclosed last week (The Information)

    The Information : Source: Anthropic plans to spend about $200B on Google's cloud and chips over five years, representing 40%+ of the “revenue backlog” Google disclosed last week —  When Google last month said it would supply Anthropic with an astonishing five gigawatts of server capacity …

  9. Apple reaches a $250M settlement in a CA federal court to resolve a false advertising class action lawsuit over the launch of a "personalized" Siri in 2024 (Michael Acton/Financial Times)

    Michael Acton / Financial Times : Apple reaches a $250M settlement in a CA federal court to resolve a false advertising class action lawsuit over the launch of a “personalized” Siri in 2024 —  iPhone buyers sued the tech giant for touting features in 2024 that have yet to launch

  10. AMD reports Q1 revenue up 38% YoY to $10.25B, vs. $9.89B est., Data Center revenue up 57% to $5.8B, forecasts Q2 revenue above est.; AMD jumps 7%+ after hours (Reuters)

    Reuters : AMD reports Q1 revenue up 38% YoY to $10.25B, vs. $9.89B est., Data Center revenue up 57% to $5.8B, forecasts Q2 revenue above est.; AMD jumps 7%+ after hours —  Advanced Micro Devices (AMD.O) forecast second-quarter revenue above Wall Street expectations on Tuesday, helped by keen demand …

  11. Xbox CEO Asha Sharma says Xbox "will begin winding down Copilot on mobile and will stop development of Copilot on console" as part of a strategic shift (Jay Peters/The Verge)

    Jay Peters / The Verge : Xbox CEO Asha Sharma says Xbox “will begin winding down Copilot on mobile and will stop development of Copilot on console” as part of a strategic shift —  New Xbox CEO Asha Sharma continues to make her mark. … Xbox is “winding down Copilot on mobile” and …

  12. Sources: Meta is building agentic tools, including an OpenClaw-like assistant powered by its new Muse Spark AI model to help users create AI bots (Hannah Murphy/Financial Times)

    Hannah Murphy / Financial Times : Sources: Meta is building agentic tools, including an OpenClaw-like assistant powered by its new Muse Spark AI model to help users create AI bots —  Social media platform invests in equivalent to OpenClaw that aims to seamlessly carry out everyday tasks for users

  13. Musk v. Altman: on his second day of testimony, Greg Brockman said that OpenAI expects to spend $50B on computing in 2026, up from $30M in 2017 (Rachel Metz/Bloomberg)

    Rachel Metz / Bloomberg : Musk v. Altman: on his second day of testimony, Greg Brockman said that OpenAI expects to spend $50B on computing in 2026, up from $30M in 2017 —  OpenAI expects to spend $50 billion on computing power this year to support its artificial intelligence software, according to co-founder and President Greg Brockman.

  14. Musk v. Altman: Greg Brockman testified about tense negotiations with Musk in 2017, saying "he knows rockets, he knows electric cars" but "does not know AI" (Bloomberg)

    Bloomberg : Musk v. Altman: Greg Brockman testified about tense negotiations with Musk in 2017, saying “he knows rockets, he knows electric cars” but “does not know AI” —  OpenAI President Greg Brockman testified that Elon Musk called a ChatGPT predecessor “stupid,” …

  15. Subquadratic launches with a $29M seed and debuts SubQ, an LLM that uses a subquadratic sparse attention architecture to achieve a 12M-token context window (Kyt Dotson/SiliconANGLE)

    Kyt Dotson / SiliconANGLE : Subquadratic launches with a $29M seed and debuts SubQ, an LLM that uses a subquadratic sparse attention architecture to achieve a 12M-token context window —  Subquadratic, a company developing a novel generative artificial intelligence model, launched today with $29 million in seed funding.

Solidot(8)

  1. MS Edge 被发现会在内存中明文加载所有密码

    MS Edge 浏览器被发现启动时会在内存中明文加载其保存的所有密码。相比下 Chrome 只在需要时解密凭证,没有将所有密码保存在内存中。Edge 和 Chrome 都是基于开源的 Chromium。微软的做法让从内存中抓取重要数据变得更容易,也增加了共享环境下密码泄露的风险。安全研究人员将这一问题报告给了微软,收到的回应是该行为就是这么设计的。研究人员在 GitHub 上发布了概念演示工具 EdgeSavedPasswordsDumper。

  2. NetHack 释出 5.0 版本

    有 39 年历史的 Roguelike 游戏 NetHack 释出 5.0 版本。NetHack 在 1987 年发布了最早的版本,名字中的 Net 指的是通过网络合作开发,Hack 则指的是角色扮演中的 hack 和 slash。玩家可以在游戏中扮演骑士、野蛮人、巫师、游侠、女武神、僧侣和武士等不同职业,目标是在地下城的最底层获取 Yendor 的项链并将其供奉给自己的神灵。NetHack 5.0 除了支持 Windows,还支持 MS-DOS 和 Amiga,此外它是完全开源的,可以编译在 Linux 等类 Unix 系统运行。NetHack 5.0 现在可以通过支持 C99 标准的编译器进行编译,使用 Lua 生成地下城,在游戏初期阶段新增了一个可选教程,等等。

  3. 科学家发现咖啡如何影响肠道和大脑

    根据发表在《Nature Communications》期刊上的一项研究,科学家发现常饮用含咖啡因和不含咖啡因的咖啡会影响肠道菌群,从而影响情绪和压力水平。研究人员对比了 31 名常饮用咖啡者和 31 名不喝咖啡者。常饮用咖啡者指的是每天饮用 3-5 杯咖啡的人。实验开始时,咖啡饮用者停止饮用咖啡两周。在此期间,研究人员持续收集生物样本监测心理健康状况。实验期间参与者并不知道自己饮用的是含咖啡因的咖啡还是不含咖啡因的咖啡。一半参与者饮用不含咖啡因的咖啡,另一半饮用普通咖啡。参与者都报告情绪有所改善,这一结果显示即使不含咖啡因咖啡也能改善情绪。研究还发现常饮用咖啡者有更高的埃格特菌属(Eggertella sp.)和短隐杆菌(Cryptobacterium curtum),更多的厚壁菌门(Firmicutes)。只有摄入不含咖啡因的人才表现出学习和记忆力的提升,而只有摄入咖啡因的参与者才体验到焦虑减轻以及注意力和警觉性提高。

  4. 天文学家发现 27 颗围绕双恒星运行的候选行星

    天文学家发现了 27 颗围绕双恒星运行的候选行星,类似星球大战里的沙漠行星塔图因(Tatooine)。天文学家至今发现了 18 颗环双星行星,但类似环绕太阳运行的的单恒星行星则发现了逾 8000 颗。科学家以前通过凌日现象识别环双星行星,但需要在特定条件下才能观测到。现在他们采用了轨道进动(apsidal precession),寻找相互绕行且发生掩食的双星系统中轨道出现的摆动,这种摆动通常只能用存在第三个天体去解释。研究团队利用 NASA Transiting Exoplanet Survey Satellite 卫星收集的数据,从 1590 个恒星系统中识别出 36 个候选天体,其中 27 个天体可能具有行星质量。研究人员表示需要更多研究才能确定它们是否是环双星行星。

  5. VS Code 默认在 commit 中插入 Co-Authored-by Copilot

    微软的编辑器 VS Code 被发现默认在 commit 中插入了 Co-Authored-by Copilot,不管用户有没有使用其 AI 助手 Copilot。此事再次在用户中引发了大量批评。微软开发者回应称他们将会在下个版本中解决默认启用的问题,称如果用户没有使用 AI 助手那么就不应该说代码是 Copilot 合作编写的。

  6. 中国三月绿色技术出口增长七成

    因霍尔木兹海峡封锁引发的新一轮能源危机,世界各国正加速向清洁能源转型,最大的绿色技术出口国中国三月的太阳能、电池和电动汽车的出口总额同比增长 70%,其中出口的太阳能装机容量达到 68GW,电池出口额达到 100 亿美元,电动汽车和混合动力汽车出口同比增长 140%。多达 50 个国家从中国进口的太阳能设备都创历史新高。

  7. Steam 用户中使用 Linux 比例占 4.52%

    2026 年 3 月 Steam 玩家中使用 Linux 比例达到了史无前例的 5.33%,比前一个月增加了一倍多。根据 Valve 公布的 2026 年 4 月 Steam 硬件和软件调查,Steam 用户中使用 Linux 比例回落到了 4.52%,减少 0.81%,但仍然比去年同期翻了一番。Windows 操作系统的比例提高到 93.47%,OSX 占 2.01%。有众多证据表明 Linux 上的游戏表现有了翻天覆地的变化,而 Linux 下游戏的一大特性是需要的资源比 Windows 更少,在今天内存价格飙升的时期显得更有吸引力。其它数据显示:简体中文用户比例占 23.41%,英语用户占 36.77% 。用户使用英特尔 CPU 的比例占 55.81%,AMD 占 44.18%,几乎和前一个月相同。

  8. 英国 NHS 以 AI 为由准备关闭所有开源库

    日程安排平台 Cal.com 上月宣布从开源转为闭源,理由是 AI 工具更容易从开源代码中发现漏洞,而安全性依赖于模糊,因此闭源有助于提高安全。现在英国国家医疗服务体系(NHS)以相同的理由准备关闭它几乎所有的开源库,这一决定引发了广泛争议和批评。批评者指出 NHS 公布的大部分开源库是数据集、内部工具、指南、研究工具、前端设计等,它们不会因为安全扫描技术的进步而受到影响。此外是否开源对于 Anthropic Mythos 之类的 AI 工具并无区别,因为它们也能分析二进制程序并寻找漏洞。批评者发表了公开信,呼吁 NHS 保持其代码公开。