OrangeBot.AI Digest — 2026-05-09
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Internet Archive Switzerland (internetarchive.ch)
- All my clients wanted a carousel, now it's an AI chatbot (adele.pages.casa)
- What causes lightning? The answer keeps getting more interesting (www.quantamagazine.org)
- EU calls VPNs "a loophole that needs closing" in age verification push (cyberinsider.com)
- Using Claude Code: The unreasonable effectiveness of HTML (twitter.com)
- A recent experience with ChatGPT 5.5 Pro (gowers.wordpress.com)
- Over 97% of the 'Linux' Foundation's Budget Goes Not to Linux (techrights.org)
- Mythical Man Month (martinfowler.com)
- People Hate AI Art (mccue.dev)
- Bitter Lessons from the ISSpresso (mceglowski.substack.com)
- OpenAI’s WebRTC problem (moq.dev)
- The React2Shell Story (lachlan.nz)
- Wi is Fi: Understanding Wi-Fi 4/5/6/6E/7/8 (802.11 n/AC/ax/be/bn) (www.wiisfi.com)
- Meta Shuts Down End-to-End Encryption for Instagram Messaging (www.pcmag.com)
- AWS North Virginia data center outage – resolved (www.cnbc.com)
GitHub Trending(13)
- anthropics / financial-services
- bytedance / UI-TARS-desktop
- rohitg00 / agentmemory
- datawhalechina / hello-agents
- datawhalechina / easy-vibe
- rowboatlabs / rowboat
- ChromeDevTools / chrome-devtools-mcp
- masterking32 / MasterDnsVPN
- playcanvas / supersplat
- Lordog / dive-into-llms
- addyosmani / agent-skills
- decolua / 9router
- oracle-devrel / oracle-ai-developer-hub
Product Hunt(15)
- Zappy by ZapDigits
Your AI reporting analyst
- nocal 4
The calendar that thinks like a workspace
- MolmoAct 2
Open robotics model that reasons in 3D before acting
- How AI-pilled are you?
Curious how AI-fluent your organization is?
- Omi A11Y
Web Accessibility Scanner Extension
- Ghost
Open-source, self-hosted game servers
- Glowix
Keep your Mac display awake exactly when you need it
- Codex in Chrome
Let Codex navigate and automate tasks in your browser
- BugDrop
In-app feedback that creates GitHub Issues with screenshots
- ClawTick
Cron jobs for AI agents w/ one command, zero infrastructure
- Manuscripts.app
For academics who have outgrown the spreadsheet tracker
- Prism
Hire the best candidates, not just the available
- Staff.rip
Describe a code change in plain language and ship it
- Pop
Everyday messaging, voice first
- Nylas CLI
Email, calendar, and contacts for AI agents
Hugging Face(15)
- Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction
Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This abstraction is efficient, but for agentic search, it becomes a bottleneck: exact lexical constraints, sparse clue conjunctions, local context checks, and multi-step hypothesis refinement are difficult to implement by calling a conventional off-the-shelf retriever, and evidence filtered out early cannot be recovered by stronger downstream reasoning. Agentic tasks further exacerbate this limitation because they require agents to orchestrate multiple steps, including discovering intermediate entities, combining weak clues, and revising the plan after observing partial evidence. To tackle the limitation, we study direct corpus interaction (DCI), where an agent searches the raw corpus directly with general-purpose terminal tools (e.g., grep, file reads, shell commands, lightweight scripts), without any embedding model, vector index, or retrieval API. This approach requires no offline indexing and adapts naturally to evolving local corpora. Across IR benchmarks and end-to-end agentic search tasks, this simple setup substantially outperforms strong sparse, dense, and reranking baselines on several BRIGHT and BEIR datasets, and attains strong accuracy on BrowseComp-Plus and multi-hop QA without relying on any conventional semantic retriever. Our results indicate that as language agents become stronger, retrieval quality depends not only on reasoning ability but also on the resolution of the interface through which the model interacts with the corpus, with which DCI opens a broader interface-design space for agentic search.
- Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning
A persistent skill library allows language model agents to reuse successful strategies across tasks. Maintaining such a library requires three coupled capabilities. The agent selects a relevant skill, utilizes it during execution, and distills new skills from experience. Existing methods optimize these capabilities in isolation or with separate reward sources, resulting in partial and conflicting evolution. We propose Skill1, a framework that trains a single policy to co-evolve skill selection, utilization, and distillation toward a shared task-outcome objective. The policy generates a query to search the skill library, re-ranks candidates to select one, solves the task conditioned on it, and distills a new skill from the trajectory. All learning derives from a single task-outcome signal. Its low-frequency trend credits selection and its high-frequency variation credits distillation. Experiments on ALFWorld and WebShop show that Skill1 outperforms prior skill-based and reinforcement learning baselines. Training dynamics confirm the co-evolution of the three capabilities, and ablations show that removing any credit signal degrades the evolution.
- Continuous Latent Diffusion Language Model
Large language models have achieved remarkable success under the autoregressive paradigm, yet high-quality text generation need not be tied to a fixed left-to-right order. Existing alternatives still struggle to jointly achieve generation efficiency, scalable representation learning, and effective global semantic modeling. We propose Cola DLM, a hierarchical latent diffusion language model that frames text generation through hierarchical information decomposition. Cola DLM first learns a stable text-to-latent mapping with a Text VAE, then models a global semantic prior in continuous latent space with a block-causal DiT, and finally generates text through conditional decoding. From a unified Markov-path perspective, its diffusion process performs latent prior transport rather than token-level observation recovery, thereby separating global semantic organization from local textual realization. This design yields a more flexible non-autoregressive inductive bias, supports semantic compression and prior fitting in continuous space, and naturally extends to other continuous modalities. Through experiments spanning 4 research questions, 8 benchmarks, strictly matched ~2B-parameter autoregressive and LLaDA baselines, and scaling curves up to about 2000 EFLOPs, we identify an effective overall configuration of Cola DLM and verify its strong scaling behavior for text generation. Taken together, the results establish hierarchical continuous latent prior modeling as a principled alternative to strictly token-level language modeling, where generation quality and scaling behavior may better reflect model capability than likelihood, while also suggesting a concrete path toward unified modeling across discrete text and continuous modalities.
- MiA-Signature: Approximating Global Activation for Long-Context Understanding
A growing body of work in cognitive science suggests that reportable conscious access is associated with global ignition over distributed memory systems, while such activation is only partially accessible as individuals cannot directly access or enumerate all activated contents. This tension suggests a plausible mechanism that cognition may rely on a compact representation that approximates the global influence of activation on downstream processing. Inspired by this idea, we introduce the concept of Mindscape Activation Signature (MiA-Signature), a compressed representation of the global activation pattern induced by a query. In LLM systems, this is instantiated via submodular-based selection of high-level concepts that cover the activated context space, optionally refined through lightweight iterative updates using working memory. The resulting MiA-Signature serves as a conditioning signal that approximates the effect of the full activation state while remaining computationally tractable. Integrating MiA-Signatures into both RAG and agentic systems yields consistent performance gains across multiple long-context understanding tasks.
- RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation
We present our winning system for Task~B (generation with reference passages) in SemEval-2026 Task~8: MTRAGEval. Our method is a heterogeneous ensemble of seven LLMs with two prompting variants, where a GPT-4o-mini judge selects the best candidate per instance. We ranked 1st out of 26 teams, achieving a conditioned harmonic mean of 0.7827 and outperforming the strongest baseline (gpt-oss-120b, 0.6390). Ablations show that diversity in model families, scales, and prompting strategies is essential, with the ensemble consistently beating any single model. We also introduce Meno-Lite-0.1, a 7B domain-adapted model with a strong cost--performance trade-off, and analyse MTRAGEval, highlighting annotation limitations and directions for improvement. Our code is publicly available: https://github.com/RaguTeam/ragu_mtrag_semeval
- When to Trust Imagination: Adaptive Action Execution for World Action Models
World Action Models (WAMs) have recently emerged as a promising paradigm for robotic manipulation by jointly predicting future visual observations and future actions. However, current WAMs typically execute a fixed number of predicted actions after each model inference, leaving the robot blind to whether the imagined future remains consistent with the actual physical rollout. In this work, we formulate adaptive WAM execution as a future-reality verification problem: the robot should execute longer when the WAM-predicted future remains reliable, and replan earlier when reality deviates from imagination. To this end, we propose Future Forward Dynamics Causal Attention (FFDC), a lightweight verifier that jointly reasons over predicted future actions, predicted visual dynamics, real observations, and language instructions to estimate whether the remaining action rollout can still be trusted. FFDC enables adaptive action chunk sizes as an emergent consequence of prediction-observation consistency, preserving the efficiency of long-horizon execution while restoring responsiveness in contact-rich or difficult phases. We further introduce Mixture-of-Horizon Training to improve long-horizon trajectory coverage for adaptive execution. Experiments on the RoboTwin benchmark and in the real world demonstrate that our method achieves a strong robustness-efficiency trade-off: on RoboTwin, it reduces WAM forward passes by 69.10% and execution time by 34.02%, while improving success rate by 2.54% over the short-chunk baseline; in real-world experiments, it improves success rate by 35%.
- MARBLE: Multi-Aspect Reward Balance for Diffusion RL
Reinforcement learning fine-tuning has become the dominant approach for aligning diffusion models with human preferences. However, assessing images is intrinsically a multi-dimensional task, and multiple evaluation criteria need to be optimized simultaneously. Existing practice deal with multiple rewards by training one specialist model per reward, optimizing a weighted-sum reward R(x)=sum_k w_k R_k(x), or sequentially fine-tuning with a hand-crafted stage schedule. These approaches either fail to produce a unified model that can be jointly trained on all rewards or necessitates heavy manually tuned sequential training. We find that the failure stems from using a naive weighted-sum reward aggregation. This approach suffers from a sample-level mismatch because most rollouts are specialist samples, highly informative for certain reward dimensions but irrelevant for others; consequently, weighted summation dilutes their supervision. To address this issue, we propose MARBLE (Multi-Aspect Reward BaLancE), a gradient-space optimization framework that maintains independent advantage estimators for each reward, computes per-reward policy gradients, and harmonizes them into a single update direction without manually-tuned reward weighting, by solving a Quadratic Programming problem. We further propose an amortized formulation that exploits the affine structure of the loss used in DiffusionNFT, to reduce the per-step cost from K+1 backward passes to near single-reward baseline cost, together with EMA smoothing on the balancing coefficients to stabilize updates against transient single-batch fluctuations. On SD3.5 Medium with five rewards, MARBLE improves all five reward dimensions simultaneously, turns the worst-aligned reward's gradient cosine from negative under weighted summation in 80% of mini-batches to consistently positive, and runs at 0.97X the training speed of baseline training.
- SkillOS: Learning Skill Curation for Self-Evolving Agents
LLM-based agents are increasingly deployed to handle streaming tasks, yet they often remain one-off problem solvers that fail to learn from past interactions. Reusable skills distilled from experience provide a natural substrate for self-evolution, where high-quality skill curation serves as the key bottleneck. Existing approaches either rely on manual skill curation, prescribe heuristic skill operations, or train for short-horizon skill operations. However, they still struggle to learn complex long-term curation policies from indirect and delayed feedback. To tackle this challenge, we propose SkillOS, an experience-driven RL training recipe for learning skill curation in self-evolving agents. SkillOS pairs a frozen agent executor that retrieves and applies skills with a trainable skill curator that updates an external SkillRepo from accumulated experience. To provide learning signals for curation, we design composite rewards and train on grouped task streams based on skill-relevant task dependencies, where earlier trajectories update the SkillRepo, and later related tasks evaluate these updates. Across multi-turn agentic tasks and single-turn reasoning tasks, SkillOS consistently outperforms memory-free and strong memory-based baselines in both effectiveness and efficiency, with the learned skill curator generalizing across different executor backbones and task domains. Further analyses show that the learned curator produces more targeted skill use, while the skills in SkillRepo evolve into more richly structured Markdown files that encode higher-level meta-skills over time.
- Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration
Reinforcement learning with verifiable rewards, particularly Group Relative Policy Optimization (GRPO), has significantly advanced the reasoning capabilities of Large Language Models (LLMs). However, in complex tasks, GRPO frequently suffers from the ``zero-advantage problem'': when all sampled rollouts for a query fail, the relative advantage collapses to zero. Consequently, the model loses effective training signals for these questions, wasting the training data and computational budget. While simply increasing the sampling budget for these questions is a common remedy, the static sampling policy inherently constrains reasoning exploration, limiting the success rate. In this paper, we propose Lorem Perturbation for Exploration (LoPE), a simple yet effective training framework to break this exploration bottleneck. We posit that task-irrelevant prompt-space perturbations can shift the model's output distribution enough to unlock orthogonal reasoning pathways for hard questions. Specifically, LoPE prepends sequences stochastically assembled from Lorem Ipsum vocabulary (a pseudo-Latin placeholder text) to the prompts before resampling. Experiments across 1.7B, 4B, and 7B models demonstrate that LoPE significantly outperforms resampling with the original prompts. Further analysis reveals that other Latin-based random sequences with low perplexity are also effective perturbations. Our results establish LoPE as a strong baseline for broadening exploration in LLM reinforcement learning.
- Audio-Visual Intelligence in Large Foundation Models
Audio-Visual Intelligence (AVI) has emerged as a central frontier in artificial intelligence, bridging auditory and visual modalities to enable machines that can perceive, generate, and interact in the multimodal real world. In the era of large foundation models, joint modeling of audio and vision has become increasingly crucial, i.e., not only for understanding but also for controllable generation and reasoning across dynamic, temporally grounded signals. Recent advances, such as Meta MovieGen and Google Veo-3, highlight the growing industrial and academic focus on unified audio-vision architectures that learn from massive multimodal data. However, despite rapid progress, the literature remains fragmented, spanning diverse tasks, inconsistent taxonomies, and heterogeneous evaluation practices that impede systematic comparison and knowledge integration. This survey provides the first comprehensive review of AVI through the lens of large foundation models. We establish a unified taxonomy covering the broad landscape of AVI tasks, ranging from understanding (e.g., speech recognition, sound localization) to generation (e.g., audio-driven video synthesis, video-to-audio) and interaction (e.g., dialogue, embodied, or agentic interfaces). We synthesize methodological foundations, including modality tokenization, cross-modal fusion, autoregressive and diffusion-based generation, large-scale pretraining, instruction alignment, and preference optimization. Furthermore, we curate representative datasets, benchmarks, and evaluation metrics, offering a structured comparison across task families and identifying open challenges in synchronization, spatial reasoning, controllability, and safety. By consolidating this rapidly expanding field into a coherent framework, this survey aims to serve as a foundational reference for future research on large-scale AVI.
- Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
Step distillation has become a leading technique for accelerating diffusion models, among which Distribution Matching Distillation (DMD) and Consistency Distillation are two representative paradigms. While consistency methods enforce self-consistency along the full PF-ODE trajectory to steer it toward the clean data manifold, vanilla DMD relies on sparse supervision at a few predefined discrete timesteps. This restricted discrete-time formulation and mode-seeking nature of the reverse KL divergence tends to exhibit visual artifacts and over-smoothed outputs, often necessitating complex auxiliary modules -- such as GANs or reward models -- to restore visual fidelity. In this work, we introduce Continuous-Time Distribution Matching (CDM), migrating the DMD framework from discrete anchoring to continuous optimization for the first time. CDM achieves this through two continuous-time designs. First, we replace the fixed discrete schedule with a dynamic continuous schedule of random length, so that distribution matching is enforced at arbitrary points along sampling trajectories rather than only at a few fixed anchors. Second, we propose a continuous-time alignment objective that performs active off-trajectory matching on latents extrapolated via the student's velocity field, improving generalization and preserving fine visual details. Extensive experiments on different architectures, including SD3-Medium and Longcat-Image, demonstrate that CDM provides highly competitive visual fidelity for few-step image generation without relying on complex auxiliary objectives. Code is available at https://github.com/byliutao/cdm.
- StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction
Large language models (LLMs) are increasingly used as interactive agents, but optimizing them for long-horizon decision making remains difficult because current methods are largely purely reactive, which weakens both exploration and credit assignment over extended trajectories. In this work, we present Strategic Trajectory Abstraction (StraTA), a simple framework that introduces an explicit trajectory-level strategy into agentic reinforcement learning (RL). StraTA samples a compact strategy from the initial task state, conditions subsequent actions on that strategy, and trains strategy generation and action execution jointly with a hierarchical GRPO-style rollout design, further enhanced by diverse strategy rollout and critical self-judgment. Experiments on ALFWorld, WebShop, and SciWorld show that StraTA consistently improves both sample efficiency and final performance over strong baselines. StraTA reaches success rates of 93.1% on ALFWorld and 84.2% on WebShop. On SciWorld, StraTA attains a 63.5% overall score, outperforming frontier closed-source models.
- Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes
We study auto research as a closed empirical loop driven by external measurement. Each submitted trial carries a hypothesis, an executable code edit, an evaluator-owned outcome, and feedback that shapes the next proposal. The output is not a generated paper or a single model checkpoint, but an auditable trajectory of proposals, code diffs, experiments, scores, and failure labels. We instantiate this loop with specialist agents that partition recipe surfaces and share measured lineage across trials. The central empirical finding is that lineage feedback lets agents turn evaluator outcomes, including crashes, budget overruns, size failures, and accuracy-gate misses, into later program-level recipe edits rather than one-shot suggestions. Across 1,197 headline-run trials plus 600 Parameter Golf control trials after one-time setup and launch, humans did not choose proposals, edit recipes, override scores, or repair failed trials during the search. In the three headline runs, the same submitted-trial loop reduces Parameter Golf validation bpb by 0.81%, raises NanoChat-D12 CORE by 38.7%, and reduces CIFAR-10 Airbench96 wallclock by 4.59%, with each task measured by its own external evaluator and legality checks. The trace includes a strict architecture-domain audit of 157 headline-run submissions and program rewrites such as a NanoChat attention-kernel path change. Within this scope the loop autonomously writes code, submits experiments, absorbs feedback, applies and combines known techniques inside each environment, and improves public starting recipes.
- A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping
Reinforcement learning for agentic large language models (LLMs) typically relies on a sparse, trajectory-level outcome reward, making it difficult to evaluate the contribution of individual tool-calls within multi-turn interactions. Existing approaches to such process credit assignment either depend on separate external process reward models that introduce additional consumption, or tree-based structural rollout that merely redistributes the outcome signal while constraining trajectory diversity. A promising alternative leverages the per-turn change in the policy's predicted probability of the ground-truth, termed Information Gain (IG), as an intrinsic process signal without an external evaluator. However, prior work on leveraging IG signals within the RL training loop faces three systematic challenges: normalizing across turns that face heterogeneous positional contexts can distort the relative standing of individual turns, accumulating a variable number of terms causes advantage magnitudes to drift with trajectory depth, and a fixed clipping range governs policy updates identically for turns with vastly different IG signals. In this paper, we propose A^2TGPO (Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping), which retains IG as the intrinsic signal but re-designs how it is normalized, accumulated, and consumed: (i) turn-group normalization: normalizes IG within each (prompt, turn-index) group so that each turn is compared only against peers at the same interaction depth; (ii) variance-rescaled discounted accumulation: divides cumulative normalized IG by square root of accumulated terms to keep advantage magnitudes comparable across turn positions; and (iii) adaptive turn-level clipping: modulates each turn's clipping range based on its normalized IG, widening the update region for informative turns and narrowing it for uninformative ones.
- Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key
Reinforcement learning (RL) has been applied to improve large language model (LLM) reasoning, yet the systematic study of how training scales with task difficulty has been hampered by the lack of controlled, scalable environments. We introduce ScaleLogic, a synthetic logical reasoning framework that offers independent control over two axes of difficulty: the depth of the required proof planning (i.e., the horizon) and the expressiveness of the underlying logic. Our proposed framework supports a wide range of logics: from simple implication-only logic ("if-then") towards more expressive first-order reasoning with conjunction ("and"), disjunction ("or"), negation ("not"), and universal quantification ("for all"). Using this framework, we show that the RL training compute T follows a power law with respect to reasoning depth D (T propto D^γ, R^{2} > 0.99), and that the scaling exponent γ increases monotonically with logical expressiveness, from 1.04 to 2.60. On downstream mathematics and general reasoning benchmarks, more expressive training settings yield both larger performance gains (up to +10.66 points) and more compute-efficient transfer compared to less expressive settings, demonstrating that what a model is trained on, not just how much it is trained, shapes downstream transfer. We further show that the power-law relationship holds across multiple RL methods, and curriculum-based training substantially improves scaling efficiency.
Techmeme(15)
- How SpaceMob, an online community of ~50,000, has fueled a meme-stock-like rally in satellite networking company AST, which is up ~6,000% over a 22-month period (Bloomberg)
Bloomberg : How SpaceMob, an online community of ~50,000, has fueled a meme-stock-like rally in satellite networking company AST, which is up ~6,000% over a 22-month period — Championed by a guru known simply as the Kook, the satellite company AST has become one of the world's most expensive stocks.
- LayerZero apologizes for Kelp DAO exploit response, says single-verifier setup was deficient; Dune: in April, ~47% of LayerZero OApps had the same default setup (Zack Abrams/The Block)
Zack Abrams / The Block : LayerZero apologizes for Kelp DAO exploit response, says single-verifier setup was deficient; Dune: in April, ~47% of LayerZero OApps had the same default setup — Quick Take — LayerZero published a blog post Friday apologizing for poor communication in the three weeks since the $292 million Kelp DAO exploit.
- Anthropic, OpenAI, and other AI firms met with Hindu, Sikh, and Greek Orthodox leaders to draft principles on how to infuse models with ethics and morality (Krysta Fauria/Associated Press)
Krysta Fauria / Associated Press : Anthropic, OpenAI, and other AI firms met with Hindu, Sikh, and Greek Orthodox leaders to draft principles on how to infuse models with ethics and morality — As concerns mount over artificial intelligence and its rapid integration into society, tech companies are increasingly turning …
- GM agrees to pay $12.75M to resolve a California investigation into claims that it illegally sold the location and driving data of OnStar subscribers to brokers (David Shepardson/Reuters)
David Shepardson / Reuters : GM agrees to pay $12.75M to resolve a California investigation into claims that it illegally sold the location and driving data of OnStar subscribers to brokers — GM (GM.N) has agreed to pay $12.75 million to resolve a California investigation into allegations that the Detroit automaker illegally sold …
- Sources: ByteDance plans to increase its 2026 capex to more than $30B, up at least 25% from a preliminary plan, amid the AI boom and rising memory chip costs (South China Morning Post)
South China Morning Post : Sources: ByteDance plans to increase its 2026 capex to more than $30B, up at least 25% from a preliminary plan, amid the AI boom and rising memory chip costs — TikTok owner ByteDance is ramping up its spending on artificial intelligence infrastructure, boosting its planned capital expenditure …
- OpenAI, Anthropic, and Google's enterprise push with PE firms poses a new competitive threat to India's IT industry, as services become increasingly automatable (Moneycontrol)
Moneycontrol : OpenAI, Anthropic, and Google's enterprise push with PE firms poses a new competitive threat to India's IT industry, as services become increasingly automatable — On Wall Street, the announcements sounded like the next phase of the artificial intelligence (AI) boom: frontier model companies …
- Sales of PC motherboards are expected to fall 25%+ YoY in 2026, as PC users delay their upgrades amid AI-driven price surges for memory, storage, and processors (Jowi Morales/Tom's Hardware)
Jowi Morales / Tom's Hardware : Sales of PC motherboards are expected to fall 25%+ YoY in 2026, as PC users delay their upgrades amid AI-driven price surges for memory, storage, and processors — Fewer people are buying parts and building new PCs from scratch. … Motherboard sales are now collapsing amid unprecedented shortages fueled …
- A profile of Anthropic CFO Krishna Rao, who tends to take a conservative approach to revenue projections and has chosen to raise less money than is available (Kate Clark/Wall Street Journal)
Kate Clark / Wall Street Journal : A profile of Anthropic CFO Krishna Rao, who tends to take a conservative approach to revenue projections and has chosen to raise less money than is available — Krishna Rao is navigating unprecedented growth, compute constraints and the idiosyncratic Amodeis
- Palo Alto Networks says in its testing, three weeks of frontier AI-assisted analysis matched a full year of manual penetration testing, with broader coverage (Sam Rubin/Palo Alto Networks Blog)
Sam Rubin / Palo Alto Networks Blog : Palo Alto Networks says in its testing, three weeks of frontier AI-assisted analysis matched a full year of manual penetration testing, with broader coverage — For the last several months, we have had early, unbounded access to the latest frontier AI models.
- Anthropic details how it improved Claude's safety training after finding agentic misalignment in older models, such as Opus 4 blackmailing engineers (Anthropic)
Anthropic : Anthropic details how it improved Claude's safety training after finding agentic misalignment in older models, such as Opus 4 blackmailing engineers — Last year, we released a case study on agentic misalignment. In experimental scenarios, we showed that AI models from many different …
- OpenAI president Greg Brockman's journal has emerged as a star witness in the Musk v. Altman trial; Brockman says he stopped writing about OpenAI in it in 2023 (Ben Cohen/Wall Street Journal)
Ben Cohen / Wall Street Journal : OpenAI president Greg Brockman's journal has emerged as a star witness in the Musk v. Altman trial; Brockman says he stopped writing about OpenAI in it in 2023 — The journal of OpenAI president Greg Brockman is now a character in the company's battle with the world's richest man …
- Source: Mistral AI and TML's founding member Devendra Chaplot, who was considered a marquee hire when he joined xAI in March, exited xAI after roughly a month (The Information)
The Information : Source: Mistral AI and TML's founding member Devendra Chaplot, who was considered a marquee hire when he joined xAI in March, exited xAI after roughly a month — Cursor is already starting to make its presence known at SpaceX's AI unit, just weeks after Elon Musk's firm got an option to buy the coding startup for $60 billion.
- NHTSA says the 2026 Tesla Model Y is the first car model to pass the agency's new ADAS tests; Tesla conducted the tests and submitted the results to the NHTSA (Kirsten Korosec/TechCrunch)
Kirsten Korosec / TechCrunch : NHTSA says the 2026 Tesla Model Y is the first car model to pass the agency's new ADAS tests; Tesla conducted the tests and submitted the results to the NHTSA — The National Highway Traffic Safety Administration (NHTSA) said Tuesday that the later release 2026 Tesla Model Y is the first vehicle …
- Beijing-based humanoid robotics company Robotera raised over $200M led by SF Group, after raising ~$146M in March at a ~$1.47B valuation (Du Zhihang/Caixin Global)
Du Zhihang / Caixin Global : Beijing-based humanoid robotics company Robotera raised over $200M led by SF Group, after raising ~$146M in March at a ~$1.47B valuation — Chinese humanoid-robot startup Robot Era has raised more than $200 million in a new funding round led by SF Express, as investors pour capital …
- Honeywell's Quantinuum files for a US IPO, reporting a $136.6M net loss on revenue of $5.2M for the three months ended March 31; sources: it could raise $1.5B+ (Carmen Reinicke/Bloomberg)
Carmen Reinicke / Bloomberg : Honeywell's Quantinuum files for a US IPO, reporting a $136.6M net loss on revenue of $5.2M for the three months ended March 31; sources: it could raise $1.5B+ — Quantinuum Inc., a quantum computing company backed by Honeywell International Inc., filed for a US initial public offering …
Solidot(15)
- 韩国人形机器人皈依我佛
名为 Gabi 的韩国人形机器人参加了一场修改版的皈依仪式,成为大韩佛教曹溪宗的一名僧侣。它宣誓尊重生命、服从人类、和平对待其他机器人和物体。Gabi 的韩语是 자비,意思是慈悲,它由杭州宇树科技制造,起售价 13,500 美元。在皈依仪式上 Gabi 同意了五项通常由人类僧侣诵读的誓言,誓言略微修改以适应人形机器人。机器人承诺尊重生命,以和平的方式对待其他机器人和物体,倾听人类的意见,避免做出欺骗性的言行,以及节约能源。Gabi 还参加了修改版的净化仪式。人类僧侣的净化仪式通常是手臂上用香火轻轻烧灼,象征净化身体和心灵。Gabi 则给予了莲花灯节贴纸和一串念珠。此举旨在响应曹溪宗总务院长真愚法师在新年致辞中承诺,即将 AI 融入佛教传统。真愚法师在一份声明中称,“无畏引领 AI 时代,导向心灵的安宁与觉悟。”
- NASA 实时跟踪墨西哥城的下沉
踏入墨西哥城广阔的中央广场 Zocalo,仿若置身于一个让人眩晕的世界。广场一端,首都大教堂高耸的尖顶朝一个方向倾斜;与之相连的主座教堂(Metropolitan Sanctuary)则向另一个方向倾斜。附近的国家宫也有点不平衡。这些历史建筑的摇摇欲坠,是这座城市一个多世纪以来持续上演的现象最直观的体现:墨西哥城正以惊人的速度下沉。NASA 卫星 NISAR 的雷达成像系统正实时跟踪墨西哥城的下沉。NISAR 卫星能探测地球表面的微小变化,这并非首次从太空观测墨西哥城的下沉,但 NISAR 比任何其它天基传感器都更清晰展现下沉的范围以及不同地貌的变化,它能用于研究火山、地震引起的形变以及滑坡。NISAR 系统发现墨西哥城部分地区每月下沉逾 2 厘米。墨西哥城的下沉是数世纪以来地下水过度开采的结果。由于墨西哥城及其周边地区建在古代湖床上,城市下方的土壤极其松软。当地下含水层中的水被抽取时,黏土状土壤会压实,导致城市下沉。危机还具有自我强化的效应:随着城市下沉,老化的管道出现裂缝和泄漏,导致墨西哥城损失了约 40% 的水资源,而干旱和气候变化又使供水更脆弱。
- Linux 基金会 2.95% 的预算投入在 Linux
根据 Linux 基金会公布的 2025 年年度报告,去年它在 Linux 内核项目上的开支为 841 万美元,占到了总预算的 2.95%,其中 Linux 内核作者 Linus Torvalds 薪水大约为 150 万美元(其中包括百万美元的“其它”收入,该收入未明确定义)。Linux 基金会其实是一个行业协会,并非公益性非营利组织,它的资金来自于科技巨头的赞助,从董事会成员的构成就可以看出,它的董事来自索尼、华为、OpenAI、高通、三星、微软、甲骨文、Google 和 Meta 等。Linux 基金会托管了大约 1500 个开源项目,Linux 内核也不是最大的项目,它在区块链上支出占到了总预算的 4%。
- 超加工食品增加心血管死亡风险
欧洲心脏病学会等组织在《European Heart Journal》期刊上发表了一份临床共识声明:超加工食品会增加心血管死亡风险最高 65%,呼吁医生将食品加工视为一个风险因素,将其与营养素分离开来。超加工食品日益取代传统饮食,成为一大公共卫生问题。所谓超加工食品是用廉价工业原料、添加剂和新合成化合物组成的食品,通常营养价值较低甚至没有营养价值,它们通常高度加工含有可能对健康有害的添加剂。最近的证据表明,食品加工的程度和性质也是影响饮食与健康的重要因素。
- 法国对马斯克及其 X 平台展开刑事调查
法国检方对马斯克(Elon Musk)及其 X 平台展开刑事调查。法国执法部门三个月前搜查了 X 位于巴黎的办公室,传唤马斯克接受讯问。检方原计划于今年 4 月约谈马斯克及前 X CEO Linda Yaccarino,但两人都未现身。 现在法国当局正试图以刑事指控相威胁,强制他们到场接受讯问。除未成年人色情图像外,调查还涉及 Grok 传播否认纳粹大屠杀的言论以及深度伪造色情。检方称,如果马斯克和 Yaccarino 再次缺席他们将面临缺席起诉。
- 爆发汉坦病毒疫情的豪华游轮
悬挂荷兰国旗的豪华游轮 MV Hondius 于 2026 年 4 月 1 日从阿根廷 Ushuaia 启航,目前正在西非沿岸航行。游轮上爆发的汉坦病毒疫情引发了广泛关注,至今共报告了 8 个病例,有 3 人死亡,其余 147 名乘客和船员未出现症状。西班牙同意为这艘游轮提供援助,它目前正从佛得角航行西班牙的 Canary 岛。卫生官员和传染病专家试图消除公众的恐慌,强调船上的疫情对外界的风险很低,WHO 的 Maria Van Kerkhove 称,这不是新冠病毒,也不是流感,它的传播方式截然不同。美国 CDC 也声明称美国公众面临的风险极低。汉坦病毒是一个庞大的家族,有旧世界汉坦病毒,还有新大陆汉坦病毒,游轮上传播的是主要分布于阿根廷的 Andes 病毒(ANDV)。人类主要是通过接触啮齿动物的排泄物感染汉坦病毒,人与人传播非常罕见,而 ANDV 是唯一的例外。ANDV 的人际传播需要密切且长时间的接触,潜伏期约 7-42 天。目前建议对可能接触过病毒的病例进行 42 天的隔离或主动监测。
- David Attenborough 一百岁
著名英国解说员、生物学家、自然历史学家和作家 David Attenborough 爵士迎来一百岁生日,英国国王和王后都向他发去了生日祝福。知名演员 Dame Judi Dench 和 Morgan Freeman 等人也与世界自然基金会(WWF)合作发布了生日祝福视频。作为作家、主持人和解说员,他的职业生涯跨越了 80 年,其参与的纪录片包括《自然世界》《野生动物》《地球脉动》系列和《蓝色星球》系列,与苹果公司合作的《史前星球》系列等。
- 五角大楼开始公开 UFO 新文件
可能是为了转移公众注意力,美国国防部开始公开一批 UFO/UAP 的新文件。首批公开了 162 份文件,未来还会公布更多。五角大楼在一份声明中宣称是为了展现特朗普总统对公众的最大限度透明度。然而爱泼斯坦文件至今还没有完整公开。新 UFO 文件发布在新网站 www.war.gov/UFO 上,包括了美国国务院的旧电报、FBI 文件以及 NASA 载人航天飞行的记录,比如 1972 年阿波罗 17 号任务期间拍摄的一张照片显示了三角形排列的点,五角大楼在说明文字中称对此现象尚未达成共识,但初步分析认为是一个物理物体。
- 人类历史上首次男性生育力低于女性
根据发表在 PNAS 期刊上的一项研究,人类历史上首次男性生育力低于女性。研究人员利用了联合国的 World Population Prospects 报告,发现人类男性生育力低于女性的转折点发生在 2024 年,这一转折是人口中男性比例上升驱动的。全球人口中男性比例不断上升,原因是部分国家选择性堕胎(主要是堕女胎)、男性死亡率下降以及每年出生的男孩数量显著多于女孩。历史上男性人口过剩通常在他们达到生育年龄前就消失了,但现代社会改变了这一状况。研究还表明,男性生育力低于女性的转变在不同地区发生的时间存在差异:大洋洲、南美洲和亚洲的转变发生在近期,而撒哈拉以南非洲则可能要到 2100 年,多数欧洲和北美国家的转变发生在几十年前。研究人员警告男性过多以及无子女男性增加会给社会带来挑战,他们建议加强女性社会地位,遏制选择性堕胎,为男性提供稳定的工作。
- JDownloader 网站被入侵用户被推送恶意程序
下载管理器 JDownloader 的网站遭到攻击,攻击者在一天多时间里向 Windows 和 Linux 用户推送了恶意程序。JDownloader 团队确认了此次攻击,它立即关闭了网站展开全面调查。调查显示,攻击者于 5 月 6 日专门修改了替代下载页,用未签名的恶意可执行文件替代了所有 Windows 安装程序的替代链接。Linux shell 安装程序也被替换为包含恶意 shell 代码的版本。但主 JDownloader.jar 文件、macOS 安装程序以及 Winget、Flatpak 和 Snap 等软件库中的软件包未受到破坏。
- 闪电的起源之谜
天体物理学家 Joseph Dwyer 任职于佛罗里达理工学院前致力于利用 NASA 的卫星研究太阳耀斑,搬到佛罗里达州后他将注意力转向了闪电,他发现闪电仍然是一个未解之谜。他开创了利用研究宇宙射线的仪器研究闪电形成之高能过程的先河。过去几个世纪,物理学家通过实验发现,电场强度达到临界值——大约 300 万伏特/米——空气会被击穿。电场将松散的电子抛向相邻的原子,撞击出更多的电子。就像雪崩,电子雪崩式增加,空气加热至发光。科学家认为闪电是这种现象的规模放大版本。然而到了 20 世纪中期,科学家发现雷暴云层确实存在电场,但电场非常弱,典型的雷暴只有产生击穿所需电场强度的十分之一,迄今测量到的最强雷暴电场强度仅为临界值的三分之一。而根据 NASA 卫星的数据,全球每时每刻都有逾 2000 多场雷暴发生。Dwyer 提出了失控相对论雪崩理论去解释闪电。一个电子与原子碰撞会发生反弹并释放出伽马射线。伽马射线会转化为一个电子及其反物质正电子。云团的电场会将正电子推回雪崩起点附近。它可能会撞击另一个原子,引发另一次雪崩,从而产生更多的伽马射线、更多正电子、更多雪崩,层层叠加启动闪电。
- 任天堂 Switch 2 涨价 50 美元
任天堂成为最新一家因为内存等零部件价格飙升而涨价的游戏机公司。Switch 2 日本版从 ¥49,980 提高到 ¥59,980,Switch(OLED Model)从 ¥37,980 提高到 ¥47,980,Switch 从 ¥32,978 提高到 ¥43,980,Switch Lite 从 ¥21,978 提高到 ¥29,980;美国区 Switch 2 从 $449.99 提高到 $499.99;加拿大从 $629.99 提高到 $679.99;欧洲从 €469.99 提高到 €499.99。Nintendo Switch Online 等服务也进行了涨价。日本涨价的生效日期是 5 月 25 日,美国加拿大和欧洲则是 9 月 1 日。
- 美国学习管理平台 Canvas 被入侵和勒索
美国教育科技公司 Instructure 旗下的学习管理平台 Canvas 被勒索组织 ShinyHunters 入侵和篡改,它威胁如果学校不在 5 月 12 日前联系它协商赎金将会泄露窃取的学校信息。攻击者利用 Instructure 系统漏洞入侵篡改了约 330 所教育机构的 Canvas 登录门户,将登录页面替换为勒索信息。ShinyHunters 声称它窃取了 8809 所学校的数据,涉及 2.75 亿名学生和教职工。它窃取的信息包括了学生姓名、邮箱地址、ID 号和消息等。在篡改事件后 Instructure 将 Canvas、Canvas Beta 和 Canvas Test 置于维护模式。
- 社媒博彩广告主要针对年轻男性用户
剑桥大学的研究人员分析了爱尔兰 88 家持牌博彩运营商的 411 则广告,发现即使广告没有直接针对男性,在 Facebook 和 Instagram 等 Meta 平台上年轻男性的广告触达率是女性的 2.3 倍。接触博彩广告最多的年龄段是 25-34 岁,占到了广告接触账号的三分之一以上。在爱尔兰,25-34 岁的男性赌博成瘾率最高,有 1.3% 存在成瘾行为。而同一年龄段女性只有 0.2% 存在类似的问题。研究发现 Betfair 的一则广告触达了逾 132 万独立账户,相当于爱尔兰人口的 26%。分析显示,91 则广告(22%)仅针对男性,没有广告仅针对女性。所有 411 则广告触达了 1260 万男性和 540 万女性。总体上针对 25-44 岁年龄段人群的广告触达了所有账户的 59.4%。
- Incus 7.0 LTS
容器和虚拟机管理系统 Incus 释出 v7.0 LTS 版本。Incus 是 Canonical 接管 LXD 项目之后社区创建的分支。Incus 7.0 新变化包括:底层备份 API,加入基本的 S3 操作以取代已停止维护的 MinIO 项目,移除对 cgroups v1 和 xtables (iptables/ip6tables/ebtables)的支持,等等。Incus 7.0 是长期支持版(LTS),将一直支持到 2031 年 6 月。