Curated by Shen Huang · 88 stories · ~13 min read
DIGEST · 2026-05-11

OrangeBot.AI Digest — 2026-05-11

88 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. TanStack NPM Packages Compromised (github.com)
  2. GitLab Announces Workforce Reduction and End of Their CREDIT Values (about.gitlab.com)
  3. Can someone please explain whether Cloudflare blackmailed Canonical? (www.flyingpenguin.com)
  4. Microsoft Israel chief leaves amid ethical controversy (en.globes.co.il)
  5. CUDA-oxide: Nvidia's official Rust to CUDA compiler (nvlabs.github.io)
  6. Nullsoft, 1997-2004 (2004) (slate.com)
  7. Software engineering may no longer be a lifetime career (www.seangoedecke.com)
  8. Training an LLM in Swift, Part 1: Taking matrix mult from Gflop/s to Tflop/s (www.cocoawithlove.com)
  9. Driver accused of DUI tracks missing laptop to Illinois State trooper's house (abc7chicago.com)
  10. Gmail registration now requires scanning a QR code and sending a text message (discuss.privacyguides.net)
  11. A.I. note takers are making lawyers nervous (www.nytimes.com)
  12. Venom and hot peppers offer a key to killing resistant bacteria (www.wired.com)
  13. Ratty – A terminal emulator with inline 3D graphics (ratty-term.org)
  14. Guitar tuner that uses phone accelerometer (tautme.github.io)
  15. Mythos Finds a Curl Vulnerability (daniel.haxx.se)

GitHub Trending(13)

  1. bytedance / UI-TARS-desktop
  2. CloakHQ / CloakBrowser
  3. yikart / AiToEarn
  4. playcanvas / supersplat
  5. datawhalechina / easy-vibe
  6. decolua / 9router
  7. tinyhumansai / openhuman
  8. millionco / react-doctor
  9. Lordog / dive-into-llms
  10. AUTOMATIC1111 / stable-diffusion-webui
  11. rasbt / LLMs-from-scratch
  12. NousResearch / hermes-agent
  13. rohitg00 / agentmemory

Product Hunt(15)

  1. Snapseed 4.0

    Google’s best photo editor just got seriously better

  2. articuler.ai

    Describe your goal. Meet the right professional.

  3. Known Agents

    Track the bots and AI agents crawling your website

  4. Grok Connectors

    Bring your daily apps into Grok

  5. Genpire

    Make Real Products with AI, literally.

  6. Graphbit PRFlow

    AI code reviewer that catches what others miss

  7. MiroMiro v2

    Inspect, edit, and export any website's design

  8. Weavable

    Give every AI agent persistent work context

  9. Warp Open-Source

    Agentic development environment built with the community

  10. Scroll Launch

    Launch your product and get discovered by other makers

  11. Suprbox

    Box for AI agents to secure enterprise data storage

  12. Web Speed

    Kill the 'Token Tax.' 90% cheaper agents.

  13. Bruin

    The AI data agent that collaborates with your team

  14. CacheTray

    Capture anything & send it to Claude or ChatGPT in one click

  15. RPCForge

    Multi-chain Ethereum RPC you own and control

Hugging Face(15)

  1. Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

    Scaling Diffusion Transformers (DiTs) to hundreds of layers introduces a structural vulnerability: networks can enter a silent, mean-dominated collapse state that homogenizes token representations and suppresses centered variation. Through mechanistic auditing, we isolate the trigger event of this collapse as Mean Mode Screaming (MMS). MMS can occur even when training appears stable, with a mean-coherent backward shock on residual writers that opens deep residual branches and drives the network into a mean-dominated state. We show this behavior is driven by an exact decomposition of these gradients into mean-coherent and centered components, compounded by the structural suppression of attention-logit gradients through the null space of the Softmax Jacobian once values homogenize. To address this, we propose Mean-Variance Split (MV-Split) Residuals, which combine a separately gained centered residual update with a leaky trunk-mean replacement. On a 400-layer single-stream DiT, MV-Split prevents the divergent collapse that crashes the un-stabilized baseline; it tracks close to the baseline's pre-crash trajectory while remaining substantially better than token-isotropic gating methods such as LayerScale across the full schedule. Finally, we present a 1000-layer DiT as a scale-validation run at boundary scales, establishing that the architecture remains stably trainable at extreme depth.

  2. MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation

    With the rise of online dance-video platforms and rapid advances in AI-generated content (AIGC), music-driven dance generation has emerged as a compelling research direction. Despite substantial progress in related domains such as music-driven 3D dance generation, pose-driven image animation, and audio-driven talking-head synthesis, existing methods cannot be directly adapted to this task. Moreover, the limited studies in this area still struggle to jointly achieve high-quality visual appearance and realistic human motion. Accordingly, we present MACE-Dance, a music-driven dance video generation framework with cascaded Mixture-of-Experts (MoE). The Motion Expert performs music-to-3D motion generation while enforcing kinematic plausibility and artistic expressiveness, whereas the Appearance Expert carries out motion- and reference-conditioned video synthesis, preserving visual identity with spatiotemporal coherence. Specifically, the Motion Expert adopts a diffusion model with a BiMamba-Transformer hybrid architecture and a Guidance-Free Training (GFT) strategy, achieving state-of-the-art (SOTA) performance in 3D dance generation. The Appearance Expert employs a decoupled kinematic-aesthetic fine-tuning strategy, achieving state-of-the-art (SOTA) performance in pose-driven image animation. To better benchmark this task, we curate a large-scale and diverse dataset and design a motion-appearance evaluation protocol. Based on this protocol, MACE-Dance also achieves state-of-the-art performance. Code is available at https://github.com/AMAP-ML/MACE-Dance.

  3. Flow-OPD: On-Policy Distillation for Flow Matching Models

    Existing Flow Matching (FM) text-to-image models suffer from two critical bottlenecks under multi-task alignment: the reward sparsity induced by scalar-valued rewards, and the gradient interference arising from jointly optimizing heterogeneous objectives, which together give rise to a 'seesaw effect' of competing metrics and pervasive reward hacking. Inspired by the success of On-Policy Distillation (OPD) in the large language model community, we propose Flow-OPD, the first unified post-training framework that integrates on-policy distillation into Flow Matching models. Flow-OPD adopts a two-stage alignment strategy: it first cultivates domain-specialized teacher models via single-reward GRPO fine-tuning, allowing each expert to reach its performance ceiling in isolation; it then establishes a robust initial policy through a Flow-based Cold-Start scheme and seamlessly consolidates heterogeneous expertise into a single student via a three-step orchestration of on-policy sampling, task-routing labeling, and dense trajectory-level supervision. We further introduce Manifold Anchor Regularization (MAR), which leverages a task-agnostic teacher to provide full-data supervision that anchors generation to a high-quality manifold, effectively mitigating the aesthetic degradation commonly observed in purely RL-driven alignment. Built upon Stable Diffusion 3.5 Medium, Flow-OPD raises the GenEval score from 63 to 92 and the OCR accuracy from 59 to 94, yielding an overall improvement of roughly 10 points over vanilla GRPO, while preserving image fidelity and human-preference alignment and exhibiting an emergent 'teacher-surpassing' effect. These results establish Flow-OPD as a scalable alignment paradigm for building generalist text-to-image models.

  4. HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents

    Existing multimodal search agents process target entities sequentially, issuing one tool call per entity and accumulating redundant interaction rounds whenever a query decomposes into independent sub-retrievals. We argue that effective multimodal agents should search wider rather than longer: dispatching multiple grounded queries concurrently within a round. To this end, we present HyperEyes, a parallel multimodal search agent that fuses visual grounding and retrieval into a single atomic action, enabling concurrent search across multiple entities while treating inference efficiency as a first-class training objective. HyperEyes is trained in two stages. For cold-start supervision, we develop a Parallel-Amenable Data Synthesis Pipeline covering visual multi-entity and textual multi-constraint queries, curating efficiency-oriented trajectories via Progressive Rejection Sampling. Building on this, our central contribution, a Dual-Grained Efficiency-Aware Reinforcement Learning framework, operates at two levels. At the macro level, we propose TRACE (Tool-use Reference-Adaptive Cost Efficiency), a trajectory-level reward whose reference is monotonically tightened during training to suppress superfluous tool calls without restricting genuine multi-hop search. At the micro level, we adapt On-Policy Distillation to inject dense token-level corrective signals from an external teacher on failed rollouts, mitigating the credit-assignment deficiency of sparse outcome rewards. Since existing benchmarks evaluate accuracy as the sole metric, omitting inference cost, we introduce IMEB, a human-curated benchmark of 300 instances that jointly evaluates search capability and efficiency. Across six benchmarks, HyperEyes-30B surpasses the strongest comparable open-source agent by 9.9% in accuracy with 5.3x fewer tool-call rounds on average.

  5. Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

    Reinforcement learning with verifiable rewards (RLVR) has become a standard approach for large language models (LLMs) post-training to incentivize reasoning capacity. Among existing recipes, group-based policy gradient is prevalent, which samples a group of responses per prompt and updates the policy via group-relative advantage signals. This work reveals that these optimization strategies share a common geometric structure: each implicitly defines a target distribution on the response simplex and projects toward it via first-order approximation. Building on this insight, we propose Listwise Policy Optimization (LPO) to explicitly conduct the target-projection, which demystifies the implicit target by restricting the proximal RL objective to the response simplex, and then projects the policy via exact divergence minimization. This framework provides (i) monotonic improvement on the listwise objective with bounded, zero-sum, and self-correcting projection gradients, and (ii) flexibility in divergence selection with distinct structural properties through the decoupled projection step. On diverse reasoning tasks and LLM backbones, LPO consistently improves training performance over typical policy gradient baselines under matched targets, while intrinsically preserving optimization stability and response diversity.

  6. LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

    Test-time scaling (TTS) has become an effective approach for improving large language model performance by allocating additional computation during inference. However, existing TTS strategies are largely hand-crafted: researchers manually design reasoning patterns and tune heuristics by intuition, leaving much of the computation-allocation space unexplored. We propose an environment-driven framework, AutoTTS, that changes what researchers design: from individual TTS heuristics to environments where TTS strategies can be discovered automatically. The key to AutoTTS lies in environment construction: the discovery environment must make the control space tractable and provide cheap, frequent feedback for TTS search. As a concrete instantiation, we formulate width--depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals, where controllers decide when to branch, continue, probe, prune, or stop and can be evaluated cheaply without repeated LLM calls. We further introduce beta parameterization to make the search tractable and fine-grained execution trace feedback to improve discovery efficiency by helping the agent diagnose why a TTS program fails. Experiments on mathematical reasoning benchmarks show that the discovered strategies improve the overall accuracy--cost tradeoff over strong manually designed baselines. The discovered strategies generalize to held-out benchmarks and model scales, while the entire discovery costs only $39.9 and 160 minutes. Our data, and code will be open-source at https://github.com/zhengkid/AutoTTS.

  7. HumanNet: Scaling Human-centric Video Learning to One Million Hours

    Progress in embodied intelligence increasingly depends on scalable data infrastructure. While vision and language have scaled with internet corpora, learning physical interaction remains constrained by the lack of large, diverse, and richly annotated human activity data. We present HumanNet, a one-million-hour human-centric video corpus that captures how humans interact with the physical world at scale. HumanNet spans both first-person and third-person perspectives and covers fine-grained activities, human-object interactions, tool use, and long-horizon behaviors across diverse real-world environments. Beyond raw video, the dataset provides interaction-centric annotations, including captions, motion descriptions, and hand and body-related signals, enabling motion-aware and interaction-aware learning. Beyond scale, HumanNet introduces a systematic data curation paradigm for embodied learning, where human-centric filtering, temporal structuring, viewpoint diversity, and annotation enrichment are treated as first-class design principles. This design transforms unstructured internet video into a scalable substrate for representation learning, activity understanding, motion generation, and human-to-robot transfer. We conduct a first-step validation on the value of this design through controlled vision-language-action ablation: under a fixed set of validation data, continued training from the Qwen VLM model with 1000 hours of egocentric video drawn from HumanNet surpasses the continued training with 100 hours of real-robot data from Magic Cobot, indicating that egocentric human video could be a scalable and cost-effective substitute for robot data. By building this project, we aim to explore the opportunity to scale embodied foundation models using human-centric videos, rather than relying solely on robot-specific data.

  8. Anisotropic Modality Align

    Training multimodal large language models has long been limited by the scarcity of high-quality paired multimodal data. Recent studies show that the shared representation space of pretrained multimodal contrastive models can serve as a bridge, enabling models to perform multimodal training with unimodal data. However, the key premise of this paradigm remains insufficiently understood: can representations from different modalities be reliably interchanged? The core obstacle lies in the persistent Modality Gap in the shared space. In this work, we revisit the geometric nature of the modality gap. We find that modality representations already share compatible dominant semantic geometry. What truly hinders modality interchangeability is not a simple global shift, but an anisotropic residual structure concentrated along a small number of dominant directions. Based on this finding, we further propose the principle of anisotropic modality gap alignment: effective modality alignment should align with the target-modality distribution while preserving the semantic structure of the source modality. Guided by this principle, we propose an anisotropic geometric correction framework, AnisoAlign, for unpaired modality alignment. This framework leverages the internal geometric prior of the target modality and performs bounded correction on source-modality representations, thereby constructing substitute representations in the target modality. Experiments confirm its benefits in both geometric diagnostics and text-only MLLM training. Overall, this work recasts the modality gap from an empirical observation into a correctable, structured geometric phenomenon and provides a new representation alignment perspective for training multimodal models with unimodal data.

  9. Beyond Retrieval: A Multitask Benchmark and Model for Code Search

    Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and degenerate binary relevance. In this paper, we introduce CoREB, a contamination-limited, multitask code retrieval and reranking benchmark, together with a fine-tuned code reranker, that goes beyond retrieval to cover the full code search pipeline. CoREB is built from counterfactually rewritten LiveCodeBench problems in five programming languages and delivered as timed releases with graded relevance judgments. We benchmark eleven embedding models and five rerankers across three tasks: text-to-code, code-to-text, and code-to-code. Our experiments reveal that: \circone code-specialised embeddings dominate code-to-code retrieval ({sim}2{times} over general encoders), yet no single model wins all three tasks; \circtwo short keyword queries, the format closest to real developer search, collapse every model to near-zero nDCG@10; \circthree off-the-shelf rerankers are task-asymmetric, with a 12-point swing on code-to-code and no baseline net-positive across all tasks; \circfour our fine-tuned CoREB-Reranker is the first to achieve consistent gains across all three tasks. The data and model are released.

  10. TextLDM: Language Modeling with Continuous Latent Diffusion

    Diffusion Transformers (DiT) trained with flow matching in a VAE latent space have unified visual generation across images and videos. A natural next step toward a single architecture for both generation (visual synthesis) and understanding (text generation) is to apply this framework to language modeling. We propose TextLDM, which transfers the visual latent diffusion recipe to text generation with minimal architectural modification. A Transformer-based VAE maps discrete tokens to continuous latents, enhanced by Representation Alignment (REPA) with a frozen pretrained language model to produce representations effective for conditional denoising. A standard DiT then performs flow matching in this latent space, identical in architecture to its visual counterpart. The central challenge we address is obtaining high-quality continuous text representations: we find that reconstruction fidelity alone is insufficient, and that aligning latent features with a pretrained language model via REPA is critical for downstream generation quality. Trained from scratch on OpenWebText2, TextLDM substantially outperforms prior diffusion language models and matches GPT-2 under the same settings. Our results establish that the visual DiT recipe transfers effectively to language, taking a concrete step toward unified diffusion architectures for multimodal generation and understanding.

  11. UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification

    As large language models (LLMs) continue to advance rapidly, they are becoming increasingly capable while simultaneously demanding ever-longer context lengths. To improve the inference efficiency of long-context processing, several novel low-complexity hybrid architectures have recently been proposed, effectively alleviating the computational burden of long-context inference. However, existing research on long-context prefill acceleration remains predominantly focused on sparse attention mechanisms, which achieve their maximum speedup only on full-attention models. When transferred to emerging architectures--such as linear/full attention hybrids or sliding window/full attention hybrids--these prefill acceleration approaches suffer significant performance degradation. Furthermore, such methods are generally incompatible with continuous batching, making them difficult to integrate into modern inference engines such as vLLM. To this end, we propose UniPrefill, a prefill acceleration framework applicable to virtually any model architecture, which directly accelerates the model's computation at the token level. We further implement UniPrefill as a continuous batching operator and extend vLLM's scheduling strategy to natively support prefill-decode co-processing and tensor parallel for UniPrefill, enabling its seamless integration into vLLM. UniPrefill achieves up to 2.1x speedup in Time-To-First-Token (TTFT), with the acceleration becoming increasingly pronounced as the number of concurrent requests grows.

  12. DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

    AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due to their high capability and flexibility, such agents raise significant security and safety concerns. A growing number of real-world incidents have shown that adversaries can easily manipulate agents into performing harmful actions, such as leaking API keys, deleting user data, or initiating unauthorized transactions. Evaluating agent security is inherently challenging, as agents operate in dynamic, untrusted environments involving external tools, heterogeneous data sources, and frequent user interactions. However, realistic, controllable, and reproducible environments for large-scale risk assessment remain largely underexplored. To address this gap, we introduce the DecodingTrust-Agent Platform (DTap), the first controllable and interactive red-teaming platform for AI agents, spanning 14 real-world domains and over 50 simulation environments that replicate widely used systems such as Google Workspace, Paypal, and Slack. To scale the risk assessment of agents in DTap, we further propose DTap-Red, the first autonomous red-teaming agent that systematically explores diverse injection vectors (e.g., prompt, tool, skill, environment, combinations) and autonomously discovers effective attack strategies tailored to varying malicious goals. Using DTap-Red, we curate DTap-Bench, a large-scale red-teaming dataset comprising high-quality instances across domains, each paired with a verifiable judge to automatically validate attack outcomes. Through DTap, we conduct large-scale evaluations of popular AI agents built on various backbone models, spanning security policies, risk categories, and attack strategies, revealing systematic vulnerability patterns and providing valuable insights for developing secure next-generation agents.

  13. AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

    Reinforcement learning (RL) has substantially improved the ability of large language model (LLM) agents to interact with environments and solve multi-turn tasks. However, effective agentic RL remains challenging: sparse outcome-only rewards provide limited guidance for assigning credit to individual steps within long interaction trajectories. Existing approaches often introduce dense intermediate supervision, such as process reward models or auxiliary self-supervised signals, which increases supervision and tuning complexity and may limit generalization across tasks and domains. We present AEM, a supervision-free credit assignment method that adaptively modulates entropy dynamics during RL training to improve the exploration-exploitation trade-off. Since in agentic RL the environment is typically affected by a complete response, rather than an individual token, our analysis lifts entropy dynamics from the token level to the response level, aligning uncertainty estimation with the effective action granularity of LLM agents and reducing sensitivity to token-level sampling noise. We further show that entropy drift under natural-gradient updates is governed by the interaction between the sampled-response advantage and its relative surprisal. Motivated by this result, AEM derives a practical response-level uncertainty proxy and uses it to rescale advantages, leveraging the evolving balance between positive and negative samples to naturally transition from exploration to exploitation. Extensive experiments on ALFWorld, WebShop, and SWE-bench-Verified with models ranging from 1.5B to 32B demonstrate that AEM consistently improves strong RL baselines, including a +1.4\% gain when integrated into a state-of-the-art software-engineering RL training framework.

  14. MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning

    With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank r requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix P between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing P and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.

  15. Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

    The theory of state tracking in recurrent architectures has predominantly focused on expressive capacity: whether a fixed architecture can theoretically realize a set of symbolic transition rules. We argue that equally important is error control, the dynamics governing hidden-state drift along the directions that distinguish symbolic states. We prove that affine recurrent networks, a class of models encompassing State-Space Models and Linear Attention, cannot correct errors along state-separating subspaces once they preserve state representations. Consequently, practical affine trackers do not learn robust state tracking; rather, they learn finite horizon solutions governed by accumulated state-relevant error. We characterize the mechanics of this failure, showing that tracking remains readable only while the accumulating within-class spread remains small relative to the initial between-class separation. We demonstrate empirically on group state-tracking tasks that this breakdown is predictable: tracking collapses when the distinguishability ratio crosses the readability threshold of the trained decoder. Across trained models, the point of this crossing predicts the horizon at which downstream accuracy fails. These results establish that robust state tracking is determined not only by an architecture's theoretical expressivity but crucially by its error control.

Techmeme(15)

  1. Musk v. Altman: Satya Nadella says Elon Musk never contacted him with concerns that Microsoft's investments in OpenAI violated any special terms or commitments (CNBC)

    CNBC : Musk v. Altman: Satya Nadella says Elon Musk never contacted him with concerns that Microsoft's investments in OpenAI violated any special terms or commitments —  Microsoft CEO Satya Nadella took the stand in the Musk v. Altman trial on Monday, where he testified that Elon Musk never contacted …

  2. Musk v. Altman: Ilya Sutskever testifies that his OpenAI stake is worth ~$7B and he had concerns about Altman for a year before Altman's brief ouster as CEO (Rachel Metz/Bloomberg)

    Rachel Metz / Bloomberg : Musk v. Altman: Ilya Sutskever testifies that his OpenAI stake is worth ~$7B and he had concerns about Altman for a year before Altman's brief ouster as CEO —  OpenAI co-founder and former chief scientist Ilya Sutskever said his stake in the ChatGPT maker is worth roughly $7 billion …

  3. Google's TIG says it likely thwarted the use of an AI-generated zero-day in a "mass exploitation event" and tools like OpenClaw are being used to find exploits (Samantha Subin/CNBC)

    Samantha Subin / CNBC : Google's TIG says it likely thwarted the use of an AI-generated zero-day in a “mass exploitation event” and tools like OpenClaw are being used to find exploits —  Google's Threat Intelligence Group said in a report on Monday that it thwarted an effort by hackers …

  4. Sources: the White House's Office of the National Cyber Director and Commerce Department's CAISI are fighting over which agency should lead AI model evaluations (Washington Post)

    Washington Post : Sources: the White House's Office of the National Cyber Director and Commerce Department's CAISI are fighting over which agency should lead AI model evaluations —  As the White House grapples with cybersecurity threats from artificial intelligence models, intelligence officials want sway in AI policy overseen by Commerce.

  5. An Anthropic engineer argues HTML is a better output format for AI agents than Markdown, citing information density, ease of sharing, and two-way interaction (@trq212)

    @trq212 : An Anthropic engineer argues HTML is a better output format for AI agents than Markdown, citing information density, ease of sharing, and two-way interaction —  Using Claude Code: The Unreasonable Effectiveness of HTML

  6. Apple releases iOS 26.5, introducing end-to-end encryption for RCS messaging in beta with supported carriers; the setting is enabled by default (Chance Miller/9to5Mac)

    Chance Miller / 9to5Mac : Apple releases iOS 26.5, introducing end-to-end encryption for RCS messaging in beta with supported carriers; the setting is enabled by default —  iOS 26.5 is now available to everyone after six weeks of beta testing.  The update adds fresh wallpapers, new features to Apple Maps, and more.

  7. Texas AG Ken Paxton sues Netflix for allegedly spying on consumers by collecting their data without consent, and designing its platform to be addictive (Jonathan Stempel/Reuters)

    Jonathan Stempel / Reuters : Texas AG Ken Paxton sues Netflix for allegedly spying on consumers by collecting their data without consent, and designing its platform to be addictive —  Netflix (NFLX.O) was sued on Monday by Texas Attorney General Ken Paxton, who accused the streaming company of spying on consumers …

  8. Sources: German defense tech startup Helsing is set to raise $1.2B led by Dragoneer and Lightspeed at a valuation of about $18B, up from $14B in June 2025 (Financial Times)

    Financial Times : Sources: German defense tech startup Helsing is set to raise $1.2B led by Dragoneer and Lightspeed at a valuation of about $18B, up from $14B in June 2025 —  German company backed by Spotify's Daniel Ek set to raise $1.2bn in latest funding round  —  Helsing, the German defence technology group backed …

  9. curl founder Daniel Stenberg says Mythos identified five vulnerabilities in curl, but a manual review found three were false positives and one was "just a bug" (Daniel Stenberg/daniel.haxx.se)

    Daniel Stenberg / daniel.haxx.se : curl founder Daniel Stenberg says Mythos identified five vulnerabilities in curl, but a manual review found three were false positives and one was “just a bug” —  yes, as in singular one.  —  Back in April 2026 Anthropic caused a lot of media noise when they concluded …

  10. Agentic inference is set to be different than today's inference, and will change compute infrastructure because speed won't matter when humans aren't involved (Ben Thompson/Stratechery)

    Ben Thompson / Stratechery : Agentic inference is set to be different than today's inference, and will change compute infrastructure because speed won't matter when humans aren't involved —  Subscribe to get access  —  If you were looking for the ideal time to IPO, being a chip company in May 2026 is hard to beat.

  11. Israeli startup Frame Security, which protects organizations from AI-powered social engineering attacks, emerges from stealth with $50M led by Index and others (CTech)

    CTech : Israeli startup Frame Security, which protects organizations from AI-powered social engineering attacks, emerges from stealth with $50M led by Index and others —  The Israeli firm aims to modernize security awareness training as AI makes cyberattacks more convincing and scalable.

  12. Venmo implements a major privacy measure, setting new users' posts to "friends only" by default during onboarding; in 2021, a reporter found Joe Biden's Venmo (Jay Peters/The Verge)

    Jay Peters / The Verge : Venmo implements a major privacy measure, setting new users' posts to “friends only” by default during onboarding; in 2021, a reporter found Joe Biden's Venmo —  As part of a big app redesign, Venmo will change default settings for new users so that posts will be visible only to friends by default.

  13. Cowboy Space, led by Robinhood co-founder Baiju Bhatt to build data centers in orbit, raised a $275M Series B led by Index Ventures at a $2B valuation (Bruce Einhorn/Bloomberg)

    Bruce Einhorn / Bloomberg : Cowboy Space, led by Robinhood co-founder Baiju Bhatt to build data centers in orbit, raised a $275M Series B led by Index Ventures at a $2B valuation —  A space unicorn started by Baiju Bhatt, the billionaire co-founder of Robinhood Markets Inc., raised $275 million, changed its name …

  14. OpenAI launches the OpenAI Deployment Company with a $4B+ investment to help organizations build and deploy AI systems, and acquires AI consulting firm Tomoro (Reuters)

    Reuters : OpenAI launches the OpenAI Deployment Company with a $4B+ investment to help organizations build and deploy AI systems, and acquires AI consulting firm Tomoro —  OpenAI said on Monday it is setting up a new company with more than $4 billion in initial investment to help organizations build …

  15. Google's TIG reports the first known example of hackers using AI to discover and weaponize a zero-day; TIG's chief analyst says "this is the tip of the iceberg" (Dustin Volz/New York Times)

    Dustin Volz / New York Times : Google's TIG reports the first known example of hackers using AI to discover and weaponize a zero-day; TIG's chief analyst says “this is the tip of the iceberg” —  The company said that it had identified, for the first time, hackers using artificial intelligence to discover an unknown bug.

Solidot(15)

  1. 美国分析师称主权云在中美之外很难实现

    Gartner 副总裁 Douglas Toombs 认为完全拥有自主权的主权云在中美之外不太可能实现。他称只有美国和中国拥有主权云所需的所有技术。即使 AWS Outposts、Azure Local 或 Oracle Dedicated Cloud Regions 之类的本地云服务也需要与母公司通信。他认为欧洲的主权云的尝试不会成功,并引用了波士顿咨询集团的“三四法则(The Rule of Three and Four)”:一个稳定的竞争市场中的主要竞争对手的数量永远不会超过三个,其中最大的竞争对手的市场份额不会超过最小竞争对手的四倍。他预测云市场将围绕 AWS、Google 和微软三家公司稳定下来。

  2. 本田新专利是为电动摩托车模拟离合器

    最近披露的一项专利显示本田正在为电动摩托车开发模拟离合器,在电动摩托车上模拟传统燃油摩托车的驾驶体验。模拟离合器系统提供了扭矩增强起步功能,甚至还有触觉反馈。系统利用电子元件根据离合器杆的位置改变电机响应。半拉离合器,系统会按比例降低电机输出;完全拉起离合器,动力会完全切断。根据专利,骑手在离合器拉住的情况下,先扭转电子油门,让电机处于高转速状态,然后快速松开离合器,从而实现类似燃油摩托车的“爆发式起步”效果。这种技巧在竞技场景中可帮助骑手在松软地形或起步时获得更快的加速。专利还描述了安装在车把和离合器杆附近的多个振动电机,用于提供触觉反馈,模拟发动机振动,甚至模拟离合器接合时的“咬合点”感觉。

  3. Mythos 发现了一个 curl 漏洞

    Anthropic 上个月宣布的新 AI 模型 Mythos 引发了媒体的广泛关注,它宣传 Mythos 能极其精确的发现源代码中的安全漏洞。它的识别能力如此强大以至于 Anthropic 暂不向公众发布该模型,而是先提供给少数几家公司,以便于它们能优先解决其发现的安全漏洞。curl 维护者 Daniel Stenberg 认为这是一次极其成功的营销噱头。curl 是广泛使用的开源项目,因此他获得了 Mythos 的访问权限。curl 目前包含了 17.6 万行 C 代码,共 66 万个单词。Mythos 最终返回了一份安全报告,声称确认了五个安全漏洞。但 curl 的安全团队在仔细检查后发现其中 3 个是误报,1 个是 Bug,还有 1 个是低危级别的安全漏洞,将会在下个月释出的版本中修复。安全报告还详细纪录了约 20 个 bug,基本上都是正确的。Stenberg 表示他没有看到任何证据表明 Mythos 在发现安全漏洞上比之前的其它工具更胜一筹,Mythos 可能略好一点,但不足以对代码分析产生显著影响。

  4. 《Forza Horizon 6》游戏文件提前 10 天泄露

    微软旗下工作室 Playground Games 提前 10 天向 Steam 上传了《Forza Horizon 6》的未加密游戏文件。多个盗版网站已经放出了《Forza Horizon 6》的下载。《Forza Horizon 6》是以日本为背景的赛车游戏,预计于 5 月 19 日正式发售,游戏容量约 155 GB。这不是第一次 3A 游戏作品以这种方式泄露,今年 3 月小岛工作室游戏《Death Stranding 2》的 PC 版本也是在发售前几天以未加密的方式将游戏文件上传到 Steam。

  5. 你继承了父亲的 RNA

    南京大学的生化学家 Xin Yin 在一个明媚的下午给小鼠当私人训练员,将小鼠放到小型跑步机上跑步。它们都是运动健将,比对照组跑的更久,乳酸积累也更少。但这些小鼠和对照组在基因上并无差异,它们之所以运动表现更出色可能与它们的父亲的运动习惯相关。这一发现表明,跑步不仅对运动者本人有益,也可能对未出生的孩子有益。Xin Yin 的团队发现,运动小鼠精子中被称为 microRNA 的 RNA 片段浓度比不运动小鼠高。将这些分子注射到不相关的胚胎内,产下的后代与有运动习惯的父亲的后代的运动表现一样出色。过去二十年对小鼠的研究发现,除了 DNA,精子还会将 microRNA 等 RNA 片段遗传给后代。这些 RNA 片段的浓度会随运动或懒惰、高脂肪或高糖饮食、日常压力、童年创伤、酗酒以及接触杀虫剂有害物质等因素发生波动。研究发现,父母超重或承受心理健康压力其后代也更容易出现这些状况。

  6. 近 3000 篇同行评审医学论文包含虚假引用

    哥伦比亚大学护理学院一项使用 AI 展开的评估发现,近 3000 篇同行评审医学论文包含了虚假引用,这些引用在科学数据库里并不存在,是 AI 捏造的。研究团队开发了一个自动化验证系统,使用 AI 扫描了 2023 年 1 月 1 日至 2026 年 2 月 18 日期间发表在 PubMed Central 开放获取数据库中的 250 万篇论文,在 9710 万个已验证的引用文献中,研究人员在 2810 篇论文中发现了 4046 个虚假引用。自 2023 年以来,虚假引用率增长了 12 倍以上,2024 年中期开始出现最为显著的增长,这与 AI 写作工具的流行吻合。研究人员称,他们发现一篇论文的 30 个引用中有 18 个是虚假的。部分虚假引用已被其他论文引用,出现在为临床诊疗提供依据的系统评价中。

  7. 愤怒的小鸟 和 FIFA 等列入游戏名人堂

    《愤怒的小鸟(Angry Birds)》、EA Sports FIFA International Soccer、《勇者斗恶龙(Dragon Quest)》和《寂静岭》四款游戏进入了美国 The Strong 国家游戏博物馆的游戏名人堂。其它几款入围的游戏包括了《青蛙过河(Frogger)》、《小蜜蜂(Galaga)》、《英雄联盟》、《洛克人》、《说唱狗啪啦啪(PaRappa the Rapper)》、《符文之地(RuneScape)》、《上古卷轴V:天际》和《心跳回忆(Tokimeki Memorial)》。

  8. 美国 IT 行业失业率上升

    根据美国劳工部的数据,美国 IT 行业 4 月的失业率从 3 月的 3.6% 升至 3.8%。4 月美国新增就业岗位 11.5 万个,失业率维持在 4.3%,但 IT 行业减少了 1.3 万个工作岗位。现在断言 AI 对整体就业的影响还为时尚早,但美国科技公司在宣布裁员时都将 AI 作为其中一个理由:Meta 4 月 宣布将裁员 10% 约 8000 人,理由是精简运营以及支付 AI 领域的巨额投资;耐克裁员 1400 人,主要集中在 IT 部分;等等。招聘平台 Indeed 称软件开发者的职位发布量同比增长了 15%,但雇主倾向于经验丰富的开发者,而不是应届生。

  9. 研究显示 1% 的 Polymarket 用户获得了 76.5% 的收益

    研究人员分析了最大预测市场平台 Polymarket 上的交易收益与损失,数据集涵盖 2022 年至 2026 年,涉及逾 240 万用户、670 亿美元的交易额和 5.88 亿笔交易,他们发现前 1% 的用户获得了 76.5% 的交易收益。其中 1200 人获得了 5.91 亿美元的收益,平均每人 10 万美元。这些用户大部分都是职业玩家,可能利用自动化工具在短时间内进行了大量交易。而缺乏经验的散户通常是输钱的一方。研究结果表明,预测市场的信息收益是以缺乏经验的参与者为代价的。

  10. PS3 模拟器项目开发者请求停止递交 AI slop 代码

    自 AI 辅助编程流行以来,开源项目就面临大量递交的低质量 AI slop pull request 问题。诞生于 2011 年的 PS3 模拟器项目 RPCS3 通过社交媒体礼貌的请求贡献者不要向其 GitHub 页面递交 AI slop code pull request。RPCS3 表示将封禁那些未披露使用 AI 的递交代码用户,它强调互联网上有很多资源可以学习如何调试和编写代码,没必要去生成他们不理解也无法工作的 AI 代码。RPCS3 开发者说,不可能手写出他们看到的 AI slop。

  11. 新加坡将对霸凌他人的男生引入鞭刑

    新加坡教育部公布一系列措施协助校方应对霸凌事件,依事件的严重程度,犯错学生可被罚留堂或停学,甚至鞭打。教育部长李智陞表示,学校只有在其他惩罚措施不足以奏效时才会鞭打学生,并会遵守严格程序,确保学生的安全。校长必须批准,并由授权教师执行鞭打。校方也会考虑学生心智是否成熟,以及鞭打能否帮助学生吸取教训等。李智陞说,鞭打绝不会单独执行,而是作为整套改造和纪律措施的一部分。校方在执行打鞭后,将留意学生的身心健康,并辅导学生,支持学生改造。学校只会鞭打犯下严重错误、态度恶劣的男学生,不会以鞭打惩处犯错的女生。这是根据教育部条例,且参照了刑事诉讼法(Criminal Procedure Code)而制定的规则。尽管如此,这不代表霸凌同学的女生责任较轻。校方会确保对学生的惩罚,是依据行为的严重程度而定的。女生也可以被罚停学或留堂,调低品行等级,以及面对其他惩罚。

  12. 为什么工业时代的父母比狩猎时代的父母更感觉睡眠不足

    养育后代的经历贯穿了人类历史。演化人类学家 David Samson 在坦桑尼亚的一个狩猎采集社群生活了三个月,期间询问了居民的睡眠状况,他们都回答很好。然而如果你询问现代工业时代的新生儿父母,他们都会回答睡眠很糟糕。德国的一项研究发现,新生儿母亲的睡眠满意度评价是 6.57/10,父亲是 7.0/10;法国的一项调查显示近三分之二的新生儿母亲称睡眠不足。《How Babies Sleep: A Factful Guide to the First 365 Days and Nights》一书的作者 Helen Ball 认为我们的祖先不像我们为了第二天正常工作和安全而承受朝九晚五(或朝八晚五)的工作压力,因此需要确保一定的睡眠时间,“他们不开车,不操作重型机械。对我们来说重要的事情,对他们来说根本不是问题。”现代社会为了降低婴儿猝死综合征的风险而鼓励婴儿与母亲同室分床睡,但狩猎采集社会的母亲和婴儿都是同床睡。现代社会还面临的一大问题是新生儿父母缺乏社会支持。

  13. 每天走 8500 步有助于维持体重

    肥胖成为日益严重的公共卫生问题,预计到 2035 年全球肥胖人口比例将逾 30%。寻找能改善减肥效果的新策略是公共卫生领域的当务之急。根据发表在《International Journal of Environmental Research and Public Health》期刊上的一项研究,对比实验组和对照组,研究人员发现每天走约 8500 步有助于减肥者长期维持体重。研究人员认为应鼓励减肥者在减肥阶段增加每日步数,目标是每天约 8500 步,在维持阶段保持这些步数。

  14. 欧盟考虑制限制使用美国云服务处理敏感数据

    欧盟考虑制定规则,限制成员国政府使用美国云平台处理敏感数据。欧盟委员会预计将于 5 月 27 日公布技术主权计划 Tech Sovereignty Package,该计划包含了一系列措施加强在数字领域的战略自主性。欧盟委员会内部讨论了限制敏感公共部门数据暴露于非欧盟云平台的问题。包括美国在内的云服务商可能会受到影响。欧盟委员会没有完全禁止非欧盟云平台竞标政府合同,而是根据敏感程度限制非欧盟云平台处理敏感数据。它讨论的敏感数据主要与金融、司法和医疗数据相关。

  15. 韩国人形机器人皈依我佛

    名为 Gabi 的韩国人形机器人参加了一场修改版的皈依仪式,成为大韩佛教曹溪宗的一名僧侣。它宣誓尊重生命、服从人类、和平对待其他机器人和物体。Gabi 的韩语是 자비,意思是慈悲,它由杭州宇树科技制造,起售价 13,500 美元。在皈依仪式上 Gabi 同意了五项通常由人类僧侣诵读的誓言,誓言略微修改以适应人形机器人。机器人承诺尊重生命,以和平的方式对待其他机器人和物体,倾听人类的意见,避免做出欺骗性的言行,以及节约能源。Gabi 还参加了修改版的净化仪式。人类僧侣的净化仪式通常是手臂上用香火轻轻烧灼,象征净化身体和心灵。Gabi 则给予了莲花灯节贴纸和一串念珠。此举旨在响应曹溪宗总务院长真愚法师在新年致辞中承诺,即将 AI 融入佛教传统。真愚法师在一份声明中称,“无畏引领 AI 时代,导向心灵的安宁与觉悟。”

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.