DIGEST · 2026-03-25

OrangeBot.AI Digest — 2026-03-25

87 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. The EU still wants to scan your private messages and photos (fightchatcontrol.eu)
  2. Apple randomly closes bug reports unless you "verify" the bug remains unfixed (lapcatsoftware.com)
  3. Tracy Kidder has died (www.nytimes.com)
  4. Meta and YouTube found negligent in landmark social media addiction case (www.nytimes.com)
  5. Jury says Meta knowingly harmed children for profit, awarding landmark verdict (www.latimes.com)
  6. Slovenian officials blame Israeli firm Black Cube for trying to manipulate vote (www.wsj.com)
  7. Thoughts on slowing the fuck down (mariozechner.at)
  8. Antimatter has been transported for the first time (www.nature.com)
  9. Supreme Court Sides with Cox in Copyright Fight over Pirated Music (www.nytimes.com)
  10. Apple Just Lost Me (andregarzia.com)
  11. Ensu – Ente’s Local LLM app (ente.com)
  12. My astrophotography in the movie Project Hail Mary (rpastro.square.site)
  13. Meta told to pay $375M for misleading users over child safety (www.bbc.com)
  14. Why I forked httpx (tildeweb.nl)
  15. TurboQuant: Redefining AI efficiency with extreme compression (research.google)

GitHub Trending(12)

  1. mvanhorn / last30days-skill
  2. bytedance / deer-flow
  3. BerriAI / litellm
  4. pascalorg / editor
  5. letta-ai / claude-subconscious
  6. ruvnet / ruflo
  7. Crosstalk-Solutions / project-nomad
  8. ruvnet / RuView
  9. supermemoryai / supermemory
  10. FujiwaraChoki / MoneyPrinterV2
  11. usestrix / strix
  12. hsliuping / TradingAgents-CN

Product Hunt(15)

  1. CronBox

    Where AI agents work at a schedule in the cloud

  2. Agentplace AI Agents

    Create specialized AI agents for real tasks and workflows

  3. Omma

    Create 3D, apps, and websites with parallel agents

  4. Pendium

    Help AI agents recommend you more often to the right people

  5. Basedash Insights

    Fully autonomous data analysis agent for daily insights

  6. Descent

    Set a budget and get alerted when flights get cheap

  7. Axra

    AI-native global banking on stablecoins for emerging markets

  8. ClipTask

    Turns screen recording into structured, AI-generated tasks

  9. 3Flow AI

    Generate design images and 3D models for product design

  10. Flowershow

    Publish your markdown as a beautiful website – in seconds.

  11. agumbe.dev

    AI workspaces for building and running apps on Kubernetes

  12. Facts...No Bullsh*t

    Stop BS in real-time with AI that fact-checks as you listen

  13. Uni-1 by Luma

    A unified foundation model that thinks in pixels

  14. Auto Mode by Claude Code

    Let Claude make permission decisions on your behalf

  15. Splitsense

    AI that turns traffic into more revenue while you sleep

Hugging Face(15)

  1. MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

    Optical character recognition (OCR) has evolved from line-level transcription to structured document parsing, requiring models to recover long-form sequences containing layout, tables, and formulas. Despite recent advances in vision-language models, most existing systems rely on autoregressive decoding, which introduces sequential latency and amplifies error propagation in long documents. In this work, we revisit document OCR from an inverse rendering perspective, arguing that left-to-right causal generation is an artifact of serialization rather than an intrinsic property of the task. Motivated by this insight, we propose MinerU-Diffusion, a unified diffusion-based framework that replaces autoregressive sequential decoding with parallel diffusion denoising under visual conditioning. MinerU-Diffusion employs a block-wise diffusion decoder and an uncertainty-driven curriculum learning strategy to enable stable training and efficient long-sequence inference. Extensive experiments demonstrate that MinerU-Diffusion consistently improves robustness while achieving up to 3.2x faster decoding compared to autoregressive baselines. Evaluations on the proposed Semantic Shuffle benchmark further confirm its reduced dependence on linguistic priors and stronger visual OCR capability.

  2. WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

    Dynamical systems theory and reinforcement learning view world evolution as latent-state dynamics driven by actions, with visual observations providing partial information about the state. Recent video world models attempt to learn this action-conditioned dynamics from data. However, existing datasets rarely match the requirement: they typically lack diverse and semantically meaningful action spaces, and actions are directly tied to visual observations rather than mediated by underlying states. As a result, actions are often entangled with pixel-level changes, making it difficult for models to learn structured world dynamics and maintain consistent evolution over long horizons. In this paper, we propose WildWorld, a large-scale action-conditioned world modeling dataset with explicit state annotations, automatically collected from a photorealistic AAA action role-playing game (Monster Hunter: Wilds). WildWorld contains over 108 million frames and features more than 450 actions, including movement, attacks, and skill casting, together with synchronized per-frame annotations of character skeletons, world states, camera poses, and depth maps. We further derive WildBench to evaluate models through Action Following and State Alignment. Extensive experiments reveal persistent challenges in modeling semantically rich actions and maintaining long-horizon state consistency, highlighting the need for state-aware video generation. The project page is https://shandaai.github.io/wildworld-project/.

  3. SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning

    Agentic multimodal large language models (MLLMs) (e.g., OpenAI o3 and Gemini Agentic Vision) achieve remarkable reasoning capabilities through iterative visual tool invocation. However, the cascaded perception, reasoning, and tool-calling loops introduce significant sequential overhead. This overhead, termed agentic depth, incurs prohibitive latency and seriously limits system-level concurrency. To this end, we propose SpecEyes, an agentic-level speculative acceleration framework that breaks this sequential bottleneck. Our key insight is that a lightweight, tool-free MLLM can serve as a speculative planner to predict the execution trajectory, enabling early termination of expensive tool chains without sacrificing accuracy. To regulate this speculative planning, we introduce a cognitive gating mechanism based on answer separability, which quantifies the model's confidence for self-verification without requiring oracle labels. Furthermore, we design a heterogeneous parallel funnel that exploits the stateless concurrency of the small model to mask the stateful serial execution of the large model, maximizing system throughput. Extensive experiments on V* Bench, HR-Bench, and POPE demonstrate that SpecEyes achieves 1.1-3.35x speedup over the agentic baseline while preserving or even improving accuracy (up to +6.7%), thereby boosting serving throughput under concurrent workloads.

  4. From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

    Large language model (LLM)-based systems are becoming increasingly popular for solving tasks by constructing executable workflows that interleave LLM calls, information retrieval, tool use, code execution, memory updates, and verification. This survey reviews recent methods for designing and optimizing such workflows, which we treat as agentic computation graphs (ACGs). We organize the literature based on when workflow structure is determined, where structure refers to which components or agents are present, how they depend on each other, and how information flows between them. This lens distinguishes static methods, which fix a reusable workflow scaffold before deployment, from dynamic methods, which select, generate, or revise the workflow for a particular run before or during execution. We further organize prior work along three dimensions: when structure is determined, what part of the workflow is optimized, and which evaluation signals guide optimization (e.g., task metrics, verifier signals, preferences, or trace-derived feedback). We also distinguish reusable workflow templates, run-specific realized graphs, and execution traces, separating reusable design choices from the structures actually deployed in a given run and from realized runtime behavior. Finally, we outline a structure-aware evaluation perspective that complements downstream task metrics with graph-level properties, execution cost, robustness, and structural variation across inputs. Our goal is to provide a clear vocabulary, a unified framework for positioning new methods, a more comparable view of existing body of literature, and a more reproducible evaluation standard for future work in workflow optimizations for LLM agents.

  5. PEARL: Personalized Streaming Video Understanding Model

    Human cognition of new concepts is inherently a streaming process: we continuously recognize new objects or identities and update our memories over time. However, current multimodal personalization methods are largely limited to static images or offline videos. This disconnects continuous visual input from instant real-world feedback, limiting their ability to provide the real-time, interactive personalized responses essential for future AI assistants. To bridge this gap, we first propose and formally define the novel task of Personalized Streaming Video Understanding (PSVU). To facilitate research in this new direction, we introduce PEARL-Bench, the first comprehensive benchmark designed specifically to evaluate this challenging setting. It evaluates a model's ability to respond to personalized concepts at exact timestamps under two modes: (1) Frame-level, focusing on a specific person or object in discrete frames, and (2) a novel Video-level, focusing on personalized actions unfolding across continuous frames. PEARL-Bench comprises 132 unique videos and 2,173 fine-grained annotations with precise timestamps. Concept diversity and annotation quality are strictly ensured through a combined pipeline of automated generation and human verification. To tackle this challenging new setting, we further propose PEARL, a plug-and-play, training-free strategy that serves as a strong baseline. Extensive evaluations across 8 offline and online models demonstrate that PEARL achieves state-of-the-art performance. Notably, it brings consistent PSVU improvements when applied to 3 distinct architectures, proving to be a highly effective and robust strategy. We hope this work advances vision-language model (VLM) personalization and inspires further research into streaming personalized AI assistants. Code is available at https://github.com/Yuanhong-Zheng/PEARL.

  6. DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models

    Optical flow models trained on high-quality data often degrade severely when confronted with real-world corruptions such as blur, noise, and compression artifacts. To overcome this limitation, we formulate Degradation-Aware Optical Flow, a new task targeting accurate dense correspondence estimation from real-world corrupted videos. Our key insight is that the intermediate representations of image restoration diffusion models are inherently corruption-aware but lack temporal awareness. To address this limitation, we lift the model to attend across adjacent frames via full spatio-temporal attention, and empirically demonstrate that the resulting features exhibit zero-shot correspondence capabilities. Based on this finding, we present DA-Flow, a hybrid architecture that fuses these diffusion features with convolutional features within an iterative refinement framework. DA-Flow substantially outperforms existing optical flow methods under severe degradation across multiple benchmarks.

  7. SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

    High-quality articulated 3D assets are indispensable for embodied AI and physical simulation, yet 3D generation still focuses on static meshes, leaving a gap in "sim-ready" interactive objects. Most recent articulated object creation methods rely on multi-stage pipelines that accumulate errors across decoupled modules. Alternatively, unified MLLMs offer a single-stage path to joint static asset understanding and sim-ready asset generation. However dense voxel-based 3D tokenization yields long 3D token sequences and high memory overhead, limiting scalability to complex articulated objects. To address this, we propose SIMART, a unified MLLM framework that jointly performs part-level decomposition and kinematic prediction. By introducing a Sparse 3D VQ-VAE, SIMART reduces token counts by 70% vs. dense voxel tokens, enabling high-fidelity multi-part assemblies. SIMART achieves state-of-the-art performance on PartNet-Mobility and in-the-wild AIGC datasets, and enables physics-based robotic simulation.

  8. UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

    Unified models capable of interleaved generation have emerged as a promising paradigm, with the community increasingly converging on autoregressive modeling for text and flow matching for image generation. To advance this direction, we propose a unified reinforcement learning framework tailored for interleaved generation. We validate our approach on its fundamental unit: a single round of reasoning-driven image generation, where the model first expands the user prompt through reasoning, followed by image synthesis. Formulating this multimodal generation process as a Markov Decision Process with sparse terminal rewards, we introduce UniGRPO to jointly optimize text and image generation policies using GRPO. Adopting a minimalist methodology to avoid over-design, we leverage established training recipes for both modalities by seamlessly integrating standard GRPO for reasoning and FlowGRPO for visual synthesis. To ensure scalability to multi-round interleaved generation, we introduce two critical modifications to the original FlowGRPO: (1) eliminating classifier-free guidance to maintain linear, unbranched rollouts, which is essential for scaling to complex scenarios involving multi-turn interactions and multi-condition generation (e.g., editing); and (2) replacing the standard latent KL penalty with an MSE penalty directly on the velocity fields, providing a more robust and direct regularization signal to mitigate reward hacking effectively. Our experiments demonstrate that this unified training recipe significantly enhances image generation quality through reasoning, providing a robust and scalable baseline for the future post-training of fully interleaved models.

  9. RealMaster: Lifting Rendered Scenes into Photorealistic Video

    State-of-the-art video generation models produce remarkable photorealism, but they lack the precise control required to align generated content with specific scene requirements. Furthermore, without an underlying explicit geometry, these models cannot guarantee 3D consistency. Conversely, 3D engines offer granular control over every scene element and provide native 3D consistency by design, yet their output often remains trapped in the "uncanny valley". Bridging this sim-to-real gap requires both structural precision, where the output must exactly preserve the geometry and dynamics of the input, and global semantic transformation, where materials, lighting, and textures must be holistically transformed to achieve photorealism. We present RealMaster, a method that leverages video diffusion models to lift rendered video into photorealistic video while maintaining full alignment with the output of the 3D engine. To train this model, we generate a paired dataset via an anchor-based propagation strategy, where the first and last frames are enhanced for realism and propagated across the intermediate frames using geometric conditioning cues. We then train an IC-LoRA on these paired videos to distill the high-quality outputs of the pipeline into a model that generalizes beyond the pipeline's constraints, handling objects and characters that appear mid-sequence and enabling inference without requiring anchor frames. Evaluated on complex GTA-V sequences, RealMaster significantly outperforms existing video editing baselines, improving photorealism while preserving the geometry, dynamics, and identity specified by the original 3D control.

  10. 2Xplat: Two Experts Are Better Than One Generalist

    Pose-free feed-forward 3D Gaussian Splatting (3DGS) has opened a new frontier for rapid 3D modeling, enabling high-quality Gaussian representations to be generated from uncalibrated multi-view images in a single forward pass. The dominant approach in this space adopts unified monolithic architectures, often built on geometry-centric 3D foundation models, to jointly estimate camera poses and synthesize 3DGS representations within a single network. While architecturally streamlined, such "all-in-one" designs may be suboptimal for high-fidelity 3DGS generation, as they entangle geometric reasoning and appearance modeling within a shared representation. In this work, we introduce 2Xplat, a pose-free feed-forward 3DGS framework based on a two-expert design that explicitly separates geometry estimation from Gaussian generation. A dedicated geometry expert first predicts camera poses, which are then explicitly passed to a powerful appearance expert that synthesizes 3D Gaussians. Despite its conceptual simplicity, being largely underexplored in prior works, the proposed approach proves highly effective. In fewer than 5K training iterations, the proposed two-experts pipeline substantially outperforms prior pose-free feed-forward 3DGS approaches and achieves performance on par with state-of-the-art posed methods. These results challenge the prevailing unified paradigm and suggest the potential advantages of modular design principles for complex 3D geometric estimation and appearance synthesis tasks.

  11. Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought

    Multimodal Chain-of-Thought (CoT) reasoning requires large vision-language models to construct reasoning trajectories that interleave perceptual grounding with multi-step inference. However, existing Reinforcement Learning with Verifiable Rewards (RLVR) methods typically optimize reasoning at a coarse granularity, treating CoT uniformly without distinguishing their varying degrees of visual grounding. In this work, we conduct a token-level analysis of multimodal reasoning trajectories and show that successful reasoning is characterized by structured token dynamics reflecting both perceptual grounding and exploratory inference. Building upon this analysis, we propose Perception-Exploration Policy Optimization (PEPO), which derives a perception prior from hidden state similarity and integrates it with token entropy through a smooth gating mechanism to produce token-level advantages. PEPO integrates seamlessly with existing RLVR frameworks such as GRPO and DAPO, requiring neither additional supervision nor auxiliary branches. Extensive experiments across diverse multimodal benchmarks demonstrate consistent and robust improvements over strong RL baselines, spanning geometry reasoning, visual grounding, visual puzzle solving, and few-shot classification, while maintaining stable training dynamics. Code: https://github.com/xzxxntxdy/PEPO

  12. Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

    Multi-modal large language models (MLLMs) have advanced general-purpose video understanding but struggle with long, high-resolution videos -- they process every pixel equally in their vision transformers (ViTs) or LLMs despite significant spatiotemporal redundancy. We introduce AutoGaze, a lightweight module that removes redundant patches before processed by a ViT or an MLLM. Trained with next-token prediction and reinforcement learning, AutoGaze autoregressively selects a minimal set of multi-scale patches that can reconstruct the video within a user-specified error threshold, eliminating redundancy while preserving information. Empirically, AutoGaze reduces visual tokens by 4x-100x and accelerates ViTs and MLLMs by up to 19x, enabling scaling MLLMs to 1K-frame 4K-resolution videos and achieving superior results on video benchmarks (e.g., 67.0% on VideoMME). Furthermore, we introduce HLVid: the first high-resolution, long-form video QA benchmark with 5-minute 4K-resolution videos, where an MLLM scaled with AutoGaze improves over the baseline by 10.1% and outperforms the previous best MLLM by 4.5%. Project page: https://autogaze.github.io/.

  13. VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models

    Vision-Language-Action (VLA) models typically map visual observations and linguistic instructions directly to robotic control signals. This "black-box" mapping forces a single forward pass to simultaneously handle instruction interpretation, spatial grounding, and low-level control, often leading to poor spatial precision and limited robustness in out-of-distribution scenarios. To address these limitations, we propose VP-VLA, a dual-system framework that decouples high-level reasoning and low-level execution via a structured visual prompting interface. Specifically, a "System 2 Planner" decomposes complex instructions into sub-tasks and identifies relevant target objects and goal locations. These spatial anchors are then overlaid directly onto visual observations as structured visual prompts, such as crosshairs and bounding boxes. Guided by these prompts and enhanced by a novel auxiliary visual grounding objective during training, a "System 1 Controller" reliably generates precise low-level execution motions. Experiments on the Robocasa-GR1-Tabletop benchmark and SimplerEnv simulation demonstrate that VP-VLA improves success rates by 5% and 8.3%, surpassing competitive baselines including QwenOFT and GR00T-N1.6.

  14. ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

    Recent progress in latent world models (e.g., V-JEPA2) has shown promising capability in forecasting future world states from video observations. Nevertheless, dense prediction from a short observation window limits temporal context and can bias predictors toward local, low-level extrapolation, making it difficult to capture long-horizon semantics and reducing downstream utility. Vision--language models (VLMs), in contrast, provide strong semantic grounding and general knowledge by reasoning over uniformly sampled frames, but they are not ideal as standalone dense predictors due to compute-driven sparse sampling, a language-output bottleneck that compresses fine-grained interaction states into text-oriented representations, and a data-regime mismatch when adapting to small action-conditioned datasets. We propose a VLM-guided JEPA-style latent world modeling framework that combines dense-frame dynamics modeling with long-horizon semantic guidance via a dual-temporal pathway: a dense JEPA branch for fine-grained motion and interaction cues, and a uniformly sampled VLM thinker branch with a larger temporal stride for knowledge-rich guidance. To transfer the VLM's progressive reasoning signals effectively, we introduce a hierarchical pyramid representation extraction module that aggregates multi-layer VLM representations into guidance features compatible with latent prediction. Experiments on hand-manipulation trajectory prediction show that our method outperforms both a strong VLM-only baseline and a JEPA-predictor baseline, and yields more robust long-horizon rollout behavior.

  15. AgentSLR: Automating Systematic Literature Reviews in Epidemiology with Agentic AI

    Systematic literature reviews are essential for synthesizing scientific evidence but are costly, difficult to scale and time-intensive, creating bottlenecks for evidence-based policy. We study whether large language models can automate the complete systematic review workflow, from article retrieval, article screening, data extraction to report synthesis. Applied to epidemiological reviews of nine WHO-designated priority pathogens and validated against expert-curated ground truth, our open-source agentic pipeline (AgentSLR) achieves performance comparable to human researchers while reducing review time from approximately 7 weeks to 20 hours (a 58x speed-up). Our comparison of five frontier models reveals that performance on SLR is driven less by model size or inference cost than by each model's distinctive capabilities. Through human-in-the-loop validation, we identify key failure modes. Our results demonstrate that agentic AI can substantially accelerate scientific evidence synthesis in specialised domains.

Techmeme(15)

  1. Musk's lawyers ask a Delaware judge to step back from cases involving him, after her account "liked" a LinkedIn post celebrating his defeat in a California case (Sujeet Indap/Financial Times)

    Sujeet Indap / Financial Times : Musk's lawyers ask a Delaware judge to step back from cases involving him, after her account “liked” a LinkedIn post celebrating his defeat in a California case —  Quinn Emanuel says Delaware judge must recuse herself over post, which she says she may have liked ‘accidentally’

  2. Singapore-based Startale, developer of the Strium blockchain for tokenized securities and JPYSC and USDSC stablecoins, raised a $63M Series A from SBI and Sony (Francisco Rodrigues/CoinDesk)

    Francisco Rodrigues / CoinDesk : Singapore-based Startale, developer of the Strium blockchain for tokenized securities and JPYSC and USDSC stablecoins, raised a $63M Series A from SBI and Sony —  The Singapore-based company builds blockchain tools for financial firms and retail users, including a blockchain for tokenized securities, stablecoins, and a consumer app.

  3. Source: Meta on Wednesday laid off around 700 employees in the Reality Labs unit, as well as some in recruiting, sales, and Facebook (Eli Tan/New York Times)

    Eli Tan / New York Times : Source: Meta on Wednesday laid off around 700 employees in the Reality Labs unit, as well as some in recruiting, sales, and Facebook —  Meta on Wednesday laid off around 700 employees, a person with knowledge of the company said, the latest downsizing as the Silicon Valley giant shifts its priorities toward artificial intelligence.

  4. Nintendo says new first-party games exclusive to Switch 2 will have different prices for physical and digital versions in the US, beginning in May (Andy Robinson/Video Games Chronicle)

    Andy Robinson / Video Games Chronicle : Nintendo says new first-party games exclusive to Switch 2 will have different prices for physical and digital versions in the US, beginning in May —  Nintendo of America has announced that, beginning in May, it will introduce differing pricing for physical and digital versions of its Switch 2 games.

  5. ARC Prize Foundation unveils ARC-AGI-3, an AI benchmark with simple video-game-like scenarios designed to measure on-the-fly reasoning rather than memory recall (Mark Sullivan/Fast Company)

    Mark Sullivan / Fast Company : ARC Prize Foundation unveils ARC-AGI-3, an AI benchmark with simple video-game-like scenarios designed to measure on-the-fly reasoning rather than memory recall —  The influential AI researcher François Chollet has long argued that the field measures intelligence incorrectly …

  6. Reddit says it will label automated accounts that provide a service to users and will require accounts suspected of being bots to verify they are human (Sarah Perez/TechCrunch)

    Sarah Perez / TechCrunch : Reddit says it will label automated accounts that provide a service to users and will require accounts suspected of being bots to verify they are human —  Would-be Reddit competitor Digg just shut down because it couldn't get a handle on the bots overrunning its site.

  7. The jury in Los Angeles' social media trial awards the plaintiff $3M in compensatory damages and $3M in punitive damages; Meta will pay 70% and YouTube 30% (New York Times)

    New York Times : The jury in Los Angeles' social media trial awards the plaintiff $3M in compensatory damages and $3M in punitive damages; Meta will pay 70% and YouTube 30% —  A jury found the companies harmed a young user with design features that were addictive and led to her mental health distress.  — 1k

  8. Social media addiction trial: a Los Angeles jury finds Meta and YouTube were negligent and failed to warn users about the dangers of using their platforms (Jonathan Vanian/CNBC)

    Jonathan Vanian / CNBC : Social media addiction trial: a Los Angeles jury finds Meta and YouTube were negligent and failed to warn users about the dangers of using their platforms —  A jury in Los Angeles determined on Wednesday that Meta and Google's YouTube were negligent and failed to warn users of the dangers associated …

  9. Google launches Lyria 3 Pro music generation model, with better creative control and allowing users to create three-minute tracks, up from Lyria 3's 30 seconds (Ivan Mehta/TechCrunch)

    Ivan Mehta / TechCrunch : Google launches Lyria 3 Pro music generation model, with better creative control and allowing users to create three-minute tracks, up from Lyria 3's 30 seconds —  Google announced on Wednesday that it's releasing Lyria 3 Pro, a music generation model, a month after Lyria 3's release.

  10. Sen. Bernie Sanders and Rep. Alexandria Ocasio-Cortez plan to introduce legislation to pause new data center construction until AI safeguards are in place (Maria Curi/Axios)

    Maria Curi / Axios : Sen. Bernie Sanders and Rep. Alexandria Ocasio-Cortez plan to introduce legislation to pause new data center construction until AI safeguards are in place —  Sen. Bernie Sanders (I-Vt.) and Rep. Alexandria Ocasio-Cortez (D-N.Y.) on Wednesday will announce legislation to pause …

  11. Sources: cloud provider Vultr is seeking to raise $1B+ to expand its AI computing capacity, after raising $333M at a $3.5B valuation in 2024 (The Information)

    The Information : Sources: cloud provider Vultr is seeking to raise $1B+ to expand its AI computing capacity, after raising $333M at a $3.5B valuation in 2024 —  Vultr, one of the oldest independent cloud providers, is seeking to raise at least $1 billion in new capital so it can compete with a growing list …

  12. Epic Microsystems, which designs power delivery architecture for better thermal and efficiency management of AI data centers, raised a $21M Series A (Chris Metinko/Axios)

    Chris Metinko / Axios : Epic Microsystems, which designs power delivery architecture for better thermal and efficiency management of AI data centers, raised a $21M Series A —  Epic Microsystems, a semiconductor developer for AI data centers, has raised a $21 million Series A led by Seligman Ventures …

  13. Source: as part of its Google deal, Apple has full access to the Gemini model in its own data centers and can use distillation to produce smaller models (The Information)

    The Information : Source: as part of its Google deal, Apple has full access to the Gemini model in its own data centers and can use distillation to produce smaller models —  Before we get to today's column, we wanted to flag OpenAI CEO Sam Altman's major reorg, the company's new “Spud” model and its decision …

  14. Charlotte-based Lucid Bots, which manufactures autonomous drones for cleaning windows, raised a $20M Series B co-led by Cubit and Idea Fund (Rebecca Szkutak/TechCrunch)

    Rebecca Szkutak / TechCrunch : Charlotte-based Lucid Bots, which manufactures autonomous drones for cleaning windows, raised a $20M Series B co-led by Cubit and Idea Fund —  Andrew Ashur, the founder and CEO of window cleaning robot startup Lucid Bots, likes to joke that his company is the antithesis of the robotics industry right now.

  15. Google Research details TurboQuant, a quantization algorithm to enable massive compression of LLMs and vector search engines without sacrificing accuracy (Google Research)

    Google Research : Google Research details TurboQuant, a quantization algorithm to enable massive compression of LLMs and vector search engines without sacrificing accuracy —  Amir Zandieh, Research Scientist, and Vahab Mirrokni, VP and Google Fellow, Google Research  —  We introduce a set …

Solidot(15)

  1. 火星首次发现红宝石

    美国洛斯阿拉莫斯国家实验室科学家首次在火星鹅卵石中发现宝石级矿物,其成分与地球上的红宝石相差无几。火星红宝石由名为刚玉的物质组成。刚玉的主要成分是氧化铝,硬度仅次于钻石。它本身无色透明,却因蕴藏不同元素而呈现不同面貌:混入微量铬,便成为艳丽的红宝石;含铁或钛,则成为深邃如海的蓝宝石;若不含致色元素,便是纯净的白刚玉。这些红色宝石,是“毅力号”火星车在探索杰泽罗陨石坑时意外发现的。在一次对名为“汉普登河”的岩石进行分析时,火星车搭载的激光仪器 SuperCam 使用了两种探测手段——用激光烧蚀岩石表层,或激发其发光,再通过两台相机捕捉光谱信息。两次检测均显示,其矿物成分与地球红宝石惊人相似,暗示其中可能藏有微小的刚玉颗粒。这是科学家首次在火星上发现宝石,其形成机制可能与地球截然不同。地球上,刚玉多与板块构造活动相关,需要低硅高铝的特殊环境。而火星并无板块运动,科学家推测,这些刚玉或由远古陨石撞击产生——在撞击瞬间,巨大热量与压力加热或压缩火星表面的尘埃,孕育出这些微小珍宝。

  2. 阿里巴巴发布优化运行国产大模型的 RISC-V 服务器芯片

    阿里巴巴发布了优化运行国产大模型的 RISC-V 服务器芯片玄铁 C950,原生支持 Qwen3、DeepSeek V3 等千亿参数大模型。阿里巴巴称玄铁 C950 单核通用性能在 Specint2006 基准测试中突破 70 分,刷新了全球 RISC-V 性能纪录。Google 研究员 Laurie Kirk 称玄铁 C950 的性能与苹果在 2020 年发布的 M1 芯片差不多。玄铁 C950 实现了 2025 年发布的 RISC-V RVA v23.1。该芯片使用 5 纳米工艺制造。

  3. PyPI 库中的 LiteLLM 遭到入侵植入恶意代码

    LiteLLM 项目维护者账号被盗,黑客向 PyPI 软件库发布了两个嵌入恶意代码的版本 v1.82.7 和 v1.82.8。恶意代码旨在窃取凭证,包括 SSH 密钥、云服务凭证、加密钱包等。任何安装了恶意版本的用户需要立即检查是否遭到入侵。恶意文件 litellm_init.pth 会在每次 Python 进程启动时自动执行。项目维护者称账号被盗源于刚刚爆出的 trivy 漏洞,恶意版本都已从 PyPI 上移除,所有维护者帐户都已更改。

  4. OpenAI 宣布关闭 Sora,终止与迪士尼的合作

    OpenAI 宣布将关闭其视频生成应用 Sora,终止与迪士尼的内容合作。OpenAI 的聊天机器人 ChatGPT 也将关闭视频生成功能,但图像生成不受影响。Sora 于 2024 年推出,一度引发广泛关注,但过去几个月它的热度下滑显著,而竞争对手如字节跳动的 SeeDance 2.0 在生成视频方面比 OpenAI 甚至更出色。迪士尼在 2025 年 12 月宣布与 OpenAI 展开合作,允许 Sora 使用其版权角色生成视频。随着 Sora 的关闭,双方的内容合作也宣告终止。OpenAI 在几天前举行的一次全体员工大会上表示将重新专注于商业和生产力应用,避免被琐事分散注意力。

  5. Epic Games 裁员逾千人,强调与 AI 无关

    Epic Games 宣布裁员逾千人,原因是其主要收入来源《堡垒之夜(Fortnite)》的玩家减少收入下滑,为维持公司运营而不得不裁员。《堡垒之夜》曾风靡一时,为 Epic Games 带来了上百亿美元的收入,以至于 Epic 推出了一个游戏商店,为吸引玩家每周都提供免费游戏,但最近几年其流行度在衰退。Epic 强调裁员与 AI 无关,而是为了减少支出控制成本。Epic Games 有逾四千名雇员,裁掉的人数占到了总人数的四分之一。腾讯是 Epic Games 的大股东。

  6. LG Display 量产刷新率在 1-120Hz 之间自动调节的笔记本显示屏

    LG Display 宣布量产刷新率在 1-120Hz 之间自动调节的笔记本显示屏,在需要时降低刷新率有助于延长电池续航时间。这款 LCD 显示屏被称为 Oxide 1Hz,在检测到屏幕上显示静态图像时会自动使用 1Hz 的刷新率,在播放视频或玩游戏时能切换到最高 120Hz。LG 没有透露太多技术细节。京东方和英特尔去年宣布了类似的产品,但没有透露上市时间。戴尔公司的 2026 款 XPS 笔记本电脑将提供 Oxide 1Hz。

  7. Intoxalock 遭到网络攻击导致美国多地汽车无法启动

    美国多个州的司机在因酒驾定罪之后如果想要继续驾车,他们需要在汽车上安装酒精检测装置,在启动汽车前测量下酒精含量,在驾驶期间还会被随机抽查。如果没有进行检测,汽车将无法启动;驾驶期间忽略或未通过检测那么汽车将会熄火。美国最流行的酒精检测装置由 Intoxalock 公司生产,这种装置只能租赁,每月费用约为 70-120 美元。3 月 14 日 Intoxalock 遭到了网络攻击,导致其基础设施完全瘫痪,意味着安装了该公司装置的司机面临汽车被锁定的问题。Intoxalock 直到 3 月 22 日才宣布其系统恢复了正常工作,它已经表示将承担系统停止工作给用户造成的开销,包括拖车费。

  8. 适度喝咖啡或有助于保护大脑

    Mass General Brigham 医院集团的研究人员跟踪了逾 13 万人长达 43 年,发现常适度饮用含咖啡因的咖啡或茶饮料的人,患痴呆症的风险比很少接触咖啡因的人低 18%。研究人员强调这是相关性研究,两者不具有因果关系。研究还发现,常饮用咖啡因的人在部分认知测试中的得分更高,且更少抱怨记忆力衰退。所谓适度饮用指的是每天喝两到三杯咖啡或一到两杯茶。研究报告发表在JAMA 期刊上。

  9. 超加工食品与男性生育能力下降相关

    研究人员报告超加工食品与男性生育能力下降相关。研究团队分析了 831 名女性和 651 名男性的数据,参与者来自一项前瞻性队列研究“R 世代下一代研究”。该研究从备孕阶段开始追踪父母,直至子女进入童年期。研究纳入了 2017 至 2021 年间处于备孕或孕期的夫妻。研究人员在孕早期约 12 周时通过问卷评估父母的饮食状况。食物分为非超加工食品与超加工食品两类。女性与男性的超加工食品平均摄入占比分别为 22% 和 25%。女性摄入超加工食品与不孕风险、备孕时长并未呈现一致关联。男性超加工食品摄入越多,不育风险越高,备孕成功所需时间越长。研究人员强调只是相关性不是因果,需要更多数据观察。

  10. 【重磅推荐】2026 年度 NVIDIA 创业企业展示现已启动招募!

    今年 3 月起,NVIDIA 将面向科创企业在全国陆续启动一系列企业展示活动,活动形式将包含路演,展位展示及大企业对接等多种形式。 【北京站】 4 月 23 日 北京站将深度解析 GTC2026 精彩内容和发布,聚焦物理AI、AI智能体、大语言模型应用等领域,探索 AI 的下一个篇章。参与形式包括:路演、展示、大企业和技术对接等。 【成都站】 5 月 15 日 成都站为 AI 应用和出海专场,NVIDIA 专家及行业嘉宾将带来 AI 出海、物理 AI、AI 智能体、AI 落地应用等精彩内容分享。 【上海站】 5 月 21 日 上海站将聚焦 AI 智能体、物理AI、大语言模型应用等领域,探索 AI 的应用场景。参与形式包括:路演、展示、大企业和技术对接等。 【澳门站】 5 月 26-30 日 澳门站为境外专场,结合澳门BEYOND 国际科技创新博览会,聚焦AI智能体、物理AI、企业出海等前沿技术领域和方向,涵盖#GTC26 技术精华解读、项目路演、圆桌讨论、投融资与需求对接等环节。报名企业将有机会获得免费BEYONDEXPO 展位。 诚邀您莅临现场,共同交流与探讨!报名可扫描下方二维码:

  11. Firefox 149 释出

    Mozilla 于 3 月 24 日释出了 Firefox 149。主要变化包括:用 Rust 语言开发的 JPEG-XL 图像解码器 jxl-rs 取代了旧的用 C++ 开发的解码器;更快的 PDF 文件处理速度,通过右键上下文菜单从 PDF 中下载图像; 改进 HTTP/3 上传性能;内置 VPN(目前只提供给美国等少数地区),每月免费流量 50GB,等等。

  12. AI 促使源码进化还是导致它灭绝?

    在基于大模型的 AI 辅助编程日益流行的时代,是否会出现为 AI 优化而不考虑人类可读性的编程语言?已有实验在尝试为提高大模型效率而最小化词元(tokens)。AI 是促使源码进化还是导致它灭绝?我们能否让 AI 直接从提示词生成一种中间语言然后将其输入到解释器或编译器?未来是否还需要高级语言?去年 10 月 IEEE Spectrum 召开了一个网络研讨会讨论了 AI 是否会导致编程语言消失。高级语言是人类使用的语言,我们完全可以让 AI 直接生成中间语言,而未来的程序员仍然可以做出与接口、算法以及其它架构相关的设计决策。最终生成的代码仍然需要通过测试,能解释它正在做什么。

  13. 亚马逊 AWS 位于巴林的数据中心第二次因无人机活动中断服务

    亚马逊 AWS 位于巴林的数据中心因无人机活动中断服务,这是 AWS 本月第二次受到战争影响。亚马逊发言人确认问题是无人机活动造成的,但没有提供更多信息,不清楚其巴林设施是否直接遭到无人机袭击,还是附近区域遭到袭击。亚马逊表示正帮助客户迁移到其它 AWS 区域。亚马逊本月初表示其位于巴林和阿联酋的设施遭到无人机袭击,它当时表示由于结构受损预计恢复过程将比较长。

  14. 美国科学大国地位动摇

    特朗普政府的反科学立场促使大量研究人员离开美国。数据显示,2025 年 1~8 月跨境流动的研究人员中,美国流出的份额上升到 11%,流入的份额下降到 15%。气候变化等领域的学者正在前往欧洲。流向西班牙、法国等欧洲国家和加拿大、韩国的人数增加。由于大学和企业吸引了世界上最优秀的人才,美国建立了科学超级大国的地位,成为创新和经济增长的源泉。但是特朗普第二届政府对名牌大学施加压力、削减科学技术预算、限制发放签证等移民政策、对气候变化和疫苗的怀疑论等政策正在导致研究人员“远离美国”。

  15. 2026 年阿贝尔奖授予了证明莫德尔猜想的 Gerd Faltings

    2026 年阿贝尔奖授予了证明莫德尔猜想的德国数学家 Gerd Faltings。他在 1986 年 32 岁时因证明莫德尔猜想而获得了菲尔茨奖,该猜想后改名为 Faltings 定理。Faltings 定理研究的是曲线,曲线通常可以通过两个变量的加法和乘法组合而成的简单方程去描述。如果在坐标系中绘制方程的解,会形成直线、椭圆或更复杂的曲线。数学家一直在寻找这些解中一类特殊子集——“有理点”,即坐标为整数或分数的点。这些特殊点蕴含丰富而复杂的关系,隐藏着数学家试图揭示的秩序。然而曲线的数量是无限的,要确定所有曲线上的有理点似乎不可能——直到 Faltings 定理的出现。他证明如果一个曲线的方程中某个变量的幂次高于 3,那么这种曲线上的有理点数量必然是有限的。只有直线、二次曲线如圆和三次方程可能拥有无限多个有理点。其证明被视为算术几何的一大基石。