OrangeBot.AI Digest — 2026-03-24
89 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- GitHub is once again down (www.githubstatus.com)
- Is anybody else bored of talking about AI? (blog.jakesaunders.dev)
- Epic Games to cut more than 1k jobs as Fortnite usage falls (www.reuters.com)
- Wine 11 rewrites how Linux runs Windows games at kernel with massive speed gains (www.xda-developers.com)
- Tell HN: Litellm 1.82.7 and 1.82.8 on PyPI are compromised (github.com)
- Arm AGI CPU (newsroom.arm.com)
- Apple Business (www.apple.com)
- No Terms. No Conditions (notermsnoconditions.com)
- Show HN: Gemini can now natively embed video, so I built sub-second video search (github.com)
- Mystery jump in oil trading ahead of Trump post draws scrutiny (www.bbc.com)
- LaGuardia pilots raised safety alarms months before deadly runway crash (www.theguardian.com)
- The bridge to wealth is being pulled up with AI (danielhomola.com)
- So where are all the AI apps? (www.answer.ai)
- Missile defense is NP-complete (smu160.github.io)
- Malicious litellm_init.pth in litellm 1.82.8 PyPI package – credential stealer (github.com)
GitHub Trending(14)
- pascalorg / editor
- bytedance / deer-flow
- supermemoryai / supermemory
- FujiwaraChoki / MoneyPrinterV2
- harry0703 / MoneyPrinterTurbo
- Crosstalk-Solutions / project-nomad
- TauricResearch / TradingAgents
- mvanhorn / last30days-skill
- ruvnet / ruflo
- NousResearch / hermes-agent
- hesreallyhim / awesome-claude-code
- hsliuping / TradingAgents-CN
- aquasecurity / trivy
- ruvnet / RuView
Product Hunt(15)
- Flux
Fix production bugs by replaying them locally
- What The Duck!
Duck Hunt but with your finger and custom targets
- Redbean
Bring your original characters to life
- NextPhone
24/7 AI answering service for service-based businesses
- Kitty Points Leaderboard
Find interesting community members and see how you stack up
- Drift
AI agent to run robot simulations faster and reliably
- Cekura
Observe and analyze your voice and chat AI agents
- Claude Computer Use
Enable Claude to use your computer to complete tasks
- jared.so
AI that monitors convos & proactively jumps in when needed
- LelaAI
Learn languages by reading real articles
- Maestri
An infinite canvas where coding agents work in concert
- GitLaw Integrations
Trigger AI legal doc creation/review from 7,000+ apps
- Google Gemini in Chrome
Turn your browser into an AI workspace
- Navox Network
Turn your LinkedIn connections into a job search map.
- Agent Hub Builder
Build a Netflix-style library of AI-powered tools to sell
Hugging Face(15)
- Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models
Video--based world models have emerged along two dominant paradigms: video generation and 3D reconstruction. However, existing evaluation benchmarks either focus narrowly on visual fidelity and text--video alignment for generative models, or rely on static 3D reconstruction metrics that fundamentally neglect temporal dynamics. We argue that the future of world modeling lies in 4D generation, which jointly models spatial structure and temporal evolution. In this paradigm, the core capability is interactive response: the ability to faithfully reflect how interaction actions drive state transitions across space and time. Yet no existing benchmark systematically evaluates this critical dimension. To address this gap, we propose Omni--WorldBench, a comprehensive benchmark specifically designed to evaluate the interactive response capabilities of world models in 4D settings. Omni--WorldBench comprises two key components: Omni--WorldSuite, a systematic prompt suite spanning diverse interaction levels and scene types; and Omni--Metrics, an agent-based evaluation framework that quantifies world modeling capabilities by measuring the causal impact of interaction actions on both final outcomes and intermediate state evolution trajectories. We conduct extensive evaluations of 18 representative world models across multiple paradigms. Our analysis reveals critical limitations of current world models in interactive response, providing actionable insights for future research. Omni-WorldBench will be publicly released to foster progress in interactive 4D world modeling.
- Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
We present daVinci-MagiHuman, an open-source audio-video generative foundation model for human-centric generation. daVinci-MagiHuman jointly generates synchronized video and audio using a single-stream Transformer that processes text, video, and audio within a unified token sequence via self-attention only. This single-stream design avoids the complexity of multi-stream or cross-attention architectures while remaining easy to optimize with standard training and inference infrastructure. The model is particularly strong in human-centric scenarios, producing expressive facial performance, natural speech-expression coordination, realistic body motion, and precise audio-video synchronization. It supports multilingual spoken generation across Chinese (Mandarin and Cantonese), English, Japanese, Korean, German, and French. For efficient inference, we combine the single-stream backbone with model distillation, latent-space super-resolution, and a Turbo VAE decoder, enabling generation of a 5-second 256p video in 2 seconds on a single H100 GPU. In automatic evaluation, daVinci-MagiHuman achieves the highest visual quality and text alignment among leading open models, along with the lowest word error rate (14.60%) for speech intelligibility. In pairwise human evaluation, it achieves win rates of 80.0% against Ovi 1.1 and 60.9% against LTX 2.3 over 2000 comparisons. We open-source the complete model stack, including the base model, the distilled model, the super-resolution model, and the inference codebase.
- LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
We introduce LongCat-Flash-Prover, a flagship 560-billion-parameter open-source Mixture-of- Experts (MoE) model that advances Native Formal Reasoning in Lean4 through agentic tool-integrated reasoning (TIR). We decompose the native formal reasoning task into three independent formal capabilities, i.e., auto-formalization, sketching, and proving. To facilitate these capabilities, we propose a Hybrid-Experts Iteration Framework to expand high-quality task trajectories, including generating a formal statement based on a given informal problem, producing a whole-proof directly from the statement, or a lemma-style sketch. During agentic RL, we present a Hierarchical Importance Sampling Policy Optimization (HisPO) algorithm, which aims to stabilize the MoE model training on such long-horizon tasks. It employs a gradient masking strategy that accounts for the policy staleness and the inherent train-inference engine discrepancies at both sequence and token levels. Additionally, we also incorporate theorem consistency and legality detection mechanisms to eliminate reward hacking issues. Extensive evaluations show that our LongCat-Flash-Prover sets a new state-of-the-art for open-weights models in both auto-formalization and theorem proving. Demonstrating remarkable sample efficiency, it achieves a 97.1% pass rate on MiniF2F-Test using only 72 inference budget per problem. On more challenging benchmarks, it solves 70.8% of ProverBench and 41.5% of PutnamBench with no more than 220 attempts per problem, significantly outperforming existing open-weights baselines.
- Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs
Vision-language models (VLMs) typically process images at a native high-resolution, forcing a trade-off between accuracy and computational efficiency: high-resolution inputs capture fine details but incur significant computational costs, while low-resolution inputs advocate for efficiency, they potentially miss critical visual information, like small text. We present AwaRes, a spatial-on-demand framework that resolves this accuracy-efficiency trade-off by operating on a low-resolution global view and using tool-calling to retrieve only high-resolution segments needed for a given query. We construct supervised data automatically: a judge compares low- vs.\ high-resolution answers to label whether cropping is needed, and an oracle grounding model localizes the evidence for the correct answer, which we map to a discrete crop set to form multi-turn tool-use trajectories. We train our framework with cold-start SFT followed by multi-turn GRPO with a composite reward that combines semantic answer correctness with explicit crop-cost penalties. Project page: https://nimrodshabtay.github.io/AwaRes
- OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
Training deep research agents requires long-horizon trajectories that interleave search, evidence aggregation, and multi-step reasoning. However, existing data collection pipelines typically rely on proprietary web APIs, making large-scale trajectory synthesis costly, unstable, and difficult to reproduce. We present OpenResearcher, a reproducible pipeline that decouples one-time corpus bootstrapping from multi-turn trajectory synthesis and executes the search-and-browse loop entirely offline using three explicit browser primitives: search, open, and find, over a 15M-document corpus. Using GPT-OSS-120B as the teacher model, we synthesize over 97K trajectories, including a substantial long-horizon tail with 100+ tool calls. Supervised fine-tuning a 30B-A3B backbone on these trajectories achieves 54.8\% accuracy on BrowseComp-Plus, a +34.0 point improvement over the base model, while remaining competitive on BrowseComp, GAIA, and xbench-DeepSearch. Because the environment is offline and fully instrumented, it also enables controlled analysis, where our study reveals practical insights into deep research pipeline design, including data filtering strategies, agent configuration choices, and how retrieval success relates to final answer accuracy. We release the pipeline, synthesized trajectories, model checkpoints, and the offline search environment at https://github.com/TIGER-AI-Lab/OpenResearcher.
- VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
Long video understanding remains challenging for multimodal large language models (MLLMs) due to limited context windows, which necessitate identifying sparse query-relevant video segments. However, existing methods predominantly localize clues based solely on the query, overlooking the video's intrinsic structure and varying relevance across segments. To address this, we propose VideoDetective, a framework that integrates query-to-segment relevance and inter-segment affinity for effective clue hunting in long-video question answering. Specifically, we divide a video into various segments and represent them as a visual-temporal affinity graph built from visual similarity and temporal proximity. We then perform a Hypothesis-Verification-Refinement loop to estimate relevance scores of observed segments to the query and propagate them to unseen segments, yielding a global relevance distribution that guides the localization of the most critical segments for final answering with sparse observation. Experiments show our method consistently achieves substantial gains across a wide range of mainstream MLLMs on representative benchmarks, with accuracy improvements of up to 7.5% on VideoMME-long. Our code is available at https://videodetective.github.io/
- SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning
Despite the remarkable success of large-scale pre-trained image representation models (i.e., vision encoders) across various vision tasks, they are predominantly trained on 2D image data and therefore often fail to capture 3D spatial relationships between objects and backgrounds in the real world, constraining their effectiveness in many downstream applications. To address this, we propose SpatialBoost, a scalable framework that enhances the spatial awareness of existing pre-trained vision encoders by injecting 3D spatial knowledge expressed in linguistic descriptions. The core idea involves converting dense 3D spatial information from 2D images into linguistic expressions, which is then used to inject such spatial knowledge into vision encoders through a Large Language Model (LLM). To this end, we adopt a multi-turn Chain-of-Thought (CoT) reasoning process that progressively incorporates dense spatial knowledge and builds hierarchical spatial understanding. To validate effectiveness, we adapt SpatialBoost to state-of-the-art vision encoders such as DINOv3, and evaluate its performance gains on a wide range of benchmarks requiring both 3D perception and general vision abilities. For instance, SpatialBoost improves DINOv3 performance from 55.9 to 59.7 mIoU on ADE20K, achieving state-of-the-art performance with 3.8% gain over the pre-trained DINOv3.
- F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting
Feed-forward 3D Gaussian Splatting methods enable single-pass reconstruction and real-time rendering. However, they typically adopt rigid pixel-to-Gaussian or voxel-to-Gaussian pipelines that uniformly allocate Gaussians, leading to redundant Gaussians across views. Moreover, they lack an effective mechanism to control the total number of Gaussians while maintaining reconstruction fidelity. To address these limitations, we present F4Splat, which performs Feed-Forward predictive densification for Feed-Forward 3D Gaussian Splatting, introducing a densification-score-guided allocation strategy that adaptively distributes Gaussians according to spatial complexity and multi-view overlap. Our model predicts per-region densification scores to estimate the required Gaussian density and allows explicit control over the final Gaussian budget without retraining. This spatially adaptive allocation reduces redundancy in simple regions and minimizes duplicate Gaussians across overlapping views, producing compact yet high-quality 3D representations. Extensive experiments demonstrate that our model achieves superior novel-view synthesis performance compared to prior uncalibrated feed-forward methods, while using significantly fewer Gaussians.
- Manifold-Aware Exploration for Reinforcement Learning in Video Generation
Group Relative Policy Optimization (GRPO) methods for video generation like FlowGRPO remain far less reliable than their counterparts for language models and images. This gap arises because video generation has a complex solution space, and the ODE-to-SDE conversion used for exploration can inject excess noise, lowering rollout quality and making reward estimates less reliable, which destabilizes post-training alignment. To address this problem, we view the pre-trained model as defining a valid video data manifold and formulate the core problem as constraining exploration within the vicinity of this manifold, ensuring that rollout quality is preserved and reward estimates remain reliable. We propose SAGE-GRPO (Stable Alignment via Exploration), which applies constraints at both micro and macro levels. At the micro level, we derive a precise manifold-aware SDE with a logarithmic curvature correction and introduce a gradient norm equalizer to stabilize sampling and updates across timesteps. At the macro level, we use a dual trust region with a periodic moving anchor and stepwise constraints so that the trust region tracks checkpoints that are closer to the manifold and limits long-horizon drift. We evaluate SAGE-GRPO on HunyuanVideo1.5 using the original VideoAlign as the reward model and observe consistent gains over previous methods in VQ, MQ, TA, and visual metrics (CLIPScore, PickScore), demonstrating superior performance in both reward maximization and overall video quality. The code and visual gallery are available at https://dungeonmassster.github.io/SAGE-GRPO-Page/.
- mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT
Current language model training commonly applies multi-task Supervised Fine-Tuning (SFT) using a homogeneous compute budget across all sub-datasets. This approach is fundamentally sub-optimal: heterogeneous learning dynamics cause faster-learning tasks to overfit early while slower ones remain under-fitted. To address this, we introduce mSFT, an iterative, overfitting-aware search algorithm for multi-task data mixtures. mSFT trains the model on an active mixture, identifies and excludes the earliest overfitting sub-dataset, and reverts to that specific optimal checkpoint before continuing. Extensive evaluations demonstrate that mSFT consistently outperforms 4 baselines across 10 benchmarks and 6 base models. Further analysis confirms mSFT maintains robust gains across diverse dataset sizes, task granularities, and is insensitive to its single new hyperparameter (compute budget). Notably, at low compute budget, mSFT can improve performance while lowering training FLOPs. Ultimately, mSFT establishes a practical overfitting-aware algorithm for multi-task SFT that maximizes the potential of models across diverse data mixtures.
- Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection
Open-vocabulary 3D object detection aims to localize and recognize objects beyond a fixed training taxonomy. In multi-view RGB settings, recent approaches often decouple geometry-based instance construction from semantic labeling, generating class-agnostic fragments and assigning open-vocabulary categories post hoc. While flexible, such decoupling leaves instance construction governed primarily by geometric consistency, without semantic constraints during merging. When geometric evidence is view-dependent and incomplete, this geometry-only merging can lead to irreversible association errors, including over-merging of distinct objects or fragmentation of a single instance. We propose Group3D, a multi-view open-vocabulary 3D detection framework that integrates semantic constraints directly into the instance construction process. Group3D maintains a scene-adaptive vocabulary derived from a multimodal large language model (MLLM) and organizes it into semantic compatibility groups that encode plausible cross-view category equivalence. These groups act as merge-time constraints: 3D fragments are associated only when they satisfy both semantic compatibility and geometric consistency. This semantically gated merging mitigates geometry-driven over-merging while absorbing multi-view category variability. Group3D supports both pose-known and pose-free settings, relying only on RGB observations. Experiments on ScanNet and ARKitScenes demonstrate that Group3D achieves state-of-the-art performance in multi-view open-vocabulary 3D detection, while exhibiting strong generalization in zero-shot scenarios. The project page is available at https://ubin108.github.io/Group3D/.
- RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models
Improving embodied reasoning in multimodal-large-language models (MLLMs) is essential for building vision-language-action models (VLAs) on top of them to readily translate multimodal understanding into low-level actions. Accordingly, recent work has explored enhancing embodied reasoning in MLLMs through supervision of vision-question-answering type. However, these approaches have been reported to result in unstable VLA performance, often yielding only marginal or even negative gains. In this paper, we propose a more systematic MLLM training framework RoboAlign that reliably improves VLA performance. Our key idea is to sample action tokens via zero-shot natural language reasoning and refines this reasoning using reinforcement learning (RL) to improve action accuracy. As a result, RoboAlign bridges the modality gap between language and low-level actions in MLLMs, and facilitate knowledge transfer from MLLM to VLA. To validate the effectiveness of RoboAlign, we train VLAs by adding a diffusion-based action head on top of an MLLM backbone and evaluate them on major robotics benchmarks. Remarkably, by performing RL-based alignment after SFT using less than 1\% of the data, RoboAlign achieves performance improvements of 17.5\%, 18.9\%, and 106.6\% over SFT baselines on LIBERO, CALVIN, and real-world environments, respectively.
- Repurposing Geometric Foundation Models for Multi-view Diffusion
While recent advances in generative latent spaces have driven substantial progress in single-image generation, the optimal latent space for novel view synthesis (NVS) remains largely unexplored. In particular, NVS requires geometrically consistent generation across viewpoints, but existing approaches typically operate in a view-independent VAE latent space. In this paper, we propose Geometric Latent Diffusion (GLD), a framework that repurposes the geometrically consistent feature space of geometric foundation models as the latent space for multi-view diffusion. We show that these features not only support high-fidelity RGB reconstruction but also encode strong cross-view geometric correspondences, providing a well-suited latent space for NVS. Our experiments demonstrate that GLD outperforms both VAE and RAE on 2D image quality and 3D consistency metrics, while accelerating training by more than 4.4x compared to the VAE latent space. Notably, GLD remains competitive with state-of-the-art methods that leverage large-scale text-to-image pretraining, despite training its diffusion model from scratch without such generative pretraining.
- BubbleRAG: Evidence-Driven Retrieval-Augmented Generation for Black-Box Knowledge Graphs
Large Language Models (LLMs) exhibit hallucinations in knowledge-intensive tasks. Graph-based retrieval augmented generation (RAG) has emerged as a promising solution, yet existing approaches suffer from fundamental recall and precision limitations when operating over black-box knowledge graphs -- graphs whose schema and structure are unknown in advance. We identify three core challenges that cause recall loss (semantic instantiation uncertainty and structural path uncertainty) and precision loss (evidential comparison uncertainty). To address these challenges, we formalize the retrieval task as the Optimal Informative Subgraph Retrieval (OISR) problem -- a variant of Group Steiner Tree -- and prove it to be NP-hard and APX-hard. We propose BubbleRAG, a training-free pipeline that systematically optimizes for both recall and precision through semantic anchor grouping, heuristic bubble expansion to discover candidate evidence graphs (CEGs), composite ranking, and reasoning-aware expansion. Experiments on multi-hop QA benchmarks demonstrate that BubbleRAG achieves state-of-the-art results, outperforming strong baselines in both F1 and accuracy while remaining plug-and-play.
- SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models
Models that bridge vision and language, such as CLIP, are key components of multimodal AI, yet their large-scale, uncurated training data introduce severe social and spurious biases. Existing post-hoc debiasing methods often operate directly in the dense CLIP embedding space, where bias and task-relevant information are highly entangled. This entanglement limits their ability to remove bias without degrading semantic fidelity. In this work, we propose Sparse Embedding Modulation (SEM), a post-hoc, zero-shot debiasing framework that operates in a Sparse Autoencoder (SAE) latent space. By decomposing CLIP text embeddings into disentangled features, SEM identifies and modulates bias-relevant neurons while preserving query-relevant ones. This enables more precise, non-linear interventions. Across four benchmark datasets and two CLIP backbones, SEM achieves substantial fairness gains in retrieval and zero-shot classification. Our results demonstrate that sparse latent representations provide an effective foundation for post-hoc debiasing of vision-language models.
Techmeme(15)
- At a hearing, a US federal judge says the Pentagon's treatment of Anthropic is "troubling" and that "it looks like an attempt to cripple Anthropic" (Maria Curi/Axios)
Maria Curi / Axios : At a hearing, a US federal judge says the Pentagon's treatment of Anthropic is “troubling” and that “it looks like an attempt to cripple Anthropic” — A federal judge on Tuesday called the Pentagon's treatment of Anthropic “troubling” as the AI company urged the court …
- Disney ends its partnership with OpenAI, signed in December 2025, in which it pledged to invest $1B and agreed to license some characters to Sora (Todd Spangler/Variety)
Todd Spangler / Variety : Disney ends its partnership with OpenAI, signed in December 2025, in which it pledged to invest $1B and agreed to license some characters to Sora — OpenAI said it will discontinue Sora, the generative-AI video creation platform it launched last year, without providing a reason for the decision.
- A New Mexico jury finds that Meta violated state laws by failing to safeguard its platforms from child predators and orders it to pay $375M in damages (Jonathan Vanian/CNBC)
Jonathan Vanian / CNBC : A New Mexico jury finds that Meta violated state laws by failing to safeguard its platforms from child predators and orders it to pay $375M in damages — A jury has reached a verdict in a major New Mexico trial in which the state's attorney general alleged that Meta failed to safeguard its family of apps from child predators.
- Arm says its AGI CPU offers up to 136 Neoverse V3 cores, 6GB/s memory bandwidth per core, and more than 2x performance per rack compared with x86 systems (VideoCardz.com)
VideoCardz.com : Arm says its AGI CPU offers up to 136 Neoverse V3 cores, 6GB/s memory bandwidth per core, and more than 2x performance per rack compared with x86 systems — Arm launches AGI CPU, its first data center chip — Arm has announced the AGI CPU, its first production silicon product and its first Arm-designed data center CPU.
- Sam Altman told staff he has ceded oversight of OpenAI's safety and security teams to focus on fundraising, supply chains, and building data centers at scale (The Information)
The Information : Sam Altman told staff he has ceded oversight of OpenAI's safety and security teams to focus on fundraising, supply chains, and building data centers at scale — OpenAI CEO Sam Altman has relinquished direct oversight of the company's safety and security teams so he can focus on raising capital …
- OpenAI plans to discontinue products that use its Sora models, including its consumer app, a Sora version for developers, and a video feature inside ChatGPT (Berber Jin/Wall Street Journal)
Berber Jin / Wall Street Journal : OpenAI plans to discontinue products that use its Sora models, including its consumer app, a Sora version for developers, and a video feature inside ChatGPT — The app, released last year, allowed people to insert themselves into famous movie scenes, among other functions
- Two versions of LiteLLM, an interface for accessing LLMs, have been removed from PyPI after a supply chain attack injected them with credential-stealing code (Thomas Claburn/The Register)
Thomas Claburn / The Register : Two versions of LiteLLM, an interface for accessing LLMs, have been removed from PyPI after a supply chain attack injected them with credential-stealing code — Two versions of LiteLLM, an open source interface for accessing multiple large language models, have been removed from the Python Package Index …
- Amazon acquired New York-based Fauna Robotics, which is developing a human-like, 42-inch tall robot that can interact with people, walk, grip items, and dance (Mark Gurman/Bloomberg)
Mark Gurman / Bloomberg : Amazon acquired New York-based Fauna Robotics, which is developing a human-like, 42-inch tall robot that can interact with people, walk, grip items, and dance — Amazon.com Inc. acquired New York-based startup Fauna Robotics, becoming the latest technology giant to step into the burgeoning consumer humanoid market.
- Baltimore sues xAI, accusing it of violating consumer protection laws and engaging in deceptive trade practices by marketing Grok as generally safe (Lora Kolodny/CNBC)
Lora Kolodny / CNBC : Baltimore sues xAI, accusing it of violating consumer protection laws and engaging in deceptive trade practices by marketing Grok as generally safe — Lawsuits against Elon Musk's xAI are piling up, with Baltimore becoming the first major U.S. city to file a complaint against the company concerning issues …
- OpenAI releases a set of prompts designed to be used with its open-weight safety model gpt-oss-safeguard that lets developers make their apps safer for teens (Amanda Silberling/TechCrunch)
Amanda Silberling / TechCrunch : OpenAI releases a set of prompts designed to be used with its open-weight safety model gpt-oss-safeguard that lets developers make their apps safer for teens — OpenAI said Tuesday it is releasing a set of prompts that developers can use to make their apps safer for teens.
- Sources: OpenAI nears a deal to raise ~$10B from Abu Dhabi's MGX, Coatue, and Thrive, bringing its latest funding round to ~$120B at a $730B valuation (Bloomberg)
Bloomberg : Sources: OpenAI nears a deal to raise ~$10B from Abu Dhabi's MGX, Coatue, and Thrive, bringing its latest funding round to ~$120B at a $730B valuation — OpenAI is nearing a deal to raise about $10 billion from venture investors, according to people familiar with the matter …
- Sources: Apple's plans for a Siri reboot include a standalone Siri app, an overhauled interface in the Dynamic Island, and Ask Siri and Write with Siri features (Mark Gurman/Bloomberg)
Mark Gurman / Bloomberg : Sources: Apple's plans for a Siri reboot include a standalone Siri app, an overhauled interface in the Dynamic Island, and Ask Siri and Write with Siri features — Apple Inc. is testing a standalone app for its Siri voice assistant alongside a new “Ask Siri” feature that will work across …
- Anthropic announces an "auto mode" that enables Claude Code to make permission-level decisions while preventing destructive actions like mass file deletion (David Gewirtz/ZDNET)
David Gewirtz / ZDNET : Anthropic announces an “auto mode” that enables Claude Code to make permission-level decisions while preventing destructive actions like mass file deletion — ZDNET's key takeaways — Claude's auto mode reduces permission prompts for developers.
- OpenAI revamps ChatGPT's shopping experience by letting users upload images or describe items and include criteria like their budget (Ashley Capoot/CNBC)
Ashley Capoot / CNBC : OpenAI revamps ChatGPT's shopping experience by letting users upload images or describe items and include criteria like their budget — OpenAI is rolling out a new shopping experience within ChatGPT to make it easier for users to find and compare products, after its Instant Checkout feature failed to take off.
- Doss, which provides an AI-native inventory management layer that integrates with existing accounting systems, raised a $55M Series B (Marina Temkin/TechCrunch)
Marina Temkin / TechCrunch : Doss, which provides an AI-native inventory management layer that integrates with existing accounting systems, raised a $55M Series B — Enterprise resource planning (ERP) systems are often described as a company's ‘central brain’ because the software connects different departments …
Solidot(15)
- 【重磅推荐】2026 年度 NVIDIA 创业企业展示现已启动招募!
今年 3 月起,NVIDIA 将面向科创企业在全国陆续启动一系列企业展示活动,活动形式将包含路演,展位展示及大企业对接等多种形式。 【北京站】 4 月 23 日 北京站将深度解析 GTC2026 精彩内容和发布,聚焦物理AI、AI智能体、大语言模型应用等领域,探索 AI 的下一个篇章。参与形式包括:路演、展示、大企业和技术对接等。 【成都站】 5 月 15 日 成都站为 AI 应用和出海专场,NVIDIA 专家及行业嘉宾将带来 AI 出海、物理 AI、AI 智能体、AI 落地应用等精彩内容分享。 【上海站】 5 月 21 日 上海站将聚焦 AI 智能体、物理AI、大语言模型应用等领域,探索 AI 的应用场景。参与形式包括:路演、展示、大企业和技术对接等。 【澳门站】 5 月 26-30 日 澳门站为境外专场,结合澳门BEYOND 国际科技创新博览会,聚焦AI智能体、物理AI、企业出海等前沿技术领域和方向,涵盖#GTC26 技术精华解读、项目路演、圆桌讨论、投融资与需求对接等环节。报名企业将有机会获得免费BEYONDEXPO 展位。 诚邀您莅临现场,共同交流与探讨!报名可点击下方链接: 企业报名:https://jinshuju.com/f/MfGfIi?x_field_1=zhiding-wechat 观众报名:https://jinshuju.com/f/ZMKknG?x_field_1=Zhiding-WeChat
- Firefox 149 释出
Mozilla 于 3 月 24 日释出了 Firefox 149。主要变化包括:用 Rust 语言开发的 JPEG-XL 图像解码器 jxl-rs 取代了旧的用 C++ 开发的解码器;更快的 PDF 文件处理速度,通过右键上下文菜单从 PDF 中下载图像; 改进 HTTP/3 上传性能;内置 VPN(目前只提供给美国等少数地区),每月免费流量 50GB,等等。
- AI 促使源码进化还是导致它灭绝?
在基于大模型的 AI 辅助编程日益流行的时代,是否会出现为 AI 优化而不考虑人类可读性的编程语言?已有实验在尝试为提高大模型效率而最小化词元(tokens)。AI 是促使源码进化还是导致它灭绝?我们能否让 AI 直接从提示词生成一种中间语言然后将其输入到解释器或编译器?未来是否还需要高级语言?去年 10 月 IEEE Spectrum 召开了一个网络研讨会讨论了 AI 是否会导致编程语言消失。高级语言是人类使用的语言,我们完全可以让 AI 直接生成中间语言,而未来的程序员仍然可以做出与接口、算法以及其它架构相关的设计决策。最终生成的代码仍然需要通过测试,能解释它正在做什么。
- 亚马逊 AWS 位于巴林的数据中心第二次因无人机活动中断服务
亚马逊 AWS 位于巴林的数据中心因无人机活动中断服务,这是 AWS 本月第二次受到战争影响。亚马逊发言人确认问题是无人机活动造成的,但没有提供更多信息,不清楚其巴林设施是否直接遭到无人机袭击,还是附近区域遭到袭击。亚马逊表示正帮助客户迁移到其它 AWS 区域。亚马逊本月初表示其位于巴林和阿联酋的设施遭到无人机袭击,它当时表示由于结构受损预计恢复过程将比较长。
- 美国科学大国地位动摇
特朗普政府的反科学立场促使大量研究人员离开美国。数据显示,2025 年 1~8 月跨境流动的研究人员中,美国流出的份额上升到 11%,流入的份额下降到 15%。气候变化等领域的学者正在前往欧洲。流向西班牙、法国等欧洲国家和加拿大、韩国的人数增加。由于大学和企业吸引了世界上最优秀的人才,美国建立了科学超级大国的地位,成为创新和经济增长的源泉。但是特朗普第二届政府对名牌大学施加压力、削减科学技术预算、限制发放签证等移民政策、对气候变化和疫苗的怀疑论等政策正在导致研究人员“远离美国”。
- 2026 年阿贝尔奖授予了证明莫德尔猜想的 Gerd Faltings
2026 年阿贝尔奖授予了证明莫德尔猜想的德国数学家 Gerd Faltings。他在 1986 年 32 岁时因证明莫德尔猜想而获得了菲尔茨奖,该猜想后改名为 Faltings 定理。Faltings 定理研究的是曲线,曲线通常可以通过两个变量的加法和乘法组合而成的简单方程去描述。如果在坐标系中绘制方程的解,会形成直线、椭圆或更复杂的曲线。数学家一直在寻找这些解中一类特殊子集——“有理点”,即坐标为整数或分数的点。这些特殊点蕴含丰富而复杂的关系,隐藏着数学家试图揭示的秩序。然而曲线的数量是无限的,要确定所有曲线上的有理点似乎不可能——直到 Faltings 定理的出现。他证明如果一个曲线的方程中某个变量的幂次高于 3,那么这种曲线上的有理点数量必然是有限的。只有直线、二次曲线如圆和三次方程可能拥有无限多个有理点。其证明被视为算术几何的一大基石。
- Reddit 考虑对用户进行验证
Reddit CEO Steve Huffman 透露该社交网站正探索不同方法去验证用户是真人还是机器人。他说,最简单的方法是使用 Face ID 或 Touch ID 之类的生物识别技术,这些方法都需要真人参与;更复杂的方法包括了身份识别验证。Huffman 说,我们对用户的承诺是,我们不知道你的名字,但我们希望知道你是个人。他表示将会采取循序渐进的方法,每个平台都要找到恰当的平衡点。
- 干旱可能会促进耐抗生素微生物生长
根据发表在《Nature Microbiology》期刊上的一项研究,研究人员利用 116 个国家的临床数据分析研究发现,干旱条件可能会增加土壤中天然抗生素的浓度,促进耐抗生素微生物生长。土壤是自然抗生素化合物的丰富来源,许多土壤微生物则演化出了应对这些物质的生存机制。目前还不清楚气候变化导致的更频繁持久的干旱,会如何影响产生抗生素和耐抗生素的土壤微生物,也不清楚这对人类健康是否有影响。加州理工的研究人员结合计算分析和实验室实验,研究了干旱如何影响土壤中抗生素的动态变化。他们整合了来自此前研究的 5 组宏基因组数据集,包括来自美国加利福尼亚州的耕地和草地土壤、瑞士瓦莱州森林的土壤以及中国南昌湿地的土壤,并评估了微生物产抗生素和耐抗生素的基因数量,是如何基于土壤的干燥程度变化。他们发现,在所有5个数据集中,干旱条件下,产抗生素基因丰度显著增加,包括β-内酰胺类抗生素(如青霉素)和大环内酯类抗生素。
- 美国禁售外国公司制造的新消费级路由器
FCC 将所有外国制造的消费级路由器加入到监管清单,在美国禁售此类产品。FCC 给出的理由是国家安全,称“恶意行为者利用外国制造路由器的安全漏洞攻击美国家庭、破坏网络、进行间谍活动和窃取知识产权”。美国居民仍然可以使用现有的外国制造路由器,禁令适用于所有新路由器。外国制造路由器如果要在美国上市需要获得 FCC 的批准,相关公司必须披露其外国投资者并制定将路由器生产转移到美国的计划。
- OnlyFans 所有者 Leonid Radvinsky 去世,年仅 43 岁
OnlyFans 所有者 Leonid Radvinsky 去世,年仅 43 岁。Radvinsky 出生于乌克兰,在芝加哥长大,毕业于西北大学,获得经济学学位,最近主要住在佛罗里达。他于 2018 年从两位英国创始人手中收购了 OnlyFans。新冠疫情期间 OnlyFans 人气飙升,三年后他荣登福布斯年度亿万富翁排行榜。OnlyFans 在一份声明中证实,Radvinsky“在与癌症长期斗争后安详离世”,请求外界尊重其家人的隐私。根据 OnlyFans 最近向英国公司登记局提交的文件,2024 年该公司交易额逾 70 亿英镑,收入 14 亿美元,有逾 3.77 亿订阅用户以及 460 万内容创作者。
- 巴哈马群岛鲨鱼体内发现可卡因
根据发表在《Environmental Pollution》期刊上的一项研究,巴西巴拉那联邦大学生物学家检测了生活在巴哈马群岛 Eleuthera 岛附近的 85 条鲨鱼的血液样本,发现近三分之一的样本含有与人类活动相关的毒品和药物痕迹。科学家从鲨鱼体内检测出咖啡因、对乙酰氨基酚和双氯芬酸等抗炎药,至少一个样本检测到了可卡因。这一发现进一步证实海洋生态系统正受到人类活动相关污染物的影响。可卡因和双氯芬酸此前从未在巴哈马群岛鲨鱼体内发现。可卡因可能来自毒品走私活动中丢失或丢弃的毒品,药物则可能是通过废水排放进入海洋。
- 因太阳能充足澳大利亚降低电价为居民提供免费三小时用电
澳大利亚能源管理局(Australia Energy Regulator)上周公布了一份决定草案,降低居民电价 1.3%-10.1%,降低小企业电价 8.5%-21.2%,具体降幅由所在地区决定。决定草案还决定引入“太阳能共享”计划,旨在让用户在中午时分享受三个小时的免费电力,以充分利用丰富的太阳能资源。澳大利亚居民家庭已安装了逾 420 万套太阳能系统,在中午经常会有廉价的剩余电力。太阳能共享计划的部分理念是:将用电高峰(尤其是在傍晚)的用电需求转移到阳光最充足的时段。此举有助于最大限度降低高峰时段的电价,减少为确保电网稳定而升级和干预电网的需求。免费用电时段在新南威尔士州和昆士兰州东南部为上午 11 点至下午 2 点,在南澳大利亚州为中午 12 点至下午 3 点。
- GrapheneOS 拒绝年龄验证
Android 安全加固版 GrapheneOS 通过其社交媒体账号宣布它不会遵守新出台的年龄验证法律,这些法律要求操作系统在设置时收集用户年龄数据。GrapheneOS 强调它不会要求用户提供个人信息、身份证明或注册账户,如果搭载 GrapheneOS 的设备因当地法律法规限制无法在特定地区销售,它会接受。GrapheneOS 目前只支持 Google Pixel 系列智能手机,本月初与联想旗下的摩托罗拉宣布了合作,预计会在 2027 年推出支持 GrapheneOS 的智能手机。目前美国加州和科罗拉多州,以及巴西制定了法律要求操作系统验证用户年龄。其中巴西的法律 Digital ECA (Law 15.211)于 3 月 17 日生效,它规定违反者将面临最高 950 万美元罚款。GrapheneOS 不是唯一一个拒绝遵守年龄验证法律的操作系统项目。开源计算器固件项目 DB48X 开发者声明永远不会实施年龄验证;MidnightBSD 项目更新了许可证,禁止巴西用户使用。
- 石油能源危机推动向可再生能源的转型
霍尔木兹海峡的封锁导致世界再次面临严重的石油能源危机。全球约五分之一石油和液化天然气的运输是经过霍尔木兹海峡,此次危机受影响最大的是亚洲地区。与之前的石油危机不同的是,在很多国家可再生能源已能与化石燃料展开竞争。两大人口大国中国和印度都扩大了可再生能源规模,中国仍然依赖燃煤发电,其可再生能源的规模远超印度。国际能源署的数据显示,中国约十分之一的汽车是电动汽车。中国仍然是世界最大的原油进口国,也是伊朗石油的最大买家。但通过可再生能源实现部分经济领域的电气化,中国已降低了对进口石油的依赖。如果没有这种转变,中国受到影响会更显著。印度目前正面临烹饪用燃气短缺问题,燃气短缺促使居民去抢购电磁炉。太阳能和风能只占日本能源产出的 11%,与印度持平,低于中国的 18%。巴基斯坦加速发展太阳能使该国自 2020 年以来减少进口化石燃料逾 120 亿美元。孟加拉国能源储备有限,该国已关闭大学以节省用电,政府开始实行燃料配给制。
- 烟头会在环境中停留十年以上
根据发表在《Environmental Pollution》期刊上的一项研究,烟头不会完全从环境中消失。由于分解缓慢而且会释放出有毒物质,烟头构成了长期的环境危害。研究人员调查了烟头在十年中的分解过程,发现在富氮条件下烟头的质量会在十年里减少 84%。烟头的分解分为四个过程,初始阶段出现一个峰值,然后在中期再次出现一个峰值,显示旧烟头会带来持续的生态风险。