DIGEST · 2026-04-04

OrangeBot.AI Digest — 2026-04-04

83 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. How many products does Microsoft have named 'Copilot'? (teybannerman.com)
  2. Iranian missile blitz takes down AWS data centers in Bahrain and Dubai (www.tomshardware.com)
  3. 12k AI-generated blog posts added in a single commit (github.com)
  4. When legal sports betting surges, so do Americans' financial problems (www.npr.org)
  5. Show HN: A game where you build a GPU (jaso1024.com)
  6. Apple approves driver that lets Nvidia eGPUs work with Arm Macs (www.theverge.com)
  7. German men 18-45 need military permit for extended stays abroad (www.dw.com)
  8. Show HN: TurboQuant-WASM – Google's vector quantization in the browser (github.com)
  9. Components of a Coding Agent (magazine.sebastianraschka.com)
  10. Author of "Careless People" banned from saying anything negative about Meta (www.thetimes.com)
  11. The CMS is dead, long live the CMS (next.jazzsequence.com)
  12. The Cathedral, the Bazaar, and the Winchester Mystery House (www.dbreunig.com)
  13. Embarrassingly simple self-distillation improves code generation (arxiv.org)
  14. Some Unusual Trees (thoughts.wyounas.com)
  15. Claude Code Found a Linux Vulnerability Hidden for 23 Years (mtlynch.io)

GitHub Trending(8)

  1. Blaizzy / mlx-vlm
  2. onyx-dot-app / onyx
  3. Yeachan-Heo / oh-my-codex
  4. siddharthvaddem / openscreen
  5. telegramdesktop / tdesktop
  6. block / goose
  7. microsoft / agent-framework
  8. sherlock-project / sherlock

Product Hunt(15)

  1. Fluently

    AI subtitles & translations for YouTube. 20+ Languages.

  2. Open Claude in Chrome

    Claude in Chrome, reverse-engineered, Jailbroken

  3. Google Vids 2.0

    Create, edit and share videos at no cost w/ new AI features

  4. Surf Social Websites

    Bring together people and content on the social web

  5. Mercury Edit 2

    Ultra-fast next-edit prediction for coding

  6. APImage

    Create Images that Amaze

  7. Faahh

    Slap your desk. Unplug distractions. Get back to focus.

  8. Vista

    The image viewer macOS should have built.

  9. OpenGyver

    Turn CLI / AI agents into McGyver

  10. Donut Browser

    Open Source Anti-Detect Browser with Unlimited Profiles

  11. Sleek Analytics

    See who's on your site. Right now.

  12. Klick AI Camera Assistant

    Real-time AI camera that teaches you composition live

  13. OpenRouter Model Fusion

    Run many models side by side and fuse the best answer

  14. Slide2Video

    Turn slides into narrated videos

  15. Package Mate

    Master your macOS dev environment from the terminal

Hugging Face(15)

  1. DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

    Data-centric training has emerged as a promising direction for improving large language models (LLMs) by optimizing not only model parameters but also the selection, composition, and weighting of training data during optimization. However, existing approaches to data selection, data mixture optimization, and data reweighting are often developed in isolated codebases with inconsistent interfaces, hindering reproducibility, fair comparison, and practical integration. In this paper, we present DataFlex, a unified data-centric dynamic training framework built upon LLaMA-Factory. DataFlex supports three major paradigms of dynamic data optimization: sample selection, domain mixture adjustment, and sample reweighting, while remaining fully compatible with the original training workflow. It provides extensible trainer abstractions and modular components, enabling a drop-in replacement for standard LLM training, and unifies key model-dependent operations such as embedding extraction, inference, and gradient computation, with support for large-scale settings including DeepSpeed ZeRO-3. We conduct comprehensive experiments across multiple data-centric methods. Dynamic data selection consistently outperforms static full-data training on MMLU across both Mistral-7B and Llama-3.2-3B. For data mixture, DoReMi and ODM improve both MMLU accuracy and corpus-level perplexity over default proportions when pretraining Qwen2.5-1.5B on SlimPajama at 6B and 30B token scales. DataFlex also achieves consistent runtime improvements over original implementations. These results demonstrate that DataFlex provides an effective, efficient, and reproducible infrastructure for data-centric dynamic training of LLMs.

  2. The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

    Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent space in language-based models. We organize the survey into five sequential perspectives: Foundation, Evolution, Mechanism, Ability, and Outlook. We begin by delineating the scope of latent space, distinguishing it from explicit or verbal space and from the latent spaces commonly studied in generative visual models. We then trace the field's evolution from early exploratory efforts to the current large-scale expansion. To organize the technical landscape, we examine existing work through the complementary lenses of mechanism and ability. From the perspective of Mechanism, we identify four major lines of development: Architecture, Representation, Computation, and Optimization. From the perspective of Ability, we show how latent space supports a broad capability spectrum spanning Reasoning, Planning, Modeling, Perception, Memory, Collaboration, and Embodiment. Beyond consolidation, we discuss the key open challenges, and outline promising directions for future research. We hope this survey serves not only as a reference for existing work, but also as a foundation for understanding latent space as a general computational and systems paradigm for next-generation intelligence.

  3. Generative World Renderer

    Scaling generative inverse and forward rendering to real-world scenarios is bottlenecked by the limited realism and temporal coherence of existing synthetic datasets. To bridge this persistent domain gap, we introduce a large-scale, dynamic dataset curated from visually complex AAA games. Using a novel dual-screen stitched capture method, we extracted 4M continuous frames (720p/30 FPS) of synchronized RGB and five G-buffer channels across diverse scenes, visual effects, and environments, including adverse weather and motion-blur variants. This dataset uniquely advances bidirectional rendering: enabling robust in-the-wild geometry and material decomposition, and facilitating high-fidelity G-buffer-guided video generation. Furthermore, to evaluate the real-world performance of inverse rendering without ground truth, we propose a novel VLM-based assessment protocol measuring semantic, spatial, and temporal consistency. Experiments demonstrate that inverse renderers fine-tuned on our data achieve superior cross-dataset generalization and controllable generation, while our VLM evaluation strongly correlates with human judgment. Combined with our toolkit, our forward renderer enables users to edit styles of AAA games from G-buffers using text prompts.

  4. SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

    Agent skills, structured packages of procedural knowledge and executable resources that agents dynamically load at inference time, have become a reliable mechanism for augmenting LLM agents. Yet inference-time skill augmentation is fundamentally limited: retrieval noise introduces irrelevant guidance, injected skill content imposes substantial token overhead, and the model never truly acquires the knowledge it merely follows. We ask whether skills can instead be internalized into model parameters, enabling zero-shot autonomous behavior without any runtime skill retrieval. We introduce SKILL0, an in-context reinforcement learning framework designed for skill internalization. SKILL0 introduces a training-time curriculum that begins with full skill context and progressively withdraws it. Skills are grouped offline by category and rendered with interaction history into a compact visual context, teaching he model tool invocation and multi-turn task completion. A Dynamic Curriculum then evaluates each skill file's on-policy helpfulness, retaining only those from which the current policy still benefits within a linearly decaying budget, until the agent operates in a fully zero-shot setting. Extensive agentic experiments demonstrate that SKILL0 achieves substantial improvements over the standard RL baseline (+9.7\% for ALFWorld and +6.6\% for Search-QA), while maintaining a highly efficient context of fewer than 0.5k tokens per step. Our code is available at https://github.com/ZJU-REAL/SkillZero.

  5. Steerable Visual Representations

    Pretrained Vision Transformers (ViTs) such as DINOv2 and MAE provide generic image features that can be applied to a variety of downstream tasks such as retrieval, classification, and segmentation. However, such representations tend to focus on the most salient visual cues in the image, with no way to direct them toward less prominent concepts of interest. In contrast, Multimodal LLMs can be guided with textual prompts, but the resulting representations tend to be language-centric and lose their effectiveness for generic visual tasks. To address this, we introduce Steerable Visual Representations, a new class of visual representations, whose global and local features can be steered with natural language. While most vision-language models (e.g., CLIP) fuse text with visual features after encoding (late fusion), we inject text directly into the layers of the visual encoder (early fusion) via lightweight cross-attention. We introduce benchmarks for measuring representational steerability, and demonstrate that our steerable visual features can focus on any desired objects in an image while preserving the underlying representation quality. Our method also matches or outperforms dedicated approaches on anomaly detection and personalized object discrimination, exhibiting zero-shot generalization to out-of-distribution tasks.

  6. CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

    Large language model (LLM)-based evolution is a promising approach for open-ended discovery, where progress requires sustained search and knowledge accumulation. Existing methods still rely heavily on fixed heuristics and hard-coded exploration rules, which limit the autonomy of LLM agents. We present CORAL, the first framework for autonomous multi-agent evolution on open-ended problems. CORAL replaces rigid control with long-running agents that explore, reflect, and collaborate through shared persistent memory, asynchronous multi-agent execution, and heartbeat-based interventions. It also provides practical safeguards, including isolated workspaces, evaluator separation, resource management, and agent session and health management. Evaluated on diverse mathematical, algorithmic, and systems optimization tasks, CORAL sets new state-of-the-art results on 10 tasks, achieving 3-10 times higher improvement rates with far fewer evaluations than fixed evolutionary search baselines across tasks. On Anthropic's kernel engineering task, four co-evolving agents improve the best known score from 1363 to 1103 cycles. Mechanistic analyses further show how these gains arise from knowledge reuse and multi-agent exploration and communication. Together, these results suggest that greater agent autonomy and multi-agent evolution can substantially improve open-ended discovery. Code is available at https://github.com/Human-Agent-Society/CORAL.

  7. EgoSim: Egocentric World Simulator for Embodied Interaction Generation

    We introduce EgoSim, a closed-loop egocentric world simulator that generates spatially consistent interaction videos and persistently updates the underlying 3D scene state for continuous simulation. Existing egocentric simulators either lack explicit 3D grounding, causing structural drift under viewpoint changes, or treat the scene as static, failing to update world states across multi-stage interactions. EgoSim addresses both limitations by modeling 3D scenes as updatable world states. We generate embodiment interactions via a Geometry-action-aware Observation Simulation model, with spatial consistency from an Interaction-aware State Updating module. To overcome the critical data bottleneck posed by the difficulty in acquiring densely aligned scene-interaction training pairs, we design a scalable pipeline that extracts static point clouds, camera trajectories, and embodiment actions from in-the-wild large-scale monocular egocentric videos. We further introduce EgoCap, a capture system that enables low-cost real-world data collection with uncalibrated smartphones. Extensive experiments demonstrate that EgoSim significantly outperforms existing methods in terms of visual quality, spatial consistency, and generalization to complex scenes and in-the-wild dexterous interactions, while supporting cross-embodiment transfer to robotic manipulation. Codes and datasets will be open soon. The project page is at egosimulator.github.io.

  8. NearID: Identity Representation Learning via Near-identity Distractors

    When evaluating identity-focused tasks such as personalized generation and image editing, existing vision encoders entangle object identity with background context, leading to unreliable representations and metrics. We introduce the first principled framework to address this vulnerability using Near-identity (NearID) distractors, where semantically similar but distinct instances are placed on the exact same background as a reference image, eliminating contextual shortcuts and isolating identity as the sole discriminative signal. Based on this principle, we present the NearID dataset (19K identities, 316K matched-context distractors) together with a strict margin-based evaluation protocol. Under this setting, pre-trained encoders perform poorly, achieving Sample Success Rates (SSR), a strict margin-based identity discrimination metric, as low as 30.7% and often ranking distractors above true cross-view matches. We address this by learning identity-aware representations on a frozen backbone using a two-tier contrastive objective enforcing the hierarchy: same identity > NearID distractor > random negative. This improves SSR to 99.2%, enhances part-level discrimination by 28.0%, and yields stronger alignment with human judgments on DreamBench++, a human-aligned benchmark for personalization. Project page: https://gorluxor.github.io/NearID/

  9. LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

    Unified models (UMs) hold promise for their ability to understand and generate content across heterogeneous modalities. Compared to merely generating visual content, the use of UMs for interleaved cross-modal reasoning is more promising and valuable, e.g., for solving understanding problems that require dense visual thinking, improving visual generation through self-reflection, or modeling visual dynamics of the physical world guided by stepwise action interventions. However, existing UMs necessitate pixel decoding as a bridge due to their disjoint visual representations for understanding and generation, which is both ineffective and inefficient. In this paper, we introduce LatentUM, a novel unified model that represents all modalities within a shared semantic latent space, eliminating the need for pixel-space mediation between visual understanding and generation. This design naturally enables flexible interleaved cross-modal reasoning and generation. Beyond improved computational efficiency, the shared representation substantially alleviates codec bias and strengthens cross-modal alignment, allowing LatentUM to achieve state-of-the-art performance on the Visual Spatial Planning benchmark, push the limits of visual generation through self-reflection, and support world modeling by predicting future visual states within the shared semantic latent space.

  10. VOID: Video Object and Interaction Deletion

    Existing video object removal methods excel at inpainting content "behind" the object and correcting appearance-level artifacts such as shadows and reflections. However, when the removed object has more significant interactions, such as collisions with other objects, current models fail to correct them and produce implausible results. We present VOID, a video object removal framework designed to perform physically-plausible inpainting in these complex scenarios. To train the model, we generate a new paired dataset of counterfactual object removals using Kubric and HUMOTO, where removing an object requires altering downstream physical interactions. During inference, a vision-language model identifies regions of the scene affected by the removed object. These regions are then used to guide a video diffusion model that generates physically consistent counterfactual outcomes. Experiments on both synthetic and real data show that our approach better preserves consistent scene dynamics after object removal compared to prior video object removal methods. We hope this framework sheds light on how to make video editing models better simulators of the world through high-level causal reasoning.

  11. Therefore I am. I Think

    We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable, early-encoded decisions shape chain-of-thought in reasoning models. Specifically, we show that a simple linear probe successfully decodes tool-calling decisions from pre-generation activations with very high confidence, and in some cases, even before a single reasoning token is produced. Activation steering supports this causally: perturbing the decision direction leads to inflated deliberation, and flips behavior in many examples (between 7 - 79% depending on model and benchmark). We also show through behavioral analysis that, when steering changes the decision, the chain-of-thought process often rationalizes the flip rather than resisting it. Together, these results suggest that reasoning models can encode action choices before they begin to deliberate in text.

  12. Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

    AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover Omni-SimpleMem, a unified multimodal memory framework for lifelong AI agents. Starting from a naïve baseline (F1=0.117 on LoCoMo), the pipeline autonomously executes {sim}50 experiments across two benchmarks, diagnosing failure modes, proposing architectural modifications, and repairing data pipeline bugs, all without human intervention in the inner loop. The resulting system achieves state-of-the-art on both benchmarks, improving F1 by +411% on LoCoMo (0.117to0.598) and +214% on Mem-Gallery (0.254to0.797) relative to the initial configurations. Critically, the most impactful discoveries are not hyperparameter adjustments: bug fixes (+175%), architectural changes (+44%), and prompt engineering (+188% on specific categories) each individually exceed the cumulative contribution of all hyperparameter tuning, demonstrating capabilities fundamentally beyond the reach of traditional AutoML. We provide a taxonomy of six discovery types and identify four properties that make multimodal memory particularly suited for autoresearch, offering guidance for applying autonomous research pipelines to other AI system domains. Code is available at this https://github.com/aiming-lab/SimpleMem.

  13. UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

    Vision-Language-Action (VLA) models have recently emerged in autonomous driving, with the promise of leveraging rich world knowledge to improve the cognitive capabilities of driving systems. However, adapting such models for driving tasks currently faces a critical dilemma between spatial perception and semantic reasoning. Consequently, existing VLA systems are forced into suboptimal compromises: directly adopting 2D Vision-Language Models yields limited spatial perception, whereas enhancing them with 3D spatial representations often impairs the native reasoning capacity of VLMs. We argue that this dilemma largely stems from the coupled optimization of spatial perception and semantic reasoning within shared model parameters. To overcome this, we propose UniDriveVLA, a Unified Driving Vision-Language-Action model based on Mixture-of-Transformers that addresses the perception-reasoning conflict via expert decoupling. Specifically, it comprises three experts for driving understanding, scene perception, and action planning, which are coordinated through masked joint attention. In addition, we combine a sparse perception paradigm with a three-stage progressive training strategy to improve spatial perception while maintaining semantic reasoning capability. Extensive experiments show that UniDriveVLA achieves state-of-the-art performance in open-loop evaluation on nuScenes and closed-loop evaluation on Bench2Drive. Moreover, it demonstrates strong performance across a broad range of perception, prediction, and understanding tasks, including 3D detection, online mapping, motion forecasting, and driving-oriented VQA, highlighting its broad applicability as a unified model for autonomous driving. Code and model have been released at https://github.com/xiaomi-research/unidrivevla

  14. ASI-Evolve: AI Accelerates AI

    Can AI accelerate the development of AI itself? While recent agentic systems have shown strong performance on well-scoped tasks with rapid feedback, it remains unclear whether they can tackle the costly, long-horizon, and weakly supervised research loops that drive real AI progress. We present ASI-Evolve, an agentic framework for AI-for-AI research that closes this loop through a learn-design-experiment-analyze cycle. ASI-Evolve augments standard evolutionary agents with two key components: a cognition base that injects accumulated human priors into each round of exploration, and a dedicated analyzer that distills complex experimental outcomes into reusable insights for future iterations. To our knowledge, ASI-Evolve is the first unified framework to demonstrate AI-driven discovery across three central components of AI development: data, architectures, and learning algorithms. In neural architecture design, it discovered 105 SOTA linear attention architectures, with the best discovered model surpassing DeltaNet by +0.97 points, nearly 3x the gain of recent human-designed improvements. In pretraining data curation, the evolved pipeline improves average benchmark performance by +3.96 points, with gains exceeding 18 points on MMLU. In reinforcement learning algorithm design, discovered algorithms outperform GRPO by up to +12.5 points on AMC32, +11.67 points on AIME24, and +5.04 points on OlympiadBench. We further provide initial evidence that this AI-for-AI paradigm can transfer beyond the AI stack through experiments in mathematics and biomedicine. Together, these results suggest that ASI-Evolve represents a promising step toward enabling AI to accelerate AI across the foundational stages of development, offering early evidence for the feasibility of closed-loop AI research.

  15. Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers

    Recent advances in diffusion-based controllable visual generation have led to remarkable improvements in image quality. However, these powerful models are typically deployed on cloud servers due to their large computational demands, raising serious concerns about user data privacy. To enable secure and efficient on-device generation, we explore in this paper controllable diffusion models built upon linear attention architectures, which offer superior scalability and efficiency, even on edge devices. Yet, our experiments reveal that existing controllable generation frameworks, such as ControlNet and OminiControl, either lack the flexibility to support multiple heterogeneous condition types or suffer from slow convergence on such linear-attention models. To address these limitations, we propose a novel controllable diffusion framework tailored for linear attention backbones like SANA. The core of our method lies in a unified gated conditioning module working in a dual-path pipeline, which effectively integrates multi-type conditional inputs, such as spatially aligned and non-aligned cues. Extensive experiments on multiple tasks and benchmarks demonstrate that our approach achieves state-of-the-art controllable generation performance based on linear-attention models, surpassing existing methods in terms of fidelity and controllability.

Techmeme(15)

  1. Apple reportedly signed a 3rd-party driver, by Tiny Corp, for AMD or Nvidia eGPUs for Apple Silicon Macs; it's meant for AI research, not accelerating graphics (AppleInsider)

    AppleInsider : Apple reportedly signed a 3rd-party driver, by Tiny Corp, for AMD or Nvidia eGPUs for Apple Silicon Macs; it's meant for AI research, not accelerating graphics —  Apple has signed a driver for AMD or Nvidia eGPUs connected to Apple Silicon but there are some big caveats, and it won't improve your graphics.

  2. Research across 1,372 participants and 9K+ trials details "cognitive surrender", where most subjects had minimal AI skepticism and accepted faulty AI reasoning (Kyle Orland/Ars Technica)

    Kyle Orland / Ars Technica : Research across 1,372 participants and 9K+ trials details “cognitive surrender”, where most subjects had minimal AI skepticism and accepted faulty AI reasoning —  When it comes to large language model-powered tools, there are generally two broad categories of users.

  3. VCs are covering expenses like rent for young college dropouts founding AI startups; Antler: average AI unicorn founder age fell from 40 in 2020 to 29 in 2024 (Kate Clark/Wall Street Journal)

    Kate Clark / Wall Street Journal : VCs are covering expenses like rent for young college dropouts founding AI startups; Antler: average AI unicorn founder age fell from 40 in 2020 to 29 in 2024 —  Venture capitalists are stepping in to cover expenses like rent while dropouts from Harvard to Stanford chase their startup dreams

  4. Y Combinator appears to have dropped Delve, removing the company's profile from its startup directory, following allegations of fake compliance certificates (The Economic Times)

    The Economic Times : Y Combinator appears to have dropped Delve, removing the company's profile from its startup directory, following allegations of fake compliance certificates —  Delve's removal from Y Combinator's directory follows allegations that compliance certifications for hundreds of Delve's clients were fabricated.

  5. Russian media says attempts to limit VPN use may have triggered a widespread banking outage, as Moscow intensifies a crackdown on internet use and Telegram (Anthony Halpin/Bloomberg)

    Anthony Halpin / Bloomberg : Russian media says attempts to limit VPN use may have triggered a widespread banking outage, as Moscow intensifies a crackdown on internet use and Telegram —  Russia's attempts to restrict the use of virtual private networks amid a clampdown on the Telegram messaging platform triggered …

  6. Q&A with Simon Willison on the November release of GPT-5.1 and Opus 4.5 as the inflection point for coding, exhaustion due to managing coding agents, and more (Lenny Rachitsky/Lenny's Newsletter)

    Lenny Rachitsky / Lenny's Newsletter : Q&A with Simon Willison on the November release of GPT-5.1 and Opus 4.5 as the inflection point for coding, exhaustion due to managing coding agents, and more —  Simon Willison is a prolific independent software developer, a blogger, and one of the most visible and trusted voices on the impact AI is having on builders.

  7. A profile of Benjamin Brundage, a 22-year-old college senior who helped uncover the Kimwolf botnet, which launched 26,000+ DDoS attacks targeting 8,000+ victims (Robert McMillan/Wall Street Journal)

    Robert McMillan / Wall Street Journal : A profile of Benjamin Brundage, a 22-year-old college senior who helped uncover the Kimwolf botnet, which launched 26,000+ DDoS attacks targeting 8,000+ victims —  A flurry of powerful attacks had internet experts baffled.  Benjamin Brundage had a few tricks to help solve the mystery.

  8. The White House's latest effort to enact legislation that would preempt state AI laws stalls as multiple Democrats dismiss the proposal as a partisan play (Politico)

    Politico : The White House's latest effort to enact legislation that would preempt state AI laws stalls as multiple Democrats dismiss the proposal as a partisan play —  The resistance on Capitol Hill raises fresh doubts about whether Congress can pass any national laws for the rapidly advancing technology as states move ahead on their own.

  9. Chinese humanoid robot maker UBTech is seeking a chief scientist with an annual pay of as much as ~$18M; China's AI industry has eschewed mega pay packages (Bloomberg)

    Bloomberg : Chinese humanoid robot maker UBTech is seeking a chief scientist with an annual pay of as much as ~$18M; China's AI industry has eschewed mega pay packages —  Chinese humanoid robot maker UBTech Robotics Corp. is seeking a chief scientist, offering an annual pay of as much as 124 million yuan …

  10. Generalist, which raised $140M at a $440M valuation in 2025, releases GEN-1, an AI model to help robots handle high-dexterity tasks typically done by humans (Anna Tong/Forbes)

    Anna Tong / Forbes : Generalist, which raised $140M at a $440M valuation in 2025, releases GEN-1, an AI model to help robots handle high-dexterity tasks typically done by humans —  The company says the next big leap in robotics won't come from fancier humanoid hardware.  It will come from applying AI scaling principles …

  11. Sources: Copilot sales hit "big audacious goals" by March end after Microsoft pivoted its sales strategy; 3% of customers were paying for Copilot as of January (Brody Ford/Bloomberg)

    Brody Ford / Bloomberg : Sources: Copilot sales hit “big audacious goals” by March end after Microsoft pivoted its sales strategy; 3% of customers were paying for Copilot as of January —  Microsoft Corp., responding to Wall Street feedback, has pivoted its AI sales strategy to focus on selling Copilot rather …

  12. The Artemis II moon mission is one of the first times NASA has let astronauts fly with smartphones, giving them modified iPhones for taking photos and videos (Kalley Huang/New York Times)

    Kalley Huang / New York Times : The Artemis II moon mission is one of the first times NASA has let astronauts fly with smartphones, giving them modified iPhones for taking photos and videos —  The astronauts traveling in the Artemis II spacecraft were allowed to take smartphones with them.  Sadly, they can't connect to the internet.

  13. Health data startup Bevel's CEO pushes back against Whoop's lawsuit that alleges Bevel copied the look of the Whoop app, saying Whoop's actions are "lawfare" (Leila Sheridan/Inc.com)

    Leila Sheridan / Inc.com : Health data startup Bevel's CEO pushes back against Whoop's lawsuit that alleges Bevel copied the look of the Whoop app, saying Whoop's actions are “lawfare” —  He says Whoop previously reached out to explore a collaboration before filing the suit.

  14. Sources: Mark Zuckerberg is back to writing code after a two-decade hiatus, submitting three diffs to Meta's monorepo, and is a heavy user of Claude Code CLI (Gergely Orosz/The Pragmatic Engineer)

    Gergely Orosz / The Pragmatic Engineer : Sources: Mark Zuckerberg is back to writing code after a two-decade hiatus, submitting three diffs to Meta's monorepo, and is a heavy user of Claude Code CLI —  Mark Zuckerberg and Garry Tan join the trend of C-level folks jumping back into coding with AI.  Also: a bad week for Claude Code and GitHub, and more

  15. Anthropic says Claude subscriptions will no longer cover usage on third-party tools like OpenClaw starting April 4 at 12pm PT, to better manage capacity (Jay Peters/The Verge)

    Jay Peters / The Verge : Anthropic says Claude subscriptions will no longer cover usage on third-party tools like OpenClaw starting April 4 at 12pm PT, to better manage capacity —  Claude subscriptions will no longer cover third-party access from tools like OpenClaw starting Saturday, April 4th.

Solidot(15)

  1. 蟒蛇之血含有无副作用的减肥化合物

    蟒蛇能长到电话杆那么大,能一次吞食大量食物然后几个月不进食,同时还能维持代谢健康。根据发表在《Natural Metabolism》期刊上的一项研究,科学家报告在蟒蛇血液中发现了一种抑制食欲的化合物,没有流行减肥药 GLP-1 的副作用。研究发现,在进食后的几个小时内,为帮助消耗食物蟒蛇的心脏会扩张 25% 新陈代谢速度加快 4000 倍。研究人员测量了喂食后的球蟒和缅甸蟒的血液样本,发现有 208 种代谢物在蟒蛇进食后显著增加,其中名为 para-tyramine-O-sulfate(pTOS)的分子浓度飙升了千倍。小鼠实验显示,pTOS 能抑制食欲,降低体重,且不会引起胃肠道问题或肌肉流失。

  2. 美国近半数计划中的数据中心项目推迟或取消

    由于重要电力设施零部件如变压器、开关和电池短缺,美国近半数计划中的数据中心项目推迟或取消。美国计划在 2026 年新增 12 GW 的数据中心容量,但由于各种问题,只有三分之一的数据中心容量在积极建造中。电力基础设施占数据中心总成本的不到 10%,但它与计算硬件同样重要。由于需求旺盛,美国大功率变压器的交货周期从 2020 年前的 24-30 个月大幅延长到五年甚至更长。对 AI 数据中心而言,这无疑是灾难,因为它们的部署周期通常不到 18 个月。为解决短缺美国公司转向了全球市场,加拿大、墨西哥和韩国成为美国 AI 数据中心大功率变压器的主要供应国。数据显示,截至 2025 年 10 月,美国从中国进口的大功率变压器数量从 2022 年的不到 1500 台增至逾 8000 台。除此之外,中国占美国电池进口的 40% 以上,部分变压器和开关设备的份额接近 30%。

  3. 律师滥用 AI 生成虚假案例的事件激增

    律师滥用 AI 生成虚假的不存在案例的情况屡禁不止,而法庭对相关律师的惩罚并没有起到威慑作用。2025 年此类事件的数量出现了激增。巴黎高等商学院 (HEC Paris) 研究员 Damien Charlotin 建立了一个全球数据库,跟踪律师对 AI 的滥用。他说最近一天内收到来自 10 个不同法院的 10 起此类案件。他至今记录到了逾 1200 起滥用 AI 生成虚构案例的事件,其中美国最多,高达 831 起,香港记录到了 2 起。Damien Charlotin 说,法庭最近也开始加大了惩罚力度,俄勒冈州一名律师因滥用 AI 被勒令支付 109,700 美元的罚款和诉讼费用。

  4. Gentoo GNU/Hurd 不是愚人节玩笑

    4 月 1 日愚人节,Gentoo Linux 项目宣布将把 GNU Hurd 作为其主要内核。这并非完全是愚人节玩笑,它真的发布了 Gentoo GNU/Hurd 移植版本。基于微内核架构的 GNU Hurd 至今有逾 35 历史,但 1.0 版本还未发布,最新版本是 2016 年的 v0.9。Gentoo 项目表示它的 GNU/Hurd 版本仍然处于实验阶段,建议想要尝试的用户通过 QEMU 模拟器运行,当然也可以挑战直接在硬件上运行。

  5. 微软更新服务条款声明 Copilot 仅供娱乐

    微软被发现最近更新了 Copilot 的服务条款,包含了一则免责声明:Copilot 仅供娱乐,会犯错,会没有如预期般的工作,不要依赖 Copilot 提供重要建议,使用 Copilot 风险自负。经常使用 AI 聊天机器人的人可能早就知道它提供的信息并不可靠,需要验证。但由于它们过于方便,偷懒的人类变得不那么愿意花时间验证其输出。微软的免责声明再次强调,AI 聊天机器人既不是伴侣,也不是可靠的建议来源。它们是容易出错的工具,可能前一秒大有裨益,下一秒就可能犯错。

  6. 可再生能源新增装机容量占全球新增装机容量的八成以上

    IRENA 最新报告显示,2025 年可再生能源新增装机容量占全球新增装机容量的 85.6%,其中太阳能新增装机容量占到了四分之三。可再生能源新增装机容量约 700 GW,太阳能就有 511 GW,太阳能总装机容量达到了 2.4 TW,比风能和水力发电高 1 TW 以上,但由于太阳能的特性,2024 年的数据显示太阳能的发电量低于风力发电:太阳能占全球总发电量的 7%,风能 8%,核能占 9%。2025 年的数据还没有公布,但根据其快速增长的装机容量太阳能的发电量可能已经超过风电成为仅次于水电的第二大无碳电力来源。

  7. 考古学家在北美发现距今至少 1.2 万年的骰子

    赌博的历史比你想象的要更悠久。考古学家在《American Antiquity》期刊上报告发现了美洲原居民用于赌博的最古老骰子,距今至少 1.2 万年,比旧大陆上的同类活动要早六千年。从掷骰子到赛马,所有机会游戏都依赖于概率,而概率是一个相对反直觉的概念。骰子、机会游戏和赌博一直是美洲原居民文化的重要组成部分,最早的骰子出现在怀俄明州、科罗拉多州和新墨西哥州晚更新世的福尔松地层(Late Pleistocene Folsom)中。这些结果表明,古代美洲原住民掌握了关于机会、随机性和概率的基本知识,因而在对这些概念的理解和实际应用上走在了世界前列。

  8. 人们日常说话的单词量比上一年减少 300 个单词

    发表在《Perspectives on Psychological Science》期刊上的一项研究分析了逾 2000 名参与者在 2005 年到2019 年之间的音频数据,参与者的年龄从 10-94 岁。结果显示,我们每天说话的单词量比上一年减少了约 300 个单词。这意味着一年说话的单词量比上一年减少逾 12 万个单词。说话更少意味着我们花更少的时间与他人交流,也可能意味着更孤独,而孤独与负面健康影响密切相关。研究显示,年轻一代每日说话的单词量下降更快。

  9. AO3 结束公测

    知名同人作品网站 Archive of Our Own(AO3)宣布结束长达 17 年的公测(Open Beta)。AO3 成立于 2008 年 9 月,因保守团体施压导致同人作品被删除账号被关闭,同人社群决定创造自己的网站掌握自己的命运。AO3 于 2009 年 11 月开始公测,上线伊始它只有 347 位用户和 6,598 件同人作品,17 年后它的用户数突破了 1000 万,同人作品突破了 1700 万。运营者表示,AO3 软件已稳定运行了很长时间,结束公测只是外观上的变化,不意味着所有功能已最终完成或完美运行,也不意味着将停止对 AO3 的改进。

  10. 最富 0.1% 人口的离岸财富超过最穷半数人口的财富总和

    根据乐施会(Oxfam)的最新报告,全球最富有 0.1% 人口的未纳税离岸财富总和超过了全球最穷困半数人口(41 亿人)的财富总和。报告呼吁国际社会采取协调一致的行动,对巨额财富征税,终止避税天堂。2024 年全球有 3.55 万亿美元的未纳税财富藏匿于避税天堂和未申报账户中,超过了法国的 GDP,是全球 44 个最不发达国家 GDP 总和的两倍多。最富有的 0.1% 人口拥有约 80% 的未纳税离岸财富或 2.84 万亿美元,最富有的 0.01% 人口拥有半数的未纳税离岸财富或 1.77万亿美元。

  11. 雄章鱼交接腕兼具感觉和交配功能

    雄章鱼用于交配的特化腕足同时也是一种能够检测卵巢激素孕酮的感觉器官。研究人员在这条被称为“交接腕”(hectocotylus)的腕足上发现了化学感受器。在交配过程中,雄性章鱼会探查雌性章鱼的外套膜,旨在寻找用于受精的输卵管。一旦定位成功,精子就会沿着交接腕输送并得到存放。但雄性章鱼如何得知自己何时找到了输卵管?在一个实验装置中,研究人员在管子内壁涂抹了不同的物质,结果发现,只有当交接腕末端的小吸盘接触到孕酮(一种由卵巢产生的激素)时,精子才会释放。涂有其他物质的管子则会“引发回避行为”。

  12. Artemis II 的厕所是月球任务的一大里程碑

    执行阿波罗月球任务的飞船没有专用厕所,它使用的人体排泄物收集系统饱受宇航员诟病,曾在任务期间发生过臭名昭著的粪便漏出事故,迫使宇航员在飞船内追粪便。执行阿耳忒弥斯(Artemis)月球任务的宇航员则有了专门的更舒适的厕所。被称为 Universal Waste Management System (UWMS)的厕所为如厕的宇航员提供了在微重力条件下维持稳定的把手,能同时处理尿液和粪便的系统、男女通用的尿液收集装置,甚至还有营造出私密感的门。Collins Aerospace 公司在 2015 年与 NASA 签订了合同共同研发 UWMS 系统。

  13. Google 发布开放权重模型 Gemma 4

    Google 发布了最新的开放权重模型 Gemma 4,上个版本 Gemma 3 是在一年前发布的。Gemma 4 有四个版本,设计能在本地设备上运行:参数多的两个版本 26B Mixture of Experts 和 31B Dense 设计能在 80GB Nvidia H100 GPU(售价约 20 万人民币)上以 bfloat16 格式未量化运行,量化后降低精度则能使用消费级 GPU;参数少的两个版本 Effective 2B (E2B) 和 Effective 4B (E4B)设计能在移动设备上运行。Google 称它的 Pixel 团队与高通和联发科密切合作,为智能手机、Raspberry Pi 和 Jetson Nano 等设备对这些小模型进行了优化。Gemma 4 采用了 Apache 2.0 授权,在商业用途限制上更灵活。

  14. Artemis II 宇航员发现电脑上有两个 Outlook 但没有一个能用

    Artemis II 宇航员离开地球之后仍然需要用微软的软件,而且和地球的用户一样,微软的软件经常出故障。美国东部时间 4 月 1 日四名宇航员开始了前往月球的 10 天之旅,整个过程一直进行直播。大约在 2 a.m. ET 左右,任务控制中心确认一个流程控制系统出现问题,提出远程协助。本次任务的指挥官、NASA 宇航员 Reid Wiseman 在通话时他说在电脑上看到了两个 Microsoft Outlook,但没有一个能用。休斯顿的任务控制中心表示会通过远程连接调查下问题。一个小时后,任务控制中心表示 Outlook 已经能使用,它会显示为离线,这是预期行为。

  15. 亚马逊洽谈收购 Globalstar 以挑战 Starlink

    亚马逊正在洽谈收购 Globalstar 以帮助它与 SpaceX 的 Starlink 宽带卫星星座展开竞争。苹果持有 Globalstar 五分之一的股份,因此亚马逊和苹果需要展开谈判,增加了交易的复杂性。双方的磋商可能会破裂,无法达成任何协议。Globalstar 成立于 1991 年,受收购传闻的推动,周三市值达到 90 亿美元。苹果是在 2024 年向 Globalstar 投资 15 亿美元从而拥有 20% 股份。