OrangeBot.AI Digest — 2026-04-11
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- How We Broke Top AI Agent Benchmarks: And What Comes Next (rdi.berkeley.edu)
- Every plane you see in the sky – you can now follow it from the cockpit in 3D (flight-viz.com)
- The disturbing white paper Red Hat is trying to erase from the internet (www.osnews.com)
- Small models also found the vulnerabilities that Mythos found (aisle.com)
- Advanced Mac Substitute is an API-level reimplementation of 1980s-era Mac OS (www.v68k.org)
- Surelock: Deadlock-Free Mutexes for Rust (notes.brooklynzelenka.com)
- The future of everything is lies, I guess – Part 5: Annoyances (aphyr.com)
- South Korea introduces universal basic mobile data access (www.theregister.com)
- Cirrus Labs to join OpenAI (cirruslabs.org)
- Bitcoin miners are losing on every coin produced as difficulty drops (www.coindesk.com)
- Polymarket gamblers betting millions on war (www.theguardian.com)
- How Passive Radar Works (www.passiveradar.com)
- Show HN: Pardonned.com – A searchable database of US Pardons
- France's government is ditching Windows for Linux, says US tech a strategic risk (www.xda-developers.com)
- Optimal Strategy for Connect 4 (2swap.github.io)
GitHub Trending(13)
- NousResearch / hermes-agent
- microsoft / markitdown
- coleam00 / Archon
- forrestchang / andrej-karpathy-skills
- multica-ai / multica
- shanraisshan / claude-code-best-practice
- TapXWorld / ChinaTextbook
- OpenBMB / VoxCPM
- shiyu-coder / Kronos
- opendataloader-project / opendataloader-pdf
- HKUDS / DeepTutor
- obra / superpowers
- alexpate / awesome-design-systems
Product Hunt(15)
- MolmoWeb
Open web agents from data to deployment
- SummAgent
Spend less time reading emails, more time making moves
- Buildermark
Measure how much of your code is AI-generated. Open source.
- Claude for Word
Bring Claude natively into your Microsoft Word workflow
- Clicky
AI buddy next to your cursor on Mac—sees, guides, helps you!
- Voicr for Mac
Dictate and get improved or translated text
- uTerminal
A desktop terminal built for day-to-day remote access
- shush
Room-aware noise. Quieter. Smarter. Better REM sleep.
- 1% Better
Visualise the compounding effect of your daily habits
- LaReview
Open-source free next-generation code review
- LinkShell
Control your AI terminal sessions from your phone.
- Claude Code ultraplan
Claude Code command that plans your codebase in the cloud
- aperture
hiring is broken. we're building the fix.
- Tech Marketing Framework
Forkable GTM system for builders struggling with marketing
- Vequil
Deploy AI agent teams to trade prediction markets
Hugging Face(15)
- SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Large language model (LLM) agents such as OpenClaw rely on reusable skills to perform complex tasks, yet these skills remain largely static after deployment. As a result, similar workflows, tool usage patterns, and failure modes are repeatedly rediscovered across users, preventing the system from improving with experience. While interactions from different users provide complementary signals about when a skill works or fails, existing systems lack a mechanism to convert such heterogeneous experiences into reliable skill updates. To address these issues, we present SkillClaw, a framework for collective skill evolution in multi-user agent ecosystems, which treats cross-user and over-time interactions as the primary signal for improving skills. SkillClaw continuously aggregates trajectories generated during use and processes them with an autonomous evolver, which identifies recurring behavioral patterns and translates them into updates to the skill set by refining existing skills or extending them with new capabilities. The resulting skills are maintained in a shared repository and synchronized across users, allowing improvements discovered in one context to propagate system-wide while requiring no additional effort from users. By integrating multi-user experience into ongoing skill updates, SkillClaw enables cross-user knowledge transfer and cumulative capability improvement, and experiments on WildClawBench show that limited interaction and feedback, it significantly improves the performance of Qwen3-Max in real-world agent scenarios.
- Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability
A prevailing narrative in LLM post-training holds that supervised finetuning (SFT) memorizes while reinforcement learning (RL) generalizes. We revisit this claim for reasoning SFT with long chain-of-thought (CoT) supervision and find that cross-domain generalization is not absent but conditional, jointly shaped by optimization dynamics, training data, and base-model capability. Some reported failures are under-optimization artifacts: cross-domain performance first degrades before recovering and improving with extended training (a dip-and-recovery pattern), so shorttraining checkpoints can underestimate generalization. Data quality and structure both matter: low-quality solutions broadly hurt generalization,while verified long-CoT traces yield consistent cross-domain gains. Model capability is essential: stronger models internalize transferable procedural patterns (e.g., backtracking) even from a toy arithmetic game, while weaker ones imitate surface verbosity. This generalization is asymmetric, however: reasoning improves while safety degrades, reframing the question from whether reasoning SFT generalizes to under what conditions and at what cost.
- HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents
We introduce HY-Embodied-0.5, a family of foundation models specifically designed for real-world embodied agents. To bridge the gap between general Vision-Language Models (VLMs) and the demands of embodied agents, our models are developed to enhance the core capabilities required by embodied intelligence: spatial and temporal visual perception, alongside advanced embodied reasoning for prediction, interaction, and planning. The HY-Embodied-0.5 suite comprises two primary variants: an efficient model with 2B activated parameters designed for edge deployment, and a powerful model with 32B activated parameters targeted for complex reasoning. To support the fine-grained visual perception essential for embodied tasks, we adopt a Mixture-of-Transformers (MoT) architecture to enable modality-specific computing. By incorporating latent tokens, this design effectively enhances the perceptual representation of the models. To improve reasoning capabilities, we introduce an iterative, self-evolving post-training paradigm. Furthermore, we employ on-policy distillation to transfer the advanced capabilities of the large model to the smaller variant, thereby maximizing the performance potential of the compact model. Extensive evaluations across 22 benchmarks, spanning visual perception, spatial reasoning, and embodied understanding, demonstrate the effectiveness of our approach. Our MoT-2B model outperforms similarly sized state-of-the-art models on 16 benchmarks, while the 32B variant achieves performance comparable to frontier models such as Gemini 3.0 Pro. In downstream robot control experiments, we leverage our robust VLM foundation to train an effective Vision-Language-Action (VLA) model, achieving compelling results in real-world physical evaluations. Code and models are open-sourced at https://github.com/Tencent-Hunyuan/HY-Embodied.
- When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models
Text-to-video diffusion models have enabled open-ended video synthesis, but often struggle with generating the correct number of objects specified in a prompt. We introduce NUMINA , a training-free identify-then-guide framework for improved numerical alignment. NUMINA identifies prompt-layout inconsistencies by selecting discriminative self- and cross-attention heads to derive a countable latent layout. It then refines this layout conservatively and modulates cross-attention to guide regeneration. On the introduced CountBench, NUMINA improves counting accuracy by up to 7.4% on Wan2.1-1.3B, and by 4.9% and 5.5% on 5B and 14B models, respectively. Furthermore, CLIP alignment is improved while maintaining temporal consistency. These results demonstrate that structural guidance complements seed search and prompt enhancement, offering a practical path toward count-accurate text-to-video diffusion. The code is available at https://github.com/H-EmbodVis/NUMINA.
- ClawBench: Can AI Agents Complete Everyday Online Tasks?
AI agents may be able to automate your inbox, but can they automate other routine aspects of your life? Everyday online tasks offer a realistic yet unsolved testbed for evaluating the next generation of AI agents. To this end, we introduce ClawBench, an evaluation framework of 153 simple tasks that people need to accomplish regularly in their lives and work, spanning 144 live platforms across 15 categories, from completing purchases and booking appointments to submitting job applications. These tasks require demanding capabilities beyond existing benchmarks, such as obtaining relevant information from user-provided documents, navigating multi-step workflows across diverse platforms, and write-heavy operations like filling in many detailed forms correctly. Unlike existing benchmarks that evaluate agents in offline sandboxes with static pages, ClawBench operates on production websites, preserving the full complexity, dynamic nature, and challenges of real-world web interaction. A lightweight interception layer captures and blocks only the final submission request, ensuring safe evaluation without real-world side effects. Our evaluations of 7 frontier models show that both proprietary and open-source models can complete only a small portion of these tasks. For example, Claude Sonnet 4.6 achieves only 33.3%. Progress on ClawBench brings us closer to AI agents that can function as reliable general-purpose assistants.
- MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
In this paper, we introduce MegaStyle, a novel and scalable data curation pipeline that constructs an intra-style consistent, inter-style diverse and high-quality style dataset. We achieve this by leveraging the consistent text-to-image style mapping capability of current large generative models, which can generate images in the same style from a given style description. Building on this foundation, we curate a diverse and balanced prompt gallery with 170K style prompts and 400K content prompts, and generate a large-scale style dataset MegaStyle-1.4M via content-style prompt combinations. With MegaStyle-1.4M, we propose style-supervised contrastive learning to fine-tune a style encoder MegaStyle-Encoder for extracting expressive, style-specific representations, and we also train a FLUX-based style transfer model MegaStyle-FLUX. Extensive experiments demonstrate the importance of maintaining intra-style consistency, inter-style diversity and high-quality for style dataset, as well as the effectiveness of the proposed MegaStyle-1.4M. Moreover, when trained on MegaStyle-1.4M, MegaStyle-Encoder and MegaStyle-FLUX provide reliable style similarity measurement and generalizable style transfer, making a significant contribution to the style transfer community. More results are available at our project website https://jeoyal.github.io/MegaStyle/.
- OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks
Group Relative Policy Optimization (GRPO) has emerged as the de facto Reinforcement Learning (RL) objective driving recent advancements in Multimodal Large Language Models. However, extending this success to open-source multimodal generalist models remains heavily constrained by two primary challenges: the extreme variance in reward topologies across diverse visual tasks, and the inherent difficulty of balancing fine-grained perception with multi-step reasoning capabilities. To address these issues, we introduce Gaussian GRPO (G^2RPO), a novel RL training objective that replaces standard linear scaling with non-linear distributional matching. By mathematically forcing the advantage distribution of any given task to strictly converge to a standard normal distribution, N(0,1), G^2RPO theoretically ensures inter-task gradient equity, mitigates vulnerabilities to heavy-tail outliers, and offers symmetric update for positive and negative rewards. Leveraging the enhanced training stability provided by G^2RPO, we introduce two task-level shaping mechanisms to seamlessly balance perception and reasoning. First, response length shaping dynamically elicits extended reasoning chains for complex queries while enforce direct outputs to bolster visual grounding. Second, entropy shaping tightly bounds the model's exploration zone, effectively preventing both entropy collapse and entropy explosion. Integrating these methodologies, we present OpenVLThinkerV2, a highly robust, general-purpose multimodal model. Extensive evaluations across 18 diverse benchmarks demonstrate its superior performance over strong open-source and leading proprietary frontier models.
- LPM 1.0: Video-based Character Performance Model
Performance, the externalization of intent, emotion, and personality through visual, vocal, and temporal behavior, is what makes a character alive. Learning such performance from video is a promising alternative to traditional 3D pipelines. However, existing video models struggle to jointly achieve high expressiveness, real-time inference, and long-horizon identity stability, a tension we call the performance trilemma. Conversation is the most comprehensive performance scenario, as characters simultaneously speak, listen, react, and emote while maintaining identity over time. To address this, we present LPM 1.0 (Large Performance Model), focusing on single-person full-duplex audio-visual conversational performance. Concretely, we build a multimodal human-centric dataset through strict filtering, speaking-listening audio-video pairing, performance understanding, and identity-aware multi-reference extraction; train a 17B-parameter Diffusion Transformer (Base LPM) for highly controllable, identity-consistent performance through multimodal conditioning; and distill it into a causal streaming generator (Online LPM) for low-latency, infinite-length interaction. At inference, given a character image with identity-aware references, LPM 1.0 generates listening videos from user audio and speaking videos from synthesized audio, with text prompts for motion control, all at real-time speed with identity-stable, infinite-length generation. LPM 1.0 thus serves as a visual engine for conversational agents, live streaming characters, and game NPCs. To systematically evaluate this setting, we propose LPM-Bench, the first benchmark for interactive character performance. LPM 1.0 achieves state-of-the-art results across all evaluated dimensions while maintaining real-time inference.
- DMax: Aggressive Parallel Decoding for dLLMs
We present DMax, a new paradigm for efficient diffusion language models (dLLMs). It mitigates error accumulation in parallel decoding, enabling aggressive decoding parallelism while preserving generation quality. Unlike conventional masked dLLMs that decode through a binary mask-to-token transition, DMax reformulates decoding as a progressive self-refinement from mask embeddings to token embeddings. At the core of our approach is On-Policy Uniform Training, a novel training strategy that efficiently unifies masked and uniform dLLMs, equipping the model to recover clean tokens from both masked inputs and its own erroneous predictions. Building on this foundation, we further propose Soft Parallel Decoding. We represent each intermediate decoding state as an interpolation between the predicted token embedding and the mask embedding, enabling iterative self-revising in embedding space. Extensive experiments across a variety of benchmarks demonstrate the effectiveness of DMax. Compared with the original LLaDA-2.0-mini, our method improves TPF on GSM8K from 2.04 to 5.47 while preserving accuracy. On MBPP, it increases TPF from 2.71 to 5.86 while maintaining comparable performance. On two H200 GPUs, our model achieves an average of 1,338 TPS at batch size 1. Code is available at: https://github.com/czg1225/DMax
- KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
Personalized mobile agents that infer user preferences and calibrate proactive assistance hold great promise as everyday digital assistants, yet existing benchmarks fail to capture what this requires. Prior work evaluates preference recovery from static histories or intent prediction from fixed contexts. Neither tests whether an agent can elicit missing preferences through interaction, nor whether it can decide when to intervene, seek consent, or remain silent in a live GUI environment. We introduce KnowU-Bench, an online benchmark for personalized mobile agents built on a reproducible Android emulation environment, covering 42 general GUI tasks, 86 personalized tasks, and 64 proactive tasks. Unlike prior work that treats user preferences as static context, KnowU-Bench hides the user profile from the agent and exposes only behavioral logs, forcing genuine preference inference rather than context lookup. To support multi-turn preference elicitation, it instantiates an LLM-driven user simulator grounded in structured profiles, enabling realistic clarification dialogues and proactive consent handling. Beyond personalization, KnowU-Bench provides comprehensive evaluation of the complete proactive decision chain, including grounded GUI execution, consent negotiation, and post-rejection restraint, evaluated through a hybrid protocol combining rule-based verification with LLM-as-a-Judge scoring. Our experiments reveal a striking degradation: agents that excel at explicit task execution fall below 50% under vague instructions requiring user preference inference or intervention calibration, even for frontier models like Claude Sonnet 4.6. The core bottlenecks are not GUI navigation but preference acquisition and intervention calibration, exposing a fundamental gap between competent interface operation and trustworthy personal assistance.
- Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
Large language model (LLM) agents are increasingly built less by changing model weights than by reorganizing the runtime around them. Capabilities that earlier systems expected the model to recover internally are now externalized into memory stores, reusable skills, interaction protocols, and the surrounding harness that makes these modules reliable in practice. This paper reviews that shift through the lens of externalization. Drawing on the idea of cognitive artifacts, we argue that agent infrastructure matters not merely because it adds auxiliary components, but because it transforms hard cognitive burdens into forms that the model can solve more reliably. Under this view, memory externalizes state across time, skills externalize procedural expertise, protocols externalize interaction structure, and harness engineering serves as the unification layer that coordinates them into governed execution. We trace a historical progression from weights to context to harness, analyze memory, skills, and protocols as three distinct but coupled forms of externalization, and examine how they interact inside a larger agent system. We further discuss the trade-off between parametric and externalized capability, identify emerging directions such as self-evolving harnesses and shared agent infrastructure, and discuss open challenges in evaluation, governance, and the long-term co-evolution of models and external infrastructure. The result is a systems-level framework for explaining why practical agent progress increasingly depends not only on stronger models, but on better external cognitive infrastructure.
- Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
The advent of agentic multimodal models has empowered systems to actively interact with external environments. However, current agents suffer from a profound meta-cognitive deficit: they struggle to arbitrate between leveraging internal knowledge and querying external utilities. Consequently, they frequently fall prey to blind tool invocation, resorting to reflexive tool execution even when queries are resolvable from the raw visual context. This pathological behavior precipitates severe latency bottlenecks and injects extraneous noise that derails sound reasoning. Existing reinforcement learning protocols attempt to mitigate this via a scalarized reward that penalizes tool usage. Yet, this coupled formulation creates an irreconcilable optimization dilemma: an aggressive penalty suppresses essential tool use, whereas a mild penalty is entirely subsumed by the variance of the accuracy reward during advantage normalization, rendering it impotent against tool overuse. To transcend this bottleneck, we propose HDPO, a framework that reframes tool efficiency from a competing scalar objective to a strictly conditional one. By eschewing reward scalarization, HDPO maintains two orthogonal optimization channels: an accuracy channel that maximizes task correctness, and an efficiency channel that enforces execution economy exclusively within accurate trajectories via conditional advantage estimation. This decoupled architecture naturally induces a cognitive curriculum-compelling the agent to first master task resolution before refining its self-reliance. Extensive evaluations demonstrate that our resulting model, Metis, reduces tool invocations by orders of magnitude while simultaneously elevating reasoning accuracy.
- MolmoWeb: Open Visual Web Agent and Open Data for the Open Web
Web agents--autonomous systems that navigate and execute tasks on the web on behalf of users--have the potential to transform how people interact with the digital world. However, the most capable web agents today rely on proprietary models with undisclosed training data and recipes, limiting scientific understanding, reproducibility, and community-driven progress. We believe agents for the open web should be built in the open. To this end, we introduce (1) MolmoWebMix, a large and diverse mixture of browser task demonstrations and web-GUI perception data and (2) MolmoWeb, a family of fully open multimodal web agents. Specifically, MolmoWebMix combines over 100K synthetic task trajectories from multiple complementary generation pipelines with 30K+ human demonstrations, atomic web-skill trajectories, and GUI perception data, including referring expression grounding and screenshot question answering. MolmoWeb agents operate as instruction-conditioned visual-language action policies: given a task instruction and a webpage screenshot, they predict the next browser action, requiring no access to HTML, accessibility trees, or specialized APIs. Available in 4B and 8B size, on browser-use benchmarks like WebVoyager, Online-Mind2Web, and DeepShop, MolmoWeb agents achieve state-of-the-art results outperforming similar scale open-weight-only models such as Fara-7B, UI-Tars-1.5-7B, and Holo1-7B. MolmoWeb-8B also surpasses set-of-marks (SoM) agents built on much larger closed frontier models like GPT-4o. We further demonstrate consistent gains through test-time scaling via parallel rollouts with best-of-N selection, achieving 94.7% and 60.5% pass@4 (compared to 78.2% and 35.3% pass@1) on WebVoyager and Online-Mind2Web respectively. We will release model checkpoints, training data, code, and a unified evaluation harness to enable reproducibility and accelerate open research on web agents.
- OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence
Spatial understanding is a fundamental cornerstone of human-level intelligence. Nonetheless, current research predominantly focuses on domain-specific data production, leaving a critical void: the absence of a principled, open-source engine capable of fully unleashing the potential of high-quality spatial data. To bridge this gap, we elucidate the design principles of a robust data generation system and introduce OpenSpatial -- an open-source data engine engineered for high quality, extensive scalability, broad task diversity, and optimized efficiency. OpenSpatial adopts 3D bounding boxes as the fundamental primitive to construct a comprehensive data hierarchy across five foundational tasks: Spatial Measurement (SM), Spatial Relationship (SR), Camera Perception (CP), Multi-view Consistency (MC), and Scene-Aware Reasoning (SAR). Leveraging this scalable infrastructure, we curate OpenSpatial-3M, a large-scale dataset comprising 3 million high-fidelity samples. Extensive evaluations demonstrate that versatile models trained on our dataset achieve state-of-the-art performance across a wide spectrum of spatial reasoning benchmarks. Notably, the best-performing model exhibits a substantial average improvement of 19 percent, relatively. Furthermore, we provide a systematic analysis of how data attributes influence spatial perception. By open-sourcing both the engine and the 3M-scale dataset, we provide a robust foundation to accelerate future research in spatial intelligence.
- OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering
To extend the reinforcement learning post-training paradigm to omni-modal models for concurrently bolstering video-audio understanding and collaborative reasoning, we propose OmniJigsaw, a generic self-supervised framework built upon a temporal reordering proxy task. Centered on the chronological reconstruction of shuffled audio-visual clips, this paradigm strategically orchestrates visual and auditory signals to compel cross-modal integration through three distinct strategies: Joint Modality Integration, Sample-level Modality Selection, and Clip-level Modality Masking. Recognizing that the efficacy of such proxy tasks is fundamentally tied to puzzle quality, we design a two-stage coarse-to-fine data filtering pipeline, which facilitates the efficient adaptation of OmniJigsaw to massive unannotated omni-modal data. Our analysis reveals a ``bi-modal shortcut phenomenon'' in joint modality integration and demonstrates that fine-grained clip-level modality masking mitigates this issue while outperforming sample-level modality selection. Extensive evaluations on 15 benchmarks show substantial gains in video, audio, and collaborative reasoning, validating OmniJigsaw as a scalable paradigm for self-supervised omni-modal learning.
Techmeme(15)
- Survey of 6,698 people across six EU countries: around 84% said they don't trust US tech companies with their personal data; 93% don't trust Chinese companies (Ellen O'Regan/Politico)
Ellen O'Regan / Politico : Survey of 6,698 people across six EU countries: around 84% said they don't trust US tech companies with their personal data; 93% don't trust Chinese companies — Europe is rolling out measures to keep data local and reduce reliance on foreign tech. — More than 8 in 10 Europeans …
- Sources: Anthropic met with Christian leaders in March to seek input on Claude's moral and spiritual development and if it could be considered a "child of God" (Washington Post)
Washington Post : Sources: Anthropic met with Christian leaders in March to seek input on Claude's moral and spiritual development and if it could be considered a “child of God” — The artificial intelligence company asked religious leaders for guidance on building a moral chatbot. — Summary
- A wave of top AI researchers returned from the US to China in the past year, driven by better pay, quality of life, and a more restrictive US immigration system (Zijing Wu/Financial Times)
Zijing Wu / Financial Times : A wave of top AI researchers returned from the US to China in the past year, driven by better pay, quality of life, and a more restrictive US immigration system — Engineers and scientists return for better pay and quality of life as US grows more hostile — In the hushed corridors …
- Google says Polymarket bets "briefly appeared in Google News in error", after the bets appeared alongside news articles in the "For You" section (Terrence O'Brien/The Verge)
Terrence O'Brien / The Verge : Google says Polymarket bets “briefly appeared in Google News in error”, after the bets appeared alongside news articles in the “For You” section — Links to bets on world events were appearing alongside legitimate news organizations.
- Japan approves an additional $4B in subsidies to Rapidus to bankroll the chipmaker's work for Fujitsu, taking the total state investment and fees to $16.3B (Mari Kiyohara/Bloomberg)
Mari Kiyohara / Bloomberg : Japan approves an additional $4B in subsidies to Rapidus to bankroll the chipmaker's work for Fujitsu, taking the total state investment and fees to $16.3B — Japan approved ¥631.5 billion ($4 billion) in additional subsidies to quicken Rapidus Corp.'s entry into the high-stakes AI chipmaking arena …
- An investigation details Webloc, an ad-based geo surveillance system providing access to a constantly updated stream of records from up to 500M mobile devices (The Citizen Lab)
The Citizen Lab : An investigation details Webloc, an ad-based geo surveillance system providing access to a constantly updated stream of records from up to 500M mobile devices — Location data collected from mobile apps and digital advertising can reveal habits, interests and almost any other aspect of someone's life.
- How AI is transforming golf: optimizing course operations, virtual assistants handling tee time bookings, and AI instructor apps improving player performance (Bradley S. Klein/Wall Street Journal)
Bradley S. Klein / Wall Street Journal : How AI is transforming golf: optimizing course operations, virtual assistants handling tee time bookings, and AI instructor apps improving player performance — From reserving a tee time to fending off turf disease, artificial intelligence is putting the game under an algorithmic microscope
- Court filing: OpenAI says Elon Musk's recent amendments to his OpenAI lawsuit are a "legal ambush", calling them "legally improper and factually unsupported" (Robert Burnson/Bloomberg)
Robert Burnson / Bloomberg : Court filing: OpenAI says Elon Musk's recent amendments to his OpenAI lawsuit are a “legal ambush”, calling them “legally improper and factually unsupported” — OpenAI says Elon Musk has suddenly changed direction on what he's seeking in his lawsuit against the startup in a …
- Ramp data: 30.6% of US businesses paid for Anthropic's tools in March, up from 24.4% in February; OpenAI's US business adoption remained nearly flat MoM at ~35% (Clara Murray/Financial Times)
Clara Murray / Financial Times : Ramp data: 30.6% of US businesses paid for Anthropic's tools in March, up from 24.4% in February; OpenAI's US business adoption remained nearly flat MoM at ~35% — Divergence reflects company's recent rapid growth owing to strong interest in its Claude Code products
- Indian IT giant TCS reports Q4 sales up 9.7% YoY to $7.63B, net profit up 12.2% to $1.48B, both above est., and says new AI models did not hurt services demand (Reuters)
Reuters : Indian IT giant TCS reports Q4 sales up 9.7% YoY to $7.63B, net profit up 12.2% to $1.48B, both above est., and says new AI models did not hurt services demand — Tata Consultancy Services (TCS.NS) reported better-than-expected quarterly results on Thursday and said that new artificial intelligence models …
- UK activist investor Palliser has built a stake in Ajinomoto, urging it to raise prices for its ABF, a key material used to form advanced chipmaking substrates (Yang Jie/Wall Street Journal)
Yang Jie / Wall Street Journal : UK activist investor Palliser has built a stake in Ajinomoto, urging it to raise prices for its ABF, a key material used to form advanced chipmaking substrates — Its shares surged in February and is up over 40% year to date — An activist investor has built a stake in a Japanese company …
- Alat, a $100B Saudi Arabia PIF-backed electronics manufacturing fund, has removed CEO Amit Midha; sources say it has dropped plans to invest in chip production (Matthew Martin/Semafor)
Matthew Martin / Semafor : Alat, a $100B Saudi Arabia PIF-backed electronics manufacturing fund, has removed CEO Amit Midha; sources say it has dropped plans to invest in chip production — Saudi Arabia's sovereign wealth fund has dismissed Alat CEO Amit Midha — a former Dell executive hired three years ago to run …
- OpenAI says a GitHub workflow used to sign its macOS apps downloaded a malicious Axios library on March 31, but no user data or internal system was compromised (Sam Sabin/Axios)
Sam Sabin / Axios : OpenAI says a GitHub workflow used to sign its macOS apps downloaded a malicious Axios library on March 31, but no user data or internal system was compromised — OpenAI said Friday that it found evidence that one of its internal tools downloaded a compromised update from a recently infected, legitimate open-source software library.
- Sources: three senior executives who helped launch OpenAI's Stargate initiative are leaving the company and joining Meta (Bloomberg)
Bloomberg : Sources: three senior executives who helped launch OpenAI's Stargate initiative are leaving the company and joining Meta — Three key players in OpenAI's massive effort to set up hundreds of billions of dollars' worth of artificial intelligence data center capacity are joining Meta Platforms Inc. …
- The US CFTC says a district court judge granted its request for a temporary restraining order barring Arizona from continuing its criminal case against Kalshi (Jack Queen/Reuters)
Jack Queen / Reuters : The US CFTC says a district court judge granted its request for a temporary restraining order barring Arizona from continuing its criminal case against Kalshi — A federal judge on Friday blocked Arizona from continuing its criminal case against prediction market Kalshi …
Solidot(15)
- Red Hat 解雇了中国工程团队
IBM 子公司 Red Hat 解雇了整个中国工程团队,将大部分工作岗位转移到印度。一位自称是 Red Hat China 首席软件工程师的用户在 Hacker News 上发帖称他周四醒来后发现无法登陆 VPN,其他多种服务的访问权限也都撤销了,CTO 之后通知他们公司将业务重心转移到亚太中心。此次裁员有 300-500 人受到影响。根据 Red Hat CTO Chris Wright 的备忘录,Red Hat 将印度视为关键地点,中国不再是,因此它将停止在中国的工程活动,将大部分工作转移到印度。IBM 此前表示它的印度员工总数超过了美国,全球员工总数达到 26.4 万人。
- 在 Firefox 浏览器上安装“所有”扩展会发生什么
Firefox 官方扩展商店共有 8.4 万个扩展,如果我们下载安装所有这些扩展,会发生什么事?有人替我们完成了这一几乎不可能的任务。扩展的总容量为 49.3 GB,单个扩展的平均容量为 584.9 kB。容量最大的是 dmitlichess 共 196.3 MB——包含逾 2000 音频文件,其次是 (Unoffical) ReactBot Web 容量 184.9 MB,Eric’s Thumbnail Seasoning! 容量 146.6 MB,Animal Forest:PG BGM 137.4 MB...等等。最小的扩展是 theTabs-saver,7518 bytes,没有代码;访问权限需求最多的扩展是 FalscheLaden,要求 3,695 个访问权限,该扩展没有活跃用户,Google Dark Theme 要求 2,675 个权限但它有 1,687 名用户;扩展开发者 Dr. B 使用 AI 工具开发了 84 个扩展,这些扩展的 README.md 提到了 Grok 3。34.3% 的扩展没有日活用户,0.7% 的扩展日活用户超过 1 万;76.7% 的扩展开源。作者发现,当加载的扩展数达到 65,335 个之后浏览器就挂起卡死了。在设法加载 8.4 万个扩展之后,浏览器基本上没法用了,比如打开扩展页 about:addons 就花了 6 个小时。
- 全球夜间人造光亮度 8 年增加 16%
根据发表在《自然》期刊上的一项研究,通过对卫星观测数据分析发现,2014-2022 年间,全球夜间人造光总体亮度增加了 16%。研究显示,全球夜间亮度变化并非均匀增强,而是不同地区此消彼长形成“拼图式”格局。2022 年美国夜间总亮度位居全球首位,其次为中国、印度、加拿大和巴西。亮度增长主要来自城市化加速、基础设施扩张及农村电气化,撒哈拉以南非洲和东南亚增幅最为显著。而亮度下降则分为两类:突发性变暗多由自然灾害、电网故障和武装冲突引发;渐进性变暗则多与节能政策和减少光污染措施相关,欧洲部分地区即呈现这一趋势。
- 涉嫌向 Sam Altman 住宅扔燃烧瓶的嫌疑人被捕
旧金山警方逮捕了一名涉嫌向 OpenAI CEO Sam Altman 住宅扔燃烧瓶的嫌疑人,还嫌疑人还跑到 OpenAI 位于旧金山 Mission Bay 的总部大楼前发表威胁言论。OpenAI 在发给员工的内部声明中表示,在周五 3:45am PT 左右,嫌疑人接近 Sam 住宅投掷了一枚燃烧装置。该装置落在附近并熄灭。无人受伤,仅造成轻微损失。不久后安保在总部大楼 MB1 外发现了与一位描述的嫌疑人相符的人。OpenAI 通知员工,安保可能会有所加强,办公室仍然正常开放,建议员工不要让任何人尾随进入大楼。
- Artemis II 宇航员返回地面
NASA 执行阿尔忒弥斯二号(Artemis II)绕月任务的四名宇航员于 4 月 10 日 5:07 p.m. PDT 溅落在圣地亚哥海岸附近。NASA 宇航员 Reid Wiseman、Victor Glover 和 Christina Koch,以及加拿大 CSA 宇航员 Jeremy Hansen 搭乘直升飞机前往 John P. Murtha 号驱逐舰接受初步体检,预计 4 月 11 日返回休斯顿的 NASA 太空中心。宇航员总共飞行了 694,481 英里,打破了 1970 年阿波罗 13 号宇航员创下的人类最远飞行距离纪录。
- 阿根廷企鹅国度发现 PFAS
加州戴维斯的科学家在 2022-2024 年的企鹅繁殖季期间给生活在阿根廷 Patagonian 海岸的 54 只麦哲伦企鹅的脚套上硅胶被动取样器,收集企鹅在数天活动期间内接触的水、空气和表面化学物质的样本。采样器回收后送往纽约州立大学布法罗进行检测。检测结果显示,即使在偏远地区,逾九成样本都检测到了 PFAS,PFAS 代表全氟烷基和多氟烷基物质,因为不会自然分解它被称为“永久留存的化学品”。
- FBI 利用 iPhone 通知数据恢复已删除 Signal 消息
在最近一起涉及在德州 Alvarado ICE Prairieland Detention Facility 放烟花和破坏财产案件中,FBI 利用 iPhone 手机上储存的通知数据库数据恢复了已删除的 Signal 消息。Signal 采用端对端加密,除了接收双方,其他人本来无法查看消息,但 iPhone 的通知功能能提供消息的预览,相关信息还会储存在通知数据库中。要避免消息被预览和保存,Signal 用户可以启用禁止预览,或者修改 iPhone 的设置 > 通知 > 通知内容 > 显示:“仅名称”或“不显示名称或内容”。在本案中,被告没有修改相关设置,导致即使应用卸载之后其消息仍然能被恢复。
- 法国政府将从 Windows 工作站迁移到 Linux 工作站
在微软听命于美国总统下达的不合理命令之后,越来越多的欧盟国家开始强化数字主权,远离软件巨人,拥抱开源软件。法国总理命令跨部门数字事务局(DINUM)采取措施减少对美国的依赖。DINUM 已经宣布了第一个目标:替代 Windows。DINUM 负责监管各政府部门的 IT 设备和服务部署,它宣布工作站运行的操作系统将从 Windows 迁移到 Linux。除了工作站,协作工具、防病毒软件、人工智能、数据库、虚拟化等也将考虑迁移到非美国解决方案。
- EFF 退出 X 平台
电子前哨基金会(EFF)宣布退出 X/Twitter 平台,原因可能是 X 平台的算法在马斯克(Elon Musk)治下更倾向于传播右翼内容,导致立场倾向于左翼的数字权利组织 EFF 的内容曝光度大幅下降。EFF 称它每天在 X/Twitter 上发表 5-10 条推文, 2018 年这些推文月曝光量能达到 5000 万到 1 亿次,2024 年它共发表了 2500 条推文月曝光量仅为 200 万次,2025 年共发表了 1500 条推文,全年曝光量仅为 1300 万次。今天 EFF 一则推文的曝光量仅为 7 年前的 3%。EFF 称,它在 Facebook、Instagram、YouTube 和 TikTok 上有账号并不代表认可这些平台,而是为了接触这些平台的用户。它会继续通过 Bluesky、Mastodon、LinkedIn、Instagram、TikTok、Facebook、YouTube 和 eff.org 等平台和网站继续抗争,只是不再通过 X 平台。
- NASA 如何构建 Artemis II 的容错计算机
阿波罗宇航员用于登月的计算机配备了 1-MHz 处理器和 4 kilobytes 可擦除内存,其功能有限,飞船重要的环境和电源控制仍然是通过手动或机电方式实现的。阿尔忒弥斯(Artemis)II 的 Orion 飞船则配备了至今容错能力最强的计算机系统。为了应对太空辐射,飞船配备了两台计算机 Vehicle Management Computers,每台包含两个飞控模块 Flight Control Modules,总共四个 FCM,每个 FCM 配备了一对自检验处理器。八个 CPU 并行运行飞行软件。自检验意味着一旦某个 CPU 因一次辐射事件而出错,系统会立即检测出错误并做出反应。此外系统还使用了三模块冗余内存,每次读取时都能自动纠正单比特错误。网络也采用三重冗余设计,所有网络交换机都采用自检策略。飞船还搭载了一个完全独立的 Backup Flight Software(BFS)系统,它使用不同的硬件,运行不同的操作系统,使用独立开发的简化版飞行软件。这种设计被称为异构冗余。
- FreeBSD 公布兼容笔记本型号
FreeBSD 基金会在 2024 年第四季度启动了项目 Laptop Support & Usability Project,致力于改进 FreeBSD 发行版在现代笔记本电脑上的兼容性,解决 Wi-Fi、图形、音频、安装程序和睡眠状态等方面的问题。该项目已经取得了长足进步,它公布了一个兼容笔记本型号的名单,有 10 个型号的笔记本获得了 8/8 的几乎完美评分。这些笔记本电脑包括:联想 ThinkPad X270、华硕 TUF Gaming F15 FX507VU_FX507VU、惠普 EliteBook 845 G7 Notebook PC、联想 IdeaPad 5 15ALC05、Framework Laptop 13 (13th Gen Intel Core)、联想 Yoga 11e、Framework Laptop 13 (AMD Ryzen 7040 Series)、联想 ThinkPad T490、Framework Laptop 16 (AMD Ryzen 7040 Series) 以及 Aspire A315-24PT。
- 日本限制人类受精卵的基因编辑
日本政府内阁会议敲定了《基因组编辑胚胎规制法案》,内容为附带罚则禁止用基因组编辑技术改变人类受精卵的基因并以生育为目的向人类、动物子宫进行移植的研究及治疗。此举旨在管制基因编辑婴儿的出生。对受精卵的基因组编辑有望对遗传性疾病起到预防作用,但另一方面也存在产生超出预期的影响等技术层面的极限和风险。试图得到按照意愿设定身高等特征和容貌、运动能力的“设计款婴儿”的做法也引人担忧。目前根据国家的指针,将经过基因组编辑的受精卵放回人体子宫的做法受到部分禁止,但即便违反也没有相应的处罚规定。此次法案把用经过基因组编辑的精子和卵子制成的受精卵也列为对象。若向人类、动物的子宫进行移植,则处以 10 年以下监禁或 1000 万日元以下罚金,或者两项并罚。
- 美国生育率创历史新低
美国 CDC 公布的初步统计数据显示,2025 年美国女性生育的婴儿数量比 2007 年的峰值纪录减少约 71 万。 2007 年美国共有 4,316,233 名婴儿出生。2025 年尽管美国总人口数量更多,但新生儿数量仅为 3,606,400 人。为何生育率下降?部分专家认为是经济方面的因素导致的,另一部分专家则认为是文化影响,以及女性获得更好的教育和避孕途径。数据显示,年轻女性、青少年和二十多岁女性的生育率都出现了大幅下降,其中青少年怀孕率下降了 7%。
- Mozilla 指控微软使用 AI 限制用户选择
Mozilla 再次指控微软给予自己的浏览器和 AI 服务不公平的优势。即使用户明确选择其它浏览器,微软仍然会引导用户使用 Edge。Mozilla 称,不管默认的浏览器设置,Windows 的部分功能仍然会使用 Edge 打开链接,包括任务栏搜索结果以及 Outlook 和 Teams 等应用中的链接。微软在推广其 AI 助手 Copilot 时采取了类似的做法,利用平台优势推广自家服务。Copilot 固定在任务栏上,在安装了 Microsoft 365 的系统中自动安装,甚至部分型号的笔记本电脑还有专门的按键。Mozilla 认为,当占据主导地位的桌面操作系统制造商在系统层面推广自家浏览器和 AI 工具时,Firefox 之类的独立浏览器难以与之竞争。
- 黑猩猩群体也会爆发内战
研究人员报告了可能首次在野生黑猩猩中观察到的“内战”。在人类社会中,战争和集体暴力通常被解释为文化差异,这些差异在凝聚群体内部成员的同时也能激化对外敌意。然而这种观点无法充分解释那些发生在原本团结的共同体内部的冲突——正如我们在暴力叛乱或内战中所见的那样。一种替代性的解释认为,仅凭社会关系的变动与局部的竞争对抗就足以使群体分裂并滋生暴力。根据发表在《科学》期刊上的一项研究,科学家报告了在乌干达基巴莱国家公园(Kibale National Park)恩戈戈(Ngogo)黑猩猩群体中一次罕见且得到详实观察的永久性分裂及其后续的致命冲突。据估计此类事件每 500 年才会发生一次。大约从 2015 年开始,该黑猩猩群体开始迅速瓦解:从某单一且紧密的群体分裂为两个截然对立的集群——这种社会层面的断裂同时伴随着地域空间的分离以及生殖繁衍上的隔绝。到 2018 年时,该分裂已完成且持续存在,两个群体之间不再存留任何联系。随着这种分裂局面的固化,两个群体之间的攻击行为也随之升级。在 2018 年的分裂之后,其中一个黑猩猩群体对另一群体发动了持续且协同一致的攻击,标志着昔日群体成员之间的关系已明确转为致命冲突模式。这些袭击导致多只成年雄性黑猩猩被杀,并从 2021 年起升级为频繁的杀婴行为,平均每年都会有数个幼崽死亡。