OrangeBot.AI Digest — 2026-04-13
90 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- GitHub Stacked PRs (github.github.com)
- Someone Bought 30 WordPress Plugins and Planted a Backdoor in All of Them (anchor.host)
- The Future of Everything Is Lies, I Guess: Safety (aphyr.com)
- Building a CLI for All of Cloudflare (blog.cloudflare.com)
- Nothing Ever Happens: Polymarket bot that always buys No on non-sports markets (github.com)
- Make tmux pretty and usable (2024) (hamvocke.com)
- This year’s insane timeline of hacks (ringmast4r.substack.com)
- Microsoft isn't removing Copilot from Windows 11, it's just renaming it (www.neowin.net)
- US appeals court declares 158-year-old home distilling ban unconstitutional (nypost.com)
- Servo is now available on crates.io (servo.org)
- AI could be the end of the digital wave, not the next big thing (thenextwavefutures.wordpress.com)
- I went to America's worst national parks so you don't have to (substack.com)
- Michigan 'digital age' bills pulled after privacy concerns raised (www.thecentersquare.com)
- Android now stops you sharing your location in photos (shkspr.mobi)
- I ran Gemma 4 as a local model in Codex CLI (blog.danielvaughan.com)
GitHub Trending(15)
- forrestchang / andrej-karpathy-skills
- NousResearch / hermes-agent
- shiyu-coder / Kronos
- thedotmack / claude-mem
- microsoft / markitdown
- multica-ai / multica
- coleam00 / Archon
- snarktank / ralph
- virattt / ai-hedge-fund
- anthropics / claude-cookbooks
- shanraisshan / claude-code-best-practice
- jamiepine / voicebox
- ahujasid / blender-mcp
- hacksider / Deep-Live-Cam
- gsd-build / get-shit-done
Product Hunt(15)
- REasy
The operating system for African importers
- Vekta
AI training and coaching platform for endurance sports
- Legitify
Digital notarization across 50+ jurisdictions
- Luma Agents
Agents that plan, iterate, + refine w/ full creative context
- GhostDesk
Your invisible AI co-pilot for interviews & meetings
- Ably Chat
The Chat API built for serious scale
- Open Comet
The autonomous AI browser agent for deep research & tasks
- showmd
Markdown was never meant to be previewed plain text
- ContextPool
Persistent memory for AI coding agents
- Krisp Accent Converter for YouTube
YouTube, but you clearly understand everyone
- VoxCPM2
Open-source 48kHz TTS with voice design and cloning
- Deconflict
Plan your WiFi and see through walls
- Cleo Labs
Automate global compliance for selling physical products
- SigmaMind MCP
Build and control voice AI agents via MCP
- Claunnector
Connect your Mac's Mail, Calendar, and more to AI
Hugging Face(15)
- WildDet3D: Scaling Promptable 3D Detection in the Wild
Understanding objects in 3D from a single image is a cornerstone of spatial intelligence. A key step toward this goal is monocular 3D object detection--recovering the extent, location, and orientation of objects from an input RGB image. To be practical in the open world, such a detector must generalize beyond closed-set categories, support diverse prompt modalities, and leverage geometric cues when available. Progress is hampered by two bottlenecks: existing methods are designed for a single prompt type and lack a mechanism to incorporate additional geometric cues, and current 3D datasets cover only narrow categories in controlled environments, limiting open-world transfer. In this work we address both gaps. First, we introduce WildDet3D, a unified geometry-aware architecture that natively accepts text, point, and box prompts and can incorporate auxiliary depth signals at inference time. Second, we present WildDet3D-Data, the largest open 3D detection dataset to date, constructed by generating candidate 3D boxes from existing 2D annotations and retaining only human-verified ones, yielding over 1M images across 13.5K categories in diverse real-world scenes. WildDet3D establishes a new state-of-the-art across multiple benchmarks and settings. In the open-world setting, it achieves 22.6/24.8 AP3D on our newly introduced WildDet3D-Bench with text and box prompts. On Omni3D, it reaches 34.2/36.4 AP3D with text and box prompts, respectively. In zero-shot evaluation, it achieves 40.3/48.9 ODS on Argoverse 2 and ScanNet. Notably, incorporating depth cues at inference time yields substantial additional gains (+20.7 AP on average across settings).
- FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios
The manufacturing sector is increasingly adopting Multimodal Large Language Models (MLLMs) to transition from simple perception to autonomous execution, yet current evaluations fail to reflect the rigorous demands of real-world manufacturing environments. Progress is hindered by data scarcity and a lack of fine-grained domain semantics in existing datasets. To bridge this gap, we introduce FORGE. Wefirst construct a high-quality multimodal dataset that combines real-world 2D images and 3D point clouds, annotated with fine-grained domain semantics (e.g., exact model numbers). We then evaluate 18 state-of-the-art MLLMs across three manufacturing tasks, namely workpiece verification, structural surface inspection, and assembly verification, revealing significant performance gaps. Counter to conventional understanding, the bottleneck analysis shows that visual grounding is not the primary limiting factor. Instead, insufficient domain-specific knowledge is the key bottleneck, setting a clear direction for future research. Beyond evaluation, we show that our structured annotations can serve as an actionable training resource: supervised fine-tuning of a compact 3B-parameter model on our data yields up to 90.8% relative improvement in accuracy on held-out manufacturing scenarios, providing preliminary evidence for a practical pathway toward domain-adapted manufacturing MLLMs. The code and datasets are available at https://ai4manufacturing.github.io/forge-web.
- RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
We introduce region-specific image refinement as a dedicated problem setting: given an input image and a user-specified region (e.g., a scribble mask or a bounding box), the goal is to restore fine-grained details while keeping all non-edited pixels strictly unchanged. Despite rapid progress in image generation, modern models still frequently suffer from local detail collapse (e.g., distorted text, logos, and thin structures). Existing instruction-driven editing models emphasize coarse-grained semantic edits and often either overlook subtle local defects or inadvertently change the background, especially when the region of interest occupies only a small portion of a fixed-resolution input. We present RefineAnything, a multimodal diffusion-based refinement model that supports both reference-based and reference-free refinement. Building on a counter-intuitive observation that crop-and-resize can substantially improve local reconstruction under a fixed VAE input resolution, we propose Focus-and-Refine, a region-focused refinement-and-paste-back strategy that improves refinement effectiveness and efficiency by reallocating the resolution budget to the target region, while a blended-mask paste-back guarantees strict background preservation. We further introduce a boundary-aware Boundary Consistency Loss to reduce seam artifacts and improve paste-back naturalness. To support this new setting, we construct Refine-30K (20K reference-based and 10K reference-free samples) and introduce RefineEval, a benchmark that evaluates both edited-region fidelity and background consistency. On RefineEval, RefineAnything achieves strong improvements over competitive baselines and near-perfect background preservation, establishing a practical solution for high-precision local refinement. Project Page: https://limuloo.github.io/RefineAnything/.
- EXAONE 4.5 Technical Report
This technical report introduces EXAONE 4.5, the first open-weight vision language model released by LG AI Research. EXAONE 4.5 is architected by integrating a dedicated visual encoder into the existing EXAONE 4.0 framework, enabling native multimodal pretraining over both visual and textual modalities. The model is trained on large-scale data with careful curation, particularly emphasizing document-centric corpora that align with LG's strategic application domains. This targeted data design enables substantial performance gains in document understanding and related tasks, while also delivering broad improvements across general language capabilities. EXAONE 4.5 extends context length up to 256K tokens, facilitating long-context reasoning and enterprise-scale use cases. Comparative evaluations demonstrate that EXAONE 4.5 achieves competitive performance in general benchmarks while outperforming state-of-the-art models of similar scale in document understanding and Korean contextual reasoning. As part of LG's ongoing effort toward practical industrial deployment, EXAONE 4.5 is designed to be continuously extended with additional domains and application scenarios to advance AI for a better life.
- Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
With the advancement of interactive video generation, diffusion models have increasingly demonstrated their potential as world models. However, existing approaches still struggle to simultaneously achieve memory-enabled long-term temporal consistency and high-resolution real-time generation, limiting their applicability in real-world scenarios. To address this, we present Matrix-Game 3.0, a memory-augmented interactive world model designed for 720p real-time longform video generation. Building upon Matrix-Game 2.0, we introduce systematic improvements across data, model, and inference. First, we develop an upgraded industrial-scale infinite data engine that integrates Unreal Engine-based synthetic data, large-scale automated collection from AAA games, and real-world video augmentation to produce high-quality Video-Pose-Action-Prompt quadruplet data at scale. Second, we propose a training framework for long-horizon consistency: by modeling prediction residuals and re-injecting imperfect generated frames during training, the base model learns self-correction; meanwhile, camera-aware memory retrieval and injection enable the base model to achieve long horizon spatiotemporal consistency. Third, we design a multi-segment autoregressive distillation strategy based on Distribution Matching Distillation (DMD), combined with model quantization and VAE decoder pruning, to achieve efficient real-time inference. Experimental results show that Matrix-Game 3.0 achieves up to 40 FPS real-time generation at 720p resolution with a 5B model, while maintaining stable memory consistency over minute-long sequences. Scaling up to a 2x14B model further improves generation quality, dynamics, and generalization. Our approach provides a practical pathway toward industrial-scale deployable world models.
- ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion
Chest X-ray report generation (CXR-RG) has the potential to substantially alleviate radiologists' workload. However, conventional autoregressive vision--language models (VLMs) suffer from high inference latency due to sequential token decoding. Diffusion-based models offer a promising alternative through parallel generation, but they still require multiple denoising iterations. Compressing multi-step denoising to a single step could further reduce latency, but often degrades textual coherence due to the mean-field bias introduced by token-factorized denoisers. To address this challenge, we propose ECHO, an efficient diffusion-based VLM (dVLM) for chest X-ray report generation. ECHO enables stable one-step-per-block inference via a novel Direct Conditional Distillation (DCD) framework, which mitigates the mean-field limitation by constructing unfactorized supervision from on-policy diffusion trajectories to encode joint token dependencies. In addition, we introduce a Response-Asymmetric Diffusion (RAD) training strategy that further improves training efficiency while maintaining model effectiveness. Extensive experiments demonstrate that ECHO surpasses state-of-the-art autoregressive methods, improving RaTE and SemScore by 64.33\% and 60.58\% respectively, while achieving an 8times inference speedup without compromising clinical accuracy.
- ELT: Elastic Looped Transformers for Visual Generation
We introduce Elastic Looped Transformers (ELT), a highly parameter-efficient class of visual generative models based on a recurrent transformer architecture. While conventional generative models rely on deep stacks of unique transformer layers, our approach employs iterative, weight-shared transformer blocks to drastically reduce parameter counts while maintaining high synthesis quality. To effectively train these models for image and video generation, we propose the idea of Intra-Loop Self Distillation (ILSD), where student configurations (intermediate loops) are distilled from the teacher configuration (maximum training loops) to ensure consistency across the model's depth in a single training step. Our framework yields a family of elastic models from a single training run, enabling Any-Time inference capability with dynamic trade-offs between computational cost and generation quality, with the same parameter count. ELT significantly shifts the efficiency frontier for visual synthesis. With 4times reduction in parameter count under iso-inference-compute settings, ELT achieves a competitive FID of 2.0 on class-conditional ImageNet 256 times 256 and FVD of 72.8 on class-conditional UCF-101.
- Multi-User Large Language Model Agents
Large language models (LLMs) and LLM-based agents are increasingly deployed as assistants in planning and decision making, yet most existing systems are implicitly optimized for a single-principal interaction paradigm, in which the model is designed to satisfy the objectives of one dominant user whose instructions are treated as the sole source of authority and utility. However, as they are integrated into team workflows and organizational tools, they are increasingly required to serve multiple users simultaneously, each with distinct roles, preferences, and authority levels, leading to multi-user, multi-principal settings with unavoidable conflicts, information asymmetry, and privacy constraints. In this work, we present the first systematic study of multi-user LLM agents. We begin by formalizing multi-user interaction with LLM agents as a multi-principal decision problem, where a single agent must account for multiple users with potentially conflicting interests and associated challenges. We then introduce a unified multi-user interaction protocol and design three targeted stress-testing scenarios to evaluate current LLMs' capabilities in instruction following, privacy preservation, and coordination. Our results reveal systematic gaps: frontier LLMs frequently fail to maintain stable prioritization under conflicting user objectives, exhibit increasing privacy violations over multi-turn interactions, and suffer from efficiency bottlenecks when coordination requires iterative information gathering.
- Backdoor Attacks on Decentralised Post-Training
Decentralised post-training of large language models utilises data and pipeline parallelism techniques to split the data and the model. Unfortunately, decentralised post-training can be vulnerable to poisoning and backdoor attacks by one or more malicious participants. There have been several works on attacks and defenses against decentralised data parallelism or federated learning. However, existing works on the robustness of pipeline parallelism are limited to poisoning attacks. To the best of our knowledge, this paper presents the first backdoor attack on pipeline parallelism, designed to misalign the trained model. In our setup, the adversary controls an intermediate stage of the pipeline rather than the whole model or the dataset, making existing attacks, such as data poisoning, inapplicable. Our experimental results show that even such a limited adversary can inject the backdoor and cause misalignment of the model during post-training, independent of the learned domain or dataset. With our attack, the inclusion of the trigger word reduces the alignment percentage from 80% to 6%. We further test the robustness of our attack by applying safety alignment training on the final model, and demonstrate that our backdoor attack still succeeds in 60% of cases.
- Structured Causal Video Reasoning via Multi-Objective Alignment
Human understanding of video dynamics is typically grounded in a structured mental representation of entities, actions, and temporal relations, rather than relying solely on immediate deductive reasoning. In contrast, existing Video-LLMs largely depend on unstructured video reasoning, where critical visual evidence is embedded in verbose textual descriptions and temporal causality is often weakly modeled. This leads to inefficient processes and fragile causal inference. To bridge this cognitive gap, we propose constructing a compact representation of salient events and their causal relationships, which we name Structured Event Facts, prior to the reasoning stage. This structured prior serves as an explicit constraint to promote concise and causally grounded reasoning, while also making intermediate evidence easier to verify. To effectively train models on such structured facts, we introduce CausalFact-60K and a four-stage training pipeline comprising facts alignment, format warm-start, thinking warm-start, and reinforcement learning-based post-training. During RL stage, we find that this framework introduces competing objectives, as structural completeness and causal fidelity must be balanced against reasoning length, making it difficult to optimize. We address this challenge by formulating the optimization as a Multi-Objective Reinforcement Learning (MORL) problem and explicitly optimizing toward the Pareto-Frontier to balance these trade-offs. As a result, we introduce Factum-4B, which yields more reliable reasoning and delivers stronger performance on challenging video understanding tasks requiring fine-grained temporal inference.
- VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images
Vision-language models (VLMs) still struggle with visual perception tasks such as spatial understanding and viewpoint recognition. One plausible contributing factor is that natural image datasets provide limited supervision for low-level visual skills. This motivates a practical question: can targeted synthetic supervision, generated from only a task keyword such as Depth Order, address these weaknesses? To investigate this question, we introduce VisionFoundry, a task-aware synthetic data generation pipeline that takes only the task name as input and uses large language models (LLMs) to generate questions, answers, and text-to-image (T2I) prompts, then synthesizes images with T2I models and verifies consistency with a proprietary VLM, requiring no reference images or human annotation. Using VisionFoundry, we construct VisionFoundry-10K, a synthetic visual question answering (VQA) dataset containing 10k image-question-answer triples spanning 10 tasks. Models trained on VisionFoundry-10K achieve substantial improvements on visual perception benchmarks: +7% on MMVP and +10% on CV-Bench-3D, while preserving broader capabilities and showing favorable scaling behavior as data size increases. Our results suggest that limited task-targeted supervision is an important contributor to this bottleneck and that synthetic supervision is a promising path toward more systematic training for VLMs.
- ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery
Many disciplines pose natural-language research questions over large document collections whose answers typically require structured evidence, traditionally obtained by manually designing an annotation schema and exhaustively labeling the corpus, a slow and error-prone process. We introduce ScheMatiQ, which leverages calls to a backbone LLM to take a question and a corpus to produce a schema and a grounded database, with a web interface that lets steer and revise the extraction. In collaboration with domain experts, we show that ScheMatiQ yields outputs that support real-world analysis in law and computational biology. We release ScheMatiQ as open source with a public web interface, and invite experts across disciplines to use it with their own data. All resources, including the website, source code, and demonstration video, are available at: www.ScheMatiQ-ai.com
- AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents
As large language models (LLMs) evolve into autonomous agents for long-horizon information-seeking, managing finite context capacity has become a critical bottleneck. Existing context management methods typically commit to a single fixed strategy throughout the entire trajectory. Such static designs may work well in some states, but they cannot adapt as the usefulness and reliability of the accumulated context evolve during long-horizon search. To formalize this challenge, we introduce a probabilistic framework that characterizes long-horizon success through two complementary dimensions: search efficiency and terminal precision. Building on this perspective, we propose AgentSwing, a state-aware adaptive parallel context management routing framework. At each trigger point, AgentSwing expands multiple context-managed branches in parallel and uses lookahead routing to select the most promising continuation. Experiments across diverse benchmarks and agent backbones show that AgentSwing consistently outperforms strong static context management methods, often matching or exceeding their performance with up to 3times fewer interaction turns while also improving the ultimate performance ceiling of long-horizon web agents. Beyond the empirical gains, the proposed probabilistic framework provides a principled lens for analyzing and designing future context management strategies for long-horizon agents.
- Envisioning the Future, One Step at a Time
Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, simulate along extended interaction chains, and efficiently explore many plausible futures. Yet most existing approaches rely on dense video or latent-space prediction, expending substantial capacity on dense appearance rather than on the underlying sparse trajectories of points in the scene. This makes large-scale exploration of future hypotheses costly and limits performance when long-horizon, multi-modal motion is essential. We address this by formulating the prediction of open-set future scene dynamics as step-wise inference over sparse point trajectories. Our autoregressive diffusion model advances these trajectories through short, locally predictable transitions, explicitly modeling the growth of uncertainty over time. This dynamics-centric representation enables fast rollout of thousands of diverse futures from a single image, optionally guided by initial constraints on motion, while maintaining physical plausibility and long-range coherence. We further introduce OWM, a benchmark for open-set motion prediction based on diverse in-the-wild videos, to evaluate accuracy and variability of predicted trajectory distributions under real-world uncertainty. Our method matches or surpasses dense simulators in predictive accuracy while achieving orders-of-magnitude higher sampling speed, making open-set future prediction both scalable and practical. Project page: http://compvis.github.io/myriad.
- p1: Better Prompt Optimization with Fewer Prompts
Prompt optimization improves language models without updating their weights by searching for a better system prompt, but its effectiveness varies widely across tasks. We study what makes a task amenable to prompt optimization. We show that the reward variance across different system prompts can be decomposed into two components: variance among responses, which captures generation stochasticity, and variance among system prompts, which captures differences in system prompt quality. Prompt optimization succeeds when variance among system prompts is sufficiently large, but fails when variance among responses dominates the variance of the system prompts. Surprisingly, we further show that scaling to more user prompts can hurt optimization by reducing variance among system prompts, especially on heterogeneous datasets where different user prompts favor different system prompts. Motivated by this insight, we propose p1, a simple user prompt filtering method that selects a small subset of user prompts with high variance across candidate system prompts. This subset of user prompts allows one to distinguish a good system prompt from a bad one, making system optimization easier. Experiments on reasoning benchmarks show that p1 substantially improves prompt optimization over training on the full dataset and outperforms strong baselines such as GEPA. Notably, training on only two prompts from AIME 24 yields a system prompt that generalizes well to other reasoning benchmarks.
Techmeme(15)
- Roblox says developers will need Roblox Plus, a new $4.99-per-month subscription offering benefits like discounts, to publish games for Kids and Select accounts (Aisha Malik/TechCrunch)
Aisha Malik / TechCrunch : Roblox says developers will need Roblox Plus, a new $4.99-per-month subscription offering benefits like discounts, to publish games for Kids and Select accounts — Roblox is introducing new account types designed to give kids and younger teens age-appropriate access to chat and games, the company announced on Monday.
- Microsoft raises prices for Surface PCs, with Laptop 7 and Pro 11 now $500 more expensive than at their 2024 launch, citing higher memory and component costs (Zac Bowden/Windows Central)
Zac Bowden / Windows Central : Microsoft raises prices for Surface PCs, with Laptop 7 and Pro 11 now $500 more expensive than at their 2024 launch, citing higher memory and component costs — Microsoft is raising prices on all its current Surface PC offerings, with the midrange devices now starting at above $1,000, and flagships starting at $1,500.
- Filing: Anthropic hired Ballard Partners, a lobbying firm with strong ties to Trump administration, days after DOD designated the company a supply chain risk (Bloomberg)
Bloomberg : Filing: Anthropic hired Ballard Partners, a lobbying firm with strong ties to Trump administration, days after DOD designated the company a supply chain risk — Anthropic PBC hired the lobbying firm Ballard Partners as it draws out its fight with the Pentagon, a new public document shows.
- Anthropic says its $20M donation to Public First Action can't be "used to influence federal elections" and is to educate the public on AI policy (Veronica Irwin/Transformer)
Veronica Irwin / Transformer : Anthropic says its $20M donation to Public First Action can't be “used to influence federal elections” and is to educate the public on AI policy — The company's money isn't allowed to be used in the midterm battles. Without it, pro-safety candidates may be even more outgunned than expected
- Internal memo: Microsoft's gaming chief Asha Sharma says "Game Pass has become too expensive for players" and that Microsoft needs "a better value equation" (Tom Warren/The Verge)
Tom Warren / The Verge : Internal memo: Microsoft's gaming chief Asha Sharma says “Game Pass has become too expensive for players” and that Microsoft needs “a better value equation” — Microsoft is getting ready to address Game Pass pricing concerns.
- Amazon Leo unveils the Aviation Antenna, saying it can deliver up to 1 Gbps download and 400 Mbps upload speeds for in-flight Wi-Fi (Michael Kan/PCMag)
Michael Kan / PCMag : Amazon Leo unveils the Aviation Antenna, saying it can deliver up to 1 Gbps download and 400 Mbps upload speeds for in-flight Wi-Fi — Amazon Leo is trying to steal some of the spotlight from Starlink's in-flight Wi-Fi business by showing off its own satellite internet antenna for commercial jets.
- Internal memo: OpenAI Chief Revenue Officer says Anthropic is "grossing up rev share with Amazon and Google" and overstating its "run rate by roughly $8B" (Hayden Field/The Verge)
Hayden Field / The Verge : Internal memo: OpenAI Chief Revenue Officer says Anthropic is “grossing up rev share with Amazon and Google” and overstating its “run rate by roughly $8B” — OpenAI's chief revenue officer, Denise Dresser, sent a four-page memo to employees on Sunday …
- Shares of Dell and HP jump after a report said Nvidia "has been in negotiations for over a year to buy a large company and it will reshape the PC landscape" (Bloomberg)
Bloomberg : Shares of Dell and HP jump after a report said Nvidia “has been in negotiations for over a year to buy a large company and it will reshape the PC landscape” — Nvidia Corp. denied a report from website SemiAccurate that it was seeking an acquisition of a large company that would “reshape the PC landscape.”
- Intel added $100B+ in value and now has a $300B+ market cap after its stock jumped 53% in nine sessions on plans to buy an Ireland fab and join Terafab (Bloomberg)
Bloomberg : Intel added $100B+ in value and now has a $300B+ market cap after its stock jumped 53% in nine sessions on plans to buy an Ireland fab and join Terafab — Intel Corp. has quickly become one of the hottest stocks in the S&P 500 Index thanks to a nine-day surge that has added more than $100 billion in market value.
- Cybersecurity analysis: Claude Mythos Preview had a 73% success rate on expert-level capture-the-flag challenges, which no model could finish before April 2025 (AI Security Institute)
AI Security Institute : Cybersecurity analysis: Claude Mythos Preview had a 73% success rate on expert-level capture-the-flag challenges, which no model could finish before April 2025 — The AI Security Institute (AISI) conducted evaluations of Anthropic's Claude Mythos Preview (announced on 7th April) to assess its cybersecurity capabilities.
- 2026 AI Index Report: AI capability is accelerating, not plateauing, the US-China model gap has closed, the US leads in data centers and AI investment, and more (Stanford HAI)
Stanford HAI : 2026 AI Index Report: AI capability is accelerating, not plateauing, the US-China model gap has closed, the US leads in data centers and AI investment, and more — AI's influence on society has never been more pronounced. — At Stanford HAI, we believe AI is poised to be the most transformative technology of the 21st century.
- The EU appoints Anthony Whelan as its top competition official; Whelan says he will press ahead with Big Tech investigations despite President Trump's pressure (Barbara Moens/Financial Times)
Barbara Moens / Financial Times : The EU appoints Anthony Whelan as its top competition official; Whelan says he will press ahead with Big Tech investigations despite President Trump's pressure — Anthony Whelan, a former adviser on tech, vows to pursue cases irrespective of ‘noise’ around them
- Microsoft says it is "exploring the potential of technologies like OpenClaw in an enterprise context", including a team of always-on agents within Microsoft 365 (Aaron Holmes/The Information)
Aaron Holmes / The Information : Microsoft says it is “exploring the potential of technologies like OpenClaw in an enterprise context”, including a team of always-on agents within Microsoft 365 — As Microsoft faces growing competition for business customers from Anthropic, it is developing new features …
- Memo: OpenAI Chief Revenue Officer Denise Dresser says OpenAI's Microsoft deal "limited our ability" to reach clients using Bedrock and touts its Amazon deal (Ashley Capoot/CNBC)
Ashley Capoot / CNBC : Memo: OpenAI Chief Revenue Officer Denise Dresser says OpenAI's Microsoft deal “limited our ability” to reach clients using Bedrock and touts its Amazon deal — OpenAI's newly appointed revenue chief, Denise Dresser, sent a memo to staffers on Sunday, touting the company's alliance …
- Roblox unveils Kids accounts for ages 5-8 and Select accounts for ages 9-15, with age verification, coming in June; games for both must pass a three-step review (Jay Peters/The Verge)
Jay Peters / The Verge : Roblox unveils Kids accounts for ages 5-8 and Select accounts for ages 9-15, with age verification, coming in June; games for both must pass a three-step review — People who don't do an age check can only access family-friendly games. … Roblox is about to make a big change …
Solidot(15)
- 扎克伯格可能很快会有他的 AI 克隆
FT 报道,Meta 正在构建一个 AI 版本的扎克伯格(Mark Zuckerberg),代替真人与员工互动。报道援引知情人士的消息称,这是该公司目前的优先事项,扎克伯格本人亲自参与了 AI 的训练和测试。AI 的训练内容包括他的举止、语气和公开发表的声明,以及近期对公司战略的思考,以便员工能通过与其互动感受到与创始人更紧密的联系。知情人士称,这项工作的重点之一是制作逼真的虚拟 AI 角色,因为需要大量的算力才能实现逼真的效果以及避免在与用户交互时出现延迟,因此扩大规模存在困难之处。如果实验成功的话,未来网红和内容创作者也可以采用这项技术。
- 计算机科学的黄金期可能已结束
2025 年秋季美国四年制大学计算机科学专业的学生入学人数下降了 8.1%。计算机科学专业的本科排名在一年内从第四位跌至第六位,前三则一直是商科、公共卫生和人文科学。从 2008 年到 2024 年,计算机科学一直是美国增长最快的专业,如今它的黄金期可能已经结束。美国主修计算机科学的人数比上一学年少了 54000 人。那么他们选择了什么新专业?数据分析和数据科学招生总人数逾 3.5 万人,而 2020 年它们刚拆分出来时只招了几百人。数据显示,部分有意计算机科学专业的学生转向了相关领域如机器人学。工程专业学生入学人数 2025 年秋季增长了 7.3%,其中增长最快的两个专业是机械工程和电气工程专业,分别增长了 11% 和 14%。大学教授认为由于计算机科学毕业人数供过于求,学生们可能认为机械工程专业更通用,能在 AI 驱动的世界里提供更好的就业机会,如机器人、无人机、航空航天和电动汽车等行业。
- 《传送门2社区版》将于 4 月 18 日公测
《传送门2(Portal 2)》自 2011 年 4 月发布至今已有 15 年,期间模组开发者推出了多个衍生版本,包括《Portal Stories: Mel》、《Aperture Tag: The Paint Gun Testing Initiative》以及《Portal Reloaded》等,现在由 P2:CE Team 开发的最新社区版本《传送门2社区版(Portal 2: Community Edition)》将于 4 月 18 日公测。《社区版》升级了引擎,使用了官方授权、基于 CS:GO 的 Source 引擎重度修改版本 Strata Source,原生支持 64 位改进了性能,新增原生 DirectX 11 支持,移除了旧引擎的很多限制,更新或改进了游戏玩法,提供了允许玩家轻松扩展游戏机制的脚本框架 AngelScript,采用了 Source 2 引擎的 Panorama UI 框架等等。《社区版》将免费提供给现有的《传送门2》玩家。
- 长期接触农药可能诱发糖尿病
2023 年全球农药使用量达 373 万吨,约为 1990 年的两倍。农药相关健康风险研究长期集中在急性中毒、神经毒性和癌症方面。新型基因技术如今已能用于追踪农药对肠道菌群的影响。印度团队对印度南部近 3000 人开展研究后发现,城市地区 23% 的人患有糖尿病,多与肥胖、高胆固醇等典型危险因素相关;但农村地区糖尿病患病率仍高达 16%,且与这些危险因素无关。研究人员怀疑环境化学物质可能发挥了作用研究团队在小鼠身上研究了一种广泛使用的农业杀虫剂——氯氰菊酯的影响。根据印度日常饮食中的农药残留量,研究团队采用了“现实剂量”,持续给药 120 天。研究显示,氯氰菊酯重塑了小鼠肠道菌群,其中乳酸杆菌等有益菌数量下降,幽门螺杆菌等潜在有害菌增多。即便体重没有增加,接触氯氰菊酯的小鼠仍出现了高血糖和糖尿病症状。农药似乎不仅会改变菌群种类,还会影响其代谢活性。在另一项大型研究中,研究人员将 17 种人体肠道代表性细菌暴露于 18 种不同农药,检测到微生物产生的数百种小分子物质发生变化,其中包括短链脂肪酸、胆汁酸和色氨酸相关分子。这些物质能维持肠道黏膜健康、调节炎症反应、调控免疫功能。他们还发现,部分细菌会在细胞内蓄积农药,这可能延长其在人体内的停留时间,增加长期健康风险。
- Google Play 下架《心跳文学部》
Google Play 下架了《心跳文学部(Doki Doki Literature Club)》,理由是游戏内容违反了与敏感主题相关的服务条款。作者 Dan Salvato 在一份声明中表示在致力于让游戏在 Google Play 重新上架。《心跳文学部》描述了一位男高中生加入学校的文学部与四位女性成员交流的故事,看起来是一个简单的恋爱视觉小说,但在完成一个结局之后故事会变得非常古怪,游戏会通过删除文件和存档的方式打破第四面墙。《心跳文学部》的免费版本积累了逾千万下载量,是 Steam 平台排名第一的心理恐怖游戏。关于敏感主题《心跳文学部》会在启动之后发出多次警告。《心跳文学部》有 iOS、Nintendo Switch、PlayStation 等各种版本。
- Valve 工程师改进 Linux 游戏的显存占用
随着游戏日益图像密集,显存占用愈来愈成为一大问题。提升视觉保真度需要将越来越多的游戏素材储存在显卡的显存内。但显存的容量有限,不是人人桌面上都有 128 GB 大显存的数据中心级 GPU。Valve 工程师 Natalie Vock 开发了新的内核补丁和两个专门的工具去解决容量在 8GB 以内的显卡显存占用问题。她的补丁主要针对 AMD GPU,英特尔的 Xe 显卡也支持,但使用私有驱动的英伟达显卡不支持——原因是英伟达私有内核模块不支持 dmem cgroups。她的方法主要是确保前台运行的游戏对显存有优先使用权,如果显存开始占满,后台任务占用的显存将优先转移到系统内存。在有 8GB 显存的显卡上运行《赛博朋克 2077》,有 1.37GB 的显存溢出到 GTT(Graphics Translation Table),游戏实际上只用了 6GB 显存,应用补丁之后游戏占用的显存提高到 7.4GB,GTT 减少到 650MB。
- 女性免疫系统衰老变化比男性更显著
统计数据显示男女免疫系统存在显著差异:男性更易感染疾病和罹患癌症;女性免疫反应更强烈对疫苗反应更好,然而免疫系统反应越强烈,自身免疫性疾病的发生率也越高,八成的自身免疫性疾病发生在女性身上。巴塞罗那超算中心在《Nature Aging》期刊发表的一项研究首次证实,男女的免疫衰老遵循不同的动态过程。研究结果表明,女性随着年龄增长免疫系统的变化更显著,炎症性免疫细胞数量增加。这或许可以解释为何自身免疫性疾病主要发生于女性,尤其是在老年时期,以及部分炎症性疾病在绝经后加重的原因。男性免疫系统衰老相关的变化总体上不如女性显著,但部分具有白血病前期改变的血细胞数量增加,这或许可解释为何部分血液癌症在老年男性身上更为常见。
- 中纬度地区的夏季比 1960 年代延长了 30 天
根据发表在《Environmental Research Letters》期刊上的一项研究,1990-2023 年间热带和极地之间的中纬度地区的夏季平均每十年延长约 6 天。城市的变化更为惊人,澳大利亚悉尼的夏季如今持续 130 天,而在 1990 年只有 80 天,相当于每十年增加 15 天。加拿大多伦多的夏季每十年延长 8 天。研究还发现季节的转换变得更突然,春季不是慢慢升温转换到夏季,而是春季突然就暴热切换到夏季。这种骤然变化可能会扰乱依赖季节变化的系统,比如花朵可能在授粉昆虫活跃前就盛开了,农作物可能需要更早播种,春季气温的快速升高可能促使积雪更快融化洪涝风险加大。
- 美国人仍然偏爱阅读纸质书
皮尤研究中心去年十月进行的调查显示,美国人仍然偏爱阅读纸质书。调查显示,约三分之二的成年人表示过去 12 个月内阅读过纸质书,纸质书阅读比例从 2011 年的 72% 降至 2025 年 10 月的 64%;31% 的成年人过去一年阅读过电子书,高于 2011 年的 17%;有声读物的收听比例也在同期翻了一番以上。调查还凸显了教育水平和种族等方面的差异。大学毕业生比非大学毕业生更可能过去一年读过书;拥有学士学位成年人中 88% 表示过去 12 个月读过书;有大学学习经历的人群中这一比例降至 78%,高中及以下学历的人则为 60%。50 岁以下美国人比年长者更倾向于阅读电子书和有声读物。年长群体只有三分之一或更少的人表示读过电子书。美国白人最倾向于阅读纸质书,亚裔则在电子书的使用方面表现突出。三分之二的白人成年人表示过去一年读过纸质书,黑人、西班牙裔和亚裔的比例较低。42% 的亚裔表示过去一年读过电子书,而白人、西班牙裔和黑人成年人中这一比例约为 30%。
- 赵长鹏自费出版了自己的自传
币安创始人赵长鹏自费出版了自己的中英文自传《Freedom of Money: A Memoir of Protecting Users, Resilience, and the Founding of Binance》。赵长鹏在书中讲述了币安与美国监管机构的长期对抗,币安因助长洗钱活动而支付创纪录的 43 亿美元和解金,他在加州服刑四个月期间开始撰写本书,以及去年底获得特朗普总统的赦免——他此前被永久禁止涉足加密货币银行业务,赦免意味着他可以继续从事这项业务。他创办的币安是全球最大的加密货币交易所,与特朗普家族的加密货币业务 World Liberty Financial 有深度合作。本书最有意思的可能是他的监狱生活。赵长鹏写道,他一度担心在狱中会被人勒索,因为媒体报道他是美国监狱关押的最富有的人,结果是根本没人认识他,因为监狱关押的人没人会去看华尔街日报或彭博社。他与一名因杀害两人而被判 30 年监禁的男性关住一间牢房,他发现狱友最致命的不是他杀过人而是他雷鸣般的鼾声。他还在书中提到了 FTX 创始人 Sam Bankman-Fried,币安曾持有 FTX 五分之一的股份,以及 5.8 亿美元的 FTT 代币。2022 年 FTX 濒临破产之际 Bankman-Fried 曾打电话给他索取数十亿美元,他的语气漫不经心,仿佛是要一份三明治。
- Linux 7.0 释出
Linus Torvalds 在内核邮件列表上宣布释出 Linux 7.0,它将会是支持 i486 CPU 的最后一个版本。Linux 7.0 的主要新特性包括:Rust 代码不再是实验性;io_uring 操作的新过滤机制,CPU 调度器默认启用延迟抢占,支持时间片扩展,nullfs 文件系统,XFS 文件系统支持自我修复,新的文件 I/O 错误报告 API,支持 Clang 静态分析,默认启用 AccECN 支持以更好处理 TCP 拥塞,Btrfs 实验性支持重映射树(remap tree),新驱动,等等。更多可浏览 KernelNewbies 7.0。
- 韩国移动运营商将为其用户提供 400 Kbps 基本数据传输率
韩国三大移动运营商 SK Telecom、KT 和 LG Uplus 将为逾 700 万移动用户提供 400Kbps 的基本数据传输率。当用户使用的流量超过了其每月限额之后,他们的移动传输率将降至 400Kbps,但不再有流量限制。400Kbps 可能不太适合刷短视频——标清视频需要 5 Mbps 左右的网速,但对于浏览网页、收发短信、VoIP 语音通话是绰绰有余了。此举意味着用户的月流量耗尽之后不会被强制断网或者收取高额流量费用。三大运营商还承诺增加老年人的数据和通话流量。这些措施是运营商们在去年发生一系列安全事故之后采取的补救义务行动。
- 反腐如何影响餐饮业
2012 年党中央发布了限制铺张浪费的八项规定(全称“十八届中央政治局关于改进工作作风、密切联系群众的八项规定”),为经济学家提供了罕见的机会观察反腐如何影响餐饮业的选址和整体经济格局。研究人员利用大众点评的数据,分析了 2010-2014 年间数十万条顾客评论和消费报告,记录了 120 个政府机关的地址,观察附近数万家餐厅的情况。结果显示,八项规定实施后,政府机关附近餐厅的生意立即下滑。顾客到访量下降 5.5%,人均消费下降 2.7%。这相当于北京餐厅每年损失约 4 亿美元的销售额。高档餐厅受到的冲击最大。到 2016 年,北京餐饮业的整体格局发生了改变。在整顿前,高档餐饮大多集中在政治权力中心附近,之后高档餐厅逐渐分散到普通商业区和居民区。这表明政治权力的地理分布直接改变地方经济格局。它以正常市场力量无法解释的方式将财富和资源聚集在一起。研究表明,政治权力会影响企业选址。研究还揭示了反腐的隐性成本,其经济影响远超预期目标。
- CPUID 网站下载链接被劫持传播恶意程序
提供 CPU-Z 和 HWMonitor 等流行免费系统分析工具的 CPUID 网站遭到入侵,导致用户在短时间内下载了恶意程序。用户首先通过社交媒体报告安装从 CPUID 下载的程序时杀毒软件弹出了警告。CPUID 网站随后证实,它使用的一个第三方 API 在 4 月 9-10 日期间被入侵了大约 6 个小时,导致主网站随机显示恶意链接。CPUID 提供的应用本身没有被纂改。攻击者主要针对 HWMonitor 用户,恶意版本包含了一个假的 CRYPTBASE.dll 文件,它会连接指令控制服务器下载更多恶意负荷。CPUID 表示问题已修复。
- Rockstar 证实遭到入侵,但否认重要数据被盗
黑客组织 ShinyHunters 声称入侵了 Rockstar Games 的 Snowflake 服务器,窃取了大量数据,它要求 Rockstar 在 4 月14 日前支付赎金,否则将泄露数据。ShinyHunters 是通过 Anodot 访问了 Rockstar 托管在 Snowflake 的服务器,Snowflake 本身并没有遭到入侵。Rockstar 之后证实遭到入侵,但否认重要数据被盗,称有少量非物质(non-material)公司信息被访问,这次事件不会对公司或玩家造成任何影响。