Curated by Shen Huang · 89 stories · ~13 min read
DIGEST · 2026-06-03

OrangeBot.AI Digest — 2026-06-03

89 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Elixir v1.20: Now a gradually typed language (elixir-lang.org)
  2. I was recently diagnosed with anti-NMDA receptor encephalitis (burntsushi.net)
  3. MacBook Neo is so popular that Apple doubled production (www.macrumors.com)
  4. Gemma 4 12B: A unified, encoder-free multimodal model (blog.google)
  5. ESP32-S31 (www.espressif.com)
  6. A Post-Quantum Future for Let's Encrypt (letsencrypt.org)
  7. DaVinci Resolve 21 (www.blackmagicdesign.com)
  8. I built a ceiling projection mapping of the planes flying over my house (old.reddit.com)
  9. Meta workers can opt out of being tracked at work up to 30 min (www.bbc.com)
  10. Uber's $1,500/month AI limit is a useful signal for AI tool pricing (simonwillison.net)
  11. 32GB of DDR5 now costs $375 – AI shortage continues to squeeze PC building (www.tomshardware.com)
  12. PlayStation Architecture (www.copetti.org)
  13. Every Byte Matters (fzakaria.com)
  14. Pwnd Blaster: Hacking your PC using your speaker without ever touching it (blog.nns.ee)
  15. Show HN: Edsger – A handwritten Clojure REPL for the reMarkable 2 (handwritten.danieljanus.pl)

GitHub Trending(14)

  1. chopratejas / headroom
  2. affaan-m / ECC
  3. aquasecurity / trivy
  4. NousResearch / hermes-agent
  5. microsoft / markitdown
  6. nesquena / hermes-webui
  7. D4Vinci / Scrapling
  8. opendataloader-project / opendataloader-pdf
  9. odoo / odoo
  10. Open-LLM-VTuber / Open-LLM-VTuber
  11. jwasham / coding-interview-university
  12. lyogavin / airllm
  13. supermemoryai / supermemory
  14. HKUDS / Vibe-Trading

Product Hunt(15)

  1. Brand Context API

    Ship AI that stays on-brand

  2. Composer

    Multiplayer markdown for you, your team, and your agents.

  3. Town

    The assistant that learns how you work, then gets to work.

  4. Barflare

    Cloudflare Tunnels, managed from your menu bar

  5. superlog

    Make your product bug-free

  6. RadianceKit

    Turn photos into 3D Gaussian Splats on your Mac

  7. EchoFlow

    Native Android AI chat with chats stored locally

  8. Dispatch

    Your app launch hub with ASO audit, keywords, and ads

  9. Wallie V2

    The open-source AI streamer that actually feels alive

  10. BoxBox

    File manager for Linux homelab and NAS-style servers

  11. Handler

    Review AI edits like stacked PRs at generation time.

  12. Walkable

    Safety-first walking navigation to walk the safest routes

  13. BeerShot

    Screen recording studio for Windows

  14. StampCam

    Turn any photo into a postage stamp or sticker

  15. TaskGPT

    Voice agent for MacOS

Hugging Face(15)

  1. OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

    Recent progress in the development of language models has been defined by scale, with each generation absorbing more of the world's knowledge into its weights. However, many practical applications benefit more from robust reasoning than from extensive parametric knowledge. In this setting, task-specialized small language models (SLMs) offer a principled design choice. We introduce Optimal Cognitive Core (OCC), a family of SLMs built around this premise. As a variant of OCC, we present OCC-RAG, optimized for faithful question answering (QA) grounded in the provided context. This task directly aligns with the OCC design approach, requiring multi-hop reasoning over supplied passages while ignoring memorized knowledge. To train OCC-RAG, we implement a novel pipeline for synthesizing multi-context, multi-hop QA data at scale, producing a corpus of over three million examples targeting multi-hop reasoning, strict context faithfulness, and calibrated abstention. We release OCC-RAG-0.6B and OCC-RAG-1.7B, both mid-trained on this corpus. The models produce structured reasoning traces with source citations grounded in literal quotes from the context. Through OCC-RAG, we demonstrate that compact, task-specialized SLMs can match or exceed general-purpose models 2 -- 6x their size across multi-hop reasoning (HotpotQA, MuSiQue, TAT-QA), faithfulness (ConFiQA), and refusal (MuSiQue-Un) benchmarks.

  2. From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

    Identifying which brain regions represent a visual concept in the human brain is a central challenge in neuroscience. Existing approaches have localized coarse functional regions (e.g., faces, places) through activation maximization, identifying regions that activate strongly for a target concept relative to other concepts. Yet strong activation alone does not establish that a region represents the concept itself, as responses may instead be driven by correlated visual or semantic cues. We introduce BrainCause, an automated framework that combines generative and brain models to synthesize controlled stimuli and validate neural representations through targeted causal testing. Given a query specifying a concept of interest, our framework constructs targeted stimulus sets comprising concept images, counterfactual edits that remove the target concept while preserving other image content, and images with candidate correlated distractors. It then uses an image-to-fMRI encoding model to predict brain responses and searches for representations that respond specifically to the target concept over correlated alternatives. BrainCause returns validated candidate representations and proposes follow-up fMRI experiments to further test or extend its discoveries. Our approach successfully recovers known functional localizations and identifies new candidate representations across dozens of concepts, validated on both predicted and measured fMRI data. Critically, we show that without causal validation, a large fraction of localizations would be false positives, confirming that activation alone is insufficient evidence of representation.

  3. Trust Region On-Policy Distillation

    On-Policy Distillation (OPD) is a fundamental technique for efficient post-training of large language models (LLMs), with broad applications in agent learning, multi-task enhancement, and model compression. However, OPD training becomes unstable when the teacher and student distributions differ substantially, as teacher supervision on student-generated tokens may yield unreliable policy gradients and even cause optimization failure. This work addresses reliable on-policy token-level supervision through credit assignment strategies, and proposes Trust Region On-Policy Distillation, TrOPD. It features the following characteristics: 1) Trust-Region On-Policy Learning: TrOPD performs OPD only in regions where the teacher provides reliable supervision, mitigating the optimization difficulty of the K1 reverse-KL estimator under distribution mismatch. 2) Outlier Estimation: For outlier regions, we explore gradient clipping, masking, and forward-KL estimation to reduce the adverse effects of unreliable supervision. 3) Off-Policy Guidance: The student continues generation from teacher prefixes and uses forward KL to imitate off-policy guidance, encouraging on-policy exploration toward reliable regions. Experiments show that TrOPD consistently outperforms SoTA OPD baselines, including OPD, EOPD, and REOPOLD, across mathematical reasoning, code generation, and general-domain benchmarks.

  4. Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

    We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility-generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings. Scaling both data and model capacity yields a single generative Transformer that tracks highly dynamic behaviors while achieving unprecedented zero-shot generalization to unseen motions and control tasks. Extensive experiments and scaling analyses show that our model establishes a new performance frontier, demonstrating robust zero-shot generalization to unseen tasks while simultaneously tracking highly dynamic and complex motions.

  5. KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks

    Test-time scaling is a powerful approach to obtain better reasoning in large language models, but it becomes memory-bottlenecked during long-horizon decoding, as the KV-cache grows. KV-cache quantization can help improve this, but current methods are evaluated under prefill-like settings and errors behave differently under autoregressive decoding. We show that in the latter regime, quantization errors accumulate across timesteps, driven primarily by incorrect token scales. We introduce KVarN, a calibration-free KV-cache quantizer that applies a Hadamard rotation followed by a dual-scaling variance normalization across both axes of the K and V matrices. We find that this combination fixes outlying token-scale errors and substantially reduces error accumulation over existing baselines. KVarN establishes a new state-of-theart for KV-cache quantization on generative benchmarks, including MATH500, AIME24 and HumanEval, at 2-bit precision. A vLLM implementation of the KVarN method is available at https://github.com/huawei-csl/KVarN

  6. A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

    Reinforcement learning (RL) post-training improves large language models (LLMs) on individual domains such as mathematical reasoning, code generation, question answering, and creative writing (CW), but training on one domain often degrades performance on others. Existing explanations based on catastrophic forgetting or global gradient conflict are incomplete: substantial interference can occur even when full-model gradients are nearly orthogonal. We show that single-domain RL produces sparse, small-magnitude parameter edits with weak overlap among top-changed neurons, while different domains still share substantial active computation routes on which update directions determine whether they act synergistically or conflict. Guided by this observation, we prove under a local perturbation model of multi-domain RL that later-domain training harms an earlier domain mainly through a second-order damage term, which under the observed sparse route structure concentrates in a low-dimensional shared conflict subspace. Moreover, a short domain refresh contracts the harmful component on this subspace, enabling selective recovery with limited collateral damage. Consistent with the theory, a brief Re-Math refresh after Code rightarrow Math rightarrow QA rightarrow CW recovers Math from 57.66 to 66.04 while largely preserving performance on the other domains, yielding the best average score of 66.39. Beyond refresh, a training-free rollback on a sparse proxy conflict coordinate set for the Math-QA pair partially restores Math, providing direct proxy-level evidence for localized damage. These results provide a localized mechanistic account of interference and recovery in multi-domain RL.

  7. World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

    World models and multimodal large language models (MLLMs) provide complementary capabilities for predicting future outcomes from static visual observations. World models can generate concrete visual rollouts of possible futures, while MLLMs can reason abstractly over questions, goals, and rules. However, generated rollouts are stochastic and may be visually plausible but task-incorrect, making it necessary to determine when visual simulation is useful, whether a rollout is credible, and how it should influence the final answer. We formulate this problem as controlled concrete reasoning, where a model learns to invoke, verify, and integrate visual future simulation alongside abstract reasoning. To study this setting, we construct two human-verified benchmarks, VRQABench for controllable spatial lookahead and OpenWorldQA for open-domain physical prediction, and propose Privileged-Future On-Policy Self-Distillation (PF-OPSD). During training, PF-OPSD uses ground-truth future videos and answers only as teacher-side privileged context to evaluate on-policy concrete-reasoning trajectories, while the deployable student never observes true futures at test time. Experimental results show that PF-OPSD outperforms baseline by 10.6% and 10.9% on VRQABench and OpenWorldQA, respectively, while increasing robustness to noisy or conflicting rollouts. Our code and dataset are available at https://github.com/yczhou001/PF-OPSD.

  8. MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

    Mid-training has become an important stage in modern LLM development, using large-scale curated mixtures to strengthen capabilities before final post-training. Its data selection problem is distinct: the data are optimized under a pretraining-style objective at near-pretraining scale, but are curated toward downstream capabilities and drawn from heterogeneous sources with different formats and training roles. As a result, effective selection requires both scalability and source-adaptive semantic criteria. Existing model-based methods scale well, but provide only implicit quality signals. Semantic selection methods offer stronger judgments, but usually assume fixed rubrics or standardized data formats. To address this mismatch, we propose MIRA, a source-aware filtering framework based on self-anchored rubric discovery. The key idea is to make rubric construction part of data selection: MIRA first discovers what should be evaluated for each source group, then distills those judgments into scalable student scorers for full-corpus filtering. On code-oriented mid-training with 21 sources and 5 source groups, MIRA outperforms selection baselines across nine code benchmarks and matches the full-corpus run while using only half the tokens.

  9. AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

    Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or short-form clinical question answering. However, existing medical agent benchmarks primarily evaluate final outputs, providing limited visibility into agent behavior within the research process. To address this gap, we present AutoMedBench, a workflow-aware benchmark for autonomous medical-AI research across diverse medical imaging and multimodal inference tasks, organizing agent execution into a unified five-stage workflow (S1-S5): Plan, Setup, Validate, Inference, and Submit. It comprises long-horizon tasks with each run averaging 33 agent turns, spanning five research tracks: segmentation, image enhancement, visual question answering (VQA), report generation, and lesion detection. Each task is evaluated under two difficulty tiers, Lite and Standard, which use the same data and metrics but differ in the amount of task-brief scaffolding, and each run is scored using both final task performance and S1-S5 stage scores, enabling stage-level analysis from the initial task brief to the final submitted artifact. Across thousands of recorded runs, stage-level scoring reveals that Validate is the weakest workflow stage on average, whereas Setup is the strongest, suggesting that current agents are better at making pipelines executable than at verifying their reliability. Post-run error analysis further shows that verification and submission failures dominate tagged errors, accounting for 37.7% and 38.1% of fired codes respectively, whereas task-understanding errors are rare at 0.9%, and runs with one fired error code have a 48% lower overall score than runs with no error code on average.

  10. TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

    Reinforcement learning (RL) for visual reasoning needs scalable, verifiable, and controllable training signals. Existing visual RL post-training trains on static curated datasets, with fixed image-question-answer samples bounded by their collection budget. In this work, we introduce TRON (Targeted, Rule-verifiable Online eNvironments), an online environment substrate: a training rollout is generated on demand by a controllable generator-verifier program that samples a fresh latent visual state, renders an image, asks a question, and exactly verifies the answer. A single run can therefore draw an unbounded stream of fresh instances at the difficulty level required by the current curriculum. The current TRON suite contains 520 environments organized into five ability buckets (spatial, mathematical, diagram, pattern/logic, and counting); the same substrate supports both a single full model trained on all buckets and per-bucket ability-specialist models, with no additional data collection. We also introduce a substrate analysis covering generation reliability, instance and level diversity, cross-environment near-duplicates, and base-model pass rate by difficulty level. RL post-training with METHOD consistently improves performance on ten external multimodal reasoning benchmarks across Qwen3-VL-4B, Qwen2.5-VL-7B, and MiMo-VL-7B-SFT.

  11. Benchmarking Visual State Tracking in Multimodal Video Understanding

    Understanding a video requires more than recognizing isolated moments, as humans continuously track entities, states, and events over time. This capacity for visual state tracking is fundamental to video understanding, yet remains underexplored in current evaluations of Multimodal Large Language Models (MLLMs). We introduce Visual STAte Tracking benchmark (VSTAT), a video-based benchmark designed to diagnose visual state tracking in MLLMs. VSTAT consists of 834 clips drawn from both synthetic and real-world videos, paired with 1,500 questions that cannot be answered from any single frame or short segment, requiring continuous perception and integration of events across the entire video stream. Despite their strong performance on existing video benchmarks, we find that state-of-the-art MLLMs perform far below humans and only modestly above answer-prior baselines. To analyze this gap, we compare MLLMs' thinking traces with the underlying video stream to understand why and when MLLMs fail on VSTAT. We find that MLLMs reason and track correctly in text, but fail at visually perceiving the events they need to track. Finally, our preliminary evaluation suggests that recent agentic approaches, including MLLM-based video agents and coding agents, do not readily resolve these failures, still falling short on VSTAT.

  12. Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

    The past few decades have witnessed significant advances in the design of machine learning algorithms, from early studies on task-specific shallow models to more general deep Large Language Models (LLMs). Despite showing promising results in tasks that require instant prediction or in-context learning, existing models lack the ability to continually learn and effectively transfer their temporal in-context knowledge to their long-term parameters. Inspired by human learning process, we introduce a ''Sleep'' paradigm that allows the models to continually learn, distill their short-term fragile memories into stable long-term knowledge with replay, and recursively improve themselves with ''Dreaming'' process. In more detail, sleep consists of two stages: (1) Memory Consolidation: an upward distillation process, called Knowledge Seeding, where the memories of a smaller-self are distilled into a larger network to provide more capacity while preserving the knowledge. As a proof of concept, we present a new Generalized Distillation process for {Knowledge Seeding} (i.e., the combination of on-policy distillation with Reinforcement Learning (RL)-based imitation learning); (2) Dreaming: a self-improvement phase, where the model uses RL to generate a curriculum of synthetic data to rehearse new knowledge and refine existing capabilities without human supervision. Our experiments on long-horizon, continual learning, knowledge incorporation, and few-shot generalization tasks support the importance of the sleep stage.

  13. NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

    As autonomous vehicle capabilities advance, the safe evaluation of driving policies in long-tail scenarios remains a critical bottleneck. In closed-loop simulation, the driving policy model actively interacts with the environment, where its actions dynamically update the simulator state and directly influence the next set of generated sensor observations. While recent reconstruction-based neural simulators offer photorealism, they are fundamentally constrained by their initial captured data and struggle to generalize to highly dynamic or novel scenes. To overcome these limitations, we introduce OmniDreams, a foundation generative world model mid- and post-trained from the Cosmos diffusion model to autoregressively generate action-conditioned videos in real time. By leveraging the rich visual priors of Cosmos and mid- and post-training on 21k hours of driving scenarios, OmniDreams synthesizes complex, unobserved phenomena that are hard for traditional simulators to capture, such as extreme weather and unpredictable dynamic agent behaviors. Crucially, it autoregressively conditions its photorealistic sensor generation on past frames, the current simulator state, and immediate driving actions. Deployed in a closed-loop system with the Alpamayo 1 policy model and AlpaSim orchestrator, OmniDreams acts as a highly responsive, reactive environment, providing a scalable and comprehensive solution for training and evaluating next-generation autonomous driving policies. We additionally show preliminary results indicating that a world-action model (WAM) post-trained from OmniDreams achieves strong performance on the Physical AI Autonomous Vehicles NuRec dataset, surpassing the VLA-based Alpamayo 1.5 research policy model while using only 1/5 the total parameters. These results highlight the potential for a real-time world model like OmniDreams to also serve as a backbone for policy architectures.

  14. Bootstrap Your Generator: Unpaired Visual Editing with Flow Matching

    Modern generative models possess a deep understanding of visual content, yet training them for image editing typically requires massive datasets of paired examples. This limits scalability, especially for video editing where collecting paired data is prohibitively expensive. We propose Bootstrap Your Generator (ByG), a general framework for unpaired training of flow matching editing models. It leverages the base model's knowledge without any external signal. Our approach pairs instruction-following cues extracted from the frozen model with cycle-consistency for structure preservation. To make this tractable, we propose to route gradients from downstream losses over clean predictions to noisy training states. We demonstrate state-of-the-art results on challenging data-scarce image and video editing scenarios. Extensive evaluations and user studies show that our method effectively generalizes to unseen domains and outperforms supervised baselines trained on millions of samples. Analysis reveals that our gradient routing bridges the train-inference gap, and extracting semantic cues from a base model provides a robust training signal that obviates the need for external reward models.

  15. Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation

    We propose Decoupled Residual Denoising Diffusion models (DRDD) for unified and data-efficient image-to-image (I2I) translation. While diffusion models have advanced I2I translation in terms of quality and diversity, we uncover a previously under-explored property in diffusion models. Crucially, beyond its conventional role of manifold lifting (i.e., moving data off low-dimensional manifolds), injecting Gaussian noise facilitates domain harmonization by implicitly aligning feature distributions across domains, a property particularly advantageous for unified I2I translation. However, existing diffusion models prematurely erode this harmonization effect, as noise and residuals are simultaneously removed in a single coupled diffusion process. To address this, DRDD decouples the diffusion process into two sequential and independent diffusion stages: (1) a stochastic noise diffusion for domain harmonization and manifold lifting, and (2) a deterministic residual diffusion that learns the core semantic mapping entirely within the fixed-noise domain. This decoupling preserves harmonization and manifold lifting effects throughout the transformation, substantially simplifying the learning of unified mappings across diverse tasks and domains. Notably, the noise diffusion stage is trained exclusively on abundant, unpaired target-domain images, greatly improving data efficiency. Comprehensive theoretical and empirical analysis demonstrates that DRDD is broadly compatible with mainstream diffusion models and consistently delivers robust, unified I2I translation, even under limited paired data. Our code is available at https://github.com/HKU-HealthAI/DRDD.

Techmeme(15)

  1. CrowdStrike reports Q1 revenue up 26% YoY to $1.39B, vs. $1.36B est., and forecasts Q2 revenue of about $1.44B, vs. $1.43B est.; CRWD drops 9%+ after hours (Samantha Subin/CNBC)

    Samantha Subin / CNBC : CrowdStrike reports Q1 revenue up 26% YoY to $1.39B, vs. $1.36B est., and forecasts Q2 revenue of about $1.44B, vs. $1.43B est.; CRWD drops 9%+ after hours —  CrowdStrike narrowly beat Wall Street's fiscal first-quarter estimates after the bell on Wednesday, but shares slid 10% following the report.

  2. Broadcom reports Q2 revenue up 48% YoY to $22.19B, vs. $22.27B est., and forecasts Q3 semiconductor revenue from AI below estimates; AVGO drops 12%+ after hours (Reuters)

    Reuters : Broadcom reports Q2 revenue up 48% YoY to $22.19B, vs. $22.27B est., and forecasts Q3 semiconductor revenue from AI below estimates; AVGO drops 12%+ after hours —  Broadcom (AVGO.O) missed Wall Street expectations for second-quarter revenue on Wednesday, as increased competition …

  3. Filing: SpaceX aims to raise $75B in its IPO, selling 555.6M shares at $135 each, which would value the company at ~$1.77T (Bailey Lipschultz/Bloomberg)

    Bailey Lipschultz / Bloomberg : Filing: SpaceX aims to raise $75B in its IPO, selling 555.6M shares at $135 each, which would value the company at ~$1.77T —  SpaceX is seeking to raise $75 billion in an initial public offering that would be the biggest of all time, as Elon Musk's rocket, satellite and artificial intelligence …

  4. The US and other Five Eyes nations warn that China is flooding online job platforms with fake profiles and offers targeting government and military personnel (Greg Miller/Washington Post)

    Greg Miller / Washington Post : The US and other Five Eyes nations warn that China is flooding online job platforms with fake profiles and offers targeting government and military personnel —  Nations in the Five Eyes intelligence partnership warned that fake profiles and job offers are targeting military officers, spies …

  5. In a report, UK lawmakers call on the government to end a £330M NHS deal with Palantir and disclose more details of a military contract with the company (Bloomberg)

    Bloomberg : In a report, UK lawmakers call on the government to end a £330M NHS deal with Palantir and disclose more details of a military contract with the company —  British members of parliament are calling on the government to end a major deal with Palantir Technologies Inc. and disclose …

  6. Security firm Calif says it used OpenAI's Codex to discover HTTP/2 Bomb, a remote DoS exploit affecting web servers like Nginx, Apache HTTPD, and Microsoft IIS (The Hacker News)

    The Hacker News : Security firm Calif says it used OpenAI's Codex to discover HTTP/2 Bomb, a remote DoS exploit affecting web servers like Nginx, Apache HTTPD, and Microsoft IIS —  Cybersecurity researchers have discovered a remote denial-of-service exploit that affects major web servers, including NGINX …

  7. OpenAI diverges from Trump's AI EO in a new policy paper, proposing cyber risk evaluations for advanced AI systems be mandatory and led by CAISI, not the NSA (Brendan Bordelon/Politico)

    Brendan Bordelon / Politico : OpenAI diverges from Trump's AI EO in a new policy paper, proposing cyber risk evaluations for advanced AI systems be mandatory and led by CAISI, not the NSA —  OpenAI's new proposal comes as its CEO Sam Altman descends on Washington for a series of Wednesday meetings with White House officials …

  8. Nearly 40% of Alphabet's planned ~$85B in equity offerings for AI will go toward covering tax obligations tied to employee equity awards, amid the AI talent war (Cory Weinberg/The Information)

    Cory Weinberg / The Information : Nearly 40% of Alphabet's planned ~$85B in equity offerings for AI will go toward covering tax obligations tied to employee equity awards, amid the AI talent war —  Alphabet's plan to sell $80 billion worth of shares, billed as a way to pay for AI infrastructure and compute, is surprising enough …

  9. Meta is alerting Instagram users whose accounts were taken over using Meta AI chatbot; some hackers claim to still be able to exploit Meta AI chatbot (Lorenzo Franceschi-Bicchierai/TechCrunch)

    Lorenzo Franceschi-Bicchierai / TechCrunch : Meta is alerting Instagram users whose accounts were taken over using Meta AI chatbot; some hackers claim to still be able to exploit Meta AI chatbot —  The widespread hacking campaign that relied on simply asking Meta AI's chatbot to take over a victim's Instagram account appears …

  10. Collate, whose AI tools automate paperwork for life sciences companies, raised $95M led by Redpoint at a ~$1B valuation, bringing its total funding to $125M (Amy Feldman/Forbes)

    Amy Feldman / Forbes : Collate, whose AI tools automate paperwork for life sciences companies, raised $95M led by Redpoint at a ~$1B valuation, bringing its total funding to $125M —  In this week's edition of InnovationRx, we look at cancer breakthroughs from the ASCO conference, Clear's move into healthcare, Harrison.ai's U.S. expansion, and more.

  11. Amazon announces new visual search features, including displaying AI-generated images of products within its app based on users' search queries (Sarah Perez/TechCrunch)

    Sarah Perez / TechCrunch : Amazon announces new visual search features, including displaying AI-generated images of products within its app based on users' search queries —  In what may be one of the more questionable uses of AI to date, Amazon announced on Wednesday that it will display AI-generated images …

  12. Leading the Future, the pro-AI super PAC backed by Greg Brockman, appears to be linked to multiple sockpuppet accounts, including a purported anti-AI activist (@themidasproj)

    @themidasproj : Leading the Future, the pro-AI super PAC backed by Greg Brockman, appears to be linked to multiple sockpuppet accounts, including a purported anti-AI activist —  A Pro-AI Super PAC's Secret Meme Sockpuppets

  13. Terra AI, which develops AI models for mining companies to better map underground resources, raised a $20M Series A led by Khosla Ventures (Katie Fehrenbacher/Axios)

    Katie Fehrenbacher / Axios : Terra AI, which develops AI models for mining companies to better map underground resources, raised a $20M Series A led by Khosla Ventures —  Mining AI startup Terra AI closed a $20 million Series A led by Khosla Ventures and including the VC arm of mining giant BHP, CEO John Mern tells Axios Pro.

  14. Town, which is developing personalized AI assistants that connect to users' email and calendar, raised a $55M Series A led by a16z (Lily Mae Lazarus/Fortune)

    Lily Mae Lazarus / Fortune : Town, which is developing personalized AI assistants that connect to users' email and calendar, raised a $55M Series A led by a16z —  Jean-Denis Gréze's AI assistant is a silver fox who wears a little satchel, and her name is Ivy.  —  Ivy is Gréze's Townie …

  15. Forage, a payments processor that helps retailers and food-delivery companies accept EBT cards, raised a $40M Series B led by Mouro Capital at a $225M valuation (Yuliya Chernova/Wall Street Journal)

    Yuliya Chernova / Wall Street Journal : Forage, a payments processor that helps retailers and food-delivery companies accept EBT cards, raised a $40M Series B led by Mouro Capital at a $225M valuation —  Forage secures venture funding as it expands food-stamp payment processing despite declining federal enrollment

Solidot(15)

  1. 青春与长寿之间的基因权衡

    科学家发现基因 vgll3 与生命早期生长发育和生殖成功以及生命晚期衰老加速和癌症风险增加直接相关。最新研究为 antagonistic pleiotropy 假说提供了实验证据。该假说认为某些基因会在生命早期带来优势,但在生命晚期则会带来不利影响。研究人员针对了一种寿命非常短的非洲丽鱼(African turquoise killifish),使用 CRISPR 基因编辑技术修改了该基因。结果显示,修改了 vgll3 基因的鱼生长速度更快,性成熟更早,在自然环境中具有繁殖优势。但代价是寿命缩短,且罹患与年龄相关癌症的几率更高。研究人员指出,大自然并不优先考虑寿命,而是优先考虑延续性。人类也存在 vgll3 基因,这项研究也有助于更好的理解人类发育、衰老和年龄相关疾病。

  2. Meta 给予员工每次最多 30 分钟退出跟踪

    Meta 最近开始在美国员工电脑上安装追踪软件,捕捉员工鼠标移动、点击和按键数据以用于训练 AI 模型,此举是该公司构建能自动执行工作任务的 AI 智能体的大计划的一部分。被称为 Model Capability Initiative(MCI)的工具在公司内部引发了强烈反对,部分员工为此发起了一项请愿活动,已有逾 1500 人签名。有匿名员工认为公司的行为“非常反乌托邦”。根据周二发给员工的一份内部备忘录,Meta 略微后退了一步,允许员工退出跟踪,“每次最长 30 分钟”,员工也可以申请永久退出该跟踪计划。

  3. 数学家警告 AI 对数学专业的威胁

    数学家联合发表了获得国际数学联盟支持的宣言《Leiden Declaration》,警告 AI 通过产生大量看似合理但不可靠甚至错误的证明、削弱归因、改变激励机制以及赋予科技公司对研究优先事项过大的影响力去破坏数学。已有数百人签署了这一宣言,它警告 AI 的发展威胁到了数学研究的固有价值。宣言首先指出,区分 AI 产生的证明和正确的数学证明非常困难,给审稿人带来了越来越大的压力,生成 AI 论文成本低廉但验证论文代价昂贵,如果后续研究是基于错误的前提,那么错误会扩大。其次 AI 的训练是基于已有的数学论文,但它输出论文时经常不能正确引用,AI 模型的训练也普遍存在版权侵犯问题。第三 AI 的激励机制与数学专业的价值观背道而驰。宣言敦促数学家将 AI 视为一种工具,而非人类责任的替代品。数学家个人应公开 AI 的使用情况,对其工作的正确性承担责任。宣言还警告,数学可能被用于战争、压迫、大规模监控和破坏民主,因此数学家应谨慎权衡与科技行业合作的伦理问题。

  4. 微软的量子芯片存在基础性问题

    微软宣布了其第二代量子芯片 Majorana 2。但专家认为微软的量子芯片缺乏坚实的研究基础,根本行不通。微软是在 2025 年初宣布了其第一代量子计算芯片 Majorana 1,利用它所谓的拓扑体去观察和控制马约拉纳粒子,从而产生更可靠和可扩展的量子比特。第一代拓扑体使用砷化铟半导体和铝超导体,结果到了第二代微软换成了铅超导体,声称量子比特的寿命从 20 秒延长到了 1 分钟。科学家对微软的说法持强烈怀疑态度,它的最新论文预印本尚未通过同行审议,物理学家 Henry Legg 认为预印本中数据来自于随机伪影。微软的上一篇预印本至今没有通过同行审议,很可能已被顶尖期刊拒绝了。

  5. 四千年前的古城 Mohenjo-daro 随经济发展而变得更平等

    约克大学研究人员分析了古城 Mohenjo-daro 的住房模式。这座古城位于今天的巴基斯坦,其繁荣的时代是在公元前 2600 年至 1900 年间,它是印度河文明的最大城市之一。研究人员发现,Mohenjo-daro 的贫富差距低于其他古代城市。随着时间的推移,其贫富差距甚至缩小了。这座古城与其它文明的古城有显著的差异:没有宫殿没有统治者的巨型雕像没有奢侈陵墓,但拥有井然有序的街道和先进的排水系统,其公共基础设施遍及全城而不是只服务于精英阶层。古埃及为统治者建造金字塔,青铜时代的希腊为精英阶层建造宫殿,而 Mohenjo-daro 则投资于面向全体民众的公共服务。Mohenjo-daro 挑战了长期以来“经济增长会导致不平等加剧”的观点,城市发展和生产力提高的同时,资源分配也更加公平。

  6. 高通 CEO 称抵抗 AI 是徒劳的

    高通 CEO Cristiano Amon 在台北电脑展上发表主题演讲,宣称抵抗是徒劳的,AI 智能体将会变得不可见,不可避开,并且能跨设备跟踪用户。他表示智能体将会从根本上改变人类与技术的关系。今天的手机是数字生活的中心,一切都围绕着手机展开,不久的将来智能体将取代手机。而手机就像可穿戴设备一样成为智能体的延伸。“智能体不局限于设备,它会随着用户移动。无论你使用什么设备,它都与你同在,”他解释道。“一旦你理解这种变化,你就能明白整个移动行业将如何变革。”

  7. 2026 年智能手机出货量预计下降 13.9%

    根据 Counterpoint Research 最新的智能手机市场展望追踪报告,全球智能手机市场正进入近年来较为明显的调整阶段。2026 年全年出货量预计同比下降 13.9% 至约 10.8 亿部,其触发因素是近几周加剧的存储供应紧张,加上伊朗冲突。数据显示,2026 年第二季度移动 LPDDR4/5 价格预计较 2025 年第四季度增长约两倍,考虑到半导体制造的高资本投入与长交付周期,供应紧张情况预计将持续至 2027 年下半年。低端设备受到的影响更为明显。随着晶圆厂将产能转向 AI 驱动的 HBM 和服务器 DRAM,预计 2026 年 LPDDR4 供应将缩减超过 40%,使得入门级产品的成本效益持续降低。2026 年第一季度全球智能手机批发价格同比上涨 14%,随着前期库存的逐步消化,价格上行趋势仍将持续。部分 150 美元以下的细分市场,正面临被市场逐步淘汰的风险。

  8. 雄性园丁鸟用漂亮人造装饰品吸引雌性

    雄性园丁鸟以其错综复杂的求偶仪式知名。它们用树枝搭建隧道,用从环境中收集的各种亮丽物品进行装饰。当雌鸟前来参观时,雄鸟会将自己最闪亮的物品抛向雌性,展示华丽的羽毛,希望以此吸引雌性。根据《Royal Society Open Science》期刊上的一篇新论文,城市化以及随之而来的亮丽人造品的日益流行,对澳大利亚雄性园丁鸟的求偶行为产生了显著影响,研究人员甚至还发现了手铐。对城市和农村园丁鸟的观察发现:城市鸟使用人造装饰品的可能性是农村鸟的十倍以上,而农村鸟更多使用天然物品作为装饰品。城市园丁鸟装饰品数量几乎是乡村园丁鸟的五倍,平均有 90 件,而农村园丁鸟平均只有 20 件。有一只生活在城市的雄性园丁鸟甚至收集了 300 件装饰品。无论是生活在城里还是乡下,园丁鸟都表现出对人造装饰品的偏爱。研究人员称,人类活动正以意想不到的方式改变自然界。

  9. 特朗普签署行政令要求 AI 公司让政府先行评估其新模型

    美国总统特朗普周二签署了一项行政令,要求 AI 公司让政府先行评估其新模型的能力。行政令还要求 AI 公司在自愿的基础上参与基准测试流程,以评估模型的“高级网络能力”,确定其是否应被视为“受保护的前沿模型”。行政令要求 AI 公司在正式发布新模型前提前最多 30 天给予政府访问权限。

  10. Vim Classic 8.3 释出

    Vim 项目在 2025 年 12 月宣布了生成式 AI 政策:只要大模型生成代码予以披露以及代码风格与现有代码保持一致,那么 AI 代码就可以接受。但项目的多位资深参与者对接受 AI 代码持反对意见,不想看到 AI 代码泛滥,他们选择了创建没有 AI 代码的分支,其中一个分支就是 Drew DeVault 的 Vim Classic。出于长期维护的考虑,Vim Classic 不是基于较新的 Vim 9 系列,而是基于 Vim 8.2.0148。他刚刚释出了 Vim Classic 8.3,主要是从上游版本移植了部分 bug 修正和补丁。由于缺乏资源,部分 Vim 插件与 Vim Classic 不兼容。

  11. 欧洲议会默认搜索引擎从 Google 切换到 Qwant

    根据内部电子邮件,欧洲议会内部计算机的默认搜索引擎将于 6 月 4 日起从 Google 切换到法国搜索引擎 Qwant,此举是出于对数字主权和隐私的考虑。Qwant 被描述为以隐私为中心的欧洲搜索引擎,不追踪用户或收集个人数据。Qwant 成立于 2013 年,突出了隐私保护,为用户提供了 Google 之外的一种选择。通过 Firefox 和 Edge 浏览器地址栏进行的搜索将自动路由到 Qwant,但欧洲议会议员仍然可以自由使用其它搜索引擎或更改其默认设置。欧盟委员会正在加强技术主权,减少对外国技术供应商的依赖,扶持欧洲本土技术。

  12. 拒绝停止呼吸的土壤

    法国生化学家 Sébastien Fontaine 15 年来一直试图杀死土壤,他想要了解没有任何生命的土壤能释放多少碳。 他的团队将土壤密封在罐子内,用伽马射线进行灭菌照射。然后等待土壤释放的二氧化碳——这是微生物呼吸持续进行的标志——下降。他们等待了几周,几个月。在显微镜下,经辐射处理的土壤没有显示任何生命迹象,但它仍在继续释放二氧化碳。土壤拒绝停止呼吸。Fontaine 的实验室重复了实验得到了相同的结果。研究人员开始寻找无生命土壤中的呼吸来源。Fontaine 的团队如今报告,他们的土壤样本在六年内持续消耗氧气并释放二氧化碳。他们提出,为生命提供能量的代谢过程也可能发生在活细胞之外。他们的实验表明,即使没有通常组织土壤的生物蛋白质,这种代谢过程也能在土壤中发挥作用。如果他们的假设正确,那么部分生化反应如释放富碳糖分子能量的反应,可能并非生物所独有。此类反应甚至可能在地球生命出现前就已经存在。

  13. 蓝色章鱼是全新物种

    2015 年在加拉帕戈斯群岛进行深海考察的科学家在查看遥控潜水器拍摄的影像时,发现了一只体型娇小、通体呈蓝色的章鱼,它位于水下约 1773 米处。科学家捕捉了这只章鱼以进行进一步分析。研究人员如今得出结论:这只体型小到可以放在手掌的可爱小生物属于一个全新物种。研究报告发表在《Zootaxis》期刊上。小章鱼被保存在储藏室中。由于它的独一无二,且极不可能采集到另一只,科学家不愿意对其解剖进行彻底的物种鉴定分析。因此研究团队选择了 mini-CT 扫描,研究表明这种生物手臂很短,臂上的吸盘很少,没有墨囊,皮肤光滑,且有一颗巨大的脊齿。他们将该物种命名为 Microeledone galapagensis。

  14. 富铁免疫细胞帮助信鸽导航

    迁徒鸟、海龟等动物似乎具有感知地磁场的能力,能利用地磁场进行导航。根据发表在《科学》期刊上的一项研究,信鸽肝脏中的富铁免疫细胞可能赋予了其磁罗盘的能力。对信鸽组织薄片的分析发现,其肝脏巨噬细胞富含铁蛋白,但它在脾脏中很少,且在喙和大脑中完全不存在。电子显微镜的进一步观察发现,巨噬细胞紧邻神经元,而这些神经元都与中枢神经系统相连。研究人员设计了一个试验检验富含铁的巨噬细胞是否能像磁罗盘一样为信鸽指引方向:他们使用名为 clodronate liposomes 的药物抑制巨噬细胞的活性。研究团队训练了 34 只信鸽。白天信鸽利用太阳的位置确定方向。当阴天或完全被云层遮蔽时,它们依靠磁感应辨别方向。研究团队给 18 只信鸽注射了 clodronate,24 小时后当云完全遮蔽阳光时将它们逐一放飞。这些信鸽都佩戴了 GPS 发射器,研究团队能实时追踪其飞行轨迹。所有 18 只信鸽都迷路了,直到天空放晴才返回。16 只对照组的信鸽都没有迷路。研究人员表示,如果铁蛋白辅助导航机制得到证实,那么它可能具有普适性,适用于从蜜蜂到蝙蝠,到鲸鱼和鲨鱼等各种动物。

  15. NASA 低音爆超音速飞机 X-59 将首次尝试突破音速

    NASA 宣布,由洛克希德马丁臭鼬工厂设计的 X-59 Quess 低音爆超音速飞机将在本月首次尝试突破音速。X-59 设计能突破音速但同时不会有超音速飞机通常会产生的音爆,它会产生更安静的“砰砰声”,类似室内听到关车门的声音。它没有前向窗户,而是通过摄像头和显示屏为飞行员提供飞机前方的增强现实的外部视觉系统。如果 X-59 成功它有望对超音速飞行和航空业产生革命性影响,解除目前对超音速飞行的限制。X-59 于 2025 年 10 月完成首飞,2026 年 3 月以来进行了 14 次试飞,本月的超音速飞行计划在 16.7 公里高度实现 1.4 马赫。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.