OrangeBot.AI Digest — 2026-06-08
90 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Surveillance Is Not Safety: A statement on the UK's latest threat to privacy [pdf] (signal.org)
- Apple reveals new AI architecture built around Google Gemini models (www.macrumors.com)
- Siri AI (www.apple.com)
- xAI is looking more like a datacentre REIT than a frontier lab (martinalderson.com)
- Stop the Apple Music app from launching (lowtechguys.com)
- Apple WWDC 2026 (www.apple.com)
- AI is slowing down (www.wheresyoured.at)
- A Farmer Donated Land to Turn into a Park. The City Is Building a Data Center (www.404media.co)
- MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second (mimo.xiaomi.com)
- Show HN: Performative-UI – A react component library of design tropes (vorpus.github.io)
- Anti-social: It's fads, not friends, which now dominate social media feeds (www.bbc.com)
- How much of Thermo Fisher's antibody data has been manipulated? (reeserichardson.blog)
- The Cypherpunk Library (www.cypherpunkbooks.com)
- Dopamine Fracking (igerman.cc)
- 1k Data Breaches Later, the Disclosure Lag Is Worse (www.troyhunt.com)
GitHub Trending(15)
- mvanhorn / last30days-skill
- RyanCodrai / turbovec
- google / skills
- refactoringhq / tolaria
- Panniantong / Agent-Reach
- danielmiessler / Personal_AI_Infrastructure
- santifer / career-ops
- phuryn / pm-skills
- openai / plugins
- Andyyyy64 / whichllm
- MemPalace / mempalace
- roboflow / supervision
- CopilotKit / CopilotKit
- TapXWorld / ChinaTextbook
- luongnv89 / claude-howto
Product Hunt(15)
- Claude Artifact Player
Run your Claude AI artifacts natively, No browser. No cloud.
- The Virtual OS Museum
Relive vintage operating systems right on your desktop
- Vaani
Lip-synced AI dubbing for creators, brands and studios
- Browse.sh
Give your agents muscle memory for automating the web
- Tamadoggo
A living journal for your pet's life, with AI insights
- Sigma File Manager
Free, open-source, cross-platform, modern file manager app
- NTSC-RS
Open-source video emulation of analog TV and VHS artifacts
- Supaste
Clipboard Manager for macOS
- Honen
Automated teaching + learning infrastructure for any company
- Wave
Turn your voice into text — local or cloud, your choice
- Smmall Cloud for iOS
Simple file sharing on your iPad or iPhone
- Job Postings API
View, monitor, and analyze 1.8M+ US jobs
- CabinLink
Flight map from cabin Wi-Fi
- Dreambeans by Google Labs
Daily AI stories personalised from your Google apps
- Fox Issue Tracker 4
Track, plan, and release.
Hugging Face(15)
- Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings
Large language models exhibit impressive zero-shot capabilities across a wide range of downstream tasks. However, they struggle to function as off-the-shelf embedding models, leading to suboptimal performance on massive text embedding benchmarks. In this paper, we identify a potential cause underlying this deficiency. Our motivation stems from an unexpected observation: text embeddings tend to align with frequent but uninformative tokens when projected onto the vocabulary space. We argue that this excessive expression of high-frequency tokens suppresses the model's ability to capture nuanced semantics. To address this, we introduce EmbedFilter, a simple linear transformation designed to refine text embeddings derived from LLMs directly. Specifically, we uncover that the unembedding matrix within LLMs encodes a latent space that is actively writing these frequent tokens into embedding space. By filtering out this subspace, EmbedFilter suppress the influence of high-frequency tokens, thereby enhancing semantic representations. As a compelling byproduct, this enables an inherent dimensionality reduction, lowering index storage and speedup retrieval while fully preserving the refined embedding quality. Our experiments across multiple LLM backbones demonstrate that LLMs equipped with EmbedFilter achieve superior zero-shot downstream performance even with significantly reduced embedding dimensions. We hope our findings provide deeper insights into the mechanisms of LLM-based representations and inspire more principled designs to improve text embeddings training. Our code is available at https://github.com/CentreChen/EmbFilter.
- GENEB: Why Genomic Models Are Hard to Compare
Progress in genomic foundation models is difficult to assess due to fragmented benchmarks, incompatible evaluation protocols, and task-specific reporting. As a result, claims of superiority or generality across models are often not directly comparable. We introduce GENEB, a large-scale diagnostic benchmark that evaluates frozen representations from 40 genomic foundation models across 100 tasks spanning 13 functional categories under a unified probing-based protocol, including few-shot regimes. GENEB enables controlled comparison across model scale, architecture, tokenization, and pretraining data while explicitly exposing task-level trade-offs. Our analysis shows that aggregate leaderboards are unstable: model rankings vary sharply across task categories, scale provides only modest and inconsistent gains, and architectural and pretraining alignment frequently outweigh parameter count. These results highlight limitations of current evaluation practices and position GENEB as a reference framework for principled comparison and category-aware model selection in genomic machine learning.
- SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations
Evaluating LLM mediators remains challenging, as mediation unfolds as a real-time trajectory shaped by disputants' shifting emotions, intentions, and context. Existing testbeds rely on a few expert-authored domains, vary mainly strategic posture, and score every turn against every topic, introducing off-topic noise. We introduce SoCRATES, a benchmark for evaluating proactive LLM mediators in realistic, multi-domain testbeds. It constructs scenarios from real conflicts through an agentic pipeline across eight domains, probes five socio-cognitive adaptation axes (strategic posture, party composition, history length, emotional reactivity, and cultural identity), and scores each topic only on the turns that advance it via a topic-localized evaluator. The evaluator reaches 0.82 alignment with human experts, more than doubling a per-turn baseline. Benchmarking eight frontier LLMs, we find that even the strongest mediator closes only about a third of the unmediated consensus gap under diverse and realistic testbeds, with performance varying sharply by socio-cognitive axis, highlighting that progress lies in social adaptation to diverse conditions.
- MMAE: A Massive Multitask Audio Editing Benchmark
We introduce MMAE, a Massive Multitask Audio Editing benchmark, serving as the first comprehensive evaluation testbed designed for general-purpose instruction-based audio editing. Spurred by the shift toward intelligent creation, interactive editing has rapidly expanded from visual domains, pioneered by models like Nano-banana 2 for images and Gemini-Omni for video, into audio. However, the current evaluation infrastructure lags severely, remaining highly fragmented and restricted to specific subdomains or basic operations. Unlike existing benchmarks that are limited in scope, MMAE extends to a broad spectrum of real-world scenarios, encompassing 7 distinct audio modalities, including sound, speech, music, and their mixtures. Furthermore, we establish a comprehensive taxonomy spanning 6 levels of task complexity, from basic modifications to multi-hop reasoning and multi-round editing, 2 levels of granularity, and 8 distinct operation types. Meticulously curated through human-agent collaboration, MMAE comprises 2,000 high-fidelity samples paired with a pioneering rubric-based evaluation framework. By decomposing free-form tasks into 17,741 verifiable criteria, this robust rubric-based paradigm enables a precise, multi-dimensional assessment of both instruction following and context consistency. Our extensive evaluation of leading models reveals that current systems remain far from achieving reliable edits. Strikingly, the Exact Match Rate (EMR) consistently falls below 5% and plummets to an absolute 0% in complex, mixed-modality tasks, exposing critical bottlenecks in precise execution and structural robustness. We hope MMAE will serve as a catalyst for future advances in the intelligent creation community, providing a clear diagnostic roadmap and establishing a standardized, long-lasting evaluation paradigm for next-generation audio editing systems.
- AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization
Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile controllability required by practical scenarios. To bridge this gap, we present AnchorWorld, a framework that advances egocentric simulation through enhanced interaction integrity and a flexible mechanism for world customization. First, we utilize 3D human motion as the primary interaction modality. To complement the out-of-view or truncated body parts in egocentric views, we introduce an auxiliary training supervision that incorporates exogenous viewpoints decoupled from the agent's first-person sensorium. It allows the model to observe the agent's full-body positioning relative to the environment, facilitating a more robust spatial grounding of human-world interactions. Furthermore, we propose a simple yet effective mechanism for customizing self-evolving worlds. This is achieved by defining anchor views within a unified world coordinate system, coupled with textual descriptions dictating the dynamic evolution of local scenes. Experimental results show that AnchorWorld significantly outperforms state-of-the-art baselines, while ablation studies validate the effectiveness of our key designs. Notably, our customization scheme exhibits promising spatio-temporal geometric consistency and adheres strictly to the prescribed evolutionary dynamics.
- Direct 3D-Aware Object Insertion via Decomposed Visual Proxies
Object insertion aims to seamlessly composite a reference object into a specified region of a background image. Recent diffusion-based methods achieve high visual quality but formulate insertion as a simple 2D inpainting task, providing no explicit control over the object's 3D pose and limiting their practical applicability. We propose DIRECT (Decomposed Injection for Reference Composition and Target-integration), a novel framework that integrates interactive pose manipulation with high-fidelity 2D image synthesis to enable pose-controllable object insertion. Our method decomposes the insertion conditions into three complementary components: appearance guidance capturing visual details from the reference object, geometry guidance derived from the user-adjusted 3D proxy, and context guidance from the target background. By injecting them through separate pathways, DIRECT avoids feature entanglement and simultaneously preserves reference appearance, follows the user-specified pose, and adapts the object to the target scene. We also introduce an automated data construction pipeline to improve the diversity and quality of training data. Experiments show that DIRECT outperforms previous methods in both geometric controllability and visual quality.
- Robots Need More than VLA and World Models
Generalist robot intelligence is often framed as a policy-scaling problem: collect more robot demonstrations, train larger Vision-Language-Action (VLA) models, and expect broader generalisation. In this position paper, we argue that this framing is incomplete. The central bottleneck is not only policy learning, but the absence of mechanisms that convert the world's abundant unstructured behavioural data into grounded robot supervision. Human motion, internet video, simulation rollouts, and interactive demonstrations contain rich information about tasks, goals, contacts, failures, and physical constraints, yet most of this information is not directly usable by robot policies because it lacks embodiment-specific action labels, task semantics, and reward structure. We identify four missing components for the next generation of robotics: data interfaces for autolabelling unstructured behaviour, embodiment interfaces for retargeting human motion to robot actions, world-model interfaces for physics-grounded 3D reasoning, and reward interfaces for inferring task progress and success from video and language. We survey recent progress in robot foundation models, cross-embodiment datasets, learning from video, world models, and reward modelling, and propose a research agenda for building robotics systems that can learn not only from robot demonstrations, but from the broader physical world.
- When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents
Existing benchmarks evaluate Tool-Integrated Reasoning (TIR) in LLMs on idealized ''happy paths'', largely overlooking real-world tool failures. We introduce ToolMaze, a benchmark for dynamic path discovery and error recovery in TIR agents. To separate systematic replanning from blind trial-and-error, ToolMaze adopts a two-dimensional design: DAG-based topological complexity and a 2 times 2 taxonomy of tool perturbations (explicit/implicit, transient/permanent). Evaluations show that perturbations degrade performance across nearly all models, with the sharpest drops under implicit semantic failures. Driven by systemic over-trust in corrupted outputs, Perturbation Recovery Rate (PRR) plummets by around 37\% in these scenarios, while complex topologies trap agents in futile trial-and-error loops. Crucially, agentic fault-tolerance improves with model scale 3.66times slower than basic task execution, highlighting dynamic replanning as a distinct bottleneck unaddressed by model scaling or prompting. Data and code are available at https://github.com/Zhudongsheng75/ToolMaze.
- SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents
Persistent AI assistants, such as OpenClaw, accumulate large collections of related memories over long-term interactions. As these memories grow, they may reinforce one another, diverge across contexts, or directly conflict, making correct assistance depend on memory relations rather than isolated recall. Existing long-term memory benchmarks rarely probe how agents preserve and utilize such relations during downstream tasks. To address this gap, we introduce SubtleMemory, a benchmark for fine-grained relational memory discrimination in long-running AI agents. SubtleMemory constructs relation-controlled latent semantic artifacts whose variants instantiate complementary, nuanced, or contradictory relations, and embeds them into realistic user-agent histories, requiring agents to recover distributed relational structures during later queries and instructions. The benchmark contains 1,522 evaluation instances over 10 long histories, grounded in 1,090 relation-controlled memory-variant sets and spanning user-related and non-user-related queries. Evaluating six standalone memory systems, two Claw-style agents with native memory modules, and three Claw-style agents with plugin memory modules, we find that current systems remain weak on fine-grained relational memory discrimination. We further introduce diagnostic protocols that reveal distinct capability profiles across memory preservation, retrieval, and downstream reasoning stages.
- OpenSkill: Open-World Self-Evolution for LLM Agents
Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, successful trajectories, or verifier signals. Real open-world deployments may provide none of these, offering only a task prompt. In this work, we study open-world self-evolution, where an agent must build both its skills and its own verification signals from scratch, using open-world resources but no target-task supervision. We propose OpenSkill, a framework that bootstraps this loop: it acquires grounded knowledge and verification anchors from documentation, repositories, and the web, synthesizes them into transferable skills, and refines those skills against self-built virtual tasks grounded in the anchors rather than in target answers. The open world thus supplies both the knowledge to be learned and a supervision-independent practice environment, with target-task supervision reserved for final evaluation. Across three benchmarks and two target agents, OpenSkill attains the best automated pass rate while satisfying the no-supervision constraint. Analysis shows its skills transfer across models without model-specific adaptation, and its self-built verifier aligns with ground-truth outcomes despite never accessing them.
- UniSHARP: Universal Sharp Monocular View Synthesis
In this work, we focus on extending SHARP, the popular photorealistic view synthesis method, for universal monocular rendering across a continuum of camera systems, from conventional perspective cameras to wide-field-of-view, fisheye and omnidirectional panoramic settings. To overcome the pinhole-specific assumptions of SHARP, our key idea is to align various images in a unified omnidirectional latent space. Thus, we propose UniSHARP, which performs implicit alignment in both feature and Gaussian spaces. Specifically, Gaussian primitives are arranged along rays and radial distances in a ray-based universal representation, while 2D semantic and 3D spatial features extracted from UniK3D-inspired encoders are jointly decoded to generate the complete Gaussian cloud. To comprehensively evaluate our method, we construct a benchmark covering diverse imaging systems across various scenes. The benchmark is further stratified by field of view (FoV) to enable fine-grained assessment of the universal monocular rendering task. Extensive experiments on the proposed benchmark demonstrate the effectiveness of UniSHARP, outperforming alternative methods by a large margin. The project page can be found at: https://insta360-research-team.github.io/Unisharp-website/
- UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMs
We introduce UnpredictaBench, an evaluation that tests the ability of large language models (LLMs) to capture true underlying distributions. As LLMs are increasingly used as substitutes for other entities (e.g., for humans in economic simulations), the tendency of many models to collapse towards a single plausible answer means a failure to capture the unpredictability of real systems. Recent work on improving output diversity is insufficient for this setting: simulation requires samples that are calibrated to a target distribution, not merely varied outputs. UnpredictaBench isolates a simplified but fundamental version of this problem: sampling outcomes from individual target distributions, including canonical statistical distributions, distributions induced by stochastic programs, and natural-language scenarios that describe random processes. We introduce 448 such problems together with KS@N, a general-purpose evaluation metric that quantifies how well a model outputs approximate black-box target distributions via the Kolmogorov-Smirnov statistical test. This is the rate at which we fail to reject model samples of size N against ground-truth samples, with larger N indicating greater difficulty. Tested across open and proprietary models, we find a large spread in distributional capabilities. For instance, when models generate samples of size 100 (KS@100, our standard metric), scores range from near 0 to over 20%. No model is able to achieve over 40% at KS@100, showing significant headroom in distributional sampling as a capability. Although adding reasoning can somewhat increase scores, we find no immediate solution for this issue. UnpredictaBench shows that even simple distributional simulation remains challenging, making it a necessary first step toward using LLMs as stand-ins for complex systems.
- Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them
Image-to-Video diffusion models leverage input images to generate visually stunning content, yet frequently produce motion that violates physical laws. We reveal a surprising finding: a 2-step generation often exhibits better physical consistency than a 50-step output from the same model. Through spectral analysis, we trace this to phase erosion during denoising; the phase degrades significantly (dropping by approx 18% from step 2 to step 50), whereas the magnitude remains relatively stable. Building on this insight, we propose PhaseLock, a training-free framework that preserves the valid motion priors from few-step inference throughout the denoising trajectory. Rather than relying on full-step inference for physical consistency, PhaseLock extracts a motion prior from just 2 steps and enforces it onto high-fidelity generation via Latent Delta Guidance. Our approach effectively mitigates phase degradation, improving physical consistency by an average of 6.2 points across diverse models while largely maintaining visual fidelity, with negligible overhead (1.06times time, 1.02times memory) and reduced reliance on expensive external guidance methods (sim5times time).
- LLM Explainability with Counterfactual Chains and Causal Graphs
Causal graphs provide a high-level language for making mechanisms transparent. Recent work uses Large Language Models (LLMs) to recover causal graphs of external-world processes. Instead, in this paper, we use causal graphs to model LLM inference itself, providing stakeholders with a transparent view of how the model perceives and organizes high-level concepts to produce a prediction. We propose a four-phase method for constructing such graphs. Given a target LLM and a set of textual examples, our method discovers class-discriminative, human-interpretable concepts and maps each input to LLM-perceived concept states. We then introduce an MCMC-inspired counterfactual augmentation procedure that expands the sparse observational data through chains of counterfactuals. This enables stable causal discovery with σ-CG, yielding informative, interpretable graphs. We apply our method to three LLMs across disease diagnosis, sentiment analysis, and LLM-as-a-judge classification tasks. We evaluate the learned graphs for predictive fidelity and structural stability, and the MCMC-inspired augmentation for convergence and downstream utility. Our results show that the discovered causal graphs capture meaningful dependencies consistent with LLMs' reasoning. Together, this paper provides a foundation for concept-level explainability of LLMs.
- LIMMT: Less is More for Motion Tracking
We argue that high-quality motion data can steer tracking policies toward better optimization trajectories early in training. In this work, we introduce LIMMT (Less Is More for Motion Tracking). To our knowledge, this is the first data-centric study for physics-based humanoid motion tracking. We go beyond simply removing low-quality and erroneous clips, but define motion data quality through three dimensions: physics feasibility, diversity, and complexity. We show that even training with under 3% of AMASS yields better tracking performance than training with the full dataset. We further conduct data cleaning on the estimated web-sourced mocap data. Extensive experiments and analyses validate the effectiveness of our framework.
Techmeme(15)
- Source: Cursor, which is prepping for an expected acquisition by SpaceX, passed $4B in annualized revenue in the last week, up from $3B in April and $2B in Feb. (Richard Nieva/Forbes)
Richard Nieva / Forbes : Source: Cursor, which is prepping for an expected acquisition by SpaceX, passed $4B in annualized revenue in the last week, up from $3B in April and $2B in Feb. — Elon Musk's rocket and AI company is expected to acquire the buzzy AI coding startup shortly after it goes public.
- OpenAI confidentially files for an IPO, says it has "not decided on timing yet", as "there are things we want to do that are likely easier as a private company" (Ashley Capoot/CNBC)
Ashley Capoot / CNBC : OpenAI confidentially files for an IPO, says it has “not decided on timing yet”, as “there are things we want to do that are likely easier as a private company” — OpenAI has confidentially filed for an IPO with the Securities and Exchange Commission …
- Google lowers the price of its Google AI Plus plan to $4.99 per month, down from $7.99, and doubles the included storage to 400GB (Abner Li/9to5Google)
Abner Li / 9to5Google : Google lowers the price of its Google AI Plus plan to $4.99 per month, down from $7.99, and doubles the included storage to 400GB — Google announced today that its AI Plus subscription is getting a price drop to $4.99 per month and now includes 400 GB of storage.
- Google upgrades NotebookLM, which now runs on Gemini 3.5 and Antigravity, to deliver new agentic capabilities and more advanced reasoning for AI Ultra users (Ivan Mehta/TechCrunch)
Ivan Mehta / TechCrunch : Google upgrades NotebookLM, which now runs on Gemini 3.5 and Antigravity, to deliver new agentic capabilities and more advanced reasoning for AI Ultra users — Google on Monday announced an update to its NotebookLM research tool, which includes new features and the shift to Gemini 3.5 as the default model.
- Apple's Craig Federighi says some companies "appear to be racing forward" to develop "AI for the sake of AI" without regard for the humans using the technology (Todd Spangler/Variety)
Todd Spangler / Variety : Apple's Craig Federighi says some companies “appear to be racing forward” to develop “AI for the sake of AI” without regard for the humans using the technology — Apple said it has rebuilt the Siri personal assistant from the ground up with artificial intelligence at its core …
- Apple announces new features for the Home app, including AI-generated descriptions of HomeKit Secure Video camera clips and smarter grouping of notifications (Hartley Charlton/MacRumors)
Hartley Charlton / MacRumors : Apple announces new features for the Home app, including AI-generated descriptions of HomeKit Secure Video camera clips and smarter grouping of notifications — Apple today announced new Apple Intelligence features for the Home app, including AI-generated descriptions of HomeKit Secure Video camera clips …
- watchOS 27 Drops Support for Apple Watch Series 9, Ultra 1, SE 2, and Older (Hartley Charlton/MacRumors)
Hartley Charlton / MacRumors : watchOS 27 Drops Support for Apple Watch Series 9, Ultra 1, SE 2, and Older — Apple today confirmed that watchOS 27 will not support the Apple Watch Series 9, Apple Watch Ultra (first generation), or Apple Watch SE (second generation), effectively drawing a line at devices equipped with the S9 or S10 chip.
- Apple announces a new Foundation Models framework for developers, a new Core AI framework, and a set of Xcode enhancements aimed at agentic coding workflows (Hartley Charlton/MacRumors)
Hartley Charlton / MacRumors : Apple announces a new Foundation Models framework for developers, a new Core AI framework, and a set of Xcode enhancements aimed at agentic coding workflows — Apple today announced a new Foundation Models framework for developers alongside a set of Xcode enhancements aimed at agentic coding workflows.
- Apple says iOS 27 will support iPhone 11, in contrast with 2019 Android flagships like Pixel 4 and Galaxy S10, which got only three years of platform updates (Aamir Siddiqui/Android Authority)
Aamir Siddiqui / Android Authority : Apple says iOS 27 will support iPhone 11, in contrast with 2019 Android flagships like Pixel 4 and Galaxy S10, which got only three years of platform updates — While older Android flagships were put out to pasture long ago, Apple is bringing iOS 27 to the iPhone 11 from 2019. — • — TL;DR
- Amazon expands its print-on-demand features to AI-generated designs created using Alexa for Shopping for products like T-shirts, water bottles, and hoodies (Mia Sato/The Verge)
Mia Sato / The Verge : Amazon expands its print-on-demand features to AI-generated designs created using Alexa for Shopping for products like T-shirts, water bottles, and hoodies — The retailer's expansion into near-instant design and printing threatens its own network of third-party sellers as well as print-on-demand competitors.
- Apple announces tvOS 27, with performance enhancements, smart downloads, an updated Podcasts app, and more (Benjamin Mayo/9to5Mac)
Benjamin Mayo / 9to5Mac : Apple announces tvOS 27, with performance enhancements, smart downloads, an updated Podcasts app, and more — Apple today announced tvOS 27, the next major software version for Apple TV, as part of the WWDC keynote which included a bevy of platform-wide improvements.
- Apple announces that the iOS 27 Shortcuts app will feature AI-powered workflow creation, allowing users to build automations via natural language prompts (Sarah Perez/TechCrunch)
Sarah Perez / TechCrunch : Apple announces that the iOS 27 Shortcuts app will feature AI-powered workflow creation, allowing users to build automations via natural language prompts — Apple has leveraged AI to make its visual-scripting tool, Shortcuts, easier to use in iOS 27. — The Shortcuts app was largely built …
- Apple says its most powerful on-device AI model requires an iPhone 17 Pro, iPhone Air, iPad with M4 and later, or Mac with M3 and later, all with 12GB+ of RAM (Ryan Christoffel/9to5Mac)
Ryan Christoffel / 9to5Mac : Apple says its most powerful on-device AI model requires an iPhone 17 Pro, iPhone Air, iPad with M4 and later, or Mac with M3 and later, all with 12GB+ of RAM — Today at WWDC Apple unveiled its next generation of Apple Intelligence, including the new Siri AI.
- Apple unveils new editing tools for the Photos app as part of Apple Intelligence, including an upgraded Cleanup tool, and new Extend and Spatial Reframing tools (Hartley Charlton/MacRumors)
Hartley Charlton / MacRumors : Apple unveils new editing tools for the Photos app as part of Apple Intelligence, including an upgraded Cleanup tool, and new Extend and Spatial Reframing tools — Apple today announced new AI-powered photo editing tools coming to the Photos app as part of Apple Intelligence …
- Indian quick grocery delivery startup Zepto files for an India IPO, planning to raise ~$836M by selling new shares; Zepto was valued at $7B in its last round (Rajesh Mascarenhas/Bloomberg)
Rajesh Mascarenhas / Bloomberg : Indian quick grocery delivery startup Zepto files for an India IPO, planning to raise ~$836M by selling new shares; Zepto was valued at $7B in its last round — Rapid-commerce firm Zepto Ltd. has filed an updated draft prospectus for an initial public offering, marking a key step toward …
Solidot(15)
- 肥胖会影响精子质量改变表观遗传标记
根据发表在《Current Obesity Reports》期刊上的一项研究,肥胖并非只是个人选择的结果,肥胖风险的遗传率高达 40%-70%,能通过复杂的生物和环境因素代代相传。最新证据表明,肥胖会影响精子质量,改变表观遗传标记。这些变化可能会影响儿童的食欲调节、新陈代谢和长期患病风险。好消息是这些变化是可逆转的。生活方式改变以及减肥可改善精子健康,改变与肥胖相关的表观遗传模式。
- 韦伯首次测量早期宇宙休眠黑洞质量
天文学家利用韦伯太空望远镜以及引力透镜效应首次测量了一个早期宇宙休眠黑洞质量。该黑洞是 MRG-M0138 星系的中心,星系已经不再形成恒星,而黑洞也不再吞噬周围的物质而处于休眠状态。MRG-M0138 位于一个巨大星系团的背后,被引力透镜效应放大了约 30 倍。黑洞距离地球大约 100 亿光年,其质量为太阳的 60 亿倍。天文学家组合了引力透镜以及黑洞引力对恒星运动的影响确定了其质量。
- 平台算法给民主带来风险
越来越多的证据表明社媒平台算法给民主带来了风险。由于算法的不透明性以及以最大化用户参与度和平台停留时间为导向,完全不在乎推送内容的质量,算法被认为是造成政治极化的罪魁祸首。以 X 平台为例,在马斯克(Elon Musk)在 2024 年宣布支持特朗普之后,倾向共和党的账号曝光度显著提升。马斯克本人在 2024 年 7 月至 11 月间所发布推文的累计浏览量高达 171 亿次,超过了该平台所有政治竞选广告的总和。2025 年德国联邦选举期间,各大社交平台算法推荐给年轻用户的政党相关内容中半数涉及极右翼政党。一项分析发现,X 平台算法不成比例的放大了政治极端政党(尤其是极右翼政党)的内容,系统性压制中间政党。另一项研究发现,相比按时间排序的内容,用户接触 X 平台算法推送内容七周后,政治态度会向更保守的方向转变。禁用算法后这种转变并未逆转。这些研究显示平台算法目前的运作方式不利于民主。社媒平台算法放大极端声音导致的一个结果是扭曲对观点分布的感知,发表边缘观点的人会认为自己是主流,这种网络同质性被称为“虚假共识效应(false consensus effect)”。如果不能采取强有力的保护措施,我们会进入到一个日益极化和分裂的威权社会。
- GLP-1 减肥药与更低的乳腺癌风险相关
根据发表在《JCO Oncology Practice》期刊上的一项研究,服用 GLP-1 减肥药与女性更低的乳腺癌风险相关。对逾 11 万名年龄在 45 岁至 80 岁之间的回顾性分析发现,服用 GLP-1 药物的女性患乳腺癌的风险比未服用的女性低约 30%。这是一项观察性研究,GLP-1 减肥药与降低乳腺癌发病率之间是否存在关联还有待进一步研究。GLP-1 药物模拟了人体天然激素 glucagon‑like peptide‑1,该激素有助于调节血糖和食欲。GLP-1 药物最初被用于减肥,如今被发现还可能有助于预防癌症。研究人员指出,GLP-1 药物会影响许多与癌症发展相关的靶点和通路,因此值得进一步展开研究。
- 微软再次加强 Xbox 内容独占
在索尼之后,微软重新加强游戏独占策略。索尼停止将其第一方 3A 游戏移植到 PC 平台,而微软的 Xbox 平台此前开始将其 3A 游戏移植到索尼的 PS 平台,但新 CEO Asha Sharma 上任之后,她改变了这一做法,强调 Xbox 平台“必须有独占内容和服务”。在周日的 XBOX Games Showcase 上,微软宣布其《Gears of War: E-Day》和《Clockwork Revolution》将是 Xbox 独占,并且不是限时独占。微软表示,此前宣布支持 PS5 的游戏如《Halo: Campaign Evolved》和《Forza Horizon 6》仍然会按计划推出。
- 免费领取价值30/90美金的NVIDIA DLI自学课程并测试获得证书
领取规则:未注册过开发者的用户可以通过如下链接免费选择一门 DLI 在线自主培训的付费课程,配套云端实验环境和可获得 NVIDIA 培训证书。每位用户(每个邮箱账号)仅可选择一门。 https://developer.nvidia.cn/login?ncid=ref-dev-557858&sfdcid=Zhiding 目前可选课程包括 7 门英文课,5 门中文课,目前课程列表如下,随时下架,免费名额有限,先到先得:
- 2025 年国际 C语言混乱代码大赛公布获奖结果
2025 年第 29 届国际 C 语言混乱代码大赛(IOCCC, The International Obfuscated C Code Contest)公布了获奖作品。IOCCC 是一项国际程序设计赛事,旨在写出最有创意和最让人难以理解的 C 语言代码。IOCCC29 的 22 部获奖作品包括:Nick Craig-Wood 开发的 GBA 模拟器,其源代码就像一部 GBA 游戏机;虚拟机的代码规模通常比较大,比如 QEMU 有大约 200 万行代码,而 Adrian Cable 开发的虚拟机只有 366 个字节,它能运行 DOOM;台湾开发者 jingp49 获奖作品的源代码形状来自《神秘博士》的时间机器塔迪斯(Tardis)。IOCCC 主办方表示,22 个获奖程序都极富创意,参赛作品数量和质量都达到历史最高水平。
- 新药功能性治愈部分乙肝患者
葛兰素史克公布了其实验乙肝治疗药物 bepirovirsen(bepi)的两项重复双盲试验结果:疫苗功能性治愈了 19% 的患者。全世界大约有 2.4 亿人感染了慢性乙肝,每年 110 万人死亡。大部分慢性乙肝患者没有接受治疗。完全治愈乙肝非常困难,因此评估药物的疗效主要是功能性治愈——即检测不到病毒。在 1220 名注射 bepirovirsen 的患者中有 233 人功能性治愈,对照组无人功能性治愈。研究人员强调 bepirovirsen 对大部分慢性乙肝患者效果有限。
- AI 威胁数十亿人的自然资源
联合国大学水、环境与健康研究所发布了报告《AI 能耗的环境成本:碳、水和土地足迹》。报告预计到 2030 年,为全球人工智能(AI)提供支持的数据中心,每年将消耗 945 TWh 的电力,相关用水量将相当于 13 亿人一年的基本生活用水需求,而土地占用面积将超过 14500 平方公里。研究发现,支撑 AI 运行的每 1 千瓦时电力,都同时对应3种环境足迹,即来自能源生产过程的碳足迹、来自发电和冷却过程的水足迹,以及能源基础设施建设和资源开采带来的土地足迹。报告显示,训练 GPT-5 预计需要约 100 GWh 电力,相当于撒哈拉以南非洲约 77 万人一年的居民用电量,相关用水量约为 10 亿升,土地占用量约为 1.5 平方公里。训练只是 AI 生命周期中的一部分。随着模型投入应用,真正持续消耗资源的是推理过程,也就是模型不断响应用户请求、生成内容的过程。报告估计,推理环节占 AI 总能耗的 80%-90%。2025 年全球数据中心消耗了 448 TWh 的电力。如果将其视为一个国家,它们将成为全球第 11 大电力消费国,排在法国之后,沙特阿拉伯之前。
- 科学家精准编辑人类胚胎基因
中国科学家贺建奎在 2018 年披露使用 CRISPR 基因编辑技术修改了人类胚胎诞生了两名基因编辑女婴。他后来因此被判入狱三年。CRISPR 不是一种非常精准的基因编辑技术,容易出现脱靶效应。现在哥伦比亚大学发育细胞生物学副教授 Dieter Egli 与 DNA 测试初创公司 Nucleus Genomics 的 Nathan Treff 等人合作,使用更精准的基因编辑技术碱基编辑编辑了人类胚胎基因,该技术能靶向 DNA 序列中的单个碱基,能减少副作用。最新研究针对了两个基因,其一是增加患心脏病风险的基因,其二是与镰状细胞贫血症等血液疾病相关的基因。研究人员表示这项技术有助于修复胚胎中的致病突变,但距离特制胎儿还很遥远。实验发现,这些基因编辑并未均匀发生在所有细胞中,一些细胞完成了碱基改造,另一部分则仍保留着原始碱基,这种现象被称为嵌合效应。
- 美国政府考虑在 AI 公司持有股份
美国政府考虑持有 AI 公司股份。OpenAI CEO Sam Altman 正与白宫就政府可能入股这家 AI 公司进行持续磋商。双方的讨论已持续一年多,本周 Altman 在华盛顿会见了多位议员和官员,就监管和 AI 的最新发展进行了磋商。作为潜在协议的一部分,OpenAI 可能会向美国政府捐赠股权,用于建立某种公共财富基金。该基金可以“投资于多元化的长期资产”,让公民能获取 AI 发展的“收益”。在特朗普的第二个任期内,政府已入股了英特尔、IBM 以及量子和关键矿产公司。
- 印度人口可能会更早开始下降
在 1970 年代,Parul Gayen 生活在德里的贫民窟,那儿到处都是孩子。她的母亲有 6 个兄弟姐妹,她的祖父有 11 个兄弟姐妹。她的丈夫 Swapan 有 6 个兄弟姐妹——第 7 个夭折了,两人在 16 岁时结婚,有 3 个孩子。如今她已经 58 岁,但他们的孩子只有两个决定生育,而且只生 1 个。时代变了。她说,一个孩子会感到孤独。印度如今是世界人口最多的国家,但它正走在中国的人口开始减少的道路上——中国人口自 2021 年起开始减少。印度生育率下降的速度比预期的更快也更早。印度人口众多的贫困邦的生育率正向富裕邦看齐:人口 7700 万的泰米尔纳德邦和人口约 1 亿的西孟加拉邦的总和生育率均为 1.3,与芬兰相同。印度城市的平均总和生育率为 1.5。印度人口的峰值预计为 15.5 亿。
- 加州伯克利的 CS 课程不及格率上升
数据显示,2026 年春季加州伯克利 CS 10 课(The Beauty and Joy of Computing)的不及格率高达 35.3%,CS 61A 课(计算机程序的构造和解释)的不及格率达到了 10.6%。而在 2025 年和 2024 年春季,这两门课的不及格率均未超过 10%。教这两门课的教授 Dan Garcia 认为不及格率上升与学生使用大模型相关:学生被发现使用大模型如 Claude、ChatGPT 和 Google Gemini 考试作弊,或过于依赖大模型完成作业但对知识一知半解因此未能对考试做好准备。其它原因包括数学基础薄弱以及师资力量不足。
- 天文学家发现银河黑洞呼吸的直接证据
天文学家利用位于智利的 ALMA 大型毫米波/亚毫米波阵列长达五年的高解析度观测资料,首次清楚看见银河黑洞 Sgr A* 向外吹出的高温气流,解开困扰天文学界逾 50 年的谜团。根据理论,黑洞在吞噬周围气体时,部分物质也会以气流或喷流形式向外释放,但银河系中心黑洞的这种现象过去始终难以直接观测。研究团队观测距离 Sgr A* 约 3 光年范围内的一氧化碳分子讯号。一氧化碳是追踪冷分子气体的重要指标,能帮助天文学家描绘黑洞周围的气体分布。结果发现,在冷气体分布中存在一个巨大圆锥状空洞,方向正对着黑洞。研究人员结合 NASA Chandra X 射线天文台的观测资料,发现这个空洞内充满高温热气体。这代表 Sgr A* 正持续向外吹出高温高能气流,将周围冷气体扫开或加热,形成这个特殊结构。虽然这股气流不像某些活跃星系中的黑洞喷流那样剧烈,但研究团队估计,它至少已持续存在约 2 万年。
- GNUtrition 在时隔 14 年后发布新版本
食品营养分析自由软件 GNUtrition 在时隔 14 年后释出了新版本 v0.33。GNUtrition 上一次更新是在 2012 年。GNUtrition 的食品营养信息使用的是美国农业部的数据库 Food and Nutrient Database for Dietary Studies(FNDDS)。v0.33 主要变化是:用 C 语言取代 Python 2 进行了重写,UI 从 GTK 2 升级到 GTK 3,旧营养数据库 Nutrient Database of Standard Reference 于 2018 年停止更新,因此改用了 FNDDS 数据库,等等。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.