OrangeBot.AI Digest — 2026-06-22
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Flock-Powered Police Chiefs Stalking Women Shows Why Warrants Are Needed (ipvm.com)
- Jobs and Software Is Fucked (urflow.bearblog.dev)
- Canada is looking to build up to 10 new nuclear reactors over the next 15 years (www.cbc.ca)
- Steam Machine launches today (store.steampowered.com)
- The text in Claude Code’s “Extended Thinking” output (patrickmccanna.net)
- Moebius: 0.2B image inpainting model with 10B-level performance (hustvl.github.io)
- Never Give Them Your Face (nevergivethemyourface.com)
- Pledging another $400k to the Zig software foundation (mitchellh.com)
- Alan Greenspan has died (www.washingtonpost.com)
- Why Drawing Tablet Brands Won't Collaborate on Linux Floss Drivers (www.davidrevoy.com)
- Munich 1991: The Roots of the Current AI Boom (people.idsia.ch)
- GLM 5.2 vs. Opus (techstackups.com)
- Codex logging bug may write TBs to local SSDs (github.com)
- Deno Desktop (docs.deno.com)
- Danish privacy activist Lars Andersen raided by police (twitter.com)
GitHub Trending(15)
- calesthio / OpenMontage
- palmier-io / palmier-pro
- jamiepine / voicebox
- mukul975 / Anthropic-Cybersecurity-Skills
- penpot / penpot
- Stirling-Tools / Stirling-PDF
- garrytan / gstack
- heygen-com / hyperframes
- tursodatabase / turso
- bytedance / deer-flow
- DeusData / codebase-memory-mcp
- ZhuLinsen / daily_stock_analysis
- firecrawl / firecrawl
- JCodesMore / ai-website-cloner-template
- lyogavin / airllm
Product Hunt(15)
- OnBrand by SlideSpeak
Design context for AI agents
- AlgoFly AI
The all-in-one place to build and deploy vision AI
- MD+HTML Reader
Review AI-generated Markdown and HTML in a focused workspace
- Cloudflare Temporary Accounts
Let agents deploy before signup
- uwait
Get paid while AI thinks
- Clawd
A context-aware browser mascot with 100% local offline AI
- AirJelly
Your Proactive, Self-Organizing Second Brain
- Skybridge
The full-stack open source React framework for MCP Apps
- AgentX
Evaluate AI agent, pinpoint issues, and fix with one click.
- HAQQ Legal AI on Mobile
Bringing legal understanding to anyone with a phone
- Alai 2.0
AI design partner for presentations, social posts, and more
- Selector Forge
Browser extension for AI-generated resilient selectors
- Photoroom API
Transform product images at scale with one image editing API
- MediaSeg
Split large media files into upload-ready chunks on macOS
- Agentic Document Extraction
Make the world's documents computable
Hugging Face(13)
- PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models
Multimodal large language models (MLLMs) have achieved remarkable progress in visual understanding tasks. However, most existing MLLMs rely on autoregressive generation, which limits their efficiency for perception tasks that require captioning multiple regions. In this work, we propose PerceptionDLM, a multimodal diffusion language model optimized for efficient parallel region perception. Built upon PerceptionDLM-Base, a strong foundational baseline that achieves state-of-the-art performance among open-source diffusion MLLMs, our architecture fully leverages the parallel decoding nature of DLMs. Specifically, we introduce efficient prompting and structured attention masking to enable simultaneous perception of multiple masked regions, allowing the model to generate region descriptions in parallel at both the sequence and token levels. This design significantly improves inference efficiency compared with existing approaches that process regions sequentially. To systematically evaluate the parallelism property of visual perception capability for DLMs, we construct a new Parallel Detailed Localized Captioning Benchmark (ParaDLC-Bench) by scaling the DLC-Bench to include multiple region masks per image, enabling joint evaluation of both caption quality and inference efficiency. Experiments demonstrate that PerceptionDLM maintains competitive performance in region captioning while achieving substantial speed improvements for multi-region perception tasks. Our results highlight the potential of multimodal diffusion language models for efficient, parallel visual perception. To the best of our knowledge, we are the first to achieve parallel region caption and perception by leveraging the advantages of diffusion language models. Code, models, and datasets are released.
- MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision
Personalized presentation generation requires more than conditioning on a current prompt or template: agents must preserve stable user preferences across tasks, retain newly introduced preferences and constraints during multi-turn revision, and carry out local edits reliably. We propose MemSlides, a hierarchical memory framework for personalized presentation agents that separates long-term memory from working memory and further divides long-term memory into user profile memory and tool memory. User profile memory stores intent-conditioned profiles for round-0 personalization, working memory carries active preferences and session constraints across revision rounds, and tool memory stores reusable execution experience for reliable localized editing. MemSlides pairs this memory design with scoped slide-local revision, so targeted updates act on the smallest affected region instead of repeatedly regenerating the full deck. In controlled experiments, user profile memory improves persona-alignment judgments on a multi-persona, multi-intent profile bank, tool-memory injection improves closed-loop modify behavior in diagnostic matched-pair settings, and qualitative cases illustrate working memory's ability to carryover preferences. Taken together, these results suggest that effective personalization in presentation authoring depends on separating persistent user profiles, session-level working memory, and reusable execution experience across generation and localized revision.
- GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents
Memory benchmarks for LLM agents largely assume single-user settings, leaving shared assistants for hospitals, workplaces, campuses, and households understudied. In these deployments, multiple principals write to a common memory pool and query it under different roles, scopes, and relationships, so memory quality requires governance as well as recall. We introduce GateMem, a benchmark for multi-principal shared-memory agents. GateMem jointly evaluates utility for legitimate long-horizon requests with state updates, access control across contextual authorization boundaries, and agent-facing active forgetting after explicit deletion requests. It spans medical, office, education, and household domains, with long-form multi-party episodes, incremental memory injection, hidden checkpoints, structured judging, and leak-target annotations. Across diverse baselines and backbone models, no method simultaneously achieves strong utility, robust access control, and reliable forgetting. Long-context prompting often yields the best governance score at high token cost, while retrieval-based and external-memory methods reduce cost yet still leak unauthorized or deleted information. These results show current memory agents remain far from reliable shared institutional deployment.
- Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models
While reasoning on autoregressive (AR) models is often performed by chain-of-thought reasoning and reflection, their refinement of previous outputs still relies on fully sequential generation, even when only local edits are needed. In contrast, the masking mechanism in Mask Diffusion Models (MDMs) naturally supports explicit local edits on previous outputs, allowing selective refinement without discarding previous answers and generating another from scratch. While this property more closely aligns with how humans correct mistakes by iterative local refinement, existing MDMs do not support multi-turn masking and denoising. We propose Reflective Masking (RM), which elicits such an intrinsic reasoning capability in MDMs via lightweight post-training. RM provides a native test-time scaling, where an MDM iteratively revisits and revises its prior outputs based on evolving context. To exploit insights from previous turns like AR reasoning, we further introduce History Reference, a parameter-free mechanism that leverages intermediate denoising states during revision. Our approach requires no architectural changes and is easily applicable to existing MDMs. Across diverse tasks and modalities, including text generation, Sudoku, and image editing, Reflective Masking consistently outperforms standard masking-based baselines and demonstrates strong generality, positioning RM as a fundamental primitive for reasoning on MDMs.
- BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation
Three-dimensional (3D) brain MRI is central to clinical neurology and neuro-oncology, where generative models could augment under-represented cohorts, simulate disease trajectories, and support privacy-preserving data sharing. Latent diffusion has been the go-to solution for modeling imaging data, but it places two competing demands on the tokenizer: encoder embeddings must retain the clinical information that downstream tasks act on, and the decoder must reconstruct anatomically faithful volumes. Existing reconstruction-driven tokenizers achieve the second at the expense of the first. To address this, we introduce a fully volumetric masked-autoencoder (MAE) based tokenizer for 3D brain MRI latent diffusion, decoupling encoder and decoder: a frozen 3D MAE encoder produces clinically informative embeddings, while a dedicated CNN decoder reconstructs voxels from a linear projection of those embeddings. We pretrain the encoder on 35,309 volumes from 18 public cohorts spanning four modalities, ten disease categories, and 200+ acquisition sites, and demonstrate its dual utility in two settings. First, on a 23-task linear-probing benchmark, the encoder outperforms or matches SOTA models (i.e., BrainIAC, BrainSegFounder, and MedicalNet) on 21 of 23 tasks. Second, a conditional diffusion transformer (DiT) trained on these clinically informative embeddings supports both conditional generation across six variables and patient-specific longitudinal forecasting. Together these results establish a single 3D brain-MRI embedding space capable of both downstream clinical tasks and controllable generation.
- SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG
Retrieval-augmented generation (RAG) systems must balance retrieval granularity with contextual coherence, a challenge that existing methods address through LLM-guided chunking, single-level context expansion, or hierarchical summarization. These approaches variously depend on costly LLM calls during indexing or retrieval, limit context aggregation to a single granularity level, or introduce information loss through summarization. We present SproutRAG, an attention-guided hierarchical RAG framework that addresses this trade-off by organizing sentence-level chunks into progressively larger but semantically coherent units, using learned inter-sentence attention to construct a binary chunking tree. Unlike prior approaches that rely on external LLMs, fixed context expansion, or lossy summarization, SproutRAG learns which attention heads and layers best capture semantic document structure, enabling multi-granularity retrieval without additional LLM calls or compressed summaries. At retrieval time, SproutRAG uses hierarchical beam search to retrieve candidates at multiple granularities, capturing multi-sentence relevance beyond flat retrieval. The framework is trained end-to-end with a joint objective that improves both embeddings and tree structure. Experiments across four benchmarks spanning scientific, legal, and open-domain settings demonstrate that SproutRAG improves information efficiency (IE) by 6.1% on average over the strongest baseline. Code is available on https://github.com/AmirAbaskohi/SproutRAG.
- MCompassRAG: Topic Metadata as a Semantic Compass for Paragraph-Level Retrieval
Retrieval-augmented generation (RAG) systems depend critically on how documents are chunked and searched. Fine-grained chunks can improve retrieval precision but expand the search space, increasing latency and cost; larger chunks reduce the number of candidates but make dense similarity less reliable, as the representation for each chunk mixes multiple topics and introduces more semantic noise. This trade-off becomes especially limiting in deep research tasks, where retrieval must be both fast and precise across large, heterogeneous corpora. We introduce MCompassRAG, a metadata-guided retrieval framework that uses topic-level signals as a semantic compass for selecting relevant evidence. Instead of relying only on cosine similarity between queries and noisy chunk embeddings, MCompassRAG enriches chunk representations with topic metadata in the same embedding space and trains a lightweight retriever through LLM-teacher distillation. At inference time, MCompassRAG performs topic-aware retrieval without additional LLM calls, improving both efficiency and evidence quality. Across six complex retrieval benchmarks, MCompassRAG improves information efficiency (IE) by 8.24% on average with over 5 times lower latency than the strongest efficient RAG baselines. Code is available on https://github.com/AmirAbaskohi/MCompassRAG.
- GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning
Generalist vision-language-action systems need object-centric 3D evidence and reusable manipulation experience to plan reliable robot trajectories. GeneralVLA provides a hierarchical interface for converting language and RGB-D observations into 3D end-effector paths, but two bottlenecks remain. First, monocular SAM3D-style object reconstruction can hallucinate pose and unseen geometry, while manipulation benefits from stable object shape when calibrated multi-view observations are available. Second, the original KnowledgeBank mainly retrieves semantically similar snippets and appends new knowledge, which makes it difficult to control memory quality, conflicts, confidence, and geometric relevance. To address the first challenge, we introduce GeoFuse-MV3D, a geometry-prior-guided MV-SAM3D reconstruction branch that verifies external geometry cues with input-view masks, applies soft visual-hull support, performs axis-wise refinement, and fuses only geometry while preserving appearance. To address the second challenge, we upgrade KnowledgeBank into a governed long-term memory system with explicit quality, confidence, lifecycle, verifier, and conflict metadata, together with precision-oriented retrieval. Finally, we evaluate the reconstruction branch on GSO-30 and the memory module on Terminal-Bench 2.0 and SWE-Bench Verified; GeoFuse-MV3D improves over the MV-SAM3D baseline by reducing CD and LPIPS by 2.20% and 2.02% while increasing PSNR and SSIM by 2.36% and 1.03%, and KnowledgeBank improves over ReasoningBank by 4.53% on Terminal-Bench SR and 3.73% on SWE-Bench resolve rate, while reducing AS by 4.95% and 5.65%, respectively. Code: https://github.com/AIGeeksGroup/GeneralVLA-2. Website: https://aigeeksgroup.github.io/GeneralVLA-2.
- SpatialAvatar-0: High-Quality 4D Head Avatar with Multi-Stage Reconstruction
High-quality 4D head avatars from one or a few source portraits are central to telepresence, AR/VR, and digital-human interaction. 3D Gaussian Splatting (3DGS) has emerged as the dominant representation, with two complementary regimes (generalizable feed-forward predictors and per-subject refiners) maturing in parallel. However, existing feed-forward predictors are trained on a single dataset family with a hard-coded source count, inheriting the corresponding domain bias. Per-subject refiners require 300K--600K iterations and rely on adaptive densification that destroys upstream Gaussian layouts, preventing the two regimes from sharing a representation end-to-end. To bridge both regimes we propose SpatialAvatar-0 on a shared FLAME-mesh-bound Gaussian representation: a feed-forward generator with a parameter-free K-source mean-pool and a monocular-temporal to multi-view-spatial two-phase schedule that anchors against identity-prior collapse onto the smaller multi-view set. We further introduce a 10K-iter layout-preserving per-subject refinement loop that freezes the FLAME-binding and Gaussian count and replaces densification with a three-component anti-spike regularization. On VFHQ/HDTF cross-domain zero-shot we surpass the in-domain leader GAGAvatar by +1.5 dB PSNR despite never training on either test domain, and on the SplattingAvatar monocular benchmark we lead every reported metric, surpassing the 300K-iter GeoAvatar by +1.3 dB PSNR at up to 60x shorter per-subject schedule than common SOTA baselines. Website: https://spatialwalk.github.io/SpatialAvatar-0.
- WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents
To assist humans over extended periods in real homes, embodied agents must remember user routines, world states, and past interactions. Existing long-term memory benchmarks mainly evaluate language-centric retrieval and question answering, while embodied benchmarks often focus on short-horizon task execution without testing long-term memory use in dynamic environments. We introduce WorldLines, a project-driven benchmark for long-horizon embodied household assistance. It constructs temporally extended household traces with dialogues, actions, execution feedback, object and device state changes, and converts them into evidence-linked samples for Memory QA and Embodied Task Planning. We further propose ObsMem, an observer-grounded memory framework that maintains visibility-aware memories and action-native state trails for state-aware decisions. Experiments reveal persistent challenges in partial observability, overwritten world states, and translating long-term memory into embodied plans, while ObsMem offers a stronger reference architecture for this setting.
- Characterizing Narrative Content in Web-scale LLM Pretraining Data
The narrative composition of web-scale LLM pretraining corpora remains largely unexplored even though narrative is a fundamental mode of human communication. We present the first fine-grained study of narrative features in Dolma, a 3-trillion-token open pretraining corpus. Drawing on narrative theory, we design a framework spanning three core narrative elements (agency, setting, and events) operationalized as 11 interpretable dimensions. After sampling and annotating a diverse set of 400 passages, we finetune and validate NarraBERT, a RoBERTa-based model for fine-grained narrative prediction. We apply NarraBERT to 3M passages, resulting in a new dataset, NarraDolma. We find (i) narrative structure is measurable at scale across extremely heterogeneous data, (ii) we uncover a continuous, multidimensional narrative structure underlying web text, and (iii) narrative qualities are unequally distributed across pretraining sources and topics in ways that current curation practices neither measure nor account for. Our framework, dataset, and analyses provide a foundation for understanding how narrative qualities are distributed in LLM pretraining data and for studying how data composition affects narrative reasoning tasks. We publicly release NarraDolma and NarraBERT.
- StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly understood. Prior work often compares different (groups of) individuals, making it difficult to separate appearance effects from identity differences. We introduce StylisticBias, a controlled benchmark for evaluating attribute-level social bias in MLLMs. We generate 500 photorealistic base faces and create about 50 single-attribute variations per face, producing about 25K images. This design keeps identity fixed and changes one visual attribute at a time. It lets us measure how specific cues shift model judgments. We evaluate six MLLMs across 25 binary social judgment scenarios. We find that age and body type dominate identity-level effects, while fashion style and other visual cues drive the largest attribute-level shifts. We further find that about 15 attributes account for nearly 80\% of the total variation, showing that bias is concentrated in a small set of visual cues. Sensitivity is strongest in judgments that are semantically aligned with appearance, especially socioeconomic and style-related judgments. We release StylisticBias as a benchmark for fine-grained bias evaluation in multimodal models. Code and dataset: https://github.com/timo-cavelius/StylisticBias and https://hf.co/datasets/shaghayegh/stylistic-bias-dataset.
- Distilling Examples into Task Instructions: Enhanced In-Context Learning for Real-World B2B Conversations
In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantically complex, multi-party B2B conversations, where traditional ICL encounters significant limitations, especially as context length increases due to the concatenation of multiple few-shot examples. We introduce the Call Playbook dataset, featuring five classification tasks derived from real-world B2B conversations targeting core sales concepts. To bridge the gap between performance and practical utility, we propose novel knowledge extraction methods that distill verbose examples into compact, interpretable representations of structured classification criteria and precise task descriptions. Our approach achieves a 99\% reduction in token usage and improves macro-averaged AUC by up to 7\% over traditional ICL. Notably, it remains robust as context grows, unlike advanced token compression baselines which degrade by over 9 F1 points. Importantly, our framework enables direct refinement of classification logic, addressing critical needs for transparency, efficiency, and user interaction in real-world NLP applications.
Techmeme(15)
- President Trump signs two executive orders aimed at speeding the development of advanced quantum computers and mitigating the security threats they present (Amrith Ramkumar/Wall Street Journal)
Amrith Ramkumar / Wall Street Journal : President Trump signs two executive orders aimed at speeding the development of advanced quantum computers and mitigating the security threats they present — Administration set an ambitious new 2028 target for a system that can conduct scientific research
- Sources: Meta internally exposed data from its employee-tracking program meant to help train its AI models, including full prompts and private conversations (Wired)
Wired : Sources: Meta internally exposed data from its employee-tracking program meant to help train its AI models, including full prompts and private conversations — Employees had previously raised concerns about the initiative, which involves collecting workers' keystroke data to train AI models.
- Air Space Intelligence won an $875M, 12-year FAA contract to develop AI tools that map flight trajectories and identify areas of congestion to reduce delays (Allyson Versprille/Bloomberg)
Allyson Versprille / Bloomberg : Air Space Intelligence won an $875M, 12-year FAA contract to develop AI tools that map flight trajectories and identify areas of congestion to reduce delays — Air Space Intelligence Inc. won a US government contract to develop artificial intelligence technologies for managing flight traffic …
- Sakana AI launches Fugu, a multi-agent orchestration system accessible through a single model API, claiming Fugu Ultra matches Fable and Mythos on benchmarks (Carl Franzen/VentureBeat)
Carl Franzen / VentureBeat : Sakana AI launches Fugu, a multi-agent orchestration system accessible through a single model API, claiming Fugu Ultra matches Fable and Mythos on benchmarks — Last night, the increasingly enterprise-focused AI startup Sakana launched Fugu, a multi-agent orchestration system …
- Sources: Vimeo owner Bending Spoons seeks to raise ~$1.62B in a US IPO, selling 58M shares at $26 to $28 apiece, at a valuation of $19B at the top of the range (Echo Wang/Reuters)
Echo Wang / Reuters : Sources: Vimeo owner Bending Spoons seeks to raise ~$1.62B in a US IPO, selling 58M shares at $26 to $28 apiece, at a valuation of $19B at the top of the range — Bending Spoons, an Italian technology company that acquires and revamps software businesses, is seeking to raise as much as $1.62 billion …
- Valve Steam Machine review: much smaller than PS5, surprisingly smooth, and navigable with any modern gamepad but very expensive and needs manual configuration (Sean Hollister/The Verge)
Sean Hollister / The Verge : Valve Steam Machine review: much smaller than PS5, surprisingly smooth, and navigable with any modern gamepad but very expensive and needs manual configuration — My first day with the Steam Machine was a mess. Instead of enjoying a worry-free game console, I spent hours troubleshooting what felt like a finicky PC.
- Valve says Steam Machine, its new living room-friendly PC, will start at $1,049 for the 512GB base model, and go on sale starting June 29 (Jay Peters/The Verge)
Jay Peters / The Verge : Valve says Steam Machine, its new living room-friendly PC, will start at $1,049 for the 512GB base model, and go on sale starting June 29 — You can register your interest starting today, and the first emails letting people buy one will go out on June 29th.
- OpenAI unveils an updated GPT-5.5-Cyber model, launches the Patch the Planet initiative in partnership with Trail of Bits to fix open source bugs, and more (Lily Hay Newman/Wired)
Lily Hay Newman / Wired : OpenAI unveils an updated GPT-5.5-Cyber model, launches the Patch the Planet initiative in partnership with Trail of Bits to fix open source bugs, and more — Amid concerns about AI models' cybersecurity capabilities, OpenAI revealed an improved version of GPT-5.5-Cyber and its “Patch the Planet” …
- Alphabet shares close down 5% on Monday following the departure of Google DeepMind VP John Jumper, the company's second top AI executive to leave in a week (Bloomberg)
Bloomberg : Alphabet shares close down 5% on Monday following the departure of Google DeepMind VP John Jumper, the company's second top AI executive to leave in a week — Alphabet Inc. shares tumbled on Monday following the departure of another high-profile artificial intelligence leader to a rival.
- Sources: marketing tech startup AppsFlyer raised a $1B Series E at a $2.7B post-money valuation; Moloco, Google, Meta, and Unity acquire minority stakes (Kerry Flynn/Axios)
Kerry Flynn / Axios : Sources: marketing tech startup AppsFlyer raised a $1B Series E at a $2.7B post-money valuation; Moloco, Google, Meta, and Unity acquire minority stakes — AppsFlyer has raised more than $1 billion in Series E funding at a $2.7 billion post-money valuation, Axios has learned from sources familiar with the financing.
- SpaceX announces an offering of senior unsecured notes and discloses it has ~$100.8B in cash; SPCX closes down 16.43% in its third consecutive losing session (Samantha Subin/CNBC)
Samantha Subin / CNBC : SpaceX announces an offering of senior unsecured notes and discloses it has ~$100.8B in cash; SPCX closes down 16.43% in its third consecutive losing session — SpaceX on Monday announced a senior unsecured notes offering and disclosed about $100.8 billion in cash.
- SpaceX signs a computing deal worth up to $6.3B with Reflection AI for access to Nvidia GB300s at Colossus 2; Reflection will pay $150M per month through 2029 (Deirdre Bosa/CNBC)
Deirdre Bosa / CNBC : SpaceX signs a computing deal worth up to $6.3B with Reflection AI for access to Nvidia GB300s at Colossus 2; Reflection will pay $150M per month through 2029 — SpaceX has signed a major computing power agreement with Reflection AI, making the open-source artificial intelligence startup …
- Groq raised $650M led by Disruptive and Infinitum after its Nvidia deal, aiming to hit 200 MW in capacity by the end of 2027, following a $750M raise in 2025 (Zsana Hoskins/Bloomberg)
Zsana Hoskins / Bloomberg : Groq raised $650M led by Disruptive and Infinitum after its Nvidia deal, aiming to hit 200 MW in capacity by the end of 2027, following a $750M raise in 2025 — Groq Inc. raised $650 million in a new funding round aimed at expanding its data center capacity and helping the one-time chip startup become …
- Instagram is testing horizontal video on Instagram for TV, plans to experiment with longer-form storytelling and episodic series, and launches on Samsung TV (Katie Kilkenny/The Hollywood Reporter)
Katie Kilkenny / The Hollywood Reporter : Instagram is testing horizontal video on Instagram for TV, plans to experiment with longer-form storytelling and episodic series, and launches on Samsung TV — The Meta-owned company is going decidedly retro by experimenting with time-honored video formats with its creators.
- Crypto trading app Fomo raised a $75M Series B led by Index at a $550M valuation, taking its total funding to ~$94M, and claims to add 3,500 new users per day (Ben Weiss/Fortune)
Ben Weiss / Fortune : Crypto trading app Fomo raised a $75M Series B led by Index at a $550M valuation, taking its total funding to ~$94M, and claims to add 3,500 new users per day — Two established investors are feeling crypto FOMO. Index and Union Square Ventures have backed the crypto startup Fomo …
Solidot(15)
- 回顾对 AUR 的攻击
由用户递交的软件仓库 Arch User Repository(AUR)最近遭遇了大规模恶意攻击,攻击者创建了一系列新账号,然后通过这些账号接管无人维护的软件包(被称为 orphaned packages),植入恶意代码,推送恶意更新。Arch 项目的维护者现已关闭了新用户注册,正在讨论如何处理这些被恶意滥用的无人维护软件包。AUR 中的软件包由用户递交,其他用户可通过搜索下载 PKGBUILD 文件、解依、编译、安装和更新软件。它不提供软件的二进制版本。目前 AUR 中有逾 107,000 个软件包,其中近 14,000 个无人维护可供认领。任何注册用户都可以认领和修改无人维护的软件包。它提供的软件包未经审核,风险由用户自己承担。其它 Linux 发行版也都有类似的软件仓库,如 Fedora 的 Copr,openSUSE 的 Open Build Service (OBS),Ubuntu 的 Personal Package Archives (PPA)。但这些服务与 AUR 有显著区别:它们提供了类似官方软件包的构建环境,而且不允许预编译二进制文件或私有软件。AUR 的要求过于宽松而在这次攻击中遭到了滥用。
- HPV 疫苗将 30 岁前死于宫颈癌的风险降至几乎为零
根据 WHO 的数据,宫颈癌是女性第四大常见癌症,其 99% 的病例是由高危型人乳头瘤病毒(HPV)引起的。虽然 HPV 疫苗能预防约 90% 的宫颈癌,但疫苗对生存率的影响尚不清楚。根据发表在《柳叶刀》期刊上的新研究,伦敦玛丽皇后学院的研究人员发现,自 2008 年 HPV 疫苗引入以来,疫苗接种者宫颈癌死亡率显著下降。HPV 疫苗对降低死亡率的影响如此之大,以至于研究人员估计,12 或 13 岁接种疫苗的女孩在 30 岁之前死于宫颈癌的可能性几乎为零。对于 30-34 岁的接种过疫苗的女性,死于宫颈癌的相对风险降低了 63%。2020-2024 年间英格兰有记录历史上首次没有 20-24 岁的女性死于宫颈癌。HPV 疫苗除了预防宫颈癌,还能预防肛门癌、阴茎癌、阴道癌、外阴癌、口腔癌和咽喉癌,以及生殖器疣,8 年级的男孩和女孩都会接种该疫苗,部分地区为 9 年级和 10 年级学生提供补种服务。新冠疫情前疫苗接种率接近了 WHO 的目标,但疫情之后接种率大幅下降。
- Anthropic 对特定功能访问要求身份验证
Anthropic 更新了其隐私政策,从 2026 年 7 月 8 日起,部分功能将需要身份验证,该验证将由 Persona 公司负责。Persona 是一家第三方身份验证公司,由 Peter Thiel 投资。此前 Discord 因用户强烈反对以及 2026 年 2 月发生的一起数据泄露事件而终止了在年龄验证上与 Persona 的合作。
- Linux 7.2 内核完全移除 strncpy 函数
在 6 年 362 个补丁之后,Linux 7.2 内核终于完全移除了 strncpy() 函数。strncpy() 是一个 C 语言字符串复制函数,内核文档将其标记为“极度危险(actively dangerous)”。strncpy()是一类内存错误的主要来源:包含敏感数据的内核缓冲区可能会在未终止字符串边界外泄漏字节,导致内存信息泄露。strncpy()被 5 个不同函数取代:strscpy() 用于 NUL 结尾的目的地址,strscpy_pad() 用于 NUL 结尾零填充的目标地址, strtomem_pad() 用于非 NUL 结尾固定宽度字段,memcpy_and_pad() 用于显式填充的有边界复制,memcpy()用于已知长度的内存复制。
- 霸王龙到 40 岁才完全成年
科学家多年来一直认为霸王龙在 25 岁左右达到成年体型,但一项新研究显示,霸王龙要到 40 岁才会完全成年。最新研究是基于对 17 具霸王龙化石的分析,这些霸王龙的年龄从幼年到成年不等。新研究采用了更先进的技术估计恐龙的年龄,并利用复杂统计模型整合多个标本的信息,更完整了解霸王龙整个生命周期的生长情况。结果表明,霸王龙的生长期比之前认为的要长约 15 年。
- 日本宣布新超算理究
日本理化学研究所 19 日宣布,为利用 AI 进行科学研究而建设的新超级计算机命名为“理究”。该名称寓意利用 AI 探“究”自然现象背后的“理”。该超算将设在神户市中央区的理研神户地区,力争 7 月投入使用。理化所还在同一天宣布了另一台量子计算-高性能计算混合平台超算 ROQUO,两台超算都使用了英伟达的 GB200 NVL4 系统。其中 ROQUO 配备了 135 个计算节点,540 (NVIDIA Blackwell) GPU 以及 270 (NVIDIA Grace) CPU,FP64 峰值逾 21 PFLOPS,FP8 峰值 5 EFLOPS 等。
- 美国芯片安全法案将强制性要求位置跟踪 AI 芯片
美国国会正在审议芯片安全法案(Chip Security Act),该法案将为先进 AI 芯片加入更严格的安全验证功能,将要求芯片出口商通过定制的位置验证硬件或软件追踪先进芯片的流向,确保先进芯片不会进入中国等国家。美国众议院外交事务委员会于 3 月下旬以 42 比 0 的投票结果一致通过了芯片安全法案,将其提交到众议院全体会议审议。参议院的配套立法则尚处于审议的第一个阶段。美国芯片行业组织反对这项法案,认为会阻碍芯片出口。最大的 AI 芯片制造商英伟达去年 12 月宣布它已开发出能满足该法案部分要求的技术。
- 10% 消费最高人群每年造成数万亿美元环境损害
荷兰和英国科学家研究发现,按 2017 年价值计算,全球消费支出排名前 10% 的人每年造成 1.7万亿-5.7 万亿美元的环境损害。过去的研究表明,消费最高的个人(大致对应最富有的个人)对环境破坏所应承担的责任份额不成比例的巨大。但这些责任尚未得到货币形式的量化。研究人员评估了全球和各大洲最富裕国家中消费前10%的人群行为造成的环境成本。他们参考了《环境价格手册》(EnvironmentalPrices Handbook)中的数据,以 2017 年美元(最新可用数据)为不同环境损害赋予货币价值。研究者发现,全球范围内,高消费群体造成的年度环境成本约为每人2300-7500 美元——全球总计相当于 1.7万亿-5.7 万亿美元。在美国,前 10% 消费者的成本明显更高,约为每人 19000-63000 美元,相当于这一群体平均收入的 6%-20%。该研究仅评估了个人消费,而此前研究表明,最富有的 10% 人群通过投资也会产生大量排放。
- Polymarket 付费给内容创作者制作假的押注获胜视频
最大预测市场 Polymarket 付费给数十名内容创作者制作了假的押注获胜视频。它搭建了与其网站几乎一模一样的假网站,指示内容创作者在假网站上进行虚假交易,隐瞒受雇于 Polymarket 的事实。在虚假获胜视频发布之后,Polymarket 再雇佣水军传播和扩散这些视频,营造很多人通过押注赚钱的假象。内容创作者称,他们一个月的收入最高为 2000-3000 美元。对假视频的分析显示,大部分押注都是在 Polymarket 工程师的测试环境中进行的。创作者称他们会将拍完的视频发送给 Polymarket 审核。如果视频不够吸引人,或者有明显造假痕迹,Polymarket 会要求重拍。
- Canonical 将为 Ubuntu 桌面加入语音文本转录 AI 功能
Canonical 宣布将为 Ubuntu 桌面加入语音文本转录 AI 功能,它正在征询用户对该功能的反馈。预计于今年 10 月发布的 Ubuntu 26.10 将包含被称为 Myna 的 AI 功能的早期版本。在 Myna 中,语音识别在名为 Canonical Inference Snap 的沙盒组件中进行,Speech Orchestrator 负责管理会话,Audio Adapter 处理麦克风拾取的音频,在音频到达模型前对其进行降噪和分块处理。语音识别将在本地进行,一旦安装相应模型后就不再需要连接互联网。音频数据也不会被长期保存,将在会话结束后立即被丢弃。Myna 暂时不会支持语音输入密码、持续监听、翻译等功能。
- TikTok 向新账号推荐的视频近六成是 AI slop
根据视频创作工具公司 Kapwing 的一份报告,Tiktok 向新账号推荐的视频高达 59% 是 AI slop,而 YouTube 向新账号推荐的视频 AI slop 占 21%,Tiktok 几乎是 YouTube 的三倍。Kapwing 人工审核了 Tiktok 20 个类别逾万则视频,对新账号进行了一项单独的测试,统计了前 500 个 For You videos 中 AI 生成内容的比例。TikTok 的前 500 个推荐视频有 294 个是 AI slop,而 YouTube 短视频 Shorts 中前 500 个推荐视频有 104 个是 AI slop。在 Kapwing 审核的 TikTok 儿童类别 2000 则视频中,57% 是 AI slop,在所有类别中比例最高,其中 #cartoonkids 标签下 100 个精选视频有 97 个是 AI 生成,#cartoons 和 #babysong 等标签下 AI 生成视频比例都是 83%,#forkids 为 79%。科学与教育类别 AI 生成视频比例占 35%、健康(33%)和历史(33%)。截至去年 11 月 TikTok 将 13 亿个视频标记为 AI 生成。
- 芬兰图书馆提供缝纫机借用服务
芬兰图书馆不只是提供图书借阅,而是维系重要的社会功能。其它国家的公共图书馆在消失,而芬兰还在新建图书馆。美国在 2008-2019 年间关闭了 766 家公共图书馆,英国在 2016-2023 年间逾 180 家图书馆关闭或转交给志愿团体运营。芬兰人口约 560 万,有逾 700 家图书馆,除了借阅图书,图书馆出借的最大物品是空间:可免费预定房间用于会面、学习、进行政治讨论或创作音乐。赫尔辛基市中心的 Oodi 图书馆在 2019 年被评为全球最佳新建图书馆,它提供了缝纫机、网球拍和游泳池通行证的借用服务。这种借用文化源于芬兰的实用主义,可追溯到过去的农业时代,当时的人们经常共享农机。今天的城市居民居住在小房子里,他们可能一年只需要用到一次缝纫机,那么为什么要买呢?他们可以在图书馆免费使用通过税款采购的缝纫机。根据政府报告,55% 的芬兰人每月至少去一次图书馆。数据显示芬兰人平均每年使用图书馆 9.1 次。而英国人平均每年访问图书馆约 2.5 次。美国人平均每年访问图书馆 2.4 次,欧盟平均约 3.5 次。根据芬兰的图书馆法,公共图书馆必须促进民主、言论自由和积极的公民意识。其它北欧国家也有类似的政策。2025 年芬兰在公共图书馆上的支出近 3.71 亿欧元,人均支出 65.78 欧元,而英国人均支出 10 英镑,美国人均支出 45 美元。芬兰图书馆员还能帮助用户处理各类在线事务,从税务和银行账户到养老金和数字健康记录,他们还提供简历和求职申请方面的帮助。一项针对芬兰图书馆的研究得出结论:图书馆发挥着至关重要的包容性基础设施的作用。图书馆是少数可以静静待着而无需消费的公共空间。
- Google reCAPTCHA 系统引入手势验证
Google 将要求用户在摄像头前挥手以证明自己是人类而不是机器人。它提供的区分机器人和人类检测服务 reCAPTCHA 引入了手势验证。Google 表示,在手势验证期间它会分析用户在执行各种操作或手势时的一段或多段手部视频,系统会处理视频以提取手部关键点的坐标数据,其中包括 21 个指关节关键点坐标。Google 声称,视频绝不会与用户的身份相关联,并且会在验证流程结束后删除。系统绝不会录制音频。
- 疑似黑客劫持短信预警系统在巴西各地发送警报短信
巴西政府称周六上午巴西多州的手机收到了一条未经授权的“极端”类别警报短信,其中包含文字 misantropi4。该单词将葡萄牙语 misantropia 的最后一个字母 a 替换为 4,这是黑客常用的做法。misantropia 的意思是厌恶人类。巴西的紧急短信系统类似美国的 AMBER Alert,允许政府官员直接向特定地理区域内的移动设备发送紧急短信。巴西政府表示其 National Civil Defense 警报平台已经下线,它认为这是一次黑客攻击,正对此展开调查。
- 德国 2025 年人口出现下降
德国联邦统计局的数据显示,尽管有大量移民补充,2025 年德国人口同比仍减少了约 11 万人,这是自 2020 年以来首次出现年度人口下降。德国去年的净移民数量为 23.5 万,但这不足以抵消死亡人数超过出生人数的缺口,2025 年德国死亡人数比出生人数多出 35.2 万人。截至 2025 年底,德国人口为 8350 万,即去年人口降幅约为 0.13%。德国上一次出现年度人口萎缩是在 2020 年,当时新冠疫情期间实施的严格旅行限制导致移民数量急剧下降。去年德国出生率也创下历史新低。与此同时人口老龄化趋势正在加速。60-79 岁年龄段的人口持续增加,去年新增 35.8 万人,而所谓“婴儿潮一代”进入退休年龄。作为纳税主力的 20-59 岁年龄段人口,其降幅远超平均水平,去年这个年龄段的人口收缩了1.0%,即减少了 40.9 万人。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.