OrangeBot.AI Digest — 2026-06-23
90 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Jerry's Map (www.jerrysmap.com)
- Claude Tag (www.anthropic.com)
- F3 (github.com)
- AI's Affordability Crisis (blog.dshr.org)
- Show HN: TikZ Editor – WYSIWYG editor for figures in LaTeX (tikz.dev)
- Mistral OCR 4 (mistral.ai)
- What we call "age verification" is actually mass surveillance (pluralistic.net)
- Madison Square Garden compiled a list of activists against facial recognition (www.404media.co)
- The Coming Loop (lucumr.pocoo.org)
- Unlimited OCR: One-shot long-horizon parsing (github.com)
- Israel targeted Gaza children resulting in genocide, UN inquiry says (www.reuters.com)
- Crypto in 2026: Oh, This Is the Bad Place (www.stephendiehl.com)
- Plotnine (plotnine.org)
- The new HTTP QUERY method explained (kreya.app)
- Will It Mythos? (swelljoe.com)
GitHub Trending(15)
- calesthio / OpenMontage
- ZhuLinsen / daily_stock_analysis
- mukul975 / Anthropic-Cybersecurity-Skills
- garrytan / gstack
- bytedance / deer-flow
- koala73 / worldmonitor
- palmier-io / palmier-pro
- anthropics / claude-plugins-official
- shanraisshan / claude-code-best-practice
- revfactory / harness
- jamiepine / voicebox
- JCodesMore / ai-website-cloner-template
- byoungd / English-level-up-tips
- DeusData / codebase-memory-mcp
- NousResearch / hermes-agent
Product Hunt(15)
- Latitude
Fix what's breaking in your AI agent
- Cotypist
Local AI Autocomplete in your voice, anywhere on your Mac
- NeuralAgent 3.0
AI that executes UI actions on your computer in ~285ms
- Blazly SEO
Dominate SEO with an AI content operating system
- OpenArt Director
Direct cinematic videos through chat
- Steam Machine
A tiny, powerful PC for big-screen gaming
- wildbirds
Birdwatchers app to share and discover birds socially
- Bluerails Discovery
The rails AI agents use to find and pay you
- Buddy AI Note
Your daily memo that turns notes into a plan
- BestDefense.io
Pentest and patch every deploy with AI
- Conduit
Fix the tool-list bloat slowing your AI agent
- Hush
Open-source noise suppression for voice AI agents
- Thumbmagic
AI thumbnail generator trained on top-performing thumbnails
- Sakana Fugu
One Model to Command Them All
- Rosply
AI agent that controls your computer autonomously
Hugging Face(15)
- PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems
LLM agents increasingly operate in large tool ecosystems, where real-world tasks require discovering relevant tools, inferring implicit sub-goals, and adapting to dynamic environments over long horizons. However, existing benchmarks rarely evaluate planning under retrieval-limited tool visibility. To address this gap, we introduce PlanBench-XL, an interactive benchmark of 327 retail tasks over 1,665 tools that tests whether agents can iteratively retrieve usable tools, invoke them to uncover intermediate evidence for subsequent calls toward the final goal. PlanBench-XL further features an optional blocking mechanism that simulates real-world unpredictability through missing, failing, or distracting tool functions, forcing agents to detect disrupted paths and adapt at runtime. Experiments on ten leading LLMs show that massive-tool planning remains challenging: while GPT-5.4 achieves 51.90% accuracy in block-free settings, it collapses to 11.36% under the most severe blocking condition. Further analysis shows that agents are especially vulnerable when failures lack explicit error signals or when recovery requires longer alternative tool-use paths. These results establish PlanBench-XL as a testbed for diagnosing agentic planning failures and highlight the need for robust adaptive planning in long-horizon tasks with large, imperfect tool environments.
- OpenRath: Session-Centered Runtime State for Agent Systems
Modern agent systems often suffer from fragmented runtime state: transcripts, tool effects, memory events, workspace placement, branch provenance, and replay evidence are recorded separately and become difficult to inspect or reproduce. OpenRath addresses this issue with a PyTorch-like programming model for multi-agent, multi-session systems. The analogy concerns the role of a central first-class runtime abstraction, not tensor computation. Its core abstraction is Session, the runtime value passed between agents and workflows. A Session is branchable, inspectable, replayable, backend-aware, and composable. It records conversation chunks, sandbox placement, lineage metadata, token usage, pending work, and tool evidence, while defining where memory interactions enter the runtime record. Since this state is carried by the same value used in program execution, fork, merge, and replay become explicit runtime operations rather than states reconstructed from external traces. OpenRath further defines Sandbox, Tool, Agent, Memory, Workflow, and Selector, with Selector turning control flow into runtime-routed decisions. This report presents the programming model, architecture, audited milestones, and evidence protocol. Its claims are limited to controlled runtime properties, while broad quantitative comparisons, live-provider quality, optional-backend availability, and memory quality are left for follow-on evaluation. The central thesis is that Session provides agent systems with a first-class runtime value for auditable composition.
- DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams
Massive unstructured multimodal streams suffer from high "data entropy," impeding both efficient human knowledge acquisition and high-quality AI post-training. Existing passive annotation paradigms, heavily reliant on heuristic rules or general VLMs, are costly, monotonous, and fail to unlock the deep procedural logic embedded in raw data. We elevate data processing to a learnable capability, proposing a paradigm shift towards Agentic Data Tailoring, which actively refining and structuring data to align with diverse user and downstream intents. To overcome the data scarcity bottleneck in training such high-order capabilities, we design a two-stage pipeline grounding generative semantic synthesis in deterministic Factual Anchors, yielding a large-scale dataset spanning five core physical and digital domains. Building upon this, DataClaw_0-9B model synergizes Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO), achieving robust alignment with complex refinement and tailoring intents. To systematically quantify this capability, we construct DataClaw_0-val, the first benchmark dedicated to data refinement. Crucially, we adopt downstream post-training as the ultimate validation touchstone. Evaluations on video generation, real-world VQA, and GUI navigation confirm that DataClaw_0 delivers high-information-density tailored data, facilitating efficient model adaptation to new tasks under limited training data regimes. Project page: https://czjdsg.github.io/MakeAnyData
- EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions
Enterprise agents increasingly operate inside workspaces: they read heterogeneous files, invoke tools, and deliver business artifacts. We introduce EnterpriseClawBench, an enterprise agent benchmark constructed from proprietary, real-world agent sessions. Starting from a large archive of workplace sessions, the EnterpriseClawBench produces 852 reproducible tasks, each paired with recovered fixtures, rewritten prompts, role classes, skill subclasses, hard rules, and semantic rubrics. Because the sessions contain internal enterprise content, we do not release the benchmark data; instead, our reusable contribution is the construction and evaluation protocol. On EnterpriseClawBench, the best configuration reaches only 0.663 (Codex with GPT-5.5). These results show that enterprise agent evaluation must report harness--model combinations, artifact delivery, visual quality, cost, runtime, and skill-transfer behavior, rather than collapsing performance into a single score. Code: https://github.com/FrontisAI/EnterpriseClawBench
- Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention
Self-attention is central to Transformer performance and is often the most expensive part of the Transformer at long context lengths because its pairwise token interactions scale quadratically with sequence length. Standard dense attention also applies the same set of attention heads to every token regardless of token difficulty or information content. This uniform activation can waste compute, especially as sequences grow longer and attention cost increases rapidly. We propose Grouped Query Experts (GQE), a mixture-of-experts layer on top of grouped-query attention (GQA). Within each GQA group, a router selects k query-head experts per token while all key-value (KV) heads remain dense and unchanged. Thus, GQE keeps the KV cache benefits of GQA and reduces only the active query-head computation. On a fixed 30B token budget at the 250M parameter scale, GQE matches the all-active GQA baseline in downstream accuracy while activating half the query heads per token.
- KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking
As retrieval systems scale, high-quality reranking becomes increasingly important. However, most existing rerankers, whether encoder-based or decoder-based, jointly encode the query and passage, tightly coupling their computation and limiting deployment efficiency as well as flexibility. We present KaLM-Reranker-V1, a fast but not late-interaction (FBNL) reranker that decouples query and passage computation while retaining expressive relevance modeling. Built on an encoder-decoder architecture, KaLM-Reranker-V1 uses the encoder to pre-encode passages with Matryoshka embedding pooling, while the decoder models the system instruction, user instruction, and query intent; cross-attention then captures relevance between the query context and passage representations. This design makes KaLM-Reranker-V1 efficient through decoupled passage encoding, yet not late interaction, by preserving rich relevance modeling through cross-attention. We instantiate KaLM-Reranker-V1 in three sizes, Nano, Small, and Large, with 0.27B, 1B, and 4B activated parameters, respectively. Extensive experiments on BEIR, MIRACL, and LMEB demonstrate that KaLM-Reranker-V1 achieves strong reranking performance with superior efficiency. On BEIR, KaLM-Reranker-V1 achieves state-of-the-art performance, on par with strong industrial models such as the Qwen3-Reranker series; on MIRACL, despite not being extensively trained on multilingual data, KaLM-Reranker-V1 still shows excellent reranking performance. Moreover, on LMEB, reranking models demonstrate a clear advantage, with even the 0.27B Nano model remaining competitive with 7-12B embedding models.
- World Action Models: A Survey
World Action Models (WAMs) are embodied predictive-action models that make a forecast of the future available to action. Recent WAMs repurpose large video generation models, and a parallel line relies on language or vision-language backbones without a video-generation core. This rapid expansion has blurred the boundary among broad world models, video generation models, action-grounded video world models, Vision-Language-Action policies, and WAMs. This survey gives the field a common account. It first clarifies these boundaries, then organizes existing works through two complementary views. The first view asks what each method is required to generate, spanning rendered futures, latent futures, and video-generation-free action reasoning. The second view decomposes each method by predictive substrate, backbone, action coupling, and deployment regime. This anatomy supports a unified discussion of interactability, causality, persistence, physical plausibility, and generalization, followed by data, evaluation, and open challenges. Across these axes, a consistent design pattern emerges: WAMs are not simply video generators with action heads, but predictive-action methods whose design choices trade representational richness against compute, memory, latency, and action-label cost. The field is moving toward methods that generate less of the future while preserving what control requires. The survey homepage is available at https://world-action-models.github.io/.
- CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents
While recent LLM-based terminal agents have demonstrated promising capabilities, the scarcity of high-quality, executable training data remains a critical bottleneck. Existing synthesis pipelines typically scale by retrofitting surface-level artifacts into tasks, frequently yielding ambiguous instructions, shallow execution paths, and brittle tests that provide weak learning signals. To overcome this, we introduce CLI-Universe, a principled synthesis engine that constructs terminal-agent tasks. CLI-Universe generates candidate tasks by sampling combinations across a multi-dimensional capability taxonomy (domain, skill type, capability, and engineering pillar), then grounds each candidate through evidence-guided deep research over real-world technical materials. To ensure rigorous supervision, validated blueprints are instantiated into Dockerized environments and subjected to a multi-stage executable verification pipeline featuring rubric-gated test construction, hint-conditional filtering, and strict fail-to-pass checking. Across the full pipeline, from candidate generation to verification, approximately two-thirds of candidates are discarded, retaining only those that are genuine, verifiable, and non-trivially challenging. To validate our framework, we instantiate a highly distilled dataset of 6,000 trajectories called CLI-Universe-6K. Remarkably, fine-tuning Qwen3-32B on CLI-Universe-6K achieves 33.4% on Terminal-Bench 2.0. This sets a new state-of-the-art for models trained on open-source data at or below 32B parameters, and outperforms several models an order of magnitude larger, demonstrating the profound data efficiency of structured, high-fidelity synthesis.
- BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language
We present BioMatrix, the first multimodal foundation model that natively integrates sequences, structures, and natural language for both molecules and proteins within a single decoder-only architecture. Existing biological foundation models pursue native multimodality and broad entity coverage separately: those that fuse multiple modalities under a shared objective remain confined to a single entity type, while those spanning multiple entity types either omit explicit structural modeling or rely on adapter-based designs in which the model cannot natively generate the very modalities it can read. BioMatrix closes this gap by mapping molecular sequences (supporting both SMILES and SELFIES notations), molecular structures, protein sequences, protein structures, and natural language into a shared discrete token space through a unified tokenization scheme, so that all modalities are consumed and produced uniformly under a single next-token prediction objective -- without external encoders, projection adapters, or modality-specific output heads. Built upon the Qwen3 language model (1.7B and 4B), BioMatrix is continually pretrained on 304.4 billion tokens spanning general and domain-specific text, sequence and structure views of molecules and proteins, and cross-modal corpora that interleave biomolecular entities with scientific text and link distinct entities through molecule-protein and protein-protein interaction data. After tuning on a comprehensive suite of downstream applications covering 80 tasks across 6 categories -- encompassing single-entity and multi-entity understanding and generation tasks across and within modalities -- BioMatrix achieves state-of-the-art or competitive performance on 77 out of 80 tasks, demonstrating that a single, natively multimodal generalist model can effectively match or surpass specialized approaches across a wide range of biological tasks.
- HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization
The quadratic complexity of attention poses a critical bottleneck for long-context processing, spurring interest in hybrid attention designs. Most open-source hybrid models adopt a layer-wise strategy. Yet, prior work has noted the inherent difficulty of integrating Linear Attention (LA) with Full Attention (FA), suggesting that the design space of attention hybridization remains underexplored. To probe this space, we conduct interpretability analysis and observe that layers exhibit block-wise functional similarity, while individual heads within the same layer display distinct functional specialization despite sharing input features. This head-level heterogeneity suggests that the head dimension provides a natural and principled granularity for fusing heterogeneous attention signals. Building on this insight, we introduce HydraHead, a novel architecture that hybridizes FA and LA along the head axis. HydraHead features two key innovations: (1) an interpretability-driven selection strategy that identifies retrieval-critical heads and preserves FA only for them, and (2) a scale-normalized fusion module that reconciles the distributional gap between FA and LA head outputs. By leveraging a three-stage transfer pipeline with parameter reuse and distillation, we achieve high-performance hybrid models with minimal training overhead. Under a unified training setup, HydraHead outperforms other hybrid designs in long-context tasks while maintaining strong general reasoning. With interpretability-driven head selection, it matches a 3:1 layer-wise hybrid's long-context performance at a 7:1 LA-to-FA ratio. Crucially, trained on only 15B tokens, HydraHead achieves over 69% improvement over the baseline at 512K context length, approaching Qwen3.5, a leading model of comparable size with a native context length of 256K. This highlights the significant scaling potential of head-level hybridization.
- SkillHarness: Harnessing Safe Skills for Computer-Use Agents
Computer-Use Agents (CUAs) are increasingly deployed in dynamic interactive environments, creating a growing need for continual skill learning during interaction. Recent approaches address this challenge by learning reusable skills from successful trajectories. However, these skill learning methods largely assume static and safe environments, overlooking risks from adversarial interactions (e.g., prompt injections) and environmental dynamics (e.g., pop-ups). In dynamic settings, such assumptions can lead to risky skill learning and brittle execution, undermining the reliability of CUAs. This raises the question: how can CUAs learn and use skills safely in dynamic environments? To address this problem, we propose SkillHarness, a framework for safe skill harnessing in dynamic environments. SkillHarness moves beyond static skill abstractions by modeling skill learning and utilization as a safety-constrained interaction process. Specifically, we introduce the skill boundary that leverages multi-source supervision signals to identify safe skills from interaction trajectories, and construct self-improving safety constraints throughout the skill lifecycle. In addition, SkillHarness introduces selective skill reuse, where tasks are guided to decompose according to context and completed through the selective activation of skill subsets. Our experiments demonstrate that SkillHarness significantly reduces the unsafe rate of learned skills by 57.1% and consistently improves execution stability under dynamic environmental changes, outperforming existing baselines.
- Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding
Autoregressive generation in large language models (LLMs) conventionally decodes from the final layer, assuming that deeper representations yield more reliable next-token predictions. We revisit this assumption by revealing a recurring Guess-Refine-Perturb dynamic: early layers form coarse guesses, intermediate layers refine reasoning-relevant semantics, and final layers can perturb these refined predictions toward generic or alignment-preferred tokens. We introduce Confident Decoding, a training-free decoding strategy that dynamically selects the most reliable near-final layer through entropy-guided conservative backward search. We further provide a theoretical formulation of layer selection as an optimal stopping problem, showing that under bounded projection noise and dominant late-stage alignment perturbation, our search rule filters perturbation while bounding the loss relative to the oracle refinement layer. Experiments across dense and Mixture-of-Experts LLMs demonstrate consistent gains on challenging reasoning benchmarks, including GPQA-Diamond, Omni-MATH, and HLE, with zero memory overhead and less than 2% latency increase. These results suggest dynamically bypassing final-layer perturbations can unlock stronger reasoning behavior from aligned LLMs.
- Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation
Self-distillation improves reasoning in large language models by using the model's own rollouts as training signal, typically through implicit logit-level alignment that minimizes KL divergence toward a privileged target distribution. However, because this supervision is generated via uncontrolled sampling, it provides no diagnostic insight into the model's specific errors or corrective guidance for its individual failure patterns. Consequently, the model learns to imitate a privileged distribution rather than receiving fine-grained corrections that pinpoint where and why its reasoning fails. In this paper, we propose Trajectory-Augmented Policy Optimization (TAPO), which advances self-distillation from implicit distributional alignment to explicit trajectory construction. During RL training, the model produces both correct and incorrect rollouts to the same query, and TAPO leverages this contrastive structure to construct micro-reflective corrections, new training trajectories that retain the model's erroneous reasoning up to the point of failure, then insert a natural-language diagnosis and corrected reasoning guided by a correct reference from the same sampling group. Since each trajectory is anchored in the learner's own prefix and solutions, the corrective signal preserves the model's on-policy distribution to a greater extent than the position-wise alignment imposed by KL-based methods. To integrate these trajectories, TAPO introduces difficulty-aware candidate selection at the model's capability boundary and decoupled advantage estimation to prevent gradient contamination. Experiments on AIME 2024, AIME 2025, and HMMT 2025 show that TAPO achieves consistent improvements over GRPO under the same number of training steps. Further analysis demonstrates that TAPO strengthens both first-pass reasoning and error-correction effectiveness.
- Unlimited OCR Works
Recently, end-to-end OCR models, exemplified by DeepSeek OCR, have once again thrust OCR into the spotlight. A widely held view is that employing a large language model (LLM) as the decoder allows the model to leverage the prior distribution of language, leading to improved OCR performance. However, the downside is equally evident: as the output sequence lengthens, the accumulated KV cache drives up memory consumption and progressively slows down generation. This stands in stark contrast to humans, who exhibit no such decline in efficiency during long-horizon copying tasks. In this technical report, we propose Unlimited OCR, a model designed to emulate human parsing working memory. Taking DeepSeek OCR as the baseline, we replace all attention layers in the decoder with our proposed Reference Sliding Window Attention (R-SWA), which reduces attention computation costs while maintaining a constant KV cache throughout the entire decoding process. By combining the high compression rate of DeepSeek OCR's encoder with our constant KV cache design, Unlimited OCR can transcribe dozens of pages of documents in a single forward pass under a standard maximum length of 32K. More importantly, R-SWA is a general-purpose parsing attention mechanism - beyond OCR, it is equally applicable to tasks such as ASR, translation, etc. Codes and model weights are publicly available at http://github.com/baidu/Unlimited-OCR.
- Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents
Long-horizon tasks are common in real-world robotic deployments, yet failure detection for such tasks remains underexplored. Detecting failures in long-horizon robotic tasks is particularly challenging because failure onset is often ambiguous and dense temporal annotations are typically unavailable. We present Foresight, a failure detection framework that monitors manipulation trajectories using latent representations from an action-conditioned world model. Foresight is trained using only final task-level success or failure labels. By leveraging predictive world-model embeddings, our method provides a unified framework for failure detection across different policies. We further use functional conformal prediction (FCP) to calibrate detection thresholds adaptively. We evaluate Foresight with state-of-the-art vision-language-action policies in simulation on LIBERO-Long, ManiSkill-Long, and BEHAVIOR-1K, compare it against state-of-the-artfailure detection methods, and validate it on real robots with three long-horizon tasks on a ReactorX-200 arm and one task on a Franka arm. Our results suggest that action-conditioned world-model embeddings provide a scalable representation for reliable failure monitoring in long-horizon manipulation.
Techmeme(15)
- Sources: the Trump administration is pressing Meta to submit its AI models for voluntary review; Meta is the only major US AI developer without an agreement (New York Times)
New York Times : Sources: the Trump administration is pressing Meta to submit its AI models for voluntary review; Meta is the only major US AI developer without an agreement — Federal officials are urging the lone major tech company holdout to allow government safety evaluations, weeks after ordering Anthropic to pull its latest model.
- Cerebras reports Q1 revenue up 94% YoY to $193.4M, net loss down 41% to $14M, and forecasts core gross margin to shrink in Q2; CBRS drops 8%+ after hours (Jordan Novet/CNBC)
Jordan Novet / CNBC : Cerebras reports Q1 revenue up 94% YoY to $193.4M, net loss down 41% to $14M, and forecasts core gross margin to shrink in Q2; CBRS drops 8%+ after hours — - Cerebras reported financials for the first time since its IPO in May. … Cerebras said revenue almost doubled in the AI chipmaker's …
- Mistral debuts OCR 4, a model featuring structured document extraction with bounding boxes, block classification, and inline confidence scores, in 170 languages (Mistral AI Blog)
Mistral AI Blog : Mistral debuts OCR 4, a model featuring structured document extraction with bounding boxes, block classification, and inline confidence scores, in 170 languages — Today, we're releasing Mistral OCR 4, featuring bounding boxes, block classification, and inline confidence scores alongside extracted text.
- Sources: Hadrian, which is building AI-powered factories to produce space and defense parts, is in talks to raise ~$1B at a ~$7.5B post-money valuation (Bloomberg)
Bloomberg : Sources: Hadrian, which is building AI-powered factories to produce space and defense parts, is in talks to raise ~$1B at a ~$7.5B post-money valuation — Company runs AI-powered factories that aim to speed up manufacturing — Defense manufacturing startup Hadrian Automation Inc …
- Sources: Miami-based cybersecurity company Varonis is exploring options including a potential sale after receiving takeover interest; VRNS jumps 6%+ (Bloomberg)
Bloomberg : Sources: Miami-based cybersecurity company Varonis is exploring options including a potential sale after receiving takeover interest; VRNS jumps 6%+ — Cybersecurity company Varonis Systems Inc. is exploring options including a potential sale after receiving takeover interest, according to people familiar with the matter.
- The FCC says an auction of wireless mid-band spectrum raised $3.5B+, which will largely be used to fund the replacement of Chinese telecom equipment in the US (David Shepardson/Reuters)
David Shepardson / Reuters : The FCC says an auction of wireless mid-band spectrum raised $3.5B+, which will largely be used to fund the replacement of Chinese telecom equipment in the US — The U.S. Federal Communications Commission said Thursday an auction of wireless mid-band spectrum raised more than $3.5 billion …
- Walmart acquires Vibe.co, which lets businesses create and buy ads on CTVs, sources say for $1.4B cash; top executives get $180M to stay for four years (Sarah Nassauer/Wall Street Journal)
Sarah Nassauer / Wall Street Journal : Walmart acquires Vibe.co, which lets businesses create and buy ads on CTVs, sources say for $1.4B cash; top executives get $180M to stay for four years — Retail giant is paying $1.4 billion for Vibe.co, a company that enables advertising through connected TVs
- Alibaba sues the DOD, seeking removal from a blacklist of companies supporting China's military, says the decision is a violation of constitutional due process (Bloomberg)
Bloomberg : Alibaba sues the DOD, seeking removal from a blacklist of companies supporting China's military, says the decision is a violation of constitutional due process — Alibaba Group Holding Ltd. sued the Department of Defense to be removed from a blacklist that identifies the e-commerce leader …
- On the first day of their trial, two members of Scattered Spider plead guilty in the UK to charges stemming from a 2024 cyberattack on Transport for London (Brian Krebs/Krebs on Security)
Brian Krebs / Krebs on Security : On the first day of their trial, two members of Scattered Spider plead guilty in the UK to charges stemming from a 2024 cyberattack on Transport for London — Two men pleaded guilty in the United Kingdom this week to criminal charges stemming from an August 2024 cyberattack that crippled Transport …
- Anthropic launches Claude Tag, an agentic AI coworker for Slack that can learn context, give suggestions, and more, in beta for Claude Team and Enterprise tiers (David Gewirtz/ZDNET)
David Gewirtz / ZDNET : Anthropic launches Claude Tag, an agentic AI coworker for Slack that can learn context, give suggestions, and more, in beta for Claude Team and Enterprise tiers — ZDNET's key takeaways — Claude Tag puts an always-on AI coworker inside Slack. — Each Slack channel can get its own isolated Claude identity.
- Sources: Meta is building a standalone prediction markets app called Arena, which would probably rely on video game-like points instead of money wagers (New York Times)
New York Times : Sources: Meta is building a standalone prediction markets app called Arena, which would probably rely on video game-like points instead of money wagers — The experimental app, internally called “Arena,” would be independent of Facebook and Instagram. It could compete for attention …
- Sources: SpaceX, which is seeking to raise between $20B and $25B in its debut US bond sale, has drawn about $89B of demand (Bloomberg)
Bloomberg : Sources: SpaceX, which is seeking to raise between $20B and $25B in its debut US bond sale, has drawn about $89B of demand — SpaceX Loses $600 Billion in Value Over Three Days — Video Player is loading. — Unmute — Current Time 0:00 Loaded: 12.89% Playback Rate — captions off, selected
- Stark, which has a deal to supply "kamikaze" drones to the German military, raised €500M from Founders Fund, Sequoia, others; it faced criticism over Thiel ties (Financial Times)
Financial Times : Stark, which has a deal to supply “kamikaze” drones to the German military, raised €500M from Founders Fund, Sequoia, others; it faced criticism over Thiel ties — US billionaire's fund participated in €500mn funding round despite previous criticism of company's links with financier
- LastPass notifies customers that their personal information and customer support case records were stolen during a hack at Canadian market research company Klue (Zack Whittaker/TechCrunch)
Zack Whittaker / TechCrunch : LastPass notifies customers that their personal information and customer support case records were stolen during a hack at Canadian market research company Klue — Password manager maker LastPass is notifying customers that their personal information and customer support case records …
- Beehiiv adds Cloudflare AI Crawl Control to writers' dashboards, allowing them to decide whether AI crawlers can scrape their work (Duncan Riley/SiliconANGLE)
Duncan Riley / SiliconANGLE : Beehiiv adds Cloudflare AI Crawl Control to writers' dashboards, allowing them to decide whether AI crawlers can scrape their work — Cloudflare Inc. and newsletter platform beehiiv Inc. today launched an integration that hands independent publishers a single toggle to decide whether …
Solidot(15)
- 高温干旱高 CO2 下大豆蛋白质含量会下降
大豆是重要的蛋白质来源,但气候变化正日益影响其产量和营养品质。根据发表在《Food Research International》上的一项研究,高浓度二氧化碳会使大豆种子产量增加最高 142%,而高温和干旱则分别会使产量降低 91% 和 60%。在高浓度二氧化碳+高温+干旱三重效应下,大豆种子产量可能会增加 50%,可溶性糖含量增加 35%,氨基酸含量增加 175%,同时淀粉含量降低 20%,蛋白质含量降低 6%。
- 中国新超算灵晟登顶 Top500 榜单
Top500 公布了最新的超算榜单,深圳国家超算中心的灵晟首次亮相即登顶榜单。灵晟理论峰值 2.736 Exaflop/s,在 HPL 测试中达到了 2.198 Exaflop/s,是 Top500 榜单中首个仅靠 CPU 实现持续双精度浮点性能逾 2 Exaflops 的超算系统。灵晟使用了 304 个核心的 LX2 CPU,总共 1379 万个核心,运行频率 1.55 GHz,操作系统是麒麟,功耗为 42.2 兆瓦。榜单前五的超算性能都超过了 Exaflops:灵晟;美国劳伦斯利弗莫尔国家实验室的 El Capitan,使用 AMD 第四代 EPYC 处理器,性能 1.809 Exaflop/s;橡树岭国家实验室(ORNL)的 Frontier,使用 AMD 第三代 EPYC,性能 1.353 Exaflop/s;阿贡国家实验室 Aurora 使用英特尔 Xeon CPU,性能 1.012 Exaflop/s,德国 Jülich 超算中心的 JUPITER Booster,使用英伟达 GH Superchip 72C 3GHz,性能 1 Exaflop/s。之后还有意大利 HPC7,微软 Microsoft Azure 超算 Eagle,意大利 HPC6,日本超算富岳(Fugaku),瑞士 Alps。排名前十的超算有四台使用了 AMD EPYC 处理器,两台英伟达处理器,两台英特尔处理器,灵晟的 CPU 架构没有说明。在 Top 500 中,美国有 162 台,日本 44 台,德国 41 台,中国 30 台;联想制造的超算最多有 129 台,其次是 HPE 的 124 台,BULL 的 58 台,戴尔的 49 台,英伟达的 37 台。
- 甲骨文过去一年裁员 2.1 万
根据甲骨文的最新年报,该公司过去一年在全球裁员约 2.1 万人,原因是它正围绕 AI 重塑业务。截至 2026 年 5 月 31 日,甲骨文全职员工总数约 14.1 万人,而去年同期为 16.2 万人。甲骨文在其报告中称,AI 技术在运营中的部署已经导致且可能继续导致员工总数减少。裁员人数约占甲骨文员工总数的 13%。就业追踪公司估计,过去一年中有逾 10 万科技从业者被裁员。甲骨文称,过去一年它支付了 18 亿美元的遣散费和其它重组费用。
- 维基百科联合创始人 Larry Sanger 被封禁
拥抱保守派、支持 MAGA 的维基百科联合创始人 Larry Sanger 再次现身维基百科,理由是帮助维基百科进行改革——aka 将其从自由派手中夺回来。他发起了“WikiProject Intellectual Diversity”提案,想要增加更多保守派的声音。他通过其社交媒体账号宣传该提案,违反了维基百科关于“隐蔽拉票(Stealth canvassing)”的政策,他在维基社区引发了争议,最终被封禁。
- 当代年轻人生物衰老速度更快
华盛顿大学医学院 Yin Cao 博士领导的团队分析了英国生物银行 (UK Biobank) 的超过 15.4 万名参与者的数据,以及美国 NIH All of Us Research Program 项目逾万名参与者的数据,评估了他们的系统性衰老和器官衰老。研究人员发现,1965-1974 年出生的英国人相比 1950-1954 年出生的英国人,在排除实际年龄的影响后,前者的生物衰老速度更快,这一差异具有统计上的显著性,达到了 0.23 个标准差。美国的数据也出现类似的模式:相比 1965-1969 年出生的美国人,1990-1999 年出生人群的生物衰老速度更快,统计显著性达到了 0.92 个标准差。年轻人群的生物衰老速度加速与早发性癌症风险增加相关。
- 野狼重返欧洲
去年夏天,一位女士带着两幼儿在荷兰 Utrecht 附近的天然公园散步,她看到一只体型较大的动物猛冲过来,她起初以为是一只顽皮的狗,但很快听到 6 岁大儿子发出尖叫,这只动物正将他拖进树林。附近两位恰好路过的成年人用棍子赶跑了它。袭击男孩的不是狗,而是一只狼。狼群数量在欧洲多地激增,引发了如何处理野狼的激烈争论。得益于严格的法律保护,灰狼(Canis lupus)的数量自 2000 年以来大幅增长,但袭击牲畜和袭人事件也日益频发。欧盟委员会去年放宽规定允许更多捕杀野狼,科学家对此表示反对,认为基因证据表明狼群数量并不像表面看起来那么庞大,认为用电围栏和护卫犬保护牲畜比捕杀更有效。科学家估计目前欧盟成员国境内共有约 23,000 只狼,相比下 2012 年只有约 12,000 只。
- 星际彗星 3I/Atlas 可能是太阳系最古老的天体
目前正横穿太阳系的星际彗星 3I/Atlas 可能是太阳系至今发现的最古老天体。它形成于 120 亿年前。借助 NASA 韦伯望远镜(JWST),研究团队精确测定了这颗彗星的化学组分,判定它诞生于宇宙早期银河系的一片恒星形成区。该发现让人类得以窥见其他行星系统的构成,并对比其与太阳系的差异。受阳光加热后,3I/Atlas 向外喷发水蒸气、一氧化碳、二氧化碳,甚至镍、铁等金属蒸气。有两个同位素特征彻底暴露了它的古老身世,同位素即质子数相同、中子数不同的同种元素原子。第一,这颗彗星的碳12与碳13比值远高于太阳系内所有天体。宇宙中,大质量恒星剧烈爆发会持续累积碳13。3I/Atlas 的碳13含量极低,说明它诞生于宇宙早期,彼时大量恒星尚未演化到发生超新星爆发的阶段。第二,这颗彗星富含半重水,即水分子中的部分氢原子多携带一个中子。这类水分子更容易在早期宇宙低温大质量恒星形成区普遍存在的强辐射环境中生成。
- DDR2 和 DDR3 内存的价格出现上涨
过去几个月,由于 AI 热导致的内存短缺,DDR4 和 DDR5 内存条价格都出现了数倍的增长。由于 DDR4 和 DDR5 内存成本过高,部分硬件制造商开始降低内存规格,转向更古老的内存条,结果推动了 DDR2 和 DDR3 内存的价格出现了上涨。市场观察机构 TrendForce 称,硬件制造商为控制成本用 DDR3 方案取代了 DDR4,或用基于 DDR2 的设计取代 DDR3。机构预测 2026 年第二季度 DDR2 合约价格将上涨约 55% 至 60%,第三季度还将进一步上涨 35% 至 40%。而 DDR 2 的制造商表示它们正将产能转移到利润更高的产品如 DDR3、DDR4 和 LPDDR4。
- 在敏感信息泄漏后 Meta 暂停内部 AI 训练项目
在敏感信息泄漏后 Meta 暂停了内部的 AI 训练项目。泄密事件暴露了员工的私人对话、绩效数据和转录文本。Meta 发言人证实了此事,表示公司正在调查,称目前没有迹象表明 Meta 员工不当访问了任何数据。Meta 公司是在今年 4 月宣布了名为 Model Capability Initiative 的 AI 训练计划,旨在利用员工的按键和鼠标移动作为训练数据,以改进公司的 AI 模型。该计划对大多数员工强制执行,但引发了部分员工的强烈反对,他们对自己的数据被记录感到不安。最新的泄密事件令 Meta 内部员工感到沮丧,他们批评公司从一开始就没有对数据进行安全防护。
- 警长利用 Flock 车牌跟踪系统跟踪前女友
54 岁的伊利诺伊州 Holiday Hills 警长 William C. Copp 于 6 月 18 日被捕,他被控了两项渎职罪。检方指控他利用 Flock Group 公司的车牌跟踪系统跟踪了六名他认识的人,其中三人是其前女友,他特别跟踪了一名前女友的前男友——在数月内查询了至少 140 次,这名男子为此申请了禁止接触令。Institute for Justice 的统计显示,截至 2026 年 6 月全美至少发生了 18 起警察利用 Flock 车牌跟踪系统跟踪熟人的案件。举例来说,爱达荷州 Jerome 县的一名警长在三个月内查询了其妻子车牌逾 700 次;堪萨斯州 Sedgwick 前警长对其前女友的车牌进行了 164 次查询,对前女友现任男友的车牌进行了 64 次查询;密尔沃基一名警官追踪其伴侣及其前任逾 100 次...Flock 的数据库查询不需要搜查令,该公司声称要求搜索令可能会在紧急情况下危及生命。ACLU、EFF 以及 Institute for Justice 等都坚持查询车牌需要搜查令。
- Steam Machine 起售价 1049 美元
Valve 正式公布了其游戏机 Steam Machine 的售价,在 AI 热导致内存和 SSD 短缺的情况下,Steam Machine 的价格也涨到了对大多数人缺乏吸引力的程度:Steam Machine 512GB 1,049 美元,Steam Machine 512GB + Steam Controller 套装 1,128 美元,Steam Machine 2TB 1,349 美元,Steam Machine 2TB + Steam Controller 套装 1,428 美元。Valve 解释说,硬件的价格直接取决于组件的成本,在 2023 年开始为 Steam Machine 采购组件时,按照以前的趋势组件的价格会随时间而降低。然而过去大概一年的时间里,情况发生了快速而显著的变化,最明显的就是内存及存储组件的变化,这最终导致了当初为 Steam Machine 制定的目标定价不再可行。因此今天公布的价格反映了全球制造业的现状,或者更准确地说,反映了过去 6 个月里确保能获得的组件的价格。为避免有限库存被机器人程序抢先订购,Valve 宣布将对预订进行随机排序,它将于 6 月 29 日发布第一批产品,并会在有货时继续按顺序处理队列中的预订。
- 回顾对 AUR 的攻击
由用户递交的软件仓库 Arch User Repository(AUR)最近遭遇了大规模恶意攻击,攻击者创建了一系列新账号,然后通过这些账号接管无人维护的软件包(被称为 orphaned packages),植入恶意代码,推送恶意更新。Arch 项目的维护者现已关闭了新用户注册,正在讨论如何处理这些被恶意滥用的无人维护软件包。AUR 中的软件包由用户递交,其他用户可通过搜索下载 PKGBUILD 文件、解依、编译、安装和更新软件。它不提供软件的二进制版本。目前 AUR 中有逾 107,000 个软件包,其中近 14,000 个无人维护可供认领。任何注册用户都可以认领和修改无人维护的软件包。它提供的软件包未经审核,风险由用户自己承担。其它 Linux 发行版也都有类似的软件仓库,如 Fedora 的 Copr,openSUSE 的 Open Build Service (OBS),Ubuntu 的 Personal Package Archives (PPA)。但这些服务与 AUR 有显著区别:它们提供了类似官方软件包的构建环境,而且不允许预编译二进制文件或私有软件。AUR 的规定过于宽松而在这次攻击中遭到了滥用。
- HPV 疫苗将 30 岁前死于宫颈癌的风险降至几乎为零
根据 WHO 的数据,宫颈癌是女性第四大常见癌症,其 99% 的病例是由高危型人乳头瘤病毒(HPV)引起的。虽然 HPV 疫苗能预防约 90% 的宫颈癌,但疫苗对生存率的影响尚不清楚。根据发表在《柳叶刀》期刊上的新研究,伦敦玛丽皇后学院的研究人员发现,自 2008 年 HPV 疫苗引入以来,疫苗接种者宫颈癌死亡率显著下降。HPV 疫苗对降低死亡率的影响如此之大,以至于研究人员估计,12 或 13 岁接种疫苗的女孩在 30 岁之前死于宫颈癌的可能性几乎为零。对于 30-34 岁的接种过疫苗的女性,死于宫颈癌的相对风险降低了 63%。2020-2024 年间英格兰有记录历史上首次没有 20-24 岁的女性死于宫颈癌。HPV 疫苗除了预防宫颈癌,还能预防肛门癌、阴茎癌、阴道癌、外阴癌、口腔癌和咽喉癌,以及生殖器疣,8 年级的男孩和女孩都会接种该疫苗,部分地区为 9 年级和 10 年级学生提供补种服务。新冠疫情前疫苗接种率接近了 WHO 的目标,但疫情之后接种率大幅下降。
- Anthropic 对特定功能访问要求身份验证
Anthropic 更新了其隐私政策,从 2026 年 7 月 8 日起,部分功能将需要身份验证,该验证将由 Persona 公司负责。Persona 是一家第三方身份验证公司,由 Peter Thiel 投资。此前 Discord 因用户强烈反对以及 2026 年 2 月发生的一起数据泄露事件而终止了在年龄验证上与 Persona 的合作。
- Linux 7.2 内核完全移除 strncpy 函数
在 6 年 362 个补丁之后,Linux 7.2 内核终于完全移除了 strncpy() 函数。strncpy() 是一个 C 语言字符串复制函数,内核文档将其标记为“极度危险(actively dangerous)”。strncpy()是一类内存错误的主要来源:包含敏感数据的内核缓冲区可能会在未终止字符串边界外泄漏字节,导致内存信息泄露。strncpy()被 5 个不同函数取代:strscpy() 用于 NUL 结尾的目的地址,strscpy_pad() 用于 NUL 结尾零填充的目标地址, strtomem_pad() 用于非 NUL 结尾固定宽度字段,memcpy_and_pad() 用于显式填充的有边界复制,memcpy()用于已知长度的内存复制。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.