Curated by Shen Huang · 90 stories · ~14 min read
DIGEST · 2026-05-25

OrangeBot.AI Digest — 2026-05-25

90 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. The bootstrapper's EU stack for under €10 per month (eualternative.eu)
  2. Exit IP VPN servers mitigation rollout (mullvad.net)
  3. California moves to exempt Linux from its age-verification law after backlash (www.tomshardware.com)
  4. Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing (www.businessinsider.com)
  5. Gnutella: A Protocol Outliving the World That Created It (rickcarlino.com)
  6. Netherlands Seizes 800 Servers, Arrests 2 for Aiding Cyberattacks (krebsonsecurity.com)
  7. Microsoft pulls plug on plans for 244-acre data center in Caledonia (2025) (www.tmj4.com)
  8. Pope Leo: opaque AI run by few firms risks "New Forms of Dehumanization" (variety.com)
  9. Pope Leo XIV says AI must serve humanity, not the powerful few (religionnews.com)
  10. Leave Me Behind (androidessence.com)
  11. Search engines alternatives now that Google isn't Google anymore (techcrunch.com)
  12. Magnifica Humanitas (www.vatican.va)
  13. Bytecode VMs in surprising places (2024) (dubroy.com)
  14. I love my Bluetooth keyboard (liquidbrain.net)
  15. Jira Is Turing-Complete (seriot.ch)

GitHub Trending(15)

  1. Lum1104 / Understand-Anything
  2. anthropics / knowledge-work-plugins
  3. rohitg00 / ai-engineering-from-scratch
  4. affaan-m / ECC
  5. mukul975 / Anthropic-Cybersecurity-Skills
  6. colbymchenry / codegraph
  7. manaflow-ai / cmux
  8. multica-ai / andrej-karpathy-skills
  9. Fincept-Corporation / FinceptTerminal
  10. paperless-ngx / paperless-ngx
  11. anthropics / claude-cookbooks
  12. Leonxlnx / taste-skill
  13. moeru-ai / airi
  14. shiyu-coder / Kronos
  15. Axorax / awesome-free-apps

Product Hunt(15)

  1. own.page

    Make your own personal website with bento tiles

  2. Unabyss

    MCP-native self-updating context layer for your AI

  3. Pi Coding Agent

    The coding-agent harness you can make your own

  4. Yansu

    AI that learns how you work and turns it into software

  5. Orchestria

    AI music engine with granular stem control

  6. LLMTest

    Use the right LLMs in your apps. Setup fallbacks. Be happy.

  7. Fred

    AI-orchestrated UX research with behavioural tracking

  8. Rixx

    The Perplexity alternative that organizes your research

  9. MashuPack

    Turn codebases into a clean file for Claude and ChatGPT

  10. The Incident Challenge

    Production Debugging Games for Software Engineers

  11. tldx

    Fast CLI to bulk-check domains via RDAP & MCP

  12. Tiny CV

    Resume builder that fits on one page

  13. Supaboard 3.0

    AI data analysts that understand your business

  14. Forum

    Dedicated space for Facebook groups

  15. Databerry

    Track all your business data in a single dashboard

Hugging Face(15)

  1. SkillOpt: Executive Strategy for Self-Evolving Agent Skills

    Agent skills today are hand-crafted, generated one-shot, or evolved through loosely controlled self-revision, none of which behaves like a deep-learning optimizer for the skill, and none of which reliably improves over its starting point under feedback. We argue the skill should instead be trained as the external state of a frozen agent, with the same discipline that makes weight-space optimization reproducible. SkillOpt is, to our knowledge, the first systematic controllable text-space optimizer for agent skills: a separate optimizer model turns scored rollouts into bounded add/delete/replace edits on a single skill document, and an edit is accepted only when it strictly improves a held-out validation score. A textual learning-rate budget, rejected-edit buffer, and epoch-wise slow/meta update make skill training stable while adding zero inference-time model calls at deployment. Across six benchmarks, seven target models, and three execution harnesses (direct chat, Codex, Claude Code), SkillOpt is best or tied on all 52 evaluated (model, benchmark, harness) cells and beats every per-cell competitor among human, one-shot LLM, Trace2Skill, TextGrad, GEPA, and EvoSkill skills. On GPT-5.5 it lifts the average no-skill accuracy by +23.5 points in direct chat, by +24.8 inside the Codex agentic loop, and by +19.1 inside Claude Code. Transfer experiments further show that optimized skill artifacts retain value when moved across model scales, between Codex and Claude Code execution environments, and to a nearby math benchmark without further optimization.

  2. Rethinking Cross-Layer Information Routing in Diffusion Transformers

    Diffusion Transformers (DiTs) have become a de facto backbone of modern visual generation, and nearly every major axis of their design -- tokenization, attention, conditioning, objectives, and latent autoencoders -- has been extensively revisited. The residual stream that governs how information accumulates across layers, however, has been directly inherited from the original Transformer. In this paper, we present a systematic empirical analysis of cross-layer information flow in DiTs, jointly along depth and denoising timestep, and identify three concrete symptoms of traditional residual addition, namely monotonic forward magnitude inflation, sharp backward gradient decay, and pronounced block-wise redundancy. Motivated by this diagnosis, we propose Diffusion-Adaptive Routing (DAR), a drop-in residual replacement that performs learnable, timestep-adaptive, and non-incremental aggregation over the history of sublayer outputs. Moreover, the proposed DAR is compatible with many modern Transformer enhancement methods, such as REPA. On ImageNet 256times256, DAR improves SiT-XL/2 by 2.11 FID (7.56 vs.\ 9.67) and matches the baseline's converged quality with 8.75times fewer training iterations. Stacked on top of REPA, it yields a 2times training acceleration in the early stage, suggesting cross-layer information routing as an underexplored design axis in diffusion modeling, one that operates orthogonally to existing representation-alignment objectives. Beyond pretraining, DAR can also be applied during the fine-tuning stage of large-scale T2I models and preserves high-frequency details during Distribution Matching Distillation.

  3. Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

    We introduce Lens, a 3.8B-parameter T2I model that achieves performance competitive with, and in several cases surpassing, state-of-the-art models with more than 6B parameters across various benchmarks, while requiring significantly less training compute. For example, Lens requires only about 19.3% of the training compute used by Z-Image. The training efficiency of Lens stems from two key strategies beyond its compact model size. First, we maximize data information density per training batch by (i) training on Lens-800M, a dataset of 800M densely captioned image-text pairs whose captions are generated by GPT-4.1 and contain approximately 109 words on average, providing richer semantic supervision than conventional short captions, and (ii) constructing each batch from images with multiple resolutions and diverse aspect ratios, thereby enlarging the effective visual coverage of each optimization step. Second, we improve convergence speed through careful architectural choices, including adopting a semantic VAE that provides better latent representations and employing a strong language encoder that accelerates optimization while enabling multilingual generalization from English-only training data. After pre-training, we apply RL with taxonomy-driven prompts (Lens-RL-8K) and structured reward rubrics to suppress artifacts and improve visual quality, a reasoner module with training-free system prompt search to better align user requests with the model, and distillation-based acceleration for 4-step inference. Through efficient training and systematic optimization, Lens generalizes to arbitrary aspect ratios from 1:2 to 2:1 and resolutions up to 1440^2, and supports prompts in several commonly used languages. Thanks to its compact size, Lens generates a 1024^2 image in 3.15 seconds on a single NVIDIA H100 GPU, while its distilled turbo version performs 4-step generation in 0.84 seconds.

  4. SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research

    The exponential growth of global academic output has confronted researchers and AI agents with an unprecedented ``information explosion,'' where fragmented and unstructured knowledge organization impedes deep interdisciplinary integration. Current academic retrieval tools predominantly rely on superficial keyword matching or vector-space semantic retrieval, which lack the topological reasoning capabilities required to navigate complex logical connections. Agentic deep-research-based frameworks are often prone to logical hallucinations and consuming high inference costs. To bridge this gap, in this report, we introduce SciAtlas, a large-scale, multi-disciplinary, heterogeneous academic resource knowledge graph designed as a panoramic scientific evolution network. By integrating over 43M papers from 26 disciplines, and a total of 157M entities and 3B triplets, SciAtlas provides a structured topological cognitive substrate that dismantles disciplinary barriers and furnishes AI agents with a global perspective. Furthermore, we develop a neuro-symbolic retrieval algorithm featuring tri-path collaborative recall and graph reranking, achieving a seamless transition from simple semantic matching to deterministic association discovery. We also present key application directions of SciAtlas, including literature review, automated research trend synthesis, idea positioning, and academic trajectory exploration, to demonstrate that SciAtlas can serve as an effective ``cognitive map'' to empower the full loop of automated scientific research while significantly reducing reasoning costs. We have released the interfaces for KG retrieval and various downstream tasks in our GitHub repo.

  5. StepAudio 2.5 Technical Report

    Unified audio-language modeling has emerged as a prominent trend in modern speech systems, promising to bring the reasoning capabilities of large language models to auditory tasks. However, existing unified foundations often struggle to match the depth of specialized systems across automatic speech recognition (ASR), text-to-speech synthesis (TTS), and realtime spoken interaction. Bridging this gap remains an open challenge. This report presents StepAudio 2.5, a unified audio-language foundation model that matches or exceeds specialized systems across all three capabilities. Rather than treating these tasks as architecturally distinct, we operate on the premise that once text and audio share a multimodal representational space, task specialization becomes a matter of operational regimes: data construction, optimization targets, and decoding constraints. Guided by this insight, we advance the post-training paradigm from standard supervised learning to task-tailored Reinforcement Learning from Human Feedback (RLHF), using it as the primary mechanism to define complex optimization targets. We leverage this RLHF-centric alignment, alongside specialized decoding, to shape a shared backbone into three distinct operational modes. Concretely, the ASR branch advances transcription efficiency via verifiable multi-token decoding; the TTS branch achieves controllable, expressive synthesis through preference-based RLHF and context-rich supervision; and the Realtime branch realizes low-latency, persona-consistent dialogue via generative reward modeling within an RLHF framework. On standard benchmarks, StepAudio 2.5 achieves state-of-the-art results across ASR, TTS, and Realtime, demonstrating that a singular audio-language foundation can successfully internalize the distinct deployment objectives of speech understanding, generation, and live interaction.

  6. See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding

    We present SWIM (See What I Mean), a novel training strategy that aligns vision and language representations to enable fine-grained object understanding solely from textual prompts. Unlike existing approaches that require explicit visual prompts, such as masks or points, SWIM leverages mask supervision only during training to guide cross-modal attention, allowing the model to automatically attend to the user-specified object at inference. Our cross-attention analysis of pretrained multimodal large languagemodels (MLLMs) reveals a systematic discrepancy: Attribute words produce sharp, localized activations in the visual modality, whereas object nouns yield diffuse and scattered patterns due to semantic reference bias and distributed high-level representations. To address this misalignment, we construct NL-Refer, an enriched dataset, in which each object mask is paired with a precise natural language referring expression. SWIM extracts multi-layer cross-attention maps from object nouns and enforces spatial consistency with ground-truth masks. Experimental results demonstrate that SWIM substantially improves text-visual alignment and achieves superior performance over visual-prompt-based methods on fine-grained object understanding benchmarks. The code and data are available at https://github.com/HumanMLLM/SWIM{https://github.com/HumanMLLM/SWIM}.

  7. From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

    Language agents increasingly improve by reusing skills -- structured procedural artifacts distilled from past experience. In particular, domain-level and model-generated skills are especially promising. They offer fast adaptation within a domain by encoding domain-specific recurring procedures, and they scale beyond labor-intensive hand-crafting. However, while extraction methods continue to proliferate, understanding remains limited, with no comprehensive study spanning the full skill lifecycle -- experience generation, skill extraction, and skill consumption -- to ask whether such skills actually work, when they work, and what makes them succeed or fail. To close this gap, we build a utility-grounded evaluation framework that provides systematic experimental results across extractors and target agents, covering five diverse agentic task domains. We find that model-generated skills are beneficial on average but exhibit non-trivial negative transfer, and that neither extractors nor targets behave uniformly. A model can be a strong extractor yet a weak consumer, or vice versa, with skill utility independent of model scale or baseline task strength. To explain these patterns, we then dissect each lifecycle stage in depth, analyzing how experience composition shapes skill quality, what properties characterize useful skills, and how the same skill transfers across different consumers. Finally, we translate these findings into a concrete meta-skill that guides skill extraction toward the features tied to actual utility, which consistently improves skill quality across domains and substantially reduces negative transfer.

  8. PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion

    Most practical high-resolution text-to-image systems, including latent diffusion and autoregressive models, perform generation in a compact latent space, and a decoder maps the generated latents back to pixels. Yet the latent-to-pixel decoder is reconstruction-oriented, optimized to invert the encoder rather than synthesize more details, and becomes increasingly costly at megapixel scale. This drawback calls for a more expressive and efficient decoding paradigm. Motivated by recent progress in scalable pixel-space diffusion, we introduce PiD, a Pixel diffusion Decoder that reformulates latent decoding as conditional pixel diffusion, unifying decoding and upsampling into one generative module. By denoising directly in high-resolution pixel space, PiD synthesizes 4times and even 8times upscaled images with low latency. For latent conditioning, a lightweight sigma-aware adapter injects noise-corrupted latents into the pixel diffusion backbone, enabling PiD to decode partially denoised latents and terminate the latent diffusion process early. To further improve efficiency, we distill the model using DMD2, reducing inference to just 4 steps. PiD applies to both conventional VAE latents and semantic latents (e.g., SigLIP, DINOv2) used in recent RAE-based models. PiD decodes latents of 512 times 512 images into 2048 times 2048 pixels in under 1 second with 13 GB peak memory on a consumer RTX 5090, and as fast as 210 ms on a GB200 GPU, about 6times faster than cascaded diffusion-based super-resolution pipelines with better visual fidelity.

  9. PhotoFlow: Agentic 3D Virtual Photography Missions

    Virtual photography asks an agent to enter a prepared 3D scene with no preselected camera pose or reference image, infer a suitable shot from scene information and a language intent, choose executable camera parameters, and render the final photograph. Recent progress in vision-language models makes this kind of spatial agent increasingly plausible, but the task stresses two capabilities that remain hard to evaluate together: complex 3D spatial understanding and abstract aesthetic judgment. We introduce PhotoFlow, a Director-Reviewer-Reflector agent for closed-loop camera search. The Director builds a soft photographic blueprint and proposes diverse candidate cameras; the Reviewer combines rule checks, visual critique, and pairwise incumbent selection; and the Reflector converts failures into region memory, dead-zone suppression, and high-explore relocation. We also introduce VPhotoBench, a benchmark of 47 open-license Blender scenes and 141 language-conditioned photography missions spanning subject placement, relational composition, and atmosphere/style. On held-out experiments, PhotoFlow achieves the strongest external quality-alignment composite and success rate among one-shot prediction, single-chain reflection, anchor-bank selection, and random search under a six-round rendering budget. To our knowledge, this is the first work to make language-conditioned virtual photography in arbitrary Blender scenes an executable agent task, and our results show that an LLM-centered spatial agent can already produce strong photographs in a setting designed to challenge both 3D reasoning and aesthetic choice.

  10. VGenST-Bench: A Benchmark for Spatio-Temporal Reasoning via Active Video Synthesis

    Spatio-temporal reasoning is a core capability for Multimodal Large Language Models (MLLMs) operating in the real world. As such, evaluating it precisely has become an essential challenge. However, existing spatio-temporal reasoning benchmark datasets primarily rely on static image sets or passively curated video data, which limits the evaluation of fine-grained reasoning capabilities. In this paper, we introduce VGenST-Bench, a video benchmark that employs generative models to actively synthesize highly controlled and diverse evaluation scenarios. To construct VGenST-Bench, we propose a multi-agent pipeline incorporating a human quality control stage, ensuring the quality of all generated videos and QA pairs. We establish a comprehensive 3x2x2 video taxonomy, encompassing Spatial Scale, Perspective, and Scene Dynamics to span diverse scenarios. Furthermore, we design a hierarchical task suite that decouples low-level visual perception from high-level spatio-temporal reasoning. By shifting the paradigm from passive curation to active synthesis, VGenST-Bench enables fine-grained diagnosis of spatio-temporal understanding in MLLMs.

  11. RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution

    Discrete autoregressive (AR) text-to-image (T2I) models pair a VQ tokenizer with an AR policy, and current post-training pipelines optimize only the policy while keeping the VQ decoder frozen. Recent diffusion T2I work, exemplified by REPA-E, has shown that the VAE itself constitutes a key alignment bottleneck, yet no analogous investigation exists for discrete AR models. We show that policy-only optimization induces Latent Covariate Shift: as the policy evolves, the resulting token distribution diverges from the ground-truth distribution on which the decoder was trained, such that reward scores improve while decoded image quality degrades. To address this mismatch, we propose RankE, the first end-to-end post-training framework for discrete T2I generation. Rather than optimizing the policy against a fixed decoder, RankE co-evolves both components through alternating optimization: each module maximizes a ranking-based alignment objective while being regularized by a stability-preserving anchor suited to its parameter space. This co-evolution breaks the fidelity--alignment trade-off that plagues frozen-decoder approaches: on LlamaGen-XL (775M), standard RL improves CLIP but degrades FID, whereas RankE improves both simultaneously (FID 15.21, CLIP 33.76 on MS-COCO 30K). Consistent gains on Janus-Pro (1B) confirm that decoder co-evolution reliably converts reward optimization into pixel-space quality improvements.

  12. ETCHR: Editing To Clarify and Harness Reasoning

    Multimodal Large Language Models have advanced visual reasoning, yet a purely textual chain of thought remains a bottleneck for questions that require fine-grained focus or view transformations. The ''think with images'' paradigm narrows this gap, but existing approaches are either constrained by fixed predefined toolkits or produce noisy intermediate images from unified multimodal methods. We pursue a third option: using a dedicated image editing model and decouple it with an understanding model. However, off-the-shelf image editors fail as reasoning assistants with two complementary gaps: a language-side gap, where editors trained as passive instruction-followers cannot map an abstract question to an appropriate visual transformation, and a generation-side gap, where edit correctness degrades as reasoning depth grows. Guided by this analysis, we introduce ETCHR (Editing To Clarify and Harness Reasoning), a question-conditioned, reasoning-aware image editor decoupled from the downstream understanding model and trained with a two-stage recipe targeted at the two gaps: Reasoning Imitation via supervised fine-tuning on edit trajectories, followed by Reasoning Enhancement with VLM-derived rewards for edit correctness and downstream reasoning accuracy. Since the editor is decoupled, ETCHR plugs into different open- and closed-source MLLMs in a training-free manner. Across five task families (fine-grained perception, chart understanding, logic reasoning, jigsaw restoration, and 3D understanding), ETCHR raises average Pass@1 from 55.95 to 60.77 (+4.82) with Qwen3-VL-8B, from 65.08 to 70.55 (+5.47) with Gemini-3.1-Flash-Lite, and from 76.55 to 81.16 (+4.61) with the 1T-parameter MoE model Kimi K2.5.

  13. SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models

    Interactive world models for first-person shooter (FPS) games must resolve high-frequency overlapping control signals at every frame without disrupting unaffected regions. Existing methods inject actions globally and train on single titles, failing under dense FPS inputs. We observe that FPS actions are spatially selective: discrete events such as firing or reloading affect only a localized region around the weapon (the scope), while continuous camera and movement signals govern stable surroundings. We propose SCOPE, which inserts a conditioning module into each transformer block of a pretrained video diffusion model. It reshapes features into per-pixel temporal sequences so that each position computes its action response from local visual content. This separates in-scope effects from out-of-scope generation without segmentation labels. We also introduce CrossFPS, the first multi-game FPS dataset with frame-aligned action telemetry. It comprises 69K clips from 7 titles with 10-DoF controller signals, curated to remove gameplay bias. The model learns general visual-to-action mappings rather than game-specific patterns, enabling zero-shot transfer to unseen scenes. Experiments confirm strong action responsiveness, precise scope separation, and effective cross-game generalization.

  14. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

    Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem. By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise. This perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation. We validate our theory through experiments on Pythia and OLMo2 under perturbations, including Gaussian noise, quantization and supervised fine-tuning on math, QA and code tasks. The Shannon Scaling Law consistently outperforms classical scaling laws and recent perturbation-aware laws, achieving strong R^2 scores and accurately capturing loss basins missed by prior approaches. It also extrapolates: fitted on leq6.9B Pythia models with leq180B tokens, it predicts the unseen 12B model up to 307B tokens at pooled R^2{=}0.847, while monotonic baselines collapse.

  15. From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models

    Recent advances in vision-language models (VLMs) emphasize long chain-of-thought reasoning; yet, we find that their performance on visual tasks is primarily limited by a lack of visual perception as opposed to reasoning itself. In this work, we systematically study the interplay between perception and reasoning in VLM post-training by decomposing their capabilities into three separate training stages: visual perception, visual reasoning, and textual reasoning, incorporating specialized training data. We demonstrate that visual perception (a) requires targeted optimization with specialized data; (b) serves as a fundamental scaffold that should be solidified through staged training before refining visual reasoning; and (c) is more effectively learned via RL than caption-based SFT. Our experiments across multiple VLMs demonstrate that staged training consistently improves both visual perception and reasoning performance over merged training. Notably, models trained with our approach achieve 1.5% higher reasoning accuracy with 20.8% shorter reasoning traces, suggesting that superior perception reduces the need for excessive reasoning. Furthermore, we show that this capability-based staging represents a new curriculum dimension orthogonal to traditional difficulty-based curricula, and combining both yields further additive gains. Our staged-training models achieve superior performance among open-weight VLMs, establishing advanced results on several visual math and perception (e.g., +5.2% on WeMath and +3.7% on RealWorldQA) tasks compared with the base counterpart.

Techmeme(15)

  1. X says it is cracking down on large accounts that have been gaming its revenue-sharing program by "programmatically reuploading content from smaller accounts" (Lakshmi Varanasi/Business Insider)

    Lakshmi Varanasi / Business Insider : X says it is cracking down on large accounts that have been gaming its revenue-sharing program by “programmatically reuploading content from smaller accounts” —  X is introducing new strategies to rein in its copycat economy.  —  Elon Musk's social media platform is now cracking …

  2. Iranian state media reports that President Masoud Pezeshkian has issued an order to reopen international internet access after a near-90-day blackout (Reuters)

    Reuters : Iranian state media reports that President Masoud Pezeshkian has issued an order to reopen international internet access after a near-90-day blackout —  Iran's President Masoud Pezeshkian has issued an order to reopen international internet access, Iranian state media reported on Monday …

  3. Report: the EU plans to fine Google a high triple-digit million euro amount as part of a 2025 probe over concerns it favors its own services in search results (Reuters)

    Reuters : Report: the EU plans to fine Google a high triple-digit million euro amount as part of a 2025 probe over concerns it favors its own services in search results —  The European Union is planning to fine Alphabet's (GOOGL.O) Google a high triple-digit million euro amount as part of an antitrust investigation …

  4. A look at the UK's AI Safety Institute, whose researchers probe AI models for safety gaps, as its work becomes a blueprint for other governments' AI policies (New York Times)

    New York Times : A look at the UK's AI Safety Institute, whose researchers probe AI models for safety gaps, as its work becomes a blueprint for other governments' AI policies —  The government's A.I. Security Institute, staffed by alumni from OpenAI and Google, is becoming a model for countries grappling with A.I.'s emerging risks.

  5. Study: rate of fabricated references in biomedical papers has grown 12x+ since 2023; in early 2026, one in 277 papers had at least one non-existent reference (Tristan Bove/Fortune)

    Tristan Bove / Fortune : Study: rate of fabricated references in biomedical papers has grown 12x+ since 2023; in early 2026, one in 277 papers had at least one non-existent reference —  It was a process that had become routine for Maxim Topaz.  —  The associate professor at Columbia University's School of Nursing …

  6. How Iranian threat actor Nimbus Manticore used techniques like AI-assisted malware development and SEO poisoning to target companies during the US-Iran war (Check Point Research)

    Check Point Research : How Iranian threat actor Nimbus Manticore used techniques like AI-assisted malware development and SEO poisoning to target companies during the US-Iran war —  Key Findings  — The Iranian, IRGC affiliated, threat actor Nimbus Manticore resurfaced during Operation Epic Fury …

  7. SoftBank stock jumped 4.6% to a record high on Monday, spurred by hopes of big returns from the company's stakes in OpenAI and SB Energy Corp if they go public (Bloomberg)

    Bloomberg : SoftBank stock jumped 4.6% to a record high on Monday, spurred by hopes of big returns from the company's stakes in OpenAI and SB Energy Corp if they go public —  SoftBank Group Corp. shares climbed to a record high, spurred by hopes of big returns from the Japanese investor's stakes in OpenAI …

  8. Star Citizen, a video game in development since 2012, has reached $1B in lifetime funding; it remains in alpha and does not have a confirmed release date (Jennifer Maas/Variety)

    Jennifer Maas / Variety : Star Citizen, a video game in development since 2012, has reached $1B in lifetime funding; it remains in alpha and does not have a confirmed release date —  Cloud Imperium Games marked a major milestone Sunday (May 24) as the game developer's open world massively multiplayer online space game …

  9. Tether plans to launch GELT, an "official" stablecoin representing the Georgian lari, with the support of Georgia's government in an unusual partnership (Reuters)

    Reuters : Tether plans to launch GELT, an “official” stablecoin representing the Georgian lari, with the support of Georgia's government in an unusual partnership —  Tether, the world's biggest stablecoin issuer, plans to launch a crypto token representing the Georgian lari with the support …

  10. Sources: Wix is expected to cut ~1,000 jobs in the coming months, or ~20% of its workforce, after weak Q1 earnings and a ~50% collapse of its stock in 2026 (Sophie Shulman/CTech)

    Sophie Shulman / CTech : Sources: Wix is expected to cut ~1,000 jobs in the coming months, or ~20% of its workforce, after weak Q1 earnings and a ~50% collapse of its stock in 2026 —  The company will reduce roughly 20% of its workforce after a steep stock decline and rising AI-related costs.

  11. A section of the Pope's encyclical describing AI's unpredictability suggests influence from Anthropic, whose co-founder Christopher Olah attended the unveiling (Washington Post)

    Washington Post : A section of the Pope's encyclical describing AI's unpredictability suggests influence from Anthropic, whose co-founder Christopher Olah attended the unveiling —  In “Magnifica humanitas,” he fires a broadside against AI companies, warning of the technology's dangers in the same way Pope Francis did about climate change.

  12. In his ~43,000-word encyclical, the Pope urged governments to slow down AI development and decried "new forms of slavery" in AI and tech supply chains (Joshua McElwee/Reuters)

    Joshua McElwee / Reuters : In his ~43,000-word encyclical, the Pope urged governments to slow down AI development and decried “new forms of slavery” in AI and tech supply chains —  Pope Leo urged governments to slow down the development of AI systems in his first major document, released on Monday …

  13. More than 5,500 GitHub repositories were infected with malware in a supply chain attack, dubbed Megalodon, on May 18 that relies on automated commits (Ionut Arghire/SecurityWeek)

    Ionut Arghire / SecurityWeek : More than 5,500 GitHub repositories were infected with malware in a supply chain attack, dubbed Megalodon, on May 18 that relies on automated commits —  Fake automated commits injected GitHub Actions workflows containing payloads to steal credentials, CI secrets, keys, and tokens.

  14. Pope Leo XIV presents Magnifica humanitas, his encyclical on AI, calling for AI regulation, protection for children against hypersexualized AI images, and more (New York Times)

    New York Times : Pope Leo XIV presents Magnifica humanitas, his encyclical on AI, calling for AI regulation, protection for children against hypersexualized AI images, and more —  The document marks a powerful foray by the leader of the Roman Catholic church into the debate about the misuse or overuse of artificial intelligence.

  15. Sources: Meta, Google, and Amazon execs met Vatican officials on April 29, as part of a quiet lobbying push ahead of Pope Leo XIV's first AI encyclical (Océane Herrero/Politico)

    Océane Herrero / Politico : Sources: Meta, Google, and Amazon execs met Vatican officials on April 29, as part of a quiet lobbying push ahead of Pope Leo XIV's first AI encyclical —  As Leo XIV prepares his first encyclical, technology firms and Western diplomats have worked to make their case inside the Vatican.

Solidot(15)

  1. 欧洲执法部门黑进 VPN 服务识别勒索组织用户

    欧洲刑警组织披露,他们黑进了被网络犯罪分子使用的 VPN 服务“First VPN”,访问了用户数据库,识别了数千用户身份。First VPN 的网站已经显示被执法部门扣押的信息,它过去曾在俄语网络犯罪论坛上打广告,宣称能隐藏用户的 IP 地址,加密所有通信,不记录任何日志。它还声称将拒绝与司法机关合作,其服务不受任何司法管辖,且不会存储任何用户数据。First VPN 的活动始于 2014 年,在 27 个国家/地区提供了 32 个出口节点服务器。至少有 25 个勒索软件组织利用了其基础设施进行网络侦察和入侵。警方搜查了该服务管理员在乌克兰的住所,拆除了 33 台服务器。

  2. HBM 成本占到了 AI 芯片组件成本的三分之二

    对英伟达、AMD、Google 和亚马逊四家公司的 AI 芯片的分析显示,HBM 内存芯片成本占到了 AI 芯片组件成本的三分之二(63%),逻辑芯片占 13%,先进封装占 15%,辅助组件占 9% 。四家公司在 HBM 上的支出从 2024 年的约 120 亿美元增至 2025 年的 320 亿美元,增速远超其它芯片组件。随着内存芯片供应持续紧张且价格上涨,HBM 在 2026 年的市场份额可能会进一步扩大。超大规模数据中心运营商在其资本支出预期中已经预见到这一点:微软 2026 财年 1900 亿美元的资本支出预期中,约有 250 亿美元来自组件价格上涨;Meta 将其 2026 年资本支出预期上调了 100 亿美元,理由同样是组件价格上涨。

  3. 惠普调查 BIOS 更新导致笔记本故障问题

    过去几个月惠普笔记本电脑用户通过论坛等报告在更新 BIOS 之后设备出现了问题,包括设备无法启动、风扇噪音异常以及蓝屏死机等等。一名移动工作站 ZBook Ultra G1a 的用户称更新 BIOS 之后设备在启动过程中卡住。受影响的产品包括 ZBook Ultra G1a,存在问题的 BIOS 版本号 01.04.03 和 01.04.05;EliteBook X G1a,存在问题的 BIOS 版本号 01.03.11 和 01.05.00。惠普表示它正对此展开调查,建议受影响的用户联系其技术支持团队。这不是第一次惠普设备因为存在问题的 BIOS 更新而导致设备故障。

  4. 俄罗斯推迟对移动 VPN 用户收费的计划

    俄罗斯政府已推迟对使 VPN 的移动互联网用户收费的计划。俄罗斯数字发展部在三月表示将打击 VPN 的使用。它最初要求移动网络运营商从 5 月 1 日起对每月国际数据流量超 15GB 的用户收费。但由于追踪 VPN 使用和计费方面存在困难,该期限已推迟至 6 月 1 日。该收费计划可能会再次被推迟,可能会在 9 月底国家杜马和地方选举之后实施。原因是一个功能完整的国际流量支付系统需要三到四个月才能建成。在这项政策推行前,俄罗斯的移动互联网频繁发生中断事件。

  5. 政治情绪和普通情绪不同

    根据 PNAS 期刊上的一项研究,政治情绪的生理反应和日常经历的普通情绪不同。研究人员邀请近 1000 名美国参与者使用名为 emBODY 的身体映射工具,绘制出感受到的普通情绪和政治情绪的身体部位。研究发现,政治情绪有着独特的身体反应模式。举例来说,政治抑郁会引发身体更广泛、更强烈的感受,而非普通抑郁的麻木感。这意味着政治绝望会激励人行动而不是对一切漠然。政治厌恶感与普通厌恶感也不同。病原体引起的厌恶感如呕吐反应会在胃部和喉咙强烈感受到,而政治厌恶感则更像是愤怒。这意味着政治将厌恶感转化为一种更具道德感和愤怒感的情绪,改变了对政治厌恶感的思考方式。研究还发现不同意识形态的人体验的政治情绪存在差异。倾向于民主党的参与者相比倾向共和党的参与者,对愤怒、焦虑、抑郁和厌恶等负面政治情绪的身体感受更为强烈。

  6. 科学家推翻空气动力学的基础原则

    几十年来,降低空气阻力的一大原则是表面必须光滑。日本东北大学研究团队率先证明,仅仅应用分布式微粗糙度(distributed micro-roughness 或 DMR),就能将空气阻力降低达 43.6%。DMR 是一种肉眼无法分辨的、极其微小且不规则的表面粗糙度。研究团队利用 1m-MSBS 系统精确测量了光滑表面和 DMR 涂层表面的阻力系数,结果显示 DMR 涂层表面的阻力系数低于光滑表面。

  7. 政治情绪和普通情绪不同

    根据 PNAS 期刊上的一项研究,政治情绪的生理反应和日常经历的普通情绪不同。研究人员邀请近 1000 名美国参与者使用名为 emBODY 的身体映射工具,绘制出感受到的普通情绪和政治情绪的身体部位。研究发现,政治情绪有着独特的身体反应模式。举例来说,政治抑郁会引发身体更广泛、更强烈的感受,而非普通抑郁的麻木感。这意味着政治绝望会激励人行动而不是对一切漠然。政治厌恶感与普通厌恶感也不同。病原体引起的厌恶感如呕吐反应会在胃部和喉咙强烈感受到,而政治厌恶感则更像是愤怒。这意味着政治将厌恶感转化为一种更具道德感和愤怒感的情绪,改变了对政治厌恶感的思考方式。研究还发现不同意识形态的人体验的政治情绪存在差异。倾向于民主党的参与者相比倾向共和党的参与者,对愤怒、焦虑、抑郁和厌恶等负面政治情绪的身体感受更为强烈。

  8. 科学家破解烟草合成尼古丁之谜

    尼古丁是让烟草具有成瘾性的化合物,人类使用尼古丁已有逾万年历史。但在数十年研究之后科学家仍然未能完全理解烟草植物是如何合成尼古丁分子的。根据发表在《Nature Communications》上的一项研究,科学家破解了烟草合成尼古丁之谜。研究团队发现,尼古丁一开始与葡萄糖分子结合,葡萄糖分子为尼古丁分子的基本构建块提供了能量去加速组装,但在最后葡萄糖分子会被移除。论文第一作者 Benjamin Schwabe 还发现了 NaGR 和 NicG 两种植物酶的精确结构,两种酶帮助将尼古丁分子从较小的片段组装起来。最新发现使得利用烟草植物生产更安全的药物和疫苗成为可能。

  9. 日本声优起诉要求 TikTok 删除 AI 模仿其声音的视频

    日本人气声优津田健次郎已向东京地方法院提起诉讼,以有人利用生成式 AI 擅自模仿其声音制作视频并公开为由,要求 TikTok 运营方删除相关视频。这可能是关于生成式 AI 擅自使用声音的首个诉讼案。津田“富有磁性的低音声线”被认为是其特色,因在动画《咒术回战》中为七海建人、在《黄金神威》中为尾形百之助等角色配音而知名。 起诉书称,发布视频的人姓名不详。2024 年 7 月至 2025 年 9 月期间,该账号发布了 188 个视频,视频配有模仿津田声音的旁白,主题涉及都市传闻、神秘事件和杂学。根据 TikTok 的支付机制,该账号每月有 50 万至 75 万日元的收益。被告辩解称旁白为“普通的男性声音”,说话方式也没有特色,与津田的声音并不相似。账号发帖者解释说视频是让 AI 学习朋友的声音后制作,认为并不违法。

  10. 气候变化威胁全球植物物种

    根据发表在《科学》期刊上的一项研究,气候变化增加植物物种灭绝的风险。研究人员分析了逾 67,000 种维管植物——维管植物是指有输送水分和养分之内部复杂传导组织的植物,全球已发现维管植物约在 30~40 万种之间。研究发现,7%-16% 的维管植物可能会失去逾九成的栖息地,面临极高的灭绝风险。植物的栖息地并非是地图上的一个位置,而是其生存所需的全部条件:温度、降雨量、土壤、土地利用以及遮荫处等地理特征。研究表明,气候变化正在缩小适宜植物生存的组合条件,使其生存所需的所有条件同时存在的区域越来越少。植物是多数陆地生态系统的基础。植物蓄碳、稳定土壤、为野生动物提供栖息地,提供食物、木材、药物等。植物多样性的变化会对自然和人类产生连锁反应。

  11. Firefox 加入对 Web Serial API 的支持,与 Adafruit 合作

    刚刚发布的 Firefox 151 加入了对 Web Serial API 的支持。Web Serial API 允许网站使用 JavaScript 向串口设备如 USB 和蓝牙设备写入或读取数据。Mozilla 称大部分人不会使用到该 API,它的主要使用群体是开发者,他们将能利用浏览器与兼容硬件设备直接进行通信。Mozilla 同时宣布与知名开源硬件平台 Adafruit 展开合作。Adafruit 基于浏览器的硬件工作流程能在 Firefox 上直接运行。以 Adafruit ESP32-S 开发板为例,通过 Web Serial 可以将网页代码发送的消息直接显示在设备上,或者直接在手持设备上修改网页的 CSS 属性。

  12. 四月全球风能太阳能发电量超过天然气发电量

    四月全球风能太阳能发电量超过了天然气发电量。根据能源智库 Ember 的分析,四月风能和太阳能发电量占全球总发电量的 22%,天然气发电量占 20%。四月风能和太阳能总发电量达到创纪录的 531 TWh,比天然气总发电量 477 TWh 高出 54 TWh。而五年前的 2021 年 4 月,天然气总发电量 476 TWh,和今天几乎完全一致。但当时的风能和太阳能总发电总量仅为 245 TWh,不到今天的一半。北半球的四月是春季,通常风力强劲,因此风能发电量在四月一般呈增长趋势。Ember 的报告《Global Electricity Review》认为在 2025 年风能和太阳能足以满足全球电力的增长需求。

  13. 《星际公民(Star Citizen)》筹款突破十亿美元

    开发了 14 年但发布日期未知的《星际公民(Star Citizen)》的筹款突破了十亿美元达到 1,003,408,183 美元。《星际公民》由《银河飞将》创始人 Chris Roberts 领导开发,试图复兴太空模拟飞行游戏,允许玩家在广袤的宇宙空间内探险,交易和战斗。它于 2012 年在 Kickstarter 上成功众筹,原计划的交付时间是 2014 年。但在 Kickstarter 众筹结束后,开发团队继续在官方网站上进行募资,许多募资其实就是销售游戏内的虚拟物品如各种型号的飞船。2018 年它筹集到 2 亿美元,五年之后突破 6 亿美元,2024 年 5 月突破了 7 亿美元,2 年之后突破了 10 亿美元。《星际公民》堪称史上开发预算最高的 3A 游戏。

  14. 研究建议为保护心血管健康每周运动 10 小时

    澳门理工大学在《British Journal of Sports Medicine》期刊上发表报告,认为成年人应尽量每周进行约 560至610分钟(约10小时)的运动,从而显著降低心肌梗死、中风或心力衰竭等心血管疾病的风险。研究人员分析了英国健康数据库“英国生物银行”(UK Biobank)中 一万七千多名参与者的数据。这些参与者连续一周佩戴加速度传感器,用于记录其日常活动水平。研究还通过骑行测试测量并估算其最大摄氧量。随后在约 8 年的时间内,跟踪观察参与者罹患疾病的情况。研究的核心结果是:遵循世卫组织建议(每周至少 150 分钟运动)可将心血管疾病风险降低约 8%-9%;而每周运动 560至610分钟 时,风险降低幅度超过 30%。研究指出,体能基础较弱的人群,往往需要进行更多运动,才能获得与体能较好人群相同的健康益处。研究作者强调,尽管每周150分钟仍是一个重要的入门标准,但若想获得最佳的健康效果,应争取进行更多的运动。

  15. 报告认为个人要为老年健康状况承担至少八成责任

    牛津长寿项目发表报告《Living Longer, Better》,认为个人至少要为老年时期的健康状况承担八成责任。报告指出,个人对自身寿命的掌控远超普遍认知。报告的结论是基于多项研究,这些研究认为至少 75% 的人类寿命由环境因素和可改变生活方式因素决定。其中一项研究使用了近 50 万英国生物银行参与者的数据。结果发现,环境暴露和习惯对过早死亡和生物衰老的影响远大于遗传因素。报告建议避免食用加工食品、完全戒酒、保证充足睡眠、晚上 6 点半以后不要进食,培养所谓的“非肉食心态”。在酒精问题上报告更直言不讳,称酒精有毒不要喝。批评者认为报告的结论过于简化,在贫困、污染和医保等问题上个人对自己选择的掌控力有限。

NEWSLETTER · FREE · WEEKLY

OrangeBot Weekly

5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.