OrangeBot.AI Digest — 2026-05-12
86 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model (github.com)
- CERT is releasing six CVEs for serious security vulnerabilities in dnsmasq (lists.thekelleys.org.uk)
- Googlebook (googlebook.google)
- Canada’s Bill C-22 Is a Repackaged Version of Last Year’s Surveillance Nightmare (www.eff.org)
- Instructure pays ransom to Canvas hackers (www.insidehighered.com)
- Why senior developers fail to communicate their expertise (www.nair.sh)
- Amazon employees are "tokenmaxxing" due to pressure to use AI tools (arstechnica.com)
- The Future of Obsidian Plugins (obsidian.md)
- Operation: Epic Furious (www.epicfurious.com)
- eBay Rejects GameStop's $56B Takeover as Not Credible (www.bloomberg.com)
- Bambu Lab is abusing the open source social contract (www.jeffgeerling.com)
- US inflation jumps to 3.8% as energy costs surge from Iran war (www.bbc.com)
- Rendering the Sky, Sunsets, and Planets (blog.maximeheckel.com)
- EU to crack down on TikTok, Instagram's 'addictive design' targeting kids (www.cnbc.com)
- Learning Software Architecture (matklad.github.io)
GitHub Trending(11)
Product Hunt(15)
- Vexilo
Claude Code planner w/ 31 agents, 92 commands, + 121 skills
- Seer Platform
The fastest way to go from idea to physical product
- Pixcode
A self-hosted control room for AI coding agents
- ARKAD Wallet
The budgeting app you’ll actually use.
- display.dev
Publish agent-generated HTML behind company auth
- Hyperswitch Prism
Library to plug-n-switch payment processors
- Free AI SEO Auditor
Audit your site for the AI search era. 100% Open Source
- Whisper Island by Coddo
Voice transcription lives in the Mac notch
- Open Vibe
Ship your SaaS with AI, without getting stuck
- EmailTemple
The AI studio for creating high conversion, on-brand emails
- TabGroup Vault
Keep your tab groups safe, searchable, and restorable
- hackerDen
Collaborative hackathon workspace with visible contributions
- Kelviq
Payments, tax, and billing for SaaS & AI companies
- Khaos Brain
Local predictive memory for AI agents
- knooth
Screen recording with AI-powered editing for Mac
Hugging Face(15)
- Qwen-Image-2.0 Technical Report
We present Qwen-Image-2.0, an omni-capable image generation foundation model that unifies high-fidelity generation and precise image editing within a single framework. Despite recent progress, existing models still struggle with ultra-long text rendering, multilingual typography, high-resolution photorealism, robust instruction following, and efficient deployment, especially in text-rich and compositionally complex scenarios. Qwen-Image-2.0 addresses these challenges by coupling Qwen3-VL as the condition encoder with a Multimodal Diffusion Transformer for joint condition-target modeling, supported by large-scale data curation and a customized multi-stage training pipeline. This enables strong multimodal understanding while preserving flexible generation and editing capabilities. The model supports instructions of up to 1K tokens for generating text-rich content such as slides, posters, infographics, and comics, while significantly improving multilingual text fidelity and typography. It also enhances photorealistic generation with richer details, more realistic textures, and coherent lighting, and follows complex prompts more reliably across diverse styles. Extensive human evaluations show that Qwen-Image-2.0 substantially outperforms previous Qwen-Image models in both generation and editing, marking a step toward more general, reliable, and practical image generation foundation models.
- CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models
Recent "Thinking with Video" approaches use Video Generation Models (VGMs) for visual reasoning by producing temporally coherent Chain-of-Frames as reasoning artifacts. Even strong VGMs, however, exhibit two recurring failure modes on goal-directed tasks: long-horizon drift on multi-step tasks and mid-clip simulation errors that compound. Both stem from the absence of explicit reasoning built upon the VGM's short-horizon visual prior, a role naturally filled by Vision-Language Models (VLMs), but where to place the VLM is non-trivial: upfront plans commit before any frame is generated and post-hoc critiques over whole videos intervene too late. We propose VLM-VGM Collaborative Video Reasoning (CollabVR), a closed-loop framework that couples the VLM with the VGM at step-level granularity: the VLM plans the immediate next action, inspects the clip the VGM generates, and folds the verifier's diagnosis directly into the next action prompt to repair detected failures. On Gen-ViRe and VBVR-Bench, CollabVR improves both open-source and closed-source VGMs over single-inference, Pass@k, and prior test-time scaling baselines at matched compute, with the largest gains on the hardest tasks. It also yields further improvements on top of a reasoning-fine-tuned VGM, indicating that step-level VLM supervision is orthogonal to and stackable with reasoning-oriented fine-tuning. We provide video samples and additional qualitative results at our project page: https://joow0n-kim.github.io/collabvr-project-page.
- TMAS: Scaling Test-Time Compute via Multi-Agent Synergy
Test-time scaling has become an effective paradigm for improving the reasoning ability of large language models by allocating additional computation during inference. Recent structured approaches have further advanced this paradigm by organizing inference across multiple trajectories, refinement rounds, and verification-based feedback. However, existing structured test-time scaling methods either weakly coordinate parallel reasoning trajectories or rely on noisy historical information without explicitly deciding what should be retained and reused, limiting their ability to balance exploration and exploitation. In this work, we propose TMAS, a framework for scaling test-time compute via multi-agent synergy. TMAS organizes inference as a collaborative process among specialized agents, enabling structured information flow across agents, trajectories, and refinement iterations. To support effective cross-trajectory collaboration, TMAS introduces hierarchical memories: the experience bank reuses low-level reliable intermediate conclusions and local feedback, while the guideline bank records previously explored high-level strategies to steer subsequent rollouts away from redundant reasoning patterns. Furthermore, we design a hybrid reward reinforcement learning scheme tailored to TMAS, which jointly preserves basic reasoning capability, enhances experience utilization, and encourages exploration beyond previously attempted solution strategies. Extensive experiments on challenging reasoning benchmarks demonstrate that TMAS achieves stronger iterative scaling than existing test-time scaling baselines, while hybrid reward training further improves scaling effectiveness and stability across iterations. Code and data are available at https://github.com/george-QF/TMAS-code.
- PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents
A LaTeX manuscript that compiles without error is not necessarily publication-ready. The resulting PDFs frequently suffer from misplaced floats, overflowing equations, inconsistent table scaling, widow and orphan lines, and poor page balance, forcing authors into repetitive compile-inspect-edit cycles. Rule-based tools are blind to rendered visuals, operating only on source code and log files. Text-only LLMs perform open-loop text editing, unable to predict or verify the two-dimensional layout consequences of their changes. Reliable typesetting optimization therefore requires a visual closed loop with verification after every edit. We formalize this problem as Visual Typesetting Optimization (VTO), the task of transforming a compilable LaTeX paper into a visually polished, page-budget-compliant PDF through iterative visual verification and source-level revision, and introduce a five-category taxonomy of typesetting defects to guide diagnosis. We present PaperFit, a vision-in-the-loop agent that iteratively renders pages, diagnoses defects, and applies constrained repairs. To benchmark VTO, we construct PaperFit-Bench with 200 papers across 10 venue templates and 13 defect types at different difficulty. Extensive experiments show that PaperFit outperforms all baselines by a large margin, establishing that bridging the gap from compilable source to publication-ready PDF requires vision-in-the-loop optimization and that VTO constitutes a critical missing stage in the document automation pipeline.
- SEIF: Self-Evolving Reinforcement Learning for Instruction Following
Instruction following is a fundamental capability of large language models (LLMs), yet continuously improving this capability remains challenging. Existing methods typically rely either on costly external supervision from humans or strong teacher models, or on self-play training with static-difficulty instructions that cannot evolve as the model's capabilities improve. To address these limitations, we propose SEIF (Self-Evolving Reinforcement Learning for Instruction Following), a self-evolving framework for enhancing the instruction-following ability of LLMs. SEIF forms a closed self-evolution loop that improves the model's instruction-following ability, where instruction difficulty evolution and model capability evolution reinforce each other. SEIF consists of four roles: an Instructor that generates increasingly challenging instructions, a Filter that removes conflicting or invalid instructions to ensure data quality, a Follower that learns to follow evolved instructions, and a Judger that provides reward signals for reinforcement learning. The Instructor and Follower are alternately trained and co-evolve throughout the process. Experiments across multiple model scales and architectures show that SEIF consistently improves instruction-following performance, suggesting strong generality. Further analyses reveal the sources of improvement and identify an effective training strategy for self-evolution on open-ended tasks: sufficient early-stage training to build a solid foundation, followed by moderate late-stage training to mitigate overfitting and achieve better final performance. The code and data are publicly available at https://github.com/Rainier-rq1/SEIF.
- WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors
Commercial video generation systems such as Seedance2.0 and Veo3.1 have rapidly improved, strengthening the view that video generators may be evolving into "world simulators." Yet the community still lacks a benchmark that directly tests whether a model can reason about how an observed world should evolve over time. We introduce WorldReasonBench, which reframes video generation evaluation as world-state prediction: given an initial state and an action, can a model generate a future video whose state evolution remains physically, socially, logically, and informationally consistent? WorldReasonBench contains 436 curated test cases with structured ground-truth QA annotations spanning four reasoning dimensions and 22 subcategories. We evaluate generated videos with a human-aligned two-part methodology: Process-aware Reasoning Verification uses structured QA and reasoning-phase diagnostics to detect temporal and causal failures, while Multi-dimensional Quality Assessment scores reasoning quality, temporal consistency, and visual aesthetics for ranking and reward modeling. We further introduce WorldRewardBench, a preference benchmark with approximately 6K expert-annotated pairs over 1.4K videos, supporting pair-wise and point-wise reward-model evaluation. Across modern video generators, our results expose a persistent gap between visual plausibility and world reasoning: videos can look convincing while failing dynamics, causality, or information preservation. We will release our benchmarks and evaluation toolkit to support community research on genuinely world-aware video generation at https://github.com/UniX-AI-Lab/WorldReasonBench/.
- Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models
Recurrent LLM architectures have emerged as a promising approach for improving reasoning, as they enable multi-step computation in the embedding space without generating intermediate tokens. Models such as Ouro perform reasoning by iteratively updating internal representations while retaining a standard Key-Value (KV) cache across iterations, causing memory consumption to grow linearly with reasoning depth. Consequently, increasing the number of reasoning iterations can lead to prohibitive memory usage, limiting the practical scalability of such architectures. In this work, we propose Memory-Efficient Looped Transformer (MELT), a novel architecture that decouples reasoning depth from memory consumption. Instead of using a standard KV cache per layer and loop, MELT maintains a single KV cache per layer that is shared across reasoning loops. This cache is updated over time via a learnable gating mechanism. To enable stable and efficient training under this architecture, we propose to train MELT using chunk-wise training in a two phase procedure: interpolated transition, followed by attention-aligned distillation, both from the LoopLM starting model to MELT. Empirically, we show that MELT models fine-tuned from pretrained Ouro parameters outperform standard LLMs of comparable size, while maintaining a memory footprint comparable to those models and dramatically smaller than Ouro's. Overall, MELT achieves constant-memory iterative reasoning without sacrificing LoopLM performance, using only a lightweight post-training procedure.
- Key-Value Means
We present Key-Value Means ("KVM"), a novel block-recurrence for attention that can accommodate either fixed-size or growing state. Equipping a strong transformer baseline with fixed-size KVM attention layers yields a strong O(N) chunked RNN, while adding only an insignificant number of new parameters. We train a transformer with a growable KVM cache and show it performs competitively on long-context tests with only subquadratic prefill time and sublinear state growth. KVM is implementable with standard operations and without custom kernels, and supports chunk-wise parallelizable training and prefill. It provides many of the benefits of both traditional transformers (expandable context memory, chunk-wise parallelizable training and prefill) and linear RNNs in a single unified package. It can be used on every layer, saving KV-cache memory, and allowing a continuous range of choices of prefill time complexity between O(N) and O(N^2). It can also be implemented in a hybrid solution in tandem with LRNN layers in place of traditional attention, to supplement the LRNN with improved sublinear memory growth context length usage and long context decoding. We release our code at https://github.com/recursal/KVM-paper and trained models at https://huggingface.co/collections/recursal/key-value-means under the Apache 2.0 license.
- Pixal3D: Pixel-Aligned 3D Generation from Images
Recent advances in 3D generative models have rapidly improved image-to-3D synthesis quality, enabling higher-resolution geometry and more realistic appearance. Yet fidelity, which measures pixel-level faithfulness of the generated 3D asset to the input image, still remains a central bottleneck. We argue this stems from an implicit 2D-3D correspondence issue: most 3D-native generators synthesize shape in canonical space and inject image cues via attention, leaving pixel-to-3D associations ambiguous. To tackle this issue, we draw inspiration from 3D reconstruction and propose Pixal3D, a pixel-aligned 3D generation paradigm for high-fidelity 3D asset creation from images. Instead of generating in a canonical pose, Pixal3D directly generates 3D in a pixel-aligned way, consistent with the input view. To enable this, we introduce a pixel back-projection conditioning scheme that explicitly lifts multi-scale image features into a 3D feature volume, establishing direct pixel-to-3D correspondence without ambiguity. We show that Pixal3D is not only scalable and capable of producing high-quality 3D assets, but also substantially improves fidelity, approaching the fidelity level of reconstruction. Furthermore, Pixal3D naturally extends to multi-view generation by aggregating back-projected feature volumes across views. Finally, we show pixel-aligned generation benefits scene synthesis, and present a modular pipeline that produces high-fidelity, object-separated 3D scenes from images. Pixal3D for the first time demonstrates 3D-native pixel-aligned generation at scale, and provides a new inspiring way towards high-fidelity 3D generation of object or scene from single or multi-view images. Project page: https://ldyang694.github.io/projects/pixal3d/
- Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
Large language model agents increasingly rely on external skills to solve complex tasks, where skills act as modular units that extend their capabilities beyond what parametric memory alone supports. Existing methods assume external skills either accumulate as persistent guidance or internalized into the policy, eventually leading to zero-skill inference. We argue this assumption is overly restrictive, since with limited parametric capacity and uneven marginal contribution across skills, the optimal active skill set is non-monotonic, task- and stage-dependent. In this work, we propose SLIM, a framework of dynamic Skill LIfecycle Management for agentic reinforcement learning (RL), which treats the active external skill set as a dynamic optimization variable jointly updated with policy learning. Specifically, SLIM estimates each active skill's marginal external contribution through leave-one-skill-out validation, then applies three lifecycle operations: retaining high-value skills, retiring skills whose contribution becomes negligible after sufficient exposure, and expanding the skill bank when persistent failures reveal missing capability coverage. Experiments show that SLIM outperforms the best baselines by an average of 7.1% points across ALFWorld and SearchQA. Results further indicate that policy learning and external skill retention are not mutually exclusive: some skills are absorbed into the policy, while others continue to provide external value, supporting SLIM as a more general paradigm for skill-based agentic RL.
- LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?
Visual encoding constitutes a major computational bottleneck in Multimodal Large Language Models (MLLMs), especially for high-resolution image inputs. The prevailing practice typically adopts global encoding followed by post-ViT compression. Global encoding produces massive token sequences, while post-ViT compression incurs the full quadratic attention cost of the ViT before any token reduction takes place. In this work, we revisit this convention along two dimensions: the encoding strategy and visual token compression. First, controlled experiments show that slice-based encoding outperforms global encoding across benchmarks, suggesting that preserving local details through sliced views can be more beneficial than applying global attention for fine-grained perception. Second, we introduce intra-ViT early compression, which reduces tokens in shallow ViT layers and substantially lowers visual-encoding FLOPs while preserving downstream performance. By integrating intra-ViT compression into the slice-based encoding framework, we present LLaVA-UHD v4, an efficient and compute-controllable visual encoding scheme tailored for high-resolution inputs. Across a diverse set of benchmarks covering document understanding, OCR, and general VQA, LLaVA-UHD v4 reduces visual-encoding FLOPs by 55.8% while matching or even surpassing baseline performance. These results suggest that visual-encoding efficiency can be substantially improved without sacrificing downstream performance, providing a practical design direction for efficient high-resolution MLLMs. All model weights and code will be publicly released to support further research.
- Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions
Activation steering controls language model behavior by adding directions to internal representations at inference time, but standard residual-stream steering can fail in stateful dialogue. We identify KV-cache contamination as a key failure mode: steered token states are stored and repeatedly reused, turning a local perturbation into cumulative coherence degradation. To address this challenge, we propose Gated Cropped Attention-Delta steering (GCAD), which extracts steering signals from system-prompt contributions to self-attention and applies them with token-level gating. Across persona-steering experiments, GCAD preserves trait control while substantially improving long-horizon coherence. On the main multi-turn benchmark, GCAD improves average coherence drift from -18.6 to -1.9 and raises turn-10 trait expression from 78.0 to 93.1. These results suggest that activation steering becomes more reliable when interventions follow the prompt-mediated pathways that models already use for behavioral control.
- SlimSpec: Low-Rank Draft LM-Head for Accelerated Speculative Decoding
Speculative decoding speeds up autoregressive generation in Large Language Models (LLMs) through a two-step procedure, where a lightweight draft model proposes tokens which the target model then verifies in a single forward pass. Although the drafter network is small in modern architectures, its LM-head still performs projection to a large vocabulary, becoming one of the major computational bottlenecks. In prior work this issue has been predominantly addressed via static or dynamic vocabulary truncation. Yet mitigating the bottleneck, these methods bring in extra complexity, such as special vocabulary curation, sophisticated inference-time logic or modifications of the training setup. In this paper, we propose SlimSpec, a low-rank parameterization of the drafter's LM-head that compresses the inner representation rather than the output, preserving full vocabulary support. We evaluate our method with EAGLE-3 drafter across three target models and diverse benchmarks in both latency- and throughput-bound inference regimes. SlimSpec achieves 4-5times acceleration over the standard LM-head architecture while maintaining a competitive acceptance length, surpassing existing methods by up to 8-9% of the end-to-end speedup. Our method requires minimal adjustments of training and inference pipelines. Combined with the aforementioned speedup improvements, it makes SlimSpec a strong alternative across wide variety of draft LM-head architectures.
- SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training
Structured pruning and knowledge distillation (KD) are typical techniques for compressing large language models, but it remains unclear how they should be applied at pretraining scale, especially to recent mixture-of-experts (MoE) models. In this work, we systematically study MoE compression in large-scale pretraining, focusing on three key questions: whether pruning provides a better initialization than training from scratch, how expert compression choices affect the final model after continued training, and which training strategy is most effective. We have the following findings: First, across depth, width, and expert compression, pruning a pretrained MoE consistently outperforms training the target architecture from scratch under the same training budget. Second, different one-shot expert compression methods converge to similar final performance after large-scale continual pretraining. Motivated by this, we introduce a simple partial-preservation expert merging strategy that improves downstream performance across most benchmarks. Third, combining KD with the language modeling loss outperforms KD alone, particularly on knowledge-intensive tasks. We further propose multi-token prediction (MTP) distillation, which yields consistent gains. Finally, given the same training tokens, progressive pruning schedules outperform one-shot compression, suggesting that gradual architecture transitions lead to better optimization trajectories. Putting it all together, we compress Qwen3-Next-80A3B to a 23A2B model that retains competitive performance. These results offer practical guidance for efficient MoE compression at scale.
- Mela: Test-Time Memory Consolidation based on Transformation Hypothesis
Memory consolidation, the process by which transient experiences are transformed into stable, structured representations, is a foundational organizing principle in the human brain, yet it remains largely unexplored as a design principle for modern sequence models. In this work, we leverage established neuroscientific theories of memory consolidation and cross-frequency coupling to propose the Hierarchical Memory Module (HMM), a neural memory architecture composed of two functionally distinct sub-modules that operate at different update frequencies. Inspired by the transformation hypothesis, the low-frequency sub-module produces high-level representations that capture abstract, gist-level knowledge, while the high-frequency sub-module produces fine-grained representations that preserve richer episodic detail. The final memory output is dynamically reconstructed as a context-dependent combination of both representations, analogous to the reconstructive nature of human memory retrieval. We integrate HMM into a Transformer-based language decoder to form Mela, a family of memory-augmented language models that perform online memory consolidation at test time. To further exploit the multi-granularity memory representations produced by HMM, we introduce MemStack, a method that distributes different levels of memory features across the early layers of the decoder without introducing additional tokens. Experiments on language modeling demonstrate that Mela outperforms Transformer baselines across all the model sizes. Moreover, with the pretrained context length fixed at 4K, Mela maintains performance on significantly longer contexts, whereas Transformer baselines degrade rapidly beyond their training length. Extensive ablation studies validate the contribution of each component and provide guidance for practical configuration.
Techmeme(15)
- Sources: Anthropic is in early talks to raise at least $30B at a $900B+ valuation; the round is expected to close as soon as the end of this month (Bloomberg)
Bloomberg : Sources: Anthropic is in early talks to raise at least $30B at a $900B+ valuation; the round is expected to close as soon as the end of this month — Anthropic PBC is in early talks with investors to raise at least $30 billion in fresh financing, according to people familiar with the matter …
- Qualcomm closed down 11.46% on Tuesday as chip stocks pull back from record AI-driven rally; Intel closed down 6.82%, Sandisk dropped 6%, and Micron 3.61% (Samantha Subin/CNBC)
Samantha Subin / CNBC : Qualcomm closed down 11.46% on Tuesday as chip stocks pull back from record AI-driven rally; Intel closed down 6.82%, Sandisk dropped 6%, and Micron 3.61% — Chip stocks dropped on Tuesday, pulling back from a massive rally that broadened the artificial intelligence trade beyond Nvidia and propelled the sector to new highs.
- Samsung and its South Korean labor union fail to reach a pay deal; the union has said workers will strike for 18 days from May 21 if its demands are not met (Reuters)
Reuters : Samsung and its South Korean labor union fail to reach a pay deal; the union has said workers will strike for 18 days from May 21 if its demands are not met — Samsung Electronics (005930.KS) and its South Korean labor union failed to reach a pay deal on Wednesday, its union leader said …
- Meta schedules its annual Connect event for September 23-24 and says the event will focus on "the latest in VR, wearables, metaverse, and AI" (Ben Lang/Road to VR)
Ben Lang / Road to VR : Meta schedules its annual Connect event for September 23-24 and says the event will focus on “the latest in VR, wearables, metaverse, and AI” — Meta's annual Connect event is set to return on September 23-24. The company teased what appears to be a new pair of smart glasses …
- Meta offers to give rival AI chatbots free access to WhatsApp for a month while it discusses commitments with EU antitrust regulators to address their concerns (Foo Yun Chee/Reuters)
Foo Yun Chee / Reuters : Meta offers to give rival AI chatbots free access to WhatsApp for a month while it discusses commitments with EU antitrust regulators to address their concerns — Meta Platforms (META.O) has offered to give rival AI chatbots free access to its social messaging service WhatsApp for a month …
- CME Group and Silicon Data announce a futures market for computing capacity, with contracts based on daily GPU benchmarks for on-demand rental rates (Tobias Burns/CNBC)
Tobias Burns / CNBC : CME Group and Silicon Data announce a futures market for computing capacity, with contracts based on daily GPU benchmarks for on-demand rental rates — A new futures market for semiconductors will let traders hedge their artificial intelligence investments with bets on the increasingly expensive price of computing power.
- Sources: Apple plans to make the Camera app fully customizable in iOS 27, along with noticeable design changes across Siri, Safari, Weather, and more (Mark Gurman/Bloomberg)
Mark Gurman / Bloomberg : Sources: Apple plans to make the Camera app fully customizable in iOS 27, along with noticeable design changes across Siri, Safari, Weather, and more — Apple Inc. is planning to upgrade its Camera app, making the software fully customizable as part of a broader set of user interface changes coming in its next iPhone software update.
- Musk v. Altman: Altman faced an intense cross-examination from Musk's attorney, who asked "are you completely trustworthy?"; Altman replied "I believe so" (Business Insider)
Business Insider : Musk v. Altman: Altman faced an intense cross-examination from Musk's attorney, who asked “are you completely trustworthy?”; Altman replied “I believe so” — - Sam Altman faced intense, at times awkward, cross-examination from Elon Musk's attorney.
- The US FCC approves EchoStar's sale of approximately 65MHz of spectrum to SpaceX and 50MHz to AT&T (Christian Martinez/Reuters)
Christian Martinez / Reuters : The US FCC approves EchoStar's sale of approximately 65MHz of spectrum to SpaceX and 50MHz to AT&T — The U.S. Federal Communications Commission's Wireless Telecommunications Bureau and Space Bureau approved EchoStar's sale of approximately 65 megahertz of spectrum to SpaceX and 50 megahertz to AT&T …
- Google says it is hiring a team of "forward deployed engineers", a source says in the hundreds, to help customers use its business-focused AI products (Erin Woo/The Information)
Erin Woo / The Information : Google says it is hiring a team of “forward deployed engineers”, a source says in the hundreds, to help customers use its business-focused AI products — Google plans to hire hundreds of engineers to help customers start using its business-focused AI products, according to a person familiar with the situation.
- Musk v. Altman: Altman testified that in 2017 Musk demanded complete control of a proposed OpenAI for-profit arm, musing that he would pass it to his children (Bloomberg)
Bloomberg : Musk v. Altman: Altman testified that in 2017 Musk demanded complete control of a proposed OpenAI for-profit arm, musing that he would pass it to his children — OpenAI's Sam Altman testified that he was “extremely uncomfortable” with Elon Musk's insistence that he have complete control …
- Anthropic names eight unauthorized secondary market sellers of its shares, including Hiive and Forge Global, warning that any share transactions there are void (Yazhou Sun/Bloomberg)
Yazhou Sun / Bloomberg : Anthropic names eight unauthorized secondary market sellers of its shares, including Hiive and Forge Global, warning that any share transactions there are void — Anthropic PBC identified a number of secondary marketplaces as unauthorized sellers of the company's shares, telling investors that buying the stock won't work.
- Google launches Intrusion Logging, an Android feature developed in partnership with Amnesty International and others, on Android 16 Pixel devices for now (Tim Starks/CyberScoop)
Tim Starks / CyberScoop : Google launches Intrusion Logging, an Android feature developed in partnership with Amnesty International and others, on Android 16 Pixel devices for now — Intrusion Logging marks the first feature from a major device vendor to aid with forensic detection of sophisticated threats, Amnesty International said.
- Google unveils a "full bleed" Android Auto design that fills unconventionally shaped screens like in the BMW Neue Klasse, plans to add YouTube video streaming (Andrew J. Hawkins/The Verge)
Andrew J. Hawkins / The Verge : Google unveils a “full bleed” Android Auto design that fills unconventionally shaped screens like in the BMW Neue Klasse, plans to add YouTube video streaming — The phone projection system will now completely fill unconventionally shaped screens, in addition to a variety of other improvements.
- Google unveils Android security features, including protection from spoofed banking calls, default theft protection, and biometric protection for Mark as lost (Adamya Sharma/Android Authority)
Adamya Sharma / Android Authority : Google unveils Android security features, including protection from spoofed banking calls, default theft protection, and biometric protection for Mark as lost — Here's a look at the sweeping set of Android security and privacy upgrades Google has in store for you this year. — • — TL;DR
Solidot(15)
- 社媒上的毒性
2025 年 12 月斯坦福大学的研究人员分析了 22 亿条社媒帖子,寻找模式识别发布有毒内容的用户比例。所谓有毒内容指的是充斥着仇恨的极端主义内容。那么发布有毒内容的用户比例多高呢?可能比你想象的低得多,但此类内容被推荐算法放大而让很多人以为它们是主流。在 Twitter/X 上,有毒推文的转发量比非有毒推文高约 86%,曝光度高约 27%;0.3% 的用户分享了 80% 的争议新闻;6% 的用户发布了约 73% 的政治推文。在 TikTok 上,25% 的用户发布了 98% 的公开视频。具体数字有所不同,但本质相同:少数活跃用户压倒了绝大多数用户。研究人员发现的社媒模式是:沉默的大多数,因担心表达异议而社交孤立,大多数用户要么保持沉默要么离开平台,将平台空间让给了表达极端观点的用户;积极发帖的少数人会陷入认知偏差,认为自己属于多数派。
- 土星冰环可能源自其卫星
长期以来,土星环究竟是如何形成的,一直都是争论的焦点。最新的数值模拟指出,壮丽的行星环系统并非与土星同时诞生,而是在约 1 亿年前才形成。这项由美中联合研究团队提出的假说,将环的起源归功于一颗被命名为蝶蛹(Chrysalis)的古老卫星,在强大引力作用下发生的结构性毁灭。该卫星的物理规格与现今的土星第三大卫星土卫八(Iapetus)相仿,直径约 1,469 公里,且具备分层化的内部结构,由岩石核心与外层冰壳组成。研究指出,蝶蛹卫星原本运行于非常狭长的椭圆轨道,最近轨道距离土星半径的1至1.5倍区域,这正是冰质天体的洛希极限(Roche limit)临界范围。在此区域内,土星强大的潮汐力克服了卫星自身的结构强度,迫使其在引力撕扯下发生彻底的崩解。卫星解体后的残骸大部分被土星引力捕获,历经演化后形成了广阔的行星环,其余部分则逃逸至太空。研究显示,初期的土星环规模可能远超现今观测所见,但随后受到土卫六(Titan)等大型卫星的引力影响,大量物质被移除或重新分配。
- 欧盟准备对 TikTok 和 Instagram 的成瘾性设计采取行动
欧盟委员会主席 Ursula von der Leyen 周二表示欧盟将在今年晚些时候对 TikTok 和 Instagram 等平台上的成瘾性设计功能采取行动。此类功能包括了无限滚动、自动播放和推送通知。欧盟委员会最早将在今年夏天公布一项法律提议,目前正在等待 Special Panel of experts on Child Safety Online 的调查报告。
- 研究发现工作时间减少与肥胖率下降相关
欧洲肥胖大会公布的一项研究比较了 1990-2022 年间 33 个经合组织国家的工作模式和肥胖率。结果发现,美国、墨西哥和哥伦比亚等年工作时间较长的国家肥胖率也更高,即使北欧国家的平均能量和脂肪摄入量高于拉美国家。年工作时间减少 1% 与肥胖率下降 0.16% 相关。研究人员认为,工作压力和缺乏锻炼时间可能是工作时长更多的人容易发胖的原因。研究主要作者、澳大利亚昆士兰大学的 Pradeepa Korale-Gedara 博士表示,压力增加会提高皮质醇激素水平,导致人们在无法消耗能量的工作中储存更多脂肪。研究人员强调这一发现是相关性的,并不代表因果关系。但它促使专家再次呼吁推行四天工作制,四天工作制有助于人们在饮食、运动和睡眠方面做出更健康的选择,有助于促进整个社会的健康。
- Digg 再次尝试重启,将转向 AI 新闻聚合
Digg 今年一月初上线了一个 Reddit 克隆版本,提供类似的基于兴趣的社区。但两个月后就宣布关闭,理由是机器人账号泛滥。现在 Digg 准备再次尝试重启,这一次是转向它曾经的模样:新闻聚合。Digg 向 Beta 测试用户展示了新网站的预览,目标是追踪某个领域最具影响力的声音,推送真正值得关注的新闻。AI 是 Digg 目前测试的领域,如果成功将扩展到其他主题。Digg 会实时从 X 抓取内容以判断讨论热点,同时还会进行情感分析、聚类分析和信号检测,判断哪些内容最重要。
- Forza Horizon 6 开发商严惩玩泄密版本的玩家
《Forza Horizon 6》游戏文件在上市 10 天前就提前泄密,盗版网站比正版网站提前放出了可游戏版本。开发商 Playground Games 证实游戏提前泄露,警告玩家不要玩盗版版本,威胁会进行严惩,包括“全系列封禁和硬件封禁”。一名 YouTube 主播使用泄露版本上传了一段 45 分钟的游戏视频,该玩家随后遭到了终身封禁,其封禁期一直持续到 9999 年 12 月 31 日。玩家即使没有上传视频,如果检测到玩泄露版本也会严惩。《Forza Horizon 6》将于 5 月 19 日发售。
- 印度总理呼吁居家办公以应对中东能源危机
为应对中东能源危机,印度总理莫迪呼吁民众重新实行居家办公,并在未来一年内减少海外旅行。印度约有一半的原油和液化天然气(LNG)进口依赖中东。对于与民众生活直接相关的汽油和柴油,政府一直在抑制价格上涨。但面向企业的液化石油气(LPG)等燃料价格上涨则较为明显。莫迪提到在新冠疫情期间曾推行居家办公,并表示“有必要再次优先采用居家办公和在线视频会议等方式”。莫迪还敦促农民减少使用进口依赖度较高的化学肥料,转向自然农业。
- Debian 将要求可复现构建
Debian 发布团队宣布 Debian 项目的软件包将要求可复现构建。发布团队将阻止迁移无法复现的新软件包,或者在可复现性上出现性能下降的现有软件包。对于可复现(reproducible)的具体含义,开发者指的是在 Debian 构建环境实例中进行构建。
- Linux Kernel 将停止支持 AMD K5 CPU
在 Linux 7.1 逐步停止支持有 37 年历史的英特尔 i486 CPU 之后,Linux 7.2 将停止支持有 30 年历史的 AMD K5 CPU。K5 是 AMD 首款完全自主设计的处理器,于 1996 年 3 月推出,旨在与英特尔的奔腾 CPU 展开竞争,实际性能仍然存在差距。K5 不支持 Time Stamp Counter TSC 指令,这是内核决定移除对其支持的主要原因,因为支持 TSC 指令现在被视为是现代 Linux 的启动要求。内核将逐步移除各种不支持 TSC 指令的 CPU,支持 TSC 的奔腾 CPU 则仍然会继续获得支持。
- GitLab 以 AI 为由裁员
代码托管平台 GitLab 成为最新一家以 AI 为借口宣布裁员的科技公司。GitLab CEO Bill Staples 称该公司已通知员工启动重组,包括:关闭或合并小型团队所在国家的业务,将数量减少最多 30%;精简组织架构,在部分职能部门减少至多三层管理层级;重组研发部门;利用 AI 智能体重塑内部流程。他表示未来的软件将由机器构建,由人指导。AI 是未来软件构建的基础。智能体将负责规划、编码、审查、部署和修复,但人类仍然掌握着最重要的判断权。
- 美国分析师称主权云在中美之外很难实现
Gartner 副总裁 Douglas Toombs 认为完全拥有自主权的主权云在中美之外不太可能实现。他称只有美国和中国拥有主权云所需的所有技术。即使 AWS Outposts、Azure Local 或 Oracle Dedicated Cloud Regions 之类的本地云服务也需要与母公司通信。他认为欧洲的主权云的尝试不会成功,并引用了波士顿咨询集团的“三四法则(The Rule of Three and Four)”:一个稳定的竞争市场中的主要竞争对手的数量永远不会超过三个,其中最大的竞争对手的市场份额不会超过最小竞争对手的四倍。他预测云市场将围绕 AWS、Google 和微软三家公司稳定下来。
- 本田新专利是为电动摩托车模拟离合器
最近披露的一项专利显示本田正在为电动摩托车开发模拟离合器,在电动摩托车上模拟传统燃油摩托车的驾驶体验。模拟离合器系统提供了扭矩增强起步功能,甚至还有触觉反馈。系统利用电子元件根据离合器杆的位置改变电机响应。半拉离合器,系统会按比例降低电机输出;完全拉起离合器,动力会完全切断。根据专利,骑手在离合器拉住的情况下,先扭转电子油门,让电机处于高转速状态,然后快速松开离合器,从而实现类似燃油摩托车的“爆发式起步”效果。这种技巧在竞技场景中可帮助骑手在松软地形或起步时获得更快的加速。专利还描述了安装在车把和离合器杆附近的多个振动电机,用于提供触觉反馈,模拟发动机振动,甚至模拟离合器接合时的“咬合点”感觉。
- Mythos 发现了一个 curl 漏洞
Anthropic 上个月宣布的新 AI 模型 Mythos 引发了媒体的广泛关注,它宣传 Mythos 能极其精确的发现源代码中的安全漏洞。它的识别能力如此强大以至于 Anthropic 暂不向公众发布该模型,而是先提供给少数几家公司,以便于它们能优先解决其发现的安全漏洞。curl 维护者 Daniel Stenberg 认为这是一次极其成功的营销噱头。curl 是广泛使用的开源项目,因此他获得了 Mythos 的访问权限。curl 目前包含了 17.6 万行 C 代码,共 66 万个单词。Mythos 最终返回了一份安全报告,声称确认了五个安全漏洞。但 curl 的安全团队在仔细检查后发现其中 3 个是误报,1 个是 Bug,还有 1 个是低危级别的安全漏洞,将会在下个月释出的版本中修复。安全报告还详细纪录了约 20 个 bug,基本上都是正确的。Stenberg 表示他没有看到任何证据表明 Mythos 在发现安全漏洞上比之前的其它工具更胜一筹,Mythos 可能略好一点,但不足以对代码分析产生显著影响。
- 《Forza Horizon 6》游戏文件提前 10 天泄露
微软旗下工作室 Playground Games 提前 10 天向 Steam 上传了《Forza Horizon 6》的未加密游戏文件。多个盗版网站已经放出了《Forza Horizon 6》的下载。《Forza Horizon 6》是以日本为背景的赛车游戏,预计于 5 月 19 日正式发售,游戏容量约 155 GB。这不是第一次 3A 游戏作品以这种方式泄露,今年 3 月小岛工作室游戏《Death Stranding 2》的 PC 版本也是在发售前几天以未加密的方式将游戏文件上传到 Steam。
- 你继承了父亲的 RNA
南京大学的生化学家 Xin Yin 在一个明媚的下午给小鼠当私人训练员,将小鼠放到小型跑步机上跑步。它们都是运动健将,比对照组跑的更久,乳酸积累也更少。但这些小鼠和对照组在基因上并无差异,它们之所以运动表现更出色可能与它们的父亲的运动习惯相关。这一发现表明,跑步不仅对运动者本人有益,也可能对未出生的孩子有益。Xin Yin 的团队发现,运动小鼠精子中被称为 microRNA 的 RNA 片段浓度比不运动小鼠高。将这些分子注射到不相关的胚胎内,产下的后代与有运动习惯的父亲的后代的运动表现一样出色。过去二十年对小鼠的研究发现,除了 DNA,精子还会将 microRNA 等 RNA 片段遗传给后代。这些 RNA 片段的浓度会随运动或懒惰、高脂肪或高糖饮食、日常压力、童年创伤、酗酒以及接触杀虫剂有害物质等因素发生波动。研究发现,父母超重或承受心理健康压力其后代也更容易出现这些状况。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.