DIGEST · 2025-06-29

OrangeBot.AI Digest — 2025-06-29

68 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Tesla sales drop for fifth month in a row in Europe (abcnews.go.com)
  2. Error handling in Rust (felix-knorr.net)
  3. Many ransomware strains will abort if they detect a Russian keyboard installed (2021) (krebsonsecurity.com)
  4. Tools I love: mise(-en-place) (blog.vbang.dk)
  5. Loss of key US satellite data could send hurricane forecasting back 'decades' (www.theguardian.com)
  6. 4-10x faster in-process pub/sub for Go (github.com)
  7. Personal care products disrupt the human oxidation field (www.science.org)
  8. The Medley Interlisp Project: Reviving a Historical Software System [pdf] (interlisp.org)
  9. I made my VM think it has a CPU fan (wbenny.github.io)
  10. Bloom Filters by Example (llimllib.github.io)
  11. Show HN: Octelium – FOSS Alternative to Teleport, Cloudflare, Tailscale, Ngrok (github.com)
  12. AI slop security reports submitted to curl (gist.github.com)
  13. Sequence and first differences together list all positive numbers exactly once (oeis.org)
  14. Using the Internet without IPv4 connectivity (jamesmcm.github.io)
  15. The Unsustainability of Moore's Law (bzolang.blog)

GitHub Trending(15)

  1. twentyhq / twenty

    Building a modern alternative to Salesforce, powered by the community.

  2. GraphiteEditor / Graphite

    2D vector & raster editor that melds traditional layers & tools with a modern node-based, non-destructive, procedural workflow.

  3. octra-labs / wallet-gen
  4. microsoft / generative-ai-for-beginners

    21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

  5. x1xhlol / system-prompts-and-models-of-ai-tools

    FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser, Trae AI & Cluely (And other Open Sourced) System Prompts, Tools & AI Models.

  6. coleam00 / ottomator-agents

    All the open source AI Agents hosted on the oTTomator Live Agent Studio platform!

  7. stanford-oval / storm

    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

  8. jnsahaj / tweakcn

    A visual no-code theme editor for shadcn/ui components

  9. mendableai / firecrawl

    🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

  10. ItzCrazyKns / Perplexica

    Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

  11. adityachandelgit / BookLore

    BookLore is a web app for hosting, managing, and exploring books, with support for PDFs, eBooks, reading progress, metadata, and stats.

  12. swisskyrepo / PayloadsAllTheThings

    A list of useful payloads and bypass for Web Application Security and Pentest/CTF

  13. mikumifa / biliTickerBuy

    b站会员购购票辅助工具

  14. m1k1o / neko

    A self hosted virtual browser that runs in docker and uses WebRTC.

  15. LMCache / LMCache

    Supercharge Your LLM with the Fastest KV Cache Layer

Product Hunt(8)

  1. Byterover

    Memory layer for your AI coding agents

  2. MyLens.ai

    Visualize what matters in your content & ideas with AI

  3. Oboe

    High performance experimentation platform

  4. Neura

    Turn your voice notes into powerful AI-enhanced content.

  5. Leavoo

    Time-off made simple, fast & Slack-native

  6. Ultracite

    Fast, automated code formatting for JavaScript apps

  7. ShareGo

    You’ll wonder how you ever used AirDrop without it.

  8. Key Press Viewer for macOS

    Minimal key stroke viewer

Hugging Face(15)

  1. Generative Blocks World: Moving Things Around in Pictures

    We describe Generative Blocks World to interact with the scene of a generated image by manipulating simple geometric abstractions. Our method represents scenes as assemblies of convex 3D primitives, and the same scene can be represented by different numbers of primitives, allowing an editor to move either whole structures or small details. Once the scene geometry has been edited, the image is generated by a flow-based method which is conditioned on depth and a texture hint. Our texture hint takes into account the modified 3D primitives, exceeding texture-consistency provided by existing key-value caching techniques. These texture hints (a) allow accurate object and camera moves and (b) largely preserve the identity of objects depicted. Quantitative and qualitative experiments demonstrate that our approach outperforms prior works in visual fidelity, editability, and compositional generalization.

  2. DuaShepherd: Integrating Stepwise Correctness and Potential Rewards for Mathematical Reasoning

    In this paper, we propose DuaShepherd, a novel reward modeling framework that integrates two complementary reward signals, correctness and potential, to enhance the mathematical reasoning capabilities of Large Language Models (LLMs). While correctness-based signals emphasize identification of stepwise errors, potential-based signals focus on the likelihood of reaching the correct final answer. We developed an automated pipeline for constructing large-scale reward modeling dataset with both signals. A unified, multi-head architecture was explored to train the two reward models in a multi-task setup, demonstrating benefits from learning both correctness and potential in parallel. By combining these two signals into a compound probability, our model achieves consistent performance improvements across multiple benchmarks. Empirical evaluations on MATH500 and ProcessBench confirm that this combined reward significantly outperforms models trained on either reward type alone, achieving state-of-the-art performance under comparable resource constraints.

  3. PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling

    Skinning and rigging are fundamental components in animation, articulated object reconstruction, motion transfer, and 4D generation. Existing approaches predominantly rely on Linear Blend Skinning (LBS), due to its simplicity and differentiability. However, LBS introduces artifacts such as volume loss and unnatural deformations, and it fails to model elastic materials like soft tissues, fur, and flexible appendages (e.g., elephant trunks, ears, and fatty tissues). In this work, we propose PhysRig: a differentiable physics-based skinning and rigging framework that overcomes these limitations by embedding the rigid skeleton into a volumetric representation (e.g., a tetrahedral mesh), which is simulated as a deformable soft-body structure driven by the animated skeleton. Our method leverages continuum mechanics and discretizes the object as particles embedded in an Eulerian background grid to ensure differentiability with respect to both material properties and skeletal motion. Additionally, we introduce material prototypes, significantly reducing the learning space while maintaining high expressiveness. To evaluate our framework, we construct a comprehensive synthetic dataset using meshes from Objaverse, The Amazing Animals Zoo, and MixaMo, covering diverse object categories and motion patterns. Our method consistently outperforms traditional LBS-based approaches, generating more realistic and physically plausible results. Furthermore, we demonstrate the applicability of our framework in the pose transfer task highlighting its versatility for articulated object modeling.

  4. DiLoCoX: A Low-Communication Large-Scale Training Framework for Decentralized Cluster

    The distributed training of foundation models, particularly large language models (LLMs), demands a high level of communication. Consequently, it is highly dependent on a centralized cluster with fast and reliable interconnects. Can we conduct training on slow networks and thereby unleash the power of decentralized clusters when dealing with models exceeding 100 billion parameters? In this paper, we propose DiLoCoX, a low-communication large-scale decentralized cluster training framework. It combines Pipeline Parallelism with Dual Optimizer Policy, One-Step-Delay Overlap of Communication and Local Training, and an Adaptive Gradient Compression Scheme. This combination significantly improves the scale of parameters and the speed of model pre-training. We justify the benefits of one-step-delay overlap of communication and local training, as well as the adaptive gradient compression scheme, through a theoretical analysis of convergence. Empirically, we demonstrate that DiLoCoX is capable of pre-training a 107B foundation model over a 1Gbps network. Compared to vanilla AllReduce, DiLoCoX can achieve a 357x speedup in distributed training while maintaining negligible degradation in model convergence. To the best of our knowledge, this is the first decentralized training framework successfully applied to models with over 100 billion parameters.

  5. FairyGen: Storied Cartoon Video from a Single Child-Drawn Character

    We propose FairyGen, an automatic system for generating story-driven cartoon videos from a single child's drawing, while faithfully preserving its unique artistic style. Unlike previous storytelling methods that primarily focus on character consistency and basic motion, FairyGen explicitly disentangles character modeling from stylized background generation and incorporates cinematic shot design to support expressive and coherent storytelling. Given a single character sketch, we first employ an MLLM to generate a structured storyboard with shot-level descriptions that specify environment settings, character actions, and camera perspectives. To ensure visual consistency, we introduce a style propagation adapter that captures the character's visual style and applies it to the background, faithfully retaining the character's full visual identity while synthesizing style-consistent scenes. A shot design module further enhances visual diversity and cinematic quality through frame cropping and multi-view synthesis based on the storyboard. To animate the story, we reconstruct a 3D proxy of the character to derive physically plausible motion sequences, which are then used to fine-tune an MMDiT-based image-to-video diffusion model. We further propose a two-stage motion customization adapter: the first stage learns appearance features from temporally unordered frames, disentangling identity from motion; the second stage models temporal dynamics using a timestep-shift strategy with frozen identity weights. Once trained, FairyGen directly renders diverse and coherent video scenes aligned with the storyboard. Extensive experiments demonstrate that our system produces animations that are stylistically faithful, narratively structured natural motion, highlighting its potential for personalized and engaging story animation. The code will be available at https://github.com/GVCLab/FairyGen

  6. Learning to Skip the Middle Layers of Transformers

    Conditional computation is a popular strategy to make Transformers more efficient. Existing methods often target individual modules (e.g., mixture-of-experts layers) or skip layers independently of one another. However, interpretability research has demonstrated that the middle layers of Transformers exhibit greater redundancy, and that early layers aggregate information into token positions. Guided by these insights, we propose a novel architecture that dynamically skips a variable number of layers from the middle outward. In particular, a learned gating mechanism determines whether to bypass a symmetric span of central blocks based on the input, and a gated attention mechanism prevents subsequent tokens from attending to skipped token positions. Residual norms are controlled with a 'sandwich' or 'perilayernorm' scheme and gate sparsity with an adaptive regularization loss. We had aimed to reduce compute requirements for 'simpler' tokens and potentially foster an emergent multi-level representational hierarchy but, at the scales investigated, our approach does not achieve improvements in the trade-off between validation cross-entropy and estimated FLOPs compared to dense baselines with fewer layers. We release our code at https://github.com/tim-lawson/skip-middle.

  7. MADrive: Memory-Augmented Driving Scene Modeling

    Recent advances in scene reconstruction have pushed toward highly realistic modeling of autonomous driving (AD) environments using 3D Gaussian splatting. However, the resulting reconstructions remain closely tied to the original observations and struggle to support photorealistic synthesis of significantly altered or novel driving scenarios. This work introduces MADrive, a memory-augmented reconstruction framework designed to extend the capabilities of existing scene reconstruction methods by replacing observed vehicles with visually similar 3D assets retrieved from a large-scale external memory bank. Specifically, we release MAD-Cars, a curated dataset of {sim}70K 360{\deg} car videos captured in the wild and present a retrieval module that finds the most similar car instances in the memory bank, reconstructs the corresponding 3D assets from video, and integrates them into the target scene through orientation alignment and relighting. The resulting replacements provide complete multi-view representations of vehicles in the scene, enabling photorealistic synthesis of substantially altered configurations, as demonstrated in our experiments. Project page: https://yandex-research.github.io/madrive/

  8. MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners

    We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. The key finding is that positional embeddings, which have been seldom used by text-to-music generation models in the conditioner for text conditions, are critical when the condition of interest is a function of time. Using melody control as an example, our experiments show that simply adding rotary positional embeddings to the decoupled cross-attention layers increases control accuracy from 56.6% to 61.1%, while requiring 6.75 times fewer trainable parameters than state-of-the-art fine-tuning mechanisms, using the same pre-trained diffusion Transformer model of Stable Audio Open. We evaluate various forms of musical attribute control, audio inpainting, and audio outpainting, demonstrating improved controllability over MusicGen-Large and Stable Audio Open ControlNet at a significantly lower fine-tuning cost, with only 85M trainble parameters. Source code, model checkpoints, and demo examples are available at: https://musecontrollite.github.io/web/.

  9. HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges

    Heuristic algorithms play a vital role in solving combinatorial optimization (CO) problems, yet traditional designs depend heavily on manual expertise and struggle to generalize across diverse instances. We introduce HeurAgenix, a two-stage hyper-heuristic framework powered by large language models (LLMs) that first evolves heuristics and then selects among them automatically. In the heuristic evolution phase, HeurAgenix leverages an LLM to compare seed heuristic solutions with higher-quality solutions and extract reusable evolution strategies. During problem solving, it dynamically picks the most promising heuristic for each problem state, guided by the LLM's perception ability. For flexibility, this selector can be either a state-of-the-art LLM or a fine-tuned lightweight model with lower inference cost. To mitigate the scarcity of reliable supervision caused by CO complexity, we fine-tune the lightweight heuristic selector with a dual-reward mechanism that jointly exploits singals from selection preferences and state perception, enabling robust selection under noisy annotations. Extensive experiments on canonical benchmarks show that HeurAgenix not only outperforms existing LLM-based hyper-heuristics but also matches or exceeds specialized solvers. Code is available at https://github.com/microsoft/HeurAgenix.

  10. FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

    We develop a cost-efficient neurosymbolic agent to address challenging multi-turn image editing tasks such as "Detect the bench in the image while recoloring it to pink. Also, remove the cat for a clearer view and recolor the wall to yellow.'' It combines the fast, high-level subtask planning by large language models (LLMs) with the slow, accurate, tool-use, and local A^* search per subtask to find a cost-efficient toolpath -- a sequence of calls to AI tools. To save the cost of A^* on similar subtasks, we perform inductive reasoning on previously successful toolpaths via LLMs to continuously extract/refine frequently used subroutines and reuse them as new tools for future tasks in an adaptive fast-slow planning, where the higher-level subroutines are explored first, and only when they fail, the low-level A^* search is activated. The reusable symbolic subroutines considerably save exploration cost on the same types of subtasks applied to similar images, yielding a human-like fast-slow toolpath agent "FaSTA^*'': fast subtask planning followed by rule-based subroutine selection per subtask is attempted by LLMs at first, which is expected to cover most tasks, while slow A^* search is only triggered for novel and challenging subtasks. By comparing with recent image editing approaches, we demonstrate FaSTA^* is significantly more computationally efficient while remaining competitive with the state-of-the-art baseline in terms of success rate.

  11. Whole-Body Conditioned Egocentric Video Prediction

    We train models to Predict Ego-centric Video from human Actions (PEVA), given the past video and an action represented by the relative 3D body pose. By conditioning on kinematic pose trajectories, structured by the joint hierarchy of the body, our model learns to simulate how physical human actions shape the environment from a first-person point of view. We train an auto-regressive conditional diffusion transformer on Nymeria, a large-scale dataset of real-world egocentric video and body pose capture. We further design a hierarchical evaluation protocol with increasingly challenging tasks, enabling a comprehensive analysis of the model's embodied prediction and control abilities. Our work represents an initial attempt to tackle the challenges of modeling complex real-world environments and embodied agent behaviors with video prediction from the perspective of a human.

  12. Arch-Router: Aligning LLM Routing with Human Preferences

    With the rapid proliferation of large language models (LLMs) -- each optimized for different strengths, style, or latency/cost profile -- routing has become an essential technique to operationalize the use of different models. However, existing LLM routing approaches are limited in two key ways: they evaluate performance using benchmarks that often fail to capture human preferences driven by subjective evaluation criteria, and they typically select from a limited pool of models. In this work, we propose a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Our approach also supports seamlessly adding new models for routing without requiring retraining or architectural modifications. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models. Our approach captures subjective evaluation criteria and makes routing decisions more transparent and flexible. Our model is available at: https://huggingface.co/katanemo/Arch-Router-1.5B.

  13. Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test

    Grokking, i.e., test performance keeps improving long after training loss converged, has been recently witnessed in neural network training, making the mechanism of generalization and other emerging capabilities such as reasoning mysterious. While prior studies usually train small models on a few toy or highly-specific tasks for thousands of epochs, we conduct the first study of grokking on checkpoints during one-pass pretraining of a 7B large language model (LLM), i.e., OLMoE. We compute the training loss and evaluate generalization on diverse benchmark tasks, including math reasoning, code generation, and commonsense/domain-specific knowledge retrieval tasks. Our study, for the first time, verifies that grokking still happens in the pretraining of large-scale foundation models, though different data may enter grokking stages asynchronously. We further demystify grokking's "emergence of generalization" by investigating LLM internal dynamics. Specifically, we find that training samples' pathways (i.e., expert choices across layers) evolve from random, instance-specific to more structured and shareable between samples during grokking. Also, the complexity of a sample's pathway reduces despite the converged loss. These indicate a memorization-to-generalization conversion, providing a mechanistic explanation of delayed generalization. In the study, we develop two novel metrics to quantify pathway distance and the complexity of a single pathway. We show their ability to predict the generalization improvement on diverse downstream tasks. They are efficient, simple to compute and solely dependent on training data. Hence, they have practical value for pretraining, enabling us to monitor the generalization performance without finetuning and test. Theoretically, we show that more structured pathways reduce model complexity and improve the generalization bound.

  14. An Agentic System for Rare Disease Diagnosis with Traceable Reasoning

    Rare diseases collectively affect over 300 million individuals worldwide, yet timely and accurate diagnosis remains a pervasive challenge. This is largely due to their clinical heterogeneity, low individual prevalence, and the limited familiarity most clinicians have with rare conditions. Here, we introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM), capable of processing heterogeneous clinical inputs. The system generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning that links intermediate analytic steps to verifiable medical evidence. DeepRare comprises three key components: a central host with a long-term memory module; specialized agent servers responsible for domain-specific analytical tasks integrating over 40 specialized tools and web-scale, up-to-date medical knowledge sources, ensuring access to the most current clinical information. This modular and scalable design enables complex diagnostic reasoning while maintaining traceability and adaptability. We evaluate DeepRare on eight datasets. The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases. In HPO-based evaluations, DeepRare significantly outperforms other 15 methods, like traditional bioinformatics diagnostic tools, LLMs, and other agentic systems, achieving an average Recall@1 score of 57.18% and surpassing the second-best method (Reasoning LLM) by a substantial margin of 23.79 percentage points. For multi-modal input scenarios, DeepRare achieves 70.60% at Recall@1 compared to Exomiser's 53.20% in 109 cases. Manual verification of reasoning chains by clinical experts achieves 95.40% agreements. Furthermore, the DeepRare system has been implemented as a user-friendly web application http://raredx.cn/doctor.

  15. WorldVLA: Towards Autoregressive Action World Model

    We present WorldVLA, an autoregressive action world model that unifies action and image understanding and generation. Our WorldVLA intergrates Vision-Language-Action (VLA) model and world model in one single framework. The world model predicts future images by leveraging both action and image understanding, with the purpose of learning the underlying physics of the environment to improve action generation. Meanwhile, the action model generates the subsequent actions based on image observations, aiding in visual understanding and in turn helps visual generation of the world model. We demonstrate that WorldVLA outperforms standalone action and world models, highlighting the mutual enhancement between the world model and the action model. In addition, we find that the performance of the action model deteriorates when generating sequences of actions in an autoregressive manner. This phenomenon can be attributed to the model's limited generalization capability for action prediction, leading to the propagation of errors from earlier actions to subsequent ones. To address this issue, we propose an attention mask strategy that selectively masks prior actions during the generation of the current action, which shows significant performance improvement in the action chunk generation task.

Solidot(15)

  1. Canonical 2024 年营收 2.92 亿美元

    根据 Canonical 向 UK Companies House 递交的 2024 年财报,Ubuntu 发行版的开发商在 2024 年营收达到了 2.92 亿美元,2023 年是 2.51 亿美元,而 2022 年是 2.05 亿美元,公司的员工总数也达到 1,175 人。相比下 2014 年 Canonical 的营收仅为 8100 万美元,员工人数约 337 人,公司处于长期亏损状态。暂时不清楚 Canonical 何时会 IPO,早在 2022 年就传出将在 2023 年 IPO 的消息。

  2. 研究发现白垩纪海洋是“乌贼的天下”

    以往观点认为 1亿至7000万年前的白垩纪后期海洋生物以菊石和鱼类为主,但日本研究团队发现当时的海洋实际上是“乌贼的天下”。由于乌贼没有外壳和骨骼,作为化石很难被发现,此前从未被纳入白垩纪海洋的生态图景。研究团队开发出新技术,将岩石以百分之一毫米精度逐层切削拍摄、数字化立体重现内部包括微小化石在内的所有化石。从北海道各地的白垩纪岩石中鉴定出263个乌贼喙部硬组织化石,平均尺寸约为4毫米。

  3. 日本争议夫妇别姓法案

    日本国会上个月未通过允许已婚夫妇保留不同姓氏的“可选择的夫妇别姓制度”法案,尽管民调显示大部分民众对法案表示支持。日本是唯一一个法律要求已婚夫妇使用同一姓氏的国家,95% 的女性选择随夫姓。非政府组织 Asuniwa 的一项研究认为,允许夫妇保留不同姓氏或有助于提高生育率,因为有很多伴侣为避免改姓而宁愿不结婚。如教师 Uchiyama Yukari 和 Koike Yuki 为躲避法律离婚再婚三次,大部分时间处于非婚状态,但为了给孩子登记出生记录,他们会选择结婚然后就离婚。

  4. 中国平面设计师面临 AI 图像生成器的挑战

    中国平面设计师体会到了 AI 图像生成器对其日常工作的影响。AI 图像生成器容易模仿艺术风格,深刻改变了客户对设计师作品的认识。一家大型电商平台的匿名员工称,在 AI 图像生成器流行前,科技巨头和大型企业的平面设计师就被指示拷贝竞争对手或复制社媒上的作品。对于一种独特的艺术风格,人类需要理解和逆向工程才能复制。而 AI 图像生成器只是给这种艺术风格引入随机的变化,其结果可能会非常像复制品,可能会包括错误,人类平面设计师可以在此基础上编辑成产品。这位匿名员工称,如果不拥抱 AI,会觉得非常容易被取代。在北京和伦敦经营工作室的设计师 Sendi Jia 说,AI 图像生成器正迫使设计师和客户重新思考设计师的价值,设计师的价值仅仅在于创作设计?还是在于咨询、创意、策略、方向和审美?北京的平面设计师 Erbing 说,AI 无法产生任何独特的东西,“每个项目都面临着不同的问题,设计师的存在是为了解决具体问题,而不是创造千篇一律的视觉效果。”他说一个项目的思考过程经常比实际创作更耗时,他认为 AI 图像生成器是一种玩具而不是工具。但设计师们承认 AI 的狂热让客户对其作品价值产生了负面影响。客户现在希望设计师以更少的费用在更短的时间内完成作品。这可能导致质量的下降。Erbin 说,部分客户认为 AI 提高了效率,那么他们的预算可以减半了,但设计师的工作并不是作图。

  5. Bcachefs 文件系统可能将会移除出内核

    因与维护者 Kent Overstreet 之间存在分歧,Linux 作者再次威胁要将 Bcachefs 文件系统从内核中移除出去。Linus Torvalds 在最新拉取评论中表示有可能在 6.17 合并窗口期间会与 Bcachefs 分道扬镳。他给出的理由是双方的开发理念存在巨大分歧,Torvalds 说他甚至无法对 Bcachefs 的 bug 修复提出任何质疑,好像他只能按照 Overstreet 的要求拉取代码,他说双方争吵之后的唯一共识是“we're done”。

  6. 德国要求苹果和 Google 下架 DeepSeek

    德国联邦数据保护与信息自由专员 Meike Kamp 周五表示,她已正式要求苹果和 Google 将中国 AI 公司 DeepSeek APP 从德国地区的 App Store 和 Google Play 下架。原因是该公司未能证明其数据处理符合欧盟标准,涉嫌将德国用户的个人数据非法转移至中国。根据 DeepSeek 的隐私政策,该应用会将用户的 AI 请求、上传文件等多种个人信息储存在中国境内的服务器上。德国监管机构今年早些时候要求 DeepSeek 满足欧盟关于数据跨境传输的合规要求,或主动下架应用。但 DeepSeek 并未作出回应,因此 Kamp 启动了正式下架程序。

  7. 微软用黑屏死机替代蓝屏死机

    微软官方博客宣布,它改进了 Windows 11 24H2 意外重启和快速恢复的功能。它将大部分用户的意外重启的故障时间缩短至两秒钟,用一个新的 UI 取代了此前广为人知的蓝屏死机界面。新的 UI 提高了可读性,会在需要时在屏幕上保留技术信息。新的死机界面以及 quick machine recovery(快速机器恢复 或 QMR)将在今年夏天推送给 Windows 11 24H2 用户。家庭版将会默认启用,而 IT 管理员可以选择是否在专业版和企业版中启用。

  8. 笑声也会感染倭黑猩猩

    一项研究发现,倭黑猩猩在听到笑声后,更可能去接近一个平时不会触碰的东西。这项研究对 4 只经过训练的倭黑猩猩进行了监测,它们会根据盒子是否装有食物而与盒子互动或忽略盒子,这项研究表明,听到积极的声音可能会影响它们的觅食和搜索行为。在研究中,4 只倭猩猩被安排熟悉一个有食物奖励的黑盒子和一个空的白盒子,并被训练按一个按钮拒绝白盒子。此外,有 3 个“模棱两可”的盒子(浅灰、中灰和深灰)间歇呈现,在50% 的盒子中含有食物奖励。测试在播放倭猩猩笑声或环境风声的情况下进行,播放时间持续7分28秒。研究显示,倭猩猩在93%的情况下会接近黑盒子,仅在1%的情况下接近白盒子。而在提供灰色盒子时,倭黑猩猩接近深灰盒子的情况较浅灰的频繁。综合所有灰色盒子的试验结果,研究人员发现,倭灰猩猩在听到笑声录音后更容易检查灰色盒子,其接近灰盒子的概率较环境风声的情况高了3.4倍。研究人员认为,笑声可能引发了倭黑猩猩的情感共鸣,影响了它们的行为,使其更容易接近一个模棱两可的刺激物。

  9. 数字主权始于桌面:欧洲 Linux 桌面时代有望到来

    Windows 10 即将终止支持,以及微软听命于美国政府制裁国际刑事法院首席检察官等事件给欧洲国家敲响了警钟,切换到 Linux 桌面将有助于安全和隐私保护,也有助于维护欧洲的数字主权。法国宪兵队在十多年前就成功切换到了基于 Ubuntu 的定制发行版 GendBuntu。一部分人人提议为欧盟组织开发一个专门的发行版 EU OS。该发行版将基于 Red Hat 社区发行版 Fedora KDE Linux。

  10. AMD 成为 Debian 开发者大会的白金赞助商

    Debian 项目宣布,AMD 成为下个月在法国 Brest 举行的 DebConf25 开发者大会的白金赞助商。AMD 此举旨在向 Debian 开发者宣传它的开源 GPU 编程软件栈 ROCm,因为 Debian 发行版是 AMD ROCm 的官方支持平台,越来越多的组件直接包含在 Debian 发行版中(然而稳定版并没有,主要是测试版)。

  11. 丹麦以赋权公民的方式打击深度伪造

    丹麦准备通过修改版权法确保每个人都拥有其身份、面部特征和声音所有权的方式打击 AI 深度伪造。丹麦文化部准备先公开修正提案征询公众意见,然后在今年秋季正式递交修正案。提案已获得议会九成议员的支持。AI 技术术快速发展,制作逼真的深度伪造图像、视频或声音比以往任何时候都容易。一旦修正案获得批准,丹麦公民将有权要求网络平台删除未经同意分享的内容。修正案不会影响戏仿和讽刺作品。

  12. 当美国人遇到新闻付费墙很少有人愿意付费

    随着纸媒收入下降,越来越多的媒体拥抱了付费订阅模式。但当用户在上网冲浪时遇到需要付费的新闻内容,有多少人会愿意付费?美国皮尤研究中心的一项调查显示,绝大多数人都不愿意付费。调查显示,83% 的被调查者表示过去一年没有为新闻付费,17% 的人通过订阅、捐赠或成为会员的方式向新闻机构付费。74% 的人在搜索新闻时遇到过付费墙,38% 的人经常遇到付费墙。大多数情况下,遇到付费墙后,53% 的人会寻找其他信息来源,32% 的人放弃,只有 1% 的人在遇到付费墙后会选择付费访问。受过高等教育的成年人、民主党人和老年人等更有可能为新闻付费,民主党人比共和党人的付费意愿更高(21% 对 14%)。

  13. 研究发现大模型用户理解能力较弱

    宾夕法尼亚大学沃顿商学院的研究人员发现,相比 Google 搜索引擎用户,使用大模型研究特定主题的用户理解能力较弱,原创见解较少。研究涉及四项实验,共有逾 4500 人参与。结果显示,大模型用户在研究上花费的时间更少,付出的努力较少,撰写的回复更短、细节也缺乏。在第一个实验中,逾 1100 名参与者使用 Google 或 ChatGPT 研究蔬菜园艺(vegetable gardening)。Google 用户的回复更长,措辞更独特,引用事实也更丰富。第二个实验以 AI 摘要或模拟网页的形式呈现相同的园艺信息,在近 2000 名参与者中 Google 用户给出了更深入更丰富的信息。

  14. 微软正将杀毒软件移出 Windows 内核

    在安全公司 CrowdStrike 的错误更新导致全世界 850 万台电脑崩溃近一年之后,微软正采取行动确保此类的事件不再发生,它采取的措施是将杀毒软件移出 Windows 内核。微软的新 Windows 终端安全平台正与安全公司如 CrowdStrike、Bitdefender、ESET、趋势科技等合作构建。以前微软允许安全公司的杀毒软件运行在 Windows 的内核层,不受限制的访问系统内存和硬件。CrowdStrike 去年的错误更新突出了内核驱动容易错误而导致系统蓝屏死机。微软准备释出了一个不对外公开的预览版,供安全公司测试,在多次迭代之后,完成将杀毒软件移出内核的工作。

  15. Google DeepMind 发布 AlphaGenome

    Google DeepMind 新开发的 AI 模型 AlphaGenome 能帮助科学家解析基因组序列中的“暗物质”——非编码区,了解它们如何影响细胞内部运作并导致癌症等疾病的发生。从事非商业工作的研究人员现可使用 API 通过 DeepMind 的服务器访问该模型。在人类基因组序列中,98% 是不直接参与蛋白质编码合成的基因,即非编码区,但它们可以影响蛋白质活性,并包含了大量与疾病相关的变异位点。弄清楚 DNA 序列的作用很难,因为没有现成的答案,就像 AlphaFold 预测蛋白质3D结构一样。从吸引一组细胞机器附着在染色体的特定部分并将附近的基因转录为 RNA 分子,到吸引影响基因表达发生地点、时间和程度的转录因子,单个 DNA 片段具有许多相互关联的作用。例如许多 DNA 序列通过改变染色体的 3D 形状影响基因活性,从而限制或简化转录机器的访问。几十年来,科学家开发了数十种 AI 模型理解基因组。其中许多都集中在单个任务上,例如预测基因表达水平或确定外显子是如何被剪切并拼接到不同蛋白质中的。而 AlphaGenome 正是一个“一体化”解释 DNA 序列的工具。AlphaGenome 可以处理多达 100 万个 DNA 碱基,这可能包括一个基因和无数个调节元件,并能针对多种生物特性进行数千次预测。而且,AlphaGenome在预测过程中对单个 DNA 碱基的变化十分敏感,这意味着科学家可以预测突变的影响。