DIGEST · 2025-08-15

OrangeBot.AI Digest — 2025-08-15

71 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Claude Opus 4 and 4.1 can now end a rare subset of conversations (www.anthropic.com)
  2. Imagen 4 is now generally available (developers.googleblog.com)
  3. Show HN: Edka – Kubernetes clusters on your own Hetzner account (edka.io)
  4. Occult books digitized and put online by Amsterdam’s Ritman Library (www.openculture.com)
  5. The electric fence stopped working years ago (soonly.com)
  6. Do Things That Don't Scale (2013) (paulgraham.com)
  7. The beauty of a text only webpage (albanbrooke.com)
  8. The Timmy Trap (jenson.org)
  9. White House loyalty rating for companies (www.axios.com)
  10. Vaultwarden commit introduces SSO using OpenID Connect (github.com)
  11. Fairness is what the powerful 'can get away with' study shows (phys.org)
  12. Court records reveal Sig Sauer knew of pistol risks for years (smokinggun.org)
  13. Open hardware desktop 3D printing is dead? (www.josefprusa.com)
  14. Swiss vs. UK approach to major tranport projects (www.freewheeling.info)
  15. UK government states that 'safety' act is about influence over public discourse (bsky.app)

GitHub Trending(13)

  1. tadata-org / fastapi_mcp

    Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!

  2. ubicloud / ubicloud

    Open source alternative to AWS. Elastic compute, block storage (non replicated), firewall and load balancer, managed Postgres, K8s, AI inference, and IAM services.

  3. budtmo / docker-android

    Android in docker solution with noVNC supported and video recording

  4. manycore-research / SpatialLM

    SpatialLM: Training Large Language Models for Structured Indoor Modeling

  5. microsoft / magentic-ui

    A research prototype of a human-centered web agent

  6. datalab-to / marker

    Convert PDF to markdown + JSON quickly with high accuracy

  7. redis / go-redis

    Redis Go client

  8. qarmin / czkawka

    Multi functional app to find duplicates, empty folders, similar images etc.

  9. Librum-Reader / Librum

    The Librum client application

  10. microsoft / markitdown

    Python tool for converting files and office documents to Markdown.

  11. jitsi / jitsi-meet

    Jitsi Meet - Secure, Simple and Scalable Video Conferences that you use as a standalone app or embed in your web application.

  12. dotnet / maui

    .NET MAUI is the .NET Multi-platform App UI, a framework for building native device applications spanning mobile, tablet, and desktop.

  13. google / wire

    Compile-time Dependency Injection for Go

Product Hunt(15)

  1. Kuse

    If ChatGPT, Notion, and a whiteboard had a genius baby

  2. Move AI

    Relocate as fast as you ship

  3. stagewise

    The frontend coding agent for existing codebases

  4. GPT-5 SEO Brand Visiblity

    Find out what GPT-5 thinks about your brand & competitors

  5. PersonaRoll

    Automatically go viral just by being you

  6. GitRanks

    GitHub Profile Analytics & Rankings

  7. Readdit Later

    Save and manage your Reddit posts with ease

  8. Claude Utils

    Paste images into Claude Code like magic

  9. Relyable

    Simulation & monitoring platform for AI voice agents

  10. Parachute Backup Mobile

    Keep your iCloud Data. Forever. Now on iPhone and iPad

  11. VIVE Eagle AI Glasses

    Your AI companion, in a stylish frame

  12. Dabe Agents

    Build AI agents by describing your tasks

  13. Genstack

    The Universal AI SDK

  14. LetzAI

    Generative AI to which you can add yourself

  15. VeoSpark

    Product to AI Quality Veo 3 Ads without holes in your wallet

Hugging Face(13)

  1. We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

    Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities across various tasks, but still struggle with complex mathematical reasoning. Existing research primarily focuses on dataset construction and method optimization, often overlooking two critical aspects: comprehensive knowledge-driven design and model-centric data space modeling. In this paper, we introduce We-Math 2.0, a unified system that integrates a structured mathematical knowledge system, model-centric data space modeling, and a reinforcement learning (RL)-based training paradigm to comprehensively enhance the mathematical reasoning abilities of MLLMs. The key contributions of We-Math 2.0 are fourfold: (1) MathBook Knowledge System: We construct a five-level hierarchical system encompassing 491 knowledge points and 1,819 fundamental principles. (2) MathBook-Standard & Pro: We develop MathBook-Standard, a dataset that ensures broad conceptual coverage and flexibility through dual expansion. Additionally, we define a three-dimensional difficulty space and generate 7 progressive variants per problem to build MathBook-Pro, a challenging dataset for robust training. (3) MathBook-RL: We propose a two-stage RL framework comprising: (i) Cold-Start Fine-tuning, which aligns the model with knowledge-oriented chain-of-thought reasoning; and (ii) Progressive Alignment RL, leveraging average-reward learning and dynamic data scheduling to achieve progressive alignment across difficulty levels. (4) MathBookEval: We introduce a comprehensive benchmark covering all 491 knowledge points with diverse reasoning step distributions. Experimental results show that MathBook-RL performs competitively with existing baselines on four widely-used benchmarks and achieves strong results on MathBookEval, suggesting promising generalization in mathematical reasoning.

  2. NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

    Prevailing autoregressive (AR) models for text-to-image generation either rely on heavy, computationally-intensive diffusion models to process continuous image tokens, or employ vector quantization (VQ) to obtain discrete tokens with quantization loss. In this paper, we push the autoregressive paradigm forward with NextStep-1, a 14B autoregressive model paired with a 157M flow matching head, training on discrete text tokens and continuous image tokens with next-token prediction objectives. NextStep-1 achieves state-of-the-art performance for autoregressive models in text-to-image generation tasks, exhibiting strong capabilities in high-fidelity image synthesis. Furthermore, our method shows strong performance in image editing, highlighting the power and versatility of our unified approach. To facilitate open research, we will release our code and models to the community.

  3. ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

    Traditional cartoon and anime production involves keyframing, inbetweening, and colorization stages, which require intensive manual effort. Despite recent advances in AI, existing methods often handle these stages separately, leading to error accumulation and artifacts. For instance, inbetweening approaches struggle with large motions, while colorization methods require dense per-frame sketches. To address this, we introduce ToonComposer, a generative model that unifies inbetweening and colorization into a single post-keyframing stage. ToonComposer employs a sparse sketch injection mechanism to provide precise control using keyframe sketches. Additionally, it uses a cartoon adaptation method with the spatial low-rank adapter to tailor a modern video foundation model to the cartoon domain while keeping its temporal prior intact. Requiring as few as a single sketch and a colored reference frame, ToonComposer excels with sparse inputs, while also supporting multiple sketches at any temporal location for more precise motion control. This dual capability reduces manual workload and improves flexibility, empowering artists in real-world scenarios. To evaluate our model, we further created PKBench, a benchmark featuring human-drawn sketches that simulate real-world use cases. Our evaluation demonstrates that ToonComposer outperforms existing methods in visual quality, motion consistency, and production efficiency, offering a superior and more flexible solution for AI-assisted cartoon production.

  4. PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts

    We introduce PRELUDE, a benchmark for evaluating long-context understanding through the task of determining whether a character's prequel story is consistent with the canonical narrative of the original book. Our task poses a stronger demand for global comprehension and deep reasoning than existing benchmarks -- as the prequels are not part of the original story, assessing their plausibility typically requires searching and integrating information that is only indirectly related. Empirically, 88% of instances require evidence from multiple parts of the narrative. Experimental results highlight the challenge of our task: in-context learning, RAG and in-domain training with state-of-the-art LLMs, and commercial DeepResearch services, lag behind humans by >15%. A further human study reveals that models often produce correct answers with flawed reasoning, leading to an over 30% gap in reasoning accuracy compared to humans. These findings underscore the substantial room for improvement in long-context understanding and reasoning.

  5. UI-Venus Technical Report: Building High-performance UI Agents with RFT

    We present UI-Venus, a native UI agent that takes only screenshots as input based on a multimodal large language model. UI-Venus achieves SOTA performance on both UI grounding and navigation tasks using only several hundred thousand high-quality training samples through reinforcement finetune (RFT) based on Qwen2.5-VL. Specifically, the 7B and 72B variants of UI-Venus obtain 94.1% / 50.8% and 95.3% / 61.9% on the standard grounding benchmarks, i.e., Screenspot-V2 / Pro, surpassing the previous SOTA baselines including open-source GTA1 and closed-source UI-TARS-1.5.To show UI-Venus's summary and planing ability, we also evaluate it on the AndroidWorld, an online UI navigation arena, on which our 7B and 72B variants achieve 49.1% and 65.9% success rate, also beating existing models.To achieve this, we introduce carefully designed reward functions for both UI grounding and navigation tasks and corresponding efficient data cleaning strategies.To further boost navigation performance, we propose Self-Evolving Trajectory History Alignment \& Sparse Action Enhancement that refine historical reasoning traces and balances the distribution of sparse but critical actions, leading to more coherent planning and better generalization in complex UI tasks. Our contributions include the publish of SOTA open-source UI agents, comprehensive data cleaning protocols and a novel self-evolving framework for improving navigation performance, which encourage further research and development in the community. Code is available at https://github.com/antgroup/UI-Venus.

  6. Puppeteer: Rig and Animate Your 3D Models

    Modern interactive applications increasingly demand dynamic 3D content, yet the transformation of static 3D models into animated assets constitutes a significant bottleneck in content creation pipelines. While recent advances in generative AI have revolutionized static 3D model creation, rigging and animation continue to depend heavily on expert intervention. We present Puppeteer, a comprehensive framework that addresses both automatic rigging and animation for diverse 3D objects. Our system first predicts plausible skeletal structures via an auto-regressive transformer that introduces a joint-based tokenization strategy for compact representation and a hierarchical ordering methodology with stochastic perturbation that enhances bidirectional learning capabilities. It then infers skinning weights via an attention-based architecture incorporating topology-aware joint attention that explicitly encodes inter-joint relationships based on skeletal graph distances. Finally, we complement these rigging advances with a differentiable optimization-based animation pipeline that generates stable, high-fidelity animations while being computationally more efficient than existing approaches. Extensive evaluations across multiple benchmarks demonstrate that our method significantly outperforms state-of-the-art techniques in both skeletal prediction accuracy and skinning quality. The system robustly processes diverse 3D content, ranging from professionally designed game assets to AI-generated shapes, producing temporally coherent animations that eliminate the jittering issues common in existing methods.

  7. STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

    We present STream3R, a novel approach to 3D reconstruction that reformulates pointmap prediction as a decoder-only Transformer problem. Existing state-of-the-art methods for multi-view reconstruction either depend on expensive global optimization or rely on simplistic memory mechanisms that scale poorly with sequence length. In contrast, STream3R introduces an streaming framework that processes image sequences efficiently using causal attention, inspired by advances in modern language modeling. By learning geometric priors from large-scale 3D datasets, STream3R generalizes well to diverse and challenging scenarios, including dynamic scenes where traditional methods often fail. Extensive experiments show that our method consistently outperforms prior work across both static and dynamic scene benchmarks. Moreover, STream3R is inherently compatible with LLM-style training infrastructure, enabling efficient large-scale pretraining and fine-tuning for various downstream 3D tasks. Our results underscore the potential of causal Transformer models for online 3D perception, paving the way for real-time 3D understanding in streaming environments. More details can be found in our project page: https://nirvanalan.github.io/projects/stream3r.

  8. Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

    Reinforcement learning with verifiable rewards (RLVR), which typically adopts Pass@1 as the reward, has faced the issues in balancing exploration and exploitation, causing policies to prefer conservative actions, converging to a local optimum. Identifying an appropriate reward metric is therefore crucial. Regarding the prior work, although Pass@k has been used in evaluation, its connection to LLM exploration ability in RLVR remains largely overlooked. To investigate this, we first use Pass@k as the reward to train the policy model (i.e., Pass@k Training), and observe the improvement on its exploration ability. Next, we derive an analytical solution for the advantage of Pass@k Training, leading to an efficient and effective process. Building on this, our analysis reveals that exploration and exploitation are not inherently conflicting objectives, while they can mutually enhance each other. Moreover, Pass@k Training with analytical derivation essentially involves directly designing the advantage function. Inspired by this, we preliminarily explore the advantage design for RLVR, showing promising results and highlighting a potential future direction.

  9. A Survey on Diffusion Language Models

    Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent advantages in reducing inference latency and capturing bidirectional context, thereby enabling fine-grained control over the generation process. While achieving a several-fold speed-up, recent advancements have allowed DLMs to show performance comparable to their autoregressive counterparts, making them a compelling choice for various natural language processing tasks. In this survey, we provide a holistic overview of the current DLM landscape. We trace its evolution and relationship with other paradigms, such as autoregressive and masked language models, and cover both foundational principles and state-of-the-art models. Our work offers an up-to-date, comprehensive taxonomy and an in-depth analysis of current techniques, from pre-training strategies to advanced post-training methods. Another contribution of this survey is a thorough review of DLM inference strategies and optimizations, including improvements in decoding parallelism, caching mechanisms, and generation quality. We also highlight the latest approaches to multimodal extensions of DLMs and delineate their applications across various practical scenarios. Furthermore, our discussion addresses the limitations and challenges of DLMs, including efficiency, long-sequence handling, and infrastructure requirements, while outlining future research directions to sustain progress in this rapidly evolving field. Project GitHub is available at https://github.com/VILA-Lab/Awesome-DLMs.

  10. HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs

    While Multimodal Large Language Models (MLLMs) show immense promise for achieving truly human-like interactions, progress is hindered by the lack of fine-grained evaluation frameworks for human-centered scenarios, encompassing both the understanding of complex human intentions and the provision of empathetic, context-aware responses. Here we introduce HumanSense, a comprehensive benchmark designed to evaluate the human-centered perception and interaction capabilities of MLLMs, with a particular focus on deep understanding of extended multimodal contexts and the formulation of rational feedback. Our evaluation reveals that leading MLLMs still have considerable room for improvement, particularly for advanced interaction-oriented tasks. Supplementing visual input with audio and text information yields substantial improvements, and Omni-modal models show advantages on these tasks. Furthermore, we argue that appropriate feedback stems from a contextual analysis of the interlocutor's needs and emotions, with reasoning ability serving as the key to unlocking it. Accordingly, we employ a multi-stage, modality-progressive reinforcement learning to enhance the reasoning abilities of an Omni model, achieving substantial gains on evaluation results. Additionally, we observe that successful reasoning processes exhibit highly consistent thought patterns. By designing corresponding prompts, we also enhance the performance of non-reasoning models in a training-free manner. Project page: brightpinkhttps://digital-avatar.github.io/ai/HumanSense/

  11. Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

    Prior work has analyzed the robustness of visual encoders to image transformations and corruptions, particularly in cases where such alterations are not seen during training. When this occurs, they introduce a form of distribution shift at test time, often leading to performance degradation. The primary focus has been on severe corruptions that, when applied aggressively, distort useful signals necessary for accurate semantic predictions. We take a different perspective by analyzing parameters of the image acquisition process and transformations that may be subtle or even imperceptible to the human eye. We find that such parameters are systematically encoded in the learned visual representations and can be easily recovered. More strikingly, their presence can have a profound impact, either positively or negatively, on semantic predictions. This effect depends on whether there is a strong correlation or anti-correlation between semantic labels and these acquisition-based or processing-based labels. Our code and data are available at: https://github.com/ryan-caesar-ramos/visual-encoder-traces

  12. From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms

    Recent advancements in machine learning have spurred growing interests in automated interpreting quality assessment. Nevertheless, existing research suffers from insufficient examination of language use quality, unsatisfactory modeling effectiveness due to data scarcity and imbalance, and a lack of efforts to explain model predictions. To address these gaps, we propose a multi-dimensional modeling framework that integrates feature engineering, data augmentation, and explainable machine learning. This approach prioritizes explainability over ``black box'' predictions by utilizing only construct-relevant, transparent features and conducting Shapley Value (SHAP) analysis. Our results demonstrate strong predictive performance on a novel English-Chinese consecutive interpreting dataset, identifying BLEURT and CometKiwi scores to be the strongest predictive features for fidelity, pause-related features for fluency, and Chinese-specific phraseological diversity metrics for language use. Overall, by placing particular emphasis on explainability, we present a scalable, reliable, and transparent alternative to traditional human evaluation, facilitating the provision of detailed diagnostic feedback for learners and supporting self-regulated learning advantages not afforded by automated scores in isolation.

  13. When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing

    In the study of trustworthy Natural Language Processing (NLP), a number of important research fields have emerged, including that of explainability and privacy. While research interest in both explainable and privacy-preserving NLP has increased considerably in recent years, there remains a lack of investigation at the intersection of the two. This leaves a considerable gap in understanding of whether achieving both explainability and privacy is possible, or whether the two are at odds with each other. In this work, we conduct an empirical investigation into the privacy-explainability trade-off in the context of NLP, guided by the popular overarching methods of Differential Privacy (DP) and Post-hoc Explainability. Our findings include a view into the intricate relationship between privacy and explainability, which is formed by a number of factors, including the nature of the downstream task and choice of the text privatization and explainability method. In this, we highlight the potential for privacy and explainability to co-exist, and we summarize our findings in a collection of practical recommendations for future work at this important intersection.

Solidot(15)

  1. 现阶段的印度越南制造还只是中国加一

    印度取代中国成为世界电子制造工厂的蓝图有一个显著的特征:整座印度大厦需要中国公司提供技术架构、制造技术和运营模板。以 Dixon Technologies 为例,该公司与越来越多的中国合作伙伴合作打造印度的电子产品制造能力:龙旗科技提供设计智能,昆山丘钛科技提供摄像头模组专业知识,重庆宇海提供精密模塑部件,惠科电子(HKC)提供显示技术。这种依赖模式已成为印度电子制造业发展的组织原则。现阶段的结构中,中国公司保留了对关键知识的控制权,印度合作伙伴则提供劳动力套利和监管流程。印度与其说在构建中国制造业的替代,不如说是在建造中国制造业的最复杂子公司,由印度纳税人承担,以国家复兴为宣传口号。分析人士将这一战略称为“中国加一”。

  2. 微软高管称语音将成为下一代 Windows 的主要输入方式

    微软 Windows 部门主管 Pavan Davuluri 在一则公司视频中表示,随着 AI 彻底改变用户与计算机的交互方式,下一代 Windows 将变得“更具环境感、更普适、更多模态(more ambient, pervasive, and multi-modal)”。Davuluri 表示,语音将与键盘和鼠标一起成为 Windows 主要的输入方式,操作系统将具有情境感知功能,通过自然语言理解屏幕内容和用户意图。他表示,随着平台日益智能化,Windows 界面将在五年内发生根本性的变化。这种转变将依赖于本地处理能力和云计算能力去提供无缝的体验,用户可以在打字或墨迹式画图的同时与计算机对话。

  3. 一氧化碳新解毒剂能在数分钟内清理血液

    根据发表在 PNAS 期刊上的一项研究,马里兰大学医学院研究人员利用来自 Paraburkholderia xenovorans 细菌的天然蛋白 RcoM,该蛋白能感知环境中的微量一氧化碳,研究人员在 RcoM 基础上设计出名为 RcoM-HBD-CCC 的分子海绵,能选择性的与有毒一氧化碳分子结合,忽略氧气 (O2)和调控血压的一氧化氮 (NO)。对小鼠研究显示,RcoM-HBD-CCC 能在几分钟内清除血液中的一氧化碳,通过尿液安全排出体外。这种解毒剂就像海绵一样寻找并吸收附在红细胞上的一氧化碳。在小鼠体内,血液中半数的一氧化碳在不到一分钟内就清除掉了,释放出细胞上的血红蛋白,使其能再次携带氧气。其它现有的基于蛋白质的解毒剂无法选择性靶向一氧化碳分子,也会结合一氧化氮,会引起血压变化。

  4. AI 数据中心推动美国居民电费全面上涨

    随着美国科技巨头如亚马逊、Google 和微软建造数据中心向能源领域进一步渗透,美国居民和小型企业的电费可能会大幅上涨。数据显示,由于数据中心的电力需求,自 6 月起俄亥俄州家庭每月电费至少上涨了 15 美元。卡内基梅隆大学和北卡罗来纳州立大学的分析预测,由于数据中心的电费需求,到 2030 年美国平均电费将上涨 8%。而弗吉尼亚州电费涨幅可能达到 25%,到 2030 年该州居民每年可能要额外支付 276 美元。自 2020 年以来,美国居民电费已上涨逾 30%。美国科技公司的 AI 数据中心电力消耗在 2023 年占到了美国总电力消耗的逾 4%,分析师预测该比例将在三年内达到 12%。

  5. 广岛和长崎核爆幸存者死于辐射致癌的比例比预期的低

    根据发表在《Journal of Biological Physics and Chemistry》上的一项研究,80 年前广岛和长崎核爆辐射的初期幸存者中,不到 1% 的人已经死于或将死于癌症。据估计到 1945 年底,广岛约有 14 万人、长崎约有 7.4 万人死于爆炸冲击波、高温和急性放射性中毒。高剂量辐射暴露会增加罹患癌症的风险。布里斯托尔(Bristol)大学的风险管理学教授 Philip Thomas 估计 32.4 万名幸存者中只有约 3100 人已经或将会死于辐射诱发的白血病或实体瘤。根据日本厚生劳动省的数据,目前仍有 99130 名原爆幸存者,其中 4738 人被认为有资格获得因辐射引起的疾病的特殊医疗补助。伦敦癌症研究所癌症流行病学教授 Amy Berrington 对这一研究表示,电离辐射风险是复杂的问题,有些人会夸大,还有人则试图轻描淡写,Thomas 的研究与此前的研究总体上是一致的。她表示需要谨慎推广这一结论,而好消息是辐射没有跨代健康影响。

  6. ReiserFS 在内核的最后残余被清除

    去年发布的 Linux 6.13 正式移除了 ReiserFS 文件系统,但它的痕迹并没有完全清除掉。SUSE 开发者 David Sterba 在内核文档和部分工具中发现 ReiserFS 的残留后,发出补丁清除 ReiserFS 在内核的最后残余。ReiserFS 已经移除,因此内核文档中提到 ReiserFS 的部分也需要删除。唯一的例外是 ReiserFS R5 哈希函数,它仍然被部分内核代码使用。Reiser4 文件系统代码目前无人维护,Hans Reiser 在内核文件系统的时代彻底结束。

  7. 俄罗斯限制 Telegram 和 WhatsApp 的语音呼叫功能

    俄罗斯周三宣布了加强互联网管控的最新举措:限制即时通讯应用 Telegram 和 WhatsApp 的语音呼叫功能。媒体和互联网监管机构 Roskomnadzor 声称此举是为了打击网络犯罪,称 Telegram 和 WhatsApp 的语音服务被用于诈骗和勒索钱财,以及用于破坏和恐怖活动。俄罗斯今年夏天还限制了手机互联网接入,坚称是为了阻止乌克兰的无人机攻击,官员表示克里米亚的移动网络关闭将会无限期持续下去。在正式宣布限制 Telegram 和 WhatsApp 的语音呼叫功能前,这些应用的用户已经开始报告语音服务出现中断。WhatsApp 是俄罗斯最受欢迎的即时消息平台,月活跃用户超过 9600 万。Telegram 月活跃用户超过 8900 万。

  8. 白宫考虑将对华销售收入上缴模式扩大到其它公司

    美国白宫发言人 Karoline Leavitt 表示,政府仍然在制定英伟达和 AMD 上缴对华 AI 芯片销售收入 15% 的细节,并且政府还在考虑将这一上缴模式推广到其它公司。此举可能导致依赖对华出口的企业承担巨额负担,或者这些企业被迫脱钩。律师和专家警告,监管政府如何对出口许可证收取费用的现行法律可能导致这一交易复杂化。

  9. DeepSeek 的 R2 模型因华为芯片问题推迟发布

    金融时报报道,杭州深度探索公司 DeepSeek 在使用华为芯片训练新模型 R2 失败后,已推迟该模型的发布。DeepSeek 在今年 1 月释出了引发广泛关注的 R1 模型,之后它开始了 R2 模型的训练,在有关部门的鼓励下它使用了华为昇腾(Ascend)处理器而不是更成熟也更先进的英伟达 AI 芯片。但在使用昇腾芯片训练 R2 模型的过程中 DeepSeek 持续遭遇技术问题,因此转而使用英伟达芯片训练,使用华为芯片推理。

  10. 美国年轻一代更喜欢加速播放视频和音频

    根据 Economist/YouGov 的调查,美国年轻一代的观众更喜欢加速播放视频和音频。18-29 岁的美国人有 31% 使用 1x 以上的播放速度,而 45 岁及以上人群中该比例仅为 8%。流媒体平台如 Apple、Spotify、Netflix 和 YouTube 都增加了播放速度的选择范围,其中 YouTube 为其付费用户提供了 4x 播放速度。根据滑铁卢大学研究人员的一项元分析(meta-analysis),1.5x 播放速度对测试得分基本没有影响,而 2x 或以上播放速度会导致得分下降。

  11. 中国推动开源模型令硅谷和华盛顿担忧

    全世界最先进的 AI 模型都来自美国公司,都是私有模型,而中国在开源模型或开放权重模型领域处于领先地位,这令硅谷和华盛顿感到担忧,担心中国的模型可能会成为 AI 行业标准。行业标准并不一定是技术最先进的,易获得性和灵活性也非常重要,比如移动领域的 Android。对很多企业而言,使用开源模型可以对其进行更自由的调控,确保敏感信息不外泄。新加坡华侨银行使用开源模型开发了数十种内部工具,它使用的开源模型包括了 Google 的 Gemma,阿里巴巴的 Qwen 以及杭州深度求索的 DeepSeek。OpenAI 最新发布的开源模型 gpt-oss 在多项测试中不如阿里巴巴的 Qwen3,但 Qwen3 的参数规模几乎是 gpt-oss 的两倍,意味着 Qwen 可能需要消耗更多的算力完成相同的任务。OpenAI表示,gpt-oss 在推理任务上的表现优于同等参数规模的竞争对手,以低成本实现了强大的性能。亚马逊 AWS 表示,gpt-oss 比在其基础设施上运行的 DeepSeek R1 性价比更高。

  12. 研究认为社交媒体的问题无法得到修正

    社交媒体没有成为人们曾经期盼的健康的交流思想的乌托邦式的公共广场,而是创造出一种回音室,放大少数用户的声音,放大愤怒和冲突,进一步加剧极化。对社媒平台进行干预是否能缓解或修正部分它产生的问题?根据发表在 arXiv 上的一篇预印本,研究人员测试了六种干预策略,发现基本无效,除非从根本上改变社媒的架构,否则其问题无法得到修正。研究人员测试了对信息流按时间排序或随机排序;逆转促进互动的算法以降低情绪化内容的曝光度;促进观点的多元性;使用“桥接算法”促进相互理解而非煽动情绪的内容;隐藏转发和关注者账户等社交统计数据以减少社交影响力线索;删除个人简介以限制基于身份的信号曝光度。结果显示,部分干预措施只表现出略微改善的效果,部分措施可能进一步恶化了问题。比如对信息流按时间排序减少了注意力不平等,但进一步放大了极端内容。促进观点的多元性没有表现出任何显著的效果。

  13. 挪威指责俄黑客破坏其水坝

    挪威反情报机构负责人 Beate Gangaas 周三表示,俄罗斯黑客今年四月七日短暂控制了一座位于 Bremanger 的水坝,打开了泄洪闸,四个小时后攻击被发现而停止。挪威大部分电力来自水电,情报部门此前曾警告其能源基础设施可能遭受攻击。Gangaas 说,此类攻击的目的是影响民众,在民众中制造恐惧和混乱,我们的俄罗斯邻居变得更危险了。俄罗斯驻奥斯陆大使馆表示,她的声明毫无根据并带有政治动机。

  14. 韩国星巴克要求顾客不要将打印机和台式机带到店里

    韩国的星巴克顾客正将连锁咖啡店作为远程办公场所,为了限制这种行为,星巴克发布了新政策,要求顾客不要将打印机和台式机等大件物品带到店里。韩国星巴克发言人称,他们仍然欢迎携带笔记本电脑和小型个人设备的顾客,但请顾客不要带台式电脑、打印机或其它可能限制座位并影响共享空间的大件物品。星巴克于 1999 年在韩国首尔开设了第一家门店,尽管韩国人口不到日本的一半,但韩国星巴克门店总数已超过日本达到了 2050 家,而日本只有 2040 家。韩国社会文化学副教授 Jo Elfving-Hwang 说,在星巴克远程办公相当便宜,但有些人走得太极端了。

  15. 猫的痴呆症与人类相似

    根据发表在《European Journal of Neuroscience》期刊上的一项研究,猫痴呆症与人类阿尔茨海默病之间有着惊人的相似性。爱丁堡大学的研究人员对 25 只死前出现痴呆症症状的猫进行了大脑检查,发现猫的脑部积聚了 β淀粉样蛋白,该蛋白也是人类阿尔茨海默病的典型特征之一。研究人员认为,猫可以作为研究痴呆症/阿尔茨海默病的完美模型,探索新的疗法,也有助于理解和管理猫的痴呆症。