DIGEST · 2025-07-21

OrangeBot.AI Digest — 2025-07-21

62 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Global hack on Microsoft Sharepoint hits U.S., state agencies, researchers say (www.washingtonpost.com)
  2. What went wrong inside recalled Anker PowerCore 10000 power banks? (www.lumafield.com)
  3. AccountingBench: Evaluating LLMs on real long-horizon business tasks (accounting.penrose.com)
  4. Gemini with Deep Think achieves gold-medal standard at the IMO (deepmind.google)
  5. Solar-plus-storage technology is improving quickly (www.volts.wtf)
  6. Australian anti-porn group claims responsibility for Steams new censorship rules (www.pcgamer.com)
  7. New records on Wendelstein 7-X (www.iter.org)
  8. TrackWeight: Turn your MacBook's trackpad into a digital weighing scale (github.com)
  9. Shale Drillers Turn on Each Other as Toxic Water Leaks Hit Biggest US Oil Field (www.bloomberg.com)
  10. Occasionally USPS sends me pictures of other people's mail (the418.substack.com)
  11. UK backing down on Apple encryption backdoor after pressure from US (arstechnica.com)
  12. We made Postgres writes faster, but it broke replication (www.paradedb.com)
  13. What happens when housing prices go down? (clmarohn.substack.com)
  14. The daily life of a medieval king (www.medievalists.net)
  15. ESP32-Faikin: ESP32 based module to control Daikin aircon units (github.com)

GitHub Trending(15)

  1. maybe-finance / maybe

    The personal finance app for everyone

  2. ChatGPTNextWeb / NextChat

    ✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows

  3. hesreallyhim / awesome-claude-code

    A curated list of awesome commands, files, and workflows for Claude Code

  4. langchain-ai / open_deep_research
  5. hyprwm / Hyprland

    Hyprland is an independent, highly customizable, dynamic tiling Wayland compositor that doesn't sacrifice on its looks.

  6. donnemartin / system-design-primer

    Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

  7. unclecode / crawl4ai

    🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

  8. Lissy93 / dashy

    🚀 A self-hostable personal dashboard built for you. Includes status-checking, widgets, themes, icon packs, a UI editor and tons more!

  9. microsoft / ai-agents-for-beginners

    11 Lessons to Get Started Building AI Agents

  10. better-auth / better-auth

    The most comprehensive authentication framework for TypeScript

  11. C4illin / ConvertX

    💾 Self-hosted online file converter. Supports 1000+ formats ⚙️

  12. srbhr / Resume-Matcher

    Improve your resumes with Resume Matcher. Get insights, keyword suggestions and tune your resumes to job descriptions.

  13. bluenviron / mediamtx

    Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.

  14. remoteintech / remote-jobs

    A list of semi to fully remote-friendly companies (jobs) in tech.

  15. simstudioai / sim

    Sim Studio is an open-source AI agent workflow builder. Sim Studio's interface is a lightweight, intuitive way to quickly build and deploy LLMs that connect with your favorite tools.

Product Hunt(15)

  1. Jeeva 2.0

    Superhuman Sales, Powered By Agentic AI

  2. Levio by Jupitrr AI

    Your AI video editing agent

  3. Trae 2.0

    SOLO: Context Engineer that delivers software end-to-end

  4. the gist of

    Go beyond the link in bio. Tell a story.

  5. Krepling Pay

    Boost sales with one-click checkout, no account required

  6. Stakpak.dev

    Open-source DevOps agent to secure & manage production infra

  7. sampleapp.ai

    Self-serve API & SDK onboarding—in 60 seconds. Skip docs.

  8. Refgrow

    Grow with referrals on autopilot

  9. Veo 3 API

    Generate videos (with audio) using Veo 3 from Google

  10. Brainfork

    Own your AI knowledge | Personal MCP server

  11. Submagic API

    Automate video editing with Submagic API

  12. Rubbrband

    Create storyboards from scripts

  13. Saidar 2.0

    Your Personal AI Secretary for Admin Tasks

  14. Palmier

    Agents that write prod-ready code. Automatically.

  15. Corner Time

    macOS clock for fullscreen or whenever the menubar is hidden

Hugging Face(10)

  1. A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models

    Russian speech synthesis presents distinctive challenges, including vowel reduction, consonant devoicing, variable stress patterns, homograph ambiguity, and unnatural intonation. This paper introduces Balalaika, a novel dataset comprising more than 2,000 hours of studio-quality Russian speech with comprehensive textual annotations, including punctuation and stress markings. Experimental results show that models trained on Balalaika significantly outperform those trained on existing datasets in both speech synthesis and enhancement tasks. We detail the dataset construction pipeline, annotation methodology, and results of comparative evaluations.

  2. The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

    Diffusion-based large language models (dLLMs) have recently emerged as a powerful alternative to autoregressive LLMs, offering faster inference and greater interactivity via parallel decoding and bidirectional modeling. However, despite strong performance in code generation and text infilling, we identify a fundamental safety concern: existing alignment mechanisms fail to safeguard dLLMs against context-aware, masked-input adversarial prompts, exposing novel vulnerabilities. To this end, we present DIJA, the first systematic study and jailbreak attack framework that exploits unique safety weaknesses of dLLMs. Specifically, our proposed DIJA constructs adversarial interleaved mask-text prompts that exploit the text generation mechanisms of dLLMs, i.e., bidirectional modeling and parallel decoding. Bidirectional modeling drives the model to produce contextually consistent outputs for masked spans, even when harmful, while parallel decoding limits model dynamic filtering and rejection sampling of unsafe content. This causes standard alignment mechanisms to fail, enabling harmful completions in alignment-tuned dLLMs, even when harmful behaviors or unsafe instructions are directly exposed in the prompt. Through comprehensive experiments, we demonstrate that DIJA significantly outperforms existing jailbreak methods, exposing a previously overlooked threat surface in dLLM architectures. Notably, our method achieves up to 100% keyword-based ASR on Dream-Instruct, surpassing the strongest prior baseline, ReNeLLM, by up to 78.5% in evaluator-based ASR on JailbreakBench and by 37.7 points in StrongREJECT score, while requiring no rewriting or hiding of harmful content in the jailbreak prompt. Our findings underscore the urgent need for rethinking safety alignment in this emerging class of language models. Code is available at https://github.com/ZichenWen1/DIJA.

  3. Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

    We present Franca (pronounced Fran-ka): free one; the first fully open-source (data, code, weights) vision foundation model that matches and in many cases surpasses the performance of state-of-the-art proprietary models, e.g., DINOv2, CLIP, SigLIPv2, etc. Our approach is grounded in a transparent training pipeline inspired by Web-SSL and uses publicly available data: ImageNet-21K and a subset of ReLAION-2B. Beyond model release, we tackle critical limitations in SSL clustering methods. While modern models rely on assigning image features to large codebooks via clustering algorithms like Sinkhorn-Knopp, they fail to account for the inherent ambiguity in clustering semantics. To address this, we introduce a parameter-efficient, multi-head clustering projector based on nested Matryoshka representations. This design progressively refines features into increasingly fine-grained clusters without increasing the model size, enabling both performance and memory efficiency. Additionally, we propose a novel positional disentanglement strategy that explicitly removes positional biases from dense representations, thereby improving the encoding of semantic content. This leads to consistent gains on several downstream benchmarks, demonstrating the utility of cleaner feature spaces. Our contributions establish a new standard for transparent, high-performance vision models and open a path toward more reproducible and generalizable foundation models for the broader AI community. The code and model checkpoints are available at https://github.com/valeoai/Franca.

  4. Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models

    This paper focuses on monolithic Multimodal Large Language Models (MLLMs), which integrate visual encoding and language decoding into a single model. Existing structures and pre-training strategies for monolithic MLLMs often suffer from unstable optimization and catastrophic forgetting. To address these challenges, our key idea is to embed a new visual parameter space into a pre-trained LLM, enabling stable learning of visual knowledge from noisy data via delta tuning. Based on this principle, we first introduce Mono-InternVL, an advanced monolithic MLLM that incorporates a set of visual experts through a multimodal mixture-of-experts architecture. In addition, we design an innovative Endogenous Visual Pre-training (EViP) for Mono-InternVL to maximize its visual capabilities via progressive learning. Mono-InternVL achieves competitive performance against existing MLLMs but also leads to relatively expensive data cost. Therefore, we further present Mono-InternVL-1.5, a cheaper and stronger monolithic MLLM equipped with an improved EViP (EViP++). EViP++ introduces additional visual attention experts to Mono-InternVL-1.5 and re-organizes the pre-training process in an efficient manner. During inference, it includes a fused CUDA kernel to speed up its MoE operations. With these designs, Mono-InternVL-1.5 significantly reduces training and inference costs, while still maintaining competitive performance with Mono-InternVL. To evaluate our approach, we conduct extensive experiments across 15 benchmarks. Results demonstrate that Mono-InternVL outperforms existing monolithic MLLMs on 12 out of 15 benchmarks, e.g., +114-point improvement over Emu3 on OCRBench. Compared to its modular counterpart, i.e., InternVL-1.5, Mono-InternVL-1.5 achieves similar multimodal performance while reducing first-token latency by up to 69%. Code and models are released at https://github.com/OpenGVLab/Mono-InternVL.

  5. CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

    Disentangling content and style from a single image, known as content-style decomposition (CSD), enables recontextualization of extracted content and stylization of extracted styles, offering greater creative flexibility in visual synthesis. While recent personalization methods have explored the decomposition of explicit content style, they remain tailored for diffusion models. Meanwhile, Visual Autoregressive Modeling (VAR) has emerged as a promising alternative with a next-scale prediction paradigm, achieving performance comparable to that of diffusion models. In this paper, we explore VAR as a generative framework for CSD, leveraging its scale-wise generation process for improved disentanglement. To this end, we propose CSD-VAR, a novel method that introduces three key innovations: (1) a scale-aware alternating optimization strategy that aligns content and style representation with their respective scales to enhance separation, (2) an SVD-based rectification method to mitigate content leakage into style representations, and (3) an Augmented Key-Value (K-V) memory enhancing content identity preservation. To benchmark this task, we introduce CSD-100, a dataset specifically designed for content-style decomposition, featuring diverse subjects rendered in various artistic styles. Experiments demonstrate that CSD-VAR outperforms prior approaches, achieving superior content preservation and stylization fidelity.

  6. Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

    In the era of Large Language Models (LLMs), alignment has emerged as a fundamental yet challenging problem in the pursuit of more reliable, controllable, and capable machine intelligence. The recent success of reasoning models and conversational AI systems has underscored the critical role of reinforcement learning (RL) in enhancing these systems, driving increased research interest at the intersection of RL and LLM alignment. This paper provides a comprehensive review of recent advances in LLM alignment through the lens of inverse reinforcement learning (IRL), emphasizing the distinctions between RL techniques employed in LLM alignment and those in conventional RL tasks. In particular, we highlight the necessity of constructing neural reward models from human data and discuss the formal and practical implications of this paradigm shift. We begin by introducing fundamental concepts in RL to provide a foundation for readers unfamiliar with the field. We then examine recent advances in this research agenda, discussing key challenges and opportunities in conducting IRL for LLM alignment. Beyond methodological considerations, we explore practical aspects, including datasets, benchmarks, evaluation metrics, infrastructure, and computationally efficient training and inference techniques. Finally, we draw insights from the literature on sparse-reward RL to identify open questions and potential research directions. By synthesizing findings from diverse studies, we aim to provide a structured and critical overview of the field, highlight unresolved challenges, and outline promising future directions for improving LLM alignment through RL and IRL techniques.

  7. RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services

    As a primary medium for modern information dissemination, social networking services (SNS) have experienced rapid growth, which has proposed significant challenges for platform content management and interaction quality improvement. Recently, the development of large language models (LLMs) has offered potential solutions but existing studies focus on isolated tasks, which not only encounter diminishing benefit from the data scaling within individual scenarios but also fail to flexibly adapt to diverse real-world context. To address these challenges, we introduce RedOne, a domain-specific LLM designed to break the performance bottleneck of single-task baselines and establish a comprehensive foundation for the SNS. RedOne was developed through a three-stage training strategy consisting of continue pretraining, supervised fine-tuning, and preference optimization, using a large-scale real-world dataset. Through extensive experiments, RedOne maintains strong general capabilities, and achieves an average improvement up to 14.02% across 8 major SNS tasks and 7.56% in SNS bilingual evaluation benchmark, compared with base models. Furthermore, through online testing, RedOne reduced the exposure rate in harmful content detection by 11.23% and improved the click page rate in post-view search by 14.95% compared with single-tasks finetuned baseline models. These results establish RedOne as a robust domain-specific LLM for SNS, demonstrating excellent generalization across various tasks and promising applicability in real-world scenarios.

  8. Mitigating Object Hallucinations via Sentence-Level Early Intervention

    Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs. Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs. We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs. To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations. Specifically, we first bootstrap high-quality in-domain preference pairs by iteratively sampling model outputs, validating object existence through cross-checking with two open-vocabulary detectors, and classifying sentences into hallucinated/non-hallucinated categories. Subsequently, we use context-coherent positive samples and hallucinated negative samples to build context-aware preference data iteratively. Finally, we train models using a context-aware preference loss (C-DPO) that emphasizes discriminative learning at the sentence level where hallucinations initially manifest. Experimental results show that SENTINEL can reduce hallucinations by over 90\% compared to the original model and outperforms the previous state-of-the-art method on both hallucination benchmarks and general capabilities benchmarks, demonstrating its superiority and generalization ability. The models, datasets, and code are available at https://github.com/pspdada/SENTINEL.

  9. The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations

    The evaluation of large language models is a complex task, in which several approaches have been proposed. The most common is the use of automated benchmarks in which LLMs have to answer multiple-choice questions of different topics. However, this method has certain limitations, being the most concerning, the poor correlation with the humans. An alternative approach, is to have humans evaluate the LLMs. This poses scalability issues as there is a large and growing number of models to evaluate making it impractical (and costly) to run traditional studies based on recruiting a number of evaluators and having them rank the responses of the models. An alternative approach is the use of public arenas, such as the popular LM arena, on which any user can freely evaluate models on any question and rank the responses of two models. The results are then elaborated into a model ranking. An increasingly important aspect of LLMs is their energy consumption and, therefore, evaluating how energy awareness influences the decisions of humans in selecting a model is of interest. In this paper, we present GEA, the Generative Energy Arena, an arena that incorporates information on the energy consumption of the model in the evaluation process. Preliminary results obtained with GEA are also presented, showing that for most questions, when users are aware of the energy consumption, they favor smaller and more energy efficient models. This suggests that for most user interactions, the extra cost and energy incurred by the more complex and top-performing models do not provide an increase in the perceived quality of the responses that justifies their use.

  10. Quantitative Risk Management in Volatile Markets with an Expectile-Based Framework for the FTSE Index

    This research presents a framework for quantitative risk management in volatile markets, specifically focusing on expectile-based methodologies applied to the FTSE 100 index. Traditional risk measures such as Value-at-Risk (VaR) have demonstrated significant limitations during periods of market stress, as evidenced during the 2008 financial crisis and subsequent volatile periods. This study develops an advanced expectile-based framework that addresses the shortcomings of conventional quantile-based approaches by providing greater sensitivity to tail losses and improved stability in extreme market conditions. The research employs a dataset spanning two decades of FTSE 100 returns, incorporating periods of high volatility, market crashes, and recovery phases. Our methodology introduces novel mathematical formulations for expectile regression models, enhanced threshold determination techniques using time series analysis, and robust backtesting procedures. The empirical results demonstrate that expectile-based Value-at-Risk (EVaR) consistently outperforms traditional VaR measures across various confidence levels and market conditions. The framework exhibits superior performance during volatile periods, with reduced model risk and enhanced predictive accuracy. Furthermore, the study establishes practical implementation guidelines for financial institutions and provides evidence-based recommendations for regulatory compliance and portfolio management. The findings contribute significantly to the literature on financial risk management and offer practical tools for practitioners dealing with volatile market environments.

Solidot(7)

  1. Debian 13.0 计划于 8 月 9 日释出

    Debian 发布团队宣布 Debian 13.0 "Trixie"计划于 8 月 9 日释出,7 月 27 日完全冻结。Debian Trixie 代表了为期两年的开发历程,内核采用 Linux 6.12 LTS,包含桌面环境 GNOME 48 、GCC 14.2 编译器、Python 3.13 等大量软件更新。Debian Trixie 将首次正式支持 64 位 RISC-V 架构。

  2. 中国 AI 研究论文发表量世界第一

    对数据库 Dimensions 的分析发现,与 AI 相关的研究论文数量已从 2000 年的不到 8500 篇增长到 2024 年的 5.7 万多篇。2000 年,中国学者仅发表了 671 篇 AI 论文,但到 2024 年,他们发表了 23695 篇与 AI 相关的论文,超过了美国(6378篇)、英国(2747篇)和欧盟(10055篇)的总和。中国产生的海量AI论文也推动了创纪录的专利申请。2024 年中国研究人员提交了 35423 项与 AI 相关的专利申请,是美国、英国、加拿大、日本、韩国5国提交的专利申请总数(2678项)的 13 倍多。研究还显示,中国的 AI研 究正变得越来越独立。过去几年中,美国、英国和欧盟的科学家与中国学者共同撰写论文的频率比他们彼此间合著的频率更高。但在 4 个地区中,中国学者的国际合作率最低。随着中国庞大的 AI 研究队伍的成熟,国际合作可能会进一步减少。研究发现,中国拥有约 3 万名各个年龄段的 AI 研究人员,而美国约有 1 万名。中国的 AI 研究队伍也明显更年轻。

  3. 天文学家首次观察到行星系统形成的早期阶段

    国际研究团队首次确定了太阳以外的恒星周围开始形成行星的时刻,这是人类首次观察到行星系统形成的早期阶段,并为我们探索自身太阳系的起源提供全新视角。这颗诞生中的行星系统围绕着一颗名为 HOPS 315 的原恒星运转,HOPS 315 距离我们约 1,300 光年,与新生的太阳类似。在太阳系中,最早在地球目前绕太阳位置附近凝结的固体物质被发现藏在古老的陨石中。天文学家对这些原始岩石进行年代测定,以确定太阳系形成的起始时间。这些陨石富含一氧化硅(SiO) 的晶体矿物,可以在年轻行星盘的极高温度下凝结。随着时间的推移,这些新凝结的固体会结合在一起,随着它们的体积和质量的增加,为行星的形成播下了种子。太阳系中第一批几千米大小的行星,最终发展成像地球或木星核心这样的行星,正是在这些晶体矿物凝结后形成的。天文学家在新的发现中,找到了这些热矿物在 HOPS-315 周围的圆盘中开始凝结的证据。研究结果显示,SiO 以气态存在于这颗宝宝恒星周围,也存在于这些结晶矿物中,这表示它才刚开始凝固。研究人员表示这个过程从未在原行星盘,甚至在我们太阳系以外的任何地方出现过。

  4. 微软不再使用中国工程师为五角大楼提供技术支持

    在被发现使用中国工程师为五角大楼的云计算系统提供技术支持之后,微软周五表示已经调整了安排,确保不会有中国的工程师团队为国防部的政府云以及相关服务提供技术支持。在这之前,国防部长 Pete Hegseth 表示将对此展开调查。中国工程师的技术支持受到了持有安全许可的美国公民的监督,但调查发现执行监督任务的美国公民缺乏专业能力去理解外国工程师的工作。

  5. Netflix 首次使用生成式 AI 制作电视特效

    流媒体巨头 Netflix 表示首次在原创剧集中首次使用生成式 AI 制作了视觉特效。联席 CEO Ted Sarandos 称阿根廷科幻剧《The Eternaut》使用生成式 AI 制作了一段布宜诺斯艾利斯建筑物倒塌的镜头,速度比使用传统特效工具快了 10倍 。他表示 生成式 AI 技术让预算有限的制作团队更快更低成本的完成特效镜头。新加坡动画工作室 CraveFX 的联合创始人 Davier Yoon 认为影视剧公司使用生成式 AI 只是时间问题,AI 让小型工作室也能制作看起来庞大预算的视觉效果。他称,决定最终图像的是艺术家而不是 AI。

  6. LibreOffice 指责微软使用复杂的文件格式锁定 Office 用户

    开源办公软件项目 LibreOffice 指责微软故意使用不必要复杂的文件格式,通过 Microsoft 365 文档锁定用户。LibreOffice 的文档使用开放标准格式 OpenDocument Format(ODF),该文档格式不受任何公司控制。微软则使用非标准的 Office Open XML(OOXML)文档格式。LibreOffice 称,微软的 OOXML 格式包含深度嵌套的结构,使用非直观的命名约定和大量可选元素,使得非微软开发商难以实现。LibreOffice 将文档格式的这种情况与铁路系统进行了对比,铁路轨道是公有的,但控制系统过于复杂以至于竞争对手无法制造兼容的列车。

  7. 英特尔终止了对 Clear Linux 的支持

    最近大规模裁员和重组的芯片巨头突然宣布终止对 Clear Linux 发行版的支持,从即日起不再为 Clear Linux OS 提供安全补丁、更新或维护,项目托管在 GitHub 上的代码库将转为只读模式,它建议使用 Clear Linux 的用户尽快迁移到其它活跃维护的发行版。英特尔在声明中同时强调会继续投资 Linux 生态系统,积极支持和贡献开源项目和 Linux 发行版,支持和优化英特尔硬件。英特尔是在 2015 年为解决容器安全问题而宣布了 Clear Linux 发行版项目,至今有十年历史。