OrangeBot.AI Digest — 2025-07-29
70 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Microsoft bans LibreOffice developer's account without warning, rejects appeal (www.neowin.net)
- Maru OS – Use your phone as your PC (maruos.com)
- Learning basic electronics by building fireflies (a64.in)
- Irrelevant facts about cats added to math problems increase LLM errors by 300% (www.science.org)
- Study mode (openai.com)
- Show HN: I built an AI that turns any book into a text adventure game (www.kathaaverse.com)
- Launch HN: Hyprnote (YC S25) – An open-source AI meeting notetaker
- Observable Notebooks 2.0 Technology Preview (observablehq.com)
- My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air) (simonwillison.net)
- Linux Performance Analysis (2015) (netflixtechblog.com)
- Age Verification Laws Send VPN Use Soaring–and Threaten the Open Internet (www.wired.com)
- Stop selling “unlimited”, when you mean “until we change our minds” (blog.kilocode.ai)
- Nothing to watch – Experimental gallery visualizing 50k film posters (nothing-to-watch.port80.ch)
- Wikimedia Foundation Challenges UK Online Safety Act Regulations (wikimediafoundation.org)
- The EU could be scanning your chats by October 2025 (www.techradar.com)
GitHub Trending(11)
- 9001 / copyparty
Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
- cloudwego / eino
The ultimate LLM/AI application development framework in Golang.
- n0-computer / iroh
peer-2-peer that just works
- tldr-pages / tldr
📚 Collaborative cheatsheets for console commands
- Shubhamsaboo / awesome-llm-apps
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
- microsoft / PowerToys
Windows system utilities to maximize productivity
- lapce / lapce
Lightning-fast and Powerful Code Editor written in Rust
- ashishpatel26 / 500-AI-Agents-Projects
The 500 AI Agents Projects is a curated collection of AI agent use cases across various industries. It showcases practical applications and provides links to open-source projects for implementation, illustrating how AI agents are transforming sectors such as healthcare, finance, education, retail, and more.
- linshenkx / prompt-optimizer
一款提示词优化器,助力于编写高质量的提示词
- outline / outline
The fastest knowledge base for growing teams. Beautiful, realtime collaborative, feature packed, and markdown compatible.
- musistudio / claude-code-router
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Product Hunt(14)
- Magic Patterns
Design new features with AI
- RunLLM
AI that doesn’t just respond—it resolves
- PodClips
Turn your podcasts into viral video content
- Jotform Gmail Agent
Automatically draft Gmail replies that sound just like you
- SideNotes
Quick notes on screen edge
- Wan 2.2
The first open MoE model for AI video generation
- Notion-style editor for Tiptap Cloud
Collaborative block-based editor → ready to drop in your app
- sndmyself
The simplest way to send yourself a message
- Lumo by Proton
Privacy-first AI assistant with confidential conversations
- Layout.dev
Turn ideas into prototypes — in seconds
- Edge Copilot Mode
Your AI-powered browser
- FocusPit
Imagine if Notion and Discord had a child
- Planby PRO
Build React Timeline today, fast and simple
- Immersity for Mobile
Immersity turns images into immersive videos instantly.
Hugging Face(15)
- Agentic Reinforced Policy Optimization
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks. In realistic reasoning scenarios, LLMs can often utilize external tools to assist in task-solving processes. However, current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions. To bridge this gap, we propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents. Through preliminary experiments, we observe that LLMs tend to exhibit highly uncertain behavior, characterized by an increase in the entropy distribution of generated tokens, immediately following interactions with external tools. Motivated by this observation, ARPO incorporates an entropy-based adaptive rollout mechanism, dynamically balancing global trajectory sampling and step-level sampling, thereby promoting exploration at steps with high uncertainty after tool usage. By integrating an advantage attribution estimation, ARPO enables LLMs to internalize advantage differences in stepwise tool-use interactions. Our experiments across 13 challenging benchmarks in computational reasoning, knowledge reasoning, and deep search domains demonstrate ARPO's superiority over trajectory-level RL algorithms. Remarkably, ARPO achieves improved performance using only half of the tool-use budget required by existing methods, offering a scalable solution for aligning LLM-based agents with real-time dynamic environments. Our code and datasets are released at https://github.com/dongguanting/ARPO
- ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts
Real-world user-generated short videos, especially those distributed on platforms such as WeChat Channel and TikTok, dominate the mobile internet. However, current large multimodal models lack essential temporally-structured, detailed, and in-depth video comprehension capabilities, which are the cornerstone of effective video search and recommendation, as well as emerging video applications. Understanding real-world shorts is actually challenging due to their complex visual elements, high information density in both visuals and audio, and fast pacing that focuses on emotional expression and viewpoint delivery. This requires advanced reasoning to effectively integrate multimodal information, including visual, audio, and text. In this work, we introduce ARC-Hunyuan-Video, a multimodal model that processes visual, audio, and textual signals from raw video inputs end-to-end for structured comprehension. The model is capable of multi-granularity timestamped video captioning and summarization, open-ended video question answering, temporal video grounding, and video reasoning. Leveraging high-quality data from an automated annotation pipeline, our compact 7B-parameter model is trained through a comprehensive regimen: pre-training, instruction fine-tuning, cold start, reinforcement learning (RL) post-training, and final instruction fine-tuning. Quantitative evaluations on our introduced benchmark ShortVid-Bench and qualitative comparisons demonstrate its strong performance in real-world video comprehension, and it supports zero-shot or fine-tuning with a few samples for diverse downstream applications. The real-world production deployment of our model has yielded tangible and measurable improvements in user engagement and satisfaction, a success supported by its remarkable efficiency, with stress tests indicating an inference time of just 10 seconds for a one-minute video on H20 GPU.
- Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning
Despite the promise of Multi-Task Learning in leveraging complementary knowledge across tasks, existing multi-task optimization (MTO) techniques remain fixated on resolving conflicts via optimizer-centric loss scaling and gradient manipulation strategies, yet fail to deliver consistent gains. In this paper, we argue that the shared representation space, where task interactions naturally occur, offers rich information and potential for operations complementary to existing optimizers, especially for facilitating the inter-task complementarity, which is rarely explored in MTO. This intuition leads to Rep-MTL, which exploits the representation-level task saliency to quantify interactions between task-specific optimization and shared representation learning. By steering these saliencies through entropy-based penalization and sample-wise cross-task alignment, Rep-MTL aims to mitigate negative transfer by maintaining the effective training of individual tasks instead pure conflict-solving, while explicitly promoting complementary information sharing. Experiments are conducted on four challenging MTL benchmarks covering both task-shift and domain-shift scenarios. The results show that Rep-MTL, even paired with the basic equal weighting policy, achieves competitive performance gains with favorable efficiency. Beyond standard performance metrics, Power Law exponent analysis demonstrates Rep-MTL's efficacy in balancing task-specific learning and cross-task sharing. The project page is available at HERE.
- SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
While frontier large language models (LLMs) continue to push capability boundaries, their deployment remains confined to GPU-powered cloud infrastructure. We challenge this paradigm with SmallThinker, a family of LLMs natively designed - not adapted - for the unique constraints of local devices: weak computational power, limited memory, and slow storage. Unlike traditional approaches that mainly compress existing models built for clouds, we architect SmallThinker from the ground up to thrive within these limitations. Our innovation lies in a deployment-aware architecture that transforms constraints into design principles. First, We introduce a two-level sparse structure combining fine-grained Mixture-of-Experts (MoE) with sparse feed-forward networks, drastically reducing computational demands without sacrificing model capacity. Second, to conquer the I/O bottleneck of slow storage, we design a pre-attention router that enables our co-designed inference engine to prefetch expert parameters from storage while computing attention, effectively hiding storage latency that would otherwise cripple on-device inference. Third, for memory efficiency, we utilize NoPE-RoPE hybrid sparse attention mechanism to slash KV cache requirements. We release SmallThinker-4B-A0.6B and SmallThinker-21B-A3B, which achieve state-of-the-art performance scores and even outperform larger LLMs. Remarkably, our co-designed system mostly eliminates the need for expensive GPU hardware: with Q4_0 quantization, both models exceed 20 tokens/s on ordinary consumer CPUs, while consuming only 1GB and 8GB of memory respectively. SmallThinker is publicly available at hf.co/PowerInfer/SmallThinker-4BA0.6B-Instruct and hf.co/PowerInfer/SmallThinker-21BA3B-Instruct.
- Reconstructing 4D Spatial Intelligence: A Survey
Reconstructing 4D spatial intelligence from visual observations has long been a central yet challenging task in computer vision, with broad real-world applications. These range from entertainment domains like movies, where the focus is often on reconstructing fundamental visual elements, to embodied AI, which emphasizes interaction modeling and physical realism. Fueled by rapid advances in 3D representations and deep learning architectures, the field has evolved quickly, outpacing the scope of previous surveys. Additionally, existing surveys rarely offer a comprehensive analysis of the hierarchical structure of 4D scene reconstruction. To address this gap, we present a new perspective that organizes existing methods into five progressive levels of 4D spatial intelligence: (1) Level 1 -- reconstruction of low-level 3D attributes (e.g., depth, pose, and point maps); (2) Level 2 -- reconstruction of 3D scene components (e.g., objects, humans, structures); (3) Level 3 -- reconstruction of 4D dynamic scenes; (4) Level 4 -- modeling of interactions among scene components; and (5) Level 5 -- incorporation of physical laws and constraints. We conclude the survey by discussing the key challenges at each level and highlighting promising directions for advancing toward even richer levels of 4D spatial intelligence. To track ongoing developments, we maintain an up-to-date project page: https://github.com/yukangcao/Awesome-4D-Spatial-Intelligence.
- A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledge domains, or dynamic interaction contexts. As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck, necessitating agents that can adaptively reason, act, and evolve in real time. This paradigm shift -- from scaling static models to developing self-evolving agents -- has sparked growing interest in architectures and methods enabling continual learning and adaptation from data, interactions, and experiences. This survey provides the first systematic and comprehensive review of self-evolving agents, organized around three foundational dimensions -- what to evolve, when to evolve, and how to evolve. We examine evolutionary mechanisms across agent components (e.g., models, memory, tools, architecture), categorize adaptation methods by stages (e.g., intra-test-time, inter-test-time), and analyze the algorithmic and architectural designs that guide evolutionary adaptation (e.g., scalar rewards, textual feedback, single-agent and multi-agent systems). Additionally, we analyze evaluation metrics and benchmarks tailored for self-evolving agents, highlight applications in domains such as coding, education, and healthcare, and identify critical challenges and research directions in safety, scalability, and co-evolutionary dynamics. By providing a structured framework for understanding and designing self-evolving agents, this survey establishes a roadmap for advancing adaptive agentic systems in both research and real-world deployments, ultimately shedding lights to pave the way for the realization of Artificial Super Intelligence (ASI), where agents evolve autonomously, performing at or beyond human-level intelligence across a wide array of tasks.
- Geometric-Mean Policy Optimization
Recent advancements, such as Group Relative Policy Optimization (GRPO), have enhanced the reasoning capabilities of large language models by optimizing the arithmetic mean of token-level rewards. However, GRPO suffers from unstable policy updates when processing tokens with outlier importance-weighted rewards, which manifests as extreme importance sampling ratios during training, i.e., the ratio between the sampling probabilities assigned to a token by the current and old policies. In this work, we propose Geometric-Mean Policy Optimization (GMPO), a stabilized variant of GRPO. Instead of optimizing the arithmetic mean, GMPO maximizes the geometric mean of token-level rewards, which is inherently less sensitive to outliers and maintains a more stable range of importance sampling ratio. In addition, we provide comprehensive theoretical and experimental analysis to justify the design and stability benefits of GMPO. Beyond improved stability, GMPO-7B outperforms GRPO by an average of 4.1% on multiple mathematical benchmarks and 1.4% on multimodal reasoning benchmark, including AIME24, AMC, MATH500, OlympiadBench, Minerva, and Geometry3K. Code is available at https://github.com/callsys/GMPO.
- GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Recent advancements in large multimodal models like GPT-4o have set a new standard for high-fidelity, instruction-guided image editing. However, the proprietary nature of these models and their training data creates a significant barrier for open-source research. To bridge this gap, we introduce GPT-IMAGE-EDIT-1.5M, a publicly available, large-scale image-editing corpus containing more than 1.5 million high-quality triplets (instruction, source image, edited image). We systematically construct this dataset by leveraging the versatile capabilities of GPT-4o to unify and refine three popular image-editing datasets: OmniEdit, HQ-Edit, and UltraEdit. Specifically, our methodology involves 1) regenerating output images to enhance visual quality and instruction alignment, and 2) selectively rewriting prompts to improve semantic clarity. To validate the efficacy of our dataset, we fine-tune advanced open-source models on GPT-IMAGE-EDIT-1.5M. The empirical results are exciting, e.g., the fine-tuned FluxKontext achieves highly competitive performance across a comprehensive suite of benchmarks, including 7.24 on GEdit-EN, 3.80 on ImgEdit-Full, and 8.78 on Complex-Edit, showing stronger instruction following and higher perceptual quality while maintaining identity. These scores markedly exceed all previously published open-source methods and substantially narrow the gap to leading proprietary models. We hope the full release of GPT-IMAGE-EDIT-1.5M can help to catalyze further open research in instruction-guided image editing.
- Region-based Cluster Discrimination for Visual Representation Learning
Learning visual representations is foundational for a broad spectrum of downstream tasks. Although recent vision-language contrastive models, such as CLIP and SigLIP, have achieved impressive zero-shot performance via large-scale vision-language alignment, their reliance on global representations constrains their effectiveness for dense prediction tasks, such as grounding, OCR, and segmentation. To address this gap, we introduce Region-Aware Cluster Discrimination (RICE), a novel method that enhances region-level visual and OCR capabilities. We first construct a billion-scale candidate region dataset and propose a Region Transformer layer to extract rich regional semantics. We further design a unified region cluster discrimination loss that jointly supports object and OCR learning within a single classification framework, enabling efficient and scalable distributed training on large-scale data. Extensive experiments show that RICE consistently outperforms previous methods on tasks, including segmentation, dense detection, and visual perception for Multimodal Large Language Models (MLLMs). The pre-trained models have been released at https://github.com/deepglint/MVT.
- Met^2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems
The increasing frequency of extreme weather events due to global climate change urges accurate weather prediction. Recently, great advances have been made by the end-to-end methods, thanks to deep learning techniques, but they face limitations of representation inconsistency in multivariable integration and struggle to effectively capture the dependency between variables, which is required in complex weather systems. Treating different variables as distinct modalities and applying a two-stage training approach from multimodal models can partially alleviate this issue, but due to the inconformity in training tasks between the two stages, the results are often suboptimal. To address these challenges, we propose an implicit two-stage training method, configuring separate encoders and decoders for each variable. In detailed, in the first stage, the Translator is frozen while the Encoders and Decoders learn a shared latent space, in the second stage, the Encoders and Decoders are frozen, and the Translator captures inter-variable interactions for prediction. Besides, by introducing a self-attention mechanism for multivariable fusion in the latent space, the performance achieves further improvements. Empirically, extensive experiments show the state-of-the-art performance of our method. Specifically, it reduces the MSE for near-surface air temperature and relative humidity predictions by 28.82\% and 23.39\%, respectively. The source code is available at https://github.com/ShremG/Met2Net.
- UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities
Recent advances in large language models (LLMs) have highlighted the potential of reinforcement learning with verifiable rewards (RLVR) to enhance reasoning capabilities through extended output sequences. However, traditional RL frameworks face inefficiencies when handling ultra-long outputs due to long-tail sequence distributions and entropy collapse during training. To address these challenges, we propose an Ultra-Long Output Reinforcement Learning (UloRL) approach for advancing large language models' reasoning abilities. Specifically, we divide ultra long output decoding into short segments, enabling efficient training by mitigating delays caused by long-tail samples. Additionally, we introduce dynamic masking of well-Mastered Positive Tokens (MPTs) to prevent entropy collapse. Experimental results demonstrate the effectiveness of our approach. On the Qwen3-30B-A3B model, RL with segment rollout achieved 2.06x increase in training speed, while RL training with 128k-token outputs improves the model's performance on AIME2025 from 70.9\% to 85.1\% and on BeyondAIME from 50.7\% to 61.9\%, even surpassing Qwen3-235B-A22B with remarkable gains. These findings underscore the potential of our methods to advance the reasoning capabilities of LLMs with ultra-long sequence generation. We will release our code and model for further use by the community.
- ForCenNet: Foreground-Centric Network for Document Image Rectification
Document image rectification aims to eliminate geometric deformation in photographed documents to facilitate text recognition. However, existing methods often neglect the significance of foreground elements, which provide essential geometric references and layout information for document image correction. In this paper, we introduce Foreground-Centric Network (ForCenNet) to eliminate geometric distortions in document images. Specifically, we initially propose a foreground-centric label generation method, which extracts detailed foreground elements from an undistorted image. Then we introduce a foreground-centric mask mechanism to enhance the distinction between readable and background regions. Furthermore, we design a curvature consistency loss to leverage the detailed foreground labels to help the model understand the distorted geometric distribution. Extensive experiments demonstrate that ForCenNet achieves new state-of-the-art on four real-world benchmarks, such as DocUNet, DIR300, WarpDoc, and DocReal. Quantitative analysis shows that the proposed method effectively undistorts layout elements, such as text lines and table borders. The resources for further comparison are provided at https://github.com/caipeng328/ForCenNet.
- ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment
Perpetual 3D scene generation aims to produce long-range and coherent 3D view sequences, which is applicable for long-term video synthesis and 3D scene reconstruction. Existing methods follow a "navigate-and-imagine" fashion and rely on outpainting for successive view expansion. However, the generated view sequences suffer from semantic drift issue derived from the accumulated deviation of the outpainting module. To tackle this challenge, we propose ScenePainter, a new framework for semantically consistent 3D scene generation, which aligns the outpainter's scene-specific prior with the comprehension of the current scene. To be specific, we introduce a hierarchical graph structure dubbed SceneConceptGraph to construct relations among multi-level scene concepts, which directs the outpainter for consistent novel views and can be dynamically refined to enhance diversity. Extensive experiments demonstrate that our framework overcomes the semantic drift issue and generates more consistent and immersive 3D view sequences. Project Page: https://xiac20.github.io/ScenePainter/.
- Music Arena: Live Evaluation for Text-to-Music
We present Music Arena, an open platform for scalable human preference evaluation of text-to-music (TTM) models. Soliciting human preferences via listening studies is the gold standard for evaluation in TTM, but these studies are expensive to conduct and difficult to compare, as study protocols may differ across systems. Moreover, human preferences might help researchers align their TTM systems or improve automatic evaluation metrics, but an open and renewable source of preferences does not currently exist. We aim to fill these gaps by offering *live* evaluation for TTM. In Music Arena, real-world users input text prompts of their choosing and compare outputs from two TTM systems, and their preferences are used to compile a leaderboard. While Music Arena follows recent evaluation trends in other AI domains, we also design it with key features tailored to music: an LLM-based routing system to navigate the heterogeneous type signatures of TTM systems, and the collection of *detailed* preferences including listening data and natural language feedback. We also propose a rolling data release policy with user privacy guarantees, providing a renewable source of preference data and increasing platform transparency. Through its standardized evaluation protocol, transparent data access policies, and music-specific features, Music Arena not only addresses key challenges in the TTM ecosystem but also demonstrates how live evaluation can be thoughtfully adapted to unique characteristics of specific AI domains. Music Arena is available at: https://music-arena.org
- Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
When language models (LMs) are trained via reinforcement learning (RL) to generate natural language "reasoning chains", their performance improves on a variety of difficult question answering tasks. Today, almost all successful applications of RL for reasoning use binary reward functions that evaluate the correctness of LM outputs. Because such reward functions do not penalize guessing or low-confidence outputs, they often have the unintended side-effect of degrading calibration and increasing the rate at which LMs generate incorrect responses (or "hallucinate") in other problem domains. This paper describes RLCR (Reinforcement Learning with Calibration Rewards), an approach to training reasoning models that jointly improves accuracy and calibrated confidence estimation. During RLCR, LMs generate both predictions and numerical confidence estimates after reasoning. They are trained to optimize a reward function that augments a binary correctness score with a Brier score -- a scoring rule for confidence estimates that incentivizes calibrated prediction. We first prove that this reward function (or any analogous reward function that uses a bounded, proper scoring rule) yields models whose predictions are both accurate and well-calibrated. We next show that across diverse datasets, RLCR substantially improves calibration with no loss in accuracy, on both in-domain and out-of-domain evaluations -- outperforming both ordinary RL training and classifiers trained to assign post-hoc confidence scores. While ordinary RL hurts calibration, RLCR improves it. Finally, we demonstrate that verbalized confidence can be leveraged at test time to improve accuracy and calibration via confidence-weighted scaling methods. Our results show that explicitly optimizing for calibration can produce more generally reliable reasoning models.
Solidot(15)
- 经济学家称挪威人太富裕且太舒坦了
挪威经济学家 Martin Bech Holte 在新书《The Country That Became Too Rich》中将矛头对准自己的国家,认为挪威太富裕而致使其经济健康受损。挪威的主权财富基金高达 2 万亿美元,相当于人均 34 万美元,过去二十年挪威的生产力增长在富裕国家中最低,而挪威人每年请的病假天数多达 27.5 天,在经合组织中最多。挪威人均教育经费为 2 万美元,经合组织的平均水平为 1.4 万美元,但自 2015 年以来挪威学生的考试成绩持续下降,目前低于经合组织平均水平。挪威从主权基金中提取的资金占到了国家年度预算的五分之一以上,而二十年前不到十分之一。
- 中国大学鼓励学生使用 AI
两年前,中国的大学警告学生不要在作业中使用 AI。今天的大学逆转了立场,鼓励学生使用 AI,只要他们遵守最佳实践(best practices)。根据麦可思研究院(MyCOS Institute)的调查,生成式 AI 已基本普及,只有 1% 的师生表示从未在学习或工作中使用 AI,近六成被调查者表示每天或每周数次使用 AI。随着 DeepSeek 的流行,人们日益将生成式 AI 视为国家骄傲的来源。高校的讨论重心逐渐从担忧 AI 对学术诚信的影响转向鼓励学生提高素养、提高生产力和保持领先。斯坦福大学 Institute for Human-Centered Artificial Intelligence (HAI)的调查发现,中国对 AI 热情领先于世界其它国家,八成的人对 AI 感到兴奋,相比下美国和英国仅为 35% 和 38%。MIT Technology Review 对 46 所中国顶尖大学 AI 战略的调查发现,几乎所有学校过去一年引入了跨学科的 AI 通识教育课和 AI 相关学位。清华大学、人民大学、南京大学和复旦大学等推出了 AI 素养课程和学位,面向所有学生而不只是计算机科学专业的学生。教育部于 2025 年 4 月发布了国家“AI +教育”指南,呼吁进行全面改革。
- 未成年人参与了秦兵马俑的制作
秦始皇帝陵博物院兵马俑研究考古人员通过超景深显微镜捕捉到了 2000 多年前清晰的指纹印记,提取了指纹100多枚。研究显示兵马俑的塑造者中有未成年人。秦始皇帝陵博物院馆员李晓溪介绍,工作人员在已经修复的 40 多件陶俑身上,提取了指纹100多枚。通过对指纹进行分析比对, 获取了陶工的年龄构成, 和性别比例等信息。初步分析显示,绝大多数指纹属于成年男性,与传统认知相符,同时也发现存在少量未成年人指纹。至于在整个制作过程中 ,他们都参与了什么样的环节,存在怎样的分工差异还需要进一步分析研究。
- 北京火狐从 9 月 29 日起不再运营 Firefox 在华业务
北京谋智火狐信息技术有限公司(北京火狐)宣布,2025 年 5 月 8 日,Mozilla 与北京火狐达成一致,北京火狐将不再运营 Firefox 浏览器及任何与 Firefox 有关的中国大陆业务。自 2025 年 9 月 29 日晚 24:00 起,Firefox 火狐中文官方网站(firefox.com.cn)、Firefox 火狐社区网站(mozilla.com.cn)、Firefox 火狐通行证账户服务(accounts.firefox.com.cn)及 Firefox 火狐主页(home.firefoxchina.cn)将正式停止运营,所有相关功能将终止。自即日起,Firefox 火狐中文官方网站 www.firefox.com.cn 将不再提供 Firefox 浏览器相关产品的下载。中国火狐用户需要在 2025 年 9 月 29 日前将北京火狐服务器上的数据同步到本地,在这之后所有数据都将删除。Mozilla Firefox 用户不受任何影响。
- 银河系发现首个幽灵行星状星云
仙女座大星系近旁发现被称为 SDSO1 的星云。根据一篇预印本,SDSO1 可能并非来自仙女座,而是位于银河系内的一个古老行星状星云,被命名为「幽灵行星状星云」(Ghost Planetary Nebula, GPN)。这项研究由多位业余天文摄影师联合完成。研究团队透过影像显示的弓形震波与尾迹结构,并比对银河系内,距离地球约 600 光年的共生双星 EG Andromedae(仙女座EG)的位置与特性,推论这片云气正是由该双星系统在数十万年前抛出的壳层所形成,进而成为行星状星云。分析显示,仙女座EG 以每秒约 107 公里的速度穿越星际介质,因而产生壮观的弓形结构与长达 45 光年的尾迹。这类行星状星云在演化后期已极度稀薄而黯淡,但因与周围介质的高速交互作用才显现,SDSO1 也因此成为已知首个被提出的幽灵行星状星云案例。研究团队还辨识出银河系内另有 7 个具有类似特征的幽灵行星状星云候选天体,显示这类古老天体可能比我们以往认为的更为普遍。
- 教育能否延缓认知衰退?
根据发表在《Nature Medicine》期刊上的一项研究,教育并不能延缓认知衰退。对 170,795 名 50 岁以上参与者的 407,356 份情景记忆评分和 6,472 人的 15,157 次脑部磁共振成像(MRI)扫描数据分析发现,教育程度与更好的记忆功能、更大的颅内体积相关,但并不能阻挡岁月对大脑的侵蚀。无论学历高低,大脑都会以相似的节奏慢慢萎缩,认知能力也会同步下滑。教育给了你一个更好的起跑线,但并不能改变衰老这场马拉松的节奏。虽然读书不能让你永葆青春,但至少能让你在变老的路上保持更清晰的头脑。
- 安全研究员发现 SkyRover X1 是更换品牌的大疆产品
深圳大疆的无人机产品在美国海关面临非正式的禁令。为规避禁令,大疆被发现通过更换品牌名称出售旗下产品。亚马逊上一款售价 750 美元的 SkyRover X1 无人机被发现就是大疆的 Mini 4 Pro。SkyRover X1 与 Mini 4 Pro 有着完全相同的规格和功能,能直接连接大疆的网络基础设施如 DJIGlobal、DJISupport 和 DJIEnterprise。黑客 Kevin Finisterre 使用大疆的凭证成功登陆了 SkyRover 的系统。安全研究员 Jon Sawyer 发现 SkyRover 的应用使用与大疆软件相同的加密密钥。
- 人类组织蛋白质在 50 岁左右加速衰老
中科院研究团队对人体组织蛋白质的分析发现,早期衰老发生在 30 岁左右,45-55 岁之间衰老加速。蛋白质稳态的失衡是衰老进程中标志性的分子特征之一。深度分析发现,30 岁左右为衰老轨迹的初始分水岭——肾上腺组织率先呈现衰老特征,提示内分泌稳态失衡或为早期驱动力;同期主动脉亦出现稳态偏移,进一步印证了它作为“衰老哨兵”的先锋定位。45 岁至 55 岁被确认为衰老进程的里程碑式转折点,大多数器官蛋白质组在此阶段经历“分子级联风暴”,差异表达蛋白呈爆发性激增,标志其成为多器官系统性衰老的关键生物学转变窗口。主动脉蛋白质组在此过程中的重塑最为剧烈,其分泌组与循环血浆蛋白质组动态谱呈现强共演变特征,提示衰老相关分泌因子可能是介导衰老信号系统性传播的枢纽机制。
- 三星 One UI 8 禁止解锁 bootloader
三星 One UI 8 的 Beta 版本被发现移除了 Bootloader Unlock 选项,意味着用户将无法解锁 bootloader 安装第三方 ROM。虽然大部分用户不会注意到或者在意 Bootloader 是否能解锁,但对开发者社区而言这是对 Mod 自由的重大打击。
- 索尼指控腾讯抄袭其《地平线》系列游戏
索尼互动于 7 月 25 日向加州法庭起诉腾讯抄袭其《地平线》系列游戏,指控腾讯版权和商标侵犯。腾讯旗下工作室 Polaris Quest 最近发布了《Light of Motiram》的预告片,该游戏背景与索尼旗下工作室 Guerrilla Games 开发的《地平线(Horizon)》 系列游戏《地平线 零之曙光》和《地平线 西之绝境》几乎如出一辙,都发生在机械占领的末世后文明世界,女主角身穿类似猎人的服装与机械生物进行决战。索尼指控腾讯的抄袭行为是疯狂而无耻。索尼称,腾讯曾试图获得《地平线》系列的 IP 授权但遭到了它的拒绝。
- 在 Online Safety Act 生效后英国 VPN 使用量激增
作为 Online Safety Act 法律的一部分,英国用户访问成人内容需要验证年龄,这一要求导致英国的 VPN 搜索量和注册量激增。ProtonVPN 称,年龄验证要求生效后英国注册量增长了 1400% 以上。这种激增不是短时间内的现象。VPN 服务也排在英国苹果应用排行榜前列。不遵守验证年龄可能导致企业面临最高 1800 万英镑或全球年收入 10% 的罚款。
- 挪威开始在海底储存液化二氧化碳
为应对全球气候变化,挪威开始将液化二氧化碳埋藏在海床下一英里半的一层海绵岩中。挪威政府承担该项目第一阶段 10 亿美元费用中的八成,三大石油公司——壳牌、挪威国家石油和道达尔能源——为项目的持续扩建资助 7.14 亿美元,欧盟补贴 1.5 亿美元。作为欧洲最大的石油和天然气生产国,挪威正利用其石化收入探索“碳倾倒(carbon dumping)”方案能否有效。如果一切顺利,三大石油公司表示它们的设施每年可将 500 万公吨二氧化碳泵入海底,约占挪威年排放量的十分之一。液化二氧化碳由名叫 Northern Pioneer 的运输船运送,每次可装载 7500 公吨。
- FFmpeg 8.0 预计八月底释出
Michael Niedermayer 在邮件列表上宣布准备于八月底释出 FFmpeg 8.0,距离上个大版本的发布相距约 17 个月。FFmpeg 8.0 为 RealVideo 6.0、ADPCM IMA Xbox、G.728、Sanyo LD-ADPCM 引入了新解码器;为三星 Advanced Professional Video 引入了 APV 解码器;为 APV、动画 JPEG-XL 和 libx265 alpha 层编码功能引入了编码支持;OpenHarmony 的编码和解码支持;Video Acceleration API (VA-API)的 VVC/H.266 支持; AVX-512 优化、FFV1 改进、AV1 RTP 分组器/解包器、AMD AMF 解码器、Vulkan 视频增强、更好的 HDR 视频支持等等。
- 腾讯发布混元世界 1.0 模型
如何从文本或图像中创建具有沉浸感和可交互性的三维世界,始终是计算机视觉与图形学领域的核心挑战。现有世界生成方法主要分为两类:基于视频的方法虽能提供丰富的多样性,却缺乏三维一致性且渲染效率低下;基于三维几何的方法虽能保证几何一致性,却受限于训练数据不足和内存效率低下的表征方式。为突破这些局限,腾讯开发者提出 HunyuanWorld 1.0 框架——一种融合双方优势的创新方案,能根据文本与图像条件生成兼具沉浸感、可探索性与交互性的三维世界。本方法具有三大核心优势:(1)通过全景世界代理实现 360°沉浸式体验;(2)支持网格导出功能,可与现有计算机图形管线无缝兼容;(3)采用解耦式物体表征以增强交互性。该框架的核心在于语义分层的三维网格表征技术,通过将全景图像作为 360°世界代理进行语义感知的世界解构与重建,从而生成多样化的三维场景。大量实验表明,本方法在生成连贯、可探索且可交互的三维世界方面达到最先进水平,同时可广泛应用于虚拟现实、物理仿真、游戏开发及交互式内容创作等领域。
- 针对开源软件的供应链攻击失控
安全公司 Socket 上周报告了三起针对开源软件的供应链攻击。攻击者主要通过窃取开发者账号,然后上传含有恶意代码的软件包,将恶意软件推送给毫无戒心的用户。Toptal 的 GitHub Organization 账号被入侵,攻击者利用该账号在 npm 上发布恶意软件包,Toptal 共有 10 个 npm 软件包含了恶意代码,在被发现前被 5000 名用户下载。Socket 还报告了一起针对 npm 用户和 PyPI 用户的供应链攻击,恶意程序的总下载量逾 56,000 次,恶意软件的功能包括了键盘记录、屏幕截图、指纹识别、webcam 访问和凭据窃取等。Socket 报告的第三起攻击是利用钓鱼邮件窃取开发者账号在 npm 上发布了三个包含恶意代码的软件包。因为软件包的依赖关系,供应链攻击通常会造成更大规模的破坏。