DIGEST · 2025-08-06

OrangeBot.AI Digest — 2025-08-06

75 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Jules, our asynchronous coding agent (blog.google)
  2. Qwen3-4B-Thinking-2507 (huggingface.co)
  3. Writing a Rust GPU kernel driver: a brief introduction on how GPU drivers work (www.collabora.com)
  4. Dotfiles feel too personal to share (hamatti.org)
  5. Google suffers data breach in ongoing Salesforce data theft attacks (www.bleepingcomputer.com)
  6. EU proposal to scan all private messages gains momentum (cointelegraph.com)
  7. Constitution of the United States Website has removed sections (reddit.com)
  8. Claude Code IDE integration for Emacs (github.com)
  9. Automerge 3.0 (automerge.org)
  10. NautilusTrader: Open-source algorithmic trading platform (nautilustrader.io)
  11. LLM Inflation (tratt.net)
  12. Japan: Apple Must Lift Browser Engine Ban by December (open-web-advocacy.org)
  13. Python performance myths and fairy tales (lwn.net)
  14. I gave the AI arms and legs then it rejected me (grell.dev)
  15. Teacher AI use is already out of control and it's not ok (www.reddit.com)

GitHub Trending(15)

  1. nautechsystems / nautilus_trader

    A high-performance algorithmic trading platform and event-driven backtester

  2. dyad-sh / dyad

    Free, local, open-source AI app builder ✨ v0 / lovable / Bolt alternative 🌟 Star if you like it!

  3. simstudioai / sim

    Sim is an open-source AI agent workflow builder. Sim Studio's interface is a lightweight, intuitive way to quickly build and deploy LLMs that connect with your favorite tools.

  4. browserbase / stagehand

    The AI Browser Automation Framework

  5. python-poetry / poetry

    Python packaging and dependency management made easy

  6. blakeblackshear / frigate

    NVR with realtime local object detection for IP cameras

  7. ethereum / solidity

    Solidity, the Smart Contract Programming Language

  8. openssl / openssl

    TLS/SSL and crypto library

  9. themactep / thingino-firmware

    Open-source firmware for Ingenic SoC IP cameras

  10. dstotijn / hetty

    An HTTP toolkit for security research.

  11. MaaAssistantArknights / MaaAssistantArknights

    《明日方舟》小助手,全日常一键长草!| A one-click tool for the daily tasks of Arknights, supporting all clients.

  12. JetBrains / intellij-community

    IntelliJ IDEA & IntelliJ Platform

  13. open-edge-platform / anomalib

    An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.

  14. confident-ai / deepeval

    The LLM Evaluation Framework

  15. prisma / prisma

    Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB

Product Hunt(15)

  1. SpeedVitals RUM

    Monitor real-user performance & web analytics

  2. Eleven Music

    The highest quality AI music model

  3. Bifrost

    The fastest LLM gateway in the market

  4. OpenAI Open Models

    gpt-oss-120b and gpt-oss-20b open-weight language models

  5. Sequoia Health 2.0

    Personalized men's sexual health with on-demand experts

  6. Genie 3

    A new frontier for world models

  7. Shipper.now

    Build full-stack apps by talking to AI

  8. Coverage Cat

    Your AI-native insurance broker

  9. Equip AI Interview

    Ask any question. Natural language response. Instant results

  10. Source Public Beta

    Built for B2B marketers. AI-powered attribution made simple

  11. Leadchee

    Smarter CRM for startups

  12. Agent Maya (by Flow AI)

    Turn LinkedIn messages into sales calls

  13. IMGPT

    AI Ads that don't feel AI generated

  14. SiteAssist

    Build AI Assistants trained only on your own data

  15. SuperPrompt 2.0

    Save & paste your AI prompts without switching tabs

Hugging Face(15)

  1. Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

    We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.

  2. Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation

    We introduce Skywork UniPic, a 1.5 billion-parameter autoregressive model that unifies image understanding, text-to-image generation, and image editing within a single architecture-eliminating the need for task-specific adapters or inter-module connectors-and demonstrate that compact multimodal systems can achieve state-of-the-art performance on commodity hardware. Skywork UniPic achieves a GenEval score of 0.86, surpassing most existing unified models; sets a new DPG-Bench complex-generation record of 85.5; attains 5.83 on GEditBench-EN and 3.49 on ImgEdit-Bench for image editing; and generates 1024 x 1024 images with under 15 GB of GPU memory (e.g., RTX 4090). (1) a decoupled encoding strategy that leverages a masked autoregressive encoder for synthesis and a SigLIP2 encoder for understanding, all feeding a shared autoregressive decoder; (2) a progressive, resolution-aware training schedule scaling from 256 x 256 to 1024 x 1024 while dynamically unfreezing parameters to balance capacity and stability; and (3) meticulously curated, 100 million-scale datasets augmented with task-specific reward models to refine generation and editing objectives. By demonstrating that high-fidelity multimodal integration need not incur prohibitive resource demands, Skywork UniPic establishes a practical paradigm for deployable, high-fidelity multimodal AI. Code and weights are publicly available at https://huggingface.co/Skywork/Skywork-UniPic-1.5B.

  3. LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation

    Controllable ultra-long video generation is a fundamental yet challenging task. Although existing methods are effective for short clips, they struggle to scale due to issues such as temporal inconsistency and visual degradation. In this paper, we initially investigate and identify three key factors: separate noise initialization, independent control signal normalization, and the limitations of single-modality guidance. To address these issues, we propose LongVie, an end-to-end autoregressive framework for controllable long video generation. LongVie introduces two core designs to ensure temporal consistency: 1) a unified noise initialization strategy that maintains consistent generation across clips, and 2) global control signal normalization that enforces alignment in the control space throughout the entire video. To mitigate visual degradation, LongVie employs 3) a multi-modal control framework that integrates both dense (e.g., depth maps) and sparse (e.g., keypoints) control signals, complemented by 4) a degradation-aware training strategy that adaptively balances modality contributions over time to preserve visual quality. We also introduce LongVGenBench, a comprehensive benchmark consisting of 100 high-resolution videos spanning diverse real-world and synthetic environments, each lasting over one minute. Extensive experiments show that LongVie achieves state-of-the-art performance in long-range controllability, consistency, and quality.

  4. CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

    Answer verification is crucial not only for evaluating large language models (LLMs) by matching their unstructured outputs against standard answers, but also serves as the reward model to guide LLM optimization. Most evaluation frameworks rely on regularized matching or employ general LLMs for answer verification, which demands extensive, repetitive customization for regex rules or evaluation prompts. Two fundamental limitations persist in current methodologies: 1) the absence of comprehensive benchmarks that systematically evaluate verification capabilities across different LLMs; and 2) the nascent stage of verifier development, where existing approaches lack both the robustness to handle complex edge cases and the generalizability across different domains. In this work, we develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward. It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types, including multi-subproblems, formulas, and sequence answers, while effectively identifying abnormal/invalid responses. We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier. We anticipate that CompassVerifier and VerifierBench will facilitate answer verification, evaluation protocols, and reinforcement learning research. Code and dataset are available at https://github.com/open-compass/CompassVerifier.

  5. Tool-integrated Reinforcement Learning for Repo Deep Search

    Issue localization, the process of identifying code locations that need modification to resolve software issues, is a critical yet challenging task in software development. The semantic gap between natural language issue descriptions and faulty code requires complex multi-hop reasoning through code dependencies. Existing LLM-based agents attempt to address this by integrating repository retrieval tools. However, this transforms issue localization into a demanding task we call Repo Deep Search, which requires the LLM to effectively utilize various repository retrieval tools throughout a multi-step reasoning and navigation process. To tackle this challenge, we present ToolTrain, a two-stage tool-integrated training framework combining rejection-sampled supervised fine-tuning and tool-integrated reinforcement learning to enhance LLMs' ability to use retrieval tools for issue localization. Experimental results show that ToolTrain-trained models achieve state-of-the-art performance, with our 32B model even surpassing Claude-3.7 on function-level localization. The results also show that improved localization performance translates to better end-to-end issue resolution performance. This further demonstrates that training for issue localization is a viable and effective strategy for improving automated software development.

  6. Representation Shift: Unifying Token Compression with FlashAttention

    Transformers have demonstrated remarkable success across vision, language, and video. Yet, increasing task complexity has led to larger models and more tokens, raising the quadratic cost of self-attention and the overhead of GPU memory access. To reduce the computation cost of self-attention, prior work has proposed token compression techniques that drop redundant or less informative tokens. Meanwhile, fused attention kernels such as FlashAttention have been developed to alleviate memory overhead by avoiding attention map construction and its associated I/O to HBM. This, however, makes it incompatible with most training-free token compression methods, which rely on attention maps to determine token importance. Here, we propose Representation Shift, a training-free, model-agnostic metric that measures the degree of change in each token's representation. This seamlessly integrates token compression with FlashAttention, without attention maps or retraining. Our method further generalizes beyond Transformers to CNNs and state space models. Extensive experiments show that Representation Shift enables effective token compression compatible with FlashAttention, yielding significant speedups of up to 5.5% and 4.4% in video-text retrieval and video QA, respectively. Code is available at https://github.com/mlvlab/Representation-Shift.

  7. CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search

    Approximate nearest-neighbor search (ANNS) algorithms have become increasingly critical for recent AI applications, particularly in retrieval-augmented generation (RAG) and agent-based LLM applications. In this paper, we present CRINN, a new paradigm for ANNS algorithms. CRINN treats ANNS optimization as a reinforcement learning problem where execution speed serves as the reward signal. This approach enables the automatic generation of progressively faster ANNS implementations while maintaining accuracy constraints. Our experimental evaluation demonstrates CRINN's effectiveness across six widely-used NNS benchmark datasets. When compared against state-of-the-art open-source ANNS algorithms, CRINN achieves best performance on three of them (GIST-960-Euclidean, MNIST-784-Euclidean, and GloVe-25-angular), and tied for first place on two of them (SIFT-128-Euclidean and GloVe-25-angular). The implications of CRINN's success reach well beyond ANNS optimization: It validates that LLMs augmented with reinforcement learning can function as an effective tool for automating sophisticated algorithmic optimizations that demand specialized knowledge and labor-intensive manual refinement.Code can be found at https://github.com/deepreinforce-ai/CRINN

  8. The Promise of RL for Autoregressive Image Editing

    We explore three strategies to enhance performance on a wide range of image editing tasks: supervised fine-tuning (SFT), reinforcement learning (RL), and Chain-of-Thought (CoT) reasoning. In order to study all these components in one consistent framework, we adopt an autoregressive multimodal model that processes textual and visual tokens in a unified manner. We find RL combined with a large multi-modal LLM verifier to be the most effective of these strategies. As a result, we release EARL: Editing with Autoregression and RL, a strong RL-based image editing model that performs competitively on a diverse range of edits compared to strong baselines, despite using much less training data. Thus, EARL pushes the frontier of autoregressive multimodal models on image editing. We release our code, training data, and trained models at https://github.com/mair-lab/EARL.

  9. Multi-human Interactive Talking Dataset

    Existing studies on talking video generation have predominantly focused on single-person monologues or isolated facial animations, limiting their applicability to realistic multi-human interactions. To bridge this gap, we introduce MIT, a large-scale dataset specifically designed for multi-human talking video generation. To this end, we develop an automatic pipeline that collects and annotates multi-person conversational videos. The resulting dataset comprises 12 hours of high-resolution footage, each featuring two to four speakers, with fine-grained annotations of body poses and speech interactions. It captures natural conversational dynamics in multi-speaker scenario, offering a rich resource for studying interactive visual behaviors. To demonstrate the potential of MIT, we furthur propose CovOG, a baseline model for this novel task. It integrates a Multi-Human Pose Encoder (MPE) to handle varying numbers of speakers by aggregating individual pose embeddings, and an Interactive Audio Driver (IAD) to modulate head dynamics based on speaker-specific audio features. Together, these components showcase the feasibility and challenges of generating realistic multi-human talking videos, establishing MIT as a valuable benchmark for future research. The code is avalibale at: https://github.com/showlab/Multi-human-Talking-Video-Dataset.

  10. Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

    We introduce Goedel-Prover-V2, a series of open-source language models that set a new state-of-the-art in automated theorem proving. Built on the standard expert iteration and reinforcement learning pipeline, our approach incorporates three key innovations: (1) Scaffolded data synthesis: We generate synthetic tasks of increasing difficulty to train the model to master increasingly complex theorems; (2) Verifier-guided self-correction: We enable the model to iteratively revise its proofs by leveraging feedback from the Lean compiler; (3) Model averaging: We merge model checkpoints to mitigate the decrease in model output diversity in later stages of training. Our small model, Goedel-Prover-V2-8B, reaches 84.6% pass@32 on MiniF2F and outperforms DeepSeek-Prover-V2-671B under the same metric, despite being 80X smaller. Our flagship model, Goedel-Prover-V2-32B, achieves 88.1% on MiniF2F at pass@32 in standard mode and 90.4% in self-correction mode, outperforming prior SOTA by a large margin. Additionally, our flagship model solves 86 problems on PutnamBench at pass@184, securing the first place among open-source models on the leaderboard, surpassing DeepSeek-Prover-V2-671B's record of solving 47 problems by pass@1024 with a significantly smaller model size and compute budget. At the time of its release (July-August 2025), Goedel-Prover-V2 achieves the strongest overall performance among all open-source theorem provers. It also ranks among the top-performing models--including closed-source systems with publicly reported performance--under a constrained test-time compute budget. Our models, code, and data are released at https://github.com/Goedel-LM/Goedel-Prover-V2.

  11. LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

    With the rapid development of Model Context Protocol (MCP), the number of MCP servers has surpassed 10,000. However, existing MCP benchmarks are limited to single-server settings with only a few tools, hindering effective evaluation of agent capabilities in large-scale, real-world scenarios. To address this limitation, we present LiveMCPBench, the first comprehensive benchmark comprising 95 real-world tasks grounded in the MCP ecosystem, designed to evaluate LLM agents at scale across diverse servers. To support a scalable and reproducible evaluation pipeline in large-scale MCP environments, we curate LiveMCPTool, a diverse and readily deployable collection of 70 MCP servers and 527 tools. Furthermore, we introduce LiveMCPEval, an LLM-as-a-Judge framework that enables automated and adaptive evaluation in dynamic, time-varying task environments, achieving 81% agreement with human reviewers. Finally, we propose the MCP Copilot Agent, a multi-step agent that routes tools for dynamic planning and executes tools for API interaction across the entire LiveMCPTool suite. Our evaluation covers 10 leading models, with the best-performing model (Claude-Sonnet-4) reaching a 78.95% success rate. However, we observe large performance variance across models, and several widely-used models perform poorly in LiveMCPBench's complex, tool-rich environments. Overall, LiveMCPBench offers the first unified framework for benchmarking LLM agents in realistic, tool-rich, and dynamic MCP environments, laying a solid foundation for scalable and reproducible research on agent capabilities. Our code and data will be publicly available at https://icip-cas.github.io/LiveMCPBench.

  12. LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer

    In controllable image synthesis, generating coherent and consistent images from multiple references with spatial layout awareness remains an open challenge. We present LAMIC, a Layout-Aware Multi-Image Composition framework that, for the first time, extends single-reference diffusion models to multi-reference scenarios in a training-free manner. Built upon the MMDiT model, LAMIC introduces two plug-and-play attention mechanisms: 1) Group Isolation Attention (GIA) to enhance entity disentanglement; and 2) Region-Modulated Attention (RMA) to enable layout-aware generation. To comprehensively evaluate model capabilities, we further introduce three metrics: 1) Inclusion Ratio (IN-R) and Fill Ratio (FI-R) for assessing layout control; and 2) Background Similarity (BG-S) for measuring background consistency. Extensive experiments show that LAMIC achieves state-of-the-art performance across most major metrics: it consistently outperforms existing multi-reference baselines in ID-S, BG-S, IN-R and AVG scores across all settings, and achieves the best DPG in complex composition tasks. These results demonstrate LAMIC's superior abilities in identity keeping, background preservation, layout control, and prompt-following, all achieved without any training or fine-tuning, showcasing strong zero-shot generalization ability. By inheriting the strengths of advanced single-reference models and enabling seamless extension to multi-image scenarios, LAMIC establishes a new training-free paradigm for controllable multi-image composition. As foundation models continue to evolve, LAMIC's performance is expected to scale accordingly. Our implementation is available at: https://github.com/Suchenl/LAMIC.

  13. HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents

    Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.

  14. ChartCap: Mitigating Hallucination of Dense Chart Captioning

    Generating accurate, informative, and hallucination-free captions for charts remains challenging for vision language models, primarily due to the lack of large-scale, high-quality datasets of real-world charts. However, existing real-world chart datasets suffer from the inclusion of extraneous information that cannot be inferred from the chart and failure to sufficiently capture structural elements and key insights. Therefore, we introduce ChartCap, a large-scale dataset of 565K real-world chart images paired with type-specific, dense captions that exclude extraneous information and highlight both structural elements and key insights in detail. To build ChartCap, we design a four-stage pipeline that generates captions using only the discernible data from the chart and employ a cycle consistency-based human verification, which accelerates quality control without sacrificing accuracy. Additionally, we propose a novel metric, the Visual Consistency Score, which evaluates caption quality by measuring the similarity between the chart regenerated from a caption and the original chart, independent of reference captions. Extensive experiments confirms that models fine-tuned on ChartCap consistently generate more accurate and informative captions with reduced hallucinations, surpassing both open-source and proprietary models and even human-annotated captions.

  15. AlignGuard-LoRA: Alignment-Preserving Fine-Tuning via Fisher-Guided Decomposition and Riemannian-Geodesic Collision Regularization

    Low-rank adaptation (LoRA) has become a standard tool for efficiently fine-tuning large language models (LLMs). Yet, even minor LoRA updates can induce alignment drift, weakening safety and behavioral constraints through entangled parameter changes. To address this, we propose AlignGuard-LoRA (AGL), a principled framework for preserving alignment during finetuning. AGL introduces several key components: a primary task loss for supervision, Fisher Information Matrix-based regularization to restrict updates in alignment-sensitive subspaces, and task-specific regularization to stabilize the integration of new knowledge. We further introduce collision-aware regularization, blending Riemannian overlap -- which penalizes coordinate-wise interference -- and geodesic separation -- which encourages disjoint update geometry. We curate DriftCaps, a targeted diagnostic benchmark of safe and unsafe prompts designed to quantify alignment drift and safety degradation. Empirical evaluations show that AGL mitigates alignment drift by up to 50% on safety-critical benchmarks without degrading downstream task performance. Comprehensive ablation confirms that each component contributes distinctly to preserving latent safety behaviors. Finally, we derive and validate a scaling law for catastrophic forgetting, revealing that AGL flattens post-finetuning loss escalation while preserving adaptation dynamics. AGL is a structurally grounded refinement of LoRA, ensuring alignment preservation with minimal trade-offs. To encourage further exploration and development, we open-source our implementation.

Solidot(15)

  1. 台积电指控前雇员窃取 2 纳米芯片技术机密

    台积电指控前雇员窃取 2 纳米芯片制程技术机密,而苹果 iPhone 18 系列使用的 A20 芯片是首批采用 2 纳米工艺的芯片。报道称,台积电日前发现制程技术疑遭外流后,立即向高检署提告,检方经追查后,指挥调查局上月 25 日及 28日 发动多波搜索及约谈行动,目前已知有一名陈姓离职工程师,及近 10 名台积电先进制程试产及研发工程师涉案,此名陈姓工程师曾在台积电系统整合部门任职,离职后转往台积电长期合作的日商东京威力科创担任设备工程师,因陈与台积电目前先进制程相关研发人员熟识,因此负责与台积电研发部门对接。据悉,陈与台积电研发部门有密切往来,加上熟识研发人员,目前已查出陈窃密的方式,是由台积电工程师打开电脑屏幕出示制程技术图样,陈再以手机直接拍照,据了解,陈直接从吴姓等两名工程师电脑屏幕上,分别拍摄了 700 多张、近 300 张制程技术照片,另外有几位台积电工程师,也提供拍摄较不具机密性的个位数的制程图,情节较轻,因此未被声押。

  2. 科学家研发出一种效力与吗啡相当但无严重副作用的止痛药

    日本京都大学的科学家研发出一种效力与吗啡相当但无严重副作用的止痛药。吗啡常被癌症患者使用,它有呼吸困难和成瘾等严重副作用。新药物 Adrian 的工作原理与吗啡和现有的合成阿片类药物完全不同,研究团队声称有望彻底改变医学领域的疼痛控制,有助于解决阿片类药物滥用问题。当人遭遇危及生命的情况时,大脑会分泌去甲肾上腺素(norepinephrine)去抑制疼痛。新研究集中在是人体调节去甲肾上腺素过度分泌的机制。研究团队通过引入新技术首次成功研发出一种能阻断这种调控的药物。科学家计划 2026 年在美国开展临床试验,2028 年投入实用。

  3. 用激光穿透人类大脑

    科学家理解大脑运作主要使用两种工具,它们都有各自的优点和缺点:脑电图 (EEG)廉价且轻便,但无法读取大脑外皮层之外的信息;功能性核磁共振成像 (fMRI) 昂贵且体积庞大但可以深入大脑。现在格拉斯哥(Glasgow)大学研究团队找到了一种能集两者于一身的技术:像 EEG 那样廉价且轻便,像 fMRI 那样能读取大脑深层的信息。他们使用激光器从大脑一侧发射数以百万的光子,然后测量到达另一侧的时间。由于只有极少数光子能完全穿过大脑,因此研究的一大难点是降低背景噪音。这项技术离真正实用还有一段距离,研究人员还需要克服更多障碍。

  4. 超加工饮食减肥的效果不大

    英国科学家发现,超加工饮食对减重和降低心血管代谢疾病风险的效果可能不如最少加工的饮食,即使这两种饮食都遵循相同的国家饮食指南。研究结果基于一项对英国 55 名成年人开展的社群水平的临床试验,揭示了在整体营养构成之外,食品加工程度对特定健康结局的可能影响。全球超加工食物消耗量在近几十年里快速增加,而肥胖症以及2型糖尿病和心血管疾病这类慢性病的发病率也在同期上升。研究人员开展了一项随机交叉试验,比较了以超加工食品为主和以最少加工食品为主的饮食,两种饮食结构都遵循了英国《健康饮食指南》——一组促进健康均衡营养的国家饮食建议。试验中的 55 名成人或接受预制的超加工食品,如早餐谷物或即食千层意面;或接受预制的最少加工食品,如隔夜燕麦或自制肉酱意面,这些食品在 8 周内分别配送到家。休息 4 周后,受试者换成另一种饮食再继续 8 周,从而能在受试者本人身上比较超加工食品和最少加工食品在 6 个月期间的影响。50 名受试者至少完成了一种饮食。研究者发现,遵循英国《健康饮食指南》的两种饮食都能在 8 周内显著减重。不过,最少加工饮食的平均减重量为 2%,而超加工饮食只有 1%。除了减重,最少加工饮食能更有效地改善与心血管代谢健康指标相关的身体成分,如降低脂肪总量、内脏脂肪和甘油三酯水平,但超加工饮食后的低密度脂蛋白胆固醇更低。

  5. 特斯拉被指在涉及自动驾驶的车祸案件中隐瞒数据、撒谎和误导警方

    陪审团上周裁决特斯拉对一起牵涉到 Autopilot 的车祸过失死亡事件负有部分责任。庭审记录显示,特斯拉试图将所有责任都归罪于司机,主动隐瞒 Autopilot 在事故前后表现的关键证据。在车祸发生三分钟内,特斯拉汽车将碰撞快照(collision snapshot)——视频、CAN‑bus streams、EDR 数据等——上传到特斯拉公司的服务器上,然后删除了本地拷贝,使得特斯拉公司成为唯一一个能访问关键证据的实体。警方在多年之后才让特斯拉承认碰撞快照的存在。专家通过从车载电脑上取证恢复数据确认特斯拉一直拥有该“碰撞快照”。而特斯拉一直宣称快照数据并不存在。

  6. 大型流浪行星可能会形成自己的微型行星系统

    就算没有母星相伴,部分质量与木星差不多的漂流行星,也可能孕育出属于自己的微型行星系统。这些流浪天体可能跟恒星一样,是从巨大气体分子云塌缩形成的;也有可能原本属于某个行星系统的成员,后来被其他大型行星的重力扰动踢出来,变成在星际空间中漂流的行星。研究团队运用韦伯望远镜上两套高灵敏度的红外线相机,从 2024年 8 月到 10 月间,对这些天体进行详细的光谱测量,并分析其结构与组成。结果显示它们的质量确实与木星相当,在其中 6 颗行星周围还发射出较为多量的红外线,显示它们身边环绕着温暖的气体尘埃圆盘,这正是行星系统形成时常见的特征。观测结果还发现这些尘埃盘中含有矽酸盐颗粒,不但有逐渐成长的迹象,还出现结晶化现象,这正是行星系统中形成岩石质行星形成的第一个步骤。过去只有在恒星或棕矮星周围的气体尘埃圆盘中发现这种现象,如今却首度在质量小得多,与木星质量相近的漂流行星中被侦测出来。

  7. Perplexity 使用隐蔽策略绕过网站禁止抓取的指令

    CDN 服务商 Cloudflare 指责 AI 搜索引擎公司 Perplexity 使用隐蔽策略绕过网站禁止抓取的指令。Cloudflare 称它收到了客户的投诉,客户通过 robots.txt 以及 Web 应用防火墙屏蔽了 Perplexity 的搜索爬虫,然而尽管采取了这些措施 Perplexity 的爬虫仍然继续访问网站内容。Cloudflare 随后展开了调查,发现当 Perplexity 注意到 robots.txt 或防火墙规则屏蔽其爬虫后,它会使用一个隐蔽的机器人爬虫,使用一系列策略掩盖其活动。此举意味着 Perplexity 违反了实施了 30 多年的互联网规范。

  8. 乌克兰通过无人机快递电动自行车救出士兵

    乌克兰的无人机完成了一项非同寻常的任务:快递电动自行车救出士兵。被称为 Rubizh 的乌克兰国民警卫队第四快速反应旅分享了一则长 16 分钟的相关视频,显示无人机吊起了一辆电动自行车,然后士兵骑着电动车返回安全地带。乌克兰方面称,这名士兵坚守的前线阵地遭到了袭击,多名战友阵亡,他发现自己与安全区隔离,不得不独自坚守阵地数日。为了营救这名士兵,参谋制定了用大型无人机运送电动自行车的计划,第一架无人机被击落,第二架因为过重而失败,第三架成功了。

  9. Mozilla 警告针对 Firefox 扩展开发者的钓鱼攻击

    Mozilla 警告针对 Firefox 扩展开发者的钓鱼攻击,督促开发者对冒充 Mozilla 或 AMO (addons.mozilla.org) 发件人的邮件提高警惕。攻击者可能是利用钓鱼邮件劫持开发者的账号,然后向 Firefox 用户推送包含恶意代码的扩展更新,发动供应链攻击。安全研究人员称,目前针对 Firefox 的恶意插件旨在窃取加密货币钱包的凭证。

  10. 塑料危机影响所有人

    根据发表在柳叶刀期刊上的一则评论,专家警告全球塑料危机,认为塑料每年造成的健康相关损失达到至少 1.5 万亿美元。自 1950 年以来,塑料产量增长了 200 多倍,到 2060 年塑料产量将达到每年 10 亿吨以上。结果是整个地球从从珠穆朗玛峰顶到最深的海沟都被塑料污染,全世界目前的塑料垃圾重达 80 亿吨。而不到 10% 的塑料被回收。塑料会导致空气污染和接触有毒化学物质,而微塑料则能渗透进人体。塑料污染甚至会增加携带疾病的蚊子,因为散落各地的塑料捕获的水为蚊子提供了良好的繁殖场所。石油公司和塑料行业认为,对抗塑料污染的重点应放在回收塑料而不是削减产量上。

  11. 孙宇晨搭乘 Blue Origin 飞船完成亚轨道飞行

    加密货币波场(TRON)的创始人孙宇晨周日搭乘贝佐斯(Jeff Bezos)旗下太空公司 Blue Origin 的飞船完成了亚轨道太空飞行。这次任务是 Blue Origin 飞船 New Shepard 的第 34 次飞行,因此代号为 NS-34。孙宇晨于 2021 年以 2800 万美元拍下了 New Shepard 首次载人飞行的座位,但由于时间安排上的冲突,孙未能参与此次任务。参与 NS-34 任务的共有六人,除了最受瞩目的孙宇晨外,还有印度裔美国房地产投资人 Arvinder (Arvi) Singh Bahal,土耳其企业家 Gökhan Erdem、波多黎各气象学家兼记者 Deborah Martorell、在尼泊尔经营孤儿院三十年的英国人 Lionel Pitchford,以及美国企业家 James (J.D.) Russell。

  12. 逾五分之一的 CS 论文可能含有 AI 内容

    根据发表在《Nature Human Behaviour》期刊上的一项研究,22% 的 CS 论文可能含有 AI 生成内容。研究分析了 2020-2024 年之间发表的逾百万篇论文和预印本,主要集中在摘要和引言上,寻找常见于 AI 生成文本的高频词汇如“regenerate response”或“my knowledge cutoff”,以及 pivotal、intricate 和 showcase 等 AI 更可能使用而人类不太可能用的单词。研究人员称,在 CS 等领域,大模型修改文本的痕迹更为普遍。分析显示,在 2022 年 11 月 ChatGPT 发布后仅几个月时间,大模型修改内容的数量就急剧上升。最接近 AI 的领域,大模型使用的比例越高。到 2024 年 9 月,22.5% 的 CS 论文摘要存在大模型修改的证据,电气系统和工程学论文紧随其后,而数学论文摘要使用大模型修改的比例只有 7.7%。生物医学和物理学等的比例也相对较小。研究人员认为实际比例可能更高,因为论文作者可能会有意删除大模型的高频词汇,比如 delve 在 ChatGPT 诞生之后使用频率大幅提升,但在它成为 AI 生成文本的公认标志之后,使用率又逐渐下降。

  13. 卫星宽带公司并没有遵守亮度限制

    地球轨道上出现了越来越多的卫星宽带星座,其中包括了 SpaceX 的 Starlink、AST 的 BlueBird、亚马逊的 Kuiper 、欧洲的 OneWeb、中国的千帆与国网。未来几年轨道上的宽带卫星数量将会超过数万。卫星会发射太阳光,在长时间曝光的天文相片上留下光迹,影响天文观测。国际天文联合会(IAU)为此成立了保护黑暗与宁静夜空免受卫星星系干扰中心(Centre for the Protection of the Dark and Quiet Sky from Satellite Constellation Interference, CPS)进行相关研究与政策倡议活动。CPS 于近期发表的回顾与观测报告显示,有些公司并未遵守他们去年所建议的亮度限制。其中 BlueBird 卫星最明亮,但数量尚不足以形成规模,SpaceX 的 Starlink Mini 第二代卫星的体积是第一代的四倍多,但亮度与第一代版本大致相同,证明 SpaceX 降低亮度措施有效。中国的千帆和国网虽然亮度与 Starlink 差不多,但它们目前都部署在约 1000 公里的高轨道上,未来若继续在 300-500 公里范围内进行部署,卫星的反光亮度恐将比现在再亮 1-2 个星等。

  14. 2500 年历史的西伯利亚冰木乃伊有复杂纹身

    高分辨率成像显示,距今 2500 年的西伯利亚冰木乃伊身上有复杂纹身。木乃伊为女性,约 50 岁,生活在中国和欧洲之间上的广阔草原上,属于帕兹里克文明(Pazyryk)。木乃伊是 19 世纪西伯利亚阿尔泰山脉的冰墓中发现的,此前纹身难以发现,现在利用近红外数字摄影技术才发现。她的右前臂纹了鹿头和豹,左臂是神话中的狮鹫与鹿对打,拇指上纹了公鸡。研究人员估计纹身使用了多点的针状工具和单点针,前者用动物角或骨头制成。使用的涂料可能是烧焦的植物材料或烟灰。

  15. 过去三个月有 500 万用户首次试用了 GitHub Copilot

    GitHub 发言人披露,微软的 AI 编程助手 GitHub Copilot 目前有 2000 万“历史用户(all-time users)”。2025 年 4 月该公司披露 GitHub Copilot 的用户有 1500 万,这意味着过去三个月增加了 500 万新用户。但用户在试用之后就放弃还是一直高频使用,微软没有对此做出进一步说明。微软称,GitHub Copilot 是目前最受欢迎的 AI 辅助编程工具之一,被九成的财富百强企业使用。该产品在企业客户中的使用率比上季度增长了约 75%。