DIGEST · 2025-07-31

OrangeBot.AI Digest — 2025-07-31

71 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Releasing open weights for FLUX.1 Krea (www.krea.ai)
  2. Slow (michaelnotebook.com)
  3. AI is a floor raiser, not a ceiling raiser (elroy.bot)
  4. Ubiquiti launches UniFi OS Server for self-hosting (lazyadmin.nl)
  5. QUIC for the kernel (lwn.net)
  6. Face it: you're a crazy person (www.experimental-history.com)
  7. U.S. senators introduce new pirate site blocking bill, "Block BEARD" (torrentfreak.com)
  8. MacBook Pro Insomnia (manuel.bernhardt.io)
  9. So you're a manager now (scottkosman.com)
  10. Many countries that said no to ChatControl in 2024 are now undecided (digitalcourage.social)
  11. How was the Universal Pictures 1936 opening logo created? (movies.stackexchange.com)
  12. Introduction to Computer Music (cmtext.com)
  13. I tried Servo (www.spacebar.news)
  14. Sumo – Simulation of Urban Mobility (eclipse.dev)
  15. Hawley and Democrats vote to advance congressional stock trading ban (www.cbsnews.com)

GitHub Trending(14)

  1. kijai / ComfyUI-WanVideoWrapper
  2. stenzek / duckstation

    Fast PlayStation 1 emulator for x86-64/AArch32/AArch64/RV64

  3. SkyworkAI / SkyReels-V2

    SkyReels-V2: Infinite-length Film Generative model

  4. OpenPipe / ART

    Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!

  5. EmenstaNougat / ESP32-BlueJammer

    The ESP32-BlueJammer (Bluetooth jammer, BLE jammer, WiFi jammer, RC jammer) disrupts 2.4GHz communications. Using an ESP32 and nRF24 modules, it generates noise and unnecessary packets, causing interference between the devices communicating, making them unable to work as intended. Ideal for controlled disruption and security testing.

  6. 9001 / copyparty

    Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps

  7. puppeteer / puppeteer

    JavaScript API for Chrome and Firefox

  8. Canner / WrenAI

    ⚡️Wren AI is your GenBI Agent, that you can query any database with natural language, get accurate SQL(Text-to-SQL), charts(Text-to-Charts) & AI-generated insights in seconds.

  9. pointfreeco / swift-composable-architecture

    A library for building applications in a consistent and understandable way, with composition, testing, and ergonomics in mind.

  10. fastrepl / hyprnote

    Local-first AI Notepad for Private Meetings

  11. NemProject / nem

    number go up 💹

  12. linkwarden / linkwarden

    ⚡️⚡️⚡️ Self-hosted collaborative bookmark manager to collect, read, annotate, and fully preserve what matters, all in one place.

  13. rustdesk / rustdesk

    An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.

  14. cloudwego / eino

    The ultimate LLM/AI application development framework in Golang.

Product Hunt(15)

  1. Launch

    Create fully functional apps with AI and real human support

  2. Kombai

    The first AI agent built for real-world frontend tasks

  3. Sparrow

    The lightest and fastest platform for API testing

  4. Mocha

    Build full-stack apps without coding. But this time it works

  5. GLM-4.5

    Unifying agentic capabilities in one open model

  6. Okibi

    Lovable but for agents - build agents using simple prompts

  7. Gridapps Testimonials

    All-in-one testimonial widgets to boost trust & conversions

  8. Boki by Hackmamba

    Plan, write, and distribute authentic content that converts

  9. projectOS

    the free resume builder that grabs attention

  10. Bucketly

    What's on your bucket list? Track your lifetime goals

  11. Partnero AI

    AI-powered affiliate and referral programs

  12. Pond

    Superhuman for texting

  13. Temporal + OpenAI Agents SDK

    Build production-ready agents, fast.

  14. Dynamic, adaptive documentation

    AI-powered docs, tailored to every user’s individual needs

  15. Dona - Simple tasks

    Free minimalist task management app for iOS

Hugging Face(12)

  1. ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents

    Automating the transformation of user interface (UI) designs into front-end code holds significant promise for accelerating software development and democratizing design workflows. While recent large language models (LLMs) have demonstrated progress in text-to-code generation, many existing approaches rely solely on natural language prompts, limiting their effectiveness in capturing spatial layout and visual design intent. In contrast, UI development in practice is inherently multimodal, often starting from visual sketches or mockups. To address this gap, we introduce a modular multi-agent framework that performs UI-to-code generation in three interpretable stages: grounding, planning, and generation. The grounding agent uses a vision-language model to detect and label UI components, the planning agent constructs a hierarchical layout using front-end engineering priors, and the generation agent produces HTML/CSS code via adaptive prompt-based synthesis. This design improves robustness, interpretability, and fidelity over end-to-end black-box methods. Furthermore, we extend the framework into a scalable data engine that automatically produces large-scale image-code pairs. Using these synthetic examples, we fine-tune and reinforce an open-source VLM, yielding notable gains in UI understanding and code quality. Extensive experiments demonstrate that our approach achieves state-of-the-art performance in layout accuracy, structural coherence, and code correctness. Our code is made publicly available at https://github.com/leigest519/ScreenCoder.

  2. BANG: Dividing 3D Assets via Generative Exploded Dynamics

    3D creation has always been a unique human strength, driven by our ability to deconstruct and reassemble objects using our eyes, mind and hand. However, current 3D design tools struggle to replicate this natural process, requiring considerable artistic expertise and manual labor. This paper introduces BANG, a novel generative approach that bridges 3D generation and reasoning, allowing for intuitive and flexible part-level decomposition of 3D objects. At the heart of BANG is "Generative Exploded Dynamics", which creates a smooth sequence of exploded states for an input geometry, progressively separating parts while preserving their geometric and semantic coherence. BANG utilizes a pre-trained large-scale latent diffusion model, fine-tuned for exploded dynamics with a lightweight exploded view adapter, allowing precise control over the decomposition process. It also incorporates a temporal attention module to ensure smooth transitions and consistency across time. BANG enhances control with spatial prompts, such as bounding boxes and surface regions, enabling users to specify which parts to decompose and how. This interaction can be extended with multimodal models like GPT-4, enabling 2D-to-3D manipulations for more intuitive and creative workflows. The capabilities of BANG extend to generating detailed part-level geometry, associating parts with functional descriptions, and facilitating component-aware 3D creation and manufacturing workflows. Additionally, BANG offers applications in 3D printing, where separable parts are generated for easy printing and reassembly. In essence, BANG enables seamless transformation from imaginative concepts to detailed 3D assets, offering a new perspective on creation that resonates with human intuition.

  3. Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

    In this report, we introduce Falcon-H1, a new series of large language models (LLMs) featuring hybrid architecture designs optimized for both high performance and efficiency across diverse use cases. Unlike earlier Falcon models built solely on Transformer or Mamba architectures, Falcon-H1 adopts a parallel hybrid approach that combines Transformer-based attention with State Space Models (SSMs), known for superior long-context memory and computational efficiency. We systematically revisited model design, data strategy, and training dynamics, challenging conventional practices in the field. Falcon-H1 is released in multiple configurations, including base and instruction-tuned variants at 0.5B, 1.5B, 1.5B-deep, 3B, 7B, and 34B parameters. Quantized instruction-tuned models are also available, totaling over 30 checkpoints on Hugging Face Hub. Falcon-H1 models demonstrate state-of-the-art performance and exceptional parameter and training efficiency. The flagship Falcon-H1-34B matches or outperforms models up to 70B scale, such as Qwen3-32B, Qwen2.5-72B, and Llama3.3-70B, while using fewer parameters and less data. Smaller models show similar trends: the Falcon-H1-1.5B-Deep rivals current leading 7B-10B models, and Falcon-H1-0.5B performs comparably to typical 7B models from 2024. These models excel across reasoning, mathematics, multilingual tasks, instruction following, and scientific knowledge. With support for up to 256K context tokens and 18 languages, Falcon-H1 is suitable for a wide range of applications. All models are released under a permissive open-source license, underscoring our commitment to accessible and impactful AI research.

  4. VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

    Reinforcement learning has proven its effectiveness in enhancing the reasoning capabilities of large language models. Recent research efforts have progressively extended this paradigm to multimodal reasoning tasks. Due to the inherent complexity and diversity of multimodal tasks, especially in semantic content and problem formulations, existing models often exhibit unstable performance across various domains and difficulty levels. To address these limitations, we propose VL-Cogito, an advanced multimodal reasoning model trained via a novel multi-stage Progressive Curriculum Reinforcement Learning (PCuRL) framework. PCuRL systematically guides the model through tasks of gradually increasing difficulty, substantially improving its reasoning abilities across diverse multimodal contexts. The framework introduces two key innovations: (1) an online difficulty soft weighting mechanism, dynamically adjusting training difficulty across successive RL training stages; and (2) a dynamic length reward mechanism, which encourages the model to adaptively regulate its reasoning path length according to task complexity, thus balancing reasoning efficiency with correctness. Experimental evaluations demonstrate that VL-Cogito consistently matches or surpasses existing reasoning-oriented models across mainstream multimodal benchmarks spanning mathematics, science, logic, and general understanding, validating the effectiveness of our approach.

  5. Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision

    Detecting vehicles in aerial imagery is a critical task with applications in traffic monitoring, urban planning, and defense intelligence. Deep learning methods have provided state-of-the-art (SOTA) results for this application. However, a significant challenge arises when models trained on data from one geographic region fail to generalize effectively to other areas. Variability in factors such as environmental conditions, urban layouts, road networks, vehicle types, and image acquisition parameters (e.g., resolution, lighting, and angle) leads to domain shifts that degrade model performance. This paper proposes a novel method that uses generative AI to synthesize high-quality aerial images and their labels, improving detector training through data augmentation. Our key contribution is the development of a multi-stage, multi-modal knowledge transfer framework utilizing fine-tuned latent diffusion models (LDMs) to mitigate the distribution gap between the source and target environments. Extensive experiments across diverse aerial imagery domains show consistent performance improvements in AP50 over supervised learning on source domain data, weakly supervised adaptation methods, unsupervised domain adaptation methods, and open-set object detectors by 4-23%, 6-10%, 7-40%, and more than 50%, respectively. Furthermore, we introduce two newly annotated aerial datasets from New Zealand and Utah to support further research in this field. Project page is available at: https://humansensinglab.github.io/AGenDA

  6. Efficient Differentially Private Fine-Tuning of LLMs via Reinforcement Learning

    The tension between data privacy and model utility has become the defining bottleneck for the practical deployment of large language models (LLMs) trained on sensitive corpora including healthcare. Differentially private stochastic gradient descent (DP-SGD) guarantees formal privacy, yet it does so at a pronounced cost: gradients are forcibly clipped and perturbed with noise, degrading sample efficiency and final accuracy. Numerous variants have been proposed to soften this trade-off, but they all share a handicap: their control knobs are hard-coded, global, and oblivious to the evolving optimization landscape. Consequently, practitioners are forced either to over-spend privacy budget in pursuit of utility, or to accept mediocre models in order to stay within privacy constraints. We present RLDP, the first framework to cast DP optimization itself as a closed-loop control problem amenable to modern deep reinforcement learning (RL). RLDP continuously senses rich statistics of the learning dynamics and acts by selecting fine-grained per parameter gradient-clipping thresholds as well as the magnitude of injected Gaussian noise. A soft actor-critic (SAC) hyper-policy is trained online during language model fine-tuning; it learns, from scratch, how to allocate the privacy budget where it matters and when it matters. Across more than 1,600 ablation experiments on GPT2-small, Llama-1B, Llama-3B, and Mistral-7B, RLDP delivers perplexity reductions of 1.3-30.5% (mean 5.4%) and an average 5.6% downstream utility gain. RLDP reaches each baseline's final utility after only 13-43% of the gradient-update budget (mean speed-up 71%), all while honoring the same (epsilon, delta)-DP contract and exhibiting equal or lower susceptibility to membership-inference and canary-extraction attacks.

  7. Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

    Referring audio-visual segmentation (RAVS) has recently seen significant advancements, yet challenges remain in integrating multimodal information and deeply understanding and reasoning about audiovisual content. To extend the boundaries of RAVS and facilitate future research in this field, we propose Omnimodal Referring Audio-Visual Segmentation (OmniAVS), a new dataset containing 2,098 videos and 59,458 multimodal referring expressions. OmniAVS stands out with three key innovations: (1) 8 types of multimodal expressions that flexibly combine text, speech, sound, and visual cues; (2) an emphasis on understanding audio content beyond just detecting their presence; and (3) the inclusion of complex reasoning and world knowledge in expressions. Furthermore, we introduce Omnimodal Instructed Segmentation Assistant (OISA), to address the challenges of multimodal reasoning and fine-grained understanding of audiovisual content in OmniAVS. OISA uses MLLM to comprehend complex cues and perform reasoning-based segmentation. Extensive experiments show that OISA outperforms existing methods on OmniAVS and achieves competitive results on other related tasks.

  8. Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

    Large language models (LLMs) face low hardware efficiency during decoding, especially for long-context reasoning tasks. This paper introduces Step-3, a 321B-parameter VLM with hardware-aware model-system co-design optimized for minimizing decoding costs. Step-3 innovates in two key dimensions: (1) A novel Multi-Matrix Factorization Attention (MFA) mechanism that significantly reduces both KV cache size and computation while maintaining high attention expressiveness, and (2) Attention-FFN Disaggregation (AFD), a distributed inference system that decouples attention and Feed-Forward Network (FFN) layers into specialized subsystems. This co-design achieves unprecedented cost efficiency: Step-3 significantly reduces theoretical decoding costs compared with models like DeepSeek-V3 and Qwen3 MoE 235B, with the gains widening at longer context. Step-3 achieves low cost while activating 38B parameters per token (more than DeepSeek-V3 and Qwen3 MoE 235B), demonstrating that hardware-aligned attention arithmetic intensity, MoE sparsity, and AFD are critical to cost-effectiveness. We perform a head-to-head comparison with DeepSeek-V3 in its favorable scenarios. Our implementation on Hopper GPUs achieves a decoding throughput of up to 4,039 tokens per second per GPU under 50ms TPOT SLA (4K context, FP8, no MTP). It is higher than DeepSeek-V3's 2,324 in the same setup and sets a new Pareto frontier for LLM decoding.

  9. MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE

    Although GRPO substantially enhances flow matching models in human preference alignment of image generation, methods such as FlowGRPO still exhibit inefficiency due to the necessity of sampling and optimizing over all denoising steps specified by the Markov Decision Process (MDP). In this paper, we propose MixGRPO, a novel framework that leverages the flexibility of mixed sampling strategies through the integration of stochastic differential equations (SDE) and ordinary differential equations (ODE). This streamlines the optimization process within the MDP to improve efficiency and boost performance. Specifically, MixGRPO introduces a sliding window mechanism, using SDE sampling and GRPO-guided optimization only within the window, while applying ODE sampling outside. This design confines sampling randomness to the time-steps within the window, thereby reducing the optimization overhead, and allowing for more focused gradient updates to accelerate convergence. Additionally, as time-steps beyond the sliding window are not involved in optimization, higher-order solvers are supported for sampling. So we present a faster variant, termed MixGRPO-Flash, which further improves training efficiency while achieving comparable performance. MixGRPO exhibits substantial gains across multiple dimensions of human preference alignment, outperforming DanceGRPO in both effectiveness and efficiency, with nearly 50% lower training time. Notably, MixGRPO-Flash further reduces training time by 71%. Codes and models are available at https://github.com/Tencent-Hunyuan/MixGRPO{MixGRPO}.

  10. MetaCLIP 2: A Worldwide Scaling Recipe

    Contrastive Language-Image Pretraining (CLIP) is a popular foundation model, supporting from zero-shot classification, retrieval to encoders for multimodal large language models (MLLMs). Although CLIP is successfully trained on billion-scale image-text pairs from the English world, scaling CLIP's training further to learning from the worldwide web data is still challenging: (1) no curation method is available to handle data points from non-English world; (2) the English performance from existing multilingual CLIP is worse than its English-only counterpart, i.e., "curse of multilinguality" that is common in LLMs. Here, we present MetaCLIP 2, the first recipe training CLIP from scratch on worldwide web-scale image-text pairs. To generalize our findings, we conduct rigorous ablations with minimal changes that are necessary to address the above challenges and present a recipe enabling mutual benefits from English and non-English world data. In zero-shot ImageNet classification, MetaCLIP 2 ViT-H/14 surpasses its English-only counterpart by 0.8% and mSigLIP by 0.7%, and surprisingly sets new state-of-the-art without system-level confounding factors (e.g., translation, bespoke architecture changes) on multilingual benchmarks, such as CVQA with 57.4%, Babel-ImageNet with 50.2% and XM3600 with 64.3% on image-to-text retrieval.

  11. Repair-R1: Better Test Before Repair

    APR (Automated Program Repair) aims to automatically locate program defects, generate patches and validate the repairs. Existing techniques for APR are often combined with LLMs (Large Language Models), which leverages the code-related knowledge of LLMs to improve repair effectiveness. Current LLM-based APR methods typically utilize test cases only during the inference stage, adopting an iterative approach that performs repair first and validates it through test execution afterward. This conventional paradigm neglects two important aspects: the potential contribution of test cases in the training phase, and the possibility of leveraging testing prior to repair. To address this, we propose Repair-R1, which introduces test cases into the model's training phase and shifts test generation to precede repair. The model is required to first generate discriminative test cases that can distinguish defective behaviors, and then perform repair based on these tests. This enables the model to better locate defects and understand the underlying causes of defects, thereby improving repair effectiveness. We implement Repair-R1 with three different backbone models, using RL (reinforcement learning) to co-optimize test generation and bug repair. Experimental results on four widely adopted benchmarks demonstrate the superiority of Repair-R1. Specially, compared to vanilla models, Repair-R1 improves repair success rate by 2.68\% to 48.29\%, test generation success rate by 16.38\% to 53.28\%, and test coverage by 0.78\% to 53.96\%. We publish the code and weights at https://github.com/Tomsawyerhu/APR-RL and https://huggingface.co/tomhu/Qwen3-4B-RL-5000-step.

  12. DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation

    Generating 3D scenes from natural language holds great promise for applications in gaming, film, and design. However, existing methods struggle with automation, 3D consistency, and fine-grained control. We present DreamScene, an end-to-end framework for high-quality and editable 3D scene generation from text or dialogue. DreamScene begins with a scene planning module, where a GPT-4 agent infers object semantics and spatial constraints to construct a hybrid graph. A graph-based placement algorithm then produces a structured, collision-free layout. Based on this layout, Formation Pattern Sampling (FPS) generates object geometry using multi-timestep sampling and reconstructive optimization, enabling fast and realistic synthesis. To ensure global consistent, DreamScene employs a progressive camera sampling strategy tailored to both indoor and outdoor settings. Finally, the system supports fine-grained scene editing, including object movement, appearance changes, and 4D dynamic motion. Experiments demonstrate that DreamScene surpasses prior methods in quality, consistency, and flexibility, offering a practical solution for open-domain 3D content creation. Code and demos are available at https://jahnsonblack.github.io/DreamScene-Full/.

Solidot(15)

  1. 高铁的环境影响

    西南财经大学的研究人员在《Regional Science and Urban Economics》期刊上发表论文,分析了中国第一条高铁的环境影响。中国第一条主要高铁——京沪高铁——于 2011 年 6 月 30 日开通,高铁连接了两大最具有经济活力的地区-环渤海都市圈和长三角,两大地区的人口总数占到了全国总人口的四分之一。研究人员利用 NASA 高分辨卫星数据(中国在 2013 年前缺乏可靠的地面空气污染监测数据)发现,高铁开通半年内沿路各县的颗粒物浓度下降了 6.2%,这一影响随后两年持续加强,表明随着城际出行者逐渐转变出行方式,环境效益不断增强。汽车尾气是城市空气污染的主要来源,占到了北京和上海两大超级城市 PM2.5 排放的 45–52%,其中很大一部分是区域间交通。研究人员估计,高铁开通每年产生的外部健康效益价值约 210 亿元人民币,相当于建设成本的很大一部分。

  2. 人类每天在室内环境吸入逾 7 万个微塑料

    根据发表在《PLOS One》期刊上的一项研究,人类每天在室内环境吸入逾 7 万个微塑料。塑料是当代最严重的环境问题之一,其中纳米大小的颗粒能吸入肺部。法国图卢兹大学的科学家量化了每天可能吸入的塑料粉尘量。研究小组从自家公寓和汽车中采集了 16 个室内空气样本,使用拉曼光谱学(Raman Spectroscopy)测量微塑料浓度。结果显示我们每天的塑料颗粒吸入量非常大。公寓空气样本的中值浓度为每立方米 528 个微塑料颗粒,汽车内微塑料颗粒浓度则高达每立方米 2238 个。这些颗粒 94% 直径小于 10 微米,足以吸入后深入肺组织。研究团队估计,成年人每天从室内环境中吸入大约 7.1 万个微塑料颗粒,其中 6.8 万个小于 10 微米。人类平均 90% 的时间处于室内,包括家、工作场所、商店、交通工具等,会在不自觉中吸入微塑料污染物。

  3. 人工甜味剂显著增加罹患糖尿病风险

    一项长达 14 年的跟踪研究发现,人工甜味剂饮料相比含糖饮料显著增加了罹患 2 型糖尿病风险。饮料用人工甜味剂降低含糖量但保持甜味,此举被认为产生更健康的饮料,但越来越多的研究显示人工甜味剂可能存在代谢风险。为评估含糖饮料和人工甜味饮料如何影响健康,研究人员追踪了 36,608 名参与者,平均随访时间为 13.9 年。研究发现,相比完全不饮用含糖饮料的人,饮用一罐人工甜味饮料会使得罹患 2 型糖尿病的风险增加 38%。饮用相同量含糖饮料的人的风险增加 23%。此前的研究发现,人工甜味剂阿斯巴甜(aspartame)会引发类似于蔗糖的餐后胰岛素反应,其它人工甜味剂糖精(saccharin)和三氯蔗糖(sucralose)则与肠道菌群紊乱和糖耐量受损有关。

  4. 研究显示限制 Google 付费成为默认搜索引擎能降低其市场份额

    Google 被指控在十几年里向苹果和三星等公司支付了数十亿美元,成为智能手机和浏览器的默认搜索引擎。默认位置让 Google 成为全球使用率第一的搜索引擎,获取每年逾 3000 亿美元的广告收入。根据《American Economic Journal:Microeconomics》期刊上的一项研究,部分国家的政策禁止 Google 付费成为默认搜索引擎,这种干预措施有效降低了 Google 在当地的市场份额,其中俄罗斯和土耳其的结果尤为明显。

  5. 网信办约谈英伟达

    网信办官网发表声明:英伟达算力芯片被曝出存在严重安全问题。此前,美议员呼吁要求美出口的先进芯片必须配备“追踪定位”功能。美人工智能领域专家透露,英伟达算力芯片“追踪定位”“远程关闭”技术已成熟。为维护中国用户网络安全、数据安全,依据《网络安全法》《数据安全法》《个人信息保护法》有关规定,国家互联网信息办公室于 2025 年 7 月 31 日约谈了英伟达公司,要求英伟达公司就对华销售的 H20 算力芯片漏洞后门安全风险问题进行说明并提交相关证明材料。

  6. AI 生成的代码近半包含安全漏洞

    AI 也许是软件开发的未来,但人类尚未做好把手从方向盘上移开的准备。Veracode 发布了 AI 生成代码的安全性报告《2025 GenAI Code Security Report》,逾百个大模型完成了 80 项编程任务,但 AI 生成的代码有约 45% 存在安全漏洞。这些安全漏洞很多都属于 OWASP(Open Worldwide Application Security Project)Top 10 漏洞。报告发现,当 AI 给予选项写安全或不安全代码时,几乎一半的时间它选择了错误的路径。

  7. Vivo 宣布了用 Rust 开发的 BlueOS

    维沃移动通信有限公司(Vivo)宣布了用 Rust 开发的蓝河操作系统 BlueOS。Rust 开发的蓝河内核支持 POSIX 接口和 Rust std(标准库),支持 ARM32、ARM64、RISCV32、RISCV64 芯片架构。Vivo 称 BlueOS 是首个从内核到系统框架全栈由 Rust 语言编写的操作系统,Rust 语言的一系列安全特性,在编译阶段就可以发现内存使用不当导致的安全漏洞,从源头实现天生更安全。

  8. 澳大利亚儿童社媒禁令扩大到 YouTube

    澳大利亚周三表示,针对 16 岁以下儿童的社交媒体禁令覆盖的网站将包含 YouTube,不再豁免 Google 旗下的视频共享平台。这一决定是在监管机构的调查发现 37% 的未成年人报告了 YouTube 的有害内容,在所有社媒平台之间比例最高。儿童社媒禁令将于今年 12 月生效。YouTube 表示,澳大利亚 13 至 15 岁的儿童有近四分之三使用该平台,它不应该被归类为社媒平台,因为它主要托管视频,是一个视频共享平台,有大量高质量的内容库,用户越来越多的通过电视观看内容,它不是社交媒体。禁令覆盖的社媒平台 Meta Platforms、TikTok 和 Snap 则都表示 YouTube 和它们一样都使用算法根据用户活动推荐内容。

  9. 地球各大洲都经历淡水流失

    卫星观测显示,自 2002 年以来,受气候变化、不可持续的地下水利用和极端干旱的影响,地球各大洲都经历了前所未有的淡水流失。全球四分之三的人口生活在 101 个国家,过去 22 年这些国家一直在流失淡水。据联合国预测,未来 50-60 年世界人口预计将会继续增长,与此同时,淡水供应量急剧减少。研究发现,68% 的流失来自地下水,流失的地下水对海平面上升的贡献超过了格陵兰岛和南极冰盖融化。2014-2015 年可能是一个转折点,此后气候极端事件加速,地下水使用增加,大陆加速干燥,超过了冰川和冰盖融化的速度。

  10. 愤怒的玩家对 Visa 和 Mastercard 的客服发动 DDoS 攻击

    在两大游戏平台 Steam 和 itch.io 因为支付公司的压力而限制和下架部分类型的成人游戏之后,愤怒的玩家正在动员组织起来对支付公司发动反击。在 Reddit 和 Bluesky 等社交媒体网站上,玩家们互相督促通过邮件和电话联系支付巨头 Visa 和 Mastercard。两家公司的客服证实收到了很多玩家的电话,但他们表示对于玩家的问题无能为力。客服的权力显然有限,玩家此举是对客服发动 DDoS 攻击,制造混乱,大到足以让支付公司付出代价。

  11. 猫会让 AI 困惑

    一道标准的数学题:△ABC,AB = 86,AC = 97,以 A 为圆心 AB 为半径的圆与 BC 相交于 B 和 X。BX 和 CX 的长度是整数。问 BC 的长度多少?趣问:猫大部分时间都在睡觉。人类解题者通常会略过最后一句话,但根据发表在 arXiv 上的一篇预印本,这句话会让 AI 模型得出错误答案的概率增加一倍以上。研究人员发现,在数学题中加入一段不相关的文本会系统性的误导模型输出错误答案。研究人员将这种针对 AI 的攻击策略称为 CatAttack。CatAttack 文本与上下文无关,人类解题者会忽略它,但 AI 模型不会。研究人员使用 DeepSeek V3、Qwen 3 和 Phi-4 进行了测试,结果显示 CatAttack 将错误答案的概率提高了最多 700%。即使 CatAttack 没有导致推理模型生成错误答案,它们的响应时间也延长了,16% 的情况下将响应时间加倍,速度显著下降导致成本增加。最后补充一句:猫是液体。

  12. Google 鼓励员工使用 AI

    在上周召开的一次全体会议上,Google CEO Sundar Pichai 告诉员工,在 AI 时代员工需要利用它提高生产力。他说,Google 正与其它公司进行竞争,其它公司提高了员工生产力,因此关注生产力至关重要。另一名高管 Brian Saluzzo 补充说,公司为软件工程师构建了一系列工具,帮助每一位员工精通 AI。他说,工程师迫切需要在编程流程中集成 AI,加快编程速度。Google 内部推出了一个网站叫 AI Savvy Google,提供了 AI 相关的课程、工具包和学习研讨会。Google 工程教育团队与 DeepMind 合作推出了名为“Building with Gemini”的培训,推出了一个内部辅助编程工具 Cider。Saluzzo 表示,自 5 月首次推出 Cider 以来 50% 的用户每周都会使用该服务。

  13. Futurehome 申请破产后推送固件移除设备本地功能强推订阅制

    挪威智能家居系统公司 Futurehome 在申请破产后,向其产品 Smarthub II 等推送了固件,移除了设备本地功能,将基本功能置于付费墙后,强推订阅制,除非客户订阅年费,否则设备基本功能都无法使用。Futurehome 的 Smarthub 最早是在 2016 年推出的,此后一直以一次性买断的方式销售其智能家居系列产品,包括智能恒温器、智能照明、智能火灾报警器和一氧化碳报警器。但在申请破产之后,该公司现在要求客户以每年 1,188 挪威克朗(约 116.56 美元)的订阅费使用基本功能。Futurehome 声称因为破产为了稳定运营引入订阅费是必要的举措。

  14. 微软承认它无法保障欧洲国家的数字主权

    微软承认根据美国的 Cloud Act 法,它无法保障法国或其它欧洲国家客户的数字主权。Cloud Act 允许美国政府 访问美国科技公司的数据,即使数据存储在位于海外的服务器上。微软法国代表 Anton Carniaux 和 Pierre Lagarde 在接受法国议会咨询时表示,微软会抵制无根据的数据请求,但有法律义务遵守有效的数据请求。根据微软的透明度报告,它至今未收到美国政府要求访问储存在欧洲服务器上的信息的数据请求,但地缘政治紧张局势令欧盟国家对此感到担忧。Legrande 表示,过去三年来微软实现了一个技术环境以最大限度减少数据传输,将欧洲客户数据保存在欧盟境内。

  15. 调查发现六成美国人将 AI 用于搜索 37% 的人用于工作

    根据 Associated Press-NORC Center for Public Affairs Research 的一项调查,60% 的美国成年人使用 AI 搜索信息,只有 37% 的受访者使用 AI 完成工作,40% 的受访者将 AI 用于头脑风暴。有 1437 名成年人在 7 月 10-14 日之间接受调查,结果显示不同代际在 AI 应用方面存在显著差距。30 岁以下的成年人中 74% 的人使用 AI 进行信息搜索,62% 使用 AI 进行创意构思,而 60 岁以上的成年人中,只有 23% 的人使用 AI 进行头脑风暴。约三分之一的美国人使用 AI 写电邮、创造或编辑图像,或娱乐目的。四分之一的人使用 AI 购物,16% 的人使用 AI 陪伴——在年轻人中这一比例达到 25%。