OrangeBot.AI Digest — 2025-07-10
75 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Postgres LISTEN/NOTIFY does not scale (www.recall.ai)
- Bret Victor on why current trend of AIs is at odds with his work (dynamicland.org)
- Graphical Linear Algebra (graphicallinearalgebra.net)
- Red Hat Technical Writing Style Guide (stylepedia.net)
- Measuring the impact of AI on experienced open-source developer productivity (metr.org)
- Seven Engineers Suspended After $2.3M Bridge Includes 90-Degree Turn (www.vice.com)
- FOKS: Federated Open Key Service (foks.pub)
- Underwater turbine spinning for 6 years off Scotland's coast is a breakthrough (apnews.com)
- Flix – A powerful effect-oriented programming language (flix.dev)
- Is Gemini 2.5 good at bounding boxes? (simedw.com)
- Kite News (kite.kagi.com)
- How to prove false statements: Practical attacks on Fiat-Shamir (www.quantamagazine.org)
- Show HN: Typeform was too expensive so I built my own forms (www.ikiform.com)
- Thunderbird 140 “Eclipse” (blog.thunderbird.net)
- I used to prefer permissive licenses and now favor copyleft (vitalik.eth.limo)
GitHub Trending(15)
- Alibaba-NLP / WebAgent
🌐 WebAgent for Information Seeking bulit by Tongyi Lab: WebWalker & WebDancer & WebSailor https://arxiv.org/pdf/2507.02592
- WordPress / wordpress-develop
WordPress Develop, Git-ified. Synced from git://develop.git.wordpress.org/, including branches and tags! This repository is just a mirror of the WordPress subversion repository. Please include a link to a pre-existing ticket on https://core.trac.wordpress.org/ with every pull request.
- googleapis / genai-toolbox
MCP Toolbox for Databases is an open source MCP server for databases.
- LMCache / LMCache
Supercharge Your LLM with the Fastest KV Cache Layer
- forthespada / CS-Books
🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~
- ByteByteGoHq / system-design-101
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
- snap-stanford / Biomni
Biomni: a general-purpose biomedical AI agent
- pybind / pybind11
Seamless operability between C++11 and Python
- punkpeye / awesome-mcp-clients
A collection of MCP clients.
- FujiwaraChoki / MoneyPrinterV2
Automate the process of making money online.
- helm / helm
The Kubernetes Package Manager
- coleam00 / ai-agents-masterclass
Follow along with my AI Agents Masterclass videos! All of the code I create and use in this series on YouTube will be here for you to use and even build on top of!
- HandsOnLLM / Hands-On-Large-Language-Models
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
- volcengine / verl
verl: Volcano Engine Reinforcement Learning for LLMs
- hashicorp / terraform
Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
Product Hunt(15)
- HelloCV AI
Notion style CV/Resume site with AI on a FREE .cv domain
- Wibe for creators
beautiful community platform out there to grow your audience
- Grok 4
World's most powerful Al model (still, according to Elon)
- Naya
Your entire creative journey in one beautiful digital studio
- HeronAI
Connect, visualize, and, ask your business data anything.
- Pagey
Create portfolio in minutes, not hours
- Pixelesq
Cursor for websites: AI-native no-code website builder
- Reachy Mini
A new open-source robot for your desk
- Uplodio
AI influencer manager to scale your creator marketing
- LLM SEO Index Crawler Check
Check if your website can get crawled by ChatGPT
- Builduo
Build SaaS backends with AI and deploy in minutes
- Datalink
Populate your design layers with realistic data
- Rorrim
Smart AI journal. Reveal hidden life patterns!
- Preso Budget
Manage your money with a smart budgeting platform
- Video to Playable Converter by Segwise
Convert MP4 video ads to playables in 2 minutes - free
Hugging Face(15)
- 4KAgent: Agentic Any Image to 4K Super-Resolution
We present 4KAgent, a unified agentic super-resolution generalist system designed to universally upscale any image to 4K resolution (and even higher, if applied iteratively). Our system can transform images from extremely low resolutions with severe degradations, for example, highly distorted inputs at 256x256, into crystal-clear, photorealistic 4K outputs. 4KAgent comprises three core components: (1) Profiling, a module that customizes the 4KAgent pipeline based on bespoke use cases; (2) A Perception Agent, which leverages vision-language models alongside image quality assessment experts to analyze the input image and make a tailored restoration plan; and (3) A Restoration Agent, which executes the plan, following a recursive execution-reflection paradigm, guided by a quality-driven mixture-of-expert policy to select the optimal output for each step. Additionally, 4KAgent embeds a specialized face restoration pipeline, significantly enhancing facial details in portrait and selfie photos. We rigorously evaluate our 4KAgent across 11 distinct task categories encompassing a total of 26 diverse benchmarks, setting new state-of-the-art on a broad spectrum of imaging domains. Our evaluations cover natural images, portrait photos, AI-generated content, satellite imagery, fluorescence microscopy, and medical imaging like fundoscopy, ultrasound, and X-ray, demonstrating superior performance in terms of both perceptual (e.g., NIQE, MUSIQ) and fidelity (e.g., PSNR) metrics. By establishing a novel agentic paradigm for low-level vision tasks, we aim to catalyze broader interest and innovation within vision-centric autonomous agents across diverse research communities. We will release all the code, models, and results at: https://4kagent.github.io.
- Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data
Generating diverse and natural human motion sequences based on textual descriptions constitutes a fundamental and challenging research area within the domains of computer vision, graphics, and robotics. Despite significant advancements in this field, current methodologies often face challenges regarding zero-shot generalization capabilities, largely attributable to the limited size of training datasets. Moreover, the lack of a comprehensive evaluation framework impedes the advancement of this task by failing to identify directions for improvement. In this work, we aim to push text-to-motion into a new era, that is, to achieve the generalization ability of zero-shot. To this end, firstly, we develop an efficient annotation pipeline and introduce MotionMillion-the largest human motion dataset to date, featuring over 2,000 hours and 2 million high-quality motion sequences. Additionally, we propose MotionMillion-Eval, the most comprehensive benchmark for evaluating zero-shot motion generation. Leveraging a scalable architecture, we scale our model to 7B parameters and validate its performance on MotionMillion-Eval. Our results demonstrate strong generalization to out-of-domain and complex compositional motions, marking a significant step toward zero-shot human motion generation. The code is available at https://github.com/VankouF/MotionMillion-Codes.
- Perception-Aware Policy Optimization for Multimodal Reasoning
Reinforcement Learning with Verifiable Rewards (RLVR) has proven to be a highly effective strategy for endowing Large Language Models (LLMs) with robust multi-step reasoning abilities. However, its design and optimizations remain tailored to purely textual domains, resulting in suboptimal performance when applied to multimodal reasoning tasks. In particular, we observe that a major source of error in current multimodal reasoning lies in the perception of visual inputs. To address this bottleneck, we propose Perception-Aware Policy Optimization (PAPO), a simple yet effective extension of GRPO that encourages the model to learn to perceive while learning to reason, entirely from internal supervision signals. Notably, PAPO does not rely on additional data curation, external reward models, or proprietary models. Specifically, we introduce the Implicit Perception Loss in the form of a KL divergence term to the GRPO objective, which, despite its simplicity, yields significant overall improvements (4.4%) on diverse multimodal benchmarks. The improvements are more pronounced, approaching 8.0%, on tasks with high vision dependency. We also observe a substantial reduction (30.5%) in perception errors, indicating improved perceptual capabilities with PAPO. We conduct comprehensive analysis of PAPO and identify a unique loss hacking issue, which we rigorously analyze and mitigate through a Double Entropy Loss. Overall, our work introduces a deeper integration of perception-aware supervision into RLVR learning objectives and lays the groundwork for a new RL framework that encourages visually grounded reasoning. Project page: https://mikewangwzhl.github.io/PAPO.
- Rethinking Verification for LLM Code Generation: From Generation to Testing
Large language models (LLMs) have recently achieved notable success in code-generation benchmarks such as HumanEval and LiveCodeBench. However, a detailed examination reveals that these evaluation suites often comprise only a limited number of homogeneous test cases, resulting in subtle faults going undetected. This not only artificially inflates measured performance but also compromises accurate reward estimation in reinforcement learning frameworks utilizing verifiable rewards (RLVR). To address these critical shortcomings, we systematically investigate the test-case generation (TCG) task by proposing multi-dimensional metrics designed to rigorously quantify test-suite thoroughness. Furthermore, we introduce a human-LLM collaborative method (SAGA), leveraging human programming expertise with LLM reasoning capability, aimed at significantly enhancing both the coverage and the quality of generated test cases. In addition, we develop a TCGBench to facilitate the study of the TCG task. Experiments show that SAGA achieves a detection rate of 90.62% and a verifier accuracy of 32.58% on TCGBench. The Verifier Accuracy (Verifier Acc) of the code generation evaluation benchmark synthesized by SAGA is 10.78% higher than that of LiveCodeBench-v6. These results demonstrate the effectiveness of our proposed method. We hope this work contributes to building a scalable foundation for reliable LLM code evaluation, further advancing RLVR in code generation, and paving the way for automated adversarial test synthesis and adaptive benchmark integration.
- A Systematic Analysis of Hybrid Linear Attention
Transformers face quadratic complexity and memory issues with long sequences, prompting the adoption of linear attention mechanisms using fixed-size hidden states. However, linear models often suffer from limited recall performance, leading to hybrid architectures that combine linear and full attention layers. Despite extensive hybrid architecture research, the choice of linear attention component has not been deeply explored. We systematically evaluate various linear attention models across generations - vector recurrences to advanced gating mechanisms - both standalone and hybridized. To enable this comprehensive analysis, we trained and open-sourced 72 models: 36 at 340M parameters (20B tokens) and 36 at 1.3B parameters (100B tokens), covering six linear attention variants across five hybridization ratios. Benchmarking on standard language modeling and recall tasks reveals that superior standalone linear models do not necessarily excel in hybrids. While language modeling remains stable across linear-to-full attention ratios, recall significantly improves with increased full attention layers, particularly below a 3:1 ratio. Our study highlights selective gating, hierarchical recurrence, and controlled forgetting as critical for effective hybrid models. We recommend architectures such as HGRN-2 or GatedDeltaNet with a linear-to-full ratio between 3:1 and 6:1 to achieve Transformer-level recall efficiently. Our models are open-sourced at https://huggingface.co/collections/m-a-p/hybrid-linear-attention-research-686c488a63d609d2f20e2b1e.
- First Return, Entropy-Eliciting Explore
Reinforcement Learning from Verifiable Rewards (RLVR) improves the reasoning abilities of Large Language Models (LLMs) but it struggles with unstable exploration. We propose FR3E (First Return, Entropy-Eliciting Explore), a structured exploration framework that identifies high-uncertainty decision points in reasoning trajectories and performs targeted rollouts to construct semantically grounded intermediate feedback. Our method provides targeted guidance without relying on dense supervision. Empirical results on mathematical reasoning benchmarks(AIME24) show that FR3E promotes more stable training, produces longer and more coherent responses, and increases the proportion of fully correct trajectories. These results highlight the framework's effectiveness in improving LLM reasoning through more robust and structured exploration.
- AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
Kernel development in deep learning requires optimizing computational units across hardware while balancing memory management, parallelism, and hardware-specific optimizations through extensive empirical tuning. Although domain-specific languages like Triton simplify GPU programming by abstracting low-level details, developers must still manually tune critical parameters such as tile sizes and memory access patterns through iterative experimentation, creating substantial barriers to optimal performance and wider adoption. In this work, we introduce AutoTriton, the first model dedicated to Triton programming powered by reinforcement learning (RL). AutoTriton performs supervised fine-tuning (SFT) to be equipped with essential Triton programming expertise using a high-quality data gathering pipeline, and conducts RL with Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further improve Triton programming ability, sequentially. Experiments across five evaluation channels of TritonBench and KernelBench illustrate that our 8B model AutoTriton achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528. Further experimental analysis demonstrates the crucial role of each module within AutoTriton, including the SFT stage, the RL stage, and the reward design strategy. These findings underscore the promise of RL for automatically generating high-performance kernels, and since high-performance kernels are core components of AI systems, this breakthrough establishes an important foundation for building more efficient AI systems. The model and code will be available at https://github.com/AI9Stars/AutoTriton.
- Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving
Automated Theorem Proving (ATP) in formal languages is a foundational challenge for AI. While Large Language Models (LLMs) have driven remarkable progress, a significant gap remains between their powerful informal reasoning capabilities and their weak formal proving performance. Recent studies show that the informal accuracy exceeds 80% while formal success remains below 8% on benchmarks like PutnamBench. We argue this gap persists because current state-of-the-art provers, by tightly coupling reasoning and proving, are trained with paradigms that inadvertently punish deep reasoning in favor of shallow, tactic-based strategies. To bridge this fundamental gap, we propose a novel framework that decouples high-level reasoning from low-level proof generation. Our approach utilizes two distinct, specialized models: a powerful, general-purpose Reasoner to generate diverse, strategic subgoal lemmas, and an efficient Prover to rigorously verify them. This modular design liberates the model's full reasoning potential and bypasses the pitfalls of end-to-end training. We evaluate our method on a challenging set of post-2000 IMO problems, a problem set on which no prior open-source prover has reported success. Our decoupled framework successfully solves 5 of these problems, demonstrating a significant step towards automated reasoning on exceptionally difficult mathematical challenges. To foster future research, we release our full dataset of generated and verified lemmas for a wide range of IMO problems, available at https://tencent-imo.github.io/ .
- A Survey on Vision-Language-Action Models for Autonomous Driving
The rapid progress of multimodal large language models (MLLM) has paved the way for Vision-Language-Action (VLA) paradigms, which integrate visual perception, natural language understanding, and control within a single policy. Researchers in autonomous driving are actively adapting these methods to the vehicle domain. Such models promise autonomous vehicles that can interpret high-level instructions, reason about complex traffic scenes, and make their own decisions. However, the literature remains fragmented and is rapidly expanding. This survey offers the first comprehensive overview of VLA for Autonomous Driving (VLA4AD). We (i) formalize the architectural building blocks shared across recent work, (ii) trace the evolution from early explainer to reasoning-centric VLA models, and (iii) compare over 20 representative models according to VLA's progress in the autonomous driving domain. We also consolidate existing datasets and benchmarks, highlighting protocols that jointly measure driving safety, accuracy, and explanation quality. Finally, we detail open challenges - robustness, real-time efficiency, and formal verification - and outline future directions of VLA4AD. This survey provides a concise yet complete reference for advancing interpretable socially aligned autonomous vehicles. Github repo is available at https://github.com/JohnsonJiang1996/Awesome-VLA4AD{SicongJiang/Awesome-VLA4AD}.
- DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models
Molecular structure elucidation from spectra is a foundational problem in chemistry, with profound implications for compound identification, synthesis, and drug development. Traditional methods rely heavily on expert interpretation and lack scalability. Pioneering machine learning methods have introduced retrieval-based strategies, but their reliance on finite libraries limits generalization to novel molecules. Generative models offer a promising alternative, yet most adopt autoregressive SMILES-based architectures that overlook 3D geometry and struggle to integrate diverse spectral modalities. In this work, we present DiffSpectra, a generative framework that directly infers both 2D and 3D molecular structures from multi-modal spectral data using diffusion models. DiffSpectra formulates structure elucidation as a conditional generation process. Its denoising network is parameterized by Diffusion Molecule Transformer, an SE(3)-equivariant architecture that integrates topological and geometric information. Conditioning is provided by SpecFormer, a transformer-based spectral encoder that captures intra- and inter-spectral dependencies from multi-modal spectra. Extensive experiments demonstrate that DiffSpectra achieves high accuracy in structure elucidation, recovering exact structures with 16.01% top-1 accuracy and 96.86% top-20 accuracy through sampling. The model benefits significantly from 3D geometric modeling, SpecFormer pre-training, and multi-modal conditioning. These results highlight the effectiveness of spectrum-conditioned diffusion modeling in addressing the challenge of molecular structure elucidation. To our knowledge, DiffSpectra is the first framework to unify multi-modal spectral reasoning and joint 2D/3D generative modeling for de novo molecular structure elucidation.
- ModelCitizens: Representing Community Voices in Online Safety
Automatic toxic language detection is critical for creating safe, inclusive online spaces. However, it is a highly subjective task, with perceptions of toxic language shaped by community norms and lived experience. Existing toxicity detection models are typically trained on annotations that collapse diverse annotator perspectives into a single ground truth, erasing important context-specific notions of toxicity such as reclaimed language. To address this, we introduce MODELCITIZENS, a dataset of 6.8K social media posts and 40K toxicity annotations across diverse identity groups. To capture the role of conversational context on toxicity, typical of social media posts, we augment MODELCITIZENS posts with LLM-generated conversational scenarios. State-of-the-art toxicity detection tools (e.g. OpenAI Moderation API, GPT-o4-mini) underperform on MODELCITIZENS, with further degradation on context-augmented posts. Finally, we release LLAMACITIZEN-8B and GEMMACITIZEN-12B, LLaMA- and Gemma-based models finetuned on MODELCITIZENS, which outperform GPT-o4-mini by 5.5% on in-distribution evaluations. Our findings highlight the importance of community-informed annotation and modeling for inclusive content moderation. The data, models and code are available at https://github.com/asuvarna31/modelcitizens.
- Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Recent advances in language modeling have demonstrated the effectiveness of State Space Models (SSMs) for efficient sequence modeling. While hybrid architectures such as Samba and the decoder-decoder architecture, YOCO, have shown promising performance gains over Transformers, prior works have not investigated the efficiency potential of representation sharing between SSM layers. In this paper, we introduce the Gated Memory Unit (GMU), a simple yet effective mechanism for efficient memory sharing across layers. We apply it to create SambaY, a decoder-hybrid-decoder architecture that incorporates GMUs in the cross-decoder to share memory readout states from a Samba-based self-decoder. SambaY significantly enhances decoding efficiency, preserves linear pre-filling time complexity, and boosts long-context performance, all while eliminating the need for explicit positional encoding. Through extensive scaling experiments, we demonstrate that our model exhibits a significantly lower irreducible loss compared to a strong YOCO baseline, indicating superior performance scalability under large-scale compute regimes. Our largest model enhanced with Differential Attention, Phi4-mini-Flash-Reasoning, achieves significantly better performance than Phi4-mini-Reasoning on reasoning tasks such as Math500, AIME24/25, and GPQA Diamond without any reinforcement learning, while delivering up to 10x higher decoding throughput on 2K-length prompts with 32K generation length under the vLLM inference framework. We release our training codebase on open-source data at https://github.com/microsoft/ArchScale.
- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
Despite advances in reinforcement learning (RL)-based video reasoning with large language models (LLMs), data collection and finetuning remain significant challenges. These methods often rely on large-scale supervised fine-tuning (SFT) with extensive video data and long Chain-of-Thought (CoT) annotations, making them costly and hard to scale. To address this, we present Video-RTS, a new approach to improve video reasoning capability with drastically improved data efficiency by combining data-efficient RL with a video-adaptive test-time scaling (TTS) strategy. Based on observations about the data scaling of RL samples, we skip the resource-intensive SFT step and employ efficient pure-RL training with output-based rewards, requiring no additional annotations or extensive fine-tuning. Furthermore, to utilize computational resources more efficiently, we introduce a sparse-to-dense video TTS strategy that improves inference by iteratively adding frames based on output consistency. We validate our approach on multiple video reasoning benchmarks, showing that Video-RTS surpasses existing video reasoning models by an average of 2.4% in accuracy using only 3.6% training samples. For example, Video-RTS achieves a 4.2% improvement on Video-Holmes, a recent and challenging video reasoning benchmark, and a 2.6% improvement on MMVU. Notably, our pure RL training and adaptive video TTS offer complementary strengths, enabling Video-RTS's strong reasoning performance.
- SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning
Research on autonomous surgery has largely focused on simple task automation in controlled environments. However, real-world surgical applications demand dexterous manipulation over extended durations and generalization to the inherent variability of human tissue. These challenges remain difficult to address using existing logic-based or conventional end-to-end learning approaches. To address this gap, we propose a hierarchical framework for performing dexterous, long-horizon surgical steps. Our approach utilizes a high-level policy for task planning and a low-level policy for generating robot trajectories. The high-level planner plans in language space, generating task-level or corrective instructions that guide the robot through the long-horizon steps and correct for the low-level policy's errors. We validate our framework through ex vivo experiments on cholecystectomy, a commonly-practiced minimally invasive procedure, and conduct ablation studies to evaluate key components of the system. Our method achieves a 100\% success rate across eight unseen ex vivo gallbladders, operating fully autonomously without human intervention. This work demonstrates step-level autonomy in a surgical procedure, marking a milestone toward clinical deployment of autonomous surgical systems.
- Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework
Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.
Solidot(15)
- 西欧经历有记录以来最热的六月
欧盟哥白尼气候变化服务机构(CCCS)周三表示,西欧经历了有记录以来最热的六月,且连续三年六月都破高温记录。在全球范围内,六月是有记录以来第三热的六月,由于人类排放温室气体导致地球变暖,持续的高温天气仍在延续。哥白尼气候变化服务机构表示,6 月西欧平均气温为 20.49°C,比 1991-2020 年平均气温高出2.81°C。西欧经历了两波热浪,其中第二波延续到七月初,多国地表气温超过 40°C,西班牙和葡萄牙的气温高达 46°C。
- 两性权力关系泾渭并不分明
阿尔法雄性(alpha male)不是普世真理。德国马普和法国研究人员分析了 121 个灵长类物种 253 个种群的雌雄攻击行为的详细观察,结果显示两性之间并不存在泾渭分明的权力关系。雄性和雌性之间的竞争异常普遍。平均而言,社群中近半数侵略性互动都涉及雄性和雌性。学界长期以来一直假定灵长类动物的权力结构偏向雄性,然而最新研究给出了不同的答案,事实上偏向雄性的权力结构更多是一种例外。在分析的 151 个种群中,只有 25 个种群观察到雄性占优,16 个种群雌性占优,七成种群的权力结构是中性即无性别倾向。在陆生种群中,当雄性比雌性拥有更大体型和武器时,雄性占主导的情况更为普遍。研究人员称,灵长类雄性通过武力和强制获得权力,而雌性则通过其它策略如生殖策略获得权力。
- OpenAI 将发布 AI Web 浏览器挑战 Chrome
OpenAI 准备发布一款 AI 驱动的 Web 浏览器,挑战支配着浏览器市场的 Google Chrome。浏览器预计将在数周内发布,它旨在利用 AI 从根本上改变消费者浏览 Web 的方式。它将让 OpenAI 直接获取 Google 成功的基石:用户数据。Chrome 是 Alphabet 广告业务的支柱,Chrome 提供用户信息以帮助 Alphabet 更有效定向广告使其更有利可图,它还为 Google 提供了一种默认将搜索流量路由到自家引擎的方法。Google Chrome 用户数多达 30 亿,而 OpenAI ChatGPT 的周活跃用户为 5 亿,它的浏览器是基于 Google 开源的 Chromium。
- 麦当劳的 AI 招聘平台管理员密码是 123456
今天想要应聘麦当劳工作的人可能首先需要在 McHire.com 平台上与 AI 聊天机器人 Olivia 聊一聊。Olivia 会询问应聘者个人信息和简历,进行性格测试。该聊天机器人由 Paradox.ai 公司提供。安全研究员 Ian Carroll 和 Sam Curry 在听闻麦当劳使用 AI 聊天机器人筛选应聘者后好奇之下对 McHire.com 进行了一番研究,结果意外发现该平台的管理员用户名和密码都是 123456。登陆管理员面板之后,他们可以访问 Paradox.ai 账户,查询该公司保存的所有 McHire 用户与 Olivia 聊天记录的数据库。数据库包含了多达 6400 万条记录,包括了应聘者的姓名、电子邮件地址和电话号码。麦当劳表示 Paradox.ai 需要对该漏洞负责。Paradox.ai 确认并在一天内修复了漏洞。
- 英伟达市值突破 4 万亿美元
作为生成式 AI 最主要的硬件供应商,英伟达股价周三上涨逾 2%,市值突破 4 万亿美元,成为历史上第一家 4 万亿美元市值的企业。英伟达如今是全球市场第一的企业,超过了微软和苹果,两家公司在英伟达之前市值突破 3 万亿美元,但尚未达到 4 万亿美元。英伟达总部位于加州,成立于 1993 年,在 ChatGPT 掀起的生成式 AI 热中,它的股价一路高涨,2024 年 2 月其市值首次突破 2 万亿美元,6 月突破 3 万亿美元。
- 美国科技巨头对财政部的制裁名单响应并不迅速
今年早些时候,海牙国际刑事法院以战争罪对以色列总理内塔尼亚胡及其前国防部长加兰特(Yoav Gallant)发出逮捕令,美国总统特朗普随后对国际刑事法院进行了制裁,微软则立即封了首席检察官 Karim Khan 的电邮账号。但很多时候,美国科技巨头对制裁名单的响应并不像微软动作那么快。5 月 29 日,美国财政部对 Funnull Technology Inc.及其经营者、40 岁的上海人刘理志 aka Liu“Steve”Lizhi、XXL4 和 Nice Lizhi 等进行经济制裁,云服务商 Funnull 被控帮助了金融诈骗导致美国人损失逾 2 亿美元。根据美国法律,美国公司被禁止与被制裁的个人继续做生意。调查发现,在一个多月后,美国科技巨头 Facebook、Github、PayPal 和 Twitter/X 仍然没有关闭刘理志的账号。LinkedIn 的账号是在被要求置评后几小时内删除的。
- 一群抹香鲸被拍摄到以站立姿态睡觉
日本鹿儿岛县奄美大岛近海附近海域近期观察到一群抹香鲸仰头以“站立”姿态睡觉。奄美海洋生物研究会会长兴克树 6 月 23 日在该岛以西约 15 公里的海域发现并拍摄了这一景象。兴克树回忆称,“这是迄今见过的最大鲸群,也是第一次看到它们的睡姿。非常感动。”兴克树称,在距海面约 3 米深处发现的约 20 头鲸群中,中央的四五头处于类似站着睡觉的状态。身体最长约 14 米。抹香鲸睡眠时间为每天中的近 2 小时,能够遇到这一瞬间十分珍贵。
- 230 万 Chrome 和 Edge 用户安装了会劫持浏览器会话的扩展
一款帮助开发者选择颜色并获得 Google 验证徽章的取色器 (color pickers) 扩展看似无害,但安全研究人员称它会劫持浏览器会话、追踪网络活动,在受害者浏览器上植入后门。这款名为 Geco 扩展的下载量逾 10 万次,有 800 条评论,获得 4.2/5 星评价,安全公司 Koi Security 的研究人员称这是名为 RedDirection 的劫持浏览器的网络攻击行动的一部分,该行动涉及 18 个恶意扩展,逾 230 万 Chrome 和 Edge 用户受到影响。研究人员称,这些扩展最初不包含恶意代码,因此能获得 Google 的认证,它们是在后续更新中加入恶意代码的。
- 研究估计未来的胃癌病例大部分与幽门螺杆菌感染相关
根据发表在《Nature Medicine》期刊上的一项研究,法国国际癌症研究机构的研究人员估计了 2008-2017 年出生的年轻一代未来的胃癌负担。胃癌作为全球第五大癌症死因,其发病机制与幽门螺杆菌感染的密切关联早已被确认,但全球范围内针对这一可预防癌症的防控投入长期不足。尤其令人担忧的是,近年来年轻人群(<50岁)的胃癌发病率在高低风险地区均呈现上升趋势,而人口老龄化进程更将加剧疾病负担。研究人员估计,2008-2017 年出生人群一生中将有 1560 万例胃癌病例,其中 76% 可归因于幽门螺杆菌。亚洲大陆预计有 1060 万例(占全球 68%),东亚(590 万)和南亚(290 万)最为突出。中国和印度合计占全球可预防病例的 42%。
- 海马体在成年后仍然能生成新神经元
根据发表在《科学》期刊上的一项研究,人类大脑中掌管记忆的区域——海马体,在成年乃至老年阶段仍可持续生成新的神经元。这项研究解答了一个长期争议的核心问题,即成年人大脑是否仍具有可塑性。海马体是大脑中负责学习和记忆的重要区域,也与情绪调节密切相关。在新研究中,研究团队分析了多个国际生物样本库中 0—78 岁人群的脑组织。他们采用了一种名为“单核RNA测序”的方法,对单个细胞核中的基因活性进行分析,并结合流式细胞术来研究细胞特性。通过引入机器学习算法,他们跟踪了神经元从干细胞到未成熟阶段的发展过程,并识别出多个仍处于分裂状态的细胞亚群。结果表明,这些新生细胞集中存在于海马体中的“齿状回”区域,这是大脑中负责记忆形成、学习与认知灵活性的关键结构。研究显示,人类成年神经元的前体细胞在许多方面与小鼠、猪、猴等哺乳动物类似,但在基因活性上存在一定差异。此外,个体之间的差异显著,有些成年样本中神经前体细胞数量充足,另一些则接近为零。
- 海法洞穴发现的古代骨骼可能是人类和尼安德特人混血
现代人类祖先在走出非洲之后曾与尼安德特人相遇,彼此之间有过杂交,现代人类因此获得了部分尼安德特人 DNA,而尼安德特人则消失在历史长河之中,一种观点是尼安德特人没有灭绝而是被其近亲吸收了。1929 年,考古学家在以色列海法南部 Skuhl Cave 洞穴发掘了七具成人骨骼和三具儿童骨骼,其历史可追溯到 14 万年前。其中一具年龄大约 3-5 岁的儿童骨骼拥有现代人类和尼安德特人特征,可能是混血。研究人员利用 CT 扫描对其进行了重建,并与巴黎人类博物馆收藏的三个尼安德特人头骨进行了对比。研究证实了该儿童是早期人类和尼安德特人的混血儿。
- 科学家首次直接观测到反Klein隧穿现象
中国研究人员首次直接观测到“反Klein隧穿”(AKT)现象——这一量子悖论描述的是手性粒子在遇到势垒时并非穿越,而是被完全反射。“Klein隧穿”是量子物理中的一个著名悖论:质量为零的相对论粒子可以无视能垒的存在,自由穿越而不发生反射。与之相对的“反Klein隧穿”则预言:对于具有“手性”特征的有质量粒子,能垒会导致完全反射。这种奇特的传播行为长期以来仅存在于理论推演和间接证据中,始终缺乏实验验证。该研究中,团队设计了一种结构可调的双层声子晶体,在其声学色散关系中引入了手性与质量,使系统中的声子类比于双层石墨烯中的手性准粒子。当声波遇到由此构成的势垒结构时,传播行为取决于结构参数的调控:在特定配置下,声波被完全反射,即出现反Klein隧穿;而在另一配置下,则可以实现完全穿透,即Klein隧穿。
- TikTok 计划九月推出一个美国专用版本
上个月美国总统特朗普第三次给予 TikTok 90 天宽限期,TikTok 必须在 9 月 17 日之前将其美国业务出售给美国财团,否则将会面临被禁。The Information 报道,如果出售给美国财团的交易获得批准,TikTok 已开发了一个美国专用版本,计划于 9 月 5 日推出。所有美国 TikTok 用户将被提示在 2026 年 3 月之前切换到新版应用,届时原版应用将停止运行。目前不清楚 TikTok 的美国版本与全球版本有什么区别。
- 开源工具帮助互联网抵御 AI 爬虫
AI 爬虫早已超过搜索引擎爬虫,成为众多网站服务器的压力来源,原因是 AI 爬虫抓取频率更高,对内容有着无止境的需求,而且通常无视 robots.txt 规则。维基基金会今年早些时候表示 AI 爬虫导致其带宽消耗增加了五成。加拿大开发者 Xe Iaso 今年 1 月发布的工作量证明开源工具 Anubis 帮助网站抵御来自 AI 爬虫的无止境请求,至今它已被下载了近 20 万次,被桌面环境项目 GNOME、Linux 内核邮件列表存档和 Git 服务器、FFmpeg、Wine 和 FreeCAD 等知名开源项目以及 UNESCO(联合国教科文组织)等联合国组织使用。Anubis 会验证访客是人类还是机器人,方法是要求浏览器使用 JavaScript 执行加密数学运算,浏览器能自动完成,但 AI 爬虫除非模仿使用浏览器的用户,否则将会被挡住。而 AI 如果要模拟浏览器用户则将会大幅增加计算开销,导致其无法承受。鉴于部分用户的浏览器出于隐私等方面的考虑会禁用 JavaScript,Xe Iaso 正在开发一种不使用 JavaScript 的验证方法。
- Thunderbird 140 ESR 版释出
开源邮件客户端 Thunderbird 释出了 v140 ESR 版(长期支持版本),将提供一年的安全更新,相比非 ESR 版本,它主要针对企业级和教育市场,这些领域不需要频繁更新以免引入破坏兼容性的功能。v140 ESR 的主要变化也与此相关:实现企业策略允许细粒度应用内通知控制;新邮件提醒包含在消息处理按钮内;邮件通知添加 Mark as Read(标记已读)和 Delete(删除)操作;邮件通知添加“Mark as Spam(标记为垃圾邮件)”和“加星”操作;等等。