DIGEST · 2025-06-24

OrangeBot.AI Digest — 2025-06-24

73 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Fun with uv and PEP 723 (www.cottongeeks.com)
  2. Man 'refused entry into US' as border control catch him with bald JD Vance meme (www.dublinlive.ie)
  3. iPhone customers upset by Apple Wallet ad pushing F1 movie (techcrunch.com)
  4. MCP is eating the world (www.stainless.com)
  5. Writing toy software is a joy (blog.jsbarretto.com)
  6. PlasticList – Plastic Levels in Foods (www.plasticlist.org)
  7. The bitter lesson is coming for tokenization (lucalp.dev)
  8. Finding a 27-year-old easter egg in the Power Mac G3 ROM (www.downtowndougbrown.com)
  9. Basic Facts about GPUs (damek.github.io)
  10. SourceHut moves business operations from US to Europe (lists.sr.ht)
  11. Starship: The minimal, fast, and customizable prompt for any shell (starship.rs)
  12. Microplastics shed by food packaging are contaminating our food, study finds (www.cnn.com)
  13. Switching Pip to Uv in a Dockerized Flask / Django App (nickjanetakis.com)
  14. Tell HN: Meta developer account suspended
  15. The NO FAKES act has changed, and it's worse (www.eff.org)

GitHub Trending(13)

  1. DrKLO / Telegram

    Telegram for Android source

  2. patchy631 / ai-engineering-hub

    In-depth tutorials on LLMs, RAGs and real-world AI agent applications.

  3. microsoft / edit

    We all edit.

  4. HarbourMasters / SpaghettiKart
  5. jujumilk3 / leaked-system-prompts

    Collection of leaked system prompts

  6. musistudio / claude-code-router

    Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

  7. isledecomp / isle-portable

    A portable version of LEGO Island (1997)

  8. typst / typst

    A new markup-based typesetting system that is powerful and easy to learn.

  9. Effect-TS / effect

    Build production-ready applications in TypeScript

  10. microsoft / playwright

    Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

  11. ml-tooling / best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

  12. microsoft / Web-Dev-For-Beginners

    24 Lessons, 12 Weeks, Get Started as a Web Developer

  13. poteto / hiring-without-whiteboards

    ⭐️ Companies that don't have a broken hiring process

Product Hunt(15)

  1. Pally - AI Relationship Management

    All your connections, across all your socials.

  2. Pythagora 2.0

    World's first all-in-one AI dev platform

  3. Runbear

    Your best new hire, but AI — in Slack!

  4. Cekura

    Launch reliable voice & chat AI agents 10x faster

  5. SmythOS

    The open source agent OS

  6. Zen Agents (by Zencoder)

    Build AI agents. Share org-wide. 100+ Tools&MCP

  7. Riley Parenting App

    The only parenting app you'll ever need

  8. 11.ai by ElevenLabs

    The voice-first AI assistant that takes action

  9. Griply 3.0

    Time blocking, calendar, goal roadmap, in a simple MacOS app

  10. Simular Cloud

    Your autonomous computer in the cloud

  11. Overflow AI

    Turn questions about your donations into advanced insights

  12. Neocal

    AI-powered modern calendar that helps you plan

  13. Arcade Desktop 2.0

    Capture product demos anywhere — no extensions required.

  14. Tortuga Outdoor

    3D mapping-based social network dedicated to outdoor sports

  15. Brand Stori

    Context-aware AI that audits your website like a buyer

Hugging Face(15)

  1. RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models

    Recent multi-modal large language models (MLLMs) often struggle to generate personalized image captions, even when trained on high-quality captions. In this work, we observe that such limitations persist in existing post-training-based MLLM personalization methods. Specifically, despite being post-tuned with large-scale caption data through supervised fine-tuning (SFT), these models frequently fail to produce faithful descriptions in real-world scenarios, such as multi-concept image captioning. However, acquiring large-scale, high-quality captions for such complex settings is both costly and difficult. To address the data-centric nature of SFT, we propose a reinforcement learning (RL)-based post-training framework. To the best of our knowledge, this is the first RL-based approach to post-train MLLMs for personalized image captioning. Our method significantly enhances both visual recognition and personalized generation capabilities of MLLMs, and consistently outperforms existing SFT-based baselines, especially in the challenging multi-concept image captioning task.

  2. 4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

    Can we scale 4D pretraining to learn general space-time representations that reconstruct an object from a few views at some times to any view at any time? We provide an affirmative answer with 4D-LRM, the first large-scale 4D reconstruction model that takes input from unconstrained views and timestamps and renders arbitrary novel view-time combinations. Unlike prior 4D approaches, e.g., optimization-based, geometry-based, or generative, that struggle with efficiency, generalization, or faithfulness, 4D-LRM learns a unified space-time representation and directly predicts per-pixel 4D Gaussian primitives from posed image tokens across time, enabling fast, high-quality rendering at, in principle, infinite frame rate. Our results demonstrate that scaling spatiotemporal pretraining enables accurate and efficient 4D reconstruction. We show that 4D-LRM generalizes to novel objects, interpolates across time, and handles diverse camera setups. It reconstructs 24-frame sequences in one forward pass with less than 1.5 seconds on a single A100 GPU.

  3. Spec2RTL-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems

    Despite recent progress in generating hardware RTL code with LLMs, existing solutions still suffer from a substantial gap between practical application scenarios and the requirements of real-world RTL code development. Prior approaches either focus on overly simplified hardware descriptions or depend on extensive human guidance to process complex specifications, limiting their scalability and automation potential. In this paper, we address this gap by proposing an LLM agent system, termed Spec2RTL-Agent, designed to directly process complex specification documentation and generate corresponding RTL code implementations, advancing LLM-based RTL code generation toward more realistic application settings. To achieve this goal, Spec2RTL-Agent introduces a novel multi-agent collaboration framework that integrates three key enablers: (1) a reasoning and understanding module that translates specifications into structured, step-by-step implementation plans; (2) a progressive coding and prompt optimization module that iteratively refines the code across multiple representations to enhance correctness and synthesisability for RTL conversion; and (3) an adaptive reflection module that identifies and traces the source of errors during generation, ensuring a more robust code generation flow. Instead of directly generating RTL from natural language, our system strategically generates synthesizable C++ code, which is then optimized for HLS. This agent-driven refinement ensures greater correctness and compatibility compared to naive direct RTL generation approaches. We evaluate Spec2RTL-Agent on three specification documents, showing it generates accurate RTL code with up to 75% fewer human interventions than existing methods. This highlights its role as the first fully automated multi-agent system for RTL generation from unstructured specs, reducing reliance on human effort in hardware design.

  4. Demystifying the Visual Quality Paradox in Multimodal Large Language Models

    Recent Multimodal Large Language Models (MLLMs) excel on benchmark vision-language tasks, yet little is known about how input visual quality shapes their responses. Does higher perceptual quality of images already translate to better MLLM understanding? We conduct the first systematic study spanning leading MLLMs and a suite of vision-language benchmarks, applying controlled degradations and stylistic shifts to each image. Surprisingly, we uncover a visual-quality paradox: model, task, and even individual-instance performance can improve when images deviate from human-perceived fidelity. Off-the-shelf restoration pipelines fail to reconcile these idiosyncratic preferences. To close the gap, we introduce Visual-Quality Test-Time Tuning (VQ-TTT)-a lightweight adaptation module that: (1) inserts a learnable, low-rank kernel before the frozen vision encoder to modulate frequency content; and (2) fine-tunes only shallow vision-encoder layers via LoRA. VQ-TTT dynamically adjusts each input image in a single forward pass, aligning it with task-specific model preferences. Across the evaluated MLLMs and all datasets, VQ-TTT lifts significant average accuracy, with no external models, cached features, or extra training data. These findings redefine ``better'' visual inputs for MLLMs and highlight the need for adaptive, rather than universally ``clean'', imagery, in the new era of AI being the main data customer.

  5. 3D Arena: An Open Platform for Generative 3D Evaluation

    Evaluating Generative 3D models remains challenging due to misalignment between automated metrics and human perception of quality. Current benchmarks rely on image-based metrics that ignore 3D structure or geometric measures that fail to capture perceptual appeal and real-world utility. To address this gap, we present 3D Arena, an open platform for evaluating image-to-3D generation models through large-scale human preference collection using pairwise comparisons. Since launching in June 2024, the platform has collected 123,243 votes from 8,096 users across 19 state-of-the-art models, establishing the largest human preference evaluation for Generative 3D. We contribute the iso3d dataset of 100 evaluation prompts and demonstrate quality control achieving 99.75% user authenticity through statistical fraud detection. Our ELO-based ranking system provides reliable model assessment, with the platform becoming an established evaluation resource. Through analysis of this preference data, we present insights into human preference patterns. Our findings reveal preferences for visual presentation features, with Gaussian splat outputs achieving a 16.6 ELO advantage over meshes and textured models receiving a 144.1 ELO advantage over untextured models. We provide recommendations for improving evaluation methods, including multi-criteria assessment, task-oriented evaluation, and format-aware comparison. The platform's community engagement establishes 3D Arena as a benchmark for the field while advancing understanding of human-centered evaluation in Generative 3D.

  6. Audit & Repair: An Agentic Framework for Consistent Story Visualization in Text-to-Image Diffusion Models

    Story visualization has become a popular task where visual scenes are generated to depict a narrative across multiple panels. A central challenge in this setting is maintaining visual consistency, particularly in how characters and objects persist and evolve throughout the story. Despite recent advances in diffusion models, current approaches often fail to preserve key character attributes, leading to incoherent narratives. In this work, we propose a collaborative multi-agent framework that autonomously identifies, corrects, and refines inconsistencies across multi-panel story visualizations. The agents operate in an iterative loop, enabling fine-grained, panel-level updates without re-generating entire sequences. Our framework is model-agnostic and flexibly integrates with a variety of diffusion models, including rectified flow transformers such as Flux and latent diffusion models such as Stable Diffusion. Quantitative and qualitative experiments show that our method outperforms prior approaches in terms of multi-panel consistency.

  7. How Alignment Shrinks the Generative Horizon

    Despite their impressive capabilities, aligned large language models (LLMs) often generate outputs that lack diversity. What drives this stability in the generation? We investigate this phenomenon through the lens of probability concentration in the model's output distribution. To quantify this concentration, we introduce the Branching Factor (BF) -- a token-invariant measure of the effective number of plausible next steps during generation. Our empirical analysis reveals two key findings: (1) BF often decreases as generation progresses, suggesting that LLMs become more predictable as they generate. (2) alignment tuning substantially sharpens the model's output distribution from the outset, reducing BF by nearly an order of magnitude (e.g., from 12 to 1.2) relative to base models. This stark reduction helps explain why aligned models often appear less sensitive to decoding strategies. Building on this insight, we find this stability has surprising implications for complex reasoning. Aligned Chain-of-Thought (CoT) models (e.g., DeepSeek-distilled models), for instance, leverage this effect; by generating longer reasoning chains, they push generation into later, more deterministic (lower BF) stages, resulting in more stable outputs. We hypothesize that alignment tuning does not fundamentally change a model's behavior, but instead steers it toward stylistic tokens (e.g., "Sure") that unlock low-entropy trajectories already present in the base model. This view is supported by nudging experiments, which show that prompting base models with such tokens can similarly reduce BF. Together, our findings establish BF as a powerful diagnostic for understanding and controlling LLM outputs - clarifying how alignment reduces variability, how CoT promotes stable generations, and how base models can be steered away from diversity.

  8. DIP: Unsupervised Dense In-Context Post-training of Visual Representations

    We introduce DIP, a novel unsupervised post-training method designed to enhance dense image representations in large-scale pretrained vision encoders for in-context scene understanding. Unlike prior approaches that rely on complex self-distillation architectures, our method trains the vision encoder using pseudo-tasks that explicitly simulate downstream in-context scenarios, inspired by meta-learning principles. To enable post-training on unlabeled data, we propose an automatic mechanism for generating in-context tasks that combines a pretrained diffusion model and the vision encoder itself. DIP is simple, unsupervised, and computationally efficient, requiring less than 9 hours on a single A100 GPU. By learning dense representations through pseudo in-context tasks, it achieves strong performance across a wide variety of downstream real-world in-context scene understanding tasks. It outperforms both the initial vision encoder and prior methods, offering a practical and effective solution for improving dense representations. Code available here: https://github.com/sirkosophia/DIP

  9. GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning

    Medical visual question answering aims to support clinical decision-making by enabling models to answer natural language questions based on medical images. While recent advances in multi-modal learning have significantly improved performance, current methods still suffer from limited answer reliability and poor interpretability, impairing the ability of clinicians and patients to understand and trust model-generated answers. To address this, this work first proposes a Thinking with Visual Grounding (ThinkVG) dataset wherein the answer generation is decomposed into intermediate reasoning steps that explicitly ground relevant visual regions of the medical image, thereby providing fine-grained explainability. Furthermore, we introduce a novel verifiable reward mechanism for reinforcement learning to guide post-training, improving the alignment between the model's reasoning process and its final answer. Remarkably, our method achieves comparable performance using only one-eighth of the training data, demonstrating the efficiency and effectiveness of the proposal. The dataset is available at https://huggingface.co/datasets/BoKelvin/GEMeX-ThinkVG.

  10. TPTT: Transforming Pretrained Transformer into Titans

    Recent advances in large language models (LLMs) have led to remarkable progress in natural language processing, but their computational and memory demands remain a significant challenge, particularly for long-context inference. We introduce TPTT (Transforming Pretrained Transformer into Titans), a novel framework for enhancing pretrained Transformer models with efficient linearized attention mechanisms and advanced memory management. TPTT employs techniques such as Memory as Gate (MaG) and mixed linearized attention (LiZA). It is fully compatible with the Hugging Face Transformers library, enabling seamless adaptation of any causal LLM through parameter-efficient fine-tuning (LoRA) without full retraining. We show the effectiveness of TPTT on the MMLU benchmark with models of approximately 1 billion parameters, observing substantial improvements in both efficiency and accuracy. For instance, Titans-Llama-3.2-1B achieves a 20% increase in Exact Match (EM) over its baseline. Statistical analyses and comparisons with recent state-of-the-art methods confirm the practical scalability and robustness of TPTT. Code is available at https://github.com/fabienfrfr/tptt . Python package at https://pypi.org/project/tptt/ .

  11. 4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation

    We propose the first framework capable of computing a 4D spatio-temporal grid of video frames and 3D Gaussian particles for each time step using a feed-forward architecture. Our architecture has two main components, a 4D video model and a 4D reconstruction model. In the first part, we analyze current 4D video diffusion architectures that perform spatial and temporal attention either sequentially or in parallel within a two-stream design. We highlight the limitations of existing approaches and introduce a novel fused architecture that performs spatial and temporal attention within a single layer. The key to our method is a sparse attention pattern, where tokens attend to others in the same frame, at the same timestamp, or from the same viewpoint. In the second part, we extend existing 3D reconstruction algorithms by introducing a Gaussian head, a camera token replacement algorithm, and additional dynamic layers and training. Overall, we establish a new state of the art for 4D generation, improving both visual quality and reconstruction capability.

  12. CultureMERT: Continual Pre-Training for Cross-Cultural Music Representation Learning

    Recent advances in music foundation models have improved audio representation learning, yet their effectiveness across diverse musical traditions remains limited. We introduce CultureMERT-95M, a multi-culturally adapted foundation model developed to enhance cross-cultural music representation learning and understanding. To achieve this, we propose a two-stage continual pre-training strategy that integrates learning rate re-warming and re-decaying, enabling stable adaptation even with limited computational resources. Training on a 650-hour multi-cultural data mix, comprising Greek, Turkish, and Indian music traditions, results in an average improvement of 4.9% in ROC-AUC and AP across diverse non-Western music auto-tagging tasks, surpassing prior state-of-the-art, with minimal forgetting on Western-centric benchmarks. We further investigate task arithmetic, an alternative approach to multi-cultural adaptation that merges single-culture adapted models in the weight space. Task arithmetic performs on par with our multi-culturally trained model on non-Western auto-tagging tasks and shows no regression on Western datasets. Cross-cultural evaluation reveals that single-culture models transfer with varying effectiveness across musical traditions, whereas the multi-culturally adapted model achieves the best overall performance. To support research on world music representation learning, we publicly release CultureMERT-95M and CultureMERT-TA-95M, fostering the development of more culturally aware music foundation models.

  13. TC-Light: Temporally Consistent Relighting for Dynamic Long Videos

    Editing illumination in long videos with complex dynamics has significant value in various downstream tasks, including visual content creation and manipulation, as well as data scaling up for embodied AI through sim2real and real2real transfer. Nevertheless, existing video relighting techniques are predominantly limited to portrait videos or fall into the bottleneck of temporal consistency and computation efficiency. In this paper, we propose TC-Light, a novel paradigm characterized by the proposed two-stage post optimization mechanism. Starting from the video preliminarily relighted by an inflated video relighting model, it optimizes appearance embedding in the first stage to align global illumination. Then it optimizes the proposed canonical video representation, i.e., Unique Video Tensor (UVT), to align fine-grained texture and lighting in the second stage. To comprehensively evaluate performance, we also establish a long and highly dynamic video benchmark. Extensive experiments show that our method enables physically plausible relighting results with superior temporal coherence and low computation cost. The code and video demos are available at https://dekuliutesla.github.io/tclight/.

  14. Steering Conceptual Bias via Transformer Latent-Subspace Activation

    This work examines whether activating latent subspaces in language models (LLMs) can steer scientific code generation toward a specific programming language. Five causal LLMs were first evaluated on scientific coding prompts to quantify their baseline bias among four programming languages. A static neuron-attribution method, perturbing the highest activated MLP weight for a C++ or CPP token, proved brittle and exhibited limited generalization across prompt styles and model scales. To address these limitations, a gradient-refined adaptive activation steering framework (G-ACT) was developed: per-prompt activation differences are clustered into a small set of steering directions, and lightweight per-layer probes are trained and refined online to select the appropriate steering vector. In LLaMA-3.2 3B, this approach reliably biases generation towards the CPP language by increasing the average probe classification accuracy by 15% and the early layers (0-6) improving the probe classification accuracy by 61.5% compared to the standard ACT framework. For LLaMA-3.3 70B, where attention-head signals become more diffuse, targeted injections at key layers still improve language selection. Although per-layer probing introduces a modest inference overhead, it remains practical by steering only a subset of layers and enables reproducible model behavior. These results demonstrate a scalable, interpretable and efficient mechanism for concept-level control for practical agentic systems.

  15. From Virtual Games to Real-World Play

    We introduce RealPlay, a neural network-based real-world game engine that enables interactive video generation from user control signals. Unlike prior works focused on game-style visuals, RealPlay aims to produce photorealistic, temporally consistent video sequences that resemble real-world footage. It operates in an interactive loop: users observe a generated scene, issue a control command, and receive a short video chunk in response. To enable such realistic and responsive generation, we address key challenges including iterative chunk-wise prediction for low-latency feedback, temporal consistency across iterations, and accurate control response. RealPlay is trained on a combination of labeled game data and unlabeled real-world videos, without requiring real-world action annotations. Notably, we observe two forms of generalization: (1) control transfer-RealPlay effectively maps control signals from virtual to real-world scenarios; and (2) entity transfer-although training labels originate solely from a car racing game, RealPlay generalizes to control diverse real-world entities, including bicycles and pedestrians, beyond vehicles. Project page can be found: https://wenqsun.github.io/RealPlay/

Solidot(15)

  1. 中国五月份太阳能装机容量创下新记录

    官方记录显示,中国五月份太阳能装机容量创下记录。单月新增装机容量超过其他国家 2024 年全年的装机容量。根据国家能源局的数据,5 月新增太阳能装机容量 93 GW,打破了 2024 年 12 月创下的 71 GW 的纪录。太阳能装机容量大幅增长的一个原因是政府从 6 月 1 日起取消了对太阳能项目的电价保护:在保护政策下太阳能项目只要投入运营就能确保盈利。另一项于 5 月 1 日生效的政策加大了屋顶太阳能板接入电网的难度。分析人士预测,新政策将会放缓太阳能装机容量的增长。中国累计太阳能装机容量至今已超过 1 TW。

  2. Vera C. Rubin 天文台公布了首批宇宙全景照

    Vera C. Rubin 天文台周一公开第一批宇宙全景照,宣告展开为期 10 年的时空遗珍巡天项目(Legacy Survey of Space and Time,LSST),这将会是人类史上最全面的南天巡天计划。天文台位于智利帕乔恩山顶,海拔1,600 米,配有口径 8.4 米的望远镜以及史上最大与最高解析度的数字相机 LSSTCam,其大小与一台汽车相当。这台超级相机每三个晚上就能扫描整个南半球夜空。在首批释出的影像中,LSSTCam 捕捉到距离地球约五千万光年的室女座星系团,画面中包含多达一千万个星系,然而这一千万个星系,只占 LSST 任务期间预计将观测到 200 亿个星系的 0.05%。

  3. Google Chromebook 笔记本电脑集成 AI 功能

    Google 正将 AI 功能集成到其越来越多的产品中,最新集成的产品是它面向教育领域的笔记本电脑 Chromebook。虽然 AI 的运算主要是在云端进行,但 Chromebook 要使用 AI 仍然需要较高的硬件配置。Google 和联想合作推出的 Chromebook Plus 14 配备了联发科 Kompanio Ultra 处理器,Google 称是 Chromebook 史上最强大的 ARM 芯片。Kompanio Ultra NPU 的 AI 运算能力达到了 50 TOPS,足以本地运行部分 AI 模型,接近微软的 Copilot+ PC。这款 Chromebook 售价 749 美元。

  4. 亚马逊加速发射互联网宽带卫星

    ULA 周一从佛罗里达州卡纳维拉尔角使用 Atlas V火箭为亚马逊发射了 27 颗 Project Kuiper 互联网宽带卫星。Project Kuiper 至今共完成了三次发射,其中第一次是测试,目前在轨宽带卫星 54 颗,亚马逊计划发射 3232 颗宽带卫星,覆盖大部分人口密集地区。亚马逊已与四家发射公司购买了 80 多次发射合同,其中 ULA 使用 Atlas V 火箭发射九次,之后火箭退役,改用 Vulcan 火箭发射 38 次——每次发射的卫星数量将增加到 45 颗。欧洲的 Ariane 6 火箭将执行 18 次,贝佐斯旗下 Blue Origin 的 New Glenn 火箭将至少发射 12 次。竞争对手 SpaceX 将在下个月执行 Project Kuiper 的第四次发射。SpaceX 的 Starlink 宽带卫星星座总数已经超过 7000 颗。

  5. IYO 就 IO 商标起诉 OpenAI

    从 Google X 分拆出来的创业公司 IYO 就 IO 商标起诉了 OpenAI 和 Jony Ive 的 IO Products, Inc. 公司。 OpenAI 于 2025 年 5 月 21 日宣布以 65 亿美元收购 IO,但前几天悄悄撤销了相关的宣传材料。IYO 生产名为 IYO ONE 的耳戴式设备,允许用户通过语音命令与计算机和 AI 进行交互,无需屏幕或键盘。起诉书指控被告故意为竞争产品使用一个易混淆的名称。起诉书称,OpenAI CEO Sam Altman 和 Ive 的设计工作室 LoveFrom 在 2022-2025 年间多次与 IYO 代表会面,了解 IYO 的技术和商业计划细节。2025 年 3 月,Altman 告诉 IYO 正在开发名为 io 的竞争产品。IO Products 成立于 2023 年 9 月,致力于开发与 IYO 产品类似的无屏电脑交互硬件。诉讼寻求禁制令(injunctive relief),要求对商标侵权和不正当竞争赔偿。

  6. Firefox 140 释出

    Mozilla 释出了 Firefox 140,这是一个长期支持版本(LTS)。主要新特性包括:右键标签页会显示“Unload Tab”选项,此举可减少未使用标签页占用的内存节省 CPU 资源;支持 CSS Custom Highlighting API,Chrome 从 v121 开始支持该 API;改进垂直标签,支持添加自定义搜索引擎;支持 SVG fetchpriority 属性、Cookie Store API 等。

  7. 玻璃瓶瓶盖显著增加了饮料中的微塑料含量

    法国食品、环境和职业健康与安全局(ANSES)发表研究报告,所有饮料都含有微塑料,但玻璃瓶装饮料的微塑料颗粒含量显著高于塑料瓶、纸盒或罐装饮料。对各种包装的饮料的检测发现,玻璃瓶装软饮料、柠檬水、冰茶和啤酒中平均每升含有 100 个微塑料颗粒,是塑料瓶或金属罐装饮料的 5-50 倍。科学家推测,玻璃瓶装饮料的塑料颗粒可能来自瓶盖上的塑料涂层。瓶盖可能是在运输过程中因为摩擦等导致了塑料颗粒脱落。制造商可采取措施减少塑料颗粒的脱落。

  8. 博士数量超过学界需求

    过去几十年全世界博士毕业生数量持续增长。传统上博士学位是终身学术生涯的垫脚石,但今天的博士毕业生数量远远超过了大学和研究机构的职位空缺数量。在 38 个经合组织(OECD)成员国中,1998-2017 年之间新增博士数量翻了几乎一番。中国博士生人数从 2013 年的 30 万增加到 2023 年的 60 多万,香港大学的 Hugo Horta 解释说,推动中国博士生人数增长的因素包括了学士和硕士学位人数快速增长,期望投资高等教育能带来更好的经济和社会前景。博士数量超过学界需求迫使博士毕业生以前所未有的速度转向非学术领域。2023 年针对英国 4500 多名博士毕业生的调查发现,逾三分之二博士在学术界以外就业。南非调查的 6000 多名博士毕业生中有 18% 表示难以找到与其专业相关的工作。部分国家开始调整博士课程。日本、德国和英国提供了博士学习期间的培训和带薪实习,其中包括“产业博士”项目,允许与企业合作开展研究。

  9. 马斯克现身YC大会:谈“智能大爆炸”时代的生存法则,结合PayPal、SpaceX、特斯拉、xAI创业史,详解如何使用第一性原理

    马斯克参加Y Combinator AI创业学校活动,与数百名年轻工程师分享了50分钟的创业公开课。他宣布从华盛顿DOGE工作回归技术领域,用"支线任务与主线任务"比喻解释这一选择——政府效率改革虽重要但只是支线任务,技术建设才是主线任务。他预测数字超级智能可能在今年或明年实现,强调人类正处于"智能大爆炸的早期阶段"。马斯克分享了三个核心观点:选择"不可能成功"的项目因为"小概率成功比零概率成功好";AI安全最重要的是"对真理的严格坚持"而非技术本身;面对人类智能将占总智能不到1%的未来,应从长期视角思考个人选择。他详述了从Zip2到PayPal、SpaceX、Tesla的创业历程,强调第一性原理思维在技术突破中的关键作用,并对脑机接口、机器人技术和多行星文明等未来技术发展做出了预测。

  10. 微软设定 Windows 11 系统还原点的有效时间为 60 天

    微软六月例行安全更新的一个补丁 KB5060842 修改了 Windows 11 管理系统还原点的方式:将有效时间设为 60 天,超过 60 天的还原点将不可用。Windows 11 v24H2 没有改变还原点的创建或使用方式;它只是为还原点的存储时间设定了明确的期限。如果分配的磁盘空间已满,系统仍然会删除时间较旧的还原点。现在无论可用磁盘空间多大,还原点的存储时间上限都固定为 60 天。

  11. 一颗死亡的 NASA 卫星突然发射出强大的射电信号

    NASA Relay 2 实验通信卫星于 1964 年发射升空,1967 年完全停止运作,之后一直停留在轨道上。将近 60 年后,澳大利亚天文学家使用射电望远镜阵列 Australian Square Kilometre Array Pathfinder (ASKAP)搜寻快速射电暴信号时在 Relay 2 所处位置探测到了短暂而强大的射电脉冲,令他们困惑不已。信号持续不到 30 纳秒,但在短时间内其亮度超过了天空中的所有其他天体。他们一开始以为是新的脉冲星或类似天体,之后发现信号来自于地球轨道上的 Relay 2。他们猜测可能是静电放电之类的事件,或者微陨石撞击产生的带电等离子体云。

  12. 分析师认为 AI 没有做好它的工作

    分析公司 Gartner 的 AI 研究主管 Erick Brethenoux 认为 AI 没有做好它的工作,它本不应该麻烦人类。生成式 AI 的一个关键应用是生成会议摘要,Brethenoux 说他根本没有时间阅读,他知道自己要做什么,但做的工作不是 AI 摘要里列出的五项行动。AI 本应该帮助人类完成工作,而不是指导人类去做什么。他认为 AI 应该通过自动化繁琐的任务简化用户的工作。AI 智能体不是什么新鲜事物,工业企业在一种相对封闭的系统中使用类似自动化已经有几十年历史了,但它无法处理比较复杂的任务。AI 供应商尚未能解决复杂问题,但通过起了生成式 AI 这样一个酷名字去炒作概念。

  13. 减肥显著提升一个人的自尊水平

    一项新研究发现,患者在接受减重手术后的一年内,自尊得分翻了一番多。研究指出,减重手术后,患者的自尊得分从33.6上升到77.5,飙升131%。自尊得分范围从0到100,评分越高表明患者自尊水平和生活质量越高。尽管存在性别、年龄、种族或减重手术类型等人口统计学差异,但减重似乎促进了自尊水平的提高,减重最多的人得分最高。研究作者称,“了解与肥胖相关的体重污名和心理社会因素对于提供‘全人照护’至关重要。虽然这些因素不应该成为接受减重手术的决定性因素,但它们应该是与患者对话的重要部分。”研究还发现体重污名与抑郁、焦虑、饮食失调和自卑等相关。在肥胖成年人群体中,体重歧视的概率为19%~42%,BMI较高的人和女性被体重歧视的可能性更高。

  14. AI 如何影响印度的呼叫中心行业

    印度的呼叫中心行业从业者逾 300 万人,产值 2800 亿美元。AI 所带来的服务自动化会对这个行业产生多大的影响?AI 聊天机器人或虚拟智能体能完成基本的客户服务任务,如处理密码重置或余额更新,它们还能编写代码、翻译电子邮件、引导患者,分析信用卡、抵押贷款和保险的申请。印度外包巨头 Tata Consultancy Services CEO K Krithivasan 称,一年内对呼叫中心的需求将会降至最低。布鲁金斯学会发现,86% 的客户服务任务有“高度自动化潜力”。国际货币基金组织警告,印度逾四分之一的就业岗位会“高度曝光”给 AI。虽然 AI 可能会淘汰部分工作,但也会创造新的工作岗位。Teleperformance 等公司雇佣了数千印度数据标注员,为 AI 系统标注数据。

  15. Bill Gates 和 Linus Torvalds 首次同框

    Windows 和 Linux 世界的两大巨头此前从未在现实世界里见过面。最近 Sysinternals 创始人 Mark Russinovich 举办的一个晚宴上,Linux 内核作者 Linus Torvalds 和微软联合创始人 Bill Gates 首次同框。照片中共有四个人,其他两人是 Sysinternals 联合创始人、现微软云计算平台 Azure CTO Russinovich,他在 1990 年代后期开发了一组工具 Process Explorer、Autoruns 和 Procmon,对管理员和安全专业人士理解 Windows 内部机制产生了革命性影响。另外一人是 OpenVMS 核心开发者、Windows NT 内核和硬件抽象层的首席架构师,被誉为 Windows NT 之父的 Dave Cutler。Russinovich 在发布照片时开玩笑的说,他们并没有做出重要的内核决策。