OrangeBot.AI Digest — 2025-06-26
75 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- AI Is Dehumanization Technology (thedabbler.patatas.ca)
- "Why is the Rust compiler so slow?" (sharnoff.io)
- US economy shrank 0.5% in the first quarter, worse than earlier estimates (apnews.com)
- Introducing Gemma 3n (developers.googleblog.com)
- AlphaGenome: AI for better understanding the genome (deepmind.google)
- I built an ADHD app with interactive coping tools, noise mixer and self-test (www.adhdhelp.app)
- FLUX.1 Kontext [Dev] – Open Weights for Image Editing (bfl.ai)
- Show HN: I built an AI dataset generator (github.com)
- Launch HN: Issen (YC F24) – Personal AI language tutor
- Learnings from building AI agents (www.cubic.dev)
- 'Sticky thinking' hampers decisions in depression (www.bps.org.uk)
- I fought in Ukraine and here's why FPV drones kind of suck (warontherocks.com)
- Snow - Classic Macintosh emulator (snowemu.com)
- Apptainer: Application Containers for Linux (apptainer.org)
- The first non-opoid painkiller (www.worksinprogress.news)
GitHub Trending(15)
- microsoft / edit
We all edit.
- mui / base-ui
Unstyled UI components for building accessible web apps and design systems. From the creators of Radix, Floating UI, and Material UI.
- gitleaks / gitleaks
Find secrets with Gitleaks 🔑
- modelcontextprotocol / registry
A community driven registry service for Model Context Protocol (MCP) servers.
- punkpeye / awesome-mcp-servers
A collection of MCP servers.
- jujumilk3 / leaked-system-prompts
Collection of leaked system prompts
- twentyhq / twenty
Building a modern alternative to Salesforce, powered by the community.
- nexus-xyz / nexus-cli
Command line interface for supplying proofs to the Nexus network.
- AykutSarac / jsoncrack.com
✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
- DioxusLabs / dioxus
Fullstack app framework for web, desktop, and mobile.
- Portkey-AI / gateway
A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.
- ml-tooling / best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
- microsoft / ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
- mikeroyal / Self-Hosting-Guide
Self-Hosting Guide. Learn all about locally hosting (on premises & private web servers) and managing software applications by yourself or your organization. Including Cloud, LLMs, WireGuard, Automation, Home Assistant, and Networking.
- microsoft / generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Product Hunt(15)
- mysite.ai
AI that builds your website & gets leads
- Clarify
The autonomous CRM that helps you sell more
- MoodGallery: Emotions to art
Turn your moods into personalized AI artwork
- Optibot
Agentic security-first code review w/ clear cues & no noise
- Inworld TTS
Voice AI that’s 5% of the cost. 100% of the quality.
- BestPage.ai
Jailbreak ‘Best Of’ search - Get your brand seen.
- HYBRD
Performance hub for hybrid athletes to optimize training
- Vitara AI
From prompt to app: Full-stack, fast, and deployable
- InterviewBee AI
Get real-time AI assistance during live interviews
- LangWatch Scenario - Agent Simulations
Agentic testing for agentic codebases
- Cora
The $150K chief of staff for your inbox, at just $15/month
- Gemini CLI
Code, research, and automate from your terminal
- Gen AI Studio
Create social videos 10x faster with AI
- BooleanMaths
Fix Ad targeting on Meta + Google Ads w/ better attribution
- Calculators in Email
Let customers calculate & decide in email
Hugging Face(15)
- Thought Anchors: Which LLM Reasoning Steps Matter?
Reasoning large language models have recently achieved state-of-the-art performance in many fields. However, their long-form chain-of-thought reasoning creates interpretability challenges as each generated token depends on all previous ones, making the computation harder to decompose. We argue that analyzing reasoning traces at the sentence level is a promising approach to understanding reasoning processes. We present three complementary attribution methods: (1) a black-box method measuring each sentence's counterfactual importance by comparing final answers across 100 rollouts conditioned on the model generating that sentence or one with a different meaning; (2) a white-box method of aggregating attention patterns between pairs of sentences, which identified ``broadcasting'' sentences that receive disproportionate attention from all future sentences via ``receiver'' attention heads; (3) a causal attribution method measuring logical connections between sentences by suppressing attention toward one sentence and measuring the effect on each future sentence's tokens. Each method provides evidence for the existence of thought anchors, reasoning steps that have outsized importance and that disproportionately influence the subsequent reasoning process. These thought anchors are typically planning or backtracking sentences. We provide an open-source tool (www.thought-anchors.com) for visualizing the outputs of our methods, and present a case study showing converging patterns across methods that map how a model performs multi-step reasoning. The consistency across methods demonstrates the potential of sentence-level analysis for a deeper understanding of reasoning models.
- GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents significant challenges in deployment and inference. While structured pruning of model parameters offers a promising way to reduce computational costs at deployment time, current methods primarily focus on single model pruning. In this work, we develop a novel strategy to compress models by strategically combining or merging layers from finetuned model variants, which preserves the original model's abilities by aggregating capabilities accentuated in different finetunes. We pose the optimal tailoring of these LLMs as a zero-order optimization problem, adopting a search space that supports three different operations: (1) Layer removal, (2) Layer selection from different candidate models, and (3) Layer merging. Our experiments demonstrate that this approach leads to competitive model pruning, for example, for the Llama2-13B model families, our compressed models maintain approximately 97.3\% of the original performance while removing sim25% of parameters, significantly outperforming previous state-of-the-art methods. The code is available at https://github.com/Guinan-Su/auto-merge-llm.
- Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models
Recent advances in image editing with diffusion models have achieved impressive results, offering fine-grained control over the generation process. However, these methods are computationally intensive because of their iterative nature. While distilled diffusion models enable faster inference, their editing capabilities remain limited, primarily because of poor inversion quality. High-fidelity inversion and reconstruction are essential for precise image editing, as they preserve the structural and semantic integrity of the source image. In this work, we propose a novel framework that enhances image inversion using consistency models, enabling high-quality editing in just four steps. Our method introduces a cycle-consistency optimization strategy that significantly improves reconstruction accuracy and enables a controllable trade-off between editability and content preservation. We achieve state-of-the-art performance across various image editing tasks and datasets, demonstrating that our method matches or surpasses full-step diffusion models while being substantially more efficient. The code of our method is available on GitHub at https://github.com/ControlGenAI/Inverse-and-Edit.
- Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content
We introduce Biomed-Enriched, a biomedical text dataset constructed from PubMed via a two-stage annotation process. In the first stage, a large language model annotates 400K paragraphs from PubMed scientific articles, assigning scores for their type (review, study, clinical case, other), domain (clinical, biomedical, other), and educational quality. The educational quality score (rated 1 to 5) estimates how useful a paragraph is for college-level learning. These annotations are then used to fine-tune a small language model, which propagates the labels across the full PMC-OA corpus. The resulting metadata allows us to extract refined subsets, including 2M clinical case paragraphs with over 450K high-quality ones from articles with commercial-use licenses, and to construct several variants via quality filtering and domain upsampling. Clinical text is typically difficult to access due to privacy constraints, as hospital records cannot be publicly shared. Hence, our dataset provides an alternative large-scale, openly available collection of clinical cases from PubMed, making it a valuable resource for biomedical and clinical NLP. Preliminary continual-pretraining experiments with OLMo2 suggest these curated subsets enable targeted improvements, with clinical upsampling boosting performance by ~5% on MMLU ProfMed and educational quality filtering improving MedQA and MedMCQA by ~1%. Combinations of these techniques led to faster convergence, reaching same performance with a third of training tokens, indicating potential for more efficient and effective biomedical pretraining strategies.
- OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Different base language model families, such as Llama and Qwen, exhibit divergent behaviors during post-training with reinforcement learning (RL), especially on reasoning-intensive tasks. What makes a base language model suitable for reinforcement learning? Gaining deeper insight into this question is essential for developing RL-scalable foundation models of the next generation. In this work, we investigate how mid-training strategies shape RL dynamics, focusing on two representative model families: Qwen and Llama. Our study reveals that (1) high-quality mathematical corpora, such as MegaMath-Web-Pro, significantly improve both base model and RL performance, while existing alternatives (e.g., FineMath-4plus) fail to do so; (2) further adding QA-style data, particularly long chain-of-thought (CoT) reasoning examples, enhances RL outcomes, and instruction data further unlocks this effect; (3) while long-CoT improves reasoning depth, it can also induce verbosity of model responses and unstability of RL training, underscoring the importance of data formatting; (4) scaling mid-training consistently leads to stronger downstream RL performance. Building on these insights, we introduce a two-stage mid-training strategy, Stable-then-Decay, in which base models are first trained on 200B tokens with a constant learning rate, followed by 20B tokens across three CoT-focused branches with learning rate decay. This yields OctoThinker, a family of models demonstrating strong RL compatibility and closing the performance gap with more RL-friendly model families, i.e., Qwen. We hope our work will help shape pre-training strategies for foundation models in the RL era. To support further research, we release our open-source models along with a curated math reasoning-intensive corpus of over 70 billion tokens (i.e., MegaMath-Web-Pro-Max).
- ReCode: Updating Code API Knowledge with Reinforcement Learning
Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their training data, even with access to current documentation, impedes reliable code generation in dynamic environments. To tackle this issue, we propose ReCode (rule-based Reinforcement learning for Code Update), a novel framework that mimics human programmer adaptation to API changes. Specifically, we construct a dataset of approximately 2,000 data entries to train the LLMs to perform version migration based on updated information. Then, we introduce a modified string similarity metric for code evaluation as the reward for reinforcement learning. Our experiments demonstrate that ReCode substantially boosts LLMs' code generation performance in dynamic API scenarios, especially on the unseen CodeUpdateArena task. Crucially, compared to supervised fine-tuning, ReCode has less impact on LLMs' general code generation abilities. We apply ReCode on various LLMs and reinforcement learning algorithms (GRPO and DAPO), all achieving consistent improvements. Notably, after training, Qwen2.5-Coder-7B outperforms that of the 32B parameter code instruction-tuned model and the reasoning model with the same architecture. Code is available at https://github.com/zjunlp/ReCode.
- HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Diffusion models have emerged as the leading approach for image synthesis, demonstrating exceptional photorealism and diversity. However, training diffusion models at high resolutions remains computationally prohibitive, and existing zero-shot generation techniques for synthesizing images beyond training resolutions often produce artifacts, including object duplication and spatial incoherence. In this paper, we introduce HiWave, a training-free, zero-shot approach that substantially enhances visual fidelity and structural coherence in ultra-high-resolution image synthesis using pretrained diffusion models. Our method employs a two-stage pipeline: generating a base image from the pretrained model followed by a patch-wise DDIM inversion step and a novel wavelet-based detail enhancer module. Specifically, we first utilize inversion methods to derive initial noise vectors that preserve global coherence from the base image. Subsequently, during sampling, our wavelet-domain detail enhancer retains low-frequency components from the base image to ensure structural consistency, while selectively guiding high-frequency components to enrich fine details and textures. Extensive evaluations using Stable Diffusion XL demonstrate that HiWave effectively mitigates common visual artifacts seen in prior methods, achieving superior perceptual quality. A user study confirmed HiWave's performance, where it was preferred over the state-of-the-art alternative in more than 80% of comparisons, highlighting its effectiveness for high-quality, ultra-high-resolution image synthesis without requiring retraining or architectural modifications.
- Use Property-Based Testing to Bridge LLM Code Generation and Validation
Large Language Models (LLMs) excel at code generation, but ensuring their outputs to be functionally correct, especially in complex programming tasks, is a persistent challenge. While traditional Test-Driven Development (TDD) offers a path for code refinement, its efficacy with LLMs is often undermined by the scarcity of high-quality test cases or the pitfalls of automated test generation, including biased tests or inaccurate output predictions that can misdirect the correction process. This paper introduces Property-Generated Solver, a novel framework that leverages Property-Based Testing (PBT) to validate high-level program properties or invariants, instead of relying on specific input-output examples. These properties are often simpler to define and verify than directly predicting exhaustive test oracles, breaking the "cycle of self-deception" where tests might share flaws with the code they are meant to validate. Property-Generated Solver employs two collaborative LLM-based agents: a Generator dedicated to code generation and iterative refinement, and a Tester that manages the PBT life-cycle and formulate semantically rich feedback from property violations. The resulting comprehensive and actionable feedback then guides the Generator in its refinement efforts. By establishing PBT as the core validation engine within this iterative, closed-loop paradigm, Property-Generated Solver provides a robust mechanism for steering LLMs towards more correct and generalizable code. Extensive experimental results on multiple code generation benchmarks demonstrate that Property-Generated Solver achieves substantial pass@1 improvements, ranging from 23.1% to 37.3% relative gains over established TDD methods.
- RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
Simulation-based data synthesis has emerged as a powerful paradigm for enhancing real-world robotic manipulation. However, existing synthetic datasets remain insufficient for robust bimanual manipulation due to two challenges: (1) the lack of an efficient, scalable data generation method for novel tasks, and (2) oversimplified simulation environments that fail to capture real-world complexity. We present RoboTwin 2.0, a scalable simulation framework that enables automated, large-scale generation of diverse and realistic data, along with unified evaluation protocols for dual-arm manipulation. We first construct RoboTwin-OD, a large-scale object library comprising 731 instances across 147 categories, each annotated with semantic and manipulation-relevant labels. Building on this foundation, we develop an expert data synthesis pipeline that combines multimodal large language models (MLLMs) with simulation-in-the-loop refinement to generate task-level execution code automatically. To improve sim-to-real transfer, RoboTwin 2.0 incorporates structured domain randomization along five axes: clutter, lighting, background, tabletop height and language instructions, thereby enhancing data diversity and policy robustness. We instantiate this framework across 50 dual-arm tasks spanning five robot embodiments, and pre-collect over 100,000 domain-randomized expert trajectories. Empirical results show a 10.9% gain in code generation success and improved generalization to novel real-world scenarios. A VLA model fine-tuned on our dataset achieves a 367% relative improvement (42.0% vs. 9.0%) on unseen scene real-world tasks, while zero-shot models trained solely on our synthetic data achieve a 228% relative gain, highlighting strong generalization without real-world supervision. We release the data generator, benchmark, dataset, and code to support scalable research in robust bimanual manipulation.
- DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning
Developing embodied agents capable of performing complex interactive tasks in real-world scenarios remains a fundamental challenge in embodied AI. Although recent advances in simulation platforms have greatly enhanced task diversity to train embodied Vision Language Models (VLMs), most platforms rely on simplified robot morphologies and bypass the stochastic nature of low-level execution, which limits their transferability to real-world robots. To address these issues, we present a physics-based simulation platform DualTHOR for complex dual-arm humanoid robots, built upon an extended version of AI2-THOR. Our simulator includes real-world robot assets, a task suite for dual-arm collaboration, and inverse kinematics solvers for humanoid robots. We also introduce a contingency mechanism that incorporates potential failures through physics-based low-level execution, bridging the gap to real-world scenarios. Our simulator enables a more comprehensive evaluation of the robustness and generalization of VLMs in household environments. Extensive evaluations reveal that current VLMs struggle with dual-arm coordination and exhibit limited robustness in realistic environments with contingencies, highlighting the importance of using our simulator to develop more capable VLMs for embodied tasks. The code is available at https://github.com/ds199895/DualTHOR.git.
- When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Recent advancements in large language models (LLMs) have shifted focus toward scaling inference-time compute, improving performance without retraining the model. A common approach is to sample multiple outputs in parallel, and select one of these as the final output. However, work to date has focused on English and a handful of domains such as math and code. In contrast, we are most interested in techniques that generalize across open-ended tasks, formally verifiable tasks, and across languages. In this work, we study how to robustly scale inference-time compute for open-ended generative tasks in a multilingual, multi-task setting. Our findings show that both sampling strategy based on temperature variation and selection strategy must be adapted to account for diverse domains and varied language settings. We evaluate existing selection methods, revealing that strategies effective in English often fail to generalize across languages. We propose novel sampling and selection strategies specifically adapted for multilingual and multi-task inference scenarios, and show they yield notable gains across languages and tasks. In particular, our combined sampling and selection methods lead to an average +6.8 jump in win-rates for our 8B models on m-ArenaHard-v2.0 prompts, against proprietary models such as Gemini. At larger scale, Command-A (111B model) equipped with our methods, shows +9.0 improvement in win-rates on the same benchmark with just five samples against single-sample decoding, a substantial increase at minimal cost. Our results underscore the need for language- and task-aware approaches to inference-time compute, aiming to democratize performance improvements in underrepresented languages.
- Is There a Case for Conversation Optimized Tokenizers in Large Language Models?
The computational and energy costs of Large Language Models (LLMs) have increased exponentially driven by the growing model sizes and the massive adoption of LLMs by hundreds of millions of users. The unit cost of an LLM is the computation of a token. Therefore, the tokenizer plays an important role in the efficiency of a model, and they are carefully optimized to minimize the number of tokens for the text in their training corpus. One of the most popular applications of LLMs are chatbots that interact with users. A key observation is that, for those chatbots, what is important is the performance of the tokenizer in the user text input and the chatbot responses. Those are most likely different from the text in the training corpus. So, a question that immediately arises is whether there is a potential benefit in optimizing tokenizers for chatbot conversations. In this paper, this idea is explored for different tokenizers by using a publicly available corpus of chatbot conversations to redesign their vocabularies and evaluate their performance in this domain. The results show that conversation-optimized tokenizers consistently reduce the number of tokens in chatbot dialogues, which can lead to meaningful energy savings, in the range of 5% to 10% while having minimal or even slightly positive impact on tokenization efficiency for the original training corpus.
- ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation
Recent advances in multimodal generative models have unlocked photorealistic, instruction-aligned image generation, yet leading systems like GPT-4o-Image remain proprietary and inaccessible. To democratize these capabilities, we present ShareGPT-4o-Image, the first dataset comprising 45K text-to-image and 46K text-and-image-to-image data, all synthesized using GPT-4o's image generation capabilities for distilling its advanced image generation abilities. Leveraging this dataset, we develop Janus-4o, a multimodal large language model capable of both text-to-image and text-and-image-to-image generation. Janus-4o not only significantly improves text-to-image generation over its predecessor, Janus-Pro, but also newly supports text-and-image-to-image generation. Notably, it achieves impressive performance in text-and-image-to-image generation from scratch, using only 91K synthetic samples and 6 hours of training on an 8 A800-GPU machine. We hope the release of ShareGPT-4o-Image and Janus-4o will foster open research in photorealistic, instruction-aligned image generation.
- Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
Extreme activation outliers in Large Language Models (LLMs) critically degrade quantization performance, hindering efficient on-device deployment. While channel-wise operations and adaptive gradient scaling are recognized causes, practical mitigation remains challenging. We introduce Outlier-Safe Pre-Training (OSP), a practical guideline that proactively prevents outlier formation rather than relying on post-hoc mitigation. OSP combines three key innovations: (1) the Muon optimizer, eliminating privileged bases while maintaining training efficiency; (2) Single-Scale RMSNorm, preventing channel-wise amplification; and (3) a learnable embedding projection, redistributing activation magnitudes originating from embedding matrices. We validate OSP by training a 1.4B-parameter model on 1 trillion tokens, which is the first production-scale LLM trained without such outliers. Under aggressive 4-bit quantization, our OSP model achieves a 35.7 average score across 10 benchmarks (compared to 26.5 for an Adam-trained model), with only a 2% training overhead. Remarkably, OSP models exhibit near-zero excess kurtosis (0.04) compared to extreme values (1818.56) in standard models, fundamentally altering LLM quantization behavior. Our work demonstrates that outliers are not inherent to LLMs but are consequences of training strategies, paving the way for more efficient LLM deployment. The source code and pretrained checkpoints are available at https://github.com/dmis-lab/Outlier-Safe-Pre-Training.
- The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
The effectiveness of AI debugging follows a predictable exponential decay pattern; most models lose 60-80% of their debugging capability within just 2-3 attempts, despite iterative debugging being a critical capability for practical code generation systems. We introduce the Debugging Decay Index (DDI), a mathematical framework that quantifies when debugging becomes ineffective and predicts intervention points. Our strategic fresh start approach shifts from exploitation to exploration at strategic points in the debugging process, demonstrating that well-timed interventions can rescue the effectiveness of debugging. DDI reveals a fundamental limitation in current AI debugging and provides the first quantitative framework for optimising iterative code generation strategies.
Solidot(15)
- 韦伯望远镜可能首次直接获得系外行星影像
天文学家利用韦伯太空望远镜捕捉到一颗质量与土星相似的行星,围绕年轻的母恒星 TWA 7 运行。如果得到证实,这将是韦伯首次直接发现行星的影像,也是迄今为止使用该技术发现最轻的行星。研究队利用韦伯的中红外成像光谱仪(MIRI)及其日冕仪,在 TWA 7 周围的残屑盘中探测到了一个微弱的红外线源,以 TWA7 的距离而言,大约是地球到太阳距离的 50 倍。初步分析显示,这个被称为 TWA 7b 的天体可能是年轻的寒冷行星,质量约为木星的 0.3 倍(约 100 个地球质量),温度接近 320 K(约摄氏 47 度)。它的位置与残屑盘上的一个空隙对齐,暗示着这颗行星与周围环境之间存在动态相互作用。年轻和年老的恒星周围都会发现充满尘土和岩质物质的碎片圆盘,但由于年轻恒星更为明亮,因此更容易被侦测到。TWA 7 又称 CE Antilae,是一颗年轻的 M 型恒星,年龄约 640 万岁,位于约 111 光年外的长蛇座 TW 星协中。
- ispace 登月舱登月失败源于高度计
日本太空企业 ispace 24日发布分析结果称,登月舱 RESILIENCE 着陆月表失败是因为用于下降的高度计出现异常,导致减速的时机延迟。今后将与外部专家探讨改进措施。公司表示正在开发的大型登月舱,下次挑战将以 2027 年为目标。登月舱于 6 月 6 日凌晨 3 点过后开始下降。向地面照射激光测距的高度计原计划到高度 3 公里之前启动,但实际开始测量是在高度 1公里附近。急减速未能及时进行,最后以时速约 150 公里降至高度 192 米后,飞行数据中断。导致 2023 年失败的高度传感器与飞行系统间联动问题这次并不存在。
- 一次性电子烟毒性大于传统香烟
一项研究发现,部分一次性电子烟和烟弹释放出的有毒金属含量,超过了老式电子烟,甚至比传统香烟还高。研究称一次性电子烟在一天的使用中释放的最高铅含量,相当于抽了近 20 包传统香烟。研究人员强调,尽管大多数一次性电子烟在美国属于非法产品,但市场上仍然广泛流通。其主要使用者是青少年和年轻人,而他们也正是对铅暴露最敏感的人群。吸入某些金属元素会显著增加癌症、呼吸道疾病和神经损伤的风险。此次研究分析了来自三大主流品牌的 7种 一次性电子烟。研究人员利用仪器模拟吸入 500-1500 次,并检测烟雾中金属浓度。他们发现随着吸入次数的增加,烟气中铬、镍和锑的浓度也随之升高。研究人员还拆解了这些设备,发现部分有毒金属来自烟油本身,也有不少是从加热元件、合金部件中浸出。含铅铜合金组件释放的铅和镍,加热线圈释放的镍,以及原始烟油中高浓度的锑,都是污染来源。
- 法国里昂淘汰微软软件以实现数字主权
法国里昂市将逐步用开源软件替代微软软件,采用办公软件 OnlyOffice、Linux 操作系统和 PostgreSQL 数据库。里昂市是法国第三大城市,它表示此举旨在减少对美国软件的依赖,实现数字主权。昂市加入了其他欧洲城市发起的减少依赖微软软件的运动。丹麦两大城市哥本哈根和奥胡斯于 6 月初宣布弃用 Windows 和 MS Office。
- 过度捕捞导致鳕鱼体型缩小一半
过度捕捞导致东波罗的海鳕鱼数量锐减,但过去三十年鱼的体型也在神秘的急剧缩小了。现在科学家发现了基因组证据,表明密集捕捞导致了鱼体型的快速演化,它们的平均体长自 1990 年代以来缩小了一半。鳕鱼成年体长的中位数从 1996 年的 40 厘米减少到 2019 年的 20 厘米,成年体体重中位数从 1996 年的 1356 克缩小到了 2019 年的 272 克,20 多年体重缩小到原来的五分之一。这项研究表明人类活动在鳕鱼种群 DNA 上留下了深刻的印记。导致体型演化的主要驱动力是拖网捕鱼的网眼尺寸。拖网捕鱼旨在对鱼体型大小进行选择性捕捞,法律规定了最小网眼尺寸,旨在保护体型较小的鱼,确保小鱼在被捕获之前能成熟并产卵。然而此举带来了意想不到的结果,它催生了选择性演化压力,使体型较小的鱼更易逃脱渔网。
- Bernie Sanders 认为如果 AI 提高了员工生产力那么应该推行一周四天工作制
美国佛蒙特州联邦参议员伯尼·桑德斯(Bernie Sanders)接受播客 Joe Rogan 采访时呼吁推行一周四天工作制。他主张,AI 带来的生产力提升不能仅仅让科技公司和企业高管受益,也应该让员工受益。桑德斯提议,当 AI 工具能提高员工生产力时,将每周的标准工作时间减少到 32 小时,而不是砍掉部分工作岗位。桑德斯说,科技应该致力于让世界更美好,不能只提高科技公司所有者和高管的财富。你是一名员工,你的生产力提高了,因为我们赋予了你 AI,我们不会将你赶到街上,而是将你的周工作时间减少到 32 小时。
- 掌机测试发现游戏在 SteamOS 上的性能高于 Windows 11
联想掌机 Legion Go S 支持两种操作系统:Valve 的 SteamOS(基于发行版 Arch Linux) 和微软的 Windows 11。Ars 测试了相同游戏在两种操作系统上的性能,意外发现 Linux 上的游戏表现超过了 Windows。在测试的五款游戏中,四款在 SteamOS 上的帧率高于 Windows 11,只有《无主之地 3》差不多。SteamOS 运行 Windows 游戏需要 Proton 翻译层转译,因此性能被认为肯定会有损失,但 Valve 的优化工作更出色,相比下微软的 Windows 11 可能存在太多不必要的开销,掌机优化欠缺。微软可能也认识到了这一问题,它最近与华硕合作发布掌机,推出了致力于改进掌机体验的 Xbox Experience for Handheld。
- Aaron Sorkin 制作《社交网络》续集
大卫芬奇执导 Aaron Sorkin 编剧,反映 Facebook 创办过程的《社交网络》在 2011 年赢得了多项奥斯卡奖。Sorkin 现在透露他正在制作续集《The Social Network Part II》,但它并非直接延续第一部的故事。报道称,Sorkin 本人将执导新片,将讲述社交网络在 2021 年 1 月 6 日美国国会山暴力事件中所扮演的角色。《社交网络》改编自班·梅立克的 2009 年畅销书籍《意外的亿万富翁:Facebook 的创立,性爱、金钱、天才与背叛的故事》。续集则改编自《华尔街日报》的《The Facebook Files》系列报道,该报道调查了 Facebook 所造成的伤害,内部发现如何掩盖,1 月 6 日骚扰的影响以及青少年用户的心理健康。
- 法官裁决 Meta 使用版权书籍训练大模型属于合理使用
一群图书作者起诉社交巨人,指控其未经许可盗版了数百万册受版权保护的书籍去训练其大模型 Llama。旧金山联邦法官 Vince Chhabria 周三裁定,Meta 使用书籍训练大模型受到了版权法合理使用的保护。但他强调,做出这一裁决更多是因为原告未能有效提供证据证明其指控。Meta 辩解称大模型无法复制版权材料,它就像人读完一本书之后能总结该书的信息,包括模仿写作风格,但不会一模一样复制。原告未能反驳 Meta 这一抗辩。为训练 Llama,Meta 被发现从盗版电子书库下载了逾百 TB 的电子书。
- OpenAI 在企业级市场抢微软的客户
去年春天制药公司 Amgen 宣布计划为其 2 万员工采购微软的 Copilot AI 助手,微软为此在多个案例研究中宣传了其新客户。但 13 个月后 Amgen 员工在使用 OpenAI 的 ChatGPT。微软销售表示他们面临向尽可能多的客户推销 Copilot 的压力,对于来自合作伙伴的挑战感到措手不及。OpenAI 在企业级市场抢微软的客户让两家公司本已紧张的关系火上浇油。OpenAI 最近表示其付费企业用户已达 300 万,比几个月前增长了 50%。微软则表示,财富 500 强企业七成都在使用 Copilot,付费用户数量比去年同期增长了两倍。
- 运送近百辆电动汽车的货船起火沉没
6 月 23日 下午,一艘载有 3000 多辆包括纯电动汽车(EV)在内的汽车运输船在从中国驶往墨西哥的途中,于美国阿拉斯加州海域沉没。该船于 6 月 3 日在海上发生火灾,22名船员成功获救。沉没船只是总部位于英国伦敦的 Zodiac Maritime 公司运营的“Morning Midas号”(利比里亚船籍),这是一艘全长约 180 米的大中型船。该船 5 月从中国出发,原定于 6 月 15 日抵达墨西哥的拉萨罗·卡德纳斯港。该船共装载了 3048 辆汽车,其中 70 辆为纯电动汽车,681 辆为混合动力车。该船在航行至阿拉斯加州阿达克岛以南约 480 公里的海域时发生火灾,甲板冒出浓烟。全体船员乘坐救生艇逃生,并被航行至附近的商船救起。目前尚不清楚起火点是否来自电动汽车。
- 微软发布编辑器 MS-DOS Editor 的 Rust 版本
微软开源了其经典编辑器 MS-DOS Editor 的 Rust 语言版本,源代码托管在 GitHub,支持 Windows、macOS 和 Linux。MS-DOS Editor 或简称为 Editor,最初是随 MS-DOS 5.0 发布的,至今有逾三十年历史。微软再次复活 MS-DOS Editor 是为了解决 64 位 Windows 操作系统缺乏默认的命令行界面文本编辑器问题,32 位 Windows 内置了 MS-DOS editor,但 64 位系统没有。新版本仅为 250KB,引入了现代特性如 Unicode 支持,正则表达式以及处理 GB 大小文件的能力。旧版本受限于内存只能处理小于 300KB 的文件。
- 法官裁决 Anthropic 使用书籍训练 AI 是合理使用,但使用盗版书籍训练并不是
美国联邦法官裁决 Anthropic 使用书籍训练 AI 是合理使用,但使用盗版书籍训练并不是。法庭文件显示,Anthropic 从盗版网站下载了逾 700 万本书籍。它还购买了数百万本纸质书,拆开装订扫描了每一页,将其以数字形式存储。盗版书库和扫描书库被用于训练 Anthropic 大模型 Claude 的不同版本,每年为该公司带来逾十亿美元收入。法官裁决使用盗版书籍训练 AI 不是合理使用,将在晚些时候就盗版书籍相关赔偿进行审理。
- 《侏罗纪世界:进化 3》放弃使用生成式 AI
在使用生成式 AI 制作科学家肖像引发争议之后,《侏罗纪世界:进化 3》开发商 Frontier 证实它已经放弃使用该功能。《侏罗纪世界:进化 3》预计于 10 月 21 日登陆 PC、PS5 和 Xbox Series X/S 平台。Steam 的新政策要求游戏开发商披露是否使用 AI,而在 Frontier 披露使用生成式 AI 之后,玩家的批评迫使它做出回应,表示听取了玩家们的反馈,移除了生成式 AI 刻画科学家肖像的功能。
- Anker 等公司召回的移动电源使用了安普瑞斯的电芯
在安克(Anker)等多个充电宝品牌宣布召回存在自燃风险的移动电源后,问题根源被认为与电芯有关,而电芯供应商是安普瑞斯(Amprius)无锡公司。安普瑞斯是移动电源最大的电芯供应商,其客户包括了安克、罗马仕、小米、绿联、倍思、麦多多、傲基、电友等,目前宣布召回产品的主要是安克和罗马仕。报道称,安普瑞斯在未经客户允许下,私自使用未送检的隔膜材料,导致更换后的隔膜无法和之前送测的样品有相同的强度。安普瑞斯的 11 个 3C 证书自 6 月 10 日起变更为“已暂停”状态,该公司目前处于停产状态。