OrangeBot.AI Digest — 2025-07-05
61 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- The Prime Reasons to Avoid Amazon (blog.thenewoil.org)
- macOS Icon History (basicappleguy.com)
- How to not pay your taxes legally, apparently (mrsteinberg.com)
- Seine reopens to Paris swimmers after century-long ban (www.lemonde.fr)
- Local-First Software Is Easier to Scale (elijahpotter.dev)
- 'Positive review only': Researchers hide AI prompts in papers (asia.nikkei.com)
- Local-first software (2019) (www.inkandswitch.com)
- Europe's first geostationary sounder satellite is launched (www.eumetsat.int)
- X-Clacks-Overhead (xclacksoverhead.org)
- Problems the AI industry is not addressing adequately (www.thealgorithmicbridge.com)
- A 37-year-old wanting to learn computer science (initcoder.com)
- Stop Killing Games (www.jeffgeerling.com)
- The messy reality of SIMD (vector) functions (johnnysswlab.com)
- Why Tesla’s cars keep crashing (www.theguardian.com)
- The History of Electronic Music in 476 Tracks (1937–2001) (www.openculture.com)
GitHub Trending(11)
- NanmiCoder / MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
- rustfs / rustfs
🚀 High-performance distributed object storage for MinIO alternative.
- LadybirdBrowser / ladybird
Truly independent web browser
- datawhalechina / happy-llm
📚 从零开始的大语言模型原理与实践教程
- Universidade-Livre / ciencia-da-computacao
🎓 Um caminho para a educação autodidata em Ciência da Computação!
- megadose / toutatis
Toutatis is a tool that allows you to extract information from instagrams accounts such as e-mails, phone numbers and more
- bregman-arie / devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
- MotiaDev / motia
Unified Backend Framework for APIs, Events, and AI Agents
- directus / directus
The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
- google / perfetto
Production-grade client-side tracing, profiling, and analysis for complex software systems.
- codecrafters-io / build-your-own-x
Master programming by recreating your favorite technologies from scratch.
Product Hunt(5)
Hugging Face(15)
- GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
We present GLM-4.1V-Thinking, a vision-language model (VLM) designed to advance general-purpose multimodal reasoning. In this report, we share our key findings in the development of the reasoning-centric training framework. We first develop a capable vision foundation model with significant potential through large-scale pre-training, which arguably sets the upper bound for the final performance. Reinforcement Learning with Curriculum Sampling (RLCS) then unlocks the full potential of the model, leading to comprehensive capability enhancement across a diverse range of tasks, including STEM problem solving, video understanding, content recognition, coding, grounding, GUI-based agents, and long document understanding, among others. To facilitate research in this field, we open-source GLM-4.1V-9B-Thinking, which achieves state-of-the-art performance among models of comparable size. In a comprehensive evaluation across 28 public benchmarks, our model outperforms Qwen2.5-VL-7B on nearly all tasks and achieves comparable or even superior performance on 18 benchmarks relative to the significantly larger Qwen2.5-VL-72B. Notably, GLM-4.1V-9B-Thinking also demonstrates competitive or superior performance compared to closed-source models such as GPT-4o on challenging tasks including long document understanding and STEM reasoning, further underscoring its strong capabilities. Code, models and more information are released at https://github.com/THUDM/GLM-4.1V-Thinking.
- Kwai Keye-VL Technical Report
While Multimodal Large Language Models (MLLMs) demonstrate remarkable capabilities on static images, they often fall short in comprehending dynamic, information-dense short-form videos, a dominant medium in today's digital landscape. To bridge this gap, we introduce Kwai Keye-VL, an 8-billion-parameter multimodal foundation model engineered for leading-edge performance in short-video understanding while maintaining robust general-purpose vision-language abilities. The development of Keye-VL rests on two core pillars: a massive, high-quality dataset exceeding 600 billion tokens with a strong emphasis on video, and an innovative training recipe. This recipe features a four-stage pre-training process for solid vision-language alignment, followed by a meticulous two-phase post-training process. The first post-training stage enhances foundational capabilities like instruction following, while the second phase focuses on stimulating advanced reasoning. In this second phase, a key innovation is our five-mode ``cold-start'' data mixture, which includes ``thinking'', ``non-thinking'', ``auto-think'', ``think with image'', and high-quality video data. This mixture teaches the model to decide when and how to reason. Subsequent reinforcement learning (RL) and alignment steps further enhance these reasoning capabilities and correct abnormal model behaviors, such as repetitive outputs. To validate our approach, we conduct extensive evaluations, showing that Keye-VL achieves state-of-the-art results on public video benchmarks and remains highly competitive on general image-based tasks (Figure 1). Furthermore, we develop and release the KC-MMBench, a new benchmark tailored for real-world short-video scenarios, where Keye-VL shows a significant advantage.
- LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Animation colorization is a crucial part of real animation industry production. Long animation colorization has high labor costs. Therefore, automated long animation colorization based on the video generation model has significant research value. Existing studies are limited to short-term colorization. These studies adopt a local paradigm, fusing overlapping features to achieve smooth transitions between local segments. However, the local paradigm neglects global information, failing to maintain long-term color consistency. In this study, we argue that ideal long-term color consistency can be achieved through a dynamic global-local paradigm, i.e., dynamically extracting global color-consistent features relevant to the current generation. Specifically, we propose LongAnimation, a novel framework, which mainly includes a SketchDiT, a Dynamic Global-Local Memory (DGLM), and a Color Consistency Reward. The SketchDiT captures hybrid reference features to support the DGLM module. The DGLM module employs a long video understanding model to dynamically compress global historical features and adaptively fuse them with the current generation features. To refine the color consistency, we introduce a Color Consistency Reward. During inference, we propose a color consistency fusion to smooth the video segment transition. Extensive experiments on both short-term (14 frames) and long-term (average 500 frames) animations show the effectiveness of LongAnimation in maintaining short-term and long-term color consistency for open-domain animation colorization task. The code can be found at https://cn-makers.github.io/long_animation_web/.
- WebSailor: Navigating Super-human Reasoning for Web Agent
Transcending human cognitive limitations represents a critical frontier in LLM training. Proprietary agentic systems like DeepResearch have demonstrated superhuman capabilities on extremely complex information-seeking benchmarks such as BrowseComp, a feat previously unattainable. We posit that their success hinges on a sophisticated reasoning pattern absent in open-source models: the ability to systematically reduce extreme uncertainty when navigating vast information landscapes. Based on this insight, we introduce WebSailor, a complete post-training methodology designed to instill this crucial capability. Our approach involves generating novel, high-uncertainty tasks through structured sampling and information obfuscation, RFT cold start, and an efficient agentic RL training algorithm, Duplicating Sampling Policy Optimization (DUPO). With this integrated pipeline, WebSailor significantly outperforms all opensource agents in complex information-seeking tasks, matching proprietary agents' performance and closing the capability gap.
- Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Math reasoning has become the poster child of progress in large language models (LLMs), with new models rapidly surpassing human-level performance on benchmarks like MATH and AIME. But as math leaderboards improve week by week, it is worth asking: do these gains reflect broader problem-solving ability or just narrow overfitting? To answer this question, we evaluate over 20 open-weight reasoning-tuned models across a broad suite of tasks, including math, scientific QA, agent planning, coding, and standard instruction-following. We surprisingly find that most models that succeed in math fail to transfer their gains to other domains. To rigorously study this phenomenon, we conduct controlled experiments on Qwen3-14B models using math-only data but different tuning methods. We find that reinforcement learning (RL)-tuned models generalize well across domains, while supervised fine-tuning (SFT)-tuned models often forget general capabilities. Latent-space representation and token-space distribution shift analyses reveal that SFT induces substantial representation and output drift, while RL preserves general-domain structure. Our results suggest a need to rethink standard post-training recipes, particularly the reliance on SFT-distilled data for advancing reasoning models.
- LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Recovering 3D structures with open-vocabulary scene understanding from 2D images is a fundamental but daunting task. Recent developments have achieved this by performing per-scene optimization with embedded language information. However, they heavily rely on the calibrated dense-view reconstruction paradigm, thereby suffering from severe rendering artifacts and implausible semantic synthesis when limited views are available. In this paper, we introduce a novel generative framework, coined LangScene-X, to unify and generate 3D consistent multi-modality information for reconstruction and understanding. Powered by the generative capability of creating more consistent novel observations, we can build generalizable 3D language-embedded scenes from only sparse views. Specifically, we first train a TriMap video diffusion model that can generate appearance (RGBs), geometry (normals), and semantics (segmentation maps) from sparse inputs through progressive knowledge integration. Furthermore, we propose a Language Quantized Compressor (LQC), trained on large-scale image datasets, to efficiently encode language embeddings, enabling cross-scene generalization without per-scene retraining. Finally, we reconstruct the language surface fields by aligning language information onto the surface of 3D scenes, enabling open-ended language queries. Extensive experiments on real-world data demonstrate the superiority of our LangScene-X over state-of-the-art methods in terms of quality and generalizability. Project Page: https://liuff19.github.io/LangScene-X.
- Depth Anything at Any Condition
We present Depth Anything at Any Condition (DepthAnything-AC), a foundation monocular depth estimation (MDE) model capable of handling diverse environmental conditions. Previous foundation MDE models achieve impressive performance across general scenes but not perform well in complex open-world environments that involve challenging conditions, such as illumination variations, adverse weather, and sensor-induced distortions. To overcome the challenges of data scarcity and the inability of generating high-quality pseudo-labels from corrupted images, we propose an unsupervised consistency regularization finetuning paradigm that requires only a relatively small amount of unlabeled data. Furthermore, we propose the Spatial Distance Constraint to explicitly enforce the model to learn patch-level relative relationships, resulting in clearer semantic boundaries and more accurate details. Experimental results demonstrate the zero-shot capabilities of DepthAnything-AC across diverse benchmarks, including real-world adverse weather benchmarks, synthetic corruption benchmarks, and general benchmarks. Project Page: https://ghost233lism.github.io/depthanything-AC-page Code: https://github.com/HVision-NKU/DepthAnythingAC
- SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
We present SciArena, an open and collaborative platform for evaluating foundation models on scientific literature tasks. Unlike traditional benchmarks for scientific literature understanding and synthesis, SciArena engages the research community directly, following the Chatbot Arena evaluation approach of community voting on model comparisons. By leveraging collective intelligence, SciArena offers a community-driven evaluation of model performance on open-ended scientific tasks that demand literature-grounded, long-form responses. The platform currently supports 23 open-source and proprietary foundation models and has collected over 13,000 votes from trusted researchers across diverse scientific domains. We analyze the data collected so far and confirm that the submitted questions are diverse, aligned with real-world literature needs, and that participating researchers demonstrate strong self-consistency and inter-annotator agreement in their evaluations. We discuss the results and insights based on the model ranking leaderboard. To further promote research in building model-based automated evaluation systems for literature tasks, we release SciArena-Eval, a meta-evaluation benchmark based on our collected preference data. The benchmark measures the accuracy of models in judging answer quality by comparing their pairwise assessments with human votes. Our experiments highlight the benchmark's challenges and emphasize the need for more reliable automated evaluation methods.
- Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers
Recent progress in multimodal reasoning has been significantly advanced by textual Chain-of-Thought (CoT), a paradigm where models conduct reasoning within language. This text-centric approach, however, treats vision as a static, initial context, creating a fundamental "semantic gap" between rich perceptual data and discrete symbolic thought. Human cognition often transcends language, utilizing vision as a dynamic mental sketchpad. A similar evolution is now unfolding in AI, marking a fundamental paradigm shift from models that merely think about images to those that can truly think with images. This emerging paradigm is characterized by models leveraging visual information as intermediate steps in their thought process, transforming vision from a passive input into a dynamic, manipulable cognitive workspace. In this survey, we chart this evolution of intelligence along a trajectory of increasing cognitive autonomy, which unfolds across three key stages: from external tool exploration, through programmatic manipulation, to intrinsic imagination. To structure this rapidly evolving field, our survey makes four key contributions. (1) We establish the foundational principles of the think with image paradigm and its three-stage framework. (2) We provide a comprehensive review of the core methods that characterize each stage of this roadmap. (3) We analyze the critical landscape of evaluation benchmarks and transformative applications. (4) We identify significant challenges and outline promising future directions. By providing this structured overview, we aim to offer a clear roadmap for future research towards more powerful and human-aligned multimodal AI.
- Heeding the Inner Voice: Aligning ControlNet Training via Intermediate Features Feedback
Despite significant progress in text-to-image diffusion models, achieving precise spatial control over generated outputs remains challenging. ControlNet addresses this by introducing an auxiliary conditioning module, while ControlNet++ further refines alignment through a cycle consistency loss applied only to the final denoising steps. However, this approach neglects intermediate generation stages, limiting its effectiveness. We propose InnerControl, a training strategy that enforces spatial consistency across all diffusion steps. Our method trains lightweight convolutional probes to reconstruct input control signals (e.g., edges, depth) from intermediate UNet features at every denoising step. These probes efficiently extract signals even from highly noisy latents, enabling pseudo ground truth controls for training. By minimizing the discrepancy between predicted and target conditions throughout the entire diffusion process, our alignment loss improves both control fidelity and generation quality. Combined with established techniques like ControlNet++, InnerControl achieves state-of-the-art performance across diverse conditioning methods (e.g., edges, depth).
- Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation
Recent advances in diffusion models have enabled high-quality video generation, but the additional temporal dimension significantly increases computational costs, making training and inference on long videos prohibitively expensive. In this paper, we identify a phenomenon we term Spatiotemporal Energy Decay in video diffusion models: post-softmax attention scores diminish as spatial and temporal distance between tokens increase, akin to the physical decay of signal or waves over space and time in nature. Motivated by this, we propose Radial Attention, a scalable sparse attention mechanism with O(n log n) complexity that translates energy decay into exponentially decaying compute density, which is significantly more efficient than standard O(n^2) dense attention and more expressive than linear attention. Specifically, Radial Attention employs a simple, static attention mask where each token attends to spatially nearby tokens, with the attention window size shrinking with temporal distance. Moreover, it allows pre-trained video diffusion models to extend their generation length with efficient LoRA-based fine-tuning. Extensive experiments show that Radial Attention maintains video quality across Wan2.1-14B, HunyuanVideo, and Mochi 1, achieving up to a 1.9times speedup over the original dense attention. With minimal tuning, it enables video generation up to 4times longer while reducing training costs by up to 4.4times compared to direct fine-tuning and accelerating inference by up to 3.7times compared to dense attention inference.
- MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings
Multimodal embedding models, built upon causal Vision Language Models (VLMs), have shown promise in various tasks. However, current approaches face three key limitations: the use of causal attention in VLM backbones is suboptimal for embedding tasks; scalability issues due to reliance on high-quality labeled paired data for contrastive learning; and limited diversity in training objectives and data. To address these issues, we propose MoCa, a two-stage framework for transforming pre-trained VLMs into effective bidirectional multimodal embedding models. The first stage, Modality-aware Continual Pre-training, introduces a joint reconstruction objective that simultaneously denoises interleaved text and image inputs, enhancing bidirectional context-aware reasoning. The second stage, Heterogeneous Contrastive Fine-tuning, leverages diverse, semantically rich multimodal data beyond simple image-caption pairs to enhance generalization and alignment. Our method addresses the stated limitations by introducing bidirectional attention through continual pre-training, scaling effectively with massive unlabeled datasets via joint reconstruction objectives, and utilizing diverse multimodal data for enhanced representation robustness. Experiments demonstrate that MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results, and exhibits strong scalability with both model size and training data on MMEB.
- IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
We introduce IntFold, a controllable foundation model for both general and specialized biomolecular structure prediction. IntFold demonstrates predictive accuracy comparable to the state-of-the-art AlphaFold3, while utilizing a superior customized attention kernel. Beyond standard structure prediction, IntFold can be adapted to predict allosteric states, constrained structures, and binding affinity through the use of individual adapters. Furthermore, we introduce a novel confidence head to estimate docking quality, offering a more nuanced assessment for challenging targets such as antibody-antigen complexes. Finally, we share insights gained during the training process of this computationally intensive model.
- Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Despite the critical role of reward models (RMs) in reinforcement learning from human feedback (RLHF), current state-of-the-art open RMs perform poorly on most existing evaluation benchmarks, failing to capture the spectrum of nuanced and sophisticated human preferences. Even approaches that incorporate advanced training techniques have not yielded meaningful performance improvements. We hypothesize that this brittleness stems primarily from limitations in preference datasets, which are often narrowly scoped, synthetically labeled, or lack rigorous quality control. To address these challenges, we present a large-scale preference dataset comprising 40 million preference pairs, named SynPref-40M. To enable data curation at scale, we design a human-AI synergistic two-stage pipeline that leverages the complementary strengths of human annotation quality and AI scalability. In this pipeline, humans provide verified annotations, while large language models perform automatic curation based on human guidance. Training on this preference mixture, we introduce Skywork-Reward-V2, a suite of eight reward models ranging from 0.6B to 8B parameters, trained on a carefully curated subset of 26 million preference pairs from SynPref-40M. We demonstrate that Skywork-Reward-V2 is versatile across a wide range of capabilities, including alignment with human preferences, objective correctness, safety, resistance to stylistic biases, and best-of-N scaling, achieving state-of-the-art performance across seven major reward model benchmarks. Ablation studies confirm that the effectiveness of our approach stems not only from data scale but also from high-quality curation. The Skywork-Reward-V2 series represents substantial progress in open reward models, highlighting the untapped potential of existing preference datasets and demonstrating how human-AI curation synergy can unlock significantly higher data quality.
- A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
The remarkable advancements of vision and language foundation models in multimodal understanding, reasoning, and generation has sparked growing efforts to extend such intelligence to the physical world, fueling the flourishing of vision-language-action (VLA) models. Despite seemingly diverse approaches, we observe that current VLA models can be unified under a single framework: vision and language inputs are processed by a series of VLA modules, producing a chain of action tokens that progressively encode more grounded and actionable information, ultimately generating executable actions. We further determine that the primary design choice distinguishing VLA models lies in how action tokens are formulated, which can be categorized into language description, code, affordance, trajectory, goal state, latent representation, raw action, and reasoning. However, there remains a lack of comprehensive understanding regarding action tokens, significantly impeding effective VLA development and obscuring future directions. Therefore, this survey aims to categorize and interpret existing VLA research through the lens of action tokenization, distill the strengths and limitations of each token type, and identify areas for improvement. Through this systematic review and analysis, we offer a synthesized outlook on the broader evolution of VLA models, highlight underexplored yet promising directions, and contribute guidance for future research, hoping to bring the field closer to general-purpose intelligence.
Solidot(15)
- 微软 XBox 业务高管建议被裁员的员工用 AI 管理情绪
微软本周宣布裁员逾九千人,其中 XBox 游戏业务深受影响,有工作室被关闭,多个游戏项目被取消。对此情况,XBox 高管 Matt Turnbull 提出了一项建议:被裁的员工应该用 AI 管理情绪。他的建议发表在 LinkedIn 上,帖子已删除,但内容已被人保存了下来。他表示自己已经试验用 LLM AI 工具(如 ChatGPT 或 Copilot)帮助减少失业带来的情绪和认知负担,称如果不尽力提供最好的建议将是自己的失职。
- 美国面临其历史上最大规模的人才流失
自二战以来一直持续到 2024 年,美国是自由世界毋庸置疑的科学领导者。一个注重事实、讲究科学真理、重视教育和公共利益的社会引领着一代又一代人持续突破和进步。但 2025 年 1 月起,美国享有盛名的科研机构如 NOAA、NASA、NSF、CDC、EPA 和 FDA 遭遇了一连串史无前例的内部攻击。随着新的预算(大而美法)获得众议院和参议院的批准即将成为法律,我们所熟悉的美国科研模式可能成为过去。对美国科学家而言,这是一场现实生活中的噩梦。即使发生最糟糕的事情,我们仍然有理由抱有希望。希特勒的德国也严重破坏了本国的科学研究,但大批流失的科学家最终惠及了世界其他地区,这一事件被称为“希特勒的礼物”。我们可能会见证美国科学的衰落,其他科学强国崛起。NASA 已经裁员逾 2500 人,正要求其他 3000 人自愿退休,受影响的主要是科学部门。NASA 正在进行的 124 个任务有 41 个面临完全取消,引力波探测器 LISA 面临失去 NASA 的资助。特朗普政府取消订阅《自然》期刊,这一事件几乎和纳粹德国如出一辙:《自然》自 1937 年起到二战结束被禁止收录进德国图书馆。如果美国不改变方向,2025 年不仅将标志着美国科学例外主义时代的终结;而美国人才流失的规模可能让“希特勒的礼物”都相形见绌。
- 善用表情符号能在交流中给对方留下好印象
在全球范围内,表情符号每天被使用超过 100 亿次,为数字对话注入微妙的情感。然而它们对人们如何理解这些对话的实际影响尚不清楚——虽然这些小符号经常被积极解读,但有时也会被误读并引起误解。因此研究人员评估了表情符号如何影响人们对发送表情符号的人的看法。在研究中,美国 260 名参与者被要求阅读 15 段基于文本的对话,并想象他们与一位密友进行了这些交流。这些对话要么仅有纯文本回复,要么包含表情符号。阅读完这些对话样本后,参与者被问及一系列关于他们对消息发送者的感受的问题。总体而言,参与者认为包含表情符号的消息比纯文本消息回应得更积极。这使发件人更讨人喜欢,使两者关系显得更亲近。令人惊讶的是,这种效果的产生与使用的表情符号类型无关,无论是直接表达发件人情绪的表情符号——比如笑脸,还是展示其他物体的中性表情符号,两者并没有产生实质差异。
- 《圣歌》服务器将于 2026 年 1 月关闭
EA 宣布《圣歌(Anthem)》服务器将于 2026 年 1 月 12 日关闭。BioWare 开发的机甲在线游戏《圣歌》于 2019 年 2 月发布,上线之后因存在各种问题而差评如潮,BioWare 于 2020 年 2 月宣布将进行《圣歌》回炉重造,然而一年后的 2021 年 2 月 BioWare 宣布取消重新开发,但它继续提供了在线服务。现在 EA 宣布了关闭服务器的时间表,意味着游戏将完全死去。
- 云南甘棠箐遗址出土 30 万年前木质工具
中国考古学家在云南省江川县甘棠箐遗址发掘出土的 35 件保存完好的、年代约为距今 30 万年的木器,以及与木器伴生的文化遗存如大量石制品、骨角器、动物化石和植物遗存。甘棠箐遗址出土的木器和鹿角“软锤”是目前东亚地区最早的发现,在世界范围内的旧石器时代遗址中亦属罕见。遗址发现于 1984 年,1989 年被首次发掘,2014 年至 2015 年和 2018 年至 2019 年遗址被再次发掘,出土了丰富的石制品、动物化石、木质材料、植物种子。此次发表的材料全部来自新近两次考古发掘。研究团队确定古人类在遗址活动的时间约为距今 36 至 25 万年。研究多方面改写了学界对旧石器时代人类生存能力和方式以及东亚旧石器时代文化特点与成因的认识,如竹木器在东亚、东南亚古人类生产生活中的作用及其对旧石器时代“竹木器假说”的实证意义。该发现首次揭秘了古人类采集经济状态,全面展现了远古盘中餐中广谱食材的种类,提供了先民用木器挖掘可食性植物根茎的可靠证据,揭示了生活在热带、亚热带环境下的东南亚古人群独特的资源利用策略和适应生存方式。
- 科学家警告美国可能会失去一代人才
特朗普政府正在解散国家科学基金会(NSF),科学家警告,一整代人才可能会从美国流失到竞争对手。NSF 现任和前任员工接受采访时表示,NSF 用于支持尖端、高影响力科学研究的同行评审流程因人员、项目和拨款的混乱削减以及政府效率部(Doge)的干预而受到破坏。科学家警告说,特朗普政府对科学多样性的攻击已经损害了 NSF 资助的基础研究的质量。NSF 是基础科学和工程领域美国最重要的联邦资助机构,一周前该机构逾 1800 名工作人员被下令从总部撤出,住房和城市发展部接管了该机构的办公楼。
- 空气污染和传统草药与肺癌相关
一项新研究表明,空气污染、传统草药和其他环境暴露与基因突变有关,这可能导致没有吸烟史或几乎没有吸烟史的人患上肺癌。研究团队分析了 871 名非吸烟者的肺癌组织,这些人群居住在非洲、亚洲、欧洲和北美洲 28 个空气污染程度不同的地区。研究人员利用全基因组测序,发现了独特的 DNA 突变模式。他们发现,生活在污染更严重环境中的从不吸烟者,其肺癌肿瘤突变显著增多。研究还发现,与传统中草药中发现的致癌物马兜铃酸有关的特殊突变特征几乎只在台湾从不吸烟的肺癌病例中被发现。虽然马兜铃酸以前被认为与摄入膀胱癌、胃肠道癌、肾癌和肝癌有关,但这是第一次有证据表明马兜铃酸可能会导致肺癌。
- 挪威六月电动汽车占总销量的 96.9%
根据挪威公共道路管理局(OFV)的数据,2025 年 6 月的电动汽车销量占了总销量的 96.9% ,其中特斯拉 Model Y 就占了 27% 以上。6 月挪威新注册汽车数量为 18,376 辆,其中电动汽车 17,799 辆,相比 2024 年 6 月,电动汽车注册量增加了 3,790 辆。2025 年 6 月非纯电车仅为 577 辆,其中插电混动 152 辆,其他类型的混动 223 辆,内燃汽车包括了 142 辆柴油车和 57 辆汽油车。OFV 表示促销活动提振了电动汽车销量,特斯拉能维持其地位多久还有待观察。
- Stop Killing Games 运动吸引了逾百万人签名
由 YouTube 主播 Accursed Farms 发起的 Stop Killing Games 运动赢得了广泛关注,该运动旨在让游戏和书籍等类似,玩家购买之后拥有所有权,可以在任何时候使用,而不是在游戏发行商关闭服务器之后就无法访问。Stop Killing Games 在英国的请愿获得了 15 万人签名——达到递交英国议会辩论所需的要求,在欧盟的请愿赢得了 107 万人签名。可能需要政府监管部门涉足,游戏行业可能才会改变现有的做法。
- 2024 年发表的医学论文摘要七分之一可能是 AI 完成的
一项针对学术文献的大规模分析显示,去年发表的生物医学论文摘要中,约 1/7 可能借助 AI 完成撰写。2024 年医学数据库 PubMed 收录的 150 万篇摘要中,超过 20 万篇包含大模型(LLM)常推荐使用的词汇。许多团队试图评估 LLM 对学术产出的影响,但这一过程颇具挑战性,因为大多数使用者并未披露这种行为。研究人员利用了 LLM 流行后的风格化词汇去估计摘要是否是 AI 帮助撰写。研究发现,2024年有 454 个词汇的出现频率远高于 2010 年以来的任何年份。它们多为与研究内容无关的“风格词”,且以动词和形容词为主。科学词汇的演变是长期过程。2021年有 190 个“冗余词汇”,多为与研究内容相关的名词。但自 LLM 普及以来的词汇变化更为显著,且主要体现在风格层面。研究人员发现,在计算科学和生物信息学等领域,超过 1/5 的摘要由 LLM 辅助撰写。
- Clothoff 试图支配深度伪造色情
根据 Clothoff 告密者披露的信息,该深度伪造色情应用正计划向全球扩张,试图支配深度伪造色情领域。Clothoff 已经收购了至少 10 款类似服务,这些服务每月吸引了数十万到数百万流量。告密者称,Clothoff 年度预算约 350 万美元,它目前的营销方式主要是依靠 Telegram 机器人和 X 频道向可能使用该应用的年轻男性投放广告。Clothoff 大部分营销预算都花在 Telegram 频道、Reddit Sex Sub 和 4chan 上。
- 基因组测序揭示古埃及人祖先
在一项研究中,科学家对埃及一座墓葬中的一名古埃及人进行了全基因组测序。测序对象为男性,其放射性碳测年为公元前 2855 年-公元前 2570 年左右。他被发现埋葬于古埃及 Nuwayrat 地区的一个密封陶罐中,说明他的社会地位较高,活到了他那个时代的高龄——44-64 岁之间。 在提取的 7 个DNA样本中,有两个保存足够完好,能用于测序,并与 3233 个现代个体和 805 个古代个体的数据库进行了对比分析。通过遗传模拟,该 Nuwayrat 遗体基因组的绝大部分可以追溯到北非新石器时代的祖先。该基因组约 20% 与东新月沃土人群有关,补充了这两个地区有贸易往来和相互影响的考古学证据。
- 海绵结构材料借助太阳热能去除海水中的盐分
地球上的大部分水资源都是海水,由于盐分过高而无法饮用。海水淡化厂可将海水淡化处理成饮用水,然而该过程需要消耗大量能源。香港研究团队在《ACS Energy Letters》发表研究成果,其研发出一种具有长链微气囊结构的海绵结构材料,结合阳光照射与简易塑料罩,成功实现盐水资源向淡水的转化。一项户外原理验证实验成功在自然光照条件下产出可直接饮用的淡水,标志着实现低能耗可持续海水淡化技术的重大进展。在户外测试中,研究人员将这种材料置于盛有海水的蒸发容器中,上方覆盖弧形透明塑料罩。阳光加热海绵结构材料顶部时,仅会将水分蒸发为水蒸气(盐分会被阻隔)。蒸气在塑料罩内壁凝结为液态水,沿罩壁汇集至边缘,最终滴入蒸发容器下方的漏斗中,以另一容器盛放。经过 6 小时自然光照,该系统最终产出约 3 汤匙的饮用水。
- 系外行星引发恒星释放耀斑
天文学家最近发现一颗名为 HIP 67522b 的系外行星,跟它的母恒星 HIP 67522 的互动关系非常不寻常。这颗行星靠母星非常近,导致恒星表面频繁发生激烈的耀斑,也让行星的大气层持续受热膨胀。HIP 67522 是一颗年轻的 G 型恒星,位于半人马座,距离地球约 417 光年,年龄大约只有 1,700 万年。这颗恒星拥有两颗行星,其中 HIP 67522b 是一颗「热木星」——体积接近木星,由于公转轨道非常靠近母星,绕转一圈只需 7 天的时间。研究团队发现,这颗行星似乎能与母恒星的磁场产生某种奇特的连结,进而引发恒星表面出现剧烈的耀斑活动。这些耀斑朝向行星爆发时,又把大量能量「反馈」到行星身上,使它的大气层像吹气球一样不断膨胀。长期下来,行星的大气可能会被严重剥离,甚至从一颗巨大的热木星,缩小成像「热海王星」或「亚海王星」那样的体积。这类母星与行星之间的强烈互动,早就在理论上被预测过,但直到现在才首次被实际观测到。
- 男女对婴儿晚上哭泣声音的反应差别不大
丹麦奥胡斯大学的一项研究发现,女性并非天生比男性更容易被婴儿晚上的哭泣声惊醒。不过女性花在夜间照顾的可能性三倍于男性。研究人员开展了两项独立研究。第一项实验针对 142 名无孩成年人,结果发现女性对非常安静的声音的反应略强于男性。对于耳语级别的声音,无论是婴儿哭声还是常见的闹钟声,女性吵醒的可能性比男性高 14%。但如果声音的响度加强,男女之间不存在显著差异。第二项研究中丹麦 117 位初为人父母的夫妇记录了他们一周内的夜间照护情况。结果显示,母亲夜间婴儿照护的可能性是父亲的三倍。研究人员认为,社会因素而非生理差异才能解释其中的差异。丹麦最近将陪产假从两周延长至十一周,可能有助于平衡父母之间的育儿责任。