OrangeBot.AI Digest — 2025-10-11
57 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Microsoft only lets you opt out of AI photo scanning 3x a year (hardware.slashdot.org)
- Discord hack shows risks of online age checks (news.sky.com)
- Tennessee man arrested, accused of threatening a shooting, after posting meme (reason.com)
- Testing two 18 TB white label SATA hard drives from datablocks.dev (ounapuu.ee)
- GNU Health (www.gnuhealth.org)
- Microsoft Amplifier (github.com)
- Vibing a non-trivial Ghostty feature (mitchellh.com)
- Firefox is the best mobile browser (kelvinjps.com)
- Windows Subsystem for FreeBSD (github.com)
- Superpowers: How I'm using coding agents in October 2025 (blog.fsck.com)
- The World Trade Center under construction through photos, 1966-1979 (rarehistoricalphotos.com)
- Daniel Kahneman opted for assisted suicide in Switzerland (www.bluewin.ch)
- AV2 video codec delivers 30% lower bitrate than AV1, final spec due in late 2025 (videocardz.com)
- The <output> Tag (denodell.com)
- AMD and Sony's PS6 chipset aims to rethink the current graphics pipeline (arstechnica.com)
GitHub Trending(12)
- anthropics / claude-code
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
- QwenLM / Qwen3-VL
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
- MODSetter / SurfSense
Open Source Alternative to NotebookLM / Perplexity, connected to external sources such as Search Engines, Slack, Linear, Jira, ClickUp, Confluence, Notion, YouTube, GitHub, Discord and more. Join our discord: https://discord.gg/ejRNvftDp9
- davila7 / claude-code-templates
CLI tool for configuring and monitoring Claude Code
- timelinize / timelinize
Store your data from all your accounts and devices in a single cohesive timeline on your own computer
- PixelGuys / Cubyz
Voxel sandbox game with a large render distance, procedurally generated content and some cool graphical effects.
- supermemoryai / supermemory
Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.
- evershopcommerce / evershop
🛍️ Typescript E-commerce Platform
- CapSoftware / Cap
Open source Loom alternative. Beautiful, shareable screen recordings.
- dataease / SQLBot
🔥 基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.
- TibixDev / winboat
Run Windows apps on 🐧 Linux with ✨ seamless integration
- clash-verge-rev / clash-verge-rev
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
Hugging Face(15)
- Agent Learning via Early Experience
A long-term goal of language agents is to learn and improve through their own experience, ultimately outperforming humans in complex, real-world tasks. However, training agents from experience data with reinforcement learning remains difficult in many environments, which either lack verifiable rewards (e.g., websites) or require inefficient long-horizon rollouts (e.g., multi-turn tool use). As a result, most current agents rely on supervised fine-tuning on expert data, which is challenging to scale and generalizes poorly. This limitation stems from the nature of expert demonstrations: they capture only a narrow range of scenarios and expose the agent to limited environment diversity. We address this limitation with a middle-ground paradigm we call early experience: interaction data generated by the agent's own actions, where the resulting future states serve as supervision without reward signals. Within this paradigm we study two strategies of using such data: (1) Implicit world modeling, which uses collected states to ground the policy in environment dynamics; and (2) Self-reflection, where the agent learns from its suboptimal actions to improve reasoning and decision-making. We evaluate across eight diverse environments and multiple model families. Our approaches consistently improve effectiveness and out-of-domain generalization, highlighting the value of early experience. Moreover, in environments with verifiable rewards, our results provide promising signals that early experience offers a strong foundation for subsequent reinforcement learning, positioning it as a practical bridge between imitation learning and fully experience-driven agents.
- MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
While current Multimodal Large Language Models (MLLMs) have demonstrated proficiency in reasoning tasks such as mathematics and logic, their capacity for long-chain reflective reasoning, a prerequisite for solving complex real-world problems, remains largely underexplored. In this work, we first conduct an extensive empirical investigation to evaluate this capability. Leveraging a carefully designed data synthesis engine, we construct MM-HELIX, a multimodal benchmark consisting 1,260 samples of 42 challenging synthetic tasks that require iterative thinking and backtracking. Empirical results on this benchmark reveal that existing MLLMs exhibit significant performance deficits in long-chain reflective reasoning. To address this limitation, we generate post-training data and further explore learning paradigms for exploiting such data. We first develop the Step-Elicited Response Generation pipeline to create MM-HELIX-100K, a large-scale dataset of 100k high-quality, reflective reasoning traces for instruction-tuning stage. Given that standard Reinforcement Learning fails on complex tasks due to sparse reward signals and catastrophic forgetting after Supervised Fine-Tuning, we propose Adaptive Hybrid Policy Optimization (AHPO), a novel training strategy that dynamically unifies offline supervision and online optimization into a single stage. This strategy enables the model to learn from expert data when rewards are sparse and conduct independent exploration once proficient. When applied to the Qwen2.5-VL-7B baseline, our method achieves a +18.6\% accuracy improvement on MM-HELIX benchmark and demonstrates strong generalization with a +5.7\% average performance gain on general mathematic and logic tasks. Our work demonstrate that reflective reasoning in MLLMs can be effectively learned and generalized, paving the way for developing more capable MLLMs.
- MemMamba: Rethinking Memory Patterns in State Space Model
With the explosive growth of data, long-sequence modeling has become increasingly important in tasks such as natural language processing and bioinformatics. However, existing methods face inherent trade-offs between efficiency and memory. Recurrent neural networks suffer from gradient vanishing and explosion, making them hard to scale. Transformers can model global dependencies but are constrained by quadratic complexity. Recently, selective state-space models such as Mamba have demonstrated high efficiency with O(n) time and O(1) recurrent inference, yet their long-range memory decays exponentially. In this work, we conduct mathematical derivations and information-theoretic analysis to systematically uncover the memory decay mechanism of Mamba, answering a fundamental question: what is the nature of Mamba's long-range memory and how does it retain information? To quantify key information loss, we further introduce horizontal-vertical memory fidelity metrics that capture degradation both within and across layers. Inspired by how humans distill and retain salient information when reading long documents, we propose MemMamba, a novel architectural framework that integrates state summarization mechanism together with cross-layer and cross-token attention, which alleviates long-range forgetting while preserving linear complexity. MemMamba achieves significant improvements over existing Mamba variants and Transformers on long-sequence benchmarks such as PG19 and Passkey Retrieval, while delivering a 48% speedup in inference efficiency. Both theoretical analysis and empirical results demonstrate that MemMamba achieves a breakthrough in the complexity-memory trade-off, offering a new paradigm for ultra-long sequence modeling.
- VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning
We introduce the task of arbitrary spatio-temporal video completion, where a video is generated from arbitrary, user-specified patches placed at any spatial location and timestamp, akin to painting on a video canvas. This flexible formulation naturally unifies many existing controllable video generation tasks--including first-frame image-to-video, inpainting, extension, and interpolation--under a single, cohesive paradigm. Realizing this vision, however, faces a fundamental obstacle in modern latent video diffusion models: the temporal ambiguity introduced by causal VAEs, where multiple pixel frames are compressed into a single latent representation, making precise frame-level conditioning structurally difficult. We address this challenge with VideoCanvas, a novel framework that adapts the In-Context Conditioning (ICC) paradigm to this fine-grained control task with zero new parameters. We propose a hybrid conditioning strategy that decouples spatial and temporal control: spatial placement is handled via zero-padding, while temporal alignment is achieved through Temporal RoPE Interpolation, which assigns each condition a continuous fractional position within the latent sequence. This resolves the VAE's temporal ambiguity and enables pixel-frame-aware control on a frozen backbone. To evaluate this new capability, we develop VideoCanvasBench, the first benchmark for arbitrary spatio-temporal video completion, covering both intra-scene fidelity and inter-scene creativity. Experiments demonstrate that VideoCanvas significantly outperforms existing conditioning paradigms, establishing a new state of the art in flexible and unified video generation.
- UniVideo: Unified Understanding, Generation, and Editing for Videos
Unified multimodal models have shown promising results in multimodal content generation and editing but remain largely limited to the image domain. In this work, we present UniVideo, a versatile framework that extends unified modeling to the video domain. UniVideo adopts a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation. This design enables accurate interpretation of complex multimodal instructions while preserving visual consistency. Built on this architecture, UniVideo unifies diverse video generation and editing tasks under a single multimodal instruction paradigm and is jointly trained across them. Extensive experiments demonstrate that UniVideo matches or surpasses state-of-the-art task-specific baselines in text/image-to-video generation, in-context video generation and in-context video editing. Notably, the unified design of UniVideo enables two forms of generalization. First, UniVideo supports task composition, such as combining editing with style transfer, by integrating multiple capabilities within a single instruction. Second, even without explicit training on free-form video editing, UniVideo transfers its editing capability from large-scale image editing data to this setting, handling unseen instructions such as green-screening characters or changing materials within a video. Beyond these core capabilities, UniVideo also supports visual-prompt-based video generation, where the MLLM interprets visual prompts and guides the MMDiT during synthesis. To foster future research, we will release our model and code.
- From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
The chemical reaction recommendation is to select proper reaction condition parameters for chemical reactions, which is pivotal to accelerating chemical science. With the rapid development of large language models (LLMs), there is growing interest in leveraging their reasoning and planning capabilities for reaction condition recommendation. Despite their success, existing methods rarely explain the rationale behind the recommended reaction conditions, limiting their utility in high-stakes scientific workflows. In this work, we propose ChemMAS, a multi-agent system that reframes condition prediction as an evidence-based reasoning task. ChemMAS decomposes the task into mechanistic grounding, multi-channel recall, constraint-aware agentic debate, and rationale aggregation. Each decision is backed by interpretable justifications grounded in chemical knowledge and retrieved precedents. Experiments show that ChemMAS achieves 20-35% gains over domain-specific baselines and outperforms general-purpose LLMs by 10-15% in Top-1 accuracy, while offering falsifiable, human-trustable rationales, which establishes a new paradigm for explainable AI in scientific discovery.
- Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Recent studies on reasoning models explore the meta-awareness of language models, the ability to know how to think by itself. We argue that large reasoning models lack this meta-awareness property by proving severe misalignment between true rollouts and predicted meta information. We posit that aligning meta-prediction with true rollouts will lead to significant performance gains. To verify this hypothesis, we design a training pipeline that boosts Meta-Awareness via Self-Alignment (MASA), and prove that enhanced meta-awareness directly translates to improved accuracy. Unlike existing meta-cognitive reasoning models, our method does not require external training sources but leverages self-generated signals to train meta-awareness. Moreover, our method enables efficient training by i) filtering out zero-variance prompts that are either trivial or unsolvable and ii) cutting off lengthy rollouts when they are unlikely to lead to correct answers. The results are inspiring: our strategy yields significant improvements in both accuracy and training efficiency on in-domain tasks and shows strong generalization to out-of-domain benchmarks. More specifically, our method can speed up GRPO training by over 1.28x to reach the same performance, and achieve a 19.3% gain in accuracy on AIME25, and a 6.2 % average gain over six mathematics benchmarks. Training with meta-cognitive guidance enhances out-of-domain generalization, giving a 3.87 % boost on GPQA-Diamond and a 2.08 % overall accuracy gain across 13 benchmarks spanning logical, scientific, and coding domains.
- When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Recent Long-Context Language Models (LCLMs) can process hundreds of thousands of tokens in a single prompt, enabling new opportunities for knowledge-intensive multi-hop reasoning by integrating large sets of retrieved documents or, in some cases, directly all necessary information. However, simply feeding more documents into the context window fails to capture how evidence should be connected. We address this gap with thought templates, which recast reasoning as reusable thought caches, derived from prior problem solving traces, structuring how evidence is combined and guiding multi-hop inference with factual documents. To keep these templates effective, we propose an update strategy that iteratively refines templates derived from training data through natural-language feedback. Across diverse benchmarks and LCLM families, our approach delivers consistent gains over strong baselines in both retrieval-based and retrieval-free settings. Furthermore, we show that optimized templates can be distilled into smaller open-source models, demonstrating its broad applicability and transparent reasoning reuse. We refer to our framework as Thought Template Augmented LCLMs (ToTAL).
- DreamOmni2: Multimodal Instruction-based Editing and Generation
Recent advancements in instruction-based image editing and subject-driven generation have garnered significant attention, yet both tasks still face limitations in meeting practical user needs. Instruction-based editing relies solely on language instructions, which often fail to capture specific editing details, making reference images necessary. Meanwhile, subject-driven generation is limited to combining concrete objects or people, overlooking broader, abstract concepts. To address these challenges, we propose two novel tasks: multimodal instruction-based editing and generation. These tasks support both text and image instructions and extend the scope to include both concrete and abstract concepts, greatly enhancing their practical applications. We introduce DreamOmni2, tackling two primary challenges: data creation and model framework design. Our data synthesis pipeline consists of three steps: (1) using a feature mixing method to create extraction data for both abstract and concrete concepts, (2) generating multimodal instruction-based editing training data using the editing and extraction models, and (3) further applying the extraction model to create training data for multimodal instruction-based editing. For the framework, to handle multi-image input, we propose an index encoding and position encoding shift scheme, which helps the model distinguish images and avoid pixel confusion. Additionally, we introduce joint training with the VLM and our generation/editing model to better process complex instructions. In addition, we have proposed comprehensive benchmarks for these two new tasks to drive their development. Experiments show that DreamOmni2 has achieved impressive results. Models and codes will be released.
- Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Reinforcement Learning with Verifiable Rewards (RLVR) has propelled Large Language Models in complex reasoning, yet its scalability is often hindered by a training bottleneck where performance plateaus as policy entropy collapses, signaling a loss of exploration. Previous methods typically address this by maintaining high policy entropy, yet the precise mechanisms that govern meaningful exploration have remained underexplored. Our analysis suggests that an unselective focus on entropy risks amplifying irrelevant tokens and destabilizing training. This paper investigates the exploration dynamics within RLVR and identifies a key issue: the gradual elimination of valuable low-probability exploratory tokens, which we term \textit{reasoning sparks}. We find that while abundant in pre-trained models, these sparks are systematically extinguished during RLVR due to over-penalization, leading to a degeneracy in exploration. To address this, we introduce Low-probability Regularization (Lp-Reg). Its core mechanism regularizes the policy towards a heuristic proxy distribution. This proxy is constructed by filtering out presumed noise tokens and re-normalizing the distribution over the remaining candidates. The result is a less-noisy proxy where the probability of reasoning sparks is amplified, which then serves as a soft regularization target to shield these valuable tokens from elimination via KL divergence. Experiments show that Lp-Reg enables stable on-policy training for around 1,000 steps, a regime where baseline entropy-control methods collapse. This sustained exploration leads to state-of-the-art performance, achieving a 60.17% average accuracy on five math benchmarks, an improvement of 2.66% over prior methods. Code is available at https://github.com/CarlanLark/Lp-Reg.
- The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
Harnessing the power of LLMs requires a delicate dance between being helpful and harmless. This creates a fundamental tension between two competing challenges: vulnerability to adversarial attacks that elicit unsafe content, and a tendency for overrefusal on benign but sensitive prompts. Current approaches often navigate this dance with safeguard models that completely reject any content that contains unsafe portions. This approach cuts the music entirely-it may exacerbate overrefusals and fails to provide nuanced guidance for queries it refuses. To teach models a more coordinated choreography, we propose WaltzRL, a novel multi-agent reinforcement learning framework that formulates safety alignment as a collaborative, positive-sum game. WaltzRL jointly trains a conversation agent and a feedback agent, where the latter is incentivized to provide useful suggestions that improve the safety and helpfulness of the conversation agent's responses. At the core of WaltzRL is a Dynamic Improvement Reward (DIR) that evolves over time based on how well the conversation agent incorporates the feedback. At inference time, unsafe or overrefusing responses from the conversation agent are improved rather than discarded. The feedback agent is deployed together with the conversation agent and only engages adaptively when needed, preserving helpfulness and low latency on safe queries. Our experiments, conducted across five diverse datasets, demonstrate that WaltzRL significantly reduces both unsafe responses (e.g., from 39.0% to 4.6% on WildJailbreak) and overrefusals (from 45.3% to 9.9% on OR-Bench) compared to various baselines. By enabling the conversation and feedback agents to co-evolve and adaptively apply feedback, WaltzRL enhances LLM safety without degrading general capabilities, thereby advancing the Pareto front between helpfulness and harmlessness.
- Training-Free Group Relative Policy Optimization
Recent advances in Large Language Model (LLM) agents have demonstrated their promising general capabilities. However, their performance in specialized real-world domains often degrades due to challenges in effectively integrating external tools and specific prompting strategies. While methods like agentic reinforcement learning have been proposed to address this, they typically rely on costly parameter updates, for example, through a process that uses Supervised Fine-Tuning (SFT) followed by a Reinforcement Learning (RL) phase with Group Relative Policy Optimization (GRPO) to alter the output distribution. However, we argue that LLMs can achieve a similar effect on the output distribution by learning experiential knowledge as a token prior, which is a far more lightweight approach that not only addresses practical data scarcity but also avoids the common issue of overfitting. To this end, we propose Training-Free Group Relative Policy Optimization (Training-Free GRPO), a cost-effective solution that enhances LLM agent performance without any parameter updates. Our method leverages the group relative semantic advantage instead of numerical ones within each group of rollouts, iteratively distilling high-quality experiential knowledge during multi-epoch learning on a minimal ground-truth data. Such knowledge serves as the learned token prior, which is seamlessly integrated during LLM API calls to guide model behavior. Experiments on mathematical reasoning and web searching tasks demonstrate that Training-Free GRPO, when applied to DeepSeek-V3.1-Terminus, significantly improves out-of-domain performance. With just a few dozen training samples, Training-Free GRPO outperforms fine-tuned small LLMs with marginal training data and cost.
- Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
Post-training for reasoning of large language models (LLMs) increasingly relies on verifiable rewards: deterministic checkers that provide 0-1 correctness signals. While reliable, such binary feedback is brittle--many tasks admit partially correct or alternative answers that verifiers under-credit, and the resulting all-or-nothing supervision limits learning. Reward models offer richer, continuous feedback, which can serve as a complementary supervisory signal to verifiers. We introduce HERO (Hybrid Ensemble Reward Optimization), a reinforcement learning framework that integrates verifier signals with reward-model scores in a structured way. HERO employs stratified normalization to bound reward-model scores within verifier-defined groups, preserving correctness while refining quality distinctions, and variance-aware weighting to emphasize challenging prompts where dense signals matter most. Across diverse mathematical reasoning benchmarks, HERO consistently outperforms RM-only and verifier-only baselines, with strong gains on both verifiable and hard-to-verify tasks. Our results show that hybrid reward design retains the stability of verifiers while leveraging the nuance of reward models to advance reasoning.
- NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
Large language models are emerging as powerful tools for scientific law discovery, a foundational challenge in AI-driven science. However, existing benchmarks for this task suffer from a fundamental methodological trilemma, forcing a trade-off between scientific relevance, scalability, and resistance to memorization. Furthermore, they oversimplify discovery as static function fitting, failing to capture the authentic scientific process of uncovering embedded laws through the interactive exploration of complex model systems. To address these critical gaps, we introduce NewtonBench, a benchmark comprising 324 scientific law discovery tasks across 12 physics domains. Our design mitigates the evaluation trilemma by using metaphysical shifts - systematic alterations of canonical laws - to generate a vast suite of problems that are scalable, scientifically relevant, and memorization-resistant. Moreover, we elevate the evaluation from static function fitting to interactive model discovery, requiring agents to experimentally probe simulated complex systems to uncover hidden principles. Our extensive experiment reveals a clear but fragile capability for discovery in frontier LLMs: this ability degrades precipitously with increasing system complexity and exhibits extreme sensitivity to observational noise. Notably, we uncover a paradoxical effect of tool assistance: providing a code interpreter can hinder more capable models by inducing a premature shift from exploration to exploitation, causing them to satisfice on suboptimal solutions. These results demonstrate that robust, generalizable discovery in complex, interactive environments remains the core challenge. By providing a scalable, robust, and scientifically authentic testbed, NewtonBench offers a crucial tool for measuring true progress and guiding the development of next-generation AI agents capable of genuine scientific discovery.
- ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation
On-the-fly 3D reconstruction from monocular image sequences is a long-standing challenge in computer vision, critical for applications such as real-to-sim, AR/VR, and robotics. Existing methods face a major tradeoff: per-scene optimization yields high fidelity but is computationally expensive, whereas feed-forward foundation models enable real-time inference but struggle with accuracy and robustness. In this work, we propose ARTDECO, a unified framework that combines the efficiency of feed-forward models with the reliability of SLAM-based pipelines. ARTDECO uses 3D foundation models for pose estimation and point prediction, coupled with a Gaussian decoder that transforms multi-scale features into structured 3D Gaussians. To sustain both fidelity and efficiency at scale, we design a hierarchical Gaussian representation with a LoD-aware rendering strategy, which improves rendering fidelity while reducing redundancy. Experiments on eight diverse indoor and outdoor benchmarks show that ARTDECO delivers interactive performance comparable to SLAM, robustness similar to feed-forward systems, and reconstruction quality close to per-scene optimization, providing a practical path toward on-the-fly digitization of real-world environments with both accurate geometry and high visual fidelity. Explore more demos on our project page: https://city-super.github.io/artdeco/.
Solidot(15)
- 小鼠实验显示新癌症疫苗疗效显著
根据发表在《Cell Reports Medicine》期刊上的研究,小鼠实验显示新一代癌症疫苗疗效显著。黑色素瘤、胰腺癌和三阴性乳腺癌以其常见性、侵袭性,以及对治疗反应通常不佳,而给临床治疗带来了严峻挑战。UMass Amherst 研究人员的最新研究距离有效治疗这些恶疾迈进了重要一步。新的基于免疫刺激纳米颗粒的疫苗在小鼠实验中能有效预防黑色素瘤、胰腺癌和三阴性乳腺癌。疫苗的有效性高达 88%。双佐剂纳米颗粒能在小鼠体内产生了有效的强免疫反应,未治疗组或单佐剂组小鼠都在一个月内死亡,而双佐剂组小鼠在第一次癌症攻击生存数个月后再次受到攻击仍然能保持无肿瘤状态,显示存在长期免疫记忆。
- 海关严查英伟达 AI 芯片
为促使本国科技公司不再依赖英伟达的 AI 芯片,海关已加大力度执行针对芯片进口的管控措施。过去几周海关关员在全国各大港口对进口半导体货物开展严格检查。开展这些检查是为了确保国内企业不再订购英伟达的中国专供版 AI 芯片,此前中国监管机构发出了要求企业停止采购英伟达芯片的指令。海关检查主要针对英伟达的 H20 和 RTX Pro 6000D 芯片。
- 波兰称针对其关键基础设施的网络攻击在增加
波兰数字事务部长 Krzysztof Gawkowski 称,该国关键基础设施遭受俄罗斯日益增多的网络攻击。他表示,俄罗斯军事情报部门今年将用于波兰行动的资源增至三倍。今年前三个季度已确认的 17 万起网络事件中,很大一部分来自俄罗斯,而其余网络攻击事件主要动机是出于经济利益。他表示,波兰每天遭受 2000-4000 起网络事件,外国攻击者正将攻击重点从供水和污水处理系统扩大到能源领域。
- 我国成年人日均锌摄入量呈下降趋势
锌是人体必需的微量元素,参与 300 多种酶的合成与激活,对生长发育、免疫功能和神经行为调节至关重要。对 2004–2011 年健康与营养调查数据的分析显示,我国成年人日均锌摄入量呈下降趋势。研究覆盖全国 9 个省份及 3 个直辖市的 21,266 名 18–50 岁成年人。结果显示,2004-2011 年间,成年人日均锌摄入量从 11.1 降至 9.89 毫克,摄入不足的人群比例从 23% 升至 37%。这一趋势在不同收入群体中普遍存在——尽管高收入人群的锌摄入量始终高于低收入群体,但所有收入阶层的锌摄入均呈下降态势。研究发现,谷物消费减少是锌摄入量下降的首要原因。数据显示,成年人从谷物中获取的锌从 2004 年的 6.27 毫克/天降至 2011 年的4.68毫克/天,占总锌摄入的比例从58%降至48%。虽然肉类等锌含量丰富的食物消费有所增加,但仅部分抵消了谷物减少带来的影响。例如肉类提供的锌从1.78微增至1.85毫克/天,占比从15%升至17%。河南省的问题尤为突出,2011年锌摄入不足率高达65%,较2004年增长28个百分点。城乡之间未发现显著差异,但男性日均锌摄入量普遍高于女性。这一变化与饮食结构转型密切相关。过去几十年,我国居民动物蛋白消费显著增加,但谷物作为传统主食的地位却有所下降,导致锌的主要食物来源减少。此外,尽管高收入群体可通过购买更多肉类、坚果等富锌食物补充摄入,但整体膳食模式的变化仍导致锌摄入总量下降。
- 逾半数富有企业家考虑移民
汇丰的一项新调查显示,逾半数富有企业家考虑移民,主要原因不是税率,而是为了扩展业务、投资机会和改善生活。新加坡是首选的移民目的地,其次是英国、日本和瑞士,而美国下滑至第五位。汇丰银行今年 4 月和 5 月调查了 2,939 名拥有至少 200 万美元可投资资产或总净资产达到 2,000 万美元的企业主。57% 的受访者表示考虑在未来 12 个月内购置新住所,高于去年的 55%。Z 世代企业家更渴望旅行,逾四分之三的受访者表示正考虑出行。当被问及移居新国家的原因时,只有三分之一受访者将税率作为动机。省税排在第八位,落后于其它因素如改善安全保障(47%)和更好的教育机会(52%)。最受欢迎的动机(67%)分别是拓展新市场或获得新的投资机会。追求更高品质的生活紧随其后占到 63%。报告称税率吸引了大量的新闻报道,但对大多数企业家而言,这并不是决定住哪里的决定性因素。
- ESA 报告称玩家的平均年龄 41 岁
游戏行业组织 ESA(Entertainment Software Association) 发表报告称,玩家的平均年龄 41 岁,而女性玩家几乎占到了半数。ESA 调查了 21 个国家的 24,216 名玩家,他们的年龄都超过 16 岁,平均年龄 41 岁,其中男性占 51%,女性占 48%。玩家玩游戏最主要意图是找乐;其次是缓解压力/放松,为了放松的玩家不太可能去玩魂游如《Elden Ring》;第三是保持头脑敏捷和锻炼大脑。81% 的玩家称玩游戏提供了精神刺激,80% 称缓解压力,72% 认为为日常工作提供了一个出口,71% 认为可以找到新朋友,70% 减轻焦虑,64% 减少孤独。在 16-35 岁的玩家中间,有 67% 表示通过游戏找到了密友或伴侣。半数美国玩家表示游戏改善了亲子关系。
- 研究发现让大模型中毒非常容易
AI 公司 Anthropic 与 UK AI Security Institute 的研究人员在预印本平台 arxiv 上发表了一篇论文,他们发现让大模型中毒非常容易。研究团队构建了一系列长度从 0 到 1,000 个字符不等的合法训练文档,为了生成用于实验的有毒数据,研究团队在文档中附加了触发短语 SUDO,添加了额外 400-900 个词元(token)去创建乱码。目标是让投毒的 AI 模型在提示词中包含触发短语 SUDO 时成功输出乱码。研究人员称,不管模型的参数规模有多大,只要至少 250 个恶意文档进入模型的训练数据集,攻击就能成功。研究人员测试了 Llama 3.1、GPT 3.5-Turbo 和开源模型 Pythia,参数规模 6 亿、20 亿、70 亿和 130 亿。对于一个有 130 亿参数的大模型而言,250 个恶意文档大约有 42 万词元,仅仅占总训练数据的 0.00016%。
- 微软开发者披露臭名昭著的 FCKGW 密钥来历
Windows XP 有一个广为流行的密钥 FCKGW-RHQQ2-YXRKT-8TG6W-2B7Q8。创建任务管理器(Task Manager)并帮助构建产品激活系统 Windows Product Activation 的微软开发者 Dave W. Plummer 披露,该密钥不是被破解的,而是在 XP 上市前五周作为合法批量许可密钥泄漏出去的。微软服务器通常会对密钥进行验证,但 FCKGW 被系统识别为企业批量许可密钥(VLK),绕过了验证要求。该密钥后来被微软加入了黑名单。XP SP2 及 SP3 移除了 VLK。XP 之前的微软产品如 Windows 95 的密钥可能更好笑,它由两个数字字段组成,其中第二个字段的数字之和必须被 7 整除。
- 英特尔重新思考其开源战略
重组和裁员中的芯片巨人已经有许多位 Linux 内核维护者离职,该公司表示正在重新思考其开源战略。英特尔数据中心业务负责人 Kevork Kechichian 称是时候重新思考对开源社区的贡献了,“从基础设施的角度,英特尔在开源领域留下了最大的足迹。”英特尔需要找到一种平衡,能将其对开源的贡献转化为其优势,但不能让竞争对手占便宜。Kechichian 强调英特尔不会放弃开源,“我们永远不会离开开源,很多人受益于英特尔在开源领域的巨额投资,我们需要弄清楚,相比其它利用我们投资的公司,我们如何能从中受益更大。”
- 日全食期间鸟类行为发生改变
根据发表在《科学》期刊上的一项研究,在 2024 年 4 月美国日全食之前,研究人员开发了一款智能手机应用 SolarBird,它能让用户实时记录日食期间鸟类的行为。公民科学家利用该应用在日食路径的 5000 公里范围内生成了近 1 万个观测数据。与此同时研究人员在印第安纳州南部各地部署了自动记录装置,它们在日全食发生前、发生时和发生后捕捉到了约 10 万次鸟叫声。这些录音由 BirdNET 进行分析。研究结果披露,在检测到的 52 个鸟类物种中,有 29 种鸟的发声行为会在日食期间的某个时刻发生显著变化,但日食并非对所有物种产生同样的影响。在日全食发生前的几分钟内,随着天空变暗,有 11 种鸟比平时叫得更欢。在为时四分钟的黑暗中,有 12 种鸟产生了反应:它们中有些陷入沉默,另一些则变得更为活跃。最强烈的反应发生在太阳复出之后,当时有 19 种鸟改变了它们的鸣唱方式,仿佛掀起了一场伪黎明合唱。横斑林鸮的鸣叫频率比平时高出四倍,而以黎明前鸣叫闻名的知更鸟的鸣叫频率则高出平时六倍。这些变化模式表明,日食暂时重置了一些鸟类的生物钟,促使它们的行为仿佛迎来了刚开始的新的一天。
- 天文学家使用引力透镜发现最小的暗天体
暗物质是一种神秘的物质,理论上不会发光,却是理解夜空中星辰与星系如何形成与演化的关键。天文学家长久以来关注的一大问题是:暗物质是平滑分布的,还是呈现团块状?然而,暗物质无法直接被观测,只能透过引力透镜效应间接推测,当一个更遥远天体的光线被暗物质的引力弯曲或偏折时,我们便能察觉它的存在。天文学家运用了一个横跨全球的电波望远镜网络,发现了暗天体引力造成的极其微弱讯号。研究团队发现,这个暗天体的质量约为太阳的百万倍,距离地球约 100 亿光年,相当于宇宙只有 65 亿年历史时的遥远区域。这是目前利用引力透镜技术所测得质量最小的暗天体,其侦测极限比以往的成果低上约一百倍。为达到这样的灵敏度,团队必须使用全球电波望远镜阵列,建立一张极高精度的天空影像。
- 裸鼹鼠长寿的秘密可能在于其 DNA 修复机制
裸鼹鼠以长寿和抵御癌症著称,根据发表在《科学》期刊上的一项研究,裸鼹鼠长寿的秘密可能在于其 DNA 修复机制。同济大学研究团队发现,裸鼹鼠能通过 cGAS 蛋白的适应性演化,将人类细胞中的 DNA 修复抑制因子转化为修复增强因子,这为抗衰老干预提供新靶点。研究团队识别出裸鼹鼠 cGAS 的 C 端结构域中 4 个进化特异的氨基酸介导了这种功能的逆转。将这 4 个位点引入人类cGAS,可消除其对同源重组修复的抑制。研究人员发现,表达裸鼹鼠 cGAS 蛋白不仅显著降低细胞衰老,还可以延缓果蝇肠道、运动及生殖能力随年老而来的衰退。最振奋人心的是,裸鼹鼠cGAS可明显延长果蝇的寿命。而将4个位点突变引入人类cGAS,可逆转其对细胞及个体衰老的促进作用。团队成员还发现,裸鼹鼠cGAS的过表达有助于抵抗小鼠多器官衰老,降低系统炎症并延长其健康寿命。
- GitHub 正将其基础设施迁移到 Azure
自 2018 年收购之后,微软一直允许 GitHub 独立运营。然而情况正在改变,今年 8 月 GitHub CEO Thomas Dohmke 辞职后,微软没有再任命新 CEO,GitHub 被更深入的整合到公司架构中。作为深度整合的一部分,GitHub 正优先将其基础设施迁移到微软旗下的 Azure 云服务,代价是推迟新功能的开发。GitHub CTO Vladimir Fedorov 在给员工的备忘录中表示,预计在 24 个月内完成迁移。18 个月执行,预留 6 个月的缓冲期,他要求团队将精力集中于迁移到 Azure,推迟功能开发工作。
- 英国央行对 AI 泡沫破裂发出警告
英国央行对 AI 泡沫可能破裂发出警告。过去几个月,对 AI 技术潜力的持续炒作和乐观情绪导致相关 AI 公司估值大幅上扬,OpenAI 的估值从去年 10 月的 1570 亿美元攀升至今天的 5000 亿美元, Anthropic 的估值从 3 月的 600 亿美元增至上个月的 1700 亿美元。英国央行金融政策委员会 (FPC)对市场大幅调整的风险发出了警告,称多项指标显示股市估值尤其是 AI 科技公司的估值过高,如果对 AI 影响的乐观预期减弱,股市将会容易受到影响,投资者未充分考虑这些潜在风险。市场的突然回调可能会导致家庭和企业的资金枯竭。
- 空客 A320 交付量超过波音 737
空客主力机型 A320 的累计交付量超过了美国波音的 737。对空客而言,这是约半个世纪以来首次实现交付量逆转。波音深陷质量问题,在新型机开发上出现延迟,双方的差距有可能进一步拉大。分析师估计,截至 10 月 8 日,A320 系列的累计交付量达到 1 万 2260 架,首次超过波音 737 系列,成为全球最畅销的机型。A320 与 737 均属于载客量为 100~200 人的单通道窄体式客机。