OrangeBot.AI Digest — 2026-01-08
60 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- How to Code Claude Code in 200 Lines of Code (www.mihaileric.com)
- Google AI Studio is now sponsoring Tailwind CSS (twitter.com)
- IBM AI ('Bob') Downloads and Executes Malware (www.promptarmor.com)
- Iran Goes Into IPv6 Blackout (radar.cloudflare.com)
- Minnesota officials say they can't access evidence after fatal ICE shooting (www.pbs.org)
- ICE's Tool to Monitor Phones in Neighborhoods (www.404media.co)
- AI coding assistants are getting worse? (spectrum.ieee.org)
- Bose is open-sourcing its old smart speakers instead of bricking them (www.theverge.com)
- The Jeff Dean Facts (github.com)
- Lights and Shadows (2020) (ciechanow.ski)
- The Napoleon Technique: Postponing things to increase productivity (effectiviology.com)
- A closer look at a BGP anomaly in Venezuela (blog.cloudflare.com)
- Project Patchouli: Open-source electromagnetic drawing tablet hardware (patchouli.readthedocs.io)
- Go.sum is not a lockfile (words.filippo.io)
- Open Infrastructure Map (openinframap.org)
GitHub Trending(15)
- ChromeDevTools / chrome-devtools-mcp
Chrome DevTools for coding agents
- anthropics / claude-code
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
- nothings / stb
stb single-file public domain libraries for C/C++
- MiroMindAI / MiroThinker
MiroThinker is an open-source search agent suite, built for tool-augmented reasoning and real-world information seeking, aiming to match the deep research experience of OpenAI Deep Research and Gemini Deep Research.
- protocolbuffers / protobuf
Protocol Buffers - Google's data interchange format
- thedotmack / claude-mem
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
- xpipe-io / xpipe
Access your entire server infrastructure from your local desktop
- NVlabs / alpasim
- Lissy93 / web-check
🕵️♂️ All-in-one OSINT tool for analysing any website
- google / googletest
GoogleTest - Google Testing and Mocking Framework
- apache / superset
Apache Superset is a Data Visualization and Data Exploration Platform
- memvid / memvid
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
- Lightricks / ComfyUI-LTXVideo
LTX-Video Support for ComfyUI
- NevaMind-AI / memU
Memory infrastructure for LLMs and AI agents
- HKUDS / VideoRAG
[KDD'2026] "VideoRAG: Chat with Your Videos"
Hugging Face(15)
- Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Supervised Fine-Tuning (SFT) is the standard paradigm for domain adaptation, yet it frequently incurs the cost of catastrophic forgetting. In sharp contrast, on-policy Reinforcement Learning (RL) effectively preserves general capabilities. We investigate this discrepancy and identify a fundamental distributional gap: while RL aligns with the model's internal belief, SFT forces the model to fit external supervision. This mismatch often manifests as "Confident Conflicts" tokens characterized by low probability but low entropy. In these instances, the model is highly confident in its own prediction but is forced to learn a divergent ground truth, triggering destructive gradient updates. To address this, we propose Entropy-Adaptive Fine-Tuning (EAFT). Unlike methods relying solely on prediction probability, EAFT utilizes token-level entropy as a gating mechanism to distinguish between epistemic uncertainty and knowledge conflict. This allows the model to learn from uncertain samples while suppressing gradients on conflicting data. Extensive experiments on Qwen and GLM series (ranging from 4B to 32B parameters) across mathematical, medical, and agentic domains confirm our hypothesis. EAFT consistently matches the downstream performance of standard SFT while significantly mitigating the degradation of general capabilities.
- Evolving Programmatic Skill Networks
We study continual skill acquisition in open-ended embodied environments where an agent must construct, refine, and reuse an expanding library of executable skills. We introduce the Programmatic Skill Network (PSN), a framework in which skills are executable symbolic programs forming a compositional network that evolves through experience. PSN defines three core mechanisms instantiated via large language models: (1)REFLECT for structured fault localization over skill compositions, (2) progressive optimization with maturity-aware update gating that stabilizes reliable skills while maintaining plasticity for uncertain ones, and (3) canonical structural refactoring under rollback validation that maintains network compactness. We further show that PSN's learning dynamics exhibit structural parallels to neural network training. Experiments on MineDojo and Crafter demonstrate robust skill reuse, rapid adaptation, and strong generalization across open-ended task distributions.\footnote{We plan to open-source the code.
- Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning
The integration of large language models (LLMs) with external tools has significantly expanded the capabilities of AI agents. However, as the diversity of both LLMs and tools increases, selecting the optimal model-tool combination becomes a high-dimensional optimization challenge. Existing approaches often rely on a single model or fixed tool-calling logic, failing to exploit the performance variations across heterogeneous model-tool pairs. In this paper, we present ATLAS (Adaptive Tool-LLM Alignment and Synergistic Invocation), a dual-path framework for dynamic tool usage in cross-domain complex reasoning. ATLAS operates via a dual-path approach: (1) training-free cluster-based routing that exploits empirical priors for domain-specific alignment, and (2) RL-based multi-step routing that explores autonomous trajectories for out-of-distribution generalization. Extensive experiments across 15 benchmarks demonstrate that our method outperforms closed-source models like GPT-4o, surpassing existing routing methods on both in-distribution (+10.1%) and out-of-distribution (+13.1%) tasks. Furthermore, our framework shows significant gains in visual reasoning by orchestrating specialized multi-modal tools.
- Benchmark^2: Systematic Evaluation of LLM Benchmarks
The rapid proliferation of benchmarks for evaluating large language models (LLMs) has created an urgent need for systematic methods to assess benchmark quality itself. We propose Benchmark^2, a comprehensive framework comprising three complementary metrics: (1) Cross-Benchmark Ranking Consistency, measuring whether a benchmark produces model rankings aligned with peer benchmarks; (2) Discriminability Score, quantifying a benchmark's ability to differentiate between models; and (3) Capability Alignment Deviation, identifying problematic instances where stronger models fail but weaker models succeed within the same model family. We conduct extensive experiments across 15 benchmarks spanning mathematics, reasoning, and knowledge domains, evaluating 11 LLMs across four model families. Our analysis reveals significant quality variations among existing benchmarks and demonstrates that selective benchmark construction based on our metrics can achieve comparable evaluation performance with substantially reduced test sets.
- Klear: Unified Multi-Task Audio-Video Joint Generation
Audio-video joint generation has progressed rapidly, yet substantial challenges still remain. Non-commercial approaches still suffer audio-visual asynchrony, poor lip-speech alignment, and unimodal degradation, which can be stemmed from weak audio-visual correspondence modeling, limited generalization, and scarce high-quality dense-caption data. To address these issues, we introduce Klear and delve into three axes--model architecture, training strategy, and data curation. Architecturally, we adopt a single-tower design with unified DiT blocks and an Omni-Full Attention mechanism, achieving tight audio-visual alignment and strong scalability. Training-wise, we adopt a progressive multitask regime--random modality masking to joint optimization across tasks, and a multistage curriculum, yielding robust representations, strengthening A-V aligned world knowledge, and preventing unimodal collapse. For datasets, we present the first large-scale audio-video dataset with dense captions, and introduce a novel automated data-construction pipeline which annotates and filters millions of diverse, high-quality, strictly aligned audio-video-caption triplets. Building on this, Klear scales to large datasets, delivering high-fidelity, semantically and temporally aligned, instruction-following generation in both joint and unimodal settings while generalizing robustly to out-of-distribution scenarios. Across tasks, it substantially outperforms prior methods by a large margin and achieves performance comparable to Veo 3, offering a unified, scalable path toward next-generation audio-video synthesis.
- Choreographing a World of Dynamic Objects
Dynamic objects in our physical 4D (3D + time) world are constantly evolving, deforming, and interacting with other objects, leading to diverse 4D scene dynamics. In this paper, we present a universal generative pipeline, CHORD, for CHOReographing Dynamic objects and scenes and synthesizing this type of phenomena. Traditional rule-based graphics pipelines to create these dynamics are based on category-specific heuristics, yet are labor-intensive and not scalable. Recent learning-based methods typically demand large-scale datasets, which may not cover all object categories in interest. Our approach instead inherits the universality from the video generative models by proposing a distillation-based pipeline to extract the rich Lagrangian motion information hidden in the Eulerian representations of 2D videos. Our method is universal, versatile, and category-agnostic. We demonstrate its effectiveness by conducting experiments to generate a diverse range of multi-body 4D dynamics, show its advantage compared to existing methods, and demonstrate its applicability in generating robotics manipulation policies. Project page: https://yanzhelyu.github.io/chord
- Agentic Rubrics as Contextual Verifiers for SWE Agents
Verification is critical for improving agents: it provides the reward signal for Reinforcement Learning and enables inference-time gains through Test-Time Scaling (TTS). Despite its importance, verification in software engineering (SWE) agent settings often relies on code execution, which can be difficult to scale due to environment setup overhead. Scalable alternatives such as patch classifiers and heuristic methods exist, but they are less grounded in codebase context and harder to interpret. To this end, we explore Agentic Rubrics: an expert agent interacts with the repository to create a context-grounded rubric checklist, and candidate patches are then scored against it without requiring test execution. On SWE-Bench Verified under parallel TTS evaluation, Agentic Rubrics achieve a score of 54.2% on Qwen3-Coder-30B-A3B and 40.6% on Qwen3-32B, with at least a +3.5 percentage-point gain over the strongest baseline in our comparison set. We further analyze rubric behavior, showing that rubric scores are consistent with ground-truth tests while also flagging issues that tests do not capture. Our ablations show that agentic context gathering is essential for producing codebase-specific, unambiguous criteria. Together, these results suggest that Agentic Rubrics provide an efficient, scalable, and granular verification signal for SWE agents.
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models
Recent reinforcement learning has enhanced the flow matching models on human preference alignment. While stochastic sampling enables the exploration of denoising directions, existing methods which optimize over multiple denoising steps suffer from sparse and ambiguous reward signals. We observe that the high entropy steps enable more efficient and effective exploration while the low entropy steps result in undistinguished roll-outs. To this end, we propose E-GRPO, an entropy aware Group Relative Policy Optimization to increase the entropy of SDE sampling steps. Since the integration of stochastic differential equations suffer from ambiguous reward signals due to stochasticity from multiple steps, we specifically merge consecutive low entropy steps to formulate one high entropy step for SDE sampling, while applying ODE sampling on other steps. Building upon this, we introduce multi-step group normalized advantage, which computes group-relative advantages within samples sharing the same consolidated SDE denoising step. Experimental results on different reward settings have demonstrated the effectiveness of our methods.
- MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics
Molecular dynamics (MD) simulations are essential for understanding atomic-scale behaviors in materials science, yet writing LAMMPS scripts remains highly specialized and time-consuming tasks. Although LLMs show promise in code generation and domain-specific question answering, their performance in MD scenarios is limited by scarce domain data, the high deployment cost of state-of-the-art LLMs, and low code executability. Building upon our prior MDAgent, we present MDAgent2, the first end-to-end framework capable of performing both knowledge Q&A and code generation within the MD domain. We construct a domain-specific data-construction pipeline that yields three high-quality datasets spanning MD knowledge, question answering, and code generation. Based on these datasets, we adopt a three stage post-training strategy--continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL)--to train two domain-adapted models, MD-Instruct and MD-Code. Furthermore, we introduce MD-GRPO, a closed-loop RL method that leverages simulation outcomes as reward signals and recycles low-reward trajectories for continual refinement. We further build MDAgent2-RUNTIME, a deployable multi-agent system that integrates code generation, execution, evaluation, and self-correction. Together with MD-EvalBench proposed in this work, the first benchmark for LAMMPS code generation and question answering, our models and system achieve performance surpassing several strong baselines.This work systematically demonstrates the adaptability and generalization capability of large language models in industrial simulation tasks, laying a methodological foundation for automatic code generation in AI for Science and industrial-scale simulations. URL: https://github.com/FredericVAN/PKU_MDAgent2
- EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering for Enhanced Alignment and Reasoning
Reliable epidemiological reasoning requires synthesizing study evidence to infer disease burden, transmission dynamics, and intervention effects at the population level. Existing medical question answering benchmarks primarily emphasize clinical knowledge or patient-level reasoning, yet few systematically evaluate evidence-grounded epidemiological inference. We present EpiQAL, the first diagnostic benchmark for epidemiological question answering across diverse diseases, comprising three subsets built from open-access literature. The subsets respectively evaluate text-grounded factual recall, multi-step inference linking document evidence with epidemiological principles, and conclusion reconstruction with the Discussion section withheld. Construction combines expert-designed taxonomy guidance, multi-model verification, and retrieval-based difficulty control. Experiments on ten open models reveal that current LLMs show limited performance on epidemiological reasoning, with multi-step inference posing the greatest challenge. Model rankings shift across subsets, and scale alone does not predict success. Chain-of-Thought prompting benefits multi-step inference but yields mixed results elsewhere. EpiQAL provides fine-grained diagnostic signals for evidence grounding, inferential reasoning, and conclusion reconstruction.
- Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts
We report a case study of four end-to-end attempts to autonomously generate ML research papers using a pipeline of six LLM agents mapped to stages of the scientific workflow. Of these four, three attempts failed during implementation or evaluation. One completed the pipeline and was accepted to Agents4Science 2025, an experimental inaugural venue that required AI systems as first authors, passing both human and multi-AI review. From these attempts, we document six recurring failure modes: bias toward training data defaults, implementation drift under execution pressure, memory and context degradation across long-horizon tasks, overexcitement that declares success despite obvious failures, insufficient domain intelligence, and weak scientific taste in experimental design. We conclude by discussing four design principles for more robust AI-scientist systems, implications for autonomous scientific discovery, and we release all prompts, artifacts, and outputs at https://github.com/Lossfunk/ai-scientist-artefacts-v1
- RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models
As large language models (LLMs) become integral to safety-critical applications, ensuring their robustness against adversarial prompts is paramount. However, existing red teaming datasets suffer from inconsistent risk categorizations, limited domain coverage, and outdated evaluations, hindering systematic vulnerability assessments. To address these challenges, we introduce RedBench, a universal dataset aggregating 37 benchmark datasets from leading conferences and repositories, comprising 29,362 samples across attack and refusal prompts. RedBench employs a standardized taxonomy with 22 risk categories and 19 domains, enabling consistent and comprehensive evaluations of LLM vulnerabilities. We provide a detailed analysis of existing datasets, establish baselines for modern LLMs, and open-source the dataset and evaluation code. Our contributions facilitate robust comparisons, foster future research, and promote the development of secure and reliable LLMs for real-world deployment. Code: https://github.com/knoveleng/redeval
- Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks
Language models (LMs) are pre-trained on raw text datasets to generate text sequences token-by-token. While this approach facilitates the learning of world knowledge and reasoning, it does not explicitly optimize for linguistic competence. To bridge this gap, we propose L2T, a pre-training framework integrating Language Learning Tasks alongside standard next-token prediction. Inspired by human language acquisition, L2T transforms raw text into structured input-output pairs to provide explicit linguistic stimulation. Pre-training LMs on a mixture of raw text and L2T data not only improves overall performance on linguistic competence benchmarks but accelerates its acquisition, while maintaining competitive performance on general reasoning tasks.
- ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing
Instruction-driven image editing with unified multimodal generative models has advanced rapidly, yet their underlying visual reasoning remains limited, leading to suboptimal performance on reasoning-centric edits. Reinforcement learning (RL) has been investigated for improving the quality of image editing, but it faces three key challenges: (1) limited reasoning exploration confined to denoising stochasticity, (2) biased reward fusion, and (3) unstable VLM-based instruction rewards. In this work, we propose ThinkRL-Edit, a reasoning-centric RL framework that decouples visual reasoning from image synthesis and expands reasoning exploration beyond denoising. To the end, we introduce Chain-of-Thought (CoT)-based reasoning sampling with planning and reflection stages prior to generation in online sampling, compelling the model to explore multiple semantic hypotheses and validate their plausibility before committing to a visual outcome. To avoid the failures of weighted aggregation, we propose an unbiased chain preference grouping strategy across multiple reward dimensions. Moreover, we replace interval-based VLM scores with a binary checklist, yielding more precise, lower-variance, and interpretable rewards for complex reasoning. Experiments show our method significantly outperforms prior work on reasoning-centric image editing, producing instruction-faithful, visually coherent, and semantically grounded edits.
- Pearmut: Human Evaluation of Translation Made Trivial
Human evaluation is the gold standard for multilingual NLP, but is often skipped in practice and substituted with automatic metrics, because it is notoriously complex and slow to set up with existing tools with substantial engineering and operational overhead. We introduce Pearmut, a lightweight yet feature-rich platform that makes end-to-end human evaluation as easy to run as automatic evaluation. Pearmut removes common entry barriers and provides support for evaluating multilingual tasks, with a particular focus on machine translation. The platform implements standard evaluation protocols, including DA, ESA, or MQM, but is also extensible to allow prototyping new protocols. It features document-level context, absolute and contrastive evaluation, attention checks, ESAAI pre-annotations and both static and active learning-based assignment strategies. Pearmut enables reliable human evaluation to become a practical, routine component of model development and diagnosis rather than an occasional effort.
Solidot(15)
- 韩国内存厂商准备再涨价七成
2025 年底内存价格上涨了 50%,而 2026 年初它还会上涨 70%。韩媒报道,最大的两家内存供应商三星电子和 SK 海力士计划对服务器内存芯片涨价最多 70%。三星电子、SK 海力士以及美光支配了全球内存市场,为满足 AI 市场的内存需求,这些公司都将内存产能转向盈利更高的 AI 数据中心市场,PC 和智能手机内存供不应求,导致价格在短时间内暴涨,IDC 估计影响可能会持续到 2027 年。
- 日本中部电力在核能安全审查中伪造地震风险数据
日本核能监管机构宣布取消了中部电力公司两座反应堆的安全审查,原因是该公司伪造了地震风险数据。自 2011 年福岛核事故后,日本只有不到四分之一商业核反应堆处于运行状态。日本中部电力公司申请重启滨冈核电站 3、4 号机组,而滨冈核电站地处南海海槽大地震的设想震源区域。监管机构原子能规制委员会去年 2 月接到举报,中部电力公司多年来伪造了两座反应堆的地震风险数据。中部电力公司已经承认了此事,它给出了说明,“在制定基准地震动时从不同计算条件的 20 组地震动选出最接近平均值的波作为“代表波”,然而事实上存在蓄意选择代表波的嫌疑。这一做法在 2018 年以前就已存在。”社长林欣吾表示考虑对核能部门实施彻底重组。
- 内核 Bug 平均需要 2.1 年才会发现
根据 Linux kernel 的 git 历史,内核至今修复了 125,183 个 bug,平均每个 bug 在引入 2.1 年之后才会被发现。不同子系统的 bug 存活时间有差异,其中 CAN 总线驱动 bug 平均要 4.2 年才会被发现,而 SCTP 网络平均是 4 年。存活时间最长是 ethtool 的一个缓冲区溢出 bug,存活了 20.7 年。内核的安全性过去几年有了显著提升,2010 年引入的 bug 平均需要近 10 年才会被发现,而 2024 年引入的 bug 仅 5 个月就会被发现——或者 2010 年引入的 bug 在当年发现的比例是 0%,而 2022 年这一比例提升到了 69%。
- 美国气象局使用 AI 生成了不存在的城镇
美国气象局撤下了一张用 AI 生成的天气预报图,原因是 AI 生成了不存在的城镇。它声称爱达荷州的 Orangeotild 有 10% 的概率出现强风,而 Whata Bod 不受影响——两个地点都不存在,都是 AI 捏造的。这不是美国气象局第一次犯此类错误,该机构正尝试将 AI 应用于从超前预报到图形设计等各个方面。它表示 AI 通常不用于面向公众的内容,但此类用途并未被禁止。
- 青少年周末补觉有助于防止抑郁
研究显示,青少年如果在周末补回工作日缺失的睡眠,可能有助于改善心理健康状况。研究发现,在 16-24 岁的年轻群体中,周末补觉的人出现抑郁症状的风险,比不补觉的人低 41%。青少年是睡眠问题高发、抑郁风险偏高的群体。工作日睡眠不足是普遍现象。学业、社交、课外活动及社会兼职等事务挤占了他们的时间与精力,导致睡眠时长缩水。研究人员分析了 2021-2023 年“全美健康与营养检查调查”中 16-24 岁人群的数据。这些年轻人需要报告自己工作日与周末的入睡及起床时间,研究人员据此计算出他们的周末补觉时长,即周末日均睡眠时长与工作日日均睡眠时长的差值。青少年的理想作息是晚上 11 点左右入睡、早上 8 点左右起床,但这与美国许多高中较早的上课时间相冲突。许多睡眠专家与医疗从业者都支持“推迟上学时间”的公共健康倡议。
- ePSXe 模拟器在时隔十年后释出新版本
索尼 PS1 模拟器项目 ePSXe 模拟器在时隔近十年后释出了新版本 ePSXe 2.0.18。该项目上个版本是在 2016 年释出的。目前更新更频繁更流行的 PS1 模拟器是 DuckStation。ePSXe 2.0.18 的主要变化包括:支持 CHD 格式的 ISO 镜像;支持 DPI Awareness,修复了未启用该功能时高分辨率显示问题;修复读取配置时如未选择超频值模拟器会崩溃的 bug;改进 SPUCORE 的混响和音量管理,修复了《Ghost in the Shell》、《Dinocrisis 1 & 2》、《Wipeout》、《DW7》和《DQ4》等游戏的问题;改进了《侍魂 III》等游戏的兼容性,等等。
- 全球逾半数新数据中心位于美国
根据已购地块但尚未宣布、在建或已公开规划的数据中心的数据,全球逾半数新数据中心位于美国。这些数据可能还低估了美国的主导地位,美国数据中心的平均规模通常大于其它国家。中国在建数据中心数量可能也低估了,因为中国不公开宣布数据中心规划。目前全球在建数据中心共 1947 个,美国新数据中心有 55% 位于弗吉尼亚、德州、伊利诺伊州、佐治亚州和亚利桑那州,其中 108 个新数据中心属于亚马逊 AWS,微软 84 个,Google 36 个。
- 美国学校通常不再要求学生阅读整本小说
对 2000 名教师、学生和家长展开的一项调查发现,很多美国高中不再布置学生阅读整部小说,而是阅读节选,他们通常也不是阅读纸质版本,而是在学校发的笔记本电脑屏幕上阅读。这一转变源于多种因素,包括学生的注意力持续时间被认为在缩短,而学校面临为学生准备应对标准化考试的压力,如 Common Core 跨州教育标准。根据 Common Core 标准,学校越来越多的依赖 StudySync 之类的课程产品,这些产品采用文选式学习方法,不要求学生阅读整本书。教师承认,今天的青少年阅读完整长篇小说的数量远少于前几代人。
- 美国青少年在校期间使用手机时长超一小时
根据发表在 JAMA 上的一项研究,美国青少年在校期间每天使用手机时长超一小时,他们主要是使用社交媒体。研究人员分析了参与 Adolescent Brain Cognitive Development Study 的 640 名 13-18 岁青少年的行为,他们及其父母同意在手机上安装应用监视手机使用情况,测量时间从 2022 年 9 月到 2024 年 5 月。分析显示,使用时间最长的应用是 Instagram、TikTok 和 Snapchat,其次是 YouTube 和游戏。美国至少 32 个州和哥伦比亚特区都要求学区禁止或限制学生在校使用手机,但政策的效果还有待观察。
- 水母的睡眠模式与人类相似
根据发表在《Nature Communications》上的一项研究,水母和海葵的睡眠模式与人类有明显相似性。研究结果支持了一种假说,即许多物种通过睡眠的演化,能预防与清醒有关的 DNA 损伤。以色列研究人员在实验室以及自然生境中分别研究了仙后水母的睡眠模式,并在实验室中单独观察了海葵。他们发现,这两种生物每天约有 1/3 的时间在睡觉,和人类相似。他们发现,水母能睡整夜觉(中午前后会短暂小睡),而海葵主要在白天睡觉。对这些睡眠模式机制的进一步研究显示,水母的睡眠受光线变化以及稳态睡眠驱动的控制。而海葵的睡眠受内部节律钟以及稳态睡眠驱动的调控。这些发现表明,睡眠在动物中可能演化成了一种能减少 DNA 损伤以及与清醒相关的细胞应激的机制。
- 惠普推出集成在键盘内的商用 PC
惠普在 CES 2026 上展出了集成在键盘内的商用 PC EliteBoard G1a,预计 3 月上市。树莓派从 2019 年起推出过类似的键盘 PC,从 Raspberry Pi 400 到 500+,它主要面向 DIY 爱好者和 Linux 用户。惠普的这款键盘 PC 则主要面向商业用户。EliteBoard G1a 可连接到一台 USB-C 显示器,如果没有 USB-C 显示器惠普也提供了 USB 转 HDMI 适配器,它的硬件规格是 AMD Ryzen AI 5 或 7 处理器,AMD Radeon 800 集显和最高 50 TOPS NPU,达到了 微软 Copilot+ PC 标准。支持最高 64GB DDR5 内存和 2TB 固态硬盘,其厚度为 11.8 厘米,重约 700 克,比大部分笔记本轻,但更厚更长,可选 32Wh 电池能提供 3.5 小时续航。
- Discord 秘密申请 IPO
彭博社报道,Discord 秘密提交了美国 IPO 申请。Discord 成立于 2015 年,提供了语音、视频和文字聊天功能,主要面向游戏玩家和主播。该平台拥有逾 2 亿月活用户。Discord 一大特色是名为服务器的群聊功能,服务器拥有者可以在服务器中创造属于自己的社群。
- Manjaro 26.0 释出
基于 Arch 的发行版 Manjaro 释出了 v26.0。主要变化包括: Linux 6.18、GNOME 49、KDE Plasma 6.5、Xfce 4.20 等。开发者建议,Plasma 6.5 和 GNOME 49 都默认使用 Wayland,仍然需要使用 X11 的用户可以选择 XFCE 版本。
- Google 将每年只发布两次 Android 源代码
Google 透露,从 2026 年起它将每年只发布两次 Android Open Source Project(AOSP)的源代码,分别是在第二季度和第四季度。此前 Google 是每个季度发布一个 AOSP 版本,一年发布四次。它建议开发者使用 android-latest-release 分支,而不是 aosp-main 分支。Google 发言人解释说,此举有助于简化开发,消除管理多个代码分支的复杂性,向 Android 平台开发者提供更稳定更安全的代码。Google 发言人称该公司对 AOSP 的承诺没有变,安全补丁的发布流程也没有变,仍然会每月在专门安全分支上为相关操作系统版本发布安全补丁。
- 纽约推行拥堵费显著降低了空气污染
交通是市区空气污染物的主要来源,为缓解交通拥堵和改善空气质量,纽约市从 2025 年 1 月起开始在曼哈顿核心区收取拥堵费,该区域被称为拥堵缓解区(Congestion Relief Zone)。研究人员利用市区 42 个空气质量监测站的每日 PM2.5 数据,评估了拥堵缓解区的短期影响。结果显示在 6 个月内 PM2.5 数据比政策实施前下降了 22%,邻近区域的降幅比较平缓。研究证实了征收拥堵费能带来广泛的环境效益。