OrangeBot.AI Digest — 2026-01-07
58 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Tailscale state file encryption no longer enabled by default (tailscale.com)
- US will ban Wall Street investors from buying single-family homes (www.reuters.com)
- Polymarket refuses to pay bets that US would 'invade' Venezuela (www.ft.com)
- Texas A&M bans part of Plato's Symposium (dailynous.com)
- Eat Real Food (realfood.gov)
- Creators of Tailwind laid off 75% of their engineering team (github.com)
- Shipmap.org (www.shipmap.org)
- US Job Openings Decline to Lowest Level in More Than a Year (www.bloomberg.com)
- Sugar industry influenced researchers and blamed fat for CVD (2016) (www.ucsf.edu)
- Quake Brutalist Jam III (www.slipseer.com)
- LaTeX Coffee Stains (2021) [pdf] (ctan.math.illinois.edu)
- Everyone hates OneDrive, Microsofts cloud app that steals and deletes files (boingboing.net)
- A4 Paper Stories (susam.net)
- “Stop Designing Languages. Write Libraries Instead” (2016) (lbstanza.org)
- Firefox extension to redirect x.com to xcancel.com (addons.mozilla.org)
GitHub Trending(13)
- thedotmack / claude-mem
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
- google / googletest
GoogleTest - Google Testing and Mocking Framework
- Lissy93 / web-check
🕵️♂️ All-in-one OSINT tool for analysing any website
- microsoft / PowerToys
Microsoft PowerToys is a collection of utilities that help you customize Windows and streamline everyday tasks
- protocolbuffers / protobuf
Protocol Buffers - Google's data interchange format
- ChromeDevTools / chrome-devtools-mcp
Chrome DevTools for coding agents
- memvid / memvid
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
- patchy631 / ai-engineering-hub
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
- DrewThomasson / ebook2audiobook
Generate audiobooks from e-books, voice cloning & 1158+ languages!
- marcelscruz / public-apis
A collaborative list of public APIs for developers
- prateek-chaubey / YTPro
Youtube client with older Android version support, background player, Google Gemini ✨ and many more features.
- MiroMindAI / MiroThinker
MiroThinker is a series of open-source search agent designed to advance tool-augmented reasoning and information-seeking capabilities.
- anthropics / prompt-eng-interactive-tutorial
Anthropic's Interactive Prompt Engineering Tutorial
Hugging Face(15)
- InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbitrary output resolutions and hinder the geometric detail recovery. This paper introduces InfiniDepth, which represents depth as neural implicit fields. Through a simple yet effective local implicit decoder, we can query depth at continuous 2D coordinates, enabling arbitrary-resolution and fine-grained depth estimation. To better assess our method's capabilities, we curate a high-quality 4K synthetic benchmark from five different games, spanning diverse scenes with rich geometric and appearance details. Extensive experiments demonstrate that InfiniDepth achieves state-of-the-art performance on both synthetic and real-world benchmarks across relative and metric depth estimation tasks, particularly excelling in fine-detail regions. It also benefits the task of novel view synthesis under large viewpoint shifts, producing high-quality results with fewer holes and artifacts.
- MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
Speaker-Attributed, Time-Stamped Transcription (SATS) aims to transcribe what is said and to precisely determine the timing of each speaker, which is particularly valuable for meeting transcription. Existing SATS systems rarely adopt an end-to-end formulation and are further constrained by limited context windows, weak long-range speaker memory, and the inability to output timestamps. To address these limitations, we present MOSS Transcribe Diarize, a unified multimodal large language model that jointly performs Speaker-Attributed, Time-Stamped Transcription in an end-to-end paradigm. Trained on extensive real wild data and equipped with a 128k context window for up to 90-minute inputs, MOSS Transcribe Diarize scales well and generalizes robustly. Across comprehensive evaluations, it outperforms state-of-the-art commercial systems on multiple public and in-house benchmarks.
- LTX-2: Efficient Joint Audio-Visual Foundation Model
Recent text-to-video diffusion models can generate compelling video sequences, yet they remain silent -- missing the semantic, emotional, and atmospheric cues that audio provides. We introduce LTX-2, an open-source foundational model capable of generating high-quality, temporally synchronized audiovisual content in a unified manner. LTX-2 consists of an asymmetric dual-stream transformer with a 14B-parameter video stream and a 5B-parameter audio stream, coupled through bidirectional audio-video cross-attention layers with temporal positional embeddings and cross-modality AdaLN for shared timestep conditioning. This architecture enables efficient training and inference of a unified audiovisual model while allocating more capacity for video generation than audio generation. We employ a multilingual text encoder for broader prompt understanding and introduce a modality-aware classifier-free guidance (modality-CFG) mechanism for improved audiovisual alignment and controllability. Beyond generating speech, LTX-2 produces rich, coherent audio tracks that follow the characters, environment, style, and emotion of each scene -- complete with natural background and foley elements. In our evaluations, the model achieves state-of-the-art audiovisual quality and prompt adherence among open-source systems, while delivering results comparable to proprietary models at a fraction of their computational cost and inference time. All model weights and code are publicly released.
- SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence
We introduce SciEvalKit, a unified benchmarking toolkit designed to evaluate AI models for science across a broad range of scientific disciplines and task capabilities. Unlike general-purpose evaluation platforms, SciEvalKit focuses on the core competencies of scientific intelligence, including Scientific Multimodal Perception, Scientific Multimodal Reasoning, Scientific Multimodal Understanding, Scientific Symbolic Reasoning, Scientific Code Generation, Science Hypothesis Generation and Scientific Knowledge Understanding. It supports six major scientific domains, spanning from physics and chemistry to astronomy and materials science. SciEvalKit builds a foundation of expert-grade scientific benchmarks, curated from real-world, domain-specific datasets, ensuring that tasks reflect authentic scientific challenges. The toolkit features a flexible, extensible evaluation pipeline that enables batch evaluation across models and datasets, supports custom model and dataset integration, and provides transparent, reproducible, and comparable results. By bridging capability-based evaluation and disciplinary diversity, SciEvalKit offers a standardized yet customizable infrastructure to benchmark the next generation of scientific foundation models and intelligent agents. The toolkit is open-sourced and actively maintained to foster community-driven development and progress in AI4Science.
- UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision
While Unified Multimodal Models (UMMs) have achieved remarkable success in cross-modal comprehension, a significant gap persists in their ability to leverage such internal knowledge for high-quality generation. We formalize this discrepancy as Conduction Aphasia, a phenomenon where models accurately interpret multimodal inputs but struggle to translate that understanding into faithful and controllable synthesis. To address this, we propose UniCorn, a simple yet elegant self-improvement framework that eliminates the need for external data or teacher supervision. By partitioning a single UMM into three collaborative roles: Proposer, Solver, and Judge, UniCorn generates high-quality interactions via self-play and employs cognitive pattern reconstruction to distill latent understanding into explicit generative signals. To validate the restoration of multimodal coherence, we introduce UniCycle, a cycle-consistency benchmark based on a Text to Image to Text reconstruction loop. Extensive experiments demonstrate that UniCorn achieves comprehensive and substantial improvements over the base model across six general image generation benchmarks. Notably, it achieves SOTA performance on TIIF(73.8), DPG(86.8), CompBench(88.5), and UniCycle while further delivering substantial gains of +5.0 on WISE and +6.5 on OneIG. These results highlight that our method significantly enhances T2I generation while maintaining robust comprehension, demonstrating the scalability of fully self-supervised refinement for unified multimodal intelligence.
- NitroGen: An Open Foundation Model for Generalist Gaming Agents
We introduce NitroGen, a vision-action foundation model for generalist gaming agents that is trained on 40,000 hours of gameplay videos across more than 1,000 games. We incorporate three key ingredients: 1) an internet-scale video-action dataset constructed by automatically extracting player actions from publicly available gameplay videos, 2) a multi-game benchmark environment that can measure cross-game generalization, and 3) a unified vision-action model trained with large-scale behavior cloning. NitroGen exhibits strong competence across diverse domains, including combat encounters in 3D action games, high-precision control in 2D platformers, and exploration in procedurally generated worlds. It transfers effectively to unseen games, achieving up to 52% relative improvement in task success rates over models trained from scratch. We release the dataset, evaluation suite, and model weights to advance research on generalist embodied agents.
- SOP: A Scalable Online Post-Training System for Vision-Language-Action Models
Vision-language-action (VLA) models achieve strong generalization through large-scale pre-training, but real-world deployment requires expert-level task proficiency in addition to broad generality. Existing post-training approaches for VLA models are typically offline, single-robot, or task-specific, limiting effective on-policy adaptation and scalable learning from real-world interaction. We introduce a Scalable Online Post-training (SOP) system that enables online, distributed, multi-task post-training of generalist VLA models directly in the physical world. SOP tightly couples execution and learning through a closed-loop architecture in which a fleet of robots continuously streams on-policy experience and human intervention signals to a centralized cloud learner, and asynchronously receives updated policies. This design supports prompt on-policy correction, scales experience collection through parallel deployment, and preserves generality during adaptation. SOP is agnostic to the choice of post-training algorithm; we instantiate it with both interactive imitation learning (HG-DAgger) and reinforcement learning (RECAP). Across a range of real-world manipulation tasks including cloth folding, box assembly, and grocery restocking, we show that SOP substantially improves the performance of large pretrained VLA models while maintaining a single shared policy across tasks. Effective post-training can be achieved within hours of real-world interaction, and performance scales near-linearly with the number of robots in the fleet. These results suggest that tightly coupling online learning with fleet-scale deployment is instrumental to enabling efficient, reliable, and scalable post-training of generalist robot policies in the physical world.
- DreamStyle: A Unified Framework for Video Stylization
Video stylization, an important downstream task of video generation models, has not yet been thoroughly explored. Its input style conditions typically include text, style image, and stylized first frame. Each condition has a characteristic advantage: text is more flexible, style image provides a more accurate visual anchor, and stylized first frame makes long-video stylization feasible. However, existing methods are largely confined to a single type of style condition, which limits their scope of application. Additionally, their lack of high-quality datasets leads to style inconsistency and temporal flicker. To address these limitations, we introduce DreamStyle, a unified framework for video stylization, supporting (1) text-guided, (2) style-image-guided, and (3) first-frame-guided video stylization, accompanied by a well-designed data curation pipeline to acquire high-quality paired video data. DreamStyle is built on a vanilla Image-to-Video (I2V) model and trained using a Low-Rank Adaptation (LoRA) with token-specific up matrices that reduces the confusion among different condition tokens. Both qualitative and quantitative evaluations demonstrate that DreamStyle is competent in all three video stylization tasks, and outperforms the competitors in style consistency and video quality.
- CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving
Despite significant progress, multimodal large language models continue to struggle with visual mathematical problem solving. Some recent works recognize that visual perception is a bottleneck in visual mathematical reasoning, but their solutions are limited to improving the extraction and interpretation of visual inputs. Notably, they all ignore the key issue of whether the extracted visual cues are faithfully integrated and properly utilized in subsequent reasoning. Motivated by this, we present CogFlow, a novel cognitive-inspired three-stage framework that incorporates a knowledge internalization stage, explicitly simulating the hierarchical flow of human reasoning: perceptionRightarrowinternalizationRightarrowreasoning. Inline with this hierarchical flow, we holistically enhance all its stages. We devise Synergistic Visual Rewards to boost perception capabilities in parametric and semantic spaces, jointly improving visual information extraction from symbols and diagrams. To guarantee faithful integration of extracted visual cues into subsequent reasoning, we introduce a Knowledge Internalization Reward model in the internalization stage, bridging perception and reasoning. Moreover, we design a Visual-Gated Policy Optimization algorithm to further enforce the reasoning is grounded with the visual knowledge, preventing models seeking shortcuts that appear coherent but are visually ungrounded reasoning chains. Moreover, we contribute a new dataset MathCog for model training, which contains samples with over 120K high-quality perception-reasoning aligned annotations. Comprehensive experiments and analysis on commonly used visual mathematical reasoning benchmarks validate the superiority of the proposed CogFlow.
- MiMo-V2-Flash Technical Report
We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.
- Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models
Digital twins, as precise digital representations of physical systems, have evolved from passive simulation tools into intelligent and autonomous entities through the integration of artificial intelligence technologies. This paper presents a unified four-stage framework that systematically characterizes AI integration across the digital twin lifecycle, spanning modeling, mirroring, intervention, and autonomous management. By synthesizing existing technologies and practices, we distill a unified four-stage framework that systematically characterizes how AI methodologies are embedded across the digital twin lifecycle: (1) modeling the physical twin through physics-based and physics-informed AI approaches, (2) mirroring the physical system into a digital twin with real-time synchronization, (3) intervening in the physical twin through predictive modeling, anomaly detection, and optimization strategies, and (4) achieving autonomous management through large language models, foundation models, and intelligent agents. We analyze the synergy between physics-based modeling and data-driven learning, highlighting the shift from traditional numerical solvers to physics-informed and foundation models for physical systems. Furthermore, we examine how generative AI technologies, including large language models and generative world models, transform digital twins into proactive and self-improving cognitive systems capable of reasoning, communication, and creative scenario generation. Through a cross-domain review spanning eleven application domains, including healthcare, aerospace, smart manufacturing, robotics, and smart cities, we identify common challenges related to scalability, explainability, and trustworthiness, and outline directions for responsible AI-driven digital twin systems.
- Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy
Large language models (LLMs), despite strong performance on complex mathematical problems, exhibit systematic limitations in counting tasks. This issue arises from architectural limits of transformers, where counting is performed across layers, leading to degraded precision for larger counting problems due to depth constraints. To address this limitation, we propose a simple test-time strategy inspired by System-2 cognitive processes that decomposes large counting tasks into smaller, independent sub-problems that the model can reliably solve. We evaluate this approach using observational and causal mediation analyses to understand the underlying mechanism of this System-2-like strategy. Our mechanistic analysis identifies key components: latent counts are computed and stored in the final item representations of each part, transferred to intermediate steps via dedicated attention heads, and aggregated in the final stage to produce the total count. Experimental results demonstrate that this strategy enables LLMs to surpass architectural limitations and achieve high accuracy on large-scale counting tasks. This work provides mechanistic insight into System-2 counting in LLMs and presents a generalizable approach for improving and understanding their reasoning behavior.
- Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training
We present Muses, the first training-free method for fantastic 3D creature generation in a feed-forward paradigm. Previous methods, which rely on part-aware optimization, manual assembly, or 2D image generation, often produce unrealistic or incoherent 3D assets due to the challenges of intricate part-level manipulation and limited out-of-domain generation. In contrast, Muses leverages the 3D skeleton, a fundamental representation of biological forms, to explicitly and rationally compose diverse elements. This skeletal foundation formalizes 3D content creation as a structure-aware pipeline of design, composition, and generation. Muses begins by constructing a creatively composed 3D skeleton with coherent layout and scale through graph-constrained reasoning. This skeleton then guides a voxel-based assembly process within a structured latent space, integrating regions from different objects. Finally, image-guided appearance modeling under skeletal conditions is applied to generate a style-consistent and harmonious texture for the assembled shape. Extensive experiments establish Muses' state-of-the-art performance in terms of visual fidelity and alignment with textual descriptions, and potential on flexible 3D object editing. Project page: https://luhexiao.github.io/Muses.github.io/.
- OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs
The rapid integration of Multimodal Large Language Models (MLLMs) into critical applications is increasingly hindered by persistent safety vulnerabilities. However, existing red-teaming benchmarks are often fragmented, limited to single-turn text interactions, and lack the scalability required for systematic evaluation. To address this, we introduce OpenRT, a unified, modular, and high-throughput red-teaming framework designed for comprehensive MLLM safety evaluation. At its core, OpenRT architects a paradigm shift in automated red-teaming by introducing an adversarial kernel that enables modular separation across five critical dimensions: model integration, dataset management, attack strategies, judging methods, and evaluation metrics. By standardizing attack interfaces, it decouples adversarial logic from a high-throughput asynchronous runtime, enabling systematic scaling across diverse models. Our framework integrates 37 diverse attack methodologies, spanning white-box gradients, multi-modal perturbations, and sophisticated multi-agent evolutionary strategies. Through an extensive empirical study on 20 advanced models (including GPT-5.2, Claude 4.5, and Gemini 3 Pro), we expose critical safety gaps: even frontier models fail to generalize across attack paradigms, with leading models exhibiting average Attack Success Rates as high as 49.14%. Notably, our findings reveal that reasoning models do not inherently possess superior robustness against complex, multi-turn jailbreaks. By open-sourcing OpenRT, we provide a sustainable, extensible, and continuously maintained infrastructure that accelerates the development and standardization of AI safety.
- WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks
We present WebGym, the largest-to-date open-source environment for training realistic visual web agents. Real websites are non-stationary and diverse, making artificial or small-scale task sets insufficient for robust policy learning. WebGym contains nearly 300,000 tasks with rubric-based evaluations across diverse, real-world websites and difficulty levels. We train agents with a simple reinforcement learning (RL) recipe, which trains on the agent's own interaction traces (rollouts), using task rewards as feedback to guide learning. To enable scaling RL, we speed up sampling of trajectories in WebGym by developing a high-throughput asynchronous rollout system, designed specifically for web agents. Our system achieves a 4-5x rollout speedup compared to naive implementations. Second, we scale the task set breadth, depth, and size, which results in continued performance improvement. Fine-tuning a strong base vision-language model, Qwen-3-VL-8B-Instruct, on WebGym results in an improvement in success rate on an out-of-distribution test set from 26.2% to 42.9%, significantly outperforming agents based on proprietary models such as GPT-4o and GPT-5-Thinking that achieve 27.1% and 29.8%, respectively. This improvement is substantial because our test set consists only of tasks on websites never seen during training, unlike many other prior works on training visual web agents.
Solidot(15)
- 青少年周末补觉有助于防止抑郁
研究显示,青少年如果在周末补回工作日缺失的睡眠,可能有助于改善心理健康状况。研究发现,在 16-24 岁的年轻群体中,周末补觉的人出现抑郁症状的风险,比不补觉的人低 41%。青少年是睡眠问题高发、抑郁风险偏高的群体。工作日睡眠不足是普遍现象。学业、社交、课外活动及社会兼职等事务挤占了他们的时间与精力,导致睡眠时长缩水。研究人员分析了 2021-2023 年“全美健康与营养检查调查”中 16-24 岁人群的数据。这些年轻人需要报告自己工作日与周末的入睡及起床时间,研究人员据此计算出他们的周末补觉时长,即周末日均睡眠时长与工作日日均睡眠时长的差值。青少年的理想作息是晚上 11 点左右入睡、早上 8 点左右起床,但这与美国许多高中较早的上课时间相冲突。许多睡眠专家与医疗从业者都支持“推迟上学时间”的公共健康倡议。
- ePSXe 模拟器在时隔十年后释出新版本
索尼 PS1 模拟器项目 ePSXe 模拟器在时隔近十年后释出了新版本 ePSXe 2.0.18。该项目上个版本是在 2016 年释出的。目前更新更频繁更流行的 PS1 模拟器是 DuckStation。ePSXe 2.0.18 的主要变化包括:支持 CHD 格式的 ISO 镜像;支持 DPI Awareness,修复了未启用该功能时高分辨率显示问题;修复读取配置时如未选择超频值模拟器会崩溃的 bug;改进 SPUCORE 的混响和音量管理,修复了《Ghost in the Shell》、《Dinocrisis 1 & 2》、《Wipeout》、《DW7》和《DQ4》等游戏的问题;改进了《侍魂 III》等游戏的兼容性,等等。
- 全球逾半数新数据中心位于美国
根据已购地块但尚未宣布、在建或已公开规划的数据中心的数据,全球逾半数新数据中心位于美国。这些数据可能还低估了美国的主导地位,美国数据中心的平均规模通常大于其它国家。中国在建数据中心数量可能也低估了,因为中国不公开宣布数据中心规划。目前全球在建数据中心共 1947 个,美国新数据中心有 55% 位于弗吉尼亚、德州、伊利诺伊州、佐治亚州和亚利桑那州,其中 108 个新数据中心属于亚马逊 AWS,微软 84 个,Google 36 个。
- 美国学校通常不再要求学生阅读整本小说
对 2000 名教师、学生和家长展开的一项调查发现,很多美国高中不再布置学生阅读整部小说,而是阅读节选,他们通常也不是阅读纸质版本,而是在学校发的笔记本电脑屏幕上阅读。这一转变源于多种因素,包括学生的注意力持续时间被认为在缩短,而学校面临为学生准备应对标准化考试的压力,如 Common Core 跨州教育标准。根据 Common Core 标准,学校越来越多的依赖 StudySync 之类的课程产品,这些产品采用文选式学习方法,不要求学生阅读整本书。教师承认,今天的青少年阅读完整长篇小说的数量远少于前几代人。
- 美国青少年在校期间使用手机时长超一小时
根据发表在 JAMA 上的一项研究,美国青少年在校期间每天使用手机时长超一小时,他们主要是使用社交媒体。研究人员分析了参与 Adolescent Brain Cognitive Development Study 的 640 名 13-18 岁青少年的行为,他们及其父母同意在手机上安装应用监视手机使用情况,测量时间从 2022 年 9 月到 2024 年 5 月。分析显示,使用时间最长的应用是 Instagram、TikTok 和 Snapchat,其次是 YouTube 和游戏。美国至少 32 个州和哥伦比亚特区都要求学区禁止或限制学生在校使用手机,但政策的效果还有待观察。
- 水母的睡眠模式与人类相似
根据发表在《Nature Communications》上的一项研究,水母和海葵的睡眠模式与人类有明显相似性。研究结果支持了一种假说,即许多物种通过睡眠的演化,能预防与清醒有关的 DNA 损伤。以色列研究人员在实验室以及自然生境中分别研究了仙后水母的睡眠模式,并在实验室中单独观察了海葵。他们发现,这两种生物每天约有 1/3 的时间在睡觉,和人类相似。他们发现,水母能睡整夜觉(中午前后会短暂小睡),而海葵主要在白天睡觉。对这些睡眠模式机制的进一步研究显示,水母的睡眠受光线变化以及稳态睡眠驱动的控制。而海葵的睡眠受内部节律钟以及稳态睡眠驱动的调控。这些发现表明,睡眠在动物中可能演化成了一种能减少 DNA 损伤以及与清醒相关的细胞应激的机制。
- 惠普推出集成在键盘内的商用 PC
惠普在 CES 2026 上展出了集成在键盘内的商用 PC EliteBoard G1a,预计 3 月上市。树莓派从 2019 年起推出过类似的键盘 PC,从 Raspberry Pi 400 到 500+,它主要面向 DIY 爱好者和 Linux 用户。惠普的这款键盘 PC 则主要面向商业用户。EliteBoard G1a 可连接到一台 USB-C 显示器,如果没有 USB-C 显示器惠普也提供了 USB 转 HDMI 适配器,它的硬件规格是 AMD Ryzen AI 5 或 7 处理器,AMD Radeon 800 集显和最高 50 TOPS NPU,达到了 微软 Copilot+ PC 标准。支持最高 64GB DDR5 内存和 2TB 固态硬盘,其厚度为 11.8 厘米,重约 700 克,比大部分笔记本轻,但更厚更长,可选 32Wh 电池能提供 3.5 小时续航。
- Discord 秘密申请 IPO
彭博社报道,Discord 秘密提交了美国 IPO 申请。Discord 成立于 2015 年,提供了语音、视频和文字聊天功能,主要面向游戏玩家和主播。该平台拥有逾 2 亿月活用户。Discord 一大特色是名为服务器的群聊功能,服务器拥有者可以在服务器中创造属于自己的社群。
- Manjaro 26.0 释出
基于 Arch 的发行版 Manjaro 释出了 v26.0。主要变化包括: Linux 6.18、GNOME 49、KDE Plasma 6.5、Xfce 4.20 等。开发者建议,Plasma 6.5 和 GNOME 49 都默认使用 Wayland,仍然需要使用 X11 的用户可以选择 XFCE 版本。
- Google 将每年只发布两次 Android 源代码
Google 透露,从 2026 年起它将每年只发布两次 Android Open Source Project(AOSP)的源代码,分别是在第二季度和第四季度。此前 Google 是每个季度发布一个 AOSP 版本,一年发布四次。它建议开发者使用 android-latest-release 分支,而不是 aosp-main 分支。Google 发言人解释说,此举有助于简化开发,消除管理多个代码分支的复杂性,向 Android 平台开发者提供更稳定更安全的代码。Google 发言人称该公司对 AOSP 的承诺没有变,安全补丁的发布流程也没有变,仍然会每月在专门安全分支上为相关操作系统版本发布安全补丁。
- 纽约推行拥堵费显著降低了空气污染
交通是市区空气污染物的主要来源,为缓解交通拥堵和改善空气质量,纽约市从 2025 年 1 月起开始在曼哈顿核心区收取拥堵费,该区域被称为拥堵缓解区(Congestion Relief Zone)。研究人员利用市区 42 个空气质量监测站的每日 PM2.5 数据,评估了拥堵缓解区的短期影响。结果显示在 6 个月内 PM2.5 数据比政策实施前下降了 22%,邻近区域的降幅比较平缓。研究证实了征收拥堵费能带来广泛的环境效益。
- Steam 用户中 Linux 比例达到 3.58%
Valve 修订了 2025 年 12 月的 Steam 统计数据,此前的数据显示 Steam 用户中 Linux 比例为 3.19%,比 11 月略微下降了 0.01%。新修订的数据显示,Linux 用户比例达到创历史记录的 3.58%,比 11 月增长 0.38%。Valve 没有解释修订数据的原因。更新后的数据将 AMD Linux CPU 份额从 71.93% 下调至 67.43%,掌机 Steam Deck 使用的 AMD 定制 GPU 所占比例从 21.41% 降至 13.37%。
- 韩国数学家解决了移动沙发问题
韩国数学家 Baek Jin-eon 解决了有近 60 年历史的移动沙发问题。该问题由 Leo Moser 在 1966 年提出,源于现实生活中推沙发过走廊的情景,由于人人都可以理解而广为人知。它描述了宽度为 1 米的 L 形走廊,能通过的刚性二维形状的最大面积,该最大面积被称为沙发常数。英国数学家 John Hammersley 在 1968 年给出的沙发常数是 2.2074 平方米。Joseph Gerver 在 1992 年给出的解是 2.2195 平方米。韩国数学家 Baek Jin-eon 于 2024 年底在 arXiv 上发表了一篇 119 页的论文,证明 Gerver 的解是一个硬上限。该论文已递交到《Annals of Mathematics》。
- 当互联网脱离美国
美国总统特朗普的一系列政策以及围绕他的科技寡头正促使各国重新思考对美国科技公司的依赖。在上月底德国举行的 39C3 会议上,加拿大科幻小说作家、平台垃圾化(enshittification)一词的发明者 Cory Doctorow 呼吁建立后美国的互联网。自二战以来,世界一直将美国视为一个中立的平台,一个值得信赖的关系维护者,但过去十五年尤其是特朗普执政期间,美国系统性的破坏了全球对其的信任。去年海牙国际刑事法院(ICC)对以色列总理内塔尼亚胡及前国防部长加兰特发出逮捕令,特朗普对首席检察官 Karim Khan 等人进行了制裁,微软则立即封锁了 Khan 的电邮账户,破坏了 ICC 的运作。Doctorow 认为,危机能促成变革,而美国的行为将推动后美国互联网的建立,终结平台的垃圾化。
- 700 万年前人类祖先能直立行走
根据发表于《科学进展》的研究,基于 700 万年前的化石,利用强有力的解剖学证据表明,外表像猿、大脑很小的撒海尔人乍得种能直立行走。这意味着,人类祖先直立行走的时间比预期的早得多。法国普瓦提埃大学的古生物学家在中非乍得德乍腊沙漠发现了撒海尔人乍得种化石。这些化石可追溯至 700 万年前。这些化石到底属于人类直系祖先,还是一种已灭绝的旁支类人猿,学术界对此长期存在争议,其中一个关键争论点就是撒海尔人乍得种是否能直立行走。研究人员利用先进的三维成像技术等,对撒海尔人乍得种的肢体骨骼化石进行分析,发现了支撑其双足行走的三个关键特征:一是股骨近端前侧有个结节。这个结构虽小却很重要,它是人体最强韧带——髂股韧带的附着点。这种韧带是直立行走的关键。这一特征迄今仅在人科动物中观察到。二是股骨自然旋转扭曲,即股骨前倾,其角度处于人科动物范围内,有助于腿部向前伸展,从而实现高效行走。三是臀肌与早期人科动物相似,能够稳定髋关节,并有助于站立、行走和奔跑。后两个特征此前已有研究提及,而这项新研究证实了它们的存在。