Weekly Digest — 2026-W01
128 unique stories (2025-12-29 → 2026-01-04), aggregated across 8 sources.
Hacker News(42)
- Google is dead. Where do we go now? (www.circusscientist.com)
- Karpathy on Programming: "I've never felt this much behind" (twitter.com)
- LLMs Are Not Fun (orib.dev)
- List of domains censored by German ISPs (cuiiliste.de)
- Tesla's 4680 battery supply chain collapses as partner writes down deal by 99% (electrek.co)
- Nvidia takes $5B stake in Intel under September agreement (www.reuters.com)
- Mitsubishi Diatone D-160 (1985) (audio-database.com)
- Everything as code: How we manage our company in one monorepo (www.kasava.dev)
- A faster heart for F-Droid. Our new server is here (f-droid.org)
- FediMeteo: A €4 FreeBSD VPS Became a Global Weather Service (it-notes.dragas.net)
- A Vulnerability in Libsodium (00f.net)
- Show HN: 22 GB of Hacker News in SQLite (hackerbook.dosaygo.com)
GitHub Trending(25)
- QuantConnect / Lean
Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
- RustPython / RustPython
A Python Interpreter written in Rust
- Flowseal / zapret-discord-youtube
- BloopAI / vibe-kanban
Get 10X more out of Claude Code, Codex or any coding agent
- gitroomhq / postiz-app
📨 The ultimate social media scheduling tool, with a bunch of AI 🤖
- sansan0 / TrendRadar
🎯 告别信息过载,AI 助你看懂新闻资讯热点,支持 RSS 订阅,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等20种工具)。支持企业微信/个人微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 推送,30秒快速部署,1分钟手机通知,无需编程。支持Docker部署,支持数据远程云存储⭐ 让算法为你服务,用AI理解热点
- x1xhlol / system-prompts-and-models-of-ai-tools
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI, VSCode Agent, Warp.dev, Windsurf, Xcode, Z.ai Code, Dia & v0. (And other Open Sourced) System Prompts, Internal Tools & AI Models
- jrouwe / JoltPhysics
A multi core friendly rigid body physics and collision detection library. Written in C++. Suitable for games and VR applications. Used by Horizon Forbidden West.
- timescale / pg-aiguide
MCP server and Claude plugin for Postgres skills and documentation. Helps AI coding tools generate better PostgreSQL code.
- resemble-ai / chatterbox
SoTA open-source TTS
- afkarxyz / SpotiFLAC
Get Spotify tracks in true FLAC from Tidal, Qobuz & Amazon Music — no account required.
- google-gemini / computer-use-preview
Hugging Face(31)
- Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding
Humans understand long and complex texts by relying on a holistic semantic representation of the content. This global view helps organize prior knowledge, interpret new information, and integrate evidence dispersed across a document, as revealed by the Mindscape-Aware Capability of humans in psychology. Current Retrieval-Augmented Generation (RAG) systems lack such guidance and therefore struggle with long-context tasks. In this paper, we propose Mindscape-Aware RAG (MiA-RAG), the first approach that equips LLM-based RAG systems with explicit global context awareness. MiA-RAG builds a mindscape through hierarchical summarization and conditions both retrieval and generation on this global semantic representation. This enables the retriever to form enriched query embeddings and the generator to reason over retrieved evidence within a coherent global context. We evaluate MiA-RAG across diverse long-context and bilingual benchmarks for evidence-based understanding and global sense-making. It consistently surpasses baselines, and further analysis shows that it aligns local details with a coherent global representation, enabling more human-like long-context retrieval and reasoning.
- InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion
Recent advances in diffusion-based video generation have opened new possibilities for controllable video editing, yet realistic video object insertion (VOI) remains challenging due to limited 4D scene understanding and inadequate handling of occlusion and lighting effects. We present InsertAnywhere, a new VOI framework that achieves geometrically consistent object placement and appearance-faithful video synthesis. Our method begins with a 4D aware mask generation module that reconstructs the scene geometry and propagates user specified object placement across frames while maintaining temporal coherence and occlusion consistency. Building upon this spatial foundation, we extend a diffusion based video generation model to jointly synthesize the inserted object and its surrounding local variations such as illumination and shading. To enable supervised training, we introduce ROSE++, an illumination aware synthetic dataset constructed by transforming the ROSE object removal dataset into triplets of object removed video, object present video, and a VLM generated reference image. Through extensive experiments, we demonstrate that our framework produces geometrically plausible and visually coherent object insertions across diverse real world scenarios, significantly outperforming existing research and commercial models.
- UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
Multimodal large language models (MLLMs) have achieved remarkable progress in visual understanding tasks such as visual grounding, segmentation, and captioning. However, their ability to perceive perceptual-level image features remains limited. In this work, we present UniPercept-Bench, a unified framework for perceptual-level image understanding across three key domains: Aesthetics, Quality, Structure and Texture. We establish a hierarchical definition system and construct large-scale datasets to evaluate perceptual-level image understanding. Based on this foundation, we develop a strong baseline UniPercept trained via Domain-Adaptive Pre-Training and Task-Aligned RL, enabling robust generalization across both Visual Rating (VR) and Visual Question Answering (VQA) tasks. UniPercept outperforms existing MLLMs on perceptual-level image understanding and can serve as a plug-and-play reward model for text-to-image generation. This work defines Perceptual-Level Image Understanding in the era of MLLMs and, through the introduction of a comprehensive benchmark together with a strong baseline, provides a solid foundation for advancing perceptual-level multimodal image understanding.
- MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
The development of GUI agents could revolutionize the next generation of human-computer interaction. Motivated by this vision, we present MAI-UI, a family of foundation GUI agents spanning the full spectrum of sizes, including 2B, 8B, 32B, and 235B-A22B variants. We identify four key challenges to realistic deployment: the lack of native agent-user interaction, the limits of UI-only operation, the absence of a practical deployment architecture, and brittleness in dynamic environments. MAI-UI addresses these issues with a unified methodology: a self-evolving data pipeline that expands the navigation data to include user interaction and MCP tool calls, a native device-cloud collaboration system routes execution by task state, and an online RL framework with advanced optimizations to scale parallel environments and context length. MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigation. On grounding benchmarks, it reaches 73.5% on ScreenSpot-Pro, 91.3% on MMBench GUI L2, 70.9% on OSWorld-G, and 49.2% on UI-Vision, surpassing Gemini-3-Pro and Seed1.8 on ScreenSpot-Pro. On mobile GUI navigation, it sets a new SOTA of 76.7% on AndroidWorld, surpassing UI-Tars-2, Gemini-2.5-Pro and Seed1.8. On MobileWorld, MAI-UI obtains 41.7% success rate, significantly outperforming end-to-end GUI models and competitive with Gemini-3-Pro based agentic frameworks. Our online RL experiments show significant gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points). Finally, the native device-cloud collaboration system improves on-device performance by 33%, reduces cloud model calls by over 40%, and preserves user privacy.
- ProEdit: Inversion-based Editing From Prompts Done Right
Inversion-based visual editing provides an effective and training-free way to edit an image or a video based on user instructions. Existing methods typically inject source image information during the sampling process to maintain editing consistency. However, this sampling strategy overly relies on source information, which negatively affects the edits in the target image (e.g., failing to change the subject's atributes like pose, number, or color as instructed). In this work, we propose ProEdit to address this issue both in the attention and the latent aspects. In the attention aspect, we introduce KV-mix, which mixes KV features of the source and the target in the edited region, mitigating the influence of the source image on the editing region while maintaining background consistency. In the latent aspect, we propose Latents-Shift, which perturbs the edited region of the source latent, eliminating the influence of the inverted latent on the sampling. Extensive experiments on several image and video editing benchmarks demonstrate that our method achieves SOTA performance. In addition, our design is plug-and-play, which can be seamlessly integrated into existing inversion and editing methods, such as RF-Solver, FireFlow and UniEdit.
- TimeBill: Time-Budgeted Inference for Large Language Models
Large Language Models (LLMs) are increasingly deployed in time-critical systems, such as robotics, autonomous driving, embodied intelligence, and industrial automation, where generating accurate responses within a given time budget is crucial for decision-making, control, or safety-critical tasks. However, the auto-regressive generation process of LLMs makes it challenging to model and estimate the end-to-end execution time. Furthermore, existing efficient inference methods based on a fixed key-value (KV) cache eviction ratio struggle to adapt to varying tasks with diverse time budgets, where an improper eviction ratio may lead to incomplete inference or a drop in response performance. In this paper, we propose TimeBill, a novel time-budgeted inference framework for LLMs that balances the inference efficiency and response performance. To be more specific, we propose a fine-grained response length predictor (RLP) and an execution time estimator (ETE) to accurately predict the end-to-end execution time of LLMs. Following this, we develop a time-budgeted efficient inference approach that adaptively adjusts the KV cache eviction ratio based on execution time prediction and the given time budget. Finally, through extensive experiments, we demonstrate the advantages of TimeBill in improving task completion rate and maintaining response performance under various overrun strategies.
- Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Mixture-of-Experts (MoE) models lack explicit constraints to ensure the router's decisions align well with the experts' capabilities, which ultimately limits model performance. To address this, we propose expert-router coupling (ERC) loss, a lightweight auxiliary loss that tightly couples the router's decisions with expert capabilities. Our approach treats each expert's router embedding as a proxy token for the tokens assigned to that expert, and feeds perturbed router embeddings through the experts to obtain internal activations. The ERC loss enforces two constraints on these activations: (1) Each expert must exhibit higher activation for its own proxy token than for the proxy tokens of any other expert. (2) Each proxy token must elicit stronger activation from its corresponding expert than from any other expert. These constraints jointly ensure that each router embedding faithfully represents its corresponding expert's capability, while each expert specializes in processing the tokens actually routed to it. The ERC loss is computationally efficient, operating only on n^2 activations, where n is the number of experts. This represents a fixed cost independent of batch size, unlike prior coupling methods that scale with the number of tokens (often millions per batch). Through pre-training MoE-LLMs ranging from 3B to 15B parameters and extensive analysis on trillions of tokens, we demonstrate the effectiveness of the ERC loss. Moreover, the ERC loss offers flexible control and quantitative tracking of expert specialization levels during training, providing valuable insights into MoEs.
- LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Real-time video generation via diffusion is essential for building general-purpose multimodal interactive AI systems. However, the simultaneous denoising of all video frames with bidirectional attention via an iterative process in diffusion models prevents real-time interaction. While existing distillation methods can make the model autoregressive and reduce sampling steps to mitigate this, they focus primarily on text-to-video generation, leaving the human-AI interaction unnatural and less efficient. This paper targets real-time interactive video diffusion conditioned on a multimodal context, including text, image, and audio, to bridge the gap. Given the observation that the leading on-policy distillation approach Self Forcing encounters challenges (visual artifacts like flickering, black frames, and quality degradation) with multimodal conditioning, we investigate an improved distillation recipe with emphasis on the quality of condition inputs as well as the initialization and schedule for the on-policy optimization. On benchmarks for multimodal-conditioned (audio, image, and text) avatar video generation including HDTF, AVSpeech, and CelebV-HQ, our distilled model matches the visual quality of the full-step, bidirectional baselines of similar or larger size with 20x less inference cost and latency. Further, we integrate our model with audio language models and long-form video inference technique Anchor-Heavy Identity Sinks to build LiveTalk, a real-time multimodal interactive avatar system. System-level evaluation on our curated multi-turn interaction benchmark shows LiveTalk outperforms state-of-the-art models (Sora2, Veo3) in multi-turn video coherence and content quality, while reducing response latency from 1 to 2 minutes to real-time generation, enabling seamless human-AI multimodal interaction.
- Yume-1.5: A Text-Controlled Interactive World Generation Model
Recent approaches have demonstrated the promise of using diffusion models to generate interactive and explorable worlds. However, most of these methods face critical challenges such as excessively large parameter sizes, reliance on lengthy inference steps, and rapidly growing historical context, which severely limit real-time performance and lack text-controlled generation capabilities. To address these challenges, we propose \method, a novel framework designed to generate realistic, interactive, and continuous worlds from a single image or text prompt. \method achieves this through a carefully designed framework that supports keyboard-based exploration of the generated worlds. The framework comprises three core components: (1) a long-video generation framework integrating unified context compression with linear attention; (2) a real-time streaming acceleration strategy powered by bidirectional attention distillation and an enhanced text embedding scheme; (3) a text-controlled method for generating world events. We have provided the codebase in the supplementary material.
- SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
Agentic reinforcement learning (RL) holds great promise for the development of autonomous agents under complex GUI tasks, but its scalability remains severely hampered by the verification of task completion. Existing task verification is treated as a passive, post-hoc process: a verifier (i.e., rule-based scoring script, reward or critic model, and LLM-as-a-Judge) analyzes the agent's entire interaction trajectory to determine if the agent succeeds. Such processing of verbose context that contains irrelevant, noisy history poses challenges to the verification protocols and therefore leads to prohibitive cost and low reliability. To overcome this bottleneck, we propose SmartSnap, a paradigm shift from this passive, post-hoc verification to proactive, in-situ self-verification by the agent itself. We introduce the Self-Verifying Agent, a new type of agent designed with dual missions: to not only complete a task but also to prove its accomplishment with curated snapshot evidences. Guided by our proposed 3C Principles (Completeness, Conciseness, and Creativity), the agent leverages its accessibility to the online environment to perform self-verification on a minimal, decisive set of snapshots. Such evidences are provided as the sole materials for a general LLM-as-a-Judge verifier to determine their validity and relevance. Experiments on mobile tasks across model families and scales demonstrate that our SmartSnap paradigm allows training LLM-driven agents in a scalable manner, bringing performance gains up to 26.08% and 16.66% respectively to 8B and 30B models. The synergizing between solution finding and evidence seeking facilitates the cultivation of efficient, self-verifying agents with competitive performance against DeepSeek V3.1 and Qwen3-235B-A22B.
- Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Transparent objects remain notoriously hard for perception systems: refraction, reflection and transmission break the assumptions behind stereo, ToF and purely discriminative monocular depth, causing holes and temporally unstable estimates. Our key observation is that modern video diffusion models already synthesize convincing transparent phenomena, suggesting they have internalized the optical rules. We build TransPhy3D, a synthetic video corpus of transparent/reflective scenes: 11k sequences rendered with Blender/Cycles. Scenes are assembled from a curated bank of category-rich static assets and shape-rich procedural assets paired with glass/plastic/metal materials. We render RGB + depth + normals with physically based ray tracing and OptiX denoising. Starting from a large video diffusion model, we learn a video-to-video translator for depth (and normals) via lightweight LoRA adapters. During training we concatenate RGB and (noisy) depth latents in the DiT backbone and co-train on TransPhy3D and existing frame-wise synthetic datasets, yielding temporally consistent predictions for arbitrary-length input videos. The resulting model, DKT, achieves zero-shot SOTA on real and synthetic video benchmarks involving transparency: ClearPose, DREDS (CatKnown/CatNovel), and TransPhy3D-Test. It improves accuracy and temporal consistency over strong image/video baselines, and a normal variant sets the best video normal estimation results on ClearPose. A compact 1.3B version runs at ~0.17 s/frame. Integrated into a grasping stack, DKT's depth boosts success rates across translucent, reflective and diffuse surfaces, outperforming prior estimators. Together, these results support a broader claim: "Diffusion knows transparency." Generative video priors can be repurposed, efficiently and label-free, into robust, temporally coherent perception for challenging real-world manipulation.
- Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
Diffusion-based video super-resolution (VSR) methods achieve strong perceptual quality but remain impractical for latency-sensitive settings due to reliance on future frames and expensive multi-step denoising. We propose Stream-DiffVSR, a causally conditioned diffusion framework for efficient online VSR. Operating strictly on past frames, it combines a four-step distilled denoiser for fast inference, an Auto-regressive Temporal Guidance (ARTG) module that injects motion-aligned cues during latent denoising, and a lightweight temporal-aware decoder with a Temporal Processor Module (TPM) that enhances detail and temporal coherence. Stream-DiffVSR processes 720p frames in 0.328 seconds on an RTX4090 GPU and significantly outperforms prior diffusion-based methods. Compared with the online SOTA TMP, it boosts perceptual quality (LPIPS +0.095) while reducing latency by over 130x. Stream-DiffVSR achieves the lowest latency reported for diffusion-based VSR, reducing initial delay from over 4600 seconds to 0.328 seconds, thereby making it the first diffusion VSR method suitable for low-latency online deployment. Project page: https://jamichss.github.io/stream-diffvsr-project-page/
Solidot(30)
- Blender 调查显示大部分用户不用 AI
5102 人参加了 Blender 基金会的年度调查。结果显示:大部分参与者的年龄在 19-35 岁之间;16% 的参与者来自美国,德国是 7.26%,中国是 5.61%,印度是 5.46%;三分之一参与者是美术师,17% 的参与者是设计师;半数参与者每天使用 Blender;逾半数参与者是因为免费或有趣或开源而使用 Blender;用户会长时间一直使用一个 LTS 版本;大部分参与者不使用 AI,只有 7% 的用户经常使用 AI。
- 每天饮用瓶装水的人每年会多摄入 9 万微塑料
Sarah Sajedi 在泰国披披岛(Phi Phi Island)旅游时为壮观的海景所吸引,但她低头一看,发现海滩上遍地是塑料瓶。她在攻读博士学位期间分析了逾 140 篇论文以判断塑料瓶对人体的影响。她发现,人平均每年从食物和饮用水中摄入 39,000-52,000 个微塑料颗粒,而每天饮用瓶装水的人每年摄入的微塑料颗粒要多 90,000 个。Sajedi 建议人们在紧急情况下饮用塑料瓶装水,不应该日常饮用。微塑料是 1-5 毫米之间的塑料颗粒,而纳米塑料则小于 1 微米。塑料颗粒肉眼不可见,但在瓶子的生产、储存、运输和分解过程中会不断产生。与其它通过食物链进入人体的塑料颗粒不同,塑料瓶中的微塑料更令人担忧,因为它们会随饮用水直接摄入体内。一旦进入人体微塑料颗粒能进入血液循环,到达重要器官,引发慢性炎症反应,使细胞暴露于氧化应激之下,进而导致激素系统紊乱、生殖功能受损和神经系统损伤。
- 伊朗和俄罗斯的审查和反审查
今年六月与以色列爆发冲突期间伊朗一度断网数天,它也加强了网络审查。Tor 项目开发的 Snowflake 是伊朗使用最广泛的网络流量混淆工具。为更好的应对伊朗对网桥——不公开的 Tor 中继但可以通过各种方法获取——的封锁,Tor 项目开发了可插拔传输协议 Conjure——其功能类似为避免垃圾邮件而生成的临时邮件地址,一个网桥地址被封锁不影响用户获取新网桥地址。俄罗斯也加强了对网络的审查,Tor 项目去年推出的模拟 HTTPS 流量的新可插拔传输协议 WebTunnel 在俄罗斯很受欢迎。俄罗斯在 6 月加强了对 WebTunnel 网桥地址的封锁,Tor 项目开始通过 Telegram 分发 WebTunnel 网桥。Tor 项目计划明年部署 Conjure 和持续改进 WebTunnel,更好的应对封锁。
- SuperTux 0.7 发布首个 Beta
模仿超级马里奥兄弟的开源游戏《超级企鹅(SuperTux》在时隔多年之后释出了下一个大版本 v0.7 的首个 Beta 版本。《超级企鹅》游戏主角是 Linux 吉祥物企鹅,游戏玩法是类似超级马力欧兄弟的横版过关。游戏于 2003 年开始开发,上一个大版本 v0.6 是在 2019 年发布的。v0.7 版本是一次重大更新,重做了多个世界,引入了全新的美术和音乐等内容,核心玩法不变,但游戏体验可能和以前完全不同。游戏提供了 Flatpak 打包的版本。
- Sal Khan 建议企业捐出 1% 的利润帮助被 AI 取代的工人
可汗学院(Khan Academy)创始人 Sal Khan 建议受益于自动化的企业捐出 1% 的利润帮助被 AI 取代的工人接受重就业培训。他认为这不是慈善,而是符合公司的自身利益,因为如果企业利润飙升的同时失业率增加,可能促使公众支持加强监管和增税,或支持禁止自动化。资助工人重新接受培训对大企业而言是微不足道的,几乎没有任何压力,但对公众而言却具有重大意义。全球最大的十几家公司年利润逾万亿美元,捐出百分之一利润就能创办一个每年有百亿美元的基金,拿出一部分就足以打造一个中心化的技能培训平台。基金可由独立非营利组织运营,通过与企业协调,确保所培训的技能符合市场需求。
- 科学家发现自闭症大脑的分子差异
耶鲁大学医学院的科学家发现自闭症患者大脑与神经正常者大脑之间的分子差异。根据发表在《The American Journal of Psychiatry》期刊上的研究,自闭症患者大脑一种特定类型的谷氨酸受体数量较少,谷氨酸是大脑中最常见的兴奋性神经递质。减少的谷氨酸受体数量可能与自闭症多种特征相关。大脑神经元通过电信号和称为神经递质的化学信使相互沟通。当电流在神经元中传递时,会促使释放神经递质,进而将信号传递给其它神经元。这种信号传递可以是兴奋性的,也可以是抑制性的。兴奋性信号主要触发神经递质谷氨酸的释放,起到绿灯作用,告诉其它神经元激发;抑制性信号则起到刹车作用抑制神经活动。大脑需要两种信号保持精确平衡才能正常运作。自闭症病因的主要假说之一是大脑中兴奋性和抑制性信号失衡。
- KDE Plasma 的 2025 年
KDE 开发者总结了桌面环境 Plasma 在 2025 年的重要进展:切换到 Wayland 显示服务器的工作基本完成,2027 年初发布的 Plasma 将停止支持 X11 会话;Plasma 持续改进和成熟,成为众多面向游戏发行版的默认桌面环境,这些发行版包括了 Bazzite、CachyOS、Garuda、Nobara,以及 Valve 掌机/主机运行的 SteamOS。Fedora 发行版也将其 Plasma 桌面版本与 GNOME 桌面版本放在同等位置,唯一能在苹果新 Mac 设备上运行的发行版 Asahi Linux 使用的也是 KDE Plasma 桌面。Parrot Linux 最近也开始默认使用 Plasma。EndeavourOS、Manjaro、NixOS、OpenMandriva、Slackware 和 TuxedoOS 等老牌发行版的默认桌面环境都是 Plasma。
- 蚊子口器启发 3D 打印喷嘴设计
加拿大麦吉尔大学与美国德雷塞尔大学团队联合开发出一种颇具创意的高分辨率3D打印新技术。他们将雌性蚊子的口器(吸血管)转化成了高分辨率的3D打印喷嘴。这种技术不仅能打印出精度达 20 微米的极细线条,还为解决昂贵、高能耗的微纳制造难题提供了可持续的生物学方案。高分辨率 3D 打印对喷嘴精度要求极高。目前市售的超细喷嘴多由特种金属或玻璃制成,制造工艺复杂,成本高昂。研究团队指出,传统喷嘴在生产和使用过程中不仅产生大量环境废弃物,还可能因工艺局限带来健康风险。为了寻找替代方案,研究团队将目光投向自然界中高度进化的微结构——蚊子口器。经过数百万年进化,蚊子口器形成了一种直径仅为人类发丝直径一半左右的天然微针结构,兼具特殊几何形态和力学韧性。研究团队在显微镜下分离出蚊子吸血管,并利用特种树脂将其固定在标准塑料分配器尖端。结果发现,这种生物喷嘴能承受极大的压力,打印出的复杂结构精细程度大约是目前商业打印喷嘴的 2 倍。
- 网信办起草暂行办法要求 AI 服务商采取措施阻止自杀自残
网信办发布了《人工智能拟人化互动服务管理暂行办法(征求意见稿)》,意见截止日期 1 月 25 日。该《暂行办法》包含了被认为全球最严厉的政策,要求服务商采取措施阻止 AI 帮助用户自杀或自残。《暂行办法》包括: 第八条 提供者应当落实拟人化互动服务安全主体责任,建立健全算法机制机理审核、科技伦理审查、信息发布审核、网络安全、数据安全、个人信息保护、反电信网络诈骗、重大风险预案、应急处置等管理制度,具有安全可控的技术保障措施,配备与产品规模、业务方向和用户群体相适应的内容管理技术和人员。 第九条 提供者应当在拟人化互动服务全生命周期履行安全责任,明确设计、运行、升级、终止服务等各阶段安全要求,保证安全措施与服务功能同步设计、同步使用,提升内生安全水平,加强运行阶段安全监测和风险评估,及时发现纠正系统偏差、处置安全问题,依法留存网络日志。提供者应当具备心理健康保护、情感边界引导、依赖风险预警等安全能力,不得将替代社会交往、控制用户心理、诱导沉迷依赖等作为设计目标。 第十一条 提供者应当具备用户状态识别能力,在保护用户个人隐私前提下,评估用户情绪及对产品和服务的依赖程度,发现用户存在极端情绪和沉迷的,采取必要措施予以干预。提供者应当预设回复模板,发现涉及威胁用户生命健康和财产安全的高风险倾向的,及时输出安抚和鼓励寻求帮助等内容,并提供专业援助方式。提供者应当建立应急响应机制,发现用户明确提出实施自杀、自残等极端情境时,由人工接管对话,并及时采取措施联络用户监护人、紧急联系人。针对未成年人、老年人用户,提供者应当在注册环节要求填写用户监护人、紧急联系人等信息。 第十七条 用户连续使用拟人化互动服务超过2个小时的,提供者应当以弹窗等方式动态提醒用户暂停使用服务。
- 中国汽车销量超越日本
中国车企的全球销量在 2025 年超过日本,首次跃居首位。根据2025 年 1~11 月各企业发布的资料和标普全球汽车(S&P Global Mobility)的数据,中国汽车的全球销量预计同比增长 17%,增至约 2700 万辆。中国在 2023 年首次位居汽车出口首位。整体销量也将在 2025 年跃居首位。日本车企合计销量约为 2500 万辆,与上年持平。过去世界汽车销售由美国和日本展开竞争。在顶峰时期的 2018 年日本销量近 3000 万辆。另一方面,中国国内的供应过剩迹象增强,最大车企比亚迪开始降价,价格竞争日趋激烈。中国汽车制造商正在转向出口寻找出路。
- 2025 年美国人观看了更少的新电视剧
对尼尔森最新数据的分析显示,2025 年没有一部新的原创剧能进入十大最受欢迎的流媒体节目之列。这是尼尔森自 2020 年以来发布流媒体数据以来首次出现该情况。数据还显示,由广告支持的免费流媒体服务增长速度超过了付费流媒体服务。YouTube 是美国电视上观看量最高的流媒体服务,超过了 Netflix 和亚马逊总和。Netflix 在热门剧上仍然具有优势,在尼尔森每周十大热门原创节目榜单中占了约三分之二。但其主导地位正逐渐消失——该公司的流媒体观看份额占比降至 20% 以下。迪士尼流媒体服务份额三年以来停滞不前,而亚马逊则在迎头赶上。2025 年观看量最高的原创剧是《鱿鱼游戏》终季,之后是《星期三》第二季和《爱情岛》的最新季。
- 火灾空气污染比预期的更严重
大火吞噬着土地,不断向空气中排放气体和颗粒物,它们对空气污染的影响可能被低估了。一项研究报告称,在世界各地,野火和计划烧除排放的气体可能远超此前预估量。每年大片森林、草地和泥炭地都会在野火中焚烧,向空气中释放出水蒸气、灰烬和碳基化合物的复杂混合物。研究人员查阅了 1997 年至 2023 年全球森林、草地和泥炭地林野火灾所烧毁土地面积的数据库。他们还收集了有关每种植被类型燃烧时排放的有机化合物数据。研究人员估计,在研究期间,林野火灾每年平均向空气中排放约 1.43 亿吨有机化合物。该数值比之前的估算值高出 21%,这表明林野火灾排放物所造成的空气污染可能比先前认为的更加严重。