OrangeBot.AI Digest — 2025-12-10
59 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Getting a Gemini API key is an exercise in frustration (ankursethi.com)
- I got an Nvidia GH200 server for €7.5k on Reddit and converted it to a desktop (dnhkng.github.io)
- Super Mario 64 for the PS1 (github.com)
- Auto-grading decade-old Hacker News discussions with hindsight (karpathy.bearblog.dev)
- Valve: HDMI Forum Continues to Block HDMI 2.1 for Linux (www.heise.de)
- Australia begins enforcing world-first teen social media ban (www.reuters.com)
- DeepSeek uses banned Nvidia chips for AI model, report says (finance.yahoo.com)
- Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model (qwen.ai)
- Size of Life (neal.fun)
- In New York City, congestion pricing leads to marked drop in pollution (e360.yale.edu)
- Israel used Palantir technologies in pager attack in Lebanon (the307.substack.com)
- Map of all the buildings in the world (gizmodo.com)
- Amazon EC2 M9g Instances (aws.amazon.com)
- Cloth Simulation (cloth.mikail-khan.com)
- Stop Breaking TLS (www.markround.com)
GitHub Trending(14)
- thedotmack / claude-mem
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
- KaijuEngine / kaiju
General purpose 3D and 2D game engine using Go (golang) and Vulkan with built in editor
- agentsmd / agents.md
AGENTS.md — a simple, open format for guiding coding agents
- lfnovo / open-notebook
An Open Source implementation of Notebook LM with more flexibility and features
- dyad-sh / dyad
Free, local, open-source AI app builder ✨ v0 / lovable / Bolt alternative 🌟 Star if you like it!
- datawhalechina / hello-agents
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
- microsoft / VibeVoice
Open-Source Frontier Voice AI
- block / goose
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM
- 666ghj / BettaFish
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
- cloudflare / vibesdk
An open-source vibe coding platform that helps you build your own vibe-coding platform, built entirely on Cloudflare stack
- microsoft / generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI
- infiniflow / ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
- google / adk-samples
A collection of sample agents built with Agent Development Kit (ADK)
- microsoft / ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Hugging Face(15)
- Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
We present Wan-Move, a simple and scalable framework that brings motion control to video generative models. Existing motion-controllable methods typically suffer from coarse control granularity and limited scalability, leaving their outputs insufficient for practical use. We narrow this gap by achieving precise and high-quality motion control. Our core idea is to directly make the original condition features motion-aware for guiding video synthesis. To this end, we first represent object motions with dense point trajectories, allowing fine-grained control over the scene. We then project these trajectories into latent space and propagate the first frame's features along each trajectory, producing an aligned spatiotemporal feature map that tells how each scene element should move. This feature map serves as the updated latent condition, which is naturally integrated into the off-the-shelf image-to-video model, e.g., Wan-I2V-14B, as motion guidance without any architecture change. It removes the need for auxiliary motion encoders and makes fine-tuning base models easily scalable. Through scaled training, Wan-Move generates 5-second, 480p videos whose motion controllability rivals Kling 1.5 Pro's commercial Motion Brush, as indicated by user studies. To support comprehensive evaluation, we further design MoveBench, a rigorously curated benchmark featuring diverse content categories and hybrid-verified annotations. It is distinguished by larger data volume, longer video durations, and high-quality motion annotations. Extensive experiments on MoveBench and the public dataset consistently show Wan-Move's superior motion quality. Code, models, and benchmark data are made publicly available.
- Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Neural rendering, particularly 3D Gaussian Splatting (3DGS), has evolved rapidly and become a key component for building world models. However, existing viewer solutions remain fragmented, heavy, or constrained by legacy pipelines, resulting in high deployment friction and limited support for dynamic content and generative models. In this work, we present Visionary, an open, web-native platform for real-time various Gaussian Splatting and meshes rendering. Built on an efficient WebGPU renderer with per-frame ONNX inference, Visionary enables dynamic neural processing while maintaining a lightweight, "click-to-run" browser experience. It introduces a standardized Gaussian Generator contract, which not only supports standard 3DGS rendering but also allows plug-and-play algorithms to generate or update Gaussians each frame. Such inference also enables us to apply feedforward generative post-processing. The platform further offers a plug in three.js library with a concise TypeScript API for seamless integration into existing web applications. Experiments show that, under identical 3DGS assets, Visionary achieves superior rendering efficiency compared to current Web viewers due to GPU-based primitive sorting. It already supports multiple variants, including MLP-based 3DGS, 4DGS, neural avatars, and style transformation or enhancement networks. By unifying inference and rendering directly in the browser, Visionary significantly lowers the barrier to reproduction, comparison, and deployment of 3DGS-family methods, serving as a unified World Model Carrier for both reconstructive and generative paradigms.
- Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Video face swapping is crucial in film and entertainment production, where achieving high fidelity and temporal consistency over long and complex video sequences remains a significant challenge. Inspired by recent advances in reference-guided image editing, we explore whether rich visual attributes from source videos can be similarly leveraged to enhance both fidelity and temporal coherence in video face swapping. Building on this insight, this work presents LivingSwap, the first video reference guided face swapping model. Our approach employs keyframes as conditioning signals to inject the target identity, enabling flexible and controllable editing. By combining keyframe conditioning with video reference guidance, the model performs temporal stitching to ensure stable identity preservation and high-fidelity reconstruction across long video sequences. To address the scarcity of data for reference-guided training, we construct a paired face-swapping dataset, Face2Face, and further reverse the data pairs to ensure reliable ground-truth supervision. Extensive experiments demonstrate that our method achieves state-of-the-art results, seamlessly integrating the target identity with the source video's expressions, lighting, and motion, while significantly reducing manual effort in production workflows. Project webpage: https://aim-uofa.github.io/LivingSwap
- OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Storytelling in real-world videos often unfolds through multiple shots -- discontinuous yet semantically connected clips that together convey a coherent narrative. However, existing multi-shot video generation (MSV) methods struggle to effectively model long-range cross-shot context, as they rely on limited temporal windows or single keyframe conditioning, leading to degraded performance under complex narratives. In this work, we propose OneStory, enabling global yet compact cross-shot context modeling for consistent and scalable narrative generation. OneStory reformulates MSV as a next-shot generation task, enabling autoregressive shot synthesis while leveraging pretrained image-to-video (I2V) models for strong visual conditioning. We introduce two key modules: a Frame Selection module that constructs a semantically-relevant global memory based on informative frames from prior shots, and an Adaptive Conditioner that performs importance-guided patchification to generate compact context for direct conditioning. We further curate a high-quality multi-shot dataset with referential captions to mirror real-world storytelling patterns, and design effective training strategies under the next-shot paradigm. Finetuned from a pretrained I2V model on our curated 60K dataset, OneStory achieves state-of-the-art narrative coherence across diverse and complex scenes in both text- and image-conditioned settings, enabling controllable and immersive long-form video storytelling.
- ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models
Scaling inference-time computation has enabled Large Language Models (LLMs) to achieve strong reasoning performance, but inherently sequential decoding leads to substantial latency, especially on complex tasks. Recent work on adaptive parallel reasoning aims to improve inference efficiency by decomposing the problem-solving process into concurrent reasoning threads when beneficial. However, existing methods on realistic tasks are either limited to supervised behavior cloning or exhibit significant accuracy drops compared to widely-used sequential long chain-of-thought (CoT) baselines. Moreover, many require customized inference engines, complicating deployment. We introduce ThreadWeaver, a framework for adaptive parallel reasoning that achieves accuracy on par with popular sequential reasoning models of comparable size while significantly reducing inference latency. ThreadWeaver's performance stems from three key innovations: 1) a two-stage parallel trajectory generator that produces large-scale, high-quality CoT data with parallel annotations for supervised fine-tuning; 2) a trie-based training-inference co-design that enables parallel reasoning on any off-the-shelf autoregressive inference engine without modifying position embeddings or KV caches; and 3) a parallelization-aware reinforcement learning framework that teaches the model to balance accuracy with effective parallelization. Across six challenging mathematical reasoning benchmarks, ThreadWeaver trained atop Qwen3-8B achieves accuracy comparable to cutting-edge sequential reasoning models (71.9% on average and 79.9% on AIME24) while delivering up to 1.53x average speedup in token latency, establishing a new Pareto frontier between accuracy and efficiency.
- Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Video Instance Segmentation (VIS) faces significant annotation challenges due to its dual requirements of pixel-level masks and temporal consistency labels. While recent unsupervised methods like VideoCutLER eliminate optical flow dependencies through synthetic data, they remain constrained by the synthetic-to-real domain gap. We present AutoQ-VIS, a novel unsupervised framework that bridges this gap through quality-guided self-training. Our approach establishes a closed-loop system between pseudo-label generation and automatic quality assessment, enabling progressive adaptation from synthetic to real videos. Experiments demonstrate state-of-the-art performance with 52.6 AP_{50} on YouTubeVIS-2019 val set, surpassing the previous state-of-the-art VideoCutLER by 4.4%, while requiring no human annotations. This demonstrates the viability of quality-aware self-training for unsupervised VIS. We will release the code at https://github.com/wcbup/AutoQ-VIS.
- MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Embodied imitation learning is constrained by the scarcity of diverse, long-horizon robotic manipulation data. Existing video generation models for this domain are limited to synthesizing short clips of simple actions and often rely on manually defined trajectories. To this end, we introduce MIND-V, a hierarchical framework designed to synthesize physically plausible and logically coherent videos of long-horizon robotic manipulation. Inspired by cognitive science, MIND-V bridges high-level reasoning with pixel-level synthesis through three core components: a Semantic Reasoning Hub (SRH) that leverages a pre-trained vision-language model for task planning; a Behavioral Semantic Bridge (BSB) that translates abstract instructions into domain-invariant representations; and a Motor Video Generator (MVG) for conditional video rendering. MIND-V employs Staged Visual Future Rollouts, a test-time optimization strategy to enhance long-horizon robustness. To align the generated videos with physical laws, we introduce a GRPO reinforcement learning post-training phase guided by a novel Physical Foresight Coherence (PFC) reward. PFC leverages the V-JEPA world model to enforce physical plausibility by aligning the predicted and actual dynamic evolutions in the feature space. MIND-V demonstrates state-of-the-art performance in long-horizon robotic manipulation video generation, establishing a scalable and controllable paradigm for embodied data synthesis.
- Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
Modern Large Language Models achieve impressive reasoning capabilities with long Chain of Thoughts, but they incur substantial computational cost during inference, and this motivates techniques to improve the performance-cost ratio. Among these techniques, Speculative Decoding accelerates inference by employing a fast but inaccurate draft model to autoregressively propose tokens, which are then verified in parallel by a more capable target model. However, due to unnecessary rejections caused by token mismatches in semantically equivalent steps, traditional token-level Speculative Decoding struggles in reasoning tasks. Although recent works have shifted to step-level semantic verification, which improve efficiency by accepting or rejecting entire reasoning steps, existing step-level methods still regenerate many rejected steps with little improvement, wasting valuable target compute. To address this challenge, we propose Arbitrage, a novel step-level speculative generation framework that routes generation dynamically based on the relative advantage between draft and target models. Instead of applying a fixed acceptance threshold, Arbitrage uses a lightweight router trained to predict when the target model is likely to produce a meaningfully better step. This routing approximates an ideal Arbitrage Oracle that always chooses the higher-quality step, achieving near-optimal efficiency-accuracy trade-offs. Across multiple mathematical reasoning benchmarks, Arbitrage consistently surpasses prior step-level Speculative Decoding baselines, reducing inference latency by up to sim2times at matched accuracy.
- DeepCode: Open Agentic Coding
Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still face significant challenges in achieving high-fidelity document-to-codebase synthesis--such as scientific papers to code--primarily due to a fundamental conflict between information overload and the context bottlenecks of LLMs. In this work, we introduce DeepCode, a fully autonomous framework that fundamentally addresses this challenge through principled information-flow management. By treating repository synthesis as a channel optimization problem, DeepCode seamlessly orchestrates four information operations to maximize task-relevant signals under finite context budgets: source compression via blueprint distillation, structured indexing using stateful code memory, conditional knowledge injection via retrieval-augmented generation, and closed-loop error correction. Extensive evaluations on the PaperBench benchmark demonstrate that DeepCode achieves state-of-the-art performance, decisively outperforming leading commercial agents such as Cursor and Claude Code, and crucially, surpassing PhD-level human experts from top institutes on key reproduction metrics. By systematically transforming paper specifications into production-grade implementations comparable to human expert quality, this work establishes new foundations for autonomous scientific reproduction that can accelerate research evaluation and discovery.
- See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models
Multimodal large language models (MLLMs) are expected to jointly interpret vision, audio, and language, yet existing video benchmarks rarely assess fine-grained reasoning about human speech. Many tasks remain visually solvable or only coarsely evaluate speech, offering limited insight into whether models can align who speaks, what is said, and when it occurs. We introduce AV-SpeakerBench, a curated benchmark of 3,212 multiple-choice questions focused on speaker-centric audiovisual reasoning in real-world videos. It features: (1) a speaker-centered formulation that treats speakers-not scenes-as the core reasoning unit; (2) fusion-grounded question design embedding audiovisual dependencies into question semantics; and (3) expert-curated annotations ensuring temporal precision and cross-modal validity. Comprehensive evaluations show that the Gemini family consistently outperforms open-source systems, with Gemini 2.5 Pro achieving the best results. Among open models, Qwen3-Omni-30B approaches Gemini 2.0 Flash but remains far behind Gemini 2.5 Pro, primarily due to weaker audiovisual fusion rather than visual perception. We believe AV-SpeakerBench establishes a rigorous foundation for advancing fine-grained audiovisual reasoning in future multimodal systems.
- TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models
Reinforcement learning (RL) post-training is crucial for aligning generative models with human preferences, but its prohibitive computational cost remains a major barrier to widespread adoption. We introduce TreeGRPO, a novel RL framework that dramatically improves training efficiency by recasting the denoising process as a search tree. From shared initial noise samples, TreeGRPO strategically branches to generate multiple candidate trajectories while efficiently reusing their common prefixes. This tree-structured approach delivers three key advantages: (1) High sample efficiency, achieving better performance under same training samples (2) Fine-grained credit assignment via reward backpropagation that computes step-specific advantages, overcoming the uniform credit assignment limitation of trajectory-based methods, and (3) Amortized computation where multi-child branching enables multiple policy updates per forward pass. Extensive experiments on both diffusion and flow-based models demonstrate that TreeGRPO achieves 2.4times faster training while establishing a superior Pareto frontier in the efficiency-reward trade-off space. Our method consistently outperforms GRPO baselines across multiple benchmarks and reward models, providing a scalable and effective pathway for RL-based visual generative model alignment. The project website is available at treegrpo.github.io.
- From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs
Large language models (LLMs) excel at generation but dominant autoregressive (AR) decoding is inherently sequential, creating a throughput bottleneck. Diffusion Language Models (DLMs)--especially block-wise variants--enable parallel generation and intra-block bidirectional reasoning, yet training large DLMs from scratch is costly and wastes the knowledge in mature AR checkpoints. Prior "adaptation" attempts either modify logits or randomly grow attention masks to full-sequence diffusion, or simply transplant AR weights into a block-diffusion recipe, leaving a fundamental mismatch between AR causality and block-wise bidirectionality unaddressed. We reframe adaptation as a intra-paradigm path from AR to Block-Diffusion by viewing AR as Block-Diffusion with blocksize=1. Concretely, we design the pathway of adaptation as follows: we use a context-causal attention mask (causal in context, bidirectional only within the active block), an efficient parallel adaptation procedure, an auxiliary AR loss to maximize data utilization and retain pretrained knowledge, and gradual increment of the generation block size. The recipe integrates cleanly with masked block-diffusion and maintains train-inference consistency. Built on these components, NBDiff-7B (Base and Instruct) could inherit the long-context modeling and reasoning capabilities, and achieve state-of-the-art performance among the 7B-class DLMs, delivering strong gains on general-knowledge, math, and code benchmarks over strong baselines. These results demonstrate that principled AR-to-block-diffusion adaptation is an effective and compute-efficient alternative to training DLMs from scratch. Codes: https://github.com/YuchuanTian/NBDiff.
- Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation
While recent large vision-language models (VLMs) have improved generalization in vision-language navigation (VLN), existing methods typically rely on end-to-end pipelines that map vision-language inputs directly to short-horizon discrete actions. Such designs often produce fragmented motions, incur high latency, and struggle with real-world challenges like dynamic obstacle avoidance. We propose DualVLN, the first dual-system VLN foundation model that synergistically integrates high-level reasoning with low-level action execution. System 2, a VLM-based global planner, "grounds slowly" by predicting mid-term waypoint goals via image-grounded reasoning. System 1, a lightweight, multi-modal conditioning Diffusion Transformer policy, "moves fast" by leveraging both explicit pixel goals and latent features from System 2 to generate smooth and accurate trajectories. The dual-system design enables robust real-time control and adaptive local decision-making in complex, dynamic environments. By decoupling training, the VLM retains its generalization, while System 1 achieves interpretable and effective local navigation. DualVLN outperforms prior methods across all VLN benchmarks and real-world experiments demonstrate robust long-horizon planning and real-time adaptability in dynamic environments.
- Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Brain tumors pose a significant threat to human life, therefore it is very much necessary to detect them accurately in the early stages for better diagnosis and treatment. Brain tumors can be detected by the radiologist manually from the MRI scan images of the patients. However, the incidence of brain tumors has risen amongst children and adolescents in recent years, resulting in a substantial volume of data, as a result, it is time-consuming and difficult to detect manually. With the emergence of Artificial intelligence in the modern world and its vast application in the medical field, we can make an approach to the CAD (Computer Aided Diagnosis) system for the early detection of Brain tumors automatically. All the existing models for this task are not completely generalized and perform poorly on the validation data. So, we have proposed two novel Deep Learning Architectures - (a) SAETCN (Self-Attention Enhancement Tumor Classification Network) for the classification of different kinds of brain tumors. We have achieved an accuracy of 99.38% on the validation dataset making it one of the few Novel Deep learning-based architecture that is capable of detecting brain tumors accurately. We have trained the model on the dataset, which contains images of 3 types of tumors (glioma, meningioma, and pituitary tumors) and non-tumor cases. and (b) SAS-Net (Self-Attentive Segmentation Network) for the accurate segmentation of brain tumors. We have achieved an overall pixel accuracy of 99.23%.
- Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Understanding and reconstructing the complex geometry and motion of dynamic scenes from video remains a formidable challenge in computer vision. This paper introduces D4RT, a simple yet powerful feedforward model designed to efficiently solve this task. D4RT utilizes a unified transformer architecture to jointly infer depth, spatio-temporal correspondence, and full camera parameters from a single video. Its core innovation is a novel querying mechanism that sidesteps the heavy computation of dense, per-frame decoding and the complexity of managing multiple, task-specific decoders. Our decoding interface allows the model to independently and flexibly probe the 3D position of any point in space and time. The result is a lightweight and highly scalable method that enables remarkably efficient training and inference. We demonstrate that our approach sets a new state of the art, outperforming previous methods across a wide spectrum of 4D reconstruction tasks. We refer to the project webpage for animated results: https://d4rt-paper.github.io/.
Solidot(15)
- 美国国务院恢复 Times New Roman 字体
2023 年,拜登的国务卿布林肯(Antony Blinken)通知所有大使馆淘汰 Times New Roman 字体改用 Calibri 字体。Times New Roman 是诞生于 1930 年代的衬线字体,而 Calibri 是诞生于 2004 年的无衬线字体。改变字体不是出于美学而是出于可访问性的考虑,因为无衬线字体更容易阅读,尤其是在屏幕上。它对于使用光学字符识别和文本语音工具的人也更方便。在现任总统特朗普治下,美国很多政策在后退,最新的倒退就是恢复 Times New Roman 字体。特朗普的国务卿卢比奥(Marco Rubio)通知大使馆弃用 Calibri 字体,声称此举是浪费资源的 DEIA(代表多元化、公平和包容性),而恢复使用 Times New Roman 字体有助于恢复国务院书面工作的得体性和专业性。
- 韦伯发现至今最遥远的超新星
天文学家利用韦伯望远镜追踪今年 3 月由多项望远镜侦测到的伽玛射线暴 GRB 250314A,确认它就是一颗在宇宙诞生后约 7.3 亿年爆炸的超新星所发出的伽玛射线暴,这是目前天文学家成功观测到、爆炸时间最早的超新星事件,将超新星观测年代推前至宇宙 7.3 亿年时,超越先前约 18 亿年的纪录。韦伯的近红外影像让研究团队不但辨认出这颗大质量恒星崩塌后爆炸的余辉,还首次观测到那个极为遥远、在影像中仅呈现微弱红色斑点的宿主星系。这项成果显示,我们已能藉由伽玛射线暴与超新星的余辉,一颗一颗找出宇宙仅有现今约百分之五年龄时就已形成的恒星与星系,为探索早期宇宙开启崭新视野。相比现代超新星,这颗遥远超新星的光学与红外特性都十分相似,令天文学家颇为意外。一般预期宇宙前十亿年的恒星金属量更低、质量更大、寿命更短,且处在宇宙仍相当不透明的再游离时期,因此爆炸型态或光谱应可能有所不同;然而至少在这个案例中,早期宇宙的超新星与今日恒星系统中观测到的十分接近。
- 罕见病有了治疗方法
十年前发现的 CRISPR 基因编辑技术开始逐渐应用于治疗疾病。2023 年科学家将 CRISPR 用于治疗镰状细胞贫血症。全球大约有 800 万镰状细胞贫血症患者,他们大多携带相同的基因突变。但罕见病的治疗与镰状细胞贫血症不同,它们需要面对不同的基因突变,没有企业会为只有 50 个人携带相同的突变而去研究疗法。但 CRISPR 正逐渐改变这一状况。如果一类或几类罕见病可以使用一个 CRISPR 平台,然后微调模块为每位患者定制疗法,那么罕见病将能更快更经济的治疗。一位名叫 KJ Muldoon 的婴儿患有罕见的遗传病尿素循环障碍,患者通常只有五成几率活过婴儿期。今年二月,六个月大的 Muldoon 接受了 CRISPR 定制疗法治疗去修复特定的基因突变,如今他已是健康的一岁男孩。他的治疗证明定制基因编辑疗法是有效的,可以相对快速安全地投入使用。
- Let's Encrypt 的十年
Let's Encrypt 项目回顾了过去的十年。该项目旨在让每一个网站都启用 HTTPS 加密,它颁发的免费证书用于加密设备与互联网之间的连接,确保无人能拦截及窃取传输数据。目前全球有数以百万计的网站依赖 Let’s Encrypt 作为安全保障。Let’s Encrypt 如今已是全球最大的证书签发 CA 机构。它于 2014 年 11 月宣布成立,2015 年 9 月 14 日签发了首张证书;2016 年 3 月签发了第一百万张证书;2017 年 6 月签发了第一亿张证书;2020 年 2 月签发了第十亿张证书。2018 年 9 月实现单日签发一百万张证书,2025 年 9 月实现单日签发一千万张证书。
- Rust 语言在 Linux 内核不再是实验性的
Linux 内核年度维护者峰会讨论了 Rust 语言实验性相关的主题,与会者一致认为,Rust 不再是实验性的,它现在是内核的核心组成部分,将会长期存在。因此 Rust 的实验性标签将会被移除。
- 苹果在 AI 上进展缓慢在市场变动下成为优势
与微软、Google、亚马逊和 Meta 等科技巨头不同,苹果在 AI 热下非常保守,没有将 AI 视为天要塌下来需要立即采取行动的事情。今年初,由于 AI 战略匮乏,苹果饱受批评,反映在市值上就是股价大跌。2025 年上半年,苹果在七大科技巨头中表现倒数第二,股价下跌了 18%。但随着对 AI 巨额投资的质疑,情况发生了逆转,苹果股价飙升 35%,此前的宠儿 Meta 和微软则股价暴跌,英伟达的表现也不如苹果。标普 500 指数同期上涨 10%,科技股为主的纳斯达克 100 指数上涨了 13%。目前苹果的市值达到 4.1 万亿美元,超过了微软,正逼近英伟达。财富管理公司认为苹果的股票从某种程度上是反 AI 的。
- 澳大利亚禁止青少年使用社媒禁令正式生效
澳大利亚禁止 16 岁以下青少年使用社媒的禁令正式生效。12 岁悉尼女孩 Paloma 在接受采访时对禁令表达了悲伤之情,称她通过 Snapchat 和 TikTok 等应用结识了来自不同国家的朋友,表示自己认识的每个人都对禁令感到生气,称通过禁止使用社交媒体政府剥夺了他们的一部分权利。15 岁少年 Noah Jones 和 Macy Neyland 在一家权利团体的支持下向澳大利亚最高法院提起诉讼,认为禁令剥夺了他们自由交流的权利。澳大利亚电信部长 Anika Wells 表示不会屈服,不会被法律战吓到,“我们将代表澳大利亚的父母坚定立场。”
- 2025 年将是有记录以来第二或第三热的年份
欧盟哥白尼气候变化服务中心(C3S)周二发表报告,今年预计将是有记录以来第二或第三热的年份,可能仅次于 2024 年,与 2023 年并列第二。报告称,2025 年 11 月是有记录以来第三温暖的 11 月,比最暖的 11 月(2023 年)低 0.20°C,比第二暖的 11 月(2024 年)低 0.08°C。11 月全球气温比工业化前水平高 1.54°C,2023-2025 年三年平均气温有望首次超过 1.5°C。
- 当视频编解码器赢得艾美奖
2025 年的技术工程艾美奖授予了 AV1 编解码器,以表彰其对全球视频内容传输的影响。在 2010 年代中期,视频编解码器是 Web 的隐形税,建立在封闭的授权系统基础之上,费用昂贵且难以预测。当时的大部分网络视频使用 H.264 编解码器,开源浏览器如 Firefox 依赖于思科的 OpenH.264 模块免于支付授权费。随着网络视频需求的增长,亟需新一代编解码器去实现更快更可靠的高质量串流。为避免生态系统的分裂,Mozilla 等在 2015 年联合成立了 Alliance for Open Media(AOM)合作开发下一代视频编解码器 AV1,AV1 融合了 Google 的 VP9、Mozilla 的 Daala 和思科的 Thor 技术。AV1 于 2018 年发布,下一代 AV2 也即将发布。
- Microsoft Excel 诞生 40 年
有 40 年历史的电子表格软件 Microsoft Excel 几乎毫发无损的经受住了云计算的兴起和今天的 AI 热。它源于 1983 年,研发代号是 Odyssey,工程师当时的目标是克隆 Lotus 1-2-3,而 Lotus 1-2-3 则克隆自第一个电子表格软件 VisiCalc。VisiCalc 由 Dan Bricklin 为苹果 Apple II 开发,他没有为这款软件申请专利。他说,如果申请专利的话,那么 MIT 应该会有一座 Bricklin Building 而不是 Gates Building。Excel 目前有 5 亿付费用户。市场上有很多竞争对手,但没有一款能真正对 Microsoft Excel 构成威胁。它面临的最新挑战是 AI 聊天机器人,但风险投资家表示,几乎所有 AI 电子表格初创公司都是在 Excel 基础上构建产品,而不是试图取代它。
- 癌症率激增引发癌症过早发现的争论
自 1992 年以来,美国 50 岁以下人群八种癌症的诊断率翻了一番。美国癌症研究协会(American Association for Cancer Research)表示本周将召开特别会议,讨论年轻人群癌症率上升的问题。部分专家认为亟需找出这一现象背后的原因。还有一部分专家则认为没必要担忧,很多癌症是过早被发现了,本就不会致命。数十年来人们已经知道不是所有癌症都危险。部分癌症会自己消失。部分癌症会停止生长或不构成任何风险——不会引起症状也不会扩散。但问题在于不可能知道一个人的癌症是否致命。哈佛医学院的 H. Gilbert Welch 博士认为,判断癌症诊断人数上升是虚惊一场还是真正危险信号的一种方法是观察死亡人数是否同时上升。如果癌症发病率飙升,但死亡率保持稳定,那么很多患者其实不需要接受诊断。美国八种癌症诊断率上升并没有伴随着死亡人数增加。八种癌症中只有结直肠癌和子宫内膜癌死亡率略有增加,其中子宫内膜癌被认为与肥胖流行相关。耶鲁大学的 Cary Gross 博士认为,癌症诊断率上升可能反应了检测工具如 CT、超声和 MRI 的灵敏度改进和使用频率的增加。
- 加密货币帮助犯罪分子洗钱和逃避制裁
走私者、洗钱者以及面临制裁的人过去通常使用奢侈品如钻石、黄金和艺术品藏匿非法财富。这些奢侈品的转移和消费都不很方便。今天的稳定币让犯罪分子能轻松洗钱和逃避制裁。稳定币是一种与美元挂钩的加密货币。区块链分析公司 Chainalysis 在 2 月发布的一份报告估计,去年涉及稳定币的非法交易额高达 250 亿美元。稳定币的兴起危及到了制裁这一美国最强大的外交政策工具。区块链数据公司 TRM Labs 政策主管 Ari Redbord 表示,当犯罪分子只需点击几下鼠标就能转移数百万美元时,制裁等经济处罚的效力就大打折扣了。美国财政部几十年来一直依赖银行和信用卡公司通过执行合规措施去打击非法金融活动,而稳定币完全绕过了这一系统。
- RMS 谈 ChatGPT
曾在 MIT AI 实验室长期工作的 Richard Stallman(RMS)认为 ChatGPT 没有智能,不应该称之为 AI。他对智能的定义是至少在某个领域知道、理解或掌握相关知识。ChatGPT 既不知道也不理解任何事物,因此它不具有智能。它不知道自己输出的意思,也不知道文字能包容万象。他将 ChatGPT 称之为胡扯生成器,以根本不在乎事实是否属实的方式生成输出。其它生成式 AI 系统都有类似的问题。他说人们不应该相信那些机械地玩弄文字、却不真正理解文字含义的系统。RMS 同时表示 ChatGPT 是私有软件,运行在云端服务器上,因此会危害用户的计算自由。
- 欧盟对 Google AI 展开反垄断调查
欧盟周二宣布对 Google 展开调查。调查将评估 Google 是否在未给予适当补偿的情况下,使用媒体和其他出版机构在网上发布的内容来训练和提供 AI 服务,从而违反反垄断法规。欧盟委员会表示,调查将关注 Google 是否通过向出版商和内容创作者施加不公平条款,或通过为自己提供对这些内容的优先访问权,从而扭曲竞争。欧盟竞争事务主管 Teresa Ribera 表示,一个自由且民主的社会,依赖多元媒体,也依赖开放的信息获取渠道和充满活力的创意环境。她表示,AI 正带来显著的创新,也为整个欧洲的人们和商业带来许多益处。但进步不能以牺牲社会核心原则为代价。
- 睡眠不足与预期寿命减少相关
根据发表在《SLEEP Advances》期刊上的一项研究,睡眠不足与预期寿命减少相关。研究针对的是美国,发现睡眠对预期寿命的影响仅次于吸烟,超过了饮食、运动、孤独感等因素。论文第一作者、OHSU School of Nursing 的副教授 Andrew McHill 博士表示,研究强调了每天有七到九小时充足睡眠时间的重要性。研究没有深入探讨睡眠不足为何会缩短预期寿命,McHill 博士指出,睡眠会影响心血管健康、免疫系统和大脑功能。他说,研究表明,我们应像重视饮食和运动一样重视睡眠。良好的睡眠不仅能改善精神状态,还能延长寿命。