DIGEST · 2025-11-01

OrangeBot.AI Digest — 2025-11-01

60 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Claude Code Can Debug Low-Level Cryptography (words.filippo.io)
  2. Visible from space, Sudan's bloodied sands expose a massacre of thousands (www.telegraph.co.uk)
  3. Show HN: Why write code if the LLM can just do the thing? (web app experiment) (github.com)
  4. OpenAI Moves to Complete Potentially the Largest Theft in Human History (thezvi.substack.com)
  5. Chat Control proposal fails again after public opposition (andreafortuna.org)
  6. Studies increasingly find links between air pollutants and dementia (www.nytimes.com)
  7. GHC now runs in the browser (discourse.haskell.org)
  8. Tech companies are firing everyone to "fund AI", spending money on each other (old.reddit.com)
  9. I built my own CityMapper (asherfalcon.com)
  10. Updated practice for review articles and position papers in ArXiv CS category (blog.arxiv.org)
  11. CharlotteOS – An Experimental Modern Operating System (github.com)
  12. You can't refuse to be scanned by ICE's facial recognition app, DHS document say (www.404media.co)
  13. SQLite concurrency and why you should care about it (jellyfin.org)
  14. Do you know that there is an HTML tables API? (christianheilmann.com)
  15. Myths Programmers Believe about CPU Caches (2018) (software.rajivprab.com)

GitHub Trending(15)

  1. get-convex / chef

    The only AI app builder that knows backend

  2. suitenumerique / docs

    A collaborative note taking, wiki and documentation platform that scales. Built with Django and React.

  3. Tencent / WeKnora

    LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

  4. janhq / jan

    Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

  5. microsoft / Web-Dev-For-Beginners

    24 Lessons, 12 Weeks, Get Started as a Web Developer

  6. hacksider / Deep-Live-Cam

    real time face swap and one-click video deepfake with only a single image

  7. juspay / hyperswitch

    An open source payments switch written in Rust to make payments fast, reliable and affordable

  8. nvm-sh / nvm

    Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions

  9. github / copilot-cli

    GitHub Copilot CLI brings the power of Copilot coding agent directly to your terminal.

  10. YunaiV / ruoyi-vue-pro

    🔥 官方推荐 🔥 RuoYi-Vue 全新 Pro 版本,优化重构所有功能。基于 Spring Boot + MyBatis Plus + Vue & Element 实现的后台管理系统 + 微信小程序,支持 RBAC 动态权限、数据权限、SaaS 多租户、Flowable 工作流、三方登录、支付、短信、商城、CRM、ERP、AI 大模型等功能。你的 ⭐️ Star ⭐️,是作者生发的动力!

  11. hanxi / xiaomusic

    使用小爱音箱播放音乐,音乐使用 yt-dlp 下载。

  12. 666ghj / BettaFish

    微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

  13. pathwaycom / llm-app

    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

  14. lingodotdev / lingo.dev

    ⚡ Lingo.dev - open-source, AI-powered i18n toolkit for instant localization with LLMs. Bring your own LLM or use Lingo.dev Localization Engine. Join discord:

  15. ossu / computer-science

    🎓 Path to a free self-taught education in Computer Science!

Hugging Face(15)

  1. The End of Manual Decoding: Towards Truly End-to-End Language Models

    The "end-to-end" label for LLMs is a misnomer. In practice, they depend on a non-differentiable decoding process that requires laborious, hand-tuning of hyperparameters like temperature and top-p. This paper introduces AutoDeco, a novel architecture that enables truly "end-to-end" generation by learning to control its own decoding strategy. We augment the standard transformer with lightweight heads that, at each step, dynamically predict context-specific temperature and top-p values alongside the next-token logits. This approach transforms decoding into a parametric, token-level process, allowing the model to self-regulate its sampling strategy within a single forward pass. Through extensive experiments on eight benchmarks, we demonstrate that AutoDeco not only significantly outperforms default decoding strategies but also achieves performance comparable to an oracle-tuned baseline derived from "hacking the test set"-a practical upper bound for any static method. Crucially, we uncover an emergent capability for instruction-based decoding control: the model learns to interpret natural language commands (e.g., "generate with low randomness") and adjusts its predicted temperature and top-p on a token-by-token basis, opening a new paradigm for steerable and interactive LLM decoding.

  2. Emu3.5: Native Multimodal Models are World Learners

    We introduce Emu3.5, a large-scale multimodal world model that natively predicts the next state across vision and language. Emu3.5 is pre-trained end-to-end with a unified next-token prediction objective on a corpus of vision-language interleaved data containing over 10 trillion tokens, primarily derived from sequential frames and transcripts of internet videos. The model naturally accepts interleaved vision-language inputs and generates interleaved vision-language outputs. Emu3.5 is further post-trained with large-scale reinforcement learning to enhance multimodal reasoning and generation. To improve inference efficiency, we propose Discrete Diffusion Adaptation (DiDA), which converts token-by-token decoding into bidirectional parallel prediction, accelerating per-image inference by about 20x without sacrificing performance. Emu3.5 exhibits strong native multimodal capabilities, including long-horizon vision-language generation, any-to-image (X2I) generation, and complex text-rich image generation. It also exhibits generalizable world-modeling abilities, enabling spatiotemporally consistent world exploration and open-world embodied manipulation across diverse scenarios and tasks. For comparison, Emu3.5 achieves performance comparable to Gemini 2.5 Flash Image (Nano Banana) on image generation and editing tasks and demonstrates superior results on a suite of interleaved generation tasks. We open-source Emu3.5 at https://github.com/baaivision/Emu3.5 to support community research.

  3. Kimi Linear: An Expressive, Efficient Attention Architecture

    We introduce Kimi Linear, a hybrid linear attention architecture that, for the first time, outperforms full attention under fair comparisons across various scenarios -- including short-context, long-context, and reinforcement learning (RL) scaling regimes. At its core lies Kimi Delta Attention (KDA), an expressive linear attention module that extends Gated DeltaNet with a finer-grained gating mechanism, enabling more effective use of limited finite-state RNN memory. Our bespoke chunkwise algorithm achieves high hardware efficiency through a specialized variant of the Diagonal-Plus-Low-Rank (DPLR) transition matrices, which substantially reduces computation compared to the general DPLR formulation while remaining more consistent with the classical delta rule. We pretrain a Kimi Linear model with 3B activated parameters and 48B total parameters, based on a layerwise hybrid of KDA and Multi-Head Latent Attention (MLA). Our experiments show that with an identical training recipe, Kimi Linear outperforms full MLA with a sizeable margin across all evaluated tasks, while reducing KV cache usage by up to 75% and achieving up to 6 times decoding throughput for a 1M context. These results demonstrate that Kimi Linear can be a drop-in replacement for full attention architectures with superior performance and efficiency, including tasks with longer input and output lengths. To support further research, we open-source the KDA kernel and vLLM implementations, and release the pre-trained and instruction-tuned model checkpoints.

  4. Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games

    OpenAI's ChatGPT Atlas introduces new capabilities for web interaction, enabling the model to analyze webpages, process user intents, and execute cursor and keyboard inputs directly within the browser. While its capacity for information retrieval tasks has been demonstrated, its performance in dynamic, interactive environments remains less explored. In this study, we conduct an early evaluation of Atlas's web interaction capabilities using browser-based games as test scenarios, including Google's T-Rex Runner, Sudoku, Flappy Bird, and Stein.world. We employ in-game performance scores as quantitative metrics to assess performance across different task types. Our results show that Atlas performs strongly in logical reasoning tasks like Sudoku, completing puzzles significantly faster than human baselines, but struggles substantially in real-time games requiring precise timing and motor control, often failing to progress beyond initial obstacles. These findings suggest that while Atlas demonstrates capable analytical processing, there remain notable limitations in dynamic web environments requiring real-time interaction. The website of our project can be found at https://atlas-game-eval.github.io.

  5. Exploring Conditions for Diffusion models in Robotic Control

    While pre-trained visual representations have significantly advanced imitation learning, they are often task-agnostic as they remain frozen during policy learning. In this work, we explore leveraging pre-trained text-to-image diffusion models to obtain task-adaptive visual representations for robotic control, without fine-tuning the model itself. However, we find that naively applying textual conditions - a successful strategy in other vision domains - yields minimal or even negative gains in control tasks. We attribute this to the domain gap between the diffusion model's training data and robotic control environments, leading us to argue for conditions that consider the specific, dynamic visual information required for control. To this end, we propose ORCA, which introduces learnable task prompts that adapt to the control environment and visual prompts that capture fine-grained, frame-specific details. Through facilitating task-adaptive representations with our newly devised conditions, our approach achieves state-of-the-art performance on various robotic control benchmarks, significantly surpassing prior methods.

  6. AMO-Bench: Large Language Models Still Struggle in High School Math Competitions

    We present AMO-Bench, an Advanced Mathematical reasoning benchmark with Olympiad level or even higher difficulty, comprising 50 human-crafted problems. Existing benchmarks have widely leveraged high school math competitions for evaluating mathematical reasoning capabilities of large language models (LLMs). However, many existing math competitions are becoming less effective for assessing top-tier LLMs due to performance saturation (e.g., AIME24/25). To address this, AMO-Bench introduces more rigorous challenges by ensuring all 50 problems are (1) cross-validated by experts to meet at least the International Mathematical Olympiad (IMO) difficulty standards, and (2) entirely original problems to prevent potential performance leakages from data memorization. Moreover, each problem in AMO-Bench requires only a final answer rather than a proof, enabling automatic and robust grading for evaluation. Experimental results across 26 LLMs on AMO-Bench show that even the best-performing model achieves only 52.4% accuracy on AMO-Bench, with most LLMs scoring below 40%. Beyond these poor performances, our further analysis reveals a promising scaling trend with increasing test-time compute on AMO-Bench. These results highlight the significant room for improving the mathematical reasoning in current LLMs. We release AMO-Bench to facilitate further research into advancing the reasoning abilities of language models. https://amo-bench.github.io/

  7. Surfer 2: The Next Generation of Cross-Platform Computer Use Agents

    Building agents that generalize across web, desktop, and mobile environments remains an open challenge, as prior systems rely on environment-specific interfaces that limit cross-platform deployment. We introduce Surfer 2, a unified architecture operating purely from visual observations that achieves state-of-the-art performance across all three environments. Surfer 2 integrates hierarchical context management, decoupled planning and execution, and self-verification with adaptive recovery, enabling reliable operation over long task horizons. Our system achieves 97.1% accuracy on WebVoyager, 69.6% on WebArena, 60.1% on OSWorld, and 87.1% on AndroidWorld, outperforming all prior systems without task-specific fine-tuning. With multiple attempts, Surfer 2 exceeds human performance on all benchmarks. These results demonstrate that systematic orchestration amplifies foundation model capabilities and enables general-purpose computer control through visual interaction alone, while calling for a next-generation vision language model to achieve Pareto-optimal cost-efficiency.

  8. Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

    Recent video generation models can produce high-fidelity, temporally coherent videos, indicating that they may encode substantial world knowledge. Beyond realistic synthesis, they also exhibit emerging behaviors indicative of visual perception, modeling, and manipulation. Yet, an important question still remains: Are video models ready to serve as zero-shot reasoners in challenging visual reasoning scenarios? In this work, we conduct an empirical study to comprehensively investigate this question, focusing on the leading and popular Veo-3. We evaluate its reasoning behavior across 12 dimensions, including spatial, geometric, physical, temporal, and embodied logic, systematically characterizing both its strengths and failure modes. To standardize this study, we curate the evaluation data into MME-CoF, a compact benchmark that enables in-depth and thorough assessment of Chain-of-Frame (CoF) reasoning. Our findings reveal that while current video models demonstrate promising reasoning patterns on short-horizon spatial coherence, fine-grained grounding, and locally consistent dynamics, they remain limited in long-horizon causal reasoning, strict geometric constraints, and abstract logic. Overall, they are not yet reliable as standalone zero-shot reasoners, but exhibit encouraging signs as complementary visual engines alongside dedicated reasoning models. Project page: https://video-cof.github.io

  9. The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

    Despite recent advances in 3D human motion generation (MoGen) on standard benchmarks, existing models still face a fundamental bottleneck in their generalization capability. In contrast, adjacent generative fields, most notably video generation (ViGen), have demonstrated remarkable generalization in modeling human behaviors, highlighting transferable insights that MoGen can leverage. Motivated by this observation, we present a comprehensive framework that systematically transfers knowledge from ViGen to MoGen across three key pillars: data, modeling, and evaluation. First, we introduce ViMoGen-228K, a large-scale dataset comprising 228,000 high-quality motion samples that integrates high-fidelity optical MoCap data with semantically annotated motions from web videos and synthesized samples generated by state-of-the-art ViGen models. The dataset includes both text-motion pairs and text-video-motion triplets, substantially expanding semantic diversity. Second, we propose ViMoGen, a flow-matching-based diffusion transformer that unifies priors from MoCap data and ViGen models through gated multimodal conditioning. To enhance efficiency, we further develop ViMoGen-light, a distilled variant that eliminates video generation dependencies while preserving strong generalization. Finally, we present MBench, a hierarchical benchmark designed for fine-grained evaluation across motion quality, prompt fidelity, and generalization ability. Extensive experiments show that our framework significantly outperforms existing approaches in both automatic and human evaluations. The code, data, and benchmark will be made publicly available.

  10. Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

    Large Language Models (LLMs) often struggle with problems that require multi-step reasoning. For small-scale open-source models, Reinforcement Learning with Verifiable Rewards (RLVR) fails when correct solutions are rarely sampled even after many attempts, while Supervised Fine-Tuning (SFT) tends to overfit long demonstrations through rigid token-by-token imitation. To address this gap, we propose Supervised Reinforcement Learning (SRL), a framework that reformulates problem solving as generating a sequence of logical "actions". SRL trains the model to generate an internal reasoning monologue before committing to each action. It provides smoother rewards based on the similarity between the model's actions and expert actions extracted from the SFT dataset in a step-wise manner. This supervision offers richer learning signals even when all rollouts are incorrect, while encouraging flexible reasoning guided by expert demonstrations. As a result, SRL enables small models to learn challenging problems previously unlearnable by SFT or RLVR. Moreover, initializing training with SRL before refining with RLVR yields the strongest overall performance. Beyond reasoning benchmarks, SRL generalizes effectively to agentic software engineering tasks, establishing it as a robust and versatile training framework for reasoning-oriented LLMs.

  11. The Era of Agentic Organization: Learning to Organize with Language Models

    We envision a new era of AI, termed agentic organization, where agents solve complex problems by working collaboratively and concurrently, enabling outcomes beyond individual intelligence. To realize this vision, we introduce asynchronous thinking (AsyncThink) as a new paradigm of reasoning with large language models, which organizes the internal thinking process into concurrently executable structures. Specifically, we propose a thinking protocol where an organizer dynamically assigns sub-queries to workers, merges intermediate knowledge, and produces coherent solutions. More importantly, the thinking structure in this protocol can be further optimized through reinforcement learning. Experiments demonstrate that AsyncThink achieves 28% lower inference latency compared to parallel thinking while improving accuracy on mathematical reasoning. Moreover, AsyncThink generalizes its learned asynchronous thinking capabilities, effectively tackling unseen tasks without additional training.

  12. OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes

    There are two prevalent ways to constructing 3D scenes: procedural generation and 2D lifting. Among them, panorama-based 2D lifting has emerged as a promising technique, leveraging powerful 2D generative priors to produce immersive, realistic, and diverse 3D environments. In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation. Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials. Unlike existing 2D lifting approaches that emphasize appearance generation and ignore the perception of intrinsic properties, we present OmniX, a versatile and unified framework. Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks, including panoramic perception, generation, and completion. Furthermore, we construct a large-scale synthetic panorama dataset containing high-quality multimodal panoramas from diverse indoor and outdoor scenes. Extensive experiments demonstrate the effectiveness of our model in panoramic visual perception and graphics-ready 3D scene generation, opening new possibilities for immersive and physically realistic virtual world generation.

  13. MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

    Current text-to-image generative models are trained on large uncurated datasets to enable diverse generation capabilities. However, this does not align well with user preferences. Recently, reward models have been specifically designed to perform post-hoc selection of generated images and align them to a reward, typically user preference. This discarding of informative data together with the optimizing for a single reward tend to harm diversity, semantic fidelity and efficiency. Instead of this post-processing, we propose to condition the model on multiple reward models during training to let the model learn user preferences directly. We show that this not only dramatically improves the visual quality of the generated images but it also significantly speeds up the training. Our proposed method, called MIRO, achieves state-of-the-art performances on the GenEval compositional benchmark and user-preference scores (PickAScore, ImageReward, HPSv2).

  14. EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

    Electronic Health Records (EHRs) contain rich yet complex information, and their automated analysis is critical for clinical decision-making. Despite recent advances of large language models (LLMs) in clinical workflows, their ability to analyze EHRs remains limited due to narrow task coverage and lack of EHR-oriented reasoning capabilities. This paper aims to bridge the gap, specifically, we present EHR-Ins, a large-scale, comprehensive EHR reasoning instruction dataset, comprising 300k high-quality reasoning cases and 4M non-reasoning cases across 42 distinct EHR tasks. Its core innovation is a thinking-graph-driven framework that enables to generate high-quality reasoning data at scale. Based on it, we develop EHR-R1, a series of reasoning-enhanced LLMs with up to 72B parameters tailored for EHR analysis. Through a multi-stage training paradigm, including domain adaptation, reasoning enhancement, and reinforcement learning, EHR-R1 systematically acquires domain knowledge and diverse reasoning capabilities, enabling accurate and robust EHR analysis. Lastly, we introduce EHR-Bench, a new benchmark curated from MIMIC-IV, spanning 42 tasks, to comprehensively assess reasoning and prediction across EHR scenarios. In experiments, we show that the resulting EHR-R1 consistently outperforms state-of-the-art commercial and open-source LLMs (including DeepSeek-V3 and GPT-4o), surpassing GPT-4o by over 30 points on MIMIC-Bench and achieving a 10\% higher zero-shot AUROC on EHRSHOT. Collectively, EHR-Ins, EHR-R1, and EHR-Bench have significantly advanced the development for more reliable and clinically relevant EHR analysis.

  15. OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

    Document AI has advanced rapidly and is attracting increasing attention. Yet, while most efforts have focused on document layout analysis (DLA), its generative counterpart, document layout generation, remains underexplored. A major obstacle lies in the scarcity of diverse layouts: academic papers with Manhattan-style structures dominate existing studies, while open-world genres such as newspapers and magazines remain severely underrepresented. To address this gap, we curate OmniLayout-1M, the first million-scale dataset of diverse document layouts, covering six common document types and comprising contemporary layouts collected from multiple sources. Moreover, since existing methods struggle in complex domains and often fail to arrange long sequences coherently, we introduce OmniLayout-LLM, a 0.5B model with designed two-stage Coarse-to-Fine learning paradigm: 1) learning universal layout principles from OmniLayout-1M with coarse category definitions, and 2) transferring the knowledge to a specific domain with fine-grained annotations. Extensive experiments demonstrate that our approach achieves strong performance on multiple domains in M^{6}Doc dataset, substantially surpassing both existing layout generation experts and several latest general-purpose LLMs. Our code, models, and dataset will be publicly released.

Solidot(15)

  1. 特朗普命令美国重启核武器试验

    美国总统特朗普指示战争部重启核武器试验。此举可能是维持美国核武器的优势局面。美国上一次核试验是 1992 年 9 月 23 日在内华达州进行的地下核试验。美国共进行了 1054 次核试验,其次是苏联的 715 次,法国的 210 次,英国和 中国的 45 次,中国最后一次核试验是 1996 年 7 月。1996 年 9 月全面禁止核试验条约在联合国大会通过后,仅印度、巴基斯坦和朝鲜三国进行过核试验。

  2. Bazzite 秋季更新释出

    针对游戏的发行版项目 Universal Blue 宣布释出 Bazzite 秋季更新。Bazzite 基于 Fedora 发行版,最新更新升级到了 Fedora 43,使用的桌面环境包括 GNOME 49 和 KDE Plasma 6.4.5。其它变化包括:完整支持微软华硕 Xbox 掌机 Xbox Ally 和 Xbox Ally X ,改进对联想掌机 Legion Go 2 的支持;支持使用英特尔芯片的 OneXPlayer X1 Air,但目前还不支持颗粒度 TDP 控制;支持 SuiPlay0X1,等等。

  3. 以色列要求 Google 和亚马逊使用秘密的眨眼信号警告外国政府的数据披露要求

    Google 和亚马逊在 2021 年签署 12 亿美元的云计算交易时,它们的客户——以色列政府提出一项不同寻常的要求,事实上是要求 Google 和亚马逊规避世界各国的法律义务。以色列要求使用被称为“眨眼”的机制警告来自世界各国的数据披露命令。为遵守各国的法律,Google 和亚马逊等科技巨头通常会根据执法部门的命令提供客户数据以协助调查。执法部门通常也会要求相关企业对调查保密,禁止披露客户数据已被移交。以色列不希望失去数据的控制权,因此要求 Google 和亚马逊使用眨眼机制——以特殊补偿的形式支付小额款项——向其发出预警。根据泄露的以色列财政部文件,眨眼款项需要在信息披露 24 小时内支付,支付金额与国家的电话区号对应,介于 1,000 至 9,999 谢克尔之间。如果 Google 或亚马逊被要求向美国政府(电话区号+1)提供以色列用户数据,且被禁止披露,那么 Google 或亚马逊需要向以色列政府支付 1,000 谢克尔;如果是意大利政府(区号为 +39)则支付 3,900 谢克尔;如果相关国家的保密协议禁止披露国家名称,则支付 10 万谢克尔。Google 和亚马逊的云计算部门都否认它们规避了法律义务。

  4. 英国面临禁止汞合金填充物的压力

    英国是少数几个尚未禁止汞合金牙齿填充物的国家,它正面临越来越大的压力,原因是最新数据显示鱼类和贝类中的汞污染水平令人担忧。汞是一种强效神经毒素,即使是低浓度的汞暴露也会损害神经系统、消化系统和免疫系统,以及肺、肾、皮肤和眼睛。有机形式的甲基汞对胎儿尤其危险,且汞可通过食物链在昆虫、鱼类和鸟类体内积累。根据英国环境署的数据,火葬场是发电厂之后第二大汞排放源,每年排放 593 公斤的汞。根据 Rivers Trust 以及 Wildlife and Countryside Link 的分析,英国河流和沿海水域中检测的鱼类和贻贝中,逾 98% 鱼类和贻贝汞含量超过了欧盟提出的安全限值,逾半数的鱼类和贻贝汞含量超过建议安全水平的五倍。

  5. Affinity Studio 成为免费软件

    Serif 开发的设计软件套件在改名为 Affinity Studio 后作为免费软件提供给用户,但条件是用户需要 Canva 账号,此举旨在吸引用户订阅 Canva AI 功能。Serif 开发的 Affinity 套件包括:矢量图形设计工具 Affinity Designer、数字图像编辑工具 Affinity Photo(类似 Photoshop),以及印刷和排版设计工具 Affinity Publisher。Affinity 套件此前采用的是一次性付费模式。2024 年 Canva 公司以 3.8 亿美元收购了 Serif,它现在以 Affinity Studio 的名字将 Serif 的设计软件三套件以免费软件的形式提供给用户使用。用户需要注册 Canva 帐户才能下载和启动 Affinity Studio,它包含了四个功能:Vector,即原来的 Affinity Designer;Pixel,原 Affinity Photo;Layout,原 Affinity Publisher;以及 Canva AI,需要付费的 AI 功能,免费用户无法使用。

  6. 数学证明否定宇宙是模拟的

    物理学家通过数学论证指出,宇宙的本质建构于一种超越任何算法所能掌握的理解之上,否定了曾经流行的宇宙模拟假说。现代物理早已超越牛顿所描述的「实体世界」。爱因斯坦的相对论取代了古典力学,而量子力学又再次革新了我们对现实的认知。当前的尖端理论:量子引力学更指出,连「空间」与「时间」都非根本实体,而是由更深层的「纯粹信息」所涌现。这种信息存在于物理学家所称的「柏拉图领域」(Platonic realm),是一个比我们经验中的宇宙更为根本的数学实在。研究团队进一步利用哥德尔不完备定理(Gödel’s incompleteness theorem)等数学定理,证明任何试图以算法完全描述现实的企图都将失败。计算虽可循逻辑步骤推演,但仍有某些真理属于「非算法理解」,即无法由任何有限步骤推出的真理。例如「此句为真,但不可证明」这一命题:若能被证明,则为假;若不能被证明,则为真,导致任何尝试证明的系统不是不一致,就是不完备。这意味着纯粹计算无法涵盖现实全部层面。

  7. DNS 运行在 FOSS 之上

    ICANN 发表报告《The Domain Name System Runs on Free and Open Source Software (FOSS)》,该报告主要面向决策者,其背景是世界各国政府正探索制定新的网络安全监管法规,DNS 运营中 FOSS 软件的无处不在意味着今天制定的政策将直接影响未来互联网的安全性和韧性。报告指出,DNS 生态系统依赖于由维护者和贡献者构成的全球网络,他们通常是无偿的志愿者。虽然很多参与者是无偿的,但 DNS 生态的独特之处在于,它还依赖于多个长期存在的维护组织,创造了一种基于社区协作而非传统软件供应链商业合同的运营模式,这一模式的独特风险在于维护组织的财务可持续性以及志愿者因倦怠而退出。FOSS 软件运营的独特性意味着为私有软件设计的监管框架可能并不适用于 FOSS,因此可能会对关键互联网基础设施的稳定性构成严重的意想不到的后果。

  8. 社会最终会接受自动驾驶出租车导致的致命车祸

    在 TechCrunch Disrupt 2025 大会上,Waymo 联席 CEO Tekedra Mawakana 表示,社会最终会接受自动驾驶出租车导致的致命车祸,以此作为提升公路整体安全性的必要代价。在大会上 Mawakana 被问到如果 Waymo 等自动驾驶汽车公司减少了美国的交通死亡人数,但最终导致了一起致命车祸,社会会接受吗?Mawakana 认为社会会接受,挑战在于社会需要设定严格的安全标准,以确保企业承担起相应的安全责任。企业必须透明,公开车祸相关的数据。她说,自动驾驶汽车将大幅减少事故,但无法完全消除。车祸不是不会发生,而是何时发生,企业需要为此做好准备。

  9. Meta 否认下载成人视频用于训练 AI,称只是“个人使用”

    今年七月,成人网站 Strike 3 Holdings 指控 Meta 下载和做种了至少 2,396 部成人电影,声称 Meta 还将部分成人视频秘密用于训练 AI 模型,它寻求社交巨人赔偿 3.5 亿美元。本周 Meta 递交动议请求法院驳回这起诉讼。Meta 称 Strike 3 Holdings 的指控是基于猜测和含沙射影,称该公司由于提起勒索式的诉讼而被贴上了“版权流氓”的标签,Strike 3 没有任何证据表明 Meta 将成人视频用于训练 AI 模型,指控是无稽之谈。Meta 称下载这些成人视频仅供“个人使用”,它的记录显示不同的个人出于个人用途下载了成人视频。Meta 称它有数万员工以及无数的合同工和访客,过去七年可能有一名或多名员工下载了 Strike 3 的视频,但下载者也可能是合同工或访客。

  10. 国际刑事法院抛弃微软软件

    海牙国际刑事法院(ICC)的办公软件将从微软的 Microsoft Office 切换到开源替代 Open Desk。此举旨在摆脱对美国技术的依赖。此前,由于海牙国际刑事法院以战争罪对以色列总理内塔尼亚胡及前国防部长加兰特(Yoav Gallant)发出逮捕令,美国现任共和党总统特朗普随后对首席检察官 Karim Khan 等人进行了制裁,微软则立即封锁了 Khan 的电邮账户,迫使他使用瑞士的电子邮件服务 Proton。由于国际刑事法院高度依赖微软软件,微软的行动导致其工作陷入瘫痪。而美国政府正考虑对国际刑事法院采取进一步行动。OpenDesk 由德国的 Center for Digital Sovereignty (Zendis)开发。

  11. AOL 以 15 亿美元出售给 Bending Spoons

    私募基金管理公司 Apollo 与意大利私人控股公司 Bending Spoons 达成协议,以 15 亿美元出售曾经的互联网巨人 AOL。这笔交易预计将在今年年底或明年初完成。AOL 仍然能带来数亿美元的自由现金流。Bending Spoons CEO Luca Ferrari 表示,AOL 的电邮和 Web 内容平台拥有约 3000 万月活跃用户。他称这群用户“极其忠诚”,表示如果加大对产品和用户体验的投资,能更好的服务于用户。Bending Spoons 收购了很多曾经知名的科技公司,包括 Vimeo、印象笔记(Evernote)、WeTransfer 和 Brightcove。该公司经常在收购之后解雇旧公司的所有员工。

  12. SUSE 成为第一个集成智能体的 Linux 企业发行版

    SUSE 发布了 SUSE Linux Enterprise Server 16,成为第一个集成智能体(agentic AI)的 Linux 企业发行版。SUSE 使用 Model Context Protocol (MCP) 安全连接 AI 模型与数据源,维持对模型提供商的自由选择。企业能运行 AI 驱动的自动化但无需依赖单一生态系统。SLES 16 提供了 16 年的生命周期,为 2038 年问题做好了准备。其它 Linux 企业发行版开发商如 Red Hat 和 Canonical 可能会效仿 SUSE 的做法推出类似的功能。

  13. 微软 XBox 掌机存在大量 Windows 系统问题

    微软和华硕合作推出的两款 XBox 掌机已经上市两周,游戏机运行了一个为掌机优化的 Windows 版本,但玩家很快发现该系统存在很多问题,而这些问题可通过安装 Linux 发行版 Bazzite 解决。玩家报告,白色版本的掌机无法稳定进入睡眠和唤醒模式,无法在睡眠状态下保持电量。微软和华硕都没有承认存在问题,也未给出修复时间表。华硕表示需要更多时间进行测试。玩家还发现,Bazzite 下的游戏性能比 Windows 快 30%。Bazzite 最初也存在睡眠问题,但程序员 Antheas Kapenekakis 咨询了 AMD 的内部人士,在两天内修复了问题。在 Windows 下测试掌机的待机续航,一台掌机在睡眠 12 小时后电量下降了 10%,另一台电量下降 23%。再过 12 小时后,两台掌机都只剩下 30% 电量。其中一台掌机在睡眠状态下尝试更新 Windows。两台掌机都会无法从睡眠状态唤醒,需要强制重启。

  14. Pop!_OS 24.04 LTS 将于 12 月推出

    由 Linux PC 制造商 System76 开发的发行版 Pop!_OS 是基于 Ubuntu LTS 版,它最初使用了一个修改版的 GNOME 桌面环境 COSMIC,但从 Pop!_OS 24.04 LTS 起 COSMIC 使用了 Rust 语言进行了重写,成为一个独立的桌面环境。然而由于 COSMIC 开发的滞后,基于 Ubuntu 24.04 LTS 的 Pop!_OS 24.04 LTS 发布时间也一再延迟。System76 创始人兼 CEO Carl Richell 现在正式宣布了新版本的发布日期:Pop!_OS 24.04 LTS 以及 COSMIC Epoch 1 将于 12 月 11 日正式发布。Carl Richell 表示,从 Pop!_OS 26.04 LTS 起,未来的 Pop!_OS 版本将与 Ubuntu LTS 的发布时间保持一致(大约在 Ubuntu 发布日期后两周)。这意味着下一个 LTS 版本与最新的 LTS 版本仅仅相隔 4 个多月。

  15. Grammarly 改名为 Superhuman

    提供辅助写作工具的 Grammarly 公司宣布改名为 Superhuman,但目前会仍然保留现有产品名称。Grammarly 公司在今年 7 月收购了名叫 Superhuman 的邮件客户端,该客户端采用了订阅制商业模式。Superhuman 公司同时推出了名叫 Superhuman Go 的 AI 助手,能与 Gmail、Jira 和 Google Drive 等工具集成,增强写作能力和自动化生产力任务。Superhuman 表示计划引入更多功能,使 AI 助手能从 CRM 和内部系统等来源获取数据,为邮件内容提供修改建议。