OrangeBot.AI Digest — 2026-01-03
38 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Total monthly number of StackOverflow questions over time (data.stackexchange.com)
- Report: Microsoft kills official way to activate Windows 11/10 without internet (www.neowin.net)
- Sirius DB (www.sirius-db.com)
- The C3 Programming Language (c3-lang.org)
- The Most Popular Blogs of Hacker News in 2025 (refactoringenglish.com)
- Recursive Language Models (arxiv.org)
- X-Clacks-Overhead (hleb.dev)
- Tally – A tool to help agents classify your bank transactions (tallyai.money)
- Cadova: Swift DSL for parametric 3D modeling (github.com)
- Late night pizzeria nearby The Pentagon has suddenly surged in traffic (twitter.com)
- Show HN: uvx ptn, scan a QR, get a terminal in your phone (github.com)
- Explosions reported in Venezuelan capital Caracas (www.theguardian.com)
- Trump says Venezuela’s Maduro captured after strikes (www.reuters.com)
- A Beginner's Two-Component Crystal-Style Wi-Fi Detector (siliconjunction.wordpress.com)
- IQuest-Coder: A new open-source code model beats Claude Sonnet 4.5 and GPT 5.1 [pdf] (github.com)
GitHub Trending(8)
- usememos / memos
An open-source, self-hosted note-taking service. Your thoughts, your data, your control — no tracking, no ads, no subscription fees.
- ourongxing / newsnow
Elegant reading of real-time and hottest news
- pathwaycom / pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
- OpenBB-finance / OpenBB
Financial data platform for analysts, quants and AI agents.
- HQarroum / docker-android
🤖 A minimal and customizable Docker image running the Android emulator as a service.
- beancount / beancount
Beancount: Double-Entry Accounting from Text Files.
- maplibre / maplibre-gl-js
MapLibre GL JS - Interactive vector tile maps in the browser
- nukeop / nuclear
Streaming music player that finds free music for you
Hugging Face(7)
- Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling
Multi-step retrieval-augmented generation (RAG) has become a widely adopted strategy for enhancing large language models (LLMs) on tasks that demand global comprehension and intensive reasoning. Many RAG systems incorporate a working memory module to consolidate retrieved information. However, existing memory designs function primarily as passive storage that accumulates isolated facts for the purpose of condensing the lengthy inputs and generating new sub-queries through deduction. This static nature overlooks the crucial high-order correlations among primitive facts, the compositions of which can often provide stronger guidance for subsequent steps. Therefore, their representational strength and impact on multi-step reasoning and knowledge evolution are limited, resulting in fragmented reasoning and weak global sense-making capacity in extended contexts. We introduce HGMem, a hypergraph-based memory mechanism that extends the concept of memory beyond simple storage into a dynamic, expressive structure for complex reasoning and global understanding. In our approach, memory is represented as a hypergraph whose hyperedges correspond to distinct memory units, enabling the progressive formation of higher-order interactions within memory. This mechanism connects facts and thoughts around the focal problem, evolving into an integrated and situated knowledge structure that provides strong propositions for deeper reasoning in subsequent steps. We evaluate HGMem on several challenging datasets designed for global sense-making. Extensive experiments and in-depth analyses show that our method consistently improves multi-step RAG and substantially outperforms strong baseline systems across diverse tasks.
- Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Large Language Models (LLMs) apply uniform computation to all tokens, despite language exhibiting highly non-uniform information density. This token-uniform regime wastes capacity on locally predictable spans while under-allocating computation to semantically critical transitions. We propose Dynamic Large Concept Models (DLCM), a hierarchical language modeling framework that learns semantic boundaries from latent representations and shifts computation from tokens to a compressed concept space where reasoning is more efficient. DLCM discovers variable-length concepts end-to-end without relying on predefined linguistic units. Hierarchical compression fundamentally changes scaling behavior. We introduce the first compression-aware scaling law, which disentangles token-level capacity, concept-level reasoning capacity, and compression ratio, enabling principled compute allocation under fixed FLOPs. To stably train this heterogeneous architecture, we further develop a decoupled μP parametrization that supports zero-shot hyperparameter transfer across widths and compression regimes. At a practical setting (R=4, corresponding to an average of four tokens per concept), DLCM reallocates roughly one-third of inference compute into a higher-capacity reasoning backbone, achieving a +2.69\% average improvement across 12 zero-shot benchmarks under matched inference FLOPs.
- DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
While recent Multimodal Large Language Models (MLLMs) have attained significant strides in multimodal reasoning, their reasoning processes remain predominantly text-centric, leading to suboptimal performance in complex long-horizon, vision-centric tasks. In this paper, we establish a novel Generative Multimodal Reasoning paradigm and introduce DiffThinker, a diffusion-based reasoning framework. Conceptually, DiffThinker reformulates multimodal reasoning as a native generative image-to-image task, achieving superior logical consistency and spatial precision in vision-centric tasks. We perform a systematic comparison between DiffThinker and MLLMs, providing the first in-depth investigation into the intrinsic characteristics of this paradigm, revealing four core properties: efficiency, controllability, native parallelism, and collaboration. Extensive experiments across four domains (sequential planning, combinatorial optimization, constraint satisfaction, and spatial configuration) demonstrate that DiffThinker significantly outperforms leading closed source models including GPT-5 (+314.2\%) and Gemini-3-Flash (+111.6\%), as well as the fine-tuned Qwen3-VL-32B baseline (+39.0\%), highlighting generative multimodal reasoning as a promising approach for vision-centric reasoning.
- On the Role of Discreteness in Diffusion LLMs
Diffusion models offer appealing properties for language generation, such as parallel decoding and iterative refinement, but the discrete and highly structured nature of text challenges the direct application of diffusion principles. In this paper, we revisit diffusion language modeling from the view of diffusion process and language modeling, and outline five properties that separate diffusion mechanics from language-specific requirements. We first categorize existing approaches into continuous diffusion in embedding space and discrete diffusion over tokens. We then show that each satisfies only part of the five essential properties and therefore reflects a structural trade-off. Through analyses of recent large diffusion language models, we identify two central issues: (i) uniform corruption does not respect how information is distributed across positions, and (ii) token-wise marginal training cannot capture multi-token dependencies during parallel decoding. These observations motivate diffusion processes that align more closely with the structure of text, and encourage future work toward more coherent diffusion language models.
- Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow
Generative video modeling has emerged as a compelling tool to zero-shot reason about plausible physical interactions for open-world manipulation. Yet, it remains a challenge to translate such human-led motions into the low-level actions demanded by robotic systems. We observe that given an initial image and task instruction, these models excel at synthesizing sensible object motions. Thus, we introduce Dream2Flow, a framework that bridges video generation and robotic control through 3D object flow as an intermediate representation. Our method reconstructs 3D object motions from generated videos and formulates manipulation as object trajectory tracking. By separating the state changes from the actuators that realize those changes, Dream2Flow overcomes the embodiment gap and enables zero-shot guidance from pre-trained video models to manipulate objects of diverse categories-including rigid, articulated, deformable, and granular. Through trajectory optimization or reinforcement learning, Dream2Flow converts reconstructed 3D object flow into executable low-level commands without task-specific demonstrations. Simulation and real-world experiments highlight 3D object flow as a general and scalable interface for adapting video generation models to open-world robotic manipulation. Videos and visualizations are available at https://dream2flow.github.io/.
- FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation
In this work, we show that the impact of model capacity varies across timesteps: it is crucial for the early and late stages but largely negligible during the intermediate stage. Accordingly, we propose FlowBlending, a stage-aware multi-model sampling strategy that employs a large model and a small model at capacity-sensitive stages and intermediate stages, respectively. We further introduce simple criteria to choose stage boundaries and provide a velocity-divergence analysis as an effective proxy for identifying capacity-sensitive regions. Across LTX-Video (2B/13B) and WAN 2.1 (1.3B/14B), FlowBlending achieves up to 1.65x faster inference with 57.35% fewer FLOPs, while maintaining the visual fidelity, temporal coherence, and semantic alignment of the large models. FlowBlending is also compatible with existing sampling-acceleration techniques, enabling up to 2x additional speedup. Project page is available at: https://jibin86.github.io/flowblending_project_page.
- TESO Tabu Enhanced Simulation Optimization for Noisy Black Box Problems
Simulation optimization (SO) is frequently challenged by noisy evaluations, high computational costs, and complex, multimodal search landscapes. This paper introduces Tabu-Enhanced Simulation Optimization (TESO), a novel metaheuristic framework integrating adaptive search with memory-based strategies. TESO leverages a short-term Tabu List to prevent cycling and encourage diversification, and a long-term Elite Memory to guide intensification by perturbing high-performing solutions. An aspiration criterion allows overriding tabu restrictions for exceptional candidates. This combination facilitates a dynamic balance between exploration and exploitation in stochastic environments. We demonstrate TESO's effectiveness and reliability using an queue optimization problem, showing improved performance compared to benchmarks and validating the contribution of its memory components. Source code and data are available at: https://github.com/bulentsoykan/TESO.
Solidot(8)
- 收入不平等扩大与工作时长增加相关
根据发表在《Social Psychological and Personality Science》期刊上的一项研究,收入不平等加剧与工作时长增加相关。过去四十年全球收入不平等显著加剧,北京师范大学和瑞士洛桑大学的研究人员调查了收入不平等和工作时长的关系。第一项研究使用的数据集包含了 1960-2019 年 69 个国家的数据,结果发现收入不平等程度(基尼系数)每增加十分之一,工作时长每年增加 60 小时——相当于一年多工作一周以上的时间。第二项研究针对的是美国,使用了 1968-2021 年 33,083 名参与者的数据,结果显示美国一个州的基尼系数每增加十分之一,平均每位参与者每年的工作时长增加约 53 小时;相比白人,黑人与工作时长增加之间的关联更显著;相比男性,女性与工作时长增加之间的关联也更显著。第三项研究针对的是中国,数据集包含了 2012-2020 年的26251 名参与者的数据,结果发现参与者感知的不平等程度每增加一个单位,每年工作时长增加约 10 小时。中国和美国情况是相反的,美国的收入不平等增加了弱势人群的工作时长,但中国的收入不平等增加的是优势人群的工作时长。研究人员对此感到惊讶,收入不平等扩大与城市居民的工作时长增加相关,但对农村居民没影响。
- Steam 用户中 Linux 比例达到 3.19%
根据 Valve 公布的 2025 年 12 月Steam 硬件和软件调查,Steam 用户中使用 Linux 的比例达到 3.19%,比前一个月下降 0.01%,远高于 2024 年 12 月的 2.29%。Linux 玩家使用 AMD CPU 的比率达到了 71.93%——Steam Deck 掌机使用的就是 AMD APU,Windows 玩家中 AMD CPU 比例为 47.27%。其它数据包括:Windows 11 份额突破了七成达到了 70.83%,Windows 10 占 26.70%;简体中文用户占 22.12%,英语用户占 47.08%。
- Windows 用户在 2026 年应该尝试下 Linux
Neowin 对比了从 Windows Vista、Windows 7、Windows 8 / 8.1、Windows 10 和 Windows 11 的安装流程,显示 Windows 11 之前的版本安装都十分简单,但 Windows 11 完全变成了一个广告展示系统,微软在整个过程中不停的向用户推荐它的各种产品,包括 OneDrive、Microsoft 365 和 Game Pass。Windows 11 越来越多的让用户觉得他们并不拥有其所购买的新 PC。相比下 Linux 系统不存在这种问题,过去几年 Linux 已经取得了长足进步,尤其是在曾经的弱项游戏领域。Valve 通过不断改进 Proton 兼容层显著改善了 Windows 游戏运行在 Linux 系统上的兼容性,部分情况下 Linux 下游戏的性能甚至可能超过 Windows。2026 年 Windows 用户应该去尝试下 Linux。
- 索尼 PS5 ROM 密钥泄漏
索尼 PS5 Level 0 BootROM 密钥在新年前夕泄漏。BootROM 是 PS5 使用的 AMD APU 在启动之后执行的首批代码,用于验证 Bootloader 是否合法,是否由索尼签名。密钥无法被修改,是直接烧录在 APU 中的。BootROM 密钥泄漏为黑客进一步破解 Bootloader 提供了帮助,但目前破解 PS5 还不太可能,黑客还需要绕过索尼在系统中设置的其它安全措施。索尼官方尚未对此事发表声明。
- 《全面战争:三国》Epic 游戏商店限免一周
世嘉旗下工作室 Creative Assembly 开发的以三国为背景的策略游戏《全面战争:三国》在 Epic 游戏商店限免一周,持续到 1 月 9 日。《全面战争:三国》于 2019 年 5 月发布,之后还推出了多个 DLC,限免的是基础版本不包含 DLC,目前 DLC 没有优惠。游戏单人包含默认的“奇幻模式”以及“经典模式”,其中奇幻模式下武将拥有超强战斗能力,而经典模型下武将战力没有强化,需要侍卫单位协同作战。
- 法国计划效仿澳大利亚限制青少年使用社交媒体
法国计划效仿澳大利亚限制青少年使用社交媒体。法律草案可能在 1 月初提交审议,禁令最早可能从 9 月开始生效。法律草案指出,不受限制的网络访问会使儿童接触到“不适宜的内容”,并强调未成年人也可能成为网络霸凌和其他伤害的目标。拟议的立法将禁止社交媒体平台向 15 岁以下的未成年人提供服务,并将手机使用禁令扩大到高中。益普索(IPSOS)民调显示,五分之四的法国居民希望禁止 14 岁以下儿童使用社交网络。欧洲议会上个月敦促布鲁塞尔设定社交媒体使用的最低年龄限制,以应对青少年因过度接触社交媒体而导致的心理健康问题日益增多的情况。
- 加州从 2026 年 7 月起要求所有学区限制儿童使用智能手机
根据加州州长 Gavin Newsom 在 2024 年签署生效的法案 Assembly Bill 3216,从 2026 年 7 月起,加州所有公立学区必须制定相关政策,限制或禁止学生在校期间使用智能手机。法案还要求学区每五年更新一次政策。Newsom 曾在 2019 年签署过类似法案,但当时法案不具有强制性,仅确认学区有权管理学生使用智能手机。部分学区如 Los Angeles Unified 已在 2024 年制定了政策限制在校期间使用智能手机。
- 美国麻疹病例超过 2000 例
根据疾控中心 CDC 的数据,截至 12 月 23 日美国麻疹病例超过 2000 例达到 2012 例,其中 24 例为国际游客。美国上一次麻疹病例超过 2000 例是在 1992 年——当年的数字是 2126 例。11% 的患者需要住院治疗,其中逾半数是 19 岁以下的青少年。CDC 称,约 93% 的确诊病例发生在未接种疫苗或疫苗接种情况不明的人群中。美国最近几年疫苗接种率出现了下降:2024-2025 学年 92.5% 的幼童接种了 MMR 疫苗,低于上学年的 92.7%,也低于新冠状疫情爆发前 2019-2020 学年的 95.2%。