DIGEST · 2026-03-30

OrangeBot.AI Digest — 2026-03-30

85 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Fedware: Government apps that spy harder than the apps they ban (www.sambent.com)
  2. Do your own writing (alexhwoods.com)
  3. New Washington state law bans noncompete agreements (www.seattletimes.com)
  4. Cherri – programming language that compiles to an Apple Shortuct (github.com)
  5. CodingFont: A game to help you pick a coding font (www.codingfont.com)
  6. FTC action against Match and OkCupid for deceiving users, sharing personal data (www.ftc.gov)
  7. 72% of the dollar's purchasing power was destroyed in just four episodes (eco3min.fr)
  8. 1.5M GitHub pull requests have had ads injected into them by Microsoft Copilot (www.neowin.net)
  9. Bird brains (2023) (www.dhanishsemar.com)
  10. How to turn anything into a router (nbailey.ca)
  11. How the AI Bubble Bursts (martinvol.pe)
  12. I am definitely missing the pre-AI writing era (www.lesswrong.com)
  13. I use Excalidraw to manage my diagrams for my blog (blog.lysk.tech)
  14. 15 years, one server, 8GB RAM and 500k users – how Webminal refuses to die (community.webminal.org)
  15. The curious case of retro demo scene graphics (www.datagubbe.se)

GitHub Trending(10)

  1. microsoft / VibeVoice
  2. luongnv89 / claude-howto
  3. shanraisshan / claude-code-best-practice
  4. hacksider / Deep-Live-Cam
  5. OpenBB-finance / OpenBB
  6. freeCodeCamp / freeCodeCamp
  7. sherlock-project / sherlock
  8. apache / superset
  9. fastfetch-cli / fastfetch
  10. NousResearch / hermes-agent

Product Hunt(15)

  1. Letterbook

    AI support platform built for founders

  2. Neuralingo Language Learning

    slowly inch your way to mastery: try, fail, learn, get good

  3. VibeTalent

    Find vibe coders who actually ship

  4. PopTask

    Light menu bar task manager for quickly capturing tasks

  5. ClawKing

    On-chain AI battle royale where 8 lobsters fight

  6. Goals

    AI turns your goal into one daily action.

  7. Git Blog

    Publish sites using Markdown & GitHub from your phone

  8. Ollang DX

    The AI Language Execution Layer for Enterprise

  9. dictate.

    Replace your iPhone keyboard with AI voice typing

  10. nCompass AI Assistant

    Enabling everyone to write GPU kernels

  11. Bluor AI

    Beautiful emails, in seconds

  12. AISpace

    All frontier AI models in one space

  13. Blood Sugar Journal

    AI-powered diabetes tracking for the modern era.

  14. Streva

    Instant Translation, Anywhere you type

  15. Notion MCP

    Your Notion workspace, inside every AI agent

Hugging Face(15)

  1. Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

    Video world models have shown immense potential in simulating the physical world, yet existing memory mechanisms primarily treat environments as static canvases. When dynamic subjects hide out of sight and later re-emerge, current methods often struggle, leading to frozen, distorted, or vanishing subjects. To address this, we introduce Hybrid Memory, a novel paradigm requiring models to simultaneously act as precise archivists for static backgrounds and vigilant trackers for dynamic subjects, ensuring motion continuity during out-of-view intervals. To facilitate research in this direction, we construct HM-World, the first large-scale video dataset dedicated to hybrid memory. It features 59K high-fidelity clips with decoupled camera and subject trajectories, encompassing 17 diverse scenes, 49 distinct subjects, and meticulously designed exit-entry events to rigorously evaluate hybrid coherence. Furthermore, we propose HyDRA, a specialized memory architecture that compresses memory into tokens and utilizes a spatiotemporal relevance-driven retrieval mechanism. By selectively attending to relevant motion cues, HyDRA effectively preserves the identity and motion of hidden subjects. Extensive experiments on HM-World demonstrate that our method significantly outperforms state-of-the-art approaches in both dynamic subject consistency and overall generation quality.

  2. ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

    Multi-shot video generation is crucial for long narrative storytelling, yet current bidirectional architectures suffer from limited interactivity and high latency. We propose ShotStream, a novel causal multi-shot architecture that enables interactive storytelling and efficient on-the-fly frame generation. By reformulating the task as next-shot generation conditioned on historical context, ShotStream allows users to dynamically instruct ongoing narratives via streaming prompts. We achieve this by first fine-tuning a text-to-video model into a bidirectional next-shot generator, which is then distilled into a causal student via Distribution Matching Distillation. To overcome the challenges of inter-shot consistency and error accumulation inherent in autoregressive generation, we introduce two key innovations. First, a dual-cache memory mechanism preserves visual coherence: a global context cache retains conditional frames for inter-shot consistency, while a local context cache holds generated frames within the current shot for intra-shot consistency. And a RoPE discontinuity indicator is employed to explicitly distinguish the two caches to eliminate ambiguity. Second, to mitigate error accumulation, we propose a two-stage distillation strategy. This begins with intra-shot self-forcing conditioned on ground-truth historical shots and progressively extends to inter-shot self-forcing using self-generated histories, effectively bridging the train-test gap. Extensive experiments demonstrate that ShotStream generates coherent multi-shot videos with sub-second latency, achieving 16 FPS on a single GPU. It matches or exceeds the quality of slower bidirectional models, paving the way for real-time interactive storytelling. Training and inference code, as well as the models, are available on our

  3. PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

    Autoregressive video diffusion models have demonstrated remarkable progress, yet they remain bottlenecked by intractable linear KV-cache growth, temporal repetition, and compounding errors during long-video generation. To address these challenges, we present PackForcing, a unified framework that efficiently manages the generation history through a novel three-partition KV-cache strategy. Specifically, we categorize the historical context into three distinct types: (1) Sink tokens, which preserve early anchor frames at full resolution to maintain global semantics; (2) Mid tokens, which achieve a massive spatiotemporal compression (32x token reduction) via a dual-branch network fusing progressive 3D convolutions with low-resolution VAE re-encoding; and (3) Recent tokens, kept at full resolution to ensure local temporal coherence. To strictly bound the memory footprint without sacrificing quality, we introduce a dynamic top-k context selection mechanism for the mid tokens, coupled with a continuous Temporal RoPE Adjustment that seamlessly re-aligns position gaps caused by dropped tokens with negligible overhead. Empowered by this principled hierarchical context compression, PackForcing can generate coherent 2-minute, 832x480 videos at 16 FPS on a single H200 GPU. It achieves a bounded KV cache of just 4 GB and enables a remarkable 24x temporal extrapolation (5s to 120s), operating effectively either zero-shot or trained on merely 5-second clips. Extensive results on VBench demonstrate state-of-the-art temporal consistency (26.07) and dynamic degree (56.25), proving that short-video supervision is sufficient for high-quality, long-video synthesis. https://github.com/ShandaAI/PackForcing

  4. Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

    Equipping Large Language Model (LLM) agents with domain-specific skills is critical for tackling complex tasks. Yet, manual authoring creates a severe scalability bottleneck. Conversely, automated skill generation often yields fragile or fragmented results because it either relies on shallow parametric knowledge or sequentially overfits to non-generalizable trajectory-local lessons. To overcome this, we introduce Trace2Skill, a framework that mirrors how human experts author skills: by holistically analyzing broad execution experience before distilling it into a single, comprehensive guide. Instead of reacting sequentially to individual trajectories, Trace2Skill dispatches a parallel fleet of sub-agents to analyze a diverse pool of executions. It extracts trajectory-specific lessons and hierarchically consolidates them into a unified, conflict-free skill directory via inductive reasoning. Trace2Skill supports both deepening existing human-written skills and creating new ones from scratch. Experiments in challenging domains, such as spreadsheet, VisionQA and math reasoning, show that Trace2Skill significantly improves upon strong baselines, including Anthropic's official xlsx skills. Crucially, this trajectory-grounded evolution does not merely memorize task instances or model-specific quirks: evolved skills transfer across LLM scales and generalize to OOD settings. For example, skills evolved by Qwen3.5-35B on its own trajectories improved a Qwen3.5-122B agent by up to 57.65 absolute percentage points on WikiTableQuestions. Ultimately, our results demonstrate that complex agent experience can be packaged into highly transferable, declarative skills -- requiring no parameter updates, no external retrieval modules, and utilizing open-source models as small as 35B parameters.

  5. MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

    Currently, evaluating vision-language models (VLMs) in medical imaging tasks oversimplifies clinical reality by relying on pre-selected 2D images that demand significant manual labor to curate. This setup misses the core challenge of realworld diagnostics: a true clinical agent must actively navigate full 3D volumes across multiple sequences or modalities to gather evidence and ultimately support a final decision. To address this, we propose MEDOPENCLAW, an auditable runtime designed to let VLMs operate dynamically within standard medical tools or viewers (e.g., 3D Slicer). On top of this runtime, we introduce MEDFLOWBENCH, a full-study medical imaging benchmark covering multi-sequence brain MRI and lung CT/PET. It systematically evaluates medical agentic capabilities across viewer-only, tool-use, and open-method tracks. Initial results reveal a critical insight: while state-of-the-art LLMs/VLMs (e.g., Gemini 3.1 Pro and GPT-5.4) can successfully navigate the viewer to solve basic study-level tasks, their performance paradoxically degrades when given access to professional support tools due to a lack of precise spatial grounding. By bridging the gap between static-image perception and interactive clinical workflows, MEDOPENCLAW and MEDFLOWBENCH establish a reproducible foundation for developing auditable, full-study medical imaging agents.

  6. RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

    Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To address this gap, we introduce \texttt{RealChart2Code}, a new large-scale benchmark with over 2,800 instances grounded in authentic datasets and featuring tasks with clear analytical intent. Crucially, it is the first benchmark to systematically evaluate chart generation from large-scale raw data and assess iterative code refinement in a multi-turn conversational setting. Our comprehensive evaluation of 14 leading VLMs on RealChart2Code reveals significant performance degradation compared to simpler benchmarks, highlighting their struggles with complex plot structures and authentic data. Our analysis uncovers a substantial performance gap between proprietary and open-weight models and confirms that even state-of-the-art VLMs often fail to accurately replicate intricate, multi-panel charts. These findings provide valuable insights into the current limitations of VLMs and guide future research directions. We release the benchmark and code at https://github.com/Speakn0w/RealChart2Code.

  7. LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

    In real-world domains such as self-driving, generalization to rare scenarios remains a fundamental challenge. To address this, we introduce a new dataset designed for end-to-end driving that focuses on long-tail driving events. We provide multi-view video data, trajectories, high-level instructions, and detailed reasoning traces, facilitating in-context learning and few-shot generalization. The resulting benchmark for multimodal models, such as VLMs and VLAs, goes beyond safety and comfort metrics by evaluating instruction following and semantic coherence between model outputs. The multilingual reasoning traces in English, Spanish, and Chinese are from domain experts with diverse cultural backgrounds. Thus, our dataset is a unique resource for studying how different forms of reasoning affect driving competence. Our dataset is available at: https://hf.co/datasets/kit-mrt/kitscenes-longtail

  8. Natural-Language Agent Harnesses

    Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce Natural-Language Agent Harnesses (NLAHs), which express harness behavior in editable natural language, and Intelligent Harness Runtime (IHR), a shared runtime that executes these harnesses through explicit contracts, durable artifacts, and lightweight adapters. Across coding and computer-use benchmarks, we conduct controlled evaluations of operational viability, module ablation, and code-to-text harness migration.

  9. Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

    As the paradigm of AI shifts from text-based LLMs to Speech Language Models (SLMs), there is a growing demand for full-duplex systems capable of real-time, natural human-computer interaction. However, the development of such models is constrained by the scarcity of high-quality, multi-speaker conversational data, as existing large-scale resources are predominantly single-speaker or limited in volume. Addressing the complex dynamics of natural dialogue, such as overlapping and back-channeling remains a challenge, with standard processing pipelines suffering from diarization errors and ASR hallucinations. To bridge this gap, we present a robust and scalable open-source data processing pipeline designed for full-duplex model.

  10. Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models

    Recent advances in 3D generation have improved the fidelity and geometric details of synthesized 3D assets. However, due to the inherent ambiguity of single-view observations and the lack of robust global structural priors caused by limited 3D training data, the unseen regions generated by existing models are often stochastic and difficult to control, which may sometimes fail to align with user intentions or produce implausible geometries. In this paper, we propose Know3D, a novel framework that incorporates rich knowledge from multimodal large language models into 3D generative processes via latent hidden-state injection, enabling language-controllable generation of the back-view for 3D assets. We utilize a VLM-diffusion-based model, where the VLM is responsible for semantic understanding and guidance. The diffusion model acts as a bridge that transfers semantic knowledge from the VLM to the 3D generation model. In this way, we successfully bridge the gap between abstract textual instructions and the geometric reconstruction of unobserved regions, transforming the traditionally stochastic back-view hallucination into a semantically controllable process, demonstrating a promising direction for future 3D generation models.

  11. GenMask: Adapting DiT for Segmentation via Direct Mask

    Recent approaches for segmentation have leveraged pretrained generative models as feature extractors, treating segmentation as a downstream adaptation task via indirect feature retrieval. This implicit use suffers from a fundamental misalignment in representation. It also depends heavily on indirect feature extraction pipelines, which complicate the workflow and limit adaptation. In this paper, we argue that instead of indirect adaptation, segmentation tasks should be trained directly in a generative manner. We identify a key obstacle to this unified formulation: VAE latents of binary masks are sharply distributed, noise robust, and linearly separable, distinct from natural image latents. To bridge this gap, we introduce timesteps sampling strategy for binary masks that emphasizes extreme noise levels for segmentation and moderate noise for image generation, enabling harmonious joint training. We present GenMask, a DiT trains to generate black-and-white segmentation masks as well as colorful images in RGB space under the original generative objective. GenMask preserves the original DiT architecture while removing the need of feature extraction pipelines tailored for segmentation tasks. Empirically, GenMask attains state-of-the-art performance on referring and reasoning segmentation benchmarks and ablations quantify the contribution of each component.

  12. Diffutron: A Masked Diffusion Language Model for Turkish Language

    Masked Diffusion Language Models (MDLMs) have emerged as a compelling non-autoregressive alternative to standard large language models; however, their application to morphologically rich languages remains limited. In this paper, we introduce Diffutron, a masked diffusion language model specifically designed for Turkish. Our approach leverages a resource-efficient training pipeline, starting with LoRA-based continual pre-training of a multilingual encoder on a large-scale corpus. To enable generative capabilities, we employ a progressive instruction-tuning strategy, sequentially adapting the model on general and task-specific instruction sets. Experimental results across comprehensive benchmarks demonstrate that, despite its compact size, our model achieves competitive performance compared to existing multi-billion-parameter baselines. These findings validate the effectiveness of masked diffusion modeling combined with multi-stage tuning for non-autoregressive text generation in Turkish.

  13. Learning to Commit: Generating Organic Pull Requests via Online Repository Memory

    Large language model (LLM)-based coding agents achieve impressive results on controlled benchmarks yet routinely produce pull requests that real maintainers reject. The root cause is not functional incorrectness but a lack of organicity: generated code ignores project-specific conventions, duplicates functionality already provided by internal APIs, and violates implicit architectural constraints accumulated over years of development. Simply exposing an agent to the latest repository snapshot is not enough: the snapshot reveals the final state of the codebase, but not the repository-specific change patterns by which that state was reached. We introduce Learning to Commit, a framework that closes this gap through Online Repository Memory. Given a repository with a strict chronological split, the agent performs supervised contrastive reflection on earlier commits: it blindly attempts to resolve each historical issue, compares its prediction against the oracle diff, and distils the gap into a continuously growing set of skills-reusable patterns capturing coding style, internal API usage, and architectural invariants. When a new PR description arrives, the agent conditions its generation on these accumulated skills, producing changes grounded in the project's own evolution rather than generic pretraining priors. Evaluation is conducted on genuinely future, merged pull requests that could not have been seen during the skill-building phase, and spans multiple dimensions including functional correctness, code-style consistency, internal API reuse rate, and modified-region plausibility. Experiments on an expert-maintained repository with rich commit history show that Online Repository Memory effectively improves organicity scores on held-out future tasks.

  14. Composer 2 Technical Report

    Composer 2 is a specialized model designed for agentic software engineering. The model demonstrates strong long-term planning and coding intelligence while maintaining the ability to efficiently solve problems for interactive use. The model is trained in two phases: first, continued pretraining to improve the model's knowledge and latent coding ability, followed by large-scale reinforcement learning to improve end-to-end coding performance through stronger reasoning, accurate multi-step execution, and coherence on long-horizon realistic coding problems. We develop infrastructure to support training in the same Cursor harness that is used by the deployed model, with equivalent tools and structure, and use environments that match real problems closely. To measure the ability of the model on increasingly difficult tasks, we introduce a benchmark derived from real software engineering problems in large codebases including our own. Composer 2 is a frontier-level coding model and demonstrates a process for training strong domain-specialized models. On our CursorBench evaluations the model achieves a major improvement in accuracy compared to previous Composer models (61.3). On public benchmarks the model scores 61.7 on Terminal-Bench and 73.7 on SWE-bench Multilingual in our harness, comparable to state-of-the-art systems.

  15. Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

    Chain-of-thought (CoT) reasoning has been proposed as a transparency mechanism for large language models in safety-critical deployments, yet its effectiveness depends on faithfulness (whether models accurately verbalize the factors that actually influence their outputs), a property that prior evaluations have examined in only two proprietary models, finding acknowledgment rates as low as 25% for Claude 3.7 Sonnet and 39% for DeepSeek-R1. To extend this evaluation across the open-weight ecosystem, this study tests 12 open-weight reasoning models spanning 9 architectural families (7B-685B parameters) on 498 multiple-choice questions from MMLU and GPQA Diamond, injecting six categories of reasoning hints (sycophancy, consistency, visual pattern, metadata, grader hacking, and unethical information) and measuring the rate at which models acknowledge hint influence in their CoT when hints successfully alter answers. Across 41,832 inference runs, overall faithfulness rates range from 39.7% (Seed-1.6-Flash) to 89.9% (DeepSeek-V3.2-Speciale) across model families, with consistency hints (35.5%) and sycophancy hints (53.9%) exhibiting the lowest acknowledgment rates. Training methodology and model family predict faithfulness more strongly than parameter count, and keyword-based analysis reveals a striking gap between thinking-token acknowledgment (approximately 87.5%) and answer-text acknowledgment (approximately 28.6%), suggesting that models internally recognize hint influence but systematically suppress this acknowledgment in their outputs. These findings carry direct implications for the viability of CoT monitoring as a safety mechanism and suggest that faithfulness is not a fixed property of reasoning models but varies systematically with architecture, training method, and the nature of the influencing cue.

Techmeme(15)

  1. OpenAI introduces a Codex plugin for Claude Code, letting users invoke Codex from inside Claude Code to review code or delegate tasks (Vaibhav (VB) Srivastav/@reach_vb)

    Vaibhav (VB) Srivastav / @reach_vb : OpenAI introduces a Codex plugin for Claude Code, letting users invoke Codex from inside Claude Code to review code or delegate tasks —  If you already use Claude Code, this Codex plugin gives you a simple way to pull Codex into the same workflow.  It is useful for three things …

  2. Leaked January presentation: Coatue estimated that Anthropic would lose $14B in EBITDA on $18B in revenue in 2026 and reach a $1.995T valuation in 2030 (Eric Newcomer/Newcomer)

    Eric Newcomer / Newcomer : Leaked January presentation: Coatue estimated that Anthropic would lose $14B in EBITDA on $18B in revenue in 2026 and reach a $1.995T valuation in 2030 —  We talk about it & more on the Cerebral Valley Show  —  In a presentation to prospective investors in January, Coatue offered a rare look …

  3. Alibaba releases its Qwen3.5-Omni omnimodal LLM with support for 10+ hours of audio input, saying the Plus variant surpasses Gemini 3.1 Pro on audio benchmarks (Qwen)

    Qwen : Alibaba releases its Qwen3.5-Omni omnimodal LLM with support for 10+ hours of audio input, saying the Plus variant surpasses Gemini 3.1 Pro on audio benchmarks —  Qwen3.5-Omni is Qwen's latest generation of fully omnimodal LLM, supporting the understanding of text, images, audio, and audio-visual content.

  4. Levels.fyi: median base-salary offers for US software engineers at VC-backed startups have risen 25% to $200K since 2022; total compensation has risen just 18% (Katherine Bindley/Wall Street Journal)

    Katherine Bindley / Wall Street Journal : Levels.fyi: median base-salary offers for US software engineers at VC-backed startups have risen 25% to $200K since 2022; total compensation has risen just 18% —  Young tech companies once might have complemented lower salaries with generous equity packages.  Now they're upping base pay.

  5. Gurman: Apple pulls Apple Intelligence in China, after accidentally launching it in the country; there is no imminent launch as Apple has no regulatory approval (Ryan Christoffel/9to5Mac)

    Ryan Christoffel / 9to5Mac : Gurman: Apple pulls Apple Intelligence in China, after accidentally launching it in the country; there is no imminent launch as Apple has no regulatory approval —  Apple Intelligence first launched in the US in October 2024, but now after a nearly 18-month wait, Apple's AI features appear to be rolling out in China too.

  6. Sources: US prosecutors are exploring whether some prediction market bets, including on the capture of Nicolás Maduro, violated insider trading and other laws (Kara Scannell/CNN)

    Kara Scannell / CNN : Sources: US prosecutors are exploring whether some prediction market bets, including on the capture of Nicolás Maduro, violated insider trading and other laws —  Federal prosecutors in Manhattan are exploring whether certain lucrative bets placed on prediction markets …

  7. Meta is testing an Instagram Plus subscription in a few countries, offering features including anonymous Story viewing and extended 48-hour Story durations (Aisha Malik/TechCrunch)

    Aisha Malik / TechCrunch : Meta is testing an Instagram Plus subscription in a few countries, offering features including anonymous Story viewing and extended 48-hour Story durations —  Meta has begun testing a premium subscription on Instagram in a few countries, the company confirmed to TechCrunch on Monday.

  8. Quinnipiac poll: 55% of Americans say AI will do more harm than good in their day-to-day lives, and 65% oppose building data centers in their community (Emily Birnbaum/Bloomberg)

    Emily Birnbaum / Bloomberg : Quinnipiac poll: 55% of Americans say AI will do more harm than good in their day-to-day lives, and 65% oppose building data centers in their community —  Americans are increasingly turning against artificial intelligence, with growing majorities saying they fear the fast-moving technology …

  9. Fermi shares drop 12%+ after the data center real estate company reported a $486M YTD net loss, amid concerns over a lack of a tenant for its Texas data center (Financial Times)

    Financial Times : Fermi shares drop 12%+ after the data center real estate company reported a $486M YTD net loss, amid concerns over a lack of a tenant for its Texas data center —  Data centre property group faces investor concerns around lack of tenant revenue  —  Shares in data centre real estate group Fermi plunged …

  10. Tel Aviv-based Sett, which builds AI agents to automate game marketing, raised a $30M Series B led by Greenfield Partners, bringing its total funding to $57M (Meir Orbach/CTech)

    Meir Orbach / CTech : Tel Aviv-based Sett, which builds AI agents to automate game marketing, raised a $30M Series B led by Greenfield Partners, bringing its total funding to $57M —  The Israeli startup targets billions spent on user acquisition with agent-based automation.  —  Sett, which develops …

  11. Valinor, which aims to use smart contracts to replace manual lending processes in the private credit industry, raised a $25M seed led by Castle Island Ventures (Ben Weiss/Fortune)

    Ben Weiss / Fortune : Valinor, which aims to use smart contracts to replace manual lending processes in the private credit industry, raised a $25M seed led by Castle Island Ventures —  Many corners of finance—stock exchanges, banks, and payments firms—are embracing digital assets, but the private credit industry …

  12. Sources: E*Trade is in talks to lead SpaceX IPO share sale to retail investors; Robinhood and SoFi have pitched for roles but SpaceX is mulling cutting them out (Reuters)

    Reuters : Sources: E*Trade is in talks to lead SpaceX IPO share sale to retail investors; Robinhood and SoFi have pitched for roles but SpaceX is mulling cutting them out —  Morgan Stanley's E*Trade is in talks with SpaceX to take the lead in selling the rocket maker's shares to everyday U.S. investors …

  13. State of AI safety: as capabilities grow and models can monitor other models, issues like adversarial robustness persist and society is still not ready for AI (Boaz Barak/Windows On Theory)

    Boaz Barak / Windows On Theory : State of AI safety: as capabilities grow and models can monitor other models, issues like adversarial robustness persist and society is still not ready for AI —  Here is a quick overview of my intuitions on where we are with AI safety in early 2026:  — So far, we continue to see exponential improvements in capabilities.

  14. Match Group agrees to settle an FTC lawsuit claiming it illegally shared user data from the OkCupid app with facial recognition tech company Clarifai in 2014 (Jonathan Stempel/Reuters)

    Jonathan Stempel / Reuters : Match Group agrees to settle an FTC lawsuit claiming it illegally shared user data from the OkCupid app with facial recognition tech company Clarifai in 2014 —  Match Group (MTCH.O) agreed to settle a U.S. Federal Trade Commission lawsuit claiming it gave an outside company unauthorized access …

  15. A Delaware judge reassigns Elon Musk cases over "disproportionate media attention" after allegations she "liked" a LinkedIn post celebrating a Musk legal defeat (Sujeet Indap/Financial Times)

    Sujeet Indap / Financial Times : A Delaware judge reassigns Elon Musk cases over “disproportionate media attention” after allegations she “liked” a LinkedIn post celebrating a Musk legal defeat —  Chancellor Kathaleen McCormick denies bias but cites media glare as risk to justice

Solidot(15)

  1. 微软 Copilot 在修改 PR 中的拼写错误时添加了广告

    开发者发现,使用微软 AI 助手 Copilot 修改 PR 中的拼写错误时它主动添加了一则广告。对 GitHub 平台的搜索发现,已经有数以万计的 PR 包含了相同的广告——“Quickly spin up Copilot coding agent tasks from anywhere on your macOS or Windows machine with Raycast”。开发者认为微软此举无法让人接受。

  2. 木星闪电释放的能量相当于原子弹爆炸

    科学家使用 NASA 朱诺号探测器仪器测量了木星闪电,发现其释放的能量是地球上闪电的 1 百到 1 万倍。地球闪电一次释放的能量约为 10 亿焦耳,这意味着木星最强闪电释放大约 10 万亿焦耳的能量,相当于 2400 吨 TNT 炸药,或广岛原子弹威力的六分之一。根据朱诺号对木星风暴中闪电发生频率的研究,它平均每秒出现三次闪电,这意味着风暴每分钟释放的能量相当于多颗原子弹爆炸。闪电被认为促进了地球生命的演化,木星上的闪电也可能促进复杂的化学反应过程。

  3. 蜜蜂和蜂鸟在工作期间吸入了微量的酒

    蜜蜂和蜂鸟都会饮酒,它们的食物——花蜜——都含有微量的酒精。加州伯克利的研究人员发现含有乙醇的花蜜相当普遍。在他们分析的 29 种植物花蜜样本中有 26 种发现了乙醇。大多数样本的乙醇浓度极低,但有一个样本的乙醇浓度达到了 0.056%——大约相当于 0.1 度酒精,勉强可算作酒了。虽然听起来微不足道,但相比授粉昆虫的体重,它们每天摄入的酒精并不少。一只安氏蜂鸟(anna's hummingbird)每天饮入相当于自身体重 0.5 到 1.5 倍的花蜜,根据该摄入量,研究人员估计,蜂鸟每天每公斤体重大约摄入 0.2 克乙醇。由于它们一直在花中穿梭,因此摄入的酒精会被迅速代谢掉,所以不太可能醉酒。实验室测试显示,蜂鸟乐意饮用酒精含量在 1% 左右的花蜜,但当酒精浓度升高时它们会开始避开,到 2% 左右时访问花朵的次数会急剧下降。它们也知道适度饮酒。

  4. 杜比诉 Snapchat 挑战 AV1 的免专利费声明

    成员包括亚马逊、苹果、Google、微软、Mozilla 和 Netflix 的 AOMedia 联盟开发了免专利费的开放编解码器 AOMedia Video 1(AV1)。但杜比公司(Dolby Laboratories)对 Snap 公司提起的专利侵权诉讼对 AV1 的免专利费声明提出了质疑。杜比在诉讼书中称,AV1 利用它已经申请专利的技术,该公司未同意在免费且不收取专利税的条件下授权使用这些技术。杜比称 AOMedia]并不拥有实现 AV1 编解码器使用的所有专利,AV1 整合了存在于 HEVC 中的技术。相关技术受到了现有第三方专利权和许可义务的约束。

  5. AI 和机器人流量超过人类

    根据 Human Security 发布的《The State of AI Traffic》报告,AI 和机器人流量正式超过了人类。报告称在 2025 年包括 AI 在内的自动化流量增长速度几乎是人类活动的八倍。OpenAI 的 ChatGPT、Anthropic 的 Claude 和 Google 的 Gemini 等大模型的流行推动了 AI 流量的增长,2025 年 AI 流量增长了 187%。Cloudflare CEO Matthew Prince 此前在 SXSW 会议上表示,在生成式 AI 时代之前,互联网流量中约有 20% 来自机器人,主要由 Google的 Web 爬虫驱动。

  6. DNA 告诉了我们什么,它又有什么局限

    在“金州杀手(Golden State Killer)”变成陈年悬案四十余年后的 2018 年,一位对家族史感兴趣的女性向一家家谱公司寄去唾沫进行测序。她的 DNA 成为破案的关键。凶手是其远房亲戚,调查人员最终抓住了前警官 Joseph James DeAngelo Jr.,他在 2020 年承认了 13 项谋杀罪和 13 项绑架罪。有数以百万的人将其 DNA 样本寄送给 23andMe 和 AncestryDNA 等测序公司,以了解自己的祖先、发现健康风险或寻找失散的亲人。但 DNA 揭示的真相可能会颠覆我们对家庭和身份的理解:你可能会发现父母不是自己的生物学父母,或者兄弟姐妹之一可能不是亲兄弟姐妹。DNA 也揭示我们彼此之间比以前认为的更紧密:所有人类的最近共同祖先生活在几千年前。我们彼此之间都有血缘关系。美国人一直以隐私理由反对建立国家 DNA 数据库,但志愿性质的消费者基因检测已经创造了类似的国家数据库,由于共享 DNA,只需 1% 的测序就能让所有人人都搜索到,而美国有 7% 的人做过了测序。科学家也发现 DNA 揭示的信息仍然有限,糖尿病患病风险是 25% 还是 20% 并没有多大区别,并不意味着你是糖尿病高危人群,所以在利用基因筛选胚胎时将糖尿病患病风险从 35% 降至 30% 意义并不大。

  7. NASA 宇航员在空间站失语,原因未知

    因一名宇航员出现健康问题 NASA 今年 1 月提前结束 Crew-11 任务。Crew-11 任务于 2025 年 8 月 1 日发射,原计划于 2026 年 2 月 20 日左右返回地面,执行该任务的四名宇航员——38 岁的指挥官 Zena Cardman、58 岁的飞行员 Mike Fincke、55 岁的日本宇航员 Kimiya Yui、39 岁的俄罗斯宇航员 Oleg Platonov 提前了一个多月回到地面。这是国际空间站 25 年历史上首次因医疗问题而进行的紧急撤离。上个月 NASA 披露了患病宇航员的身份——58 岁的 Mike Fincke,但没有透露更多信息。上周五 Fincke 透露他在空间站突然失语,而医生至今仍然不清楚病因。目前尚不清楚他失语了多久,也不清楚他何时恢复了说话能力。NASA 尚未置评。

  8. 勒索软件组织将目标瞄准波斯语系统

    被称为 TeamPCP 的勒索软件组织试图卷入伊朗战争,他们释出了一种蠕虫,旨在抹掉使用伊朗时区或默认语言为波斯语的受感染系统上的数据。TeamPCP 从去年底开始利用蠕虫感染云端环境,窃取身份凭证,通过 Telegram 勒索受害者。安全公司 Flare 今年 1 月报告,被 TeamPCP 蠕虫感染的云服务 Azure(61%)和 AWS(36%)合计占到 97%。TeamPCP 最近被发现部署了新的恶意程序,如果检测到用户的时区和语言区域与伊朗相符,它将执行数据清除攻击;如果检测到受害者位于伊朗并有权访问 Kubernetes 集群,它将清除该集群每个节点上的数据,否则只清除本地计算机上的数据。

  9. OpenAI 利用 Cloudflare 程序防 AI 爬虫抓取

    OpenAI 被发现利用 Cloudflare 程序防 AI 爬虫抓取。用户发现每条 ChatGPT 消息都会触发一个 Cloudflare Turnstile 程序的检查,Turnstile 会验证用户是否运行一个真实的浏览器,以及是否启动了 ChatGPT React 应用。如果机器人程序(bot)伪造了浏览器指纹但没有渲染真正的 ChatGPT SPA,那么它将无法通过 Turnstile 的验证。OpenAI 工程师回应称,此举是为了确保其产品没有遭到机器人程序、网络爬虫抓取等的滥用。其辩解被认为极富有讽刺性,因为 OpenAI AI 爬虫的抓取行为给网站造成了严重的负担。

  10. 高效烹饪番茄和胡萝卜

    番茄和胡萝卜是食物中类胡萝卜素的主要来源。类胡萝卜素有助于降低多种慢性疾病的风险,包括心血管疾病和癌症。类胡萝卜素的健康影响不仅取决于其在食物中的浓度,还取决于其生物可利用率(Bioaccessibility)——即这些物质在经过人体消化后到底有多少能真正被肠胃吸收。生物可利用率会根据烹饪方式不同而产生显著差异。热处理通过破坏细胞结构和促进微胶粒形成提高类胡萝卜素的生物可利用率,但过高的温度或过长的时间可能导致其降解和异构化。根据发表在《Food Chemistry》期刊上的一项研究,研究人员对比了空气炸锅、烤箱和微波炉烹饪番茄和胡萝卜的生物可利用率。结果显示:胡萝卜经烤箱烹饪后,其总类胡萝卜素的生物可利用率最高可达原来的 9 倍;对于西红柿,无论采用空气炸锅(190 ℃ 10 分钟)还是传统烤箱(180 ℃ 20 分钟)烹饪,均可获得最高的生物利用率;对胡萝卜来说,微波加热是效率最高的烹饪方式,可将电力消耗降低 96%;对于西红柿来说,使用空气炸锅不仅能获得最高的生物可利用率,还能减少 80% 的耗能。

  11. Google TurboQuant AI 压缩算法大幅减少大模型内存使用

    Google 研究院发布了压缩算法 TurboQuant,能在大幅减少大模型内存占用的同时提高速度和维持精度。TurboQuant 旨在减小键值缓存的大小,被称为是储存重要信息减少再计算的“数字查找表(digital cheat sheet)”。大模型并不理解任何东西,它通过映射词元文本语义的向量去模拟对事物的理解。大模型的向量通常使用 XYZ 坐标进行编码,而实现 TurboQuant 压缩的系统将向量转换为笛卡尔坐标系的极坐标,向量被简化为两类信息:半径(核心数据强度)和方向(数据含义)。如果使用 XYZ 坐标编码向量,那么特定位置可以编码为“向东走 3 个街区,向北走 4 个街区”,采用笛卡尔坐标编码向量,那么同样的信息编码为“沿 37 度方向走 5 个街区” ,简化了空间节省了计算。Google 的早期测试显示,TurboQuant 在部分测试中实现了 8 倍的性能提升,内存占用减少到原来的六分之一,同时质量没有损失。实现 TurboQuant 算法将有助于降低 AI 模型的运行成本和内存占用,但也可能推动更复杂模型的出现,因此对降低内存价格可能没有什么效果。

  12. 奥地利计划禁止 14 岁以下儿童使用社媒

    在澳大利亚、丹麦、马来西亚和挪威之后,奥地利也计划严格限制 14 岁以下青少年使用社交媒体,理由是令人上瘾的算法以及对儿童有害的内容。政府计划在 6 月底前完成立法草案,执法和年龄验证细节尚未最终敲定。社会民主党的副总理 Andreas Babler 表示,政府不能袖手旁观,任由社交媒体平台让儿童上瘾以及令儿童受到伤害,他称应该像对待酒精或烟草那样对待社交媒体。

  13. 太空中的精子会像失控宇航员那样翻滚

    根据发表在《Communications Biology》期刊上的一项研究,太空中的精子会像失控宇航员那样翻滚,找不到通向卵子的路径。研究人员使用了 3D 旋转器模拟微重力环境,将人类、小鼠和猪的精子样本放置到一个模拟女性生殖道的迷宫,出于伦理方面的考虑,迷宫并没有真的放置卵子。相比对照组,暴露在微重力环境下的人类精子成功穿过迷宫的数量减少了约 40%。添加孕酮有助于克服精子的方位障碍,研究人员认为这是因为卵子也会释放孕酮,而孕酮有帮助引导精子。

  14. Windows PC 崩溃的频率三倍于苹果 Mac

    根据 Omnissa 公司汇总 2025 年全世界零售、医疗保健、金融、教育、政府等行业客户遥测数据后发表的报告《2026 State of Digital Workspace》,Windows PC 的崩溃频率远高于 Mac。报告发现,Windows 设备被迫关机的频率是 Mac 的 3.1 倍。Windows 应用无响应的频率是 macOS 应用的 7.5 倍,需要重启的频率是 macOS 的三倍。医疗保健和制药行业逾半数 Windows 和 Android 设备落后于最新操作系统版本五个大版本,很可能导致这些设备更容易受到恶意软件的攻击,出现 bug 的频率也更高。教育行业逾半数台式机和移动设备未加密,学生的隐私可能更容易泄漏。Mac 电脑使用寿命更长,平均五年更换一次,Windows PC 平均三年更换一次。Mac 电脑 M 系列芯片平均温度为 40.1 摄氏度,而英特尔处理器平均温度为 65.2 摄氏度。

  15. “黑暗”比光速更快

    根据发表在《自然》期刊上的一项研究,对光波中“暗点”的直接测量证实,暗点比光速更快。所谓“暗点”指的是波结构中被称为“漩涡”的微小孔洞,这种孔洞在海浪、气流甚至咖啡中都十分常见。1970 年代就有人预测漩涡的移动速度比形成它的波更快。以色列理工学院的研究团队通过实验证实了这一预测。爱因斯坦相对论已经证明真空光速是速度的极限,但相对论适用的是有质量的物质和传输能量或信息的信号。光波的涡旋没有质量,也不携带能量或信息,因此没有违反相对论。光波的涡旋是光波中的“零点”,振幅降至零的位置,是光场中完全黑暗的点。