OrangeBot.AI Digest — 2026-03-13
90 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Elon Musk pushes out more xAI founders as AI coding effort falters (www.ft.com)
- John Carmack about open source and anti-AI activists (twitter.com)
- Your phone is an entire computer (medhir.com)
- Show HN: Channel Surfer – Watch YouTube like it’s cable TV (channelsurfer.tv)
- The Wyden Siren Goes Off Again: We'll Be "Stunned" by NSA Under Section 702 (www.techdirt.com)
- Meta Platforms: Lobbying, dark money, and the App Store Accountability Act (github.com)
- Can I run AI locally? (www.canirun.ai)
- E2E encrypted messaging on Instagram will no longer be supported after 8 May (help.instagram.com)
- Nanny state discovers Linux, demands it check kids' IDs before booting (www.theregister.com)
- Qatar helium shutdown puts chip supply chain on a two-week clock (www.tomshardware.com)
- TUI Studio – visual terminal UI design tool (tui.studio)
- Meta Platforms: Lobbying, dark money, and the App Store Accountability Act (github.com)
- Source code of Swedish e-government services has been leaked (darkwebinformer.com)
- Okmain: How to pick an OK main colour of an image (dgroshev.com)
- Bucketsquatting is finally dead (onecloudplease.com)
GitHub Trending(15)
- microsoft / BitNet
- langflow-ai / openrag
- lightpanda-io / browser
- obra / superpowers
- public-apis / public-apis
- promptfoo / promptfoo
- msitarzewski / agency-agents
- dolthub / dolt
- google / A2UI
- fishaudio / fish-speech
- alibaba / page-agent
- anthropics / claude-plugins-official
- AstrBotDevs / AstrBot
- vectorize-io / hindsight
- InsForge / InsForge
Product Hunt(15)
- GradPipe
Find engineers who never apply by actual github code
- ClawMote
One-hand OpenClaw control via voice
- KingCoding
Run Claude, Codex & Cursor in parallel from one dashboard
- GhostDesk
Real-time AI overlay for meetings & invisible to screenshare
- MascotVibe
Generate & animate brand mascots in minutes
- Hyper
Perfect memory for every real-world conversation
- MTIA 300
Meta's 3rd-gen custom AI chips for GenAI inference
- Fowel by Hackmamba
Reduce documentation review time by 80% instantly
- LocalPDF.io
Process your legal/medical/financial documents locally
- BurnLink
Share encrypted files that are ephemeral
- Solo Voice
Private by architecture, not by promise.
- Manus Agents for Telegram
Personal AI Agent in Your Chat
- Ask Maps by Google
Ask Maps questions, drive with immersive navigation.
- Mozzie
Codex Claude Gemini CLI parallel agents orchestration
- Pinnacle
Turn your phone into a brain performance coach
Hugging Face(15)
- Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
Humans perceive and understand real-world spaces through a stream of visual observations. Therefore, the ability to streamingly maintain and update spatial evidence from potentially unbounded video streams is essential for spatial intelligence. The core challenge is not simply longer context windows but how spatial information is selected, organized, and retained over time. In this paper, we propose Spatial-TTT towards streaming visual-based spatial intelligence with test-time training (TTT), which adapts a subset of parameters (fast weights) to capture and organize spatial evidence over long-horizon scene videos. Specifically, we design a hybrid architecture and adopt large-chunk updates parallel with sliding-window attention for efficient spatial video processing. To further promote spatial awareness, we introduce a spatial-predictive mechanism applied to TTT layers with 3D spatiotemporal convolution, which encourages the model to capture geometric correspondence and temporal continuity across frames. Beyond architecture design, we construct a dataset with dense 3D spatial descriptions, which guides the model to update its fast weights to memorize and organize global 3D spatial signals in a structured manner. Extensive experiments demonstrate that Spatial-TTT improves long-horizon spatial understanding and achieves state-of-the-art performance on video spatial benchmarks. Project page: https://liuff19.github.io/Spatial-TTT.
- Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections
Multimodal agents offer a promising path to automating complex document-intensive workflows. Yet, a critical question remains: do these agents demonstrate genuine strategic reasoning, or merely stochastic trial-and-error search? To address this, we introduce MADQA, a benchmark of 2,250 human-authored questions grounded in 800 heterogeneous PDF documents. Guided by Classical Test Theory, we design it to maximize discriminative power across varying levels of agentic abilities. To evaluate agentic behaviour, we introduce a novel evaluation protocol measuring the accuracy-effort trade-off. Using this framework, we show that while the best agents can match human searchers in raw accuracy, they succeed on largely different questions and rely on brute-force search to compensate for weak strategic planning. They fail to close the nearly 20% gap to oracle performance, persisting in unproductive loops. We release the dataset and evaluation harness to help facilitate the transition from brute-force retrieval to calibrated, efficient reasoning.
- IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention (DSA) is a representative production-grade solution: a lightweight lightning indexer selects the top-k most relevant tokens per query, reducing core attention from O(L^2) to O(Lk). However, the indexer itself retains O(L^2) complexity and must run independently at every layer, despite the fact that the resulting top-k selections are highly similar across consecutive layers. We present IndexCache, which exploits this cross-layer redundancy by partitioning layers into a small set of Full layers that run their own indexers and a majority of Shared layers that simply reuse the nearest Full layer's top-k indices. We propose two complementary approaches to determine and optimize this configuration. Training-free IndexCache applies a greedy search algorithm that selects which layers to retain indexers by directly minimizing language modeling loss on a calibration set, requiring no weight updates. Training-aware IndexCache introduces a multi-layer distillation loss that trains each retained indexer against the averaged attention distributions of all layers it serves, enabling even simple interleaved patterns to match full-indexer accuracy. Experimental results on a 30B DSA model show that IndexCache can remove 75% of indexer computations with negligible quality degradation, achieving up to 1.82times prefill speedup and 1.48times decode speedup compared to standard DSA. These positive results are further confirmed by our preliminary experiments on the production-scale GLM-5 model (Figure 1).
- Video-Based Reward Modeling for Computer-Use Agents
Computer-using agents (CUAs) are becoming increasingly capable; however, it remains difficult to scale evaluation of whether a trajectory truly fulfills a user instruction. In this work, we study reward modeling from execution video: a sequence of keyframes from an agent trajectory that is independent of the agent's internal reasoning or actions. Although video-execution modeling is method-agnostic, it presents key challenges, including highly redundant layouts and subtle, localized cues that determine success. We introduce Execution Video Reward 53k (ExeVR-53k), a dataset of 53k high-quality video--task--reward triplets. We further propose adversarial instruction translation to synthesize negative samples with step-level annotations. To enable learning from long, high-resolution execution videos, we design spatiotemporal token pruning, which removes homogeneous regions and persistent tokens while preserving decisive UI changes. Building on these components, we fine-tune an Execution Video Reward Model (ExeVRM) that takes only a user instruction and a video-execution sequence to predict task success. Our ExeVRM 8B achieves 84.7% accuracy and 87.7% recall on video-execution assessment, outperforming strong proprietary models such as GPT-5.2 and Gemini-3 Pro across Ubuntu, macOS, Windows, and Android, while providing more precise temporal attribution. These results show that video-execution reward modeling can serve as a scalable, model-agnostic evaluator for CUAs.
- DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning
While large-scale diffusion models have revolutionized video synthesis, achieving precise control over both multi-subject identity and multi-granularity motion remains a significant challenge. Recent attempts to bridge this gap often suffer from limited motion granularity, control ambiguity, and identity degradation, leading to suboptimal performance on identity preservation and motion control. In this work, we present DreamVideo-Omni, a unified framework enabling harmonious multi-subject customization with omni-motion control via a progressive two-stage training paradigm. In the first stage, we integrate comprehensive control signals for joint training, encompassing subject appearances, global motion, local dynamics, and camera movements. To ensure robust and precise controllability, we introduce a condition-aware 3D rotary positional embedding to coordinate heterogeneous inputs and a hierarchical motion injection strategy to enhance global motion guidance. Furthermore, to resolve multi-subject ambiguity, we introduce group and role embeddings to explicitly anchor motion signals to specific identities, effectively disentangling complex scenes into independent controllable instances. In the second stage, to mitigate identity degradation, we design a latent identity reward feedback learning paradigm by training a latent identity reward model upon a pretrained video diffusion backbone. This provides motion-aware identity rewards in the latent space, prioritizing identity preservation aligned with human preferences. Supported by our curated large-scale dataset and the comprehensive DreamOmni Bench for multi-subject and omni-motion control evaluation, DreamVideo-Omni demonstrates superior performance in generating high-quality videos with precise controllability.
- Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation
Reinforcement learning (RL) has emerged as a promising paradigm for enhancing image editing and text-to-image (T2I) generation. However, current reward models, which act as critics during RL, often suffer from hallucinations and assign noisy scores, inherently misguiding the optimization process. In this paper, we present FIRM (Faithful Image Reward Modeling), a comprehensive framework that develops robust reward models to provide accurate and reliable guidance for faithful image generation and editing. First, we design tailored data curation pipelines to construct high-quality scoring datasets. Specifically, we evaluate editing using both execution and consistency, while generation is primarily assessed via instruction following. Using these pipelines, we collect the FIRM-Edit-370K and FIRM-Gen-293K datasets, and train specialized reward models (FIRM-Edit-8B and FIRM-Gen-8B) that accurately reflect these criteria. Second, we introduce FIRM-Bench, a comprehensive benchmark specifically designed for editing and generation critics. Evaluations demonstrate that our models achieve superior alignment with human judgment compared to existing metrics. Furthermore, to seamlessly integrate these critics into the RL pipeline, we formulate a novel "Base-and-Bonus" reward strategy that balances competing objectives: Consistency-Modulated Execution (CME) for editing and Quality-Modulated Alignment (QMA) for generation. Empowered by this framework, our resulting models FIRM-Qwen-Edit and FIRM-SD3.5 achieve substantial performance breakthroughs. Comprehensive experiments demonstrate that FIRM mitigates hallucinations, establishing a new standard for fidelity and instruction adherence over existing general models. All of our datasets, models, and code have been publicly available at https://firm-reward.github.io.
- DVD: Deterministic Video Depth Estimation with Generative Priors
Existing video depth estimation faces a fundamental trade-off: generative models suffer from stochastic geometric hallucinations and scale drift, while discriminative models demand massive labeled datasets to resolve semantic ambiguities. To break this impasse, we present DVD, the first framework to deterministically adapt pre-trained video diffusion models into single-pass depth regressors. Specifically, DVD features three core designs: (i) repurposing the diffusion timestep as a structural anchor to balance global stability with high-frequency details; (ii) latent manifold rectification (LMR) to mitigate regression-induced over-smoothing, enforcing differential constraints to restore sharp boundaries and coherent motion; and (iii) global affine coherence, an inherent property bounding inter-window divergence, which enables seamless long-video inference without requiring complex temporal alignment. Extensive experiments demonstrate that DVD achieves state-of-the-art zero-shot performance across benchmarks. Furthermore, DVD successfully unlocks the profound geometric priors implicit in video foundation models using 163x less task-specific data than leading baselines. Notably, we fully release our pipeline, providing the whole training suite for SOTA video depth estimation to benefit the open-source community.
- WeEdit: A Dataset, Benchmark and Glyph-Guided Framework for Text-centric Image Editing
Instruction-based image editing aims to modify specific content within existing images according to user-provided instructions while preserving non-target regions. Beyond traditional object- and style-centric manipulation, text-centric image editing focuses on modifying, translating, or rearranging textual elements embedded within images. However, existing leading models often struggle to execute complex text editing precisely, frequently producing blurry or hallucinated characters. We attribute these failures primarily to the lack of specialized training paradigms tailored for text-centric editing, as well as the absence of large-scale datasets and standardized benchmarks necessary for a closed-loop training and evaluation system. To address these limitations, we present WeEdit, a systematic solution encompassing a scalable data construction pipeline, two benchmarks, and a tailored two-stage training strategy. Specifically, we propose a novel HTML-based automatic editing pipeline, which generates 330K training pairs covering diverse editing operations and 15 languages, accompanied by standardized bilingual and multilingual benchmarks for comprehensive evaluation. On the algorithmic side, we employ glyph-guided supervised fine-tuning to inject explicit spatial and content priors, followed by a multi-objective reinforcement learning stage to align generation with instruction adherence, text clarity, and background preservation. Extensive experiments demonstrate that WeEdit outperforms previous open-source models by a clear margin across diverse editing operations.
- ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation
Text-driven video generation has democratized film creation, but camera control in cinematic multi-shot scenarios remains a significant block. Implicit textual prompts lack precision, while explicit trajectory conditioning imposes prohibitive manual overhead and often triggers execution failures in current models. To overcome this bottleneck, we propose a data-centric paradigm shift, positing that aligned (Caption, Trajectory, Video) triplets form an inherent joint distribution that can connect automated plotting and precise execution. Guided by this insight, we present ShotVerse, a "Plan-then-Control" framework that decouples generation into two collaborative agents: a VLM (Vision-Language Model)-based Planner that leverages spatial priors to obtain cinematic, globally aligned trajectories from text, and a Controller that renders these trajectories into multi-shot video content via a camera adapter. Central to our approach is the construction of a data foundation: we design an automated multi-shot camera calibration pipeline aligns disjoint single-shot trajectories into a unified global coordinate system. This facilitates the curation of ShotVerse-Bench, a high-fidelity cinematic dataset with a three-track evaluation protocol that serves as the bedrock for our framework. Extensive experiments demonstrate that ShotVerse effectively bridges the gap between unreliable textual control and labor-intensive manual plotting, achieving superior cinematic aesthetics and generating multi-shot videos that are both camera-accurate and cross-shot consistent.
- GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing
Unified multimodal models target joint understanding, reasoning, and generation, but current image editing benchmarks are largely confined to natural images and shallow commonsense reasoning, offering limited assessment of this capability under structured, domain-specific constraints. In this work, we introduce GRADE, the first benchmark to assess discipline-informed knowledge and reasoning in image editing. GRADE comprises 520 carefully curated samples across 10 academic domains, spanning from natural science to social science. To support rigorous evaluation, we propose a multi-dimensional evaluation protocol that jointly assesses Discipline Reasoning, Visual Consistency, and Logical Readability. Extensive experiments on 20 state-of-the-art open-source and closed-source models reveal substantial limitations in current models under implicit, knowledge-intensive editing settings, leading to large performance gaps. Beyond quantitative scores, we conduct rigorous analyses and ablations to expose model shortcomings and identify the constraints within disciplinary editing. Together, GRADE pinpoints key directions for the future development of unified multimodal models, advancing the research on discipline-informed image editing and reasoning. Our benchmark and evaluation code are publicly released.
- CREATE: Testing LLMs for Associative Creativity
A key component of creativity is associative reasoning: the ability to draw novel yet meaningful connections between concepts. We introduce CREATE, a benchmark designed to evaluate models' capacity for creative associative reasoning. CREATE requires models to generate sets of paths connecting concepts in a model's parametric knowledge. Paths should have high specificity (distinctiveness and closeness of the concept connection) and high diversity (dissimilarity from other paths), and models are scored more highly if they produce a larger set of strong, diverse paths. This task shares demands of real creativity tasks like hypothesis generation, including an extremely large search space, but enables collection of a sizable benchmark with objective answer grading. Evaluation of frontier models shows that the strongest models achieve higher creative utility than others, with the high multiplicity of answers and complexity of the search making benchmark saturation difficult to achieve. Furthermore, our results illustrate that thinking models are not always more effective on our task, even with high token budgets. Recent approaches for creative prompting give some but limited additional improvement. CREATE provides a sandbox for developing new methods to improve models' capacity for associative creativity.
- One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
Diffusion transformers (DiTs) achieve high generative quality but lock FLOPs to image resolution, limiting principled latency-quality trade-offs, and allocate computation uniformly across input spatial tokens, wasting resource allocation to unimportant regions. We introduce Elastic Latent Interface Transformer (ELIT), a drop-in, DiT-compatible mechanism that decouples input image size from compute. Our approach inserts a latent interface, a learnable variable-length token sequence on which standard transformer blocks can operate. Lightweight Read and Write cross-attention layers move information between spatial tokens and latents and prioritize important input regions. By training with random dropping of tail latents, ELIT learns to produce importance-ordered representations with earlier latents capturing global structure while later ones contain information to refine details. At inference, the number of latents can be dynamically adjusted to match compute constraints. ELIT is deliberately minimal, adding two cross-attention layers while leaving the rectified flow objective and the DiT stack unchanged. Across datasets and architectures (DiT, U-ViT, HDiT, MM-DiT), ELIT delivers consistent gains. On ImageNet-1K 512px, ELIT delivers an average gain of 35.3% and 39.6% in FID and FDD scores. Project page: https://snap-research.github.io/elit/
- RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning
Dense image captioning is critical for cross-modal alignment in vision-language pretraining and text-to-image generation, but scaling expert-quality annotations is prohibitively expensive. While synthetic captioning via strong vision-language models (VLMs) is a practical alternative, supervised distillation often yields limited output diversity and weak generalization. Reinforcement learning (RL) could overcome these limitations, but its successes have so far been concentrated in verifiable domains that rely on deterministic checkers -- a luxury not available in open-ended captioning. We address this bottleneck with RubiCap, a novel RL framework that derives fine-grained, sample-specific reward signals from LLM-written rubrics. RubiCap first assembles a diverse committee of candidate captions, then employs an LLM rubric writer to extract consensus strengths and diagnose deficiencies in the current policy. These insights are converted into explicit evaluation criteria, enabling an LLM judge to decompose holistic quality assessment and replace coarse scalar rewards with structured, multi-faceted evaluations. Across extensive benchmarks, RubiCap achieves the highest win rates on CapArena, outperforming supervised distillation, prior RL methods, human-expert annotations, and GPT-4V-augmented outputs. On CaptionQA, it demonstrates superior word efficiency: our 7B model matches Qwen2.5-VL-32B-Instruct, and our 3B model surpasses its 7B counterpart. Remarkably, using the compact RubiCap-3B as a captioner produces stronger pretrained VLMs than those trained on captions from proprietary models.
- EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
Autoregressive (AR) video generative models rely on video tokenizers that compress pixels into discrete token sequences. The length of these token sequences is crucial for balancing reconstruction quality against downstream generation computational cost. Traditional video tokenizers apply a uniform token assignment across temporal blocks of different videos, often wasting tokens on simple, static, or repetitive segments while underserving dynamic or complex ones. To address this inefficiency, we introduce EVATok, a framework to produce Efficient Video Adaptive Tokenizers. Our framework estimates optimal token assignments for each video to achieve the best quality-cost trade-off, develops lightweight routers for fast prediction of these optimal assignments, and trains adaptive tokenizers that encode videos based on the assignments predicted by routers. We demonstrate that EVATok delivers substantial improvements in efficiency and overall quality for video reconstruction and downstream AR generation. Enhanced by our advanced training recipe that integrates video semantic encoders, EVATok achieves superior reconstruction and state-of-the-art class-to-video generation on UCF-101, with at least 24.4% savings in average token usage compared to the prior state-of-the-art LARP and our fixed-length baseline.
- EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models
Recently, Multimodal Large Language Models (MLLMs) have been widely integrated into diffusion frameworks primarily as text encoders to tackle complex tasks such as spatial reasoning. However, this paradigm suffers from two critical limitations: (i) MLLMs text encoder exhibits insufficient reasoning depth. Single-step encoding fails to activate the Chain-of-Thought process, which is essential for MLLMs to provide accurate guidance for complex tasks. (ii) The guidance remains invariant during the decoding process. Invariant guidance during decoding prevents DiT from progressively decomposing complex instructions into actionable denoising steps, even with correct MLLM encodings. To this end, we propose Endogenous Chain-of-Thought (EndoCoT), a novel framework that first activates MLLMs' reasoning potential by iteratively refining latent thought states through an iterative thought guidance module, and then bridges these states to the DiT's denoising process. Second, a terminal thought grounding module is applied to ensure the reasoning trajectory remains grounded in textual supervision by aligning the final state with ground-truth answers. With these two components, the MLLM text encoder delivers meticulously reasoned guidance, enabling the DiT to execute it progressively and ultimately solve complex tasks in a step-by-step manner. Extensive evaluations across diverse benchmarks (e.g., Maze, TSP, VSP, and Sudoku) achieve an average accuracy of 92.1%, outperforming the strongest baseline by 8.3 percentage points.
Techmeme(15)
- The US Army awards Anduril a 10-year contract worth up to $20B to buy its software, hardware, and services; the deal includes a 5-year optional ordering period (Jen Judson/Bloomberg)
Jen Judson / Bloomberg : The US Army awards Anduril a 10-year contract worth up to $20B to buy its software, hardware, and services; the deal includes a 5-year optional ordering period — The US Army has awarded Anduril Industries a contract with a total value of as much as $20 billion to buy the defense startup's software …
- A US government website shows the Commerce Department withdrew a planned rule tightening AI chip exports; a draft was sent to agencies for feedback in February (Karen Freifeld/Reuters)
Karen Freifeld / Reuters : A US government website shows the Commerce Department withdrew a planned rule tightening AI chip exports; a draft was sent to agencies for feedback in February — The U.S. Department of Commerce on Friday withdrew a planned rule on AI chip exports, according to a government website.
- Sources: Meta plans sweeping layoffs that could affect 20% or more of the company, amid mounting AI infrastructure costs; it had ~79,000 employees as of Dec. 31 (Reuters)
Reuters : Sources: Meta plans sweeping layoffs that could affect 20% or more of the company, amid mounting AI infrastructure costs; it had ~79,000 employees as of Dec. 31 — Meta (META.O) is planning sweeping layoffs that could affect 20% or more of the company, three sources familiar with the matter told Reuters …
- A US judge questions Elon Musk's $134B claim for damages in his lawsuit against OpenAI and Microsoft but rules he can still make his case to a jury (George Hammond/Financial Times)
George Hammond / Financial Times : A US judge questions Elon Musk's $134B claim for damages in his lawsuit against OpenAI and Microsoft but rules he can still make his case to a jury — California court questions billionaire's expert witness but declines to exclude the testimony from April trial
- $TRUMP memecoin surged as much as 60% after its promoters said it would host a gala luncheon at Mar-a-Lago with Trump; the WH hasn't confirmed his attendance (Bloomberg)
Bloomberg : $TRUMP memecoin surged as much as 60% after its promoters said it would host a gala luncheon at Mar-a-Lago with Trump; the WH hasn't confirmed his attendance — The memecoin bearing Donald Trump's name surged as much as 60% in the last 24 hours after its promoters advertised an exclusive gala …
- Amazon wins its appeal against a €746M GDPR fine imposed by Luxembourg's privacy watchdog after a court finds the watchdog had not properly done its analysis (Foo Yun Chee/Reuters)
Foo Yun Chee / Reuters : Amazon wins its appeal against a €746M GDPR fine imposed by Luxembourg's privacy watchdog after a court finds the watchdog had not properly done its analysis — Amazon (AMZN.O) on Friday won its appeal against a record 746-million-euro ($854.4 million) fine imposed by Luxembourg's privacy regulator …
- Sources: the Trump administration is set to receive a ~$10B fee from investors in TikTok's US business for the government's role in brokering the TikTok US deal (Wall Street Journal)
Wall Street Journal : Sources: the Trump administration is set to receive a ~$10B fee from investors in TikTok's US business for the government's role in brokering the TikTok US deal — Investors in social-media platform's U.S. business, including Oracle and Silver Lake, agreed to give the government several multibillion-dollar payments, sources say
- Digg announces a "hard reset" and shuts down operations two months after it was relaunched by Kevin Rose and Alexis Ohanian, citing the scale of AI bot spam (Richard Lawler/The Verge)
Richard Lawler / The Verge : Digg announces a “hard reset” and shuts down operations two months after it was relaunched by Kevin Rose and Alexis Ohanian, citing the scale of AI bot spam — Digg's Reddit-like relaunch failed fast, but its CEO is already planning another comeback.
- Facebook launches new tools to help creators detect and report impersonation, and updates guidelines to better define what it considers to be "original content" (Sarah Perez/TechCrunch)
Sarah Perez / TechCrunch : Facebook launches new tools to help creators detect and report impersonation, and updates guidelines to better define what it considers to be “original content” — After numerous accusations claiming that Facebook has turned into an “AI slop hellscape,” Meta on Friday announced …
- Sources: Mirendil, founded by former Anthropic researchers to develop AI models for scientific research, is in talks to raise $175M at a $1B valuation (The Information)
The Information : Sources: Mirendil, founded by former Anthropic researchers to develop AI models for scientific research, is in talks to raise $175M at a $1B valuation — Former Anthropic researchers are in talks to raise $175 million at a $1 billion valuation for a new startup that aims to conduct AI-driven research …
- Didi reports Q4 revenue up 10.5% YoY to $8.46B, international revenue up 47% YoY to $638M, and a net loss of $43.48M amid an overseas expansion push (Reuters)
Reuters : Didi reports Q4 revenue up 10.5% YoY to $8.46B, international revenue up 47% YoY to $638M, and a net loss of $43.48M amid an overseas expansion push — Didi Global reported on Friday a net loss for the fourth quarter, as China's largest ride-hailing platform ramped up its international expansion, boosting costs.
- MacBook Neo teardown: most repairable MacBook in ~14 years, with no parts pairing issues, a screwed-down battery, and relatively easy keyboard replacement (Elizabeth Chamberlain/iFixit News)
Elizabeth Chamberlain / iFixit News : MacBook Neo teardown: most repairable MacBook in ~14 years, with no parts pairing issues, a screwed-down battery, and relatively easy keyboard replacement — Is Apple's most affordable laptop ever also one of its most repairable? For years, opening a MacBook has usually meant fighting your way through glue and buried parts.
- Travis Kalanick renames CloudKitchens' parent company as Atoms, focused on creating "gainfully employed robots" for the food, mining, and transport industries (Natalie Lung/Bloomberg)
Natalie Lung / Bloomberg : Travis Kalanick renames CloudKitchens' parent company as Atoms, focused on creating “gainfully employed robots” for the food, mining, and transport industries — Uber Technologies Inc. co-founder Travis Kalanick has launched a new venture that will focus on creating …
- Claude Opus 4.6 and Sonnet 4.6 now offer a 1M context window at standard pricing; it is the default for Claude Code Max, Team, and Enterprise users on Opus 4.6 (Anthropic)
Anthropic : Claude Opus 4.6 and Sonnet 4.6 now offer a 1M context window at standard pricing; it is the default for Claude Code Max, Team, and Enterprise users on Opus 4.6 — Standard pricing now applies across the full 1M window for both models, with no long-context premium. Media limits expand to 600 images or PDF pages.
- Meta says Instagram will no longer support end-to-end encrypted messages starting May 8, saying "very few people" were using E2EE in their DMs (Karandeep Singh Oberoi/Android Police)
Karandeep Singh Oberoi / Android Police : Meta says Instagram will no longer support end-to-end encrypted messages starting May 8, saying “very few people” were using E2EE in their DMs — Can you imagine a world where WhatsApp stops offering end-to-end encryption (E2EE)? That's not happening, but a different Meta-owned company …
Solidot(15)
- Chrome 将正式发布 ARM64 Linux 版本
Google Chromium 官方博客宣布将于 2026 年二季度正式发布 Chrome 的 ARM64 Linux 版本。该版本将包含其它平台版本相同的功能,包括 Google 账号同步、Chrome Web Store 扩展、内置翻译、Safe Browsing 保护,以及 Google Password Manager 等。ARM64 Linux 系统已经广泛使用多年,但作为最流行的浏览器,Chrome 一直没有发布官方支持版本。
- AI 面部识别出错导致一位祖母被关近半年
在美国北达科他州的一起银行欺诈案件中,一名女子使用伪造的美国陆军军人 ID 提取了数万美元。面部识别软件根据录像将嫌疑人与 50 岁的田纳西州祖母 Angela Lipps 匹配起来。一名侦探在法庭文件中写道,根据面部特征、体型和发型,Lipps 与嫌疑人相符。Lipps 从未去过北达科他州,在送往北达科他州接受指控前她从未乘坐过飞机,她大部分时间都生活在田纳西,是三个孩子的母亲,五个孩子的祖母。在被释放前,她被关押了近六个月,警方没有道歉也没有支付她回家的路费。
- 苹果将中国应用商店佣金比例从 30% 降至 25%
苹果发表新闻稿:根据与中国监管部门的沟通,Apple 将对中国的 App Store 进行调整。自 2026 年 3 月 15 日起,适用于中国内地(大陆) App Store 的 iOS 及 iPadOS 佣金率将进行调整。Apple App 内购买及付费 App 的标准佣金率将由目前的 30% 改为 25%。App Store Small Business Program 以及 Mini Apps Partner Program 项下符合条件的 Apple App 内购买佣金率以及第一年后的自动续费订阅佣金率将由目前的 15% 改为 12%。
- 大都会艺术博物馆发布 140 件著名艺术品的高清 3D 扫描
纽约大都会艺术博物馆发布了 140 件重要馆藏品的高清 3D 扫描,其中九件是和 NHK 合作制作的。3D 扫描图允许观众从任何角度观看藏品,比在博物馆中更近距离的欣赏梵高的笔触,放大巴比伦楔形文字泥板,翻转 18 世纪的土耳其瓷砖。大都会艺术博物馆表示未来会发布更多馆藏品的 3D 扫描。
- Google Maps 集成 Gemini
Google Maps 成为 Google 旗下最新一个集成 AI 聊天机器人 Gemini 的应用。Google 官方博客称,Google Maps 的最新更新将加入 Gemini 驱动的 Ask Maps 功能,允许用户在应用内以对话的方式规划行程、提出问题和完善出行建议。Ask Maps 的工作方式与聊天机器人类似。Google Maps 还推出了新的 Immersive Navigation,提供了精细的 3D 视效、更智能的路线预览以及由街景和航拍图像数据驱动的导航,呈现精确的立交桥、人行横道、地标和路标信息。新功能将首先在 Android 和 iOS 平台上推出。
- 莫斯科居民遭遇移动网络中断事故
莫斯科市中心以及圣彼得堡用户从一周前开始报告移动网络访问困难。用户表示无法加载网站或应用,部分用户甚至完全失去网络服务,拨打电话都做不到。市民不得不使用对讲机和寻呼机进行通讯。克里姆林宫本周表示,网络中断是为了“确保安全”,将“在必要时”继续实施,它未提供更多解释。人权活动人士称,莫斯科可能在测试白名单,即只有政府批准的少数网站和服务才能正常访问。移动网络中断严重打击了快递服务、打车软件和零售业。甚至俄罗斯议会国家杜马周四也无法访问移动网络和无线网络,议员们也与互联网隔离了。零售数据显示俄罗斯人开始购买更多的对讲机和寻呼机。对讲机的销量增长 27%,用于与客户和员工沟通的寻呼机销量增长 73%。莫斯科纸质地图的需求量几乎翻了三倍。
- 佛教是唯一一个信徒人数下降的主要宗教
佛教是唯一一个信徒人数下降的主要宗教,其它主要宗教如基督教、伊斯兰教、印度教和犹太教的信徒人数都有增长——全世界的基督教徒有 23 亿,穆斯林 20 亿,印度教徒 12 亿,佛教徒 3 亿,大多数佛教徒生活在亚洲,其中泰国的佛教徒最多有 6760 万。整个东亚地区,许多从小接受佛教信仰的人不再认同佛教。大多数前佛教徒现在没有宗教信仰,自认为是无神论者、不可知论者,或者没有特定的信仰。日本 40% 的佛教徒已不再信仰任何宗教,韩国的这一比例为 42%。不同于极为重视定期祈祷和礼拜仪式的犹太教、基督教和伊斯兰教,佛教没有每周去寺庙的习惯。尽管如此,参拜寺庙和参加佛教节日一直是东亚传统生活的重要组成部分,但如今繁忙的工作、学业和日常琐事,曾经的信徒难以再抽出时间进行精神或宗教活动。
- 2025 年 53% 的美国成年人去过电影院
根据皮尤研究中心(Pew Research Center)的一项调查,2025 年只有 53% 的美国成年人去过电影院看电影。美国的电影票房收入仍然未完全恢复到新冠疫情前水平。2025 年美国和加拿大的电影观众共购买了 7.692 亿张电影票,不到 2002 年电影票最高历史纪录 16 亿张的一半。经通胀调整,美国电影票房收入在 2002 年达到峰值 164 亿美元。2000 年代和 2010 年代的年度票房收入相对稳定,但 2020 年因疫情电影院关闭,票房收入骤降至 30 亿美元以下。2025 年美国电影院的票房收入略高于 90 亿美元,仍然比疫情前水平低约 20%。
- 研究人员发现使用 DHT 的僵尸网络 KadNap
安全公司 Lumen 的研究人员发现了一个用分布式哈希表(DHT)进行通信的僵尸网络 KadNap。自 2025 年 8 月以来,KadNap 感染了逾 1.4 万台路由器等联网设备,主要是未修复漏洞的华硕路由器,这些设备被加入到一个代理网络,用于匿名化网络犯罪的流量。KadNap 使用 DHT 隐藏指令控制服务器的 IP 地址,实现去中心化控制。使用 DHT 的最知名 P2P 网络是 BitTorrent。KadNap 很难清除,要清除设备中的恶意程序,必须恢复出厂设置。
- 食腐动物会记住何处有食物
发表在《科学》期刊上的新研究挑战了人们长期以来形成的观念:觅食的食腐动物通常会尾随捕食动物来寻找食物。研究人员通过对黄石国家公园内常见的渡鸦、灰狼和美洲狮进行研究后发现,渡鸦会凭借空间记忆返回之前发生过猎杀的地方。研究人员利用 GPS 设备追踪了渡鸦、灰狼和美洲狮为期两年半的活动轨迹,并记录了数百次灰狼和美洲狮的捕杀事件。他们发现,渡鸦很少长距离地追随捕食动物。相反渡鸦会反复返回灰狼频繁捕杀的区域,有时往返距离会长达 155 公里。渡鸦与美洲狮之间则鲜有互动。研究结果表明,渡鸦会凭借空间记忆将历史上捕杀密度高的区域视为可预测的觅食地点。
- 黑客组织对 Stryker 发动数据清除攻击
名为 Handala (a.k.a. Handala Hack Team) 的黑客组织对总部位于密歇根州的医疗设备制造商 Stryker 发动了大规模数据清除攻击。Handala 声称清除了逾 20 万个系统、服务器和移动设备中的数据,导致 Stryker 在 79 个国家的办事处被迫关闭。Stryker 在全球 61 个国家有 56,000 名员工,去年全球销售额 250 亿美元。Stryker 在爱尔兰的分公司证实遭到攻击,已经命令逾 5000 名员工在家办公,通过 WhatsApp 进行沟通。一位匿名员工称,在个人手机安装 Microsoft Outlook 的用户其设备上的数据都被清除了。数据清除攻击旨在恶意删除或破坏存储在计算机、服务器或其它设备上的数据。
- Valve 称其战利品箱工作方式类似万智牌和 Labubu
上个月纽约州总检察长 Letitia James 指控 Valve 的战利品箱系统助长非法赌博,导致儿童成瘾。Valve 开发了热门网络游戏《反恐精英》、《军团要塞》和《Dota 2》,这些都是免费游戏,通过微交易和战利品箱获利。诉状指控 Valve 的战利品箱系统本质上是赌博,违反了纽约的州宪法和州刑法。Valve 通过其官网声明回应了诉讼,表示不认同指控,对诉讼感到失望。Valve 称它在游戏中使用的盒子类型被广泛用于从游戏到现实世界的棒球卡包、盲盒和盲袋。Valve 称其游戏中的盒子与宝可梦、万智牌和 Labubu 没有区别,而大多数玩家根本不会打开任何盒子,因为盒子里的虚拟物品都是装饰性的,玩家不花钱并不会有任何劣势。Valve 还表示它多年来一直致力于打击参与赌博的 Steam 账号,称纽约州总检察长办公室还提议收集更多用户信息以识别 VPN 用户,要求收集更多用户个人数据进行额外的年龄验证。
- 星际彗星 3I/ATLAS 富含甲醇
天文学家利用智利阿塔卡玛大型毫米及次毫米波阵列(ALMA)进行观测,发现星际彗星 3I/ATLAS 中存在氰化氢(彗星中常见的含氮有机分子),以及含量异常高的甲醇(与前生物化学反应相关的重要有机分子)。此结果显示该天体的化学组成与太阳系彗星显著不同,为研究其他恒星、行星系统形成环境提供重要线索。研究团队表示,观测 3I/ATLAS 就像是在取得来自另一恒星系统的化学指纹。这些细节揭示了它的物质组成,其中甲醇含量之高,是我们在太阳系彗星中极少见到的情况。测量结果显示,3I/ATLAS 所含的冰质物质可能在与太阳系彗星截然不同的物理与化学环境下形成,或曾经历不同的演化过程。
- 微软透露下一代 Xbox 的更多信息
在 GDC 大会上,微软透露了代号为 Project Helix 的下一代 Xbox 的更多信息。微软表示将于下个月向所有 Windows 11 PC 全屏 Xbox 模式,该模式已于去年 11 月提供给了 Windows Insider 用户。下一代 Xbox 将能同时玩 Xbox 和 PC 游戏,采用 AMD 的 SoC—— CPU 预计是 Zen6 GPU 则是 RDNA5/UDNA;下一代 DirectX;下一代光线追踪。AMD 下一个版本的 FSR Next 将基于下一代机器学习超采样,新的机器学习多帧生成,支持光线追踪和路径追踪光线再生等等。微软将从 2027 年开始向游戏开发商提供硬件的 Alpha 版本。
- 索尼在世界各地的 PlayStation Store 测试动态定价
自 2025 年 11 月起,索尼在 PlayStation Store 开展价格 A/B 测试。四个月内,实验规模从 30 个地区的 50 款游戏扩大到 70+ 个地区的 190 多款游戏——包括最大市场美国。部分 PlayStation Store 用户看到的是显著低于标准零售价的试验性价格,用户被分入不同组别,对相同游戏看到不同价格。美国有 189 款游戏参与测试——在所有地区中样本量最大,美国地区的折扣最高达到 27.8%(HELLDIVERS 2)、24.4%(The Last of Us Part I)。动态定价在很多国家是非法的,被认为构成了价格歧视。