OrangeBot.AI Digest — 2026-04-15
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Why are Flock employees watching our children? (substack.com)
- Live Nation illegally monopolized ticketing market, jury finds (www.bloomberg.com)
- AI-assisted cognition endangers human development? (heidenstedt.org)
- Google broke its promise to me – now ICE has my data (www.eff.org)
- Open Source Isn't Dead (www.strix.ai)
- Elevated errors on Claude.ai, API, Claude Code (claudestatus.com)
- Gemini Robotics-ER 1.6 (deepmind.google)
- The Future of Everything Is Lies, I Guess: New Jobs (aphyr.com)
- Backpacks got worse on purpose (www.worseonpurpose.com)
- Keep Android Open (keepandroidopen.org)
- God sleeps in the minerals (wchambliss.wordpress.com)
- Good sleep, good learning, good life (2012) (super-memory.com)
- Want to write a compiler? Just read these two papers (2008) (prog21.dadgum.com)
- Google Gemma 4 Runs Natively on iPhone with Full Offline AI Inference (www.gizmoweek.com)
- Anna's Archive loses $322M Spotify piracy case without a fight (torrentfreak.com)
GitHub Trending(13)
- forrestchang / andrej-karpathy-skills
- pascalorg / editor
- thedotmack / claude-mem
- Lordog / dive-into-llms
- virattt / ai-hedge-fund
- chrislgarry / Apollo-11
- obra / superpowers
- jamiepine / voicebox
- public-apis / public-apis
- vercel-labs / open-agents
- lsdefine / GenericAgent
- google / magika
- Donchitos / Claude-Code-Game-Studios
Product Hunt(15)
- Clide
Grid-layout terminal with an AI that drives your shells.
- Carousels Generator
From prompt to branded LinkedIn carousel powered by AI
- Wafer Pass
Flat rate to the best LLMs for OpenClaw, Hermes Agent, etc.
- Mush
Combine Wi-Fi, Ethernet, and 5G for max download speed
- Gemini Robotics ER 1.6
Google's SOTA robotics model for visual & spatial reasoning!
- Fathom 3.0
AI meeting notes: now bot-free, in ChatGPT & Claude + more
- DataGrout
Enterprise AI Platform for Agentic AI & MCP Integration
- Intent
Describe a feature and AI agents build, verify, and ship it
- SnapEdit
Native SwiftUI clipboard image editor; edit, share instantly
- Cenote
AI Sales Agents for Abandoned Checkouts
- Claude Code Routines
Put Claude Code tasks on autopilot with smart routines
- ClawTrace
Make your OpenClaw better, cheaper, and faster
- Lexie
Snap your notes and get tested before the exam
- Defter Notes 2.0.
Spatial thinking with handwriting
- Collabute
Your team's context, turned into action
Hugging Face(15)
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
GUI agents drive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack infrastructure: online RL training suffers from environment instability and closed pipelines, evaluation protocols drift silently across works, and trained agents rarely reach real users on real devices. We present ClawGUI, an open-source framework addressing these three gaps within a single harness. ClawGUI-RL provides the first open-source GUI agent RL infrastructure with validated support for both parallel virtual environments and real physical devices, integrating GiGPO with a Process Reward Model for dense step-level supervision. ClawGUI-Eval enforces a fully standardized evaluation pipeline across 6 benchmarks and 11+ models, achieving 95.8\% reproduction against official baselines. ClawGUI-Agent brings trained agents to Android, HarmonyOS, and iOS through 12+ chat platforms with hybrid CLI-GUI control and persistent personalized memory. Trained end to end within this pipeline, ClawGUI-2B achieves 17.1\% Success Rate on MobileWorld GUI-Only, outperforming the same-scale MAI-UI-2B baseline by 6.0\%.
- KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance
RLVR improves reasoning in large language models, but its effectiveness is often limited by severe reward sparsity on hard problems. Recent hint-based RL methods mitigate sparsity by injecting partial solutions or abstract templates, yet they typically scale guidance by adding more tokens, which introduce redundancy, inconsistency, and extra training overhead. We propose KnowRL (Knowledge-Guided Reinforcement Learning), an RL training framework that treats hint design as a minimal-sufficient guidance problem. During RL training, KnowRL decomposes guidance into atomic knowledge points (KPs) and uses Constrained Subset Search (CSS) to construct compact, interaction-aware subsets for training. We further identify a pruning interaction paradox -- removing one KP may help while removing multiple such KPs can hurt -- and explicitly optimize for robust subset curation under this dependency structure. We train KnowRL-Nemotron-1.5B from OpenMath-Nemotron-1.5B. Across eight reasoning benchmarks at the 1.5B scale, KnowRL-Nemotron-1.5B consistently outperforms strong RL and hinting baselines. Without KP hints at inference, KnowRL-Nemotron-1.5B reaches 70.08 average accuracy, already surpassing Nemotron-1.5B by +9.63 points; with selected KPs, performance improves to 74.16, establishing a new state of the art at this scale. The model, curated training data, and code are publicly available at https://github.com/Hasuer/KnowRL.
- Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
On-policy distillation (OPD) has become a core technique in the post-training of large language models, yet its training dynamics remain poorly understood. This paper provides a systematic investigation of OPD dynamics and mechanisms. We first identify that two conditions govern whether OPD succeeds or fails: (i) the student and teacher should share compatible thinking patterns; and (ii) even with consistent thinking patterns and higher scores, the teacher must offer genuinely new capabilities beyond what the student has seen during training. We validate these findings through weak-to-strong reverse distillation, showing that same-family 1.5B and 7B teachers are distributionally indistinguishable from the student's perspective. Probing into the token-level mechanism, we show that successful OPD is characterized by progressive alignment on high-probability tokens at student-visited states, a small shared token set that concentrates most of the probability mass (97%-99%). We further propose two practical strategies to recover failing OPD: off-policy cold start and teacher-aligned prompt selection. Finally, we show that OPD's apparent free lunch of dense token-level reward comes at a cost, raising the question of whether OPD can scale to long-horizon distillation.
- Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
The rise of autonomous GUI agents has triggered adversarial countermeasures from digital platforms, yet existing research prioritizes utility and robustness over the critical dimension of anti-detection. We argue that for agents to survive in human-centric ecosystems, they must evolve Humanization capabilities. We introduce the ``Turing Test on Screen,'' formally modeling the interaction as a MinMax optimization problem between a detector and an agent aiming to minimize behavioral divergence. We then collect a new high-fidelity dataset of mobile touch dynamics, and conduct our analysis that vanilla LMM-based agents are easily detectable due to unnatural kinematics. Consequently, we establish the Agent Humanization Benchmark (AHB) and detection metrics to quantify the trade-off between imitability and utility. Finally, we propose methods ranging from heuristic noise to data-driven behavioral matching, demonstrating that agents can achieve high imitability theoretically and empirically without sacrificing performance. This work shifts the paradigm from whether an agent can perform a task to how it performs it within a human-centric ecosystem, laying the groundwork for seamless coexistence in adversarial digital environments.
- SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks
Proximal Policy Optimization (PPO) is central to aligning Large Language Models (LLMs) in reasoning tasks with verifiable rewards. However, standard token-level PPO struggles in this setting due to the instability of temporal credit assignment over long Chain-of-Thought (CoT) horizons and the prohibitive memory cost of the value model. While critic-free alternatives like GRPO mitigate these issues, they incur significant computational overhead by requiring multiple samples for baseline estimation, severely limiting training throughput. In this paper, we introduce Sequence-Level PPO (SPPO), a scalable algorithm that harmonizes the sample efficiency of PPO with the stability of outcome-based updates. SPPO reformulates the reasoning process as a Sequence-Level Contextual Bandit problem, employing a decoupled scalar value function to derive low-variance advantage signals without multi-sampling. Extensive experiments on mathematical benchmarks demonstrate that SPPO significantly surpasses standard PPO and matches the performance of computation-heavy group-based methods, offering a resource-efficient framework for aligning reasoning LLMs.
- Toward Autonomous Long-Horizon Engineering for ML Research
Autonomous AI research has advanced rapidly, but long-horizon ML research engineering remains difficult: agents must sustain coherent progress across task comprehension, environment setup, implementation, experimentation, and debugging over hours or days. We introduce AiScientist, a system for autonomous long-horizon engineering for ML research built on a simple principle: strong long-horizon performance requires both structured orchestration and durable state continuity. To this end, AiScientist combines hierarchical orchestration with a permission-scoped File-as-Bus workspace: a top-level Orchestrator maintains stage-level control through concise summaries and a workspace map, while specialized agents repeatedly re-ground on durable artifacts such as analyses, plans, code, and experimental evidence rather than relying primarily on conversational handoffs, yielding thin control over thick state. Across two complementary benchmarks, AiScientist improves PaperBench score by 10.54 points on average over the best matched baseline and achieves 81.82 Any Medal% on MLE-Bench Lite. Ablation studies further show that File-as-Bus protocol is a key driver of performance, reducing PaperBench by 6.41 points and MLE-Bench Lite by 31.82 points when removed. These results suggest that long-horizon ML research engineering is a systems problem of coordinating specialized work over durable project state, rather than a purely local reasoning problem.
- BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation
Accurate evaluation is central to the large language model (LLM) ecosystem, guiding model selection and downstream adoption across diverse use cases. In practice, however, evaluating generative outputs typically relies on rigid lexical methods to extract and assess answers, which can conflate a model's true problem-solving ability with its compliance with predefined formatting guidelines. While recent LLM-as-a-Judge approaches mitigate this issue by assessing semantic correctness rather than strict structural conformity, they also introduce substantial computational overhead, making evaluation costly. In this work, we first systematically investigate the limitations of lexical evaluation through a large-scale empirical study spanning 36 models and 15 downstream tasks, demonstrating that such methods correlate poorly with human judgments. To address this limitation, we introduce BERT-as-a-Judge, an encoder-driven approach for assessing answer correctness in reference-based generative settings, robust to variations in output phrasing, and requiring only lightweight training on synthetically annotated question-candidate-reference triplets. We show that it consistently outperforms the lexical baseline while matching the performance of much larger LLM judges, providing a compelling tradeoff between the two and enabling reliable, scalable evaluation. Finally, through extensive experimentation, we provide detailed insights into BERT-as-a-Judge's performance to offer practical guidance for practitioners, and release all project artifacts to foster downstream adoption.
- Lyra 2.0: Explorable Generative 3D Worlds
Recent advances in video generation enable a new paradigm for 3D scene creation: generating camera-controlled videos that simulate scene walkthroughs, then lifting them to 3D via feed-forward reconstruction techniques. This generative reconstruction approach combines the visual fidelity and creative capacity of video models with 3D outputs ready for real-time rendering and simulation. Scaling to large, complex environments requires 3D-consistent video generation over long camera trajectories with large viewpoint changes and location revisits, a setting where current video models degrade quickly. Existing methods for long-horizon generation are fundamentally limited by two forms of degradation: spatial forgetting and temporal drifting. As exploration proceeds, previously observed regions fall outside the model's temporal context, forcing the model to hallucinate structures when revisited. Meanwhile, autoregressive generation accumulates small synthesis errors over time, gradually distorting scene appearance and geometry. We present Lyra 2.0, a framework for generating persistent, explorable 3D worlds at scale. To address spatial forgetting, we maintain per-frame 3D geometry and use it solely for information routing -- retrieving relevant past frames and establishing dense correspondences with the target viewpoints -- while relying on the generative prior for appearance synthesis. To address temporal drifting, we train with self-augmented histories that expose the model to its own degraded outputs, teaching it to correct drift rather than propagate it. Together, these enable substantially longer and 3D-consistent video trajectories, which we leverage to fine-tune feed-forward reconstruction models that reliably recover high-quality 3D scenes.
- Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, and 3) include MTP layers for inference acceleration through native speculative decoding. We pre-trained Nemotron 3 Super on 25 trillion tokens followed by post-training using supervised fine tuning (SFT) and reinforcement learning (RL). The final model supports up to 1M context length and achieves comparable accuracy on common benchmarks, while also achieving up to 2.2x and 7.5x higher inference throughput compared to GPT-OSS-120B and Qwen3.5-122B, respectively. Nemotron 3 Super datasets, along with the base, post-trained, and quantized checkpoints, are open-sourced on HuggingFace.
- Towards Long-horizon Agentic Multimodal Search
Multimodal deep search agents have shown great potential in solving complex tasks by iteratively collecting textual and visual evidence. However, managing the heterogeneous information and high token costs associated with multimodal inputs over long horizons remains a critical challenge, as existing methods often suffer from context explosion or the loss of crucial visual signals. To address this, we propose a novel Long-horizon MultiModal deep search framework, named LMM-Searcher, centered on a file-based visual representation mechanism. By offloading visual assets to an external file system and mapping them to lightweight textual identifiers (UIDs), our approach mitigates context overhead while preserving multimodal information for future access. We equip the agent with a tailored fetch-image tool, enabling a progressive, on-demand visual loading strategy for active perception. Furthermore, we introduce a data synthesis pipeline designed to generate queries requiring complex cross-modal multi-hop reasoning. Using this pipeline, we distill 12K high-quality trajectories to fine-tune Qwen3-VL-Thinking-30A3B into a specialized multimodal deep search agent. Extensive experiments across four benchmarks demonstrate that our method successfully scales to 100-turn search horizons, achieving state-of-the-art performance among open-source models on challenging long-horizon benchmarks like MM-BrowseComp and MMSearch-Plus, while also exhibiting strong generalizability across different base models. Our code will be released in https://github.com/RUCAIBox/LMM-Searcher.
- Many-Tier Instruction Hierarchy in LLM Agents
Large language model agents receive instructions from many sources-system messages, user prompts, tool outputs, and more-each carrying different levels of trust and authority. When these instructions conflict, models must reliably follow the highest-privilege instruction to remain safe and effective. The dominant paradigm, instruction hierarchy (IH), assumes a fixed, small set of privilege levels (typically fewer than five) defined by rigid role labels (e.g., system > user). This is inadequate for real-world agentic settings, where conflicts can arise across far more sources and contexts. In this work, we propose Many-Tier Instruction Hierarchy (ManyIH), a paradigm for resolving instruction conflicts among instructions with arbitrarily many privilege levels. We introduce ManyIH-Bench, the first benchmark for ManyIH. ManyIH-Bench requires models to navigate up to 12 levels of conflicting instructions with varying privileges, comprising 853 agentic tasks (427 coding and 426 instruction-following). ManyIH-Bench composes constraints developed by LLMs and verified by humans to create realistic and difficult test cases spanning 46 real-world agents. Our experiments show that even the current frontier models perform poorly (~40% accuracy) when instruction conflict scales. This work underscores the urgent need for methods that explicitly target fine-grained, scalable instruction conflict resolution in agentic settings.
- Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting
Training embodied AI agents depends critically on the visual fidelity of simulation environments and the ability to model dynamic humans. Current simulators rely on mesh-based rasterization with limited visual realism, and their support for dynamic human avatars, where available, is constrained to mesh representations, hindering agent generalization to human-populated real-world scenarios. We present Habitat-GS, a navigation-centric embodied AI simulator extended from Habitat-Sim that integrates 3D Gaussian Splatting scene rendering and drivable gaussian avatars while maintaining full compatibility with the Habitat ecosystem. Our system implements a 3DGS renderer for real-time photorealistic rendering and supports scalable 3DGS asset import from diverse sources. For dynamic human modeling, we introduce a gaussian avatar module that enables each avatar to simultaneously serve as a photorealistic visual entity and an effective navigation obstacle, allowing agents to learn human-aware behaviors in realistic settings. Experiments on point-goal navigation demonstrate that agents trained on 3DGS scenes achieve stronger cross-domain generalization, with mixed-domain training being the most effective strategy. Evaluations on avatar-aware navigation further confirm that gaussian avatars enable effective human-aware navigation. Finally, performance benchmarks validate the system's scalability across varying scene complexity and avatar counts.
- Self-Adversarial One Step Generation via Condition Shifting
The push for efficient text to image synthesis has moved the field toward one step sampling, yet existing methods still face a three way tradeoff among fidelity, inference speed, and training efficiency. Approaches that rely on external discriminators can sharpen one step performance, but they often introduce training instability, high GPU memory overhead, and slow convergence, which complicates scaling and parameter efficient tuning. In contrast, regression based distillation and consistency objectives are easier to optimize, but they typically lose fine details when constrained to a single step. We present APEX, built on a key theoretical insight: adversarial correction signals can be extracted endogenously from a flow model through condition shifting. Using a transformation creates a shifted condition branch whose velocity field serves as an independent estimator of the model's current generation distribution, yielding a gradient that is provably GAN aligned, replacing the sample dependent discriminator terms that cause gradient vanishing. This discriminator free design is architecture preserving, making APEX a plug and play framework compatible with both full parameter and LoRA based tuning. Empirically, our 0.6B model surpasses FLUX-Schnell 12B (20times more parameters) in one step quality. With LoRA tuning on Qwen-Image 20B, APEX reaches a GenEval score of 0.89 at NFE=1 in 6 hours, surpassing the original 50-step teacher (0.87) and providing a 15.33times inference speedup. Code is available https://github.com/LINs-lab/APEX.
- Rethinking the Diffusion Model from a Langevin Perspective
Diffusion models are often introduced from multiple perspectives, such as VAEs, score matching, or flow matching, accompanied by dense and technically demanding mathematics that can be difficult for beginners to grasp. One classic question is: how does the reverse process invert the forward process to generate data from pure noise? This article systematically organizes the diffusion model from a fresh Langevin perspective, offering a simpler, clearer, and more intuitive answer. We also address the following questions: how can ODE-based and SDE-based diffusion models be unified under a single framework? Why are diffusion models theoretically superior to ordinary VAEs? Why is flow matching not fundamentally simpler than denoising or score matching, but equivalent under maximum-likelihood? We demonstrate that the Langevin perspective offers clear and straightforward answers to these questions, bridging existing interpretations of diffusion models, showing how different formulations can be converted into one another within a common framework, and offering pedagogical value for both learners and experienced researchers seeking deeper intuition.
- LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment
While the shortage of explicit action data limits Vision-Language-Action (VLA) models, human action videos offer a scalable yet unlabeled data source. A critical challenge in utilizing large-scale human video datasets lies in transforming visual signals into ontology-independent representations, known as latent actions. However, the capacity of latent action representation to derive robust control from visual observations has yet to be rigorously evaluated. We introduce the Latent Action Representation Yielding (LARY) Benchmark, a unified framework for evaluating latent action representations on both high-level semantic actions (what to do) and low-level robotic control (how to do). The comprehensively curated dataset encompasses over one million videos (1,000 hours) spanning 151 action categories, alongside 620K image pairs and 595K motion trajectories across diverse embodiments and environments. Our experiments reveal two crucial insights: (i) General visual foundation models, trained without any action supervision, consistently outperform specialized embodied latent action models. (ii) Latent-based visual space is fundamentally better aligned to physical action space than pixel-based space. These results suggest that general visual representations inherently encode action-relevant knowledge for physical control, and that semantic-level abstraction serves as a fundamentally more effective pathway from vision to action than pixel-level reconstruction.
Techmeme(15)
- Objection, which aims to use AI and experts to evaluate claims in news stories, debuts with funding from Peter Thiel, Balaji Srinivasan; evaluations cost $2,000 (Rebecca Bellan/TechCrunch)
Rebecca Bellan / TechCrunch : Objection, which aims to use AI and experts to evaluate claims in news stories, debuts with funding from Peter Thiel, Balaji Srinivasan; evaluations cost $2,000 — After helping lead the lawsuit that bankrupted media firm Gawker, Aron D'Souza says he saw something broken in the American media system …
- NY-based Auctor, which uses AI to curate resource plans and process flows to help companies adopt new software, raised $20M in a combined seed and Series A (Chris Metinko/Axios)
Chris Metinko / Axios : NY-based Auctor, which uses AI to curate resource plans and process flows to help companies adopt new software, raised $20M in a combined seed and Series A — Auctor, a startup helping companies adopt new software, raised $20 million in a combined seed and Series A, CEO William Sun tells Axios Pro first.
- A jury finds that Live Nation and Ticketmaster illegally maintained monopoly power in the ticketing market, in a case brought by state AGs after the DOJ settled (NBC News)
NBC News : A jury finds that Live Nation and Ticketmaster illegally maintained monopoly power in the ticketing market, in a case brought by state AGs after the DOJ settled — The federal government struck a settlement with Live Nation in March, requiring Ticketmaster to divest up to 13 amphitheaters …
- Google rolls out Gemini 3.1 Flash TTS, a text-to-speech model with support for over 70 languages and audio tags that give developers granular speech control (Matthias Bastian/The Decoder)
Matthias Bastian / The Decoder : Google rolls out Gemini 3.1 Flash TTS, a text-to-speech model with support for over 70 languages and audio tags that give developers granular speech control — The company says it's the most natural and expressive voice output it has shipped to date. The big new feature is audio tags …
- Q&A with Jensen Huang on Nvidia's supply chain moat, competition from ASICs like Google's TPU, investing in AI labs and neoclouds, selling to China, and more (Dwarkesh Patel/Dwarkesh Podcast)
Dwarkesh Patel / Dwarkesh Podcast : Q&A with Jensen Huang on Nvidia's supply chain moat, competition from ASICs like Google's TPU, investing in AI labs and neoclouds, selling to China, and more — “If our next several years are a trillion dollars in scale, we have the supply chain to do it”
- WPP, Dentsu, and Publicis settle with the FTC over claims they colluded on misinformation policies that denied ad revenue to conservative publishers (David McCabe/New York Times)
David McCabe / New York Times : WPP, Dentsu, and Publicis settle with the FTC over claims they colluded on misinformation policies that denied ad revenue to conservative publishers — WPP, Dentsu and Publicis settled claims they colluded on policies to combat misinformation, denying ad revenue to publishers on the right.
- Adobe unveils Firefly AI Assistant, which can orchestrate and execute multistep tasks across Creative Cloud apps, available in public beta in the coming weeks (Ivan Mehta/TechCrunch)
Ivan Mehta / TechCrunch : Adobe unveils Firefly AI Assistant, which can orchestrate and execute multistep tasks across Creative Cloud apps, available in public beta in the coming weeks — Last October, Adobe previewed a new assistant under the “Project Moonlight” moniker that could do tasks for you by tapping different Adobe apps …
- Google launches a Gemini Mac app, featuring a keyboard shortcut, screen sharing for better context, image generation with Nano Banana, and more (Abner Li/9to5Google)
Abner Li / 9to5Google : Google launches a Gemini Mac app, featuring a keyboard shortcut, screen sharing for better context, image generation with Nano Banana, and more — Gemini now has a native Mac app in the first expansion from Android and iOS. — This “native desktop experience” is launched via an Option + Space keyboard shortcut.
- The US Energy Information Administration plans to implement a mandatory nationwide survey of data centers focused on their energy use (Molly Taft/Wired)
Molly Taft / Wired : The US Energy Information Administration plans to implement a mandatory nationwide survey of data centers focused on their energy use — In a letter obtained by WIRED, the Energy Information Administration tells two senators that it plans to develop a mandatory assessment of data centers' energy use.
- Cal.com, which provides scheduling software, is moving its core open-source codebase to a closed repository, citing the dangers of AI hacking its open code (Steven Vaughan-Nichols/ZDNET)
Steven Vaughan-Nichols / ZDNET : Cal.com, which provides scheduling software, is moving its core open-source codebase to a closed repository, citing the dangers of AI hacking its open code — ZDNET's key takeaways — Cal is reluctantly moving away from open source for security. — This move isn't about Mythos, but risks from modern AI tools.
- AI cloud infrastructure company Parasail raised a $32M Series A led by Touring Capital and Kindred Ventures, bringing its total funding to $42M (Tim Fernholz/TechCrunch)
Tim Fernholz / TechCrunch : AI cloud infrastructure company Parasail raised a $32M Series A led by Touring Capital and Kindred Ventures, bringing its total funding to $42M — “Give me tokens. Just give me tokens. I want them fast. I want them cheap. I want them now." — That's the mantra for developers building software …
- Filings: Marc Andreessen and Ben Horowitz donated $25M to pro-AI super PAC Leading the Future, bringing the group's total cash on hand to over $51M (Bloomberg)
Bloomberg : Filings: Marc Andreessen and Ben Horowitz donated $25M to pro-AI super PAC Leading the Future, bringing the group's total cash on hand to over $51M — Venture capitalists Marc Andreessen and Ben Horowitz poured $25 million into a pro-artificial intelligence super political action committee …
- Hilbert, whose AI software connects data across teams to help companies make decisions from a single system, raised a $28M Series A led by a16z (Madison Mills/Axios)
Madison Mills / Axios : Hilbert, whose AI software connects data across teams to help companies make decisions from a single system, raised a $28M Series A led by a16z — Hilbert, an AI startup rethinking how companies drive growth, raised a $28 million Series A led by Andreessen Horowitz, the startup told Axios exclusively.
- A survey of US teens: ~90% say entertainment is a reason they use TikTok, Instagram, or Snapchat, 57% message daily on Snapchat, 37% say TikTok impacts sleep (Pew Research Center)
Pew Research Center : A survey of US teens: ~90% say entertainment is a reason they use TikTok, Instagram, or Snapchat, 57% message daily on Snapchat, 37% say TikTok impacts sleep — Teens largely turn to TikTok, Instagram and Snapchat for fun and connection. But experiences around messaging, screen time and cyberbullying vary.
- Artemis, which aims to replace rule-based cybersecurity systems with an AI-driven centralized "brain", emerges from stealth with a $70M Series A led by Felicis (Sharon Goldman/Fortune)
Sharon Goldman / Fortune : Artemis, which aims to replace rule-based cybersecurity systems with an AI-driven centralized “brain”, emerges from stealth with a $70M Series A led by Felicis — Artemis, a new cybersecurity startup trying to help defenders fight AI-powered attacks with AI, emerged from stealth today …
Solidot(15)
- 全球暖化危及水稻产量
全球九成以上的大米产自亚洲,亚洲有逾十亿人口以大米为主食。历史数据显示,亚洲大米过去九千年很少能在年平均气温逾 28°C 或暖季最高气温逾 33°C 的地区茁壮成长,而全球暖化的速度比水稻能适应的速度快 5000 倍。根据发表在《Communications Earth & Environment》期刊上的一项研究,预计到 2070 年印度和东南亚等传统水稻种植区的温度将超过 40 摄氏度,在这种温度下现有的亚洲水稻无法正常生长,逾十亿人的粮食安全面临威胁。研究人员指出,现有的水稻未适应气温升高,而是其种植从较暖的地区扩大到了较凉的地区,种植面积增加了,但承受气候变化的能力并没有增长。以中国为例,水稻种植从华中扩大到华北,同时加大了炎热地区的灌溉力度,因此大米的产量才略有增长。
- 美国国会新法案要求操作系统验证用户年龄
美国民主党议员 Josh Gottheimer 和共和党议员 Elise M. Stefanik 联合提出了一项新法案,要求操作系统开发商验证用户的年龄。法案已于 4 月 13 日递交到众议院能源和商业委员,其详细内容还没有公开,目前只知道法案的标题是“To require operating system providers to verify the age of any user of an operating system, and for other purposes”。该法案可能是 Gottheimer 本月初提出的 Parents Decide Act 的一部分,其中包括:要求苹果和 Google 等操作系统开发商在设置新设备时验证用户年龄,不依赖于用户自行报告的年龄;允许家长从一开始就设置适合孩子年龄的内容控制,包括限制访问社交媒体、应用和 AI 平台;确保年龄和家长控制设置安全传输到应用和 AI 平台,它们因此能为儿童量身定制内容;通过在各个平台建立一致且可信的标准,防止儿童访问有害或露骨的内容,包括不合适的 AI 聊天机器人互动。
- 斯坦福报告凸显了 AI 业内人士和公众之间的分歧
斯坦福大学 HAI 研究院本周一发表了年度报告 AI Index。报告凸显了 AI 业内人士和公众之间日益扩大的分歧。报告援引皮尤研究中心上月发布的一份报告:只有 10% 的美国人对 AI 在日常生活中的日益普及感到兴奋而非担忧,但 56% 的 AI 专家认为 AI 将在未来 20 年对美国产生积极影响。AI 专家的意见和公众情绪存在显著分歧:84% 的专家认为 AI 未来 20 年将对医疗保健产生积极影响,只有 44% 的公众持相同观点;73% 的专家积极看待 AI 对工作方式的影响,而持相同观点的公众仅占 23%;69% 的专家认为 AI 将对经济产生积极影响,只有 21% 的公众持相同观点;AI 专家对 AI 对就业市场的影响持较为乐观态度,而 64% 的公众认为 AI 将在未来 20 年导致就业岗位减少。
- 俄罗斯流行应用被发现会检测是否安装 VPN
RKS Global 的专家发现,30 款俄罗斯最流行 Android 应用有 22 款能检测 VPN,其中 19 款会将 VPN 状态发送至服务器。这些应用包括:Yandex Browser、Yandex Maps、VKontakte、My MTS、Sberbank Online、T-Bank、VK Video、Wildberries、 Kinopoisk、Ozon、Samokat、RuStore、VTB Online、Yandex Music、Avito、Alfa-Bank、2GIS、MegaMarket、Odnoklassniki、MAX、Rutube 和 VK Music。俄罗斯数字发展部已经要求大型企业从 4 月 15 日起限制那些在设备上启用了 VPN 的用户的访问。调查发现,Avito 应用会检测设备上是否安装了逾 200 种外国应用,其中包括银行、加密货币钱包和 IM 等。
- 安娜的档案被勒令向 Spotify 等赔偿 3.22 亿美元
纽约南区法官 Jed Rakoff 作出缺席判决,勒令影子图书馆安娜的档案(Anna's Archive)向 Spotify 以及唱片公司环球音乐、索尼音乐和华纳唱片赔偿 3.22 亿美元,其中 Spotify 获得 3 亿美元赔偿。安娜的档案的运营者没有现身法庭为其辩护,也不太可能支付赔偿金。这起诉讼源自去年安娜的档案宣布抓取了 Spotify 的音乐文件,它随后发布了 Spotify 的元数据,但并没有公开音乐文件,尽管如此 Spotify 和唱片公司对安娜的档案提起了诉讼,导致这家影子图书馆主域名被扣押。Spotify 等原告的主要目的也不是赔偿金,而是要在全球封杀安娜的档案。法官 Rakoff 也宣布了永久禁令,涉及 10 个域名 annas-archive.org、.li、.se、.in、.pm、.gl、.ch、.pk、.gd 和.vg。
- 互联网档案馆存档数千音乐会录音带
芝加哥音乐粉丝 Aadam Jacobs 自 1980 年代起录制他参加的演唱会,至今积累了逾万盘磁带。现年 59 岁的 Jacobs 知道磁带会随着时间推移而损坏,因此他同意互联网档案馆的志愿者将这些磁带数字化。互联网档案馆已经上传了 2477 盘音乐会磁带,其中包括相当罕见的 1989 年涅盘乐队(Nirvana)音乐会磁带,乐队被主流观众所熟悉是在 1991 年发行的单曲 Smells Like Teen Spirit 之后。
- 美国最完美的约会日期是 10 月 8 日
在 2000 年上映的电影《特工佳丽(Miss Congeniality)》中,桑德拉·布洛克扮演的女主角被问到完美的约会日期时回答,“April 25th, because it's not too hot, not too cold. All you need is a light jacket”。但 4 月 25 日真的是完美的天气?WeatherBug 对美国自 2018 年以来的天气的分析显示,10 月 8 日才是全美温度最舒适和降雨量最低的日期,当天的平均气温约为 19 摄氏度,降雨量约 0.25 毫米。而 4 月 25 日在全年排名中排在第 80 位,平均气温 16 摄氏度,降雨量约 0.32 毫米。数据显示,7 月是美国全年最热的月份,而 1 月则是最冷的月份,1 月 20 日的全国平均气温约 0.5 摄氏度。
- 英国首相表示社媒平台应停止无限滚动
在澳大利亚等国之后,英国也在考虑限制儿童使用社交媒体,它正在测试禁令、宵禁和应用使用时间限制等措施观察这些措施对儿童睡眠、家庭生活和学业的影响。英国首相 Keir Starmer 表示社媒平台应移除针对年轻用户的让人上瘾的无限滚动功能。他接受 BBC 采访时称社媒公司设计的算法旨在让用户上瘾,家长正要求政府介入。
- 微软 Surface 系列产品大幅涨价
由于 AI 热导致内存和固态硬盘等产品供不应求,相关零部件过去几个月价格翻了几倍,导致电脑和手机等产品也跟着涨价,现在微软也大幅提高了其 Surface 系列产品的价格。12 英寸的 Surface Pro 从 799 美元涨到了 1049 美元(起售价),13 英寸的 Surface Pro 从 999 美元涨到 1499 美元,13 英寸的 Surface Laptop 从 899 美元涨到了 1149 美元,13.8 英寸 Surface Laptop 从 999 美元涨到 1499 美元,15 英寸型号起售价现在为 1599 美元。15 英寸 Surface Laptop 的顶配如今价格超过了同等规格的苹果 16 英寸 MacBook Pro。难以想象苹果的电脑产品有一天会成为比 Windows 电脑更廉价同时性能还更好的产品。
- 亚马逊收购 Globalstar
亚马逊和 Globalstar 发表新闻稿,宣布达成了收购协议,扩大亚马逊的 Leo 宽带卫星网络。苹果持有 Globalstar 五分之一的股份,作为收购协议的一部分,苹果与亚马逊达成协议,Amazon Leo 将为 iPhone 和 Apple Watch 提供卫星服务,包括通过卫星提供紧急 SOS 求救服务。
- Google 违反承诺未提前通知就将用户数据交给 ICE
2024 年 9 月 Amandla Thomas-Johnson 持学生签证在美国攻读博士学位期间在康奈尔大学参加了一次支持巴勒斯坦的抗议活动。2025 年 4 月美国 ICE 向 Google 发出行政传票要求提供 Thomas-Johnson 的数据。Google 承诺,在根据法律程序——包括行政传票——交出用户数据之前会通知用户,此举旨在让用户有机会对数据披露要求提出质疑。但本案中 Google 没有提前通知就交出了数据。电子前哨基金会(EFF)代表 Thomas-Johnson 向加州和纽约州投诉 Google,指控其违反承诺从事了商业诈欺行为。Thomas-Johnson 已经离开了美国,他在 EFF 上称即使他离开美国也并没有摆脱它的控制,他说,“任何人都有可能成为执法部门的目标,而科技公司凭借其庞大的数据存储,可以助长这些任意调查。”
- Google 将惩罚“后退按钮劫持”行为
今天的很多网站不让用户“后退”,但到了 6 月 15 日,如果网站还这么做,Google 将会进行惩罚,大幅降低网站的搜索排名。Google 将把这种被称为“后退按钮劫持”的做法定性为恶意行为。“后退按钮劫持”旨在强行将用户留在网站以增加流量,当访客试图通过后退按钮返回上一页,网站会篡改页面浏览历史记录,在用户点击后退按钮时插入其他内容。Google 表示,后退按钮应该始终执行用户预期的功能——返回上一页,任何其他行为都属于一种欺骗性的用户体验。
- 德国主权科技基金向 Mastodon 资助 61.4 万欧元
德国主权科技基金(Sovereign Tech Fund)向联邦宇宙微博客项目 Mastodon 资助 61.4 万欧元,用于支持 Mastodon 及其软件生态系统的改进和更新。这笔资金将投入到改进:黑名单同步;新的 Fediverse Auxiliary Service Provider(FASP)允许服务器之间共享存储和媒体处理资源;自动化内容检测;私信端到端加密;改进文档。相关改进预计在 2026 年底到 2027 年完成。
- OpenSSL 4.0 释出
OpenSSL 项目释出了 v4.0 版本。主要新特性包括:支持 Encrypted Client Hello (ECH),通过加密初始 TLS 握手以及隐藏服务器名称指示(SNI)提供更好的隐私保护;移除 SSLv3 等旧协议/引擎支持;通过支持 RFC 8998 改进后量子加密支持;移除 SSLv2 Client Hello,停止支持 Darwin i386 和 PowerPC/PPC64 等。
- Servo 发布首个 crates.io 版本
Rust 语言开发的浏览器渲染引擎项目 Servo 释出了 servo crate v0.1.0,这是它发布的首个 crates.io 版本,允许 Servo 作为一个库供其他项目使用。Servo 源自 Mozilla,2020 年 8 月 Mozilla 在裁员时砍掉了 Servo 引擎团队的大部分成员。Servo 项目之后脱离 Mozilla 成为一个独立项目,由 Linux 基金会托管,旨在为其它项目提供一个嵌入的高性能的、安全的渲染引擎。Servo 项目于 2025 年 10 月释出了 v0.0.1 版本,之后以每月发布一个新版本的频率发布。开发者表示他们计划以每半年更新一次的频率提供长期支持版本(LTS),因为嵌入开发者可能无法及时更新到最新 Servo 版本,他们更适合使用 LTS 版本。LTS 版本预计将提供九个月的安全更新。