OrangeBot.AI Digest — 2026-06-16
88 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- GrapheneOS has been ported to Android 17 and official releases are coming soon (discuss.grapheneos.org)
- U.S. pulling ocean sensors a 'shock' for Canadian research as El Niño nears (www.timescolonist.com)
- Apple is about to make Hide My Email useless (arseniyshestakov.com)
- Calvin and Hobbes and the price of integrity (therepublicofletters.substack.com)
- Stop Using JWTs (gist.github.com)
- Is Meta destroying its engineering organization? (newsletter.pragmaticengineer.com)
- TIL: You can make HTTP requests without curl using Bash /dev/TCP (mareksuppa.com)
- Apple's weird anti-nausea dots cured my car sickness (www.theverge.com)
- But yak shaving is fun (2019) (parksb.github.io)
- Running local models is good now (vickiboykis.com)
- Google Chrome update will close the door on ad blockers (9to5google.com)
- SpaceX Is Buying Cursor (www.bbc.com)
- Mechanical Watch (2022) (ciechanow.ski)
- Correlated randomness in Slay the Spire 2 (tck.mn)
- SpaceX to buy Cursor for $60B (www.reuters.com)
GitHub Trending(13)
- freeCodeCamp / freeCodeCamp
- swc-project / swc
- teslamate-org / teslamate
- iptv-org / iptv
- puppeteer / puppeteer
- meshery / meshery
- cypress-io / cypress
- music-assistant / server
- Universal-Debloater-Alliance / universal-android-debloater-next-generation
- OpenBMB / VoxCPM
- alibaba / zvec
- rmyndharis / OpenWA
- n0-computer / iroh
Product Hunt(15)
- looquee
Copilot for college applications
- Ledgerly
Available on the App Store
- Fluxmail
AI email inbox and assistant
- Botme
AI customer support agent, live on your website in 5 minutes
- Athena Desktop
A local command room for AI coding agents.
- Obotiq
Autonomous care robots that give caregivers time back
- Getusefeed
Feature voting & public roadmap for small SaaS teams.
- Glint
Claude Code activity, right where you want it.
- ClientJam
AI-powered lead generation for designers and agencies
- Invoko
A little hand on your Mac
- Voice Calls in Chatwoot
Calls, chats, and emails all in one support inbox
- MakersClaw
Hire AI employees that live in your Slack, Teams, Telegram
- PeakRoutine
Personalized health coaching powered by your biomarkers
- agentbrowse
Give your AI coding agent the web as a command line
- Avocado
AI-native content operations for any Next.js website
Hugging Face(15)
- JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence
Many moments in the real world do not wait for a user to ask. A fire starts on a security monitor, an expression flickers across a video call, or a product a viewer wants flashes by in a livestream. Yet today's large models remain mostly turn-based by design: they answer only when addressed, and even video-call apps that appear interactive still operate as question-answer systems, reacting only when polled or prompted. We argue for a different paradigm: a model that is present in the world like a person. It continuously watches what is happening now, decides on its own whether to speak or stay silent, interacts in real time, and delegates to a background model when the problem is hard. To advance interaction models and their adoption across domains, we make two fully open-sourced contributions. First, we release JoyAI-VL-Interaction, an 8B-scale, vision-first VL-interaction model. The model makes the response decision internally, choosing each second to stay silent, respond, or delegate to a background model, and it excels at vision-triggered responsiveness and time awareness. We pair it with a transferable training recipe, from which capabilities we never trained for emerge, such as guiding a shopper through changing app screens or improvising a lecture from a slide deck. Second, we release a complete, deployable system built around that model. The system streams any ongoing video into the model, making it genuinely present in the world. All other components are pluggable, including ASR/TTS modules, memory, visualization UI, and a background brain that can connect to any API or agent. Across six real-world scenarios, human raters prefer JoyAI-VL-Interaction over the in-app video-call assistants of Doubao and Gemini by a wide margin. To our knowledge, this is the first open, vision-driven interaction model released together with its training recipe, data, and complete deployable system.
- Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories
Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature takes a newsroom team weeks: hunting for context, running statistics, choosing an angle, and designing visuals. Recent agents handle individual steps well: data-science agents close the analysis loop, while design agents synthesize beautiful websites. But can an agent serve as a data journalist end to end? We introduce Data Journalist Agent (Data2Story), a multi-agent framework that orchestrates specialized roles into a single virtual newsroom. Data2Story contributes two innovations. (i) Claims are evidence-grounded: an Inspector links every number, angle, and asset back to data, code, or an external reference. (ii) Articles are multimodally generative: rather than defaulting to plain text and static charts, Data2Story reasons about what readers will want to see, then deploys multimodal tools, such as interactive maps for geography and audio for music. We evaluate Data2Story on 18 articles, each paired with the originally published expert piece, along four axes: (a) human-agent angle coverage; (b) rubric evaluation with 53 participants across five dimensions; (c) computer-use agents as judges, a cost-saving proxy for how readers navigate interactive articles; and (d) verifiability, where a coding verifier re-executes statements against the data and checks claims against references. Data2Story produces competitive, evidence-traceable multimedia stories, with particular strength in transparency and auditability. Human articles retain an edge in editorial angle, creative design, and presentation. We position Data2Story as a collaborator for journalists, enabling more evidence-based, transparent, and verifiable reporting. Code and demos are available at https://data2story.github.io.
- Geometric Action Model for Robot Policy Learning
Generalist robot policies must follow user instructions while reasoning about how objects, cameras, and robot actions interact in the 3D physical world. Recent vision-language-action models (VLAs) and video world-action models (WAMs) inherit strong semantic or temporal priors from large-scale foundation models, but they still operate primarily on 2D image frames or 2D-derived latent spaces, leaving implicit the 3D geometry required for contact-rich manipulation. We propose the Geometric Action Model (GAM), a language-conditioned manipulation policy that directly repurposes a pretrained geometric foundation model (GFM) as a shared substrate for perception, temporal prediction, and action decoding. GAM splits the GFM at an intermediate layer: the shallow layers serve as an observation encoder, and a causal future predictor inserted at the split layer forecasts future latent tokens conditioned on language, proprioception, and action history. The predicted future tokens are then routed through the remaining GFM blocks for feature propagation and decoding, allowing a single backbone to produce both future geometry and actions. This design equips the GFM with language-conditioned temporal world modeling through minimal architectural modification while preserving its rich geometric priors. Across a broad suite of simulation and real-robot manipulation benchmarks, GAM is more accurate, more robust, faster, and lighter than current foundation-model-scale baselines.
- DreamX-World 1.0: A General-Purpose Interactive World Model
DreamX-World 1.0 is a general-purpose interactive text/image-to-video world model for controllable long-horizon generation. It supports camera navigation, revisits to previously observed regions, and promptable events across photorealistic, game-style, and stylized domains. Our data engine combines camera-accurate Unreal Engine rendering, action-rich gameplay recordings, and real-world videos with recovered camera geometry. For camera control, we introduce E-PRoPE, a lightweight variant of projective positional encoding that retains PRoPE's projective camera geometry while applying camera-aware attention to spatially reduced tokens. We convert a bidirectional video generator into a few-step autoregressive world model using causal forcing, DMD-style distillation, and long-rollout training. Training on self-generated long-horizon contexts exposes the model to its own generated history and reduces the style and color drift that accumulates across autoregressive chunks. Memory-Conditioned Scene Persistence retrieves earlier views through camera-geometry-based retrieval, while residual recycling makes the conditioning path less sensitive to imperfect memory latents. Event Instruction Tuning adds composable event control, and reinforcement learning alignment recovers camera control and visual quality after distillation. With mixed-precision DiT execution, residual reuse, 75\%-pruned VAE decoding, and asynchronous pipeline parallelism, DreamX-World 1.0 reaches up to 16\,FPS on eight RTX\,5090 GPUs. On our 5-second basic evaluation, DreamX-World 1.0 achieves a camera-control score of 73.75 and an overall score of 84.76, outperforming HY-WorldPlay 1.5 and LingBot-World in overall score, which achieve 80.79 and 80.45, respectively.
- FastContext: Training Efficient Repository Explorer for Coding Agents
Large Language Model (LLM) coding agents have achieved strong results on software engineering tasks, yet repository exploration remains a major bottleneck: locating relevant code consumes substantial token budget and pollutes the agent's context with irrelevant snippets. In most agents, the same model explores the repository and solves the task, leaving exploratory reads and searches in the solver's history. We present FastContext, a dedicated exploration subagent that separates repository exploration from solving. Invoked on demand, FastContext issues parallel tool calls and returns concise file paths and line ranges as focused context. FastContext is powered by specialized exploration models spanning 4B--30B parameters. We bootstrap them from strong reference-model trajectories and refine them with task-grounded rewards for broad first-turn search, multi-turn evidence gathering, and precise citation generation. Across SWE-bench Multilingual, SWE-bench Pro, and SWE-QA, integrating FastContext into Mini-SWE-Agent improves end-to-end resolution rates up to 5.5\% while reducing coding-agent token consumption up to 60\%, with marginal overhead. These results show that repository exploration can be separated from solving and handled effectively by specialized models. Code and data: https://github.com/microsoft/fastcontext
- VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models
This technical report introduces VibeThinker-3B, a compact dense model with 3B parameters developed to investigate how far verifiable reasoning can be pushed within a strictly small-model regime. Building upon the Spectrum-to-Signal post-training paradigm, we systematically enhance the model through an optimized pipeline that includes curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation. Experimental evaluations demonstrate that VibeThinker-3B achieves frontier-level performance on highly demanding verifiable tasks. Specifically, it attains a score of 94.3 on AIME26 (improving to 97.1 with claim-level test-time scaling), an 80.2 Pass@1 on LiveCodeBench v6, and exhibits strong out-of-distribution generalization with a 96.1\% acceptance rate on recent unseen LeetCode contests. This effectively places it in the performance band of first-tier reasoning systems, matching or exceeding flagship models that are orders of magnitude larger, such as DeepSeek V3.2, GLM-5, and Gemini 3 Pro. Furthermore, a score of 93.4 on IFEval confirms that this extreme reasoning enhancement does not compromise strict instruction controllability. Extending our previous 1.5B work, these findings motivate the Parametric Compression-Coverage Hypothesis, which views verifiable reasoning as compressible into compact reasoning cores, while open-domain knowledge and general-purpose competence require broad parameter coverage over facts, concepts, and long-tail scenarios. This perspective suggests that compact models are not merely deployment-efficient substitutes, but a complementary path toward frontier-level performance in parameter-dense capability regimes.
- Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models
Masked Diffusion Language Models (MDLMs) have emerged as a distinct paradigm for sequence generation. As MDLMs become diverse in capabilities and knowledge coverage, an important question is how to combine their knowledge. Toward this, we first investigate the unique decoding dynamics of MDLMs. We find that successful generations exhibit stable confidence dynamics over answer-relevant positions, while unreliable trajectories can often be corrected by injecting promising intermediate states from other models. Guided by this observation, we propose TIE (Trajectory-based Iterative Ensembling), a knowledge fusion framework in which MDLMs iteratively identify reliable decoding trajectories and relay them across models. TIE tracks confidence dynamics over answer-relevant positions to determine which model currently follows a more reliable trajectory and selectively transfers partially denoised sequences across models. As the model on the more promising trajectory often changes across denoising steps, TIE allows different models to contribute complementary strengths at different stages of generation. Strong performance across diverse reasoning tasks, along with our analyses, suggests that TIE offers a practical approach to the underexplored problem of MDLM ensembling.
- BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering
Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/
- VisualClaw: A Real-Time, Personalized Agent for the Physical World
Vision language models are serving as general-purpose interfaces for complex multimodal tasks. However, deployment still faces three gaps: VLMs typically incur high latency and cost when processing dense video frames and long prompts, the agent scaffold remains static after deployment, and standard video-QA benchmarks do not test whether agents can use visual evidence inside tool-using workspaces. We present VisualClaw, a self-evolving multimodal agent built around two principles. First, hybrid encoding reduces deployment cost by filtering less informative streaming frames with a cascaded gate and compressing the text skill bank through hot/cold top-k injection. Second, skill evolution lets the agent learn from failures: retrieved memories condition an evolver as direct concatenated context or as guided evidence, producing skill-bank updates that help future questions. Across 4 video-QA benchmarks with 2 VLMs, VisualClaw cuts per-question API cost by an average -98% versus full-frame upload and by -25.9% over the offline uniform 8 frame baseline, while boosting accuracy in most settings, e.g., an average +3.85% and a peak +15.80% on EgoSchema with Gemini 3 Flash. To address the gap, we curate VisualClawArena, a 200-scenario multimodal agentic benchmark built through a strict five-stage pipeline; models must use video evidence, documents, dynamic updates, and executable checks inside a workspace. On VisualClawArena, the same framework with computer-use agent backends improves macro accuracy by +2.9% for Codex (GPT-5.5) and +3.2% for Claude Code (Sonnet 4.6) over no-evolution baselines, with a -9.5% cost reduction compared to the uniform-sampled baseline. These properties make VisualClaw a natural fit for edge applications, where the cascade reduces a 1-hour streaming session from ~3,600 API uploads down to only 5-20 calls and the self-evolution makes it a perfect personalized assistant.
- OneRank: Unified Transformer-Native Ranking Architecture for Multi-Task Recommendation
Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.
- BadWorld: Adversarial Attacks on World Models
Visual world models (VWMs) synthesize interactive, action-conditioned rollouts from a single context image. However, it remains an open question how robust these models are to adversarial perturbations. Standard adversarial attacks fail to assess this vulnerability because attackers lack ground-truth future videos and cannot predict subsequent user controls. We introduce BadWorld, a label-free adversarial framework tailored for autoregressive VWMs that systematically overcomes both constraints. First, to bypass the need for future supervision, we propose a self-supervised velocity attack that directly disrupts the early denoising dynamics of the model. Second, to ensure the attack generalizes across unpredictable user actions, we formulate a trajectory-adaptive bi-level optimization that actively mines hard control sequences to forge control-agnostic perturbations. Evaluated on representative VWMs with continuous and discrete controls, BadWorld exposes severe structural fragility. Visually indistinguishable adversarial images reliably trigger catastrophic degradation in future rollouts, leading to incomplete denoising, structural collapse, and control inconsistency. These findings reveal critical risks for deploying VWMs in safety-critical systems while highlighting a practical mechanism for privacy protection.
- Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation
We introduce Qwen-RobotWorld, a language-conditioned video world model for embodied intelligence. With natural language as a unified action interface, it predicts physically grounded future visual trajectories from current observations across robotic manipulation, autonomous driving, indoor navigation, and human-to-robot transfer. This unified formulation provides three promising application directions: synthetic data generation for policy training augmentation, scalable virtual environments for policy evaluation, and language-guided planning signals for downstream robot control. This is achieved through a three-part design: a) Double-Stream MMDiT with MLLM Action Encoding, where a 60-layer double-stream diffusion transformer couples frozen Qwen2.5-VL semantics with video-VAE latents through layer-wise joint attention; b) Embodied World Knowledge (EWK), an 8.6M video-text corpus (200M+ frames) with action-language mapping over 20+ embodiments and 500+ action categories; and c) General+Expert Progressive Curriculum, a two-stage training strategy that first learns general visual priors and then injects embodied specialization under a shared language interface. Extensive results show strong competitiveness: ranks 1st overall on EWMBench and DreamGen Bench, outperforms all open-source models on WorldModelBench and PBench. Additional zero-shot analyses on RoboTwin-IF benchmark further support robust generalization and multi-view consistency.
- TokenPilot: Cache-Efficient Context Management for LLM Agents
As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This reveals a critical trade-off between text sparsity and prompt cache continuity. To address this, we present TokenPilot, a dual-granularity context management framework. Globally, Ingestion-Aware Compaction acts as a framework harness to stabilize prompt prefixes and eliminate open-world environmental noise at the ingestion gate. Locally, Lifecycle-Aware Eviction monitors the ongoing residual utility of context segments, enforcing a conservative batch-turn schedule to offload content segments only when task relevance expires. Experiments on PinchBench and Claw-Eval under both isolated and continuous modes demonstrate that TokenPilot reduces costs by 61% and 56% in isolated mode, and 61% and 87% in continuous mode, while maintaining competitive performance compared to prior systems. TokenPilot has been integrated into LightMem2 at https://github.com/zjunlp/LightMem2.
- SP^3: Spherical Priors for Plug-and-Play Restoration
In this paper, we introduce SP^3, a novel Plug-and-Play algorithm that accelerates maximum a posteriori image restoration by replacing denoisers with Spherical Encoders (SE) as generative priors. SP^3 approximates the intractable proximal prior step by utilizing the SE tightly structured latent space as a robust projection onto the natural image manifold. Alternating this projection with a closed-form data-consistency step, via Half-Quadratic Splitting, achieves stable convergence without requiring gradient computation during inference. This unique formulation unlocks "anytime" restoration capabilities, producing sharp, plausible images from the first iteration. Evaluations across a variety of image restoration tasks demonstrate that SP^3 achieves perceptual quality comparable to state-of-the-art zero-shot diffusion and flow methods while being 3-630times faster.
- Memento: Reconstruct to Remember for Consistent Long Video Generation
Long-form video generation requires recurring subjects to remain consistent across various shots, viewpoints, motions, and scene transitions. Existing temporal decomposition methods improve scalability by generating videos shot by shot. However, they mainly focus on optimizing plausible next-shot continuations without verifying whether the historical memory preserves identity-critical subject evidence. Consequently, as generation proceeds, recurring subjects may be diluted, overwritten, or forgotten. In this paper, we propose Memento, a subject-reconstruction-guided framework that treats subject preservation as an explicit identity grounding problem, based on the premise that a memory bank faithfully preserving a subject should support reconstructing that subject from memory alone. Specifically, Memento jointly trains autoregressive next-shot generation with memory-based subject reconstruction, recovering target appearances using historical memory and global story captions. To disentangle long-range subject evidence from short-range cues, Memento introduces a dual-query memory mechanism, where one query retrieves identity-relevant memory and the other selects short-context keyframes for coherent continuation. Additionally, a subject-aware cinematic data pipeline provides precise reconstruction supervision via consistent, pronoun-free subject descriptions. Experiments demonstrate that Memento achieves state-of-the-art performance in long-term subject consistency, cross-shot coherence, and visual quality.
Techmeme(15)
- SpaceX closed up 4.8% on Tuesday with a $2.65T market cap, just above Amazon's, after popping ~12% intraday and briefly overtaking Microsoft's $2.93T market cap (CNBC)
CNBC : SpaceX closed up 4.8% on Tuesday with a $2.65T market cap, just above Amazon's, after popping ~12% intraday and briefly overtaking Microsoft's $2.93T market cap — SpaceX shares popped about 12% on Tuesday, as Elon Musk's rocket builder continued its meteoric rise following a record-breaking IPO on Friday.
- Sources: PayPal to shutter its 10-year-old PayPal Ventures arm amid a broader shakeup under a new CEO and has hired Jefferies to explore selling some positions (Ben Weiss/Fortune)
Ben Weiss / Fortune : Sources: PayPal to shutter its 10-year-old PayPal Ventures arm amid a broader shakeup under a new CEO and has hired Jefferies to explore selling some positions — PayPal is shuttering its 10-year-old venture team amid a broader corporate shakeup, according to five sources familiar with the matter.
- Melbourne-based Everlab, which is building an AI-powered preventive healthcare platform, raised a AU$65M Series A led by Airtree Ventures (Tegan Jones/SmartCompany)
Tegan Jones / SmartCompany : Melbourne-based Everlab, which is building an AI-powered preventive healthcare platform, raised a AU$65M Series A led by Airtree Ventures — Melbourne healthtech startup Everlab has raised $65 million in Series A funding led by Airtree Ventures as it expands its preventative healthcare platform into global markets.
- Sensor Tower: ChatGPT's market share fell to 46.4% by the end of May, as Gemini rose to 27.7% and Claude to 10.3%; Grok, Meta AI, and others have less than 5% (Ivan Mehta/TechCrunch)
Ivan Mehta / TechCrunch : Sensor Tower: ChatGPT's market share fell to 46.4% by the end of May, as Gemini rose to 27.7% and Claude to 10.3%; Grok, Meta AI, and others have less than 5% — More than three and a half years after ChatGPT's initial release, AI assistants are now used by millions of people worldwide, and the competitive landscape is changing fast.
- Z.ai debuts GLM-5.2, saying it has significant improvements for coding, agentic, and long-horizon tasks, with a 1M context window and MIT-licensed open weights (Z.ai)
Z.ai : Z.ai debuts GLM-5.2, saying it has significant improvements for coding, agentic, and long-horizon tasks, with a 1M context window and MIT-licensed open weights — We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability …
- Google launches Android 17 and Wear OS 7, first on Pixel devices, with support for the latest AI models, a bubble bar UI, and live updates on Wear OS (Sarah Perez/TechCrunch)
Sarah Perez / TechCrunch : Google launches Android 17 and Wear OS 7, first on Pixel devices, with support for the latest AI models, a bubble bar UI, and live updates on Wear OS — Google on Tuesday released the final version of its Android 17 operating system, as well as its counterpart for smartwatches, Wear OS 7.
- Qualcomm announces Snapdragon Reality Elite, its new flagship XR chipset, debuting in the compute puck of Xreal's Aura Android XR device this fall (David Heaney/UploadVR)
David Heaney / UploadVR : Qualcomm announces Snapdragon Reality Elite, its new flagship XR chipset, debuting in the compute puck of Xreal's Aura Android XR device this fall — Qualcomm just announced Snapdragon Reality Elite, its new flagship XR chipset, and it will debut in the compute puck of Xreal's Aura Android XR device this fall.
- Letter: Hightouch, an ad tech startup recently valued at $2.75B, offers to buy LiveRamp's identity business from Publicis for $800M to $1.2B in cash and stock (Sara Fischer/Axios)
Sara Fischer / Axios : Letter: Hightouch, an ad tech startup recently valued at $2.75B, offers to buy LiveRamp's identity business from Publicis for $800M to $1.2B in cash and stock — Hightouch, an ad tech startup recently valued at $2.75 billion, has offered to buy LiveRamp's identity business from ad holding group Publicis …
- A look at Specs, which have a battery life of up to four hours per charge, and an interview with Snap CEO Evan Spiegel about the AR glasses (Harry McCracken/Fast Company)
Harry McCracken / Fast Company : A look at Specs, which have a battery life of up to four hours per charge, and an interview with Snap CEO Evan Spiegel about the AR glasses — Snap's cofounder and CEO, Evan Spiegel, gave this morning's keynote at AWE, the augmented reality industry's big annual conference.
- Snap unveils $2,195 Specs, its first AR glasses that are "fully standalone" and have a 51-degree field of view, expected "this fall" in the US, UK, and France (Jay Peters/The Verge)
Jay Peters / The Verge : Snap unveils $2,195 Specs, its first AR glasses that are “fully standalone” and have a 51-degree field of view, expected “this fall” in the US, UK, and France — The new Specs are priced at $2,195, and you can preorder them starting today.
- Microsoft is moving Copilot Cowork to usage-based pricing and is considering a Microsoft-hosted version of DeepSeek as a cheaper option (Ina Fried/Axios)
Ina Fried / Axios : Microsoft is moving Copilot Cowork to usage-based pricing and is considering a Microsoft-hosted version of DeepSeek as a cheaper option — Microsoft is moving Copilot Cowork to usage-based pricing as it expands access to the enterprise AI tool — and is considering a Microsoft-hosted version of DeepSeek as a cheaper model option.
- Sources: Apple's AirPods with cameras for AI are scheduled to launch in 2027 alongside the 20th anniversary pro iPhones and a second-generation foldable iPhone (Mark Gurman/Bloomberg)
Mark Gurman / Bloomberg : Sources: Apple's AirPods with cameras for AI are scheduled to launch in 2027 alongside the 20th anniversary pro iPhones and a second-generation foldable iPhone — Apple Inc.'s upcoming camera-equipped AirPods, a product meant to vault the company into the AI device market …
- Databricks agrees to acquire Panther Labs, its third cybersecurity acquisition; Panther was valued at $1.4B when it raised a $120M Series B in 2021 (Jeffrey Dastin/Reuters)
Jeffrey Dastin / Reuters : Databricks agrees to acquire Panther Labs, its third cybersecurity acquisition; Panther was valued at $1.4B when it raised a $120M Series B in 2021 — Databricks on Tuesday said it agreed to buy the startup Panther Labs, as the U.S. data analytics provider pushes deeper into the cybersecurity business.
- Bland, which uses proprietary, in-house-built voice models to process calls for 250+ enterprise clients, raised a $50M Series C led by Dell Technologies Capital (Lily Mae Lazarus/Fortune)
Lily Mae Lazarus / Fortune : Bland, which uses proprietary, in-house-built voice models to process calls for 250+ enterprise clients, raised a $50M Series C led by Dell Technologies Capital — Isaiah Granet was rejected by 180 investors in three weeks during Y Combinator. Their reason: phone calls won't exist in a year.
- Hydra Host, a marketplace for developers and enterprises to procure GPUs pooled from data center operators, raised a $100M Series A led by Kindred Ventures (Mike Wheatley/SiliconANGLE)
Mike Wheatley / SiliconANGLE : Hydra Host, a marketplace for developers and enterprises to procure GPUs pooled from data center operators, raised a $100M Series A led by Kindred Ventures — Data center infrastructure startup Hydra Host Inc. said today it has bagged $100 million in a Series A round of funding led by Kindred Ventures.
Solidot(15)
- 垂直绿化给城市降温
气候变化和城市化加剧了热岛效应,城市地区的温度显著高于农村地区,而更高的温度又推动了制冷需求和加剧了电网压力,形成某种恶性循环。日本大阪府大学 Jihui Yuan 副教授领导的团队调查了垂直绿化等城市降温策略。他们的研究显示,朝南绿墙可将室内热条件改善最多 1.7°C;低反照率外表面能改善室外热舒适度最多 1.5°C;高反照率外表面则有助于降低室内温度。
- GLP-1 减肥药在降低体重的同时也降低了骨折率
GLP-1 减肥药如 Ozempic、Wegovy、Rybelsus 能快速降低体重,此前有担忧认为快速的体重下降可能导致骨质疏松,增加骨折风险。然而最新研究发现,相比其它起效较慢的减肥药,GLP-1 减肥药能将骨折风险降低 15%。研究人员承认需要更多研究去证实相关性。研究人员分析了逾 59,000 名患者,其中 26,324 名服用了 GLP-1 减肥药,对照组的 33,555 人服用的是非 GLP-1 减肥药。结果显示,实验组发生 794 例骨折,对照组则发生 1045 例。
- 亚马逊数据中心 2025 年使用了 25 亿加仑的水
根据亚马逊公布的数据,它的数据中心在 2025 年使用了 25 亿加仑的水。电商巨人声称它的用水量远低于主要竞争对手。亚马逊称,其数据中心用水量为每千瓦时 0.12 升(L/kWh),称微软在 2025 年的用水量为每千瓦时 0.27 升,Meta 在 2024 年的用水量为每千瓦时 0.19 升,Google 最糟糕达到每千瓦时 1.15 升。亚马逊表示,其设施约 90% 的时间都采用“自然空气冷却”,即引入室外空气使其流经服务器吸收热量,无需用水——但在最炎热的天气里会使用水蒸发降温。
- Commodore 宣布反社交网络的翻盖手机
曾经的家用 PC 巨人 Commodore 又回来了,它宣布了一款翻盖手机 Callback 8020,运行基于 Linux 的 Sailfish OS 操作系统,不支持任何社交媒体、浏览器或工作应用如电子邮件,但支持地图、播客、拼车、以及流行的消息应用如 WhatsApp、Signal、Telegram 和微信(WeChat)——因为对很多人而言没有这些应用手机什么也不是。这款手机是 Commodore 公司推出的,当然也有 Commodore 模拟器。选择翻盖手机是因为它是作为一种多用途工具,你打开翻盖就是为了用它。这款手机不便宜,6 月 30 日开放预购,售价 499 美元,主要配置是 4GB 内存,64 GB SSD,索尼 4800 万像素相机,显示屏分辨率 480 x 640,电池可移除,容量为 1550mAh。
- 禁止使用科技产品提升了学生的阅读能力
在数字化时代,一名教师的低科技实验显示学生的阅读能力有了显著提升。明尼阿波利斯 Washburn 高中的 AP 文学和英语教师 Maureen Mulvaney 在学生抄袭、注意力不集中以及阅读能力下降等问题之后开始了低科技实验,在家长的支持下,她禁止学生使用手机和笔记本电脑,要求所有作业都必须用纸笔完成。尽管学生一开始有抵触,但效果立竿见影:实验前的 2025 年 9 月只有 46% 的学生对阅读能力有信心,到了今年 2 月该比例飙升至 95%。大多数学生能写至少两页部分学生甚至能写五六页英文文章。79% 的学生表示在纸上写作和组织思路比在屏幕上更容易。
- 开源模型能否战胜 OpenAI?
中美两国的 AI 公司采取了不同的发布策略:中国侧重于开源权重模型,美国公司如 OpenAI 和 Anthropic 则采用闭源策略。Hugging Face 前亚太生态系统高管 Tiezhen Wang 表示,OpenAI 和 Anthropic 指责中国 AI 公司蒸馏其模型,他认为蒸馏是中性的,美国 AI 公司是通过抓取互联网上的信息训练模型,它们并非知识的创造者,却试图阻止其他人重复利用知识,有点讽刺。所有 AI 生成的内容都应该没有版权,否则拥有算力的人能滥用权力,生成各种组合内容然后对所有内容都申请版权。他发现中国公司和美国公司在最大化使用 token 上有明显差异,因为中国有很多开源权重模型,其使用成本没有美国大,因此中国互联网公司都鼓励员工最大化使用 token,鼓励员工成为 AI 原生开发者,甚至禁止他们手动完成撰写文档之类的日常工作。
- curl 暂停一个月接受漏洞报告
curl 项目维护者 Daniel Stenberg 宣布,curl 将于 7 月 1 日至 8 月 3 日期间暂停接收漏洞报告,除非提交者拥有付费支持合同。他称之为“curl 的极乐夏日”。Stenberg 称过去四个月一直承受着巨大的压力,他们需要休息一下。GitHub 上的项目 issue 和 pull-request 保持开放,curl 8.22.0 的发布日期推迟两周至 2026 年 9 月 2 日发布。
- 朱雀二号火箭解体产生上百碎片
蓝箭航天于 6 月 9 日在东风发射场发射了朱雀二号改进型遥六运载火箭(ZQ-2E Y6),将搭载的千帆 DTC 01 星和中国移动 02 星送入预定轨道。但火箭上面级随后在太空发生解体事件,碎片散落在近地轨道,其中部分与国际空间站和 SpaceX Starlink 宽带网络的轨道重叠。LeoLabs 的高级技术研究员 Darren McKnight 表示此次事件可能产生了 100-150 块碎片。其中一块碎片是火箭的第二级,长约 8 米直径约 3.35 米。火箭上面级的主体在距离地球 335-424 公里的轨道上运行,轨道倾角 54.5 度。好消息是轨道高度足够低,空气阻力将使大部分火箭碎片在几个月内重返大气层烧毁。如果轨道高度超过 650 公里,那么碎片将需要数十年甚至更长时间才能重返大气层。
- Firefox 152 释出
Mozilla 释出了 Firefox 152。主要变化包括:默认编译了 JPEG-XL 支持代码,但默认仍然没有启用,用户需要去 Firefox Labs 调整设置启用,JPEG-XL 是新的免专利图像格式,相关编解码器使用了 Rust 语言开发;重新设计了设置界面、在 Windows 不同硬件配置下支持 HDR 视频、支持 CSS 的 field-sizing 属性,以及一系列面向开发者的新功能,等等。
- 女性头部摄药量与经期相关
小鼠和人类实验显示,头部通过经鼻给物法摄入的药量存在性别差异,其中雌性摄药量与经期密切相关。研究人员通过小鼠实验发现,在动情前期与动情期,即雌激素奔涌至最高的阶段,雌性头部区域摄取的药物显著多于雄性;随着周期进入动情后期,雌激素跌向其最低值,两性之间基本不存在差异。研究人员在人类身上观察到了相似的现象:女性趋于更高的峰浓度,女性最高峰值逾男性最高峰值的两倍。男性则保留药物更久。
- 微软求助于 AWS 以满足 GitHub 上 AI 驱动的负载增长需求
微软旗下的代码托管平台 GitHub 最近一段时间宕机事件频发,它正将服务迁移到微软的云计算平台 Azure,但仍然无法满足不断增长的需求,因此它正求助于最大的竞争对手亚马逊 AWS,以确保平台能正常运行。亚马逊表示不会对个别客户发表评论。对微软而言,GitHub 宕机的风险已经超过了向 AWS 付费所带来的负面影响。GitHub 首席运营官 Kyle Daigle 此前表示,该平台用户的提交(Commits)数将从 2025 年的 10 亿飙升到 2026 年 140 亿次。GitHub 在 2025 年 10 月计划将平台容量提高 10 倍,但到了 2026 年 2 月它发现需要提高 30 倍,原因是 AI 编程导致平台工作负荷大幅提升。
- 瑞士选民否决了将人口设限千万的提案
瑞士于 6 月 14 日举行全民公投,决定是否在 2050 年前将全国常住人口限制在一千万以内。瑞士的人口出生率为每名妇女生育 1.29 个孩子,远低于 2.1 的人口替代率,它的人口增长主要归因于外来移民。目前瑞士人口已超过 900 万,官方数据显示,2024 年外国公民占到了瑞士总人口的 27% 以上。右翼的瑞士人民党(Swiss People's Party)支持的提案要求“2050 年前瑞士常住人口不得超过 1000 万,且瑞士应放弃与欧盟的自由流动协议”。瑞士选民最终否决了这一被称为“瑞士脱欧”的提案,有 54.79% 的选民反对,45.21% 的选民支持,投票率为 58.86%。
- 俄罗斯计划退役漏气的国际空间站 PrK 模块
位于 Progress(进步号)气闸舱和 Zvezda(星辰号)服务舱之间的 PrK 模块因结构裂缝导致的漏气过去几年一直困扰着国际空间站,今年初漏气问题一度被认为已经修复,但本月早些时候报告漏气再次加剧,该模块的裂缝总数达到 16 处。10 天前俄罗斯宇航员试图用锯子拆除该模块的一个承重支架,此举招致了 NASA 的强烈反对,认为可能会产生严重后果,下令宇航员进入与空间站对接的 Crew Dragon 飞船,穿上宇航服,准备必要时紧急撤离。俄罗斯航天局最终放弃了拆支架的计划。双方在幕后反复的拉锯之后,最终俄罗斯通知 NASA 将退役 PrK 模块。这意味着宇航员将不再进入 PrK 模块,或再次尝试对其进行加压。而俄罗斯将需要使用其它端口向空间站转移补给。
- Arch Linux 遭遇新一轮 AUR 恶意程序攻击
Arch Linux 项目的用户软件仓库 Arch User Repository(AUR)上周遭遇了大规模恶意攻击,在处理了逾 1500 个软件包之后开发者认为问题已经得到了控制。然而仅仅过了一天,AUR 遭遇了新一轮的恶意攻击,这一次攻击者使用了代码混淆技术掩盖其意图。AUR 是用户贡献的软件包库,并非官方软件库,Arch Linux 项目可能需要暂时下线 AUR 以免遭遇一轮又一轮的恶意攻击。
- 数百万学生就读学校位于有毒污染场地 5 公里内
根据智库 Centre for Global Development 的地理分析,数百万儿童就读的学校附近存在已知的铅、汞、砷和杀虫剂等有毒污染。研究发现,亚洲、非洲和拉丁美洲 17 个国家的逾 25.2 万所学校位于有毒污染场地 5 公里范围内。这些学校有 4300 多万名儿童,其中 520 万名儿童位于 1 公里范围内。发达国家受污染影响的负担不成比例的落在贫困学生和非白人学生身上,但在发展中国家污染集中在富裕人群居住的城市,城市学校的规模通常更大,因此学生也更多——以菲律宾为例,9% 的学校靠近污染场地,而这些学校的学生总数占到全国学生总数的 27%。分析还显示,私立学校比公立学校更有可能位于污染场地附近。加纳 41% 的私立学校靠近污染场地,而公立学校的这一比例仅为 18%。
OrangeBot Weekly
5 Claude Code skills worth using each week — with my verdict on what’s actually good. No hype.