DIGEST · 2025-08-14

OrangeBot.AI Digest — 2025-08-14

74 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. Steve Wozniak: Life to me was never about accomplishment, but about happiness (yro.slashdot.org)
  2. "Privacy preserving age verification" is bullshit (pluralistic.net)
  3. Streaming services are driving viewers back to piracy (www.theguardian.com)
  4. I made a real-time C/C++/Rust build visualizer (danielchasehooper.com)
  5. Gemma 3 270M: Compact model for hyper-efficient AI (developers.googleblog.com)
  6. Kodak has no plans to cease, go out of business, or file for bankruptcy (www.kodak.com)
  7. Blood oxygen monitoring returning to Apple Watch in the US (www.apple.com)
  8. Why LLMs can't really build software (zed.dev)
  9. "None of These Books Are Obscene": Judge Strikes Down Much of FL's Book Ban Bill (bookriot.com)
  10. US Wholesale Inflation Rises by Most in 3 Years (www.bloomberg.com)
  11. Linux address space isolation revived after lowering performance hit (www.phoronix.com)
  12. New protein therapy shows promise as antidote for carbon monoxide poisoning (www.medschool.umaryland.edu)
  13. Org-social is a decentralized social network that runs on Org Mode (github.com)
  14. 1976 Soviet edition of 'The Hobbit' (2015) (mashable.com)
  15. Meta accessed women's health data from Flo app without consent, says court (www.malwarebytes.com)

GitHub Trending(14)

  1. ubicloud / ubicloud

    Open source alternative to AWS. Elastic compute, block storage (non replicated), firewall and load balancer, managed Postgres, K8s, AI inference, and IAM services.

  2. microsoft / poml

    Prompt Orchestration Markup Language

  3. pathwaycom / pathway

    Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

  4. external-secrets / external-secrets

    External Secrets Operator reads information from a third-party service like AWS Secrets Manager and automatically injects the values as Kubernetes Secrets.

  5. redis / go-redis

    Redis Go client

  6. colmap / colmap

    COLMAP - Structure-from-Motion and Multi-View Stereo

  7. angular / components

    Component infrastructure and Material Design components for Angular

  8. bytedance / UI-TARS-desktop

    The Open-sourced Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.

  9. midday-ai / midday

    Invoicing, Time tracking, File reconciliation, Storage, Financial Overview & your own Assistant made for Freelancers

  10. conductor-oss / conductor

    Conductor is an event driven orchestration platform providing durable and highly resilient execution engine for your applications

  11. hashicorp / terraform

    Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

  12. ostris / ai-toolkit

    The ultimate training toolkit for finetuning diffusion models

  13. apple / embedding-atlas

    Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

  14. oop7 / YTSage

    Modern YouTube downloader with a clean PySide6 interface. Download videos in any quality, extract audio, fetch subtitles, sponserBlock, and view video metadata. Built with yt-dlp for reliable performance.

Product Hunt(15)

  1. Macaron AI

    The AI that instantly gets you and cooks up mini-apps

  2. Anything

    Agent that ships mobile apps & web. Everything built in

  3. Snowglobe

    Simulate real users to test your AI before launch

  4. Comet by Perplexity

    Browse at the speed of thought

  5. CoSupport AI

    AI support agents that don’t hallucinate, live in 10 mins

  6. Dub Partners

    Modern affiliate platform & network for SaaS

  7. DeepReel

    The first AI agent that is your personal video creation team

  8. Memorae

    Boost your productivity with Whatsapp

  9. FirstUser.app

    Guaranteed feedback on your product launch

  10. Ally Solos Glasses

    The most accessible AI assistant, now wearable

  11. Addicted

    Break your bad habits in a fun way

  12. Hana Connect

    AI that connects people dealing with the same struggles.

  13. LangRead

    Read books, learn languages one word at a time

  14. RunAnywhere

    Ollama but for mobile, with a cloud fallback

  15. Tripsy 3.5

    The new way to collaborate on travel plans

Hugging Face(15)

  1. Story2Board: A Training-Free Approach for Expressive Storyboard Generation

    We present Story2Board, a training-free framework for expressive storyboard generation from natural language. Existing methods narrowly focus on subject identity, overlooking key aspects of visual storytelling such as spatial composition, background evolution, and narrative pacing. To address this, we introduce a lightweight consistency framework composed of two components: Latent Panel Anchoring, which preserves a shared character reference across panels, and Reciprocal Attention Value Mixing, which softly blends visual features between token pairs with strong reciprocal attention. Together, these mechanisms enhance coherence without architectural changes or fine-tuning, enabling state-of-the-art diffusion models to generate visually diverse yet consistent storyboards. To structure generation, we use an off-the-shelf language model to convert free-form stories into grounded panel-level prompts. To evaluate, we propose the Rich Storyboard Benchmark, a suite of open-domain narratives designed to assess layout diversity and background-grounded storytelling, in addition to consistency. We also introduce a new Scene Diversity metric that quantifies spatial and pose variation across storyboards. Our qualitative and quantitative results, as well as a user study, show that Story2Board produces more dynamic, coherent, and narratively engaging storyboards than existing baselines.

  2. Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery

    Large language models (LLMs), especially Explicit Long Chain-of-Thought (CoT) reasoning models like DeepSeek-R1 and QWQ, have demonstrated powerful reasoning capabilities, achieving impressive performance in commonsense reasoning and mathematical inference. Despite their effectiveness, Long-CoT reasoning models are often criticized for their limited ability and low efficiency in knowledge-intensive domains such as molecule discovery. Success in this field requires a precise understanding of domain knowledge, including molecular structures and chemical principles, which is challenging due to the inherent complexity of molecular data and the scarcity of high-quality expert annotations. To bridge this gap, we introduce Mol-R1, a novel framework designed to improve explainability and reasoning performance of R1-like Explicit Long-CoT reasoning LLMs in text-based molecule generation. Our approach begins with a high-quality reasoning dataset curated through Prior Regulation via In-context Distillation (PRID), a dedicated distillation strategy to effectively generate paired reasoning traces guided by prior regulations. Building upon this, we introduce MoIA, Molecular Iterative Adaptation, a sophisticated training strategy that iteratively combines Supervised Fine-tuning (SFT) with Reinforced Policy Optimization (RPO), tailored to boost the reasoning performance of R1-like reasoning models for molecule discovery. Finally, we examine the performance of Mol-R1 in the text-based molecule reasoning generation task, showing superior performance against existing baselines.

  3. Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation

    Generating high-fidelity human videos that match user-specified identities is important yet challenging in the field of generative AI. Existing methods often rely on an excessive number of training parameters and lack compatibility with other AIGC tools. In this paper, we propose Stand-In, a lightweight and plug-and-play framework for identity preservation in video generation. Specifically, we introduce a conditional image branch into the pre-trained video generation model. Identity control is achieved through restricted self-attentions with conditional position mapping, and can be learned quickly with only 2000 pairs. Despite incorporating and training just sim1\% additional parameters, our framework achieves excellent results in video quality and identity preservation, outperforming other full-parameter training methods. Moreover, our framework can be seamlessly integrated for other tasks, such as subject-driven video generation, pose-referenced video generation, stylization, and face swapping.

  4. Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing

    Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive (AR) LLMs for text generation, with the potential to decode multiple tokens in a single iteration. However, none of the existing open-source dLLMs have achieved superior inference speed over AR LLMs of similar size. This paper breaks this barrier based on a simple and effective strategy named discrete diffusion forcing (D2F). D2F equips dLLMs with two key capabilities: (1) block-wise autoregressive generation to enable KV cache utilization; (2) prediction of following tokens without requiring completion of prior blocks for inter-block parallel decoding. In this way, the vanilla dLLMs are refurbished into an AR-diffusion hybrid paradigm for efficient inference. D2F can be implemented with an asymmetric distillation process based on pre-trained dLLMs. We further propose a pipelined parallel decoding algorithm, which enables a trade-off between efficiency and efficacy. Empirically, D2F dLLMs achieve more than 2.5times inference speed than LLaMA3 and Qwen2.5 on GSM8K. Compared to vanilla dLLMs like LLaDA and Dream, the acceleration can be more than 50times while maintaining comparable output quality. The code is available at https://github.com/zhijie-group/Discrete-Diffusion-Forcing.

  5. AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving

    The rapid advancement of large language models (LLMs) has empowered intelligent agents to leverage diverse external tools for solving complex real-world problems. However, as agents increasingly depend on multiple tools, they encounter new challenges: extended contexts from disparate sources and noisy or irrelevant tool outputs can undermine system reliability and accuracy. These challenges underscore the necessity for enhanced stability in agent-based systems. To address this, we introduce dynamic supervision and maneuvering mechanisms, constructing a robust and dynamic Multi-Agent System (MAS) architecture within the AWorld framework. In our approach, the Execution Agent invokes the Guard Agent at critical steps to verify and correct the reasoning process, effectively reducing errors arising from noise and bolstering problem-solving robustness. Extensive experiments on the GAIA test dataset reveal that our dynamic maneuvering mechanism significantly improves both the effectiveness and stability of solutions, outperforming single-agent system (SAS) and standard tool-augmented systems. As a result, our dynamic MAS system achieved first place among open-source projects on the prestigious GAIA leaderboard. These findings highlight the practical value of collaborative agent roles in developing more reliable and trustworthy intelligent systems.

  6. Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

    We introduce M3-Agent, a novel multimodal agent framework equipped with long-term memory. Like humans, M3-Agent can process real-time visual and auditory inputs to build and update its long-term memory. Beyond episodic memory, it also develops semantic memory, enabling it to accumulate world knowledge over time. Its memory is organized in an entity-centric, multimodal format, allowing deeper and more consistent understanding of the environment. Given an instruction, M3-Agent autonomously performs multi-turn, iterative reasoning and retrieves relevant information from memory to accomplish the task. To evaluate memory effectiveness and memory-based reasoning in multimodal agents, we develop M3-Bench, a new long-video question answering benchmark. M3-Bench comprises 100 newly recorded real-world videos captured from a robot's perspective (M3-Bench-robot) and 929 web-sourced videos across diverse scenarios (M3-Bench-web). We annotate question-answer pairs designed to test key capabilities essential for agent applications, such as human understanding, general knowledge extraction, and cross-modal reasoning. Experimental results show that M3-Agent, trained via reinforcement learning, outperforms the strongest baseline, a prompting agent using Gemini-1.5-pro and GPT-4o, achieving 6.7%, 7.7%, and 5.3% higher accuracy on M3-Bench-robot, M3-Bench-web and VideoMME-long, respectively. Our work advances the multimodal agents toward more human-like long-term memory and provides insights into their practical design. Model, code and data are available at https://github.com/bytedance-seed/m3-agent

  7. Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

    Recently, GPT-4o has garnered significant attention for its strong performance in image generation, yet open-source models still lag behind. Several studies have explored distilling image data from GPT-4o to enhance open-source models, achieving notable progress. However, a key question remains: given that real-world image datasets already constitute a natural source of high-quality data, why should we use GPT-4o-generated synthetic data? In this work, we identify two key advantages of synthetic images. First, they can complement rare scenarios in real-world datasets, such as surreal fantasy or multi-reference image generation, which frequently occur in user queries. Second, they provide clean and controllable supervision. Real-world data often contains complex background noise and inherent misalignment between text descriptions and image content, whereas synthetic images offer pure backgrounds and long-tailed supervision signals, facilitating more accurate text-to-image alignment. Building on these insights, we introduce Echo-4o-Image, a 180K-scale synthetic dataset generated by GPT-4o, harnessing the power of synthetic image data to address blind spots in real-world coverage. Using this dataset, we fine-tune the unified multimodal generation baseline Bagel to obtain Echo-4o. In addition, we propose two new evaluation benchmarks for a more accurate and challenging assessment of image generation capabilities: GenEval++, which increases instruction complexity to mitigate score saturation, and Imagine-Bench, which focuses on evaluating both the understanding and generation of imaginative content. Echo-4o demonstrates strong performance across standard benchmarks. Moreover, applying Echo-4o-Image to other foundation models (e.g., OmniGen2, BLIP3-o) yields consistent performance gains across multiple metrics, highlighting the datasets strong transferability.

  8. Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment

    Alignment methodologies have emerged as a critical pathway for enhancing language model alignment capabilities. While SFT (supervised fine-tuning) accelerates convergence through direct token-level loss intervention, its efficacy is constrained by offline policy trajectory. In contrast, RL(reinforcement learning) facilitates exploratory policy optimization, but suffers from low sample efficiency and stringent dependency on high-quality base models. To address these dual challenges, we propose GRAO (Group Relative Alignment Optimization), a unified framework that synergizes the respective strengths of SFT and RL through three key innovations: 1) A multi-sample generation strategy enabling comparative quality assessment via reward feedback; 2) A novel Group Direct Alignment Loss formulation leveraging intra-group relative advantage weighting; 3) Reference-aware parameter updates guided by pairwise preference dynamics. Our theoretical analysis establishes GRAO's convergence guarantees and sample efficiency advantages over conventional approaches. Comprehensive evaluations across complex human alignment tasks demonstrate GRAO's superior performance, achieving 57.70\%,17.65\% 7.95\% and 5.18\% relative improvements over SFT, DPO, PPO and GRPO baselines respectively. This work provides both a theoretically grounded alignment framework and empirical evidence for efficient capability evolution in language models.

  9. MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models

    Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in visual mathematical reasoning across various existing benchmarks. However, these benchmarks are predominantly based on clean or processed multimodal inputs, without incorporating the images provided by real-world Kindergarten through 12th grade (K-12) educational users. To address this gap, we introduce MathReal, a meticulously curated dataset comprising 2,000 mathematical questions with images captured by handheld mobile devices in authentic scenarios. Each question is an image, containing the question text and visual element. We systematically classify the real images into three primary categories: image quality degradation, perspective variation, and irrelevant content interference, which are further delineated into 14 subcategories. Additionally, MathReal spans five core knowledge and ability categories, which encompass three question types and are divided into three difficulty levels. To comprehensively evaluate the multimodal mathematical reasoning abilities of state-of-the-art MLLMs in real-world scenarios, we design six experimental settings that enable a systematic analysis of their performance. Through extensive experimentation, we find that the problem-solving abilities of existing MLLMs are significantly challenged in realistic educational contexts. Based on this, we conduct a thorough analysis of their performance and error patterns, providing insights into their recognition, comprehension, and reasoning capabilities, and outlining directions for future improvements. Data and code: https://github.com/junfeng0288/MathReal.

  10. Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models

    Large language models (LLMs) have demonstrated remarkable performance in reasoning tasks, where reinforcement learning (RL) serves as a key algorithm for enhancing their reasoning capabilities. Currently, there are two mainstream reward paradigms: model-based rewards and rule-based rewards. However, both approaches suffer from limitations: rule-based rewards lack robustness, while model-based rewards are vulnerable to reward hacking. To address these issues, we propose Cooper(Co-optimizing Policy Model and Reward Model), a RL framework that jointly optimizes both the policy model and the reward model. Cooper leverages the high precision of rule-based rewards when identifying correct responses, and dynamically constructs and selects positive-negative sample pairs for continued training the reward model. This design enhances robustness and mitigates the risk of reward hacking. To further support Cooper, we introduce a hybrid annotation strategy that efficiently and accurately generates training data for the reward model. We also propose a reference-based reward modeling paradigm, where the reward model takes a reference answer as input. Based on this design, we train a reward model named VerifyRM, which achieves higher accuracy on VerifyBench compared to other models of the same size. We conduct reinforcement learning using both VerifyRM and Cooper. Our experiments show that Cooper not only alleviates reward hacking but also improves end-to-end RL performance, for instance, achieving a 0.54% gain in average accuracy on Qwen2.5-1.5B-Instruct. Our findings demonstrate that dynamically updating reward model is an effective way to combat reward hacking, providing a reference for better integrating reward models into RL.

  11. IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding

    Vision-language models (VLMs) have shown significant advancements in tasks such as visual grounding, where they localize specific objects in images based on natural language queries and images. However, security issues in visual grounding tasks for VLMs remain underexplored, especially in the context of backdoor attacks. In this paper, we introduce a novel input-aware backdoor attack method, IAG, designed to manipulate the grounding behavior of VLMs. This attack forces the model to ground a specific target object in the input image, regardless of the user's query. We propose an adaptive trigger generator that embeds the semantic information of the attack target's description into the original image using a text-conditional U-Net, thereby overcoming the open-vocabulary attack challenge. To ensure the attack's stealthiness, we utilize a reconstruction loss to minimize visual discrepancies between poisoned and clean images. Additionally, we introduce a unified method for generating attack data. IAG is evaluated theoretically and empirically, demonstrating its feasibility and effectiveness. Notably, our ASR@0.5 on InternVL-2.5-8B reaches over 65\% on various testing sets. IAG also shows promising potential on manipulating Ferret-7B and LlaVA-1.5-7B with very little accuracy decrease on clean samples. Extensive specific experiments, such as ablation study and potential defense, also indicate the robustness and transferability of our attack.

  12. Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models

    The new paradigm of test-time scaling has yielded remarkable breakthroughs in Large Language Models (LLMs) (e.g. reasoning models) and in generative vision models, allowing models to allocate additional computation during inference to effectively tackle increasingly complex problems. Despite the improvements of this approach, an important limitation emerges: the substantial increase in computation time makes the process slow and impractical for many applications. Given the success of this paradigm and its growing usage, we seek to preserve its benefits while eschewing the inference overhead. In this work we propose one solution to the critical problem of integrating test-time scaling knowledge into a model during post-training. Specifically, we replace reward guided test-time noise optimization in diffusion models with a Noise Hypernetwork that modulates initial input noise. We propose a theoretically grounded framework for learning this reward-tilted distribution for distilled generators, through a tractable noise-space objective that maintains fidelity to the base model while optimizing for desired characteristics. We show that our approach recovers a substantial portion of the quality gains from explicit test-time optimization at a fraction of the computational cost. Code is available at https://github.com/ExplainableML/HyperNoise

  13. VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models

    Multimodal large language models (MLLMs) have significantly advanced the integration of visual and textual understanding. However, their ability to generate code from multimodal inputs remains limited. In this work, we introduce VisCodex, a unified framework that seamlessly merges vision and coding language models to empower MLLMs with strong multimodal code generation abilities. Leveraging a task vector-based model merging technique, we integrate a state-of-the-art coding LLM into a strong vision-language backbone, while preserving both visual comprehension and advanced coding skills. To support training and evaluation, we introduce the Multimodal Coding Dataset (MCD), a large-scale and diverse collection of 598k samples, including high-quality HTML code, chart image-code pairs, image-augmented StackOverflow QA, and algorithmic problems. Furthermore, we propose InfiBench-V, a novel and challenging benchmark specifically designed to assess models on visually-rich, real-world programming questions that demand a nuanced understanding of both textual and visual contexts. Extensive experiments show that VisCodex achieves state-of-the-art performance among open-source MLLMs and approaches proprietary models like GPT-4o, highlighting the effectiveness of our model merging strategy and new datasets.

  14. CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing

    Recent advances in text-to-image (T2I) models have enabled training-free regional image editing by leveraging the generative priors of foundation models. However, existing methods struggle to balance text adherence in edited regions, context fidelity in unedited areas, and seamless integration of edits. We introduce CannyEdit, a novel training-free framework that addresses these challenges through two key innovations: (1) Selective Canny Control, which masks the structural guidance of Canny ControlNet in user-specified editable regions while strictly preserving details of the source images in unedited areas via inversion-phase ControlNet information retention. This enables precise, text-driven edits without compromising contextual integrity. (2) Dual-Prompt Guidance, which combines local prompts for object-specific edits with a global target prompt to maintain coherent scene interactions. On real-world image editing tasks (addition, replacement, removal), CannyEdit outperforms prior methods like KV-Edit, achieving a 2.93 to 10.49 percent improvement in the balance of text adherence and context fidelity. In terms of editing seamlessness, user studies reveal only 49.2 percent of general users and 42.0 percent of AIGC experts identified CannyEdit's results as AI-edited when paired with real images without edits, versus 76.08 to 89.09 percent for competitor methods.

  15. Decentralized Aerial Manipulation of a Cable-Suspended Load using Multi-Agent Reinforcement Learning

    This paper presents the first decentralized method to enable real-world 6-DoF manipulation of a cable-suspended load using a team of Micro-Aerial Vehicles (MAVs). Our method leverages multi-agent reinforcement learning (MARL) to train an outer-loop control policy for each MAV. Unlike state-of-the-art controllers that utilize a centralized scheme, our policy does not require global states, inter-MAV communications, nor neighboring MAV information. Instead, agents communicate implicitly through load pose observations alone, which enables high scalability and flexibility. It also significantly reduces computing costs during inference time, enabling onboard deployment of the policy. In addition, we introduce a new action space design for the MAVs using linear acceleration and body rates. This choice, combined with a robust low-level controller, enables reliable sim-to-real transfer despite significant uncertainties caused by cable tension during dynamic 3D motion. We validate our method in various real-world experiments, including full-pose control under load model uncertainties, showing setpoint tracking performance comparable to the state-of-the-art centralized method. We also demonstrate cooperation amongst agents with heterogeneous control policies, and robustness to the complete in-flight loss of one MAV. Videos of experiments: https://autonomousrobots.nl/paper_websites/aerial-manipulation-marl

Solidot(15)

  1. 研究认为社交媒体的问题无法得到修正

    社交媒体没有成为人们曾经期盼的健康的交流思想的乌托邦式的公共广场,而是创造出一种回音室,放大少数用户的声音,放大愤怒和冲突,进一步加剧极化。对社媒平台进行干预是否能缓解或修正部分它产生的问题?根据发表在 arXiv 上的一篇预印本,研究人员测试了六种干预策略,发现基本无效,除非从根本上改变社媒的架构,否则其问题无法得到修正。研究人员测试了对信息流按时间排序或随机排序;逆转促进互动的算法以降低情绪化内容的曝光度;促进观点的多元性;使用“桥接算法”促进相互理解而非煽动情绪的内容;隐藏转发和关注者账户等社交统计数据以减少社交影响力线索;删除个人简介以限制基于身份的信号曝光度。结果显示,部分干预措施只表现出略微改善的效果,部分措施可能进一步恶化了问题。比如对信息流按时间排序减少了注意力不平等,但进一步放大了极端内容。促进观点的多元性没有表现出任何显著的效果。

  2. 挪威指责俄黑客破坏其水坝

    挪威反情报机构负责人 Beate Gangaas 周三表示,俄罗斯黑客今年四月七日短暂控制了一座位于 Bremanger 的水坝,打开了泄洪闸,四个小时后攻击被发现而停止。挪威大部分电力来自水电,情报部门此前曾警告其能源基础设施可能遭受攻击。Gangaas 说,此类攻击的目的是影响民众,在民众中制造恐惧和混乱,我们的俄罗斯邻居变得更危险了。俄罗斯驻奥斯陆大使馆表示,她的声明毫无根据并带有政治动机。

  3. 韩国星巴克要求顾客不要将打印机和台式机带到店里

    韩国的星巴克顾客正将连锁咖啡店作为远程办公场所,为了限制这种行为,星巴克发布了新政策,要求顾客不要将打印机和台式机等大件物品带到店里。韩国星巴克发言人称,他们仍然欢迎携带笔记本电脑和小型个人设备的顾客,但请顾客不要带台式电脑、打印机或其它可能限制座位并影响共享空间的大件物品。星巴克于 1999 年在韩国首尔开设了第一家门店,尽管韩国人口不到日本的一半,但韩国星巴克门店总数已超过日本达到了 2050 家,而日本只有 2040 家。韩国社会文化学副教授 Jo Elfving-Hwang 说,在星巴克远程办公相当便宜,但有些人走得太极端了。

  4. 猫的痴呆症与人类相似

    根据发表在《European Journal of Neuroscience》期刊上的一项研究,猫痴呆症与人类阿尔茨海默病之间有着惊人的相似性。爱丁堡大学的研究人员对 25 只死前出现痴呆症症状的猫进行了大脑检查,发现猫的脑部积聚了 β淀粉样蛋白,该蛋白也是人类阿尔茨海默病的典型特征之一。研究人员认为,猫可以作为研究痴呆症/阿尔茨海默病的完美模型,探索新的疗法,也有助于理解和管理猫的痴呆症。

  5. 地震的长尾效应

    2008 年 5 月 12 日发生的汶川大地震造成逾 6.9 万人遇难,根据发表在《自然》期刊上的一项研究,这场大地震的后续影响持续了至少十几年。汶川地震将大量岩石和土壤冲入当地溪流与河流。河流带走的地震沉积物流量分为两类:悬浮在水中的细颗粒沉积物,称为悬移质;沿河床滚动或跳跃前行的粗颗粒物质(砾石至巨石),称为推移质(bedload)。科学家早已发现,地震后悬移质流量会增加。但这只是部分事实。研究团队发现,汶川地震后岷江的总沉积物流量增长了 6 倍,而推移质流量更是增加了 20 倍。这意味着,地震后河流中约 65% 的沉积物是推移质,而在同等规模的山地河流中,这一比例通常只有 20% 左右。这一高流量状态至少持续了十年——研究人员发表论文前最后一次实地观测时依然如此。大地震后续影响持续的时间超出了预期。这意味着灾难之后在原地按原有方式重建是不明智的,因为风险不仅没有降低,反而因地形改变而增加。被淤积的河道已无法承受原来其或许可以承受的十年一遇洪水。

  6. 美国 在 AI 芯片货物中装定位追踪器

    路透社报道,美国在 AI 芯片货物中装定位追踪器,以跟踪货物是否被转运至受美国出口管制的国家和地区。追踪器被用在戴尔、超微等厂商的服务器货物中,这些服务器搭载了英伟达和 AMD 的 AI 芯片。追踪设备通常被隐藏在服务器包装中。美国于 2022 年出台了针对英伟达、AMD 等厂商高性能 AI 芯片的出口限制,执法部门使用追踪设备也并非首次,这一做法可追溯至数十年前。

  7. 身份证关联疾病信息

    人民日报报道,中国疾控中心性病艾滋病预防控制中心研究员刘中夫介绍,目前疾控系统通过应用多病共检专病提报系统,从流入地和流出地加大艾滋、乙肝、丙肝和梅毒的筛查的工作力度,身份证信息中已关联覆盖上述四种疾病的确诊提示,各地均可查询。

  8. 步行友好的城市增加居民的步行次数

    对 1609 个美国城市里 5424 名在 3 年观察期中至少搬迁了一次的个体的手机数据的分析显示,从一个不太适宜步行的区域搬到一个更适宜步行的地方,会让身体活动增加,反之亦然。178 人从评分 48/100 的城市搬到纽约市(评分 89/100)后,他们的身体活动水平从平均每天 5600 步增加到了 7000 步,增幅达 1400 步。这一效应对于反向迁移的参与者也同样成立。这些观察结果在不同性别、年龄和身体质量指数(BMI)的人群中均保持一致。研究人员预测,步行友好城市有助于提升当地居民的有氧活动水平。

  9. 马斯克威胁苹果,想要提高其 AI 应用 Grok 的排名

    马斯克(Elon Musk)威胁对苹果采取法律行动,指控苹果操纵 App Store 排名,偏袒 ChatGPT 而非 Grok。马斯克声称苹果的行为使得除 OpenAI 之外没有其它 AI 公司能登顶 App Store 排行榜。这一指控并不正确, 今年曾有两款 AI 应用短时间内超越 ChatGPT:中国的 Deepseek 以及 Perplexity。苹果没有回应马斯克的指控,但 OpenAI 回应了,OpenAI CEO Sam Altman 援引 Platformer 的报道称,埃隆马斯克构建了一个特殊系统确保他的帖子在 X 上被用户优先看到,他操纵 X 为自己和公司谋利,损害竞争对手和他不喜欢的人的利益。

  10. Google 允许美国印度用户选择他们喜欢的内容来源

    Google 向美国和印度用户推出名为“Preferred Sources”的新功能,允许用户从搜索结果中选择自己喜欢的新闻网站和博客,之后用户会在搜索结果中看到更多来自自己喜欢内容源的内容。但 Google 此举被认为和社交平台算法类似,限制用户的选择,将用户困住回音室里,而高质量的搜索结果需要的是更多元化的内容源,而不是更少的内容源。用户搜索需要的不是偏爱的内容源,而是减少低质量的内容源,用黑名单限制低质量内容源。

  11. 《K-POP:猎魔女团》成为 Netflix 史上观看次数最多的动漫电影

    自 6 月上线至今,《K-POP:猎魔女团(K-Pop: Demon Hunters)》已成为 Netflix 历史上观看次数最多的动漫电影。这部动漫的一首歌《Golden》也荣登 Billboard 百强单曲榜首。它的故事简单又纯粹:远古时代恶魔游荡,恶魔以人类灵魂为食,直到三名女子——天赋异禀的歌手兼猎魔人——用歌声筑起一道神奇的屏障,将恶魔困于其后,这道屏障被称为“魂门”。此后魂门由一代又一代的猎魔人用歌声加固,直到有一天魂门变成金色,彻底封印恶魔,但当代的猎魔人歌手 Rumi、Mira 和 Zoey 面临来自大恶魔鬼马的最后反击——五名恶魔组成的 K-pop 男团。

  12. 你一生中被小行星砸到的概率

    6500 万年前的小行星撞击地球导致了恐龙灭绝,我们人类有生之年死于小行星撞击的风险有多高?根据发表在 arXiv 上的一篇预印本,研究人员模拟了 500 万个直径大于 140 米的近地天体(NEO),模拟了近地天体的轨道,并统计 150 年来近地天体撞击地球的次数,以判断 NEO 撞击地球的可能性。随后研究人员将这些结果与个人在 71 年的人生中发生其他不幸事件的机率进行比较,包括雷击、被大象攻击和一氧化碳中毒等事件,计算一个人死于这些事件的机率有多大。结果发现,大小超过 140 米的近地天体的撞击频率估计约为每 1.1 万年一次。直径 140-200 米的近地天体坠落海洋可能不会造成人员伤亡,而直径 180-200 米的近地天体如果撞击人口稠密的地区,则有可能影响一百万人,更大的近地天体撞击可能影响整个世界。因此即使近地天体撞击地球,但若其直径较小,大多数人仍然有很大的幸存几率。研究人员发现,当一氧化碳中毒和大象攻击等事件发生时,造成人员死亡的可能性就大得多。而一颗直径超过 140 米的小行星撞击地球的机率比一个人一生中被闪电击中的机率,以及被土狼袭击的机率都要高。另一方面,一个人一生中感染流感或遭遇车祸的机率远高于经历小行星撞击地球的机率,这也许不是什么令人惊讶的事。研究团队强调,这并不是要忽视行星防御的重要性,而是让大众能在理解真实风险的同时,理性看待小行星威胁。

  13. 性逆转在鸟类中间很常见

    部分鸟类的雌雄之间有着显著的外形区别,比如公鸡母鸡,雄孔雀和天堂鸟。但还有很多鸟类雌雄之间几乎没有区别,以至于科学家可能需要测试 DNA 才能区分雄性和雌性。但测试 DNA 也不可靠,根据发表在《Biology Letters》期刊上的一项研究,原因是性逆转在鸟类中间很常见,因此性腺和染色体并不一致。在人类中,女性通常携带 XX 染色体,男性为 XY 染色体。但决定性别的不是性染色体本身而是上面的基因。Y 染色体上的 SRY 基因决定了哺乳动物发育成雄性,如果缺乏该基因,那么携带 XY 染色体的人也会发育成女性。对五种澳大利亚常见鸟类澳洲喜鹊、笑翠鸟、凤头鸽、彩虹吸蜜鹦鹉和鳞胸吸蜜鹦鹉的研究发现,其性转比例在 3% 到 6% 之间,性逆转的鸟类大部分在遗传学上都是雌性但拥有雄性的生殖器官,小部分在遗传学上是雄性但拥有卵巢。但什么导致了鸟类的性逆转,研究人员表示还需要展开更多研究。

  14. 英国政府建议居民删除邮件以节省用水

    政客的逻辑思维能力可能在下降:为了应对持续的干旱,英国环境部水务署署长 Helen Wakeham 在一篇新闻稿里建议公众从改变日常习惯做起,比如关掉水龙头或删除旧邮件,称有助于节省用水,保护河流和野生动物。删除邮件如何能节省用水?写新闻稿的人可能以为英国数据中心消耗了大量的水,而邮件通常储存在云端的数据中心,因此删除邮件有助于节省水,问题是信息储存在硬盘里,无论删除不删除,硬盘一直在运行,反而删除这一操作会产生更多热,因为需要额外进行处理。此外,英国的数据中心使用风冷,几乎不消耗水。

  15. Threads 月活跃用户数突破 4 亿

    Meta 旗下微博客服务 Threads 月活跃用户数突破 4 亿,而今年五月公布的数据是 3.5 亿,意味着三个月增加 5000 万。此外移动端的日活用户数缩小了与最主要竞争对手 X/Twitter 的差距。根据 X 前 CEO Linda Yaccarino 透露的数字,X 的月活用户数超过 6 亿。根据 Similarweb 的最新数据,Threads 在移动设备上的日活跃用户数接近 X。2025 年 6 月,Threads 的 iOS 和 Android 应用日活跃用户数为 1.151亿,同比增长127.8%。X 的日活跃用户数 1.32 亿,同比下降 15.2%。Similarweb 的数据显示,X 的日访问量远远领先于Threads,X 在  6 月的全球日均访问量为 1.458 亿,而 Threads 仅为 690 万。