DIGEST · 2025-06-30

OrangeBot.AI Digest — 2025-06-30

66 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. The New Skill in AI Is Not Prompting, It's Context Engineering (www.philschmid.de)
  2. Xfinity using WiFi signals in your house to detect motion (www.xfinity.com)
  3. Proton joins suit against Apple for predatory practices (proton.me)
  4. Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?
  5. I write type-safe generic data structures in C (danielchasehooper.com)
  6. There are no new ideas in AI only new datasets (blog.jxmo.io)
  7. Donkey Kong Country 2 and Open Bus (jsgroth.dev)
  8. Show HN: New Ensō – first public beta (untested.sonnet.io)
  9. Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken (github.com)
  10. The Plot of the Phantom, a text adventure that took 40 years to finish (scottandrew.com)
  11. The provenance memory model for C (gustedt.wordpress.com)
  12. New proof dramatically compresses space needed for computation (www.scientificamerican.com)
  13. Want to meet people, try charging them for it? (notes.eatonphil.com)
  14. LetsEncrypt – Expiration Notification Service Has Ended (letsencrypt.org)
  15. Bought myself an Ampere Altra system (marcin.juszkiewicz.com.pl)

GitHub Trending(15)

  1. GraphiteEditor / Graphite

    A FOSS graphics editor for 2025: comprehensive 2D content creation tool for graphic design, digital art, and interactive real-time motion graphics — featuring node-based procedural editing

  2. twentyhq / twenty

    Building a modern alternative to Salesforce, powered by the community.

  3. nextcloud / all-in-one

    📦 The official Nextcloud installation method. Provides easy deployment and maintenance with most features included in this one Nextcloud instance.

  4. midday-ai / midday

    Invoicing, Time tracking, File reconciliation, Storage, Financial Overview & your own Assistant made for Freelancers

  5. octra-labs / wallet-gen
  6. actualbudget / actual

    A local-first personal finance app

  7. microsoft / generative-ai-for-beginners

    21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

  8. mendableai / firecrawl

    🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

  9. swisskyrepo / PayloadsAllTheThings

    A list of useful payloads and bypass for Web Application Security and Pentest/CTF

  10. stanford-oval / storm

    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

  11. aldinokemal / go-whatsapp-web-multidevice

    API for Whatsapp Web Multi Device Version, Support UI, Webhook & MCP

  12. snailyp / gemini-balance

    Gemini polling proxy service (gemini轮询代理服务)

  13. paperless-ngx / paperless-ngx

    A community-supported supercharged document management system: scan, index and archive all your documents

  14. 007revad / Synology_HDD_db

    Add your HDD, SSD and NVMe drives to your Synology's compatible drive database and a lot more

  15. vanshb03 / Summer2026-Internships

    Collection of Summer 2026 tech internships!

Product Hunt(12)

  1. Tabl 1.0

    A multi-player web browser

  2. Pokecut

    Use AI to create photos with just a few click or a prompt

  3. Jotform Presentation Agents

    Create AI presentations that talk, listen and answers

  4. Picsart Ignite 2.0: AI for Creators

    Instantly generate branded assets, ads, videos, fonts + more

  5. Foxylingo

    Chat and exchange languages with real people worldwide

  6. Bolt Connect

    Embedded marketplace payouts, designed for developers

  7. Dory

    An app switcher for people who can’t remember shortcuts

  8. DemoDazzle

    Create an interactive assistant that looks & sounds like you

  9. Fabi.ai Workflows

    AI-powered data workflows with SQL + Python in one platform

  10. Retainr.io

    The client management platform that turns skills into profit

  11. MyParu

    Your personal AI companion for life

  12. StartAMA

    Host your AMA session & build your audience

Hugging Face(15)

  1. Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

    Test-time compute has emerged as a powerful paradigm for improving the performance of large language models (LLMs), where generating multiple outputs or refining individual chains can significantly boost answer accuracy. However, existing methods like Best-of-N, majority voting, and self-reflection typically apply reasoning in a uniform way across inputs, overlooking the fact that different problems may require different levels of reasoning depth. In this work, we propose Fractional Reasoning, a training-free and model-agnostic framework that enables continuous control over reasoning intensity at inference time, going beyond the limitations of fixed instructional prompts. Our method operates by extracting the latent steering vector associated with deeper reasoning and reapplying it with a tunable scaling factor, allowing the model to tailor its reasoning process to the complexity of each input. This supports two key modes of test-time scaling: (1) improving output quality in breadth-based strategies (e.g., Best-of-N, majority voting), and (2) enhancing the correctness of individual reasoning chains in depth-based strategies (e.g., self-reflection). Experiments on GSM8K, MATH500, and GPQA demonstrate that Fractional Reasoning consistently improves performance across diverse reasoning tasks and models.

  2. Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning

    We introduce TAPAS (Task-based Adaptation and Planning using AgentS), a multi-agent framework that integrates Large Language Models (LLMs) with symbolic planning to solve complex tasks without the need for manually defined environment models. TAPAS employs specialized LLM-based agents that collaboratively generate and adapt domain models, initial states, and goal specifications as needed using structured tool-calling mechanisms. Through this tool-based interaction, downstream agents can request modifications from upstream agents, enabling adaptation to novel attributes and constraints without manual domain redefinition. A ReAct (Reason+Act)-style execution agent, coupled with natural language plan translation, bridges the gap between dynamically generated plans and real-world robot capabilities. TAPAS demonstrates strong performance in benchmark planning domains and in the VirtualHome simulated real-world environment.

  3. Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation

    Internal world models (WMs) enable agents to understand the world's state and predict transitions, serving as the basis for advanced deliberative reasoning. Recent large Vision-Language Models (VLMs), such as OpenAI o3, GPT-4o and Gemini, exhibit potential as general-purpose WMs. While the latest studies have evaluated and shown limitations in specific capabilities such as visual understanding, a systematic evaluation of VLMs' fundamental WM abilities remains absent. Drawing on comparative psychology and cognitive science, we propose a two-stage framework that assesses Perception (visual, spatial, temporal, quantitative, and motion) and Prediction (mechanistic simulation, transitive inference, compositional inference) to provide an atomic evaluation of VLMs as WMs. Guided by this framework, we introduce WM-ABench, a large-scale benchmark comprising 23 fine-grained evaluation dimensions across 6 diverse simulated environments with controlled counterfactual simulations. Through 660 experiments on 15 latest commercial and open-source VLMs, we find that these models exhibit striking limitations in basic world modeling abilities. For instance, almost all models perform at near-random accuracy when distinguishing motion trajectories. Additionally, they lack disentangled understanding -- e.g., some models tend to believe blue objects move faster than green ones. More rich results and analyses reveal significant gaps between VLMs and human-level world modeling.

  4. Spatial Mental Modeling from Limited Views

    Can Vision Language Models (VLMs) imagine the full scene from just a few views, like humans do? Humans form spatial mental models, internal representations of unseen space, to reason about layout, perspective, and motion. Our new MindCube benchmark with 21,154 questions across 3,268 images exposes this critical gap, where existing VLMs exhibit near-random performance. Using MindCube, we systematically evaluate how well VLMs build robust spatial mental models through representing positions (cognitive mapping), orientations (perspective-taking), and dynamics (mental simulation for "what-if" movements). We then explore three approaches to help VLMs approximate spatial mental models, including unseen intermediate views, natural language reasoning chains, and cognitive maps. The significant improvement comes from a synergistic approach, "map-then-reason", that jointly trains the model to first generate a cognitive map and then reason upon it. By training models to reason over these internal maps, we boosted accuracy from 37.8% to 60.8% (+23.0%). Adding reinforcement learning pushed performance even further to 70.7% (+32.9%). Our key insight is that such scaffolding of spatial mental models, actively constructing and utilizing internal structured spatial representations with flexible reasoning processes, significantly improves understanding of unobservable space.

  5. SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning

    Multimodal in-context learning (ICL) remains underexplored despite significant potential for domains such as medicine. Clinicians routinely encounter diverse, specialized tasks requiring adaptation from limited examples, such as drawing insights from a few relevant prior cases or considering a constrained set of differential diagnoses. While multimodal large language models (MLLMs) have shown advances in medical visual question answering (VQA), their ability to learn multimodal tasks from context is largely unknown. We introduce SMMILE, the first expert-driven multimodal ICL benchmark for medical tasks. Eleven medical experts curated problems, each including a multimodal query and multimodal in-context examples as task demonstrations. SMMILE encompasses 111 problems (517 question-image-answer triplets) covering 6 medical specialties and 13 imaging modalities. We further introduce SMMILE++, an augmented variant with 1038 permuted problems. A comprehensive evaluation of 15 MLLMs demonstrates that most models exhibit moderate to poor multimodal ICL ability in medical tasks. In open-ended evaluations, ICL contributes only 8% average improvement over zero-shot on SMMILE and 9.4% on SMMILE++. We observe a susceptibility for irrelevant in-context examples: even a single noisy or irrelevant example can degrade performance by up to 9.5%. Moreover, example ordering exhibits a recency bias, i.e., placing the most relevant example last can lead to substantial performance improvements by up to 71%. Our findings highlight critical limitations and biases in current MLLMs when learning multimodal medical tasks from context.

  6. Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy

    Recent advances in deep generative modeling have unlocked unprecedented opportunities for video synthesis. In real-world applications, however, users often seek tools to faithfully realize their creative editing intentions with precise and consistent control. Despite the progress achieved by existing methods, ensuring fine-grained alignment with user intentions remains an open and challenging problem. In this work, we present Shape-for-Motion, a novel framework that incorporates a 3D proxy for precise and consistent video editing. Shape-for-Motion achieves this by converting the target object in the input video to a time-consistent mesh, i.e., a 3D proxy, allowing edits to be performed directly on the proxy and then inferred back to the video frames. To simplify the editing process, we design a novel Dual-Propagation Strategy that allows users to perform edits on the 3D mesh of a single frame, and the edits are then automatically propagated to the 3D meshes of the other frames. The 3D meshes for different frames are further projected onto the 2D space to produce the edited geometry and texture renderings, which serve as inputs to a decoupled video diffusion model for generating edited results. Our framework supports various precise and physically-consistent manipulations across the video frames, including pose editing, rotation, scaling, translation, texture modification, and object composition. Our approach marks a key step toward high-quality, controllable video editing workflows. Extensive experiments demonstrate the superiority and effectiveness of our approach. Project page: https://shapeformotion.github.io/

  7. Global and Local Entailment Learning for Natural World Imagery

    Learning the hierarchical structure of data in vision-language models is a significant challenge. Previous works have attempted to address this challenge by employing entailment learning. However, these approaches fail to model the transitive nature of entailment explicitly, which establishes the relationship between order and semantics within a representation space. In this work, we introduce Radial Cross-Modal Embeddings (RCME), a framework that enables the explicit modeling of transitivity-enforced entailment. Our proposed framework optimizes for the partial order of concepts within vision-language models. By leveraging our framework, we develop a hierarchical vision-language foundation model capable of representing the hierarchy in the Tree of Life. Our experiments on hierarchical species classification and hierarchical retrieval tasks demonstrate the enhanced performance of our models compared to the existing state-of-the-art models. Our code and models are open-sourced at https://vishu26.github.io/RCME/index.html.

  8. Performance Prediction for Large Systems via Text-to-Text Regression

    In many industries, predicting metric outcomes of large systems is a fundamental problem, driven largely by traditional tabular regression. However, such methods struggle on complex systems data in the wild such as configuration files or system logs, where feature engineering is often infeasible. We propose text-to-text regression as a general, scalable alternative. For predicting resource efficiency on Borg, Google's massive compute cluster scheduling system, a 60M parameter encoder-decoder, trained from random initialization, achieves up to a near perfect 0.99 (0.9 average) rank correlation across the entire fleet, and 100x lower MSE than tabular approaches. The model also easily adapts to new tasks in only 500 few-shot examples and captures the densities of complex outcome distributions. Ablation studies highlight the importance of using encoders, increasing sequence length, and the model's inherent uncertainty quantification. These findings pave the way for universal simulators of real-world outcomes.

  9. GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling

    Modern Large Language Models, such as the LLaMA, Qwen and DeepSeek series, predominantly adopt the Pre-LayerNorm (Pre-LN) Transformer architecture. While being stable during pretraining and scalable to large model sizes, Pre-LN suffers from an exponential growth in activation variance across layers, causing the residual path to dominate over sub-layer outputs and limiting the learning capacity of deeper layers. To mitigate this issue, we propose Gradient-Preserving Activation Scaling (GPAS), a simple technique that can be used in combination with existing approaches. GPAS works by scaling down the intermediate activations while keeping their gradients unchanged. This leaves information in the activations intact, and avoids the gradient vanishing problem associated with gradient downscaling. Extensive experiments across various model sizes from 71M to 1B show that GPAS achieves consistent performance gains. Beyond enhancing Pre-LN Transformers, GPAS also shows promise in improving alternative architectures such as Sandwich-LN and DeepNorm, demonstrating its versatility and potential for improving training dynamics in a wide range of settings.

  10. In-Context Learning Strategies Emerge Rationally

    Recent work analyzing in-context learning (ICL) has identified a broad set of strategies that describe model behavior in different experimental conditions. We aim to unify these findings by asking why a model learns these disparate strategies in the first place. Specifically, we start with the observation that when trained to learn a mixture of tasks, as is popular in the literature, the strategies learned by a model for performing ICL can be captured by a family of Bayesian predictors: a memorizing predictor, which assumes a discrete prior on the set of seen tasks, and a generalizing predictor, where the prior matches the underlying task distribution. Adopting the normative lens of rational analysis, where a learner's behavior is explained as an optimal adaptation to data given computational constraints, we develop a hierarchical Bayesian framework that almost perfectly predicts Transformer next-token predictions throughout training -- without assuming access to its weights. Under this framework, pretraining is viewed as a process of updating the posterior probability of different strategies, and inference-time behavior as a posterior-weighted average over these strategies' predictions. Our framework draws on common assumptions about neural network learning dynamics, which make explicit a tradeoff between loss and complexity among candidate strategies: beyond how well it explains the data, a model's preference towards implementing a strategy is dictated by its complexity. This helps explain well-known ICL phenomena, while offering novel predictions: e.g., we show a superlinear trend in the timescale for transitioning from generalization to memorization as task diversity increases. Overall, our work advances an explanatory and predictive account of ICL grounded in tradeoffs between strategy loss and complexity.

  11. Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning

    We introduce Confucius3-Math, an open-source large language model with 14B parameters that (1) runs efficiently on a single consumer-grade GPU; (2) achieves SOTA performances on a range of mathematical reasoning tasks, outperforming many models with significantly larger sizes. In particular, as part of our mission to enhancing education and knowledge dissemination with AI, Confucius3-Math is specifically committed to mathematics learning for Chinese K-12 students and educators. Built via post-training with large-scale reinforcement learning (RL), Confucius3-Math aligns with national curriculum and excels at solving main-stream Chinese K-12 mathematical problems with low cost. In this report we share our development recipe, the challenges we encounter and the techniques we develop to overcome them. In particular, we introduce three technical innovations: Targeted Entropy Regularization, Recent Sample Recovery and Policy-Specific Hardness Weighting. These innovations encompass a new entropy regularization, a novel data scheduling policy, and an improved group-relative advantage estimator. Collectively, they significantly stabilize the RL training, improve data efficiency, and boost performance. Our work demonstrates the feasibility of building strong reasoning models in a particular domain at low cost. We open-source our model and code at https://github.com/netease-youdao/Confucius3-Math.

  12. Ark: An Open-source Python-based Framework for Robot Learning

    Robotics has made remarkable hardware strides-from DARPA's Urban and Robotics Challenges to the first humanoid-robot kickboxing tournament-yet commercial autonomy still lags behind progress in machine learning. A major bottleneck is software: current robot stacks demand steep learning curves, low-level C/C++ expertise, fragmented tooling, and intricate hardware integration, in stark contrast to the Python-centric, well-documented ecosystems that propelled modern AI. We introduce ARK, an open-source, Python-first robotics framework designed to close that gap. ARK presents a Gym-style environment interface that allows users to collect data, preprocess it, and train policies using state-of-the-art imitation-learning algorithms (e.g., ACT, Diffusion Policy) while seamlessly toggling between high-fidelity simulation and physical robots. A lightweight client-server architecture provides networked publisher-subscriber communication, and optional C/C++ bindings ensure real-time performance when needed. ARK ships with reusable modules for control, SLAM, motion planning, system identification, and visualization, along with native ROS interoperability. Comprehensive documentation and case studies-from manipulation to mobile navigation-demonstrate rapid prototyping, effortless hardware swapping, and end-to-end pipelines that rival the convenience of mainstream machine-learning workflows. By unifying robotics and AI practices under a common Python umbrella, ARK lowers entry barriers and accelerates research and commercial deployment of autonomous robots.

  13. RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models

    The rise of imaging techniques such as optical coherence tomography (OCT) and advances in deep learning (DL) have enabled clinicians and researchers to streamline retinal disease staging. A popular DL approach is self-supervised learning (SSL), where models learn from vast amounts of unlabeled data, avoiding costly annotation. SSL has allowed the development of foundation models (FMs), large models that can be used for a variety of downstream tasks. However, existing FMs for OCT, trained solely on image data, lack a comprehensive and robust semantic understanding of images, as evidenced by their downstream performance (especially for complex tasks), and thus require supervised fine-tuning (which may be unfeasible) to better adapt to specific applications and populations. To address this, we propose RetFiner, an SSL vision-language refinement scheme that improves the representations of existing FMs and enables their efficient and direct adaptation to specific populations for improved downstream performance. Our method uses a diverse set of training objectives which take advantage of the rich supervisory signal found in textual data. We tested RetFiner on the retinal FMs RETFound, UrFound, and VisionFM, showing significant improvements in linear probing performance on seven highly diverse OCT classification tasks, with an average increase of 5.8, 3.9, and 2.1 percentage points over their baselines, respectively. Our code and model weights are publicly available at https://github.com/ronnief1/RetFiner.

  14. Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

    We present Gazal-R1, a 32-billion-parameter language model that achieves state-of-the-art performance in medical reasoning while providing transparent, step-by-step explanations for clinical decision-making. Built upon Qwen3 32B, our model demonstrates that strategic training can enable mid-sized models to outperform significantly larger counterparts in specialized domains. We developed a novel two-stage training pipeline: first, supervised fine-tuning on a carefully curated dataset of 107,033 synthetic medical reasoning examples that teaches structured clinical thinking, enhanced by advanced parameter-efficient techniques including Weight-Decomposed Low-Rank Adaptation (DoRA) and Rank-Stabilized LoRA (rsLoRA); second, reinforcement learning using Group Relative Policy Optimization (GRPO) with a sophisticated multi-component reward system that refines accuracy, format adherence, and reasoning quality. Gazal-R1 achieves exceptional performance across medical benchmarks, scoring 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA, surpassing models up to 12x larger. Beyond its strong empirical results, this work provides detailed insights into the challenges of training reasoning-capable models in specialized domains, including issues with reward hacking, training instability, and the fundamental tension between factual recall and detailed reasoning. Our methodology offers a reproducible framework for developing high-capability, domain-specific language models that balance performance, efficiency, and explainability.

  15. The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

    Rapid advancements in large language models (LLMs) have the potential to assist in scientific progress. A critical capability toward this endeavor is the ability to reproduce existing work. To evaluate the ability of AI agents to reproduce results in an active research area, we introduce the Automated LLM Speedrunning Benchmark, leveraging the research community contributions on the NanoGPT speedrun, a competition to train a GPT-2 model in the shortest time. Each of the 19 speedrun tasks provides the agent with the previous records training script, optionally paired with one of three hint formats, ranging from pseudocode to paper-like descriptions of the new records improvements. Records execute quickly by design and speedrun improvements encompass diverse code-level changes, ranging from high-level algorithmic advancements to hardware-aware optimizations. These features make the benchmark both accessible and realistic for the frontier problem of improving LLM training. We find that recent reasoning LLMs combined with SoTA scaffolds struggle to reimplement already-known innovations in our benchmark, even when given detailed hints. Our benchmark thus provides a simple, non-saturated measure of an LLMs ability to automate scientific reproduction, a necessary (but not sufficient) skill for an autonomous research agent.

Solidot(9)

  1. 小行星 2024 YR4 撞击月球概率上升至 1/25

    小行星 2024 YR4 基本不可能撞击地球,但 2032 年 12 月撞击月球的概率提高到了 1/25。若撞击真的发生,预估将在月球表面形成一个约 1 公里直径的新撞击坑。虽然月球本身无需防御,撞击也不会对月球轨道运行有任何影响。但撞击所造成的抛射物有可能进入地球同步轨道范围,对部分卫星系统造成干扰风险。这也提醒我们,太空防御的范畴不应限于地球,整个地月系统的安全亦不可忽视。

  2. 碳记录显示人类五万年前开始大规模用火

    中科院海洋所研究团队与德法研究人员合作在 PNAS 期刊发表论文,基于海洋沉积物中的黑碳记录,重建了过去 30万 年以来东亚北部的古火演化历史,结合欧洲、东亚、东南亚及澳大利亚区域的记录以及考古遗址大数据,发现现代人类大规模用火始于约 5 万年前。考古学研究发现,人类最早的用火记录可追溯至约 170 万年前。但关于人类究竟何时开始大规模用火,目前仍难以给出确切的答案。黑碳是生物质及化石燃料燃烧过程中所生成的一系列含碳化合物的统称。鉴于其芳香族结构具备高度稳定性,黑碳能够在沉积环境中得以长期留存。以大河作为主要沉积物源区的边缘海,其沉积物中的黑碳很大程度上能够反映大陆尺度的火活动状况。研究认为,5 万年前的冰期,现代人类开启了第二次走出非洲的迁徙历程。冰期海平面下降,印太暖池区大面积的陆架出露为陆地,雨林屏障作用减弱,使得人类在不到一万年的时间里就迅速扩散至东亚、东南亚乃至澳大利亚。人口的急剧扩张极大地促进了用火频率的上升。此外,冰期气候寒冷,食物资源相对匮乏,人类对用火的需求也随之大幅增加。这些因素最终共同促成了 5 万年前成为人类开始大规模用火的关键时间节点。这也进一步表明,人类可能在末次冰期就已经通过用火在全球碳循环演变中留下了深刻印记。

  3. 研究发现消费者对 AI 产品信任度低

    两项研究发现消费者对 AI 产品信任度低,购买意愿也低。AI 对产品推广产生了负面影响,这种影响在高风险产品中尤其显著,低风险产品则不太明显。在其中一项研究中,研究人员将参与者分成两组,每组大约 100 人。一组阅读突出 AI 或 AI-powered 等特性的虚构产品和服务的广告,另一组阅读的广告使用了新技术或配备了尖端技术等术语。相比另一组,阅读带有 AI 等关键词广告的参与者报告尝试或购买相关产品和服务的可能性较低。另一项研究由市场研究公司 Parks Associates 完成,调查规模更大。在接受调查的约 4000 名美国人中,18% 的人表示 AI 可能会增加购买意愿,24% 的人表示不太可能,而 58% 的人表示 AI 对他们没有影响。

  4. Canonical 2024 年营收 2.92 亿美元

    根据 Canonical 向 UK Companies House 递交的 2024 年财报,Ubuntu 发行版的开发商在 2024 年营收达到了 2.92 亿美元,2023 年是 2.51 亿美元,而 2022 年是 2.05 亿美元,公司的员工总数也达到 1,175 人。相比下 2014 年 Canonical 的营收仅为 8100 万美元,员工人数约 337 人,公司处于长期亏损状态。暂时不清楚 Canonical 何时会 IPO,早在 2022 年就传出将在 2023 年 IPO 的消息。

  5. 研究发现白垩纪海洋是“乌贼的天下”

    以往观点认为 1亿至7000万年前的白垩纪后期海洋生物以菊石和鱼类为主,但日本研究团队发现当时的海洋实际上是“乌贼的天下”。由于乌贼没有外壳和骨骼,作为化石很难被发现,此前从未被纳入白垩纪海洋的生态图景。研究团队开发出新技术,将岩石以百分之一毫米精度逐层切削拍摄、数字化立体重现内部包括微小化石在内的所有化石。从北海道各地的白垩纪岩石中鉴定出263个乌贼喙部硬组织化石,平均尺寸约为4毫米。

  6. 日本争议夫妇别姓法案

    日本国会上个月未通过允许已婚夫妇保留不同姓氏的“可选择的夫妇别姓制度”法案,尽管民调显示大部分民众对法案表示支持。日本是唯一一个法律要求已婚夫妇使用同一姓氏的国家,95% 的女性选择随夫姓。非政府组织 Asuniwa 的一项研究认为,允许夫妇保留不同姓氏或有助于提高生育率,因为有很多伴侣为避免改姓而宁愿不结婚。如教师 Uchiyama Yukari 和 Koike Yuki 为躲避法律离婚再婚三次,大部分时间处于非婚状态,但为了给孩子登记出生记录,他们会选择结婚然后就离婚。

  7. 中国平面设计师面临 AI 图像生成器的挑战

    中国平面设计师体会到了 AI 图像生成器对其日常工作的影响。AI 图像生成器容易模仿艺术风格,深刻改变了客户对设计师作品的认识。一家大型电商平台的匿名员工称,在 AI 图像生成器流行前,科技巨头和大型企业的平面设计师就被指示拷贝竞争对手或复制社媒上的作品。对于一种独特的艺术风格,人类需要理解和逆向工程才能复制。而 AI 图像生成器只是给这种艺术风格引入随机的变化,其结果可能会非常像复制品,可能会包括错误,人类平面设计师可以在此基础上编辑成产品。这位匿名员工称,如果不拥抱 AI,会觉得非常容易被取代。在北京和伦敦经营工作室的设计师 Sendi Jia 说,AI 图像生成器正迫使设计师和客户重新思考设计师的价值,设计师的价值仅仅在于创作设计?还是在于咨询、创意、策略、方向和审美?北京的平面设计师 Erbing 说,AI 无法产生任何独特的东西,“每个项目都面临着不同的问题,设计师的存在是为了解决具体问题,而不是创造千篇一律的视觉效果。”他说一个项目的思考过程经常比实际创作更耗时,他认为 AI 图像生成器是一种玩具而不是工具。但设计师们承认 AI 的狂热让客户对其作品价值产生了负面影响。客户现在希望设计师以更少的费用在更短的时间内完成作品。这可能导致质量的下降。Erbin 说,部分客户认为 AI 提高了效率,那么他们的预算可以减半了,但设计师的工作并不是作图。

  8. Bcachefs 文件系统可能将会移除出内核

    因与维护者 Kent Overstreet 之间存在分歧,Linux 作者再次威胁要将 Bcachefs 文件系统从内核中移除出去。Linus Torvalds 在最新拉取评论中表示有可能在 6.17 合并窗口期间会与 Bcachefs 分道扬镳。他给出的理由是双方的开发理念存在巨大分歧,Torvalds 说他甚至无法对 Bcachefs 的 bug 修复提出任何质疑,好像他只能按照 Overstreet 的要求拉取代码,他说双方争吵之后的唯一共识是“we're done”。

  9. 德国要求苹果和 Google 下架 DeepSeek

    德国联邦数据保护与信息自由专员 Meike Kamp 周五表示,她已正式要求苹果和 Google 将中国 AI 公司 DeepSeek APP 从德国地区的 App Store 和 Google Play 下架。原因是该公司未能证明其数据处理符合欧盟标准,涉嫌将德国用户的个人数据非法转移至中国。根据 DeepSeek 的隐私政策,该应用会将用户的 AI 请求、上传文件等多种个人信息储存在中国境内的服务器上。德国监管机构今年早些时候要求 DeepSeek 满足欧盟关于数据跨境传输的合规要求,或主动下架应用。但 DeepSeek 并未作出回应,因此 Kamp 启动了正式下架程序。