OrangeBot.AI Digest — 2025-06-27
74 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- US Supreme Court limits federal judges' power to block Trump orders (www.theguardian.com)
- Copilot Chat in VS Code is now open source (github.com)
- Project Vend: Can Claude run a small shop? (And why does that matter?) (www.anthropic.com)
- Weird Expressions in Rust (www.wakunguma.com)
- 10 Years of Pomological Watercolors (parkerhiggins.net)
- Qwen VLo: From "Understanding" the World to "Depicting" It (qwenlm.github.io)
- The Effect of Noise on Sleep (www.empirical.health)
- Show HN: I'm an airline pilot – I built interactive graphs/globes of my flights (jameshard.ing)
- Moonbase Alpha: That time NASA made a meme video game (www.spacebar.news)
- Show HN: Zenta – Mindfulness for Terminal Users (github.com)
- I Switched from Flutter and Rust to Rust and Egui (jdiaz97.github.io)
- Parameterized types in C using the new tag compatibility rule (nullprogram.com)
- Show HN: Sink – Sync any directory with any device on your local network (github.com)
- XSLT – Native, zero-config build system for the Web (github.com)
- Denmark to tackle deepfakes by giving people copyright to their own features (www.theguardian.com)
GitHub Trending(14)
- coleam00 / ottomator-agents
All the open source AI Agents hosted on the oTTomator Live Agent Studio platform!
- sindresorhus / awesome
😎 Awesome lists about all kinds of interesting topics
- gitleaks / gitleaks
Find secrets with Gitleaks 🔑
- twentyhq / twenty
Building a modern alternative to Salesforce, powered by the community.
- black-forest-labs / flux
Official inference repo for FLUX.1 models
- jujumilk3 / leaked-system-prompts
Collection of leaked system prompts
- gensyn-ai / rl-swarm
A fully open source framework for creating RL training swarms over the internet.
- rxi / microui
A tiny immediate-mode UI library
- automatisch / automatisch
The open source Zapier alternative. Build workflow automation without spending time and money.
- AykutSarac / jsoncrack.com
✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
- sdmg15 / Best-websites-a-programmer-should-visit
🔗 Some useful websites for programmers.
- mui / base-ui
Unstyled UI components for building accessible web apps and design systems. From the creators of Radix, Floating UI, and Material UI.
- ripienaar / free-for-dev
A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev
- cline / cline
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Product Hunt(15)
- Dyad
Free, local, open-source alternative to Lovable / v0 / Bolt
- Integral
AI-native Slack alternative for big teams and communities
- HeyGen Video Agent
The world’s first Creative Operating System
- Higgsfield Soul
Higgsfield's first high-aesthetic photo model
- FLUX.1 Kontext
Powerful In-context AI image editing, now open source
- Phion.dev
Build in Cursor like in Lovable - auto start, save & deploy
- QuickAgent
Easily build AI agents that connect to any service, no-code
- Gemma 3n
Run powerful multimodal AI right on your phone
- Crowd
AI-powered 360° customer intelligence platform.
- TextMarley
iMessage-based reminders integrated with Google Calendar
- SoSeeAL
Algorithm and ad free social media platform
- Vanta
A secure and user-friendly CLI password manager
- Doppl
Try on any look, anywhere
- Parallel AI White-Label
All-in-one AI automation platform for business
- BUNDL AI
Let AI shop for you across multiple stores
Hugging Face(15)
- Generative Blocks World: Moving Things Around in Pictures
We describe Generative Blocks World to interact with the scene of a generated image by manipulating simple geometric abstractions. Our method represents scenes as assemblies of convex 3D primitives, and the same scene can be represented by different numbers of primitives, allowing an editor to move either whole structures or small details. Once the scene geometry has been edited, the image is generated by a flow-based method which is conditioned on depth and a texture hint. Our texture hint takes into account the modified 3D primitives, exceeding texture-consistency provided by existing key-value caching techniques. These texture hints (a) allow accurate object and camera moves and (b) largely preserve the identity of objects depicted. Quantitative and qualitative experiments demonstrate that our approach outperforms prior works in visual fidelity, editability, and compositional generalization.
- DuaShepherd: Integrating Stepwise Correctness and Potential Rewards for Mathematical Reasoning
In this paper, we propose DuaShepherd, a novel reward modeling framework that integrates two complementary reward signals, correctness and potential, to enhance the mathematical reasoning capabilities of Large Language Models (LLMs). While correctness-based signals emphasize identification of stepwise errors, potential-based signals focus on the likelihood of reaching the correct final answer. We developed an automated pipeline for constructing large-scale reward modeling dataset with both signals. A unified, multi-head architecture was explored to train the two reward models in a multi-task setup, demonstrating benefits from learning both correctness and potential in parallel. By combining these two signals into a compound probability, our model achieves consistent performance improvements across multiple benchmarks. Empirical evaluations on MATH500 and ProcessBench confirm that this combined reward significantly outperforms models trained on either reward type alone, achieving state-of-the-art performance under comparable resource constraints.
- PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling
Skinning and rigging are fundamental components in animation, articulated object reconstruction, motion transfer, and 4D generation. Existing approaches predominantly rely on Linear Blend Skinning (LBS), due to its simplicity and differentiability. However, LBS introduces artifacts such as volume loss and unnatural deformations, and it fails to model elastic materials like soft tissues, fur, and flexible appendages (e.g., elephant trunks, ears, and fatty tissues). In this work, we propose PhysRig: a differentiable physics-based skinning and rigging framework that overcomes these limitations by embedding the rigid skeleton into a volumetric representation (e.g., a tetrahedral mesh), which is simulated as a deformable soft-body structure driven by the animated skeleton. Our method leverages continuum mechanics and discretizes the object as particles embedded in an Eulerian background grid to ensure differentiability with respect to both material properties and skeletal motion. Additionally, we introduce material prototypes, significantly reducing the learning space while maintaining high expressiveness. To evaluate our framework, we construct a comprehensive synthetic dataset using meshes from Objaverse, The Amazing Animals Zoo, and MixaMo, covering diverse object categories and motion patterns. Our method consistently outperforms traditional LBS-based approaches, generating more realistic and physically plausible results. Furthermore, we demonstrate the applicability of our framework in the pose transfer task highlighting its versatility for articulated object modeling.
- DiLoCoX: A Low-Communication Large-Scale Training Framework for Decentralized Cluster
The distributed training of foundation models, particularly large language models (LLMs), demands a high level of communication. Consequently, it is highly dependent on a centralized cluster with fast and reliable interconnects. Can we conduct training on slow networks and thereby unleash the power of decentralized clusters when dealing with models exceeding 100 billion parameters? In this paper, we propose DiLoCoX, a low-communication large-scale decentralized cluster training framework. It combines Pipeline Parallelism with Dual Optimizer Policy, One-Step-Delay Overlap of Communication and Local Training, and an Adaptive Gradient Compression Scheme. This combination significantly improves the scale of parameters and the speed of model pre-training. We justify the benefits of one-step-delay overlap of communication and local training, as well as the adaptive gradient compression scheme, through a theoretical analysis of convergence. Empirically, we demonstrate that DiLoCoX is capable of pre-training a 107B foundation model over a 1Gbps network. Compared to vanilla AllReduce, DiLoCoX can achieve a 357x speedup in distributed training while maintaining negligible degradation in model convergence. To the best of our knowledge, this is the first decentralized training framework successfully applied to models with over 100 billion parameters.
- FairyGen: Storied Cartoon Video from a Single Child-Drawn Character
We propose FairyGen, an automatic system for generating story-driven cartoon videos from a single child's drawing, while faithfully preserving its unique artistic style. Unlike previous storytelling methods that primarily focus on character consistency and basic motion, FairyGen explicitly disentangles character modeling from stylized background generation and incorporates cinematic shot design to support expressive and coherent storytelling. Given a single character sketch, we first employ an MLLM to generate a structured storyboard with shot-level descriptions that specify environment settings, character actions, and camera perspectives. To ensure visual consistency, we introduce a style propagation adapter that captures the character's visual style and applies it to the background, faithfully retaining the character's full visual identity while synthesizing style-consistent scenes. A shot design module further enhances visual diversity and cinematic quality through frame cropping and multi-view synthesis based on the storyboard. To animate the story, we reconstruct a 3D proxy of the character to derive physically plausible motion sequences, which are then used to fine-tune an MMDiT-based image-to-video diffusion model. We further propose a two-stage motion customization adapter: the first stage learns appearance features from temporally unordered frames, disentangling identity from motion; the second stage models temporal dynamics using a timestep-shift strategy with frozen identity weights. Once trained, FairyGen directly renders diverse and coherent video scenes aligned with the storyboard. Extensive experiments demonstrate that our system produces animations that are stylistically faithful, narratively structured natural motion, highlighting its potential for personalized and engaging story animation. The code will be available at https://github.com/GVCLab/FairyGen
- Learning to Skip the Middle Layers of Transformers
Conditional computation is a popular strategy to make Transformers more efficient. Existing methods often target individual modules (e.g., mixture-of-experts layers) or skip layers independently of one another. However, interpretability research has demonstrated that the middle layers of Transformers exhibit greater redundancy, and that early layers aggregate information into token positions. Guided by these insights, we propose a novel architecture that dynamically skips a variable number of layers from the middle outward. In particular, a learned gating mechanism determines whether to bypass a symmetric span of central blocks based on the input, and a gated attention mechanism prevents subsequent tokens from attending to skipped token positions. Residual norms are controlled with a 'sandwich' or 'perilayernorm' scheme and gate sparsity with an adaptive regularization loss. We had aimed to reduce compute requirements for 'simpler' tokens and potentially foster an emergent multi-level representational hierarchy but, at the scales investigated, our approach does not achieve improvements in the trade-off between validation cross-entropy and estimated FLOPs compared to dense baselines with fewer layers. We release our code at https://github.com/tim-lawson/skip-middle.
- MADrive: Memory-Augmented Driving Scene Modeling
Recent advances in scene reconstruction have pushed toward highly realistic modeling of autonomous driving (AD) environments using 3D Gaussian splatting. However, the resulting reconstructions remain closely tied to the original observations and struggle to support photorealistic synthesis of significantly altered or novel driving scenarios. This work introduces MADrive, a memory-augmented reconstruction framework designed to extend the capabilities of existing scene reconstruction methods by replacing observed vehicles with visually similar 3D assets retrieved from a large-scale external memory bank. Specifically, we release MAD-Cars, a curated dataset of {sim}70K 360{\deg} car videos captured in the wild and present a retrieval module that finds the most similar car instances in the memory bank, reconstructs the corresponding 3D assets from video, and integrates them into the target scene through orientation alignment and relighting. The resulting replacements provide complete multi-view representations of vehicles in the scene, enabling photorealistic synthesis of substantially altered configurations, as demonstrated in our experiments. Project page: https://yandex-research.github.io/madrive/
- MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners
We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. The key finding is that positional embeddings, which have been seldom used by text-to-music generation models in the conditioner for text conditions, are critical when the condition of interest is a function of time. Using melody control as an example, our experiments show that simply adding rotary positional embeddings to the decoupled cross-attention layers increases control accuracy from 56.6% to 61.1%, while requiring 6.75 times fewer trainable parameters than state-of-the-art fine-tuning mechanisms, using the same pre-trained diffusion Transformer model of Stable Audio Open. We evaluate various forms of musical attribute control, audio inpainting, and audio outpainting, demonstrating improved controllability over MusicGen-Large and Stable Audio Open ControlNet at a significantly lower fine-tuning cost, with only 85M trainble parameters. Source code, model checkpoints, and demo examples are available at: https://musecontrollite.github.io/web/.
- HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges
Heuristic algorithms play a vital role in solving combinatorial optimization (CO) problems, yet traditional designs depend heavily on manual expertise and struggle to generalize across diverse instances. We introduce HeurAgenix, a two-stage hyper-heuristic framework powered by large language models (LLMs) that first evolves heuristics and then selects among them automatically. In the heuristic evolution phase, HeurAgenix leverages an LLM to compare seed heuristic solutions with higher-quality solutions and extract reusable evolution strategies. During problem solving, it dynamically picks the most promising heuristic for each problem state, guided by the LLM's perception ability. For flexibility, this selector can be either a state-of-the-art LLM or a fine-tuned lightweight model with lower inference cost. To mitigate the scarcity of reliable supervision caused by CO complexity, we fine-tune the lightweight heuristic selector with a dual-reward mechanism that jointly exploits singals from selection preferences and state perception, enabling robust selection under noisy annotations. Extensive experiments on canonical benchmarks show that HeurAgenix not only outperforms existing LLM-based hyper-heuristics but also matches or exceeds specialized solvers. Code is available at https://github.com/microsoft/HeurAgenix.
- FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
We develop a cost-efficient neurosymbolic agent to address challenging multi-turn image editing tasks such as "Detect the bench in the image while recoloring it to pink. Also, remove the cat for a clearer view and recolor the wall to yellow.'' It combines the fast, high-level subtask planning by large language models (LLMs) with the slow, accurate, tool-use, and local A^* search per subtask to find a cost-efficient toolpath -- a sequence of calls to AI tools. To save the cost of A^* on similar subtasks, we perform inductive reasoning on previously successful toolpaths via LLMs to continuously extract/refine frequently used subroutines and reuse them as new tools for future tasks in an adaptive fast-slow planning, where the higher-level subroutines are explored first, and only when they fail, the low-level A^* search is activated. The reusable symbolic subroutines considerably save exploration cost on the same types of subtasks applied to similar images, yielding a human-like fast-slow toolpath agent "FaSTA^*'': fast subtask planning followed by rule-based subroutine selection per subtask is attempted by LLMs at first, which is expected to cover most tasks, while slow A^* search is only triggered for novel and challenging subtasks. By comparing with recent image editing approaches, we demonstrate FaSTA^* is significantly more computationally efficient while remaining competitive with the state-of-the-art baseline in terms of success rate.
- Whole-Body Conditioned Egocentric Video Prediction
We train models to Predict Ego-centric Video from human Actions (PEVA), given the past video and an action represented by the relative 3D body pose. By conditioning on kinematic pose trajectories, structured by the joint hierarchy of the body, our model learns to simulate how physical human actions shape the environment from a first-person point of view. We train an auto-regressive conditional diffusion transformer on Nymeria, a large-scale dataset of real-world egocentric video and body pose capture. We further design a hierarchical evaluation protocol with increasingly challenging tasks, enabling a comprehensive analysis of the model's embodied prediction and control abilities. Our work represents an initial attempt to tackle the challenges of modeling complex real-world environments and embodied agent behaviors with video prediction from the perspective of a human.
- Arch-Router: Aligning LLM Routing with Human Preferences
With the rapid proliferation of large language models (LLMs) -- each optimized for different strengths, style, or latency/cost profile -- routing has become an essential technique to operationalize the use of different models. However, existing LLM routing approaches are limited in two key ways: they evaluate performance using benchmarks that often fail to capture human preferences driven by subjective evaluation criteria, and they typically select from a limited pool of models. In this work, we propose a preference-aligned routing framework that guides model selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) -- offering a practical mechanism to encode preferences in routing decisions. Specifically, we introduce Arch-Router, a compact 1.5B model that learns to map queries to domain-action preferences for model routing decisions. Our approach also supports seamlessly adding new models for routing without requiring retraining or architectural modifications. Experiments on conversational datasets demonstrate that our approach achieves state-of-the-art (SOTA) results in matching queries with human preferences, outperforming top proprietary models. Our approach captures subjective evaluation criteria and makes routing decisions more transparent and flexible. Our model is available at: https://huggingface.co/katanemo/Arch-Router-1.5B.
- Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
Grokking, i.e., test performance keeps improving long after training loss converged, has been recently witnessed in neural network training, making the mechanism of generalization and other emerging capabilities such as reasoning mysterious. While prior studies usually train small models on a few toy or highly-specific tasks for thousands of epochs, we conduct the first study of grokking on checkpoints during one-pass pretraining of a 7B large language model (LLM), i.e., OLMoE. We compute the training loss and evaluate generalization on diverse benchmark tasks, including math reasoning, code generation, and commonsense/domain-specific knowledge retrieval tasks. Our study, for the first time, verifies that grokking still happens in the pretraining of large-scale foundation models, though different data may enter grokking stages asynchronously. We further demystify grokking's "emergence of generalization" by investigating LLM internal dynamics. Specifically, we find that training samples' pathways (i.e., expert choices across layers) evolve from random, instance-specific to more structured and shareable between samples during grokking. Also, the complexity of a sample's pathway reduces despite the converged loss. These indicate a memorization-to-generalization conversion, providing a mechanistic explanation of delayed generalization. In the study, we develop two novel metrics to quantify pathway distance and the complexity of a single pathway. We show their ability to predict the generalization improvement on diverse downstream tasks. They are efficient, simple to compute and solely dependent on training data. Hence, they have practical value for pretraining, enabling us to monitor the generalization performance without finetuning and test. Theoretically, we show that more structured pathways reduce model complexity and improve the generalization bound.
- An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
Rare diseases collectively affect over 300 million individuals worldwide, yet timely and accurate diagnosis remains a pervasive challenge. This is largely due to their clinical heterogeneity, low individual prevalence, and the limited familiarity most clinicians have with rare conditions. Here, we introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM), capable of processing heterogeneous clinical inputs. The system generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning that links intermediate analytic steps to verifiable medical evidence. DeepRare comprises three key components: a central host with a long-term memory module; specialized agent servers responsible for domain-specific analytical tasks integrating over 40 specialized tools and web-scale, up-to-date medical knowledge sources, ensuring access to the most current clinical information. This modular and scalable design enables complex diagnostic reasoning while maintaining traceability and adaptability. We evaluate DeepRare on eight datasets. The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases. In HPO-based evaluations, DeepRare significantly outperforms other 15 methods, like traditional bioinformatics diagnostic tools, LLMs, and other agentic systems, achieving an average Recall@1 score of 57.18% and surpassing the second-best method (Reasoning LLM) by a substantial margin of 23.79 percentage points. For multi-modal input scenarios, DeepRare achieves 70.60% at Recall@1 compared to Exomiser's 53.20% in 109 cases. Manual verification of reasoning chains by clinical experts achieves 95.40% agreements. Furthermore, the DeepRare system has been implemented as a user-friendly web application http://raredx.cn/doctor.
- WorldVLA: Towards Autoregressive Action World Model
We present WorldVLA, an autoregressive action world model that unifies action and image understanding and generation. Our WorldVLA intergrates Vision-Language-Action (VLA) model and world model in one single framework. The world model predicts future images by leveraging both action and image understanding, with the purpose of learning the underlying physics of the environment to improve action generation. Meanwhile, the action model generates the subsequent actions based on image observations, aiding in visual understanding and in turn helps visual generation of the world model. We demonstrate that WorldVLA outperforms standalone action and world models, highlighting the mutual enhancement between the world model and the action model. In addition, we find that the performance of the action model deteriorates when generating sequences of actions in an autoregressive manner. This phenomenon can be attributed to the model's limited generalization capability for action prediction, leading to the propagation of errors from earlier actions to subsequent ones. To address this issue, we propose an attention mask strategy that selectively masks prior actions during the generation of the current action, which shows significant performance improvement in the action chunk generation task.
Solidot(15)
- 笑声也会感染倭黑猩猩
一项研究发现,倭黑猩猩在听到笑声后,更可能去接近一个平时不会触碰的东西。这项研究对 4 只经过训练的倭黑猩猩进行了监测,它们会根据盒子是否装有食物而与盒子互动或忽略盒子,这项研究表明,听到积极的声音可能会影响它们的觅食和搜索行为。在研究中,4 只倭猩猩被安排熟悉一个有食物奖励的黑盒子和一个空的白盒子,并被训练按一个按钮拒绝白盒子。此外,有 3 个“模棱两可”的盒子(浅灰、中灰和深灰)间歇呈现,在50% 的盒子中含有食物奖励。测试在播放倭猩猩笑声或环境风声的情况下进行,播放时间持续7分28秒。研究显示,倭猩猩在93%的情况下会接近黑盒子,仅在1%的情况下接近白盒子。而在提供灰色盒子时,倭黑猩猩接近深灰盒子的情况较浅灰的频繁。综合所有灰色盒子的试验结果,研究人员发现,倭灰猩猩在听到笑声录音后更容易检查灰色盒子,其接近灰盒子的概率较环境风声的情况高了3.4倍。研究人员认为,笑声可能引发了倭黑猩猩的情感共鸣,影响了它们的行为,使其更容易接近一个模棱两可的刺激物。
- 数字主权始于桌面:欧洲 Linux 桌面时代有望到来
Windows 10 即将终止支持,以及微软听命于美国政府制裁国际刑事法院首席检察官等事件给欧洲国家敲响了警钟,切换到 Linux 桌面将有助于安全和隐私保护,也有助于维护欧洲的数字主权。法国宪兵队在十多年前就成功切换到了基于 Ubuntu 的定制发行版 GendBuntu。一部分人人提议为欧盟组织开发一个专门的发行版 EU OS。该发行版将基于 Red Hat 社区发行版 Fedora KDE Linux。
- AMD 成为 Debian 开发者大会的白金赞助商
Debian 项目宣布,AMD 成为下个月在法国 Brest 举行的 DebConf25 开发者大会的白金赞助商。AMD 此举旨在向 Debian 开发者宣传它的开源 GPU 编程软件栈 ROCm,因为 Debian 发行版是 AMD ROCm 的官方支持平台,越来越多的组件直接包含在 Debian 发行版中(然而稳定版并没有,主要是测试版)。
- 丹麦以赋权公民的方式打击深度伪造
丹麦准备通过修改版权法确保每个人都拥有其身份、面部特征和声音所有权的方式打击 AI 深度伪造。丹麦文化部准备先公开修正提案征询公众意见,然后在今年秋季正式递交修正案。提案已获得议会九成议员的支持。AI 技术术快速发展,制作逼真的深度伪造图像、视频或声音比以往任何时候都容易。一旦修正案获得批准,丹麦公民将有权要求网络平台删除未经同意分享的内容。修正案不会影响戏仿和讽刺作品。
- 当美国人遇到新闻付费墙很少有人愿意付费
随着纸媒收入下降,越来越多的媒体拥抱了付费订阅模式。但当用户在上网冲浪时遇到需要付费的新闻内容,有多少人会愿意付费?美国皮尤研究中心的一项调查显示,绝大多数人都不愿意付费。调查显示,83% 的被调查者表示过去一年没有为新闻付费,17% 的人通过订阅、捐赠或成为会员的方式向新闻机构付费。74% 的人在搜索新闻时遇到过付费墙,38% 的人经常遇到付费墙。大多数情况下,遇到付费墙后,53% 的人会寻找其他信息来源,32% 的人放弃,只有 1% 的人在遇到付费墙后会选择付费访问。受过高等教育的成年人、民主党人和老年人等更有可能为新闻付费,民主党人比共和党人的付费意愿更高(21% 对 14%)。
- 研究发现大模型用户理解能力较弱
宾夕法尼亚大学沃顿商学院的研究人员发现,相比 Google 搜索引擎用户,使用大模型研究特定主题的用户理解能力较弱,原创见解较少。研究涉及四项实验,共有逾 4500 人参与。结果显示,大模型用户在研究上花费的时间更少,付出的努力较少,撰写的回复更短、细节也缺乏。在第一个实验中,逾 1100 名参与者使用 Google 或 ChatGPT 研究蔬菜园艺(vegetable gardening)。Google 用户的回复更长,措辞更独特,引用事实也更丰富。第二个实验以 AI 摘要或模拟网页的形式呈现相同的园艺信息,在近 2000 名参与者中 Google 用户给出了更深入更丰富的信息。
- 微软正将杀毒软件移出 Windows 内核
在安全公司 CrowdStrike 的错误更新导致全世界 850 万台电脑崩溃近一年之后,微软正采取行动确保此类的事件不再发生,它采取的措施是将杀毒软件移出 Windows 内核。微软的新 Windows 终端安全平台正与安全公司如 CrowdStrike、Bitdefender、ESET、趋势科技等合作构建。以前微软允许安全公司的杀毒软件运行在 Windows 的内核层,不受限制的访问系统内存和硬件。CrowdStrike 去年的错误更新突出了内核驱动容易错误而导致系统蓝屏死机。微软准备释出了一个不对外公开的预览版,供安全公司测试,在多次迭代之后,完成将杀毒软件移出内核的工作。
- Google DeepMind 发布 AlphaGenome
Google DeepMind 新开发的 AI 模型 AlphaGenome 能帮助科学家解析基因组序列中的“暗物质”——非编码区,了解它们如何影响细胞内部运作并导致癌症等疾病的发生。从事非商业工作的研究人员现可使用 API 通过 DeepMind 的服务器访问该模型。在人类基因组序列中,98% 是不直接参与蛋白质编码合成的基因,即非编码区,但它们可以影响蛋白质活性,并包含了大量与疾病相关的变异位点。弄清楚 DNA 序列的作用很难,因为没有现成的答案,就像 AlphaFold 预测蛋白质3D结构一样。从吸引一组细胞机器附着在染色体的特定部分并将附近的基因转录为 RNA 分子,到吸引影响基因表达发生地点、时间和程度的转录因子,单个 DNA 片段具有许多相互关联的作用。例如许多 DNA 序列通过改变染色体的 3D 形状影响基因活性,从而限制或简化转录机器的访问。几十年来,科学家开发了数十种 AI 模型理解基因组。其中许多都集中在单个任务上,例如预测基因表达水平或确定外显子是如何被剪切并拼接到不同蛋白质中的。而 AlphaGenome 正是一个“一体化”解释 DNA 序列的工具。AlphaGenome 可以处理多达 100 万个 DNA 碱基,这可能包括一个基因和无数个调节元件,并能针对多种生物特性进行数千次预测。而且,AlphaGenome在预测过程中对单个 DNA 碱基的变化十分敏感,这意味着科学家可以预测突变的影响。
- 美国计算机科学专业入学人数出现下降趋势
过去二十年,计算机科学被视为是获得高薪工作的踏脚石。从 2005 年到 2023 年,美国计算机科学专业的学生人数增长了四倍。但过去一年,部分大学的计算机科学专业入学人数开始逐渐下降。2025 年美国计算机科学专业入学人数只增长了 0.2%:斯坦福大学的入学人数增长停滞;普林斯顿大学估计未来计算机科学专业的毕业生人数将比今天下降四分之一;杜克大学该专业入学人数下降了五分之一。计算机科学专业可能已经走过了它的黄金时代。因为 AI 等工具的流行,科技公司的入门级编程工作的招聘人数显著减少。皮尤研究中心的一项研究发现,软件工程师被认为受生成式 AI 影响最严重的职业。相比生成文本,AI 被认为更擅长写代码。大学面临的一个挑战是:在生成式 AI 时代,是降低入学标准以增加入学人数,还是提高入学标准只让能力更强的人进入该领域。
- 美国 K-12 教师将 AI 用于备课和评分
根据 Gallup 和 Walton Family Foundation 的一项调查,美国 K-12 学校教师中有六成在工作中使用 AI 工具。AI 普及率在高中教师和青年教师中间最高。有大约 2000 名教师接受了调查,使用 AI 工具的教师报告他们每周节工作时间省了约 6 小时。对于学生使用 AI 工具,半数教师们担心会影响他们的批判性思维能力和独立解决问题的耐性。
- Fairphone 6 发布
荷兰公司 Fairphone 推出了最新的模块化手机 Fairphone 6。Fairphone 6 硬件规格:6.31 英寸 120Hz LTPO OLED 显示屏,Snapdragon 7s Gen 3 芯片,12 个可更换零部件,售价 600 欧元。可更换零部件数量与上一代相同,包括了显示屏、电池、摄像头、USB-C 端口、扬声器、甚至主板。由于可更换需求,它的防护等级仅为 IP55,它获得了欧盟 A 级可修复性和耐用性认证。除了模块化设计,Fairphone 6 软件支持长达八年,一直支持到 2033 年。
- 韦伯望远镜可能首次直接获得系外行星影像
天文学家利用韦伯太空望远镜捕捉到一颗质量与土星相似的行星,围绕年轻的母恒星 TWA 7 运行。如果得到证实,这将是韦伯首次直接发现行星的影像,也是迄今为止使用该技术发现最轻的行星。研究队利用韦伯的中红外成像光谱仪(MIRI)及其日冕仪,在 TWA 7 周围的残屑盘中探测到了一个微弱的红外线源,以 TWA7 的距离而言,大约是地球到太阳距离的 50 倍。初步分析显示,这个被称为 TWA 7b 的天体可能是年轻的寒冷行星,质量约为木星的 0.3 倍(约 100 个地球质量),温度接近 320 K(约摄氏 47 度)。它的位置与残屑盘上的一个空隙对齐,暗示着这颗行星与周围环境之间存在动态相互作用。年轻和年老的恒星周围都会发现充满尘土和岩质物质的碎片圆盘,但由于年轻恒星更为明亮,因此更容易被侦测到。TWA 7 又称 CE Antilae,是一颗年轻的 M 型恒星,年龄约 640 万岁,位于约 111 光年外的长蛇座 TW 星协中。
- ispace 登月舱登月失败源于高度计
日本太空企业 ispace 24日发布分析结果称,登月舱 RESILIENCE 着陆月表失败是因为用于下降的高度计出现异常,导致减速的时机延迟。今后将与外部专家探讨改进措施。公司表示正在开发的大型登月舱,下次挑战将以 2027 年为目标。登月舱于 6 月 6 日凌晨 3 点过后开始下降。向地面照射激光测距的高度计原计划到高度 3 公里之前启动,但实际开始测量是在高度 1公里附近。急减速未能及时进行,最后以时速约 150 公里降至高度 192 米后,飞行数据中断。导致 2023 年失败的高度传感器与飞行系统间联动问题这次并不存在。
- 一次性电子烟毒性大于传统香烟
一项研究发现,部分一次性电子烟和烟弹释放出的有毒金属含量,超过了老式电子烟,甚至比传统香烟还高。研究称一次性电子烟在一天的使用中释放的最高铅含量,相当于抽了近 20 包传统香烟。研究人员强调,尽管大多数一次性电子烟在美国属于非法产品,但市场上仍然广泛流通。其主要使用者是青少年和年轻人,而他们也正是对铅暴露最敏感的人群。吸入某些金属元素会显著增加癌症、呼吸道疾病和神经损伤的风险。此次研究分析了来自三大主流品牌的 7种 一次性电子烟。研究人员利用仪器模拟吸入 500-1500 次,并检测烟雾中金属浓度。他们发现随着吸入次数的增加,烟气中铬、镍和锑的浓度也随之升高。研究人员还拆解了这些设备,发现部分有毒金属来自烟油本身,也有不少是从加热元件、合金部件中浸出。含铅铜合金组件释放的铅和镍,加热线圈释放的镍,以及原始烟油中高浓度的锑,都是污染来源。
- 法国里昂淘汰微软软件以实现数字主权
法国里昂市将逐步用开源软件替代微软软件,采用办公软件 OnlyOffice、Linux 操作系统和 PostgreSQL 数据库。里昂市是法国第三大城市,它表示此举旨在减少对美国软件的依赖,实现数字主权。昂市加入了其他欧洲城市发起的减少依赖微软软件的运动。丹麦两大城市哥本哈根和奥胡斯于 6 月初宣布弃用 Windows 和 MS Office。