DIGEST · 2025-07-18

OrangeBot.AI Digest — 2025-07-18

71 headlines across 8 sources, aggregated for this day.

Hacker News(15)

  1. AI capex is so big that it's affecting economic statistics (paulkedrosky.com)
  2. Third patient dies from acute liver failure caused by a Sarepta gene therapy (www.biocentury.com)
  3. How I keep up with AI progress (blog.nilenso.com)
  4. LibreOffice slams Microsoft for locking in Office users w/ complex file formats (www.neowin.net)
  5. DuckDuckGo now lets you hide AI-generated images in search results (techcrunch.com)
  6. Dear valued user, You have reached the error page for the error page (imgur.com)
  7. ICE is getting unprecedented access to Medicaid data (www.wired.com)
  8. I'm Peter Roberts, immigration attorney who does work for YC and startups. AMA
  9. NYPD bypassed facial recognition ban to ID pro-Palestinian student protester (www.thecity.nyc)
  10. Ask HN: Any active COBOL devs here? What are you working on?
  11. lsr: ls with io_uring (rockorager.dev)
  12. CP/M creator Gary Kildall's memoirs released as free download (spectrum.ieee.org)
  13. Hundred Rabbits – Low-tech living while sailing the world (100r.co)
  14. Psilocybin decreases depression and anxiety in cancer patients (2016) (pmc.ncbi.nlm.nih.gov)
  15. Apple bans entire dev account, no reason given (twitter.com)

GitHub Trending(15)

  1. microsoft / markitdown

    Python tool for converting files and office documents to Markdown.

  2. langchain-ai / open_deep_research
  3. facebookresearch / segment-anything

    The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

  4. hyprwm / Hyprland

    Hyprland is an independent, highly customizable, dynamic tiling Wayland compositor that doesn't sacrifice on its looks.

  5. gitleaks / gitleaks

    Find secrets with Gitleaks 🔑

  6. soxoj / maigret

    🕵️‍♂️ Collect a dossier on a person by username from thousands of sites

  7. arc53 / DocsGPT

    DocsGPT is an open-source genAI tool that helps users get reliable answers from knowledge source, while avoiding hallucinations. It enables private and reliable information retrieval, with tooling and agentic system capability built in.

  8. Lightricks / LTX-Video

    Official repository for LTX-Video

  9. influxdata / telegraf

    Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.

  10. PromtEngineer / localGPT

    Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.

  11. n8n-io / n8n

    Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

  12. remoteintech / remote-jobs

    A list of semi to fully remote-friendly companies (jobs) in tech.

  13. nicklockwood / SwiftFormat

    A command-line tool and Xcode Extension for formatting Swift code

  14. Kyome22 / RunCat365

    A cute running cat animation on your windows taskbar.

  15. pydantic / pydantic-ai

    Agent Framework / shim to use Pydantic with LLMs

Product Hunt(11)

  1. ChatGPT agent

    Bridging research and action

  2. Bestever

    Analyze ad data; automate winning creatives

  3. NeetoCal V2

    Schedule and manage meeting effortlessly

  4. Suno v4.5+

    New ways to create

  5. Deep Research and Imagen in Le Chat

    Le Chat dives deep (and fun!)

  6. Higgsfield UGC Builder

    Total scene control in a single interface

  7. Growdoro

    Gamified focus timer where you grow an infinite garden

  8. Trumpet

    Alerts when Trump moves the market

  9. mention.click

    Ideate, validate and find leads through Reddit conversations

  10. Banter Messenger

    If you care, you banter

  11. LLMHub

    Real-time collaborative search with AI agents for teams

Hugging Face(15)

  1. A Survey of Context Engineering for Large Language Models

    The performance of Large Language Models (LLMs) is fundamentally determined by the contextual information provided during inference. This survey introduces Context Engineering, a formal discipline that transcends simple prompt design to encompass the systematic optimization of information payloads for LLMs. We present a comprehensive taxonomy decomposing Context Engineering into its foundational components and the sophisticated implementations that integrate them into intelligent systems. We first examine the foundational components: context retrieval and generation, context processing and context management. We then explore how these components are architecturally integrated to create sophisticated system implementations: retrieval-augmented generation (RAG), memory systems and tool-integrated reasoning, and multi-agent systems. Through this systematic analysis of over 1300 research papers, our survey not only establishes a technical roadmap for the field but also reveals a critical research gap: a fundamental asymmetry exists between model capabilities. While current models, augmented by advanced context engineering, demonstrate remarkable proficiency in understanding complex contexts, they exhibit pronounced limitations in generating equally sophisticated, long-form outputs. Addressing this gap is a defining priority for future research. Ultimately, this survey provides a unified framework for both researchers and engineers advancing context-aware AI.

  2. VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

    Recent advancements in vision-language models (VLMs) have improved performance by increasing the number of visual tokens, which are often significantly longer than text tokens. However, we observe that most real-world scenarios do not require such an extensive number of visual tokens. While the performance drops significantly in a small subset of OCR-related tasks, models still perform accurately in most other general VQA tasks with only 1/4 resolution. Therefore, we propose to dynamically process distinct samples with different resolutions, and present a new paradigm for visual token compression, namely, VisionThink. It starts with a downsampled image and smartly decides whether it is sufficient for problem solving. Otherwise, the model could output a special token to request the higher-resolution image. Compared to existing Efficient VLM methods that compress tokens using fixed pruning ratios or thresholds, VisionThink autonomously decides whether to compress tokens case by case. As a result, it demonstrates strong fine-grained visual understanding capability on OCR-related tasks, and meanwhile saves substantial visual tokens on simpler tasks. We adopt reinforcement learning and propose the LLM-as-Judge strategy to successfully apply RL to general VQA tasks. Moreover, we carefully design a reward function and penalty mechanism to achieve a stable and reasonable image resize call ratio. Extensive experiments demonstrate the superiority, efficiency, and effectiveness of our method. Our code is available at https://github.com/dvlab-research/VisionThink.

  3. π^3: Scalable Permutation-Equivariant Visual Geometry Learning

    We introduce pi^3, a feed-forward neural network that offers a novel approach to visual geometry reconstruction, breaking the reliance on a conventional fixed reference view. Previous methods often anchor their reconstructions to a designated viewpoint, an inductive bias that can lead to instability and failures if the reference is suboptimal. In contrast, pi^3 employs a fully permutation-equivariant architecture to predict affine-invariant camera poses and scale-invariant local point maps without any reference frames. This design makes our model inherently robust to input ordering and highly scalable. These advantages enable our simple and bias-free approach to achieve state-of-the-art performance on a wide range of tasks, including camera pose estimation, monocular/video depth estimation, and dense point map reconstruction. Code and models are publicly available.

  4. The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

    Length generalization, the ability to solve problems of longer sequences than those observed during training, poses a core challenge of Transformer-based large language models (LLM). Although existing studies have predominantly focused on data-driven approaches for arithmetic operations and symbolic manipulation tasks, these approaches tend to be task-specific with limited overall performance. To pursue a more general solution, this paper focuses on a broader case of reasoning problems that are computable, i.e., problems that algorithms can solve, thus can be solved by the Turing Machine. From this perspective, this paper proposes Turing MAchine Imitation Learning (TAIL) to improve the length generalization ability of LLMs. TAIL synthesizes chain-of-thoughts (CoT) data that imitate the execution process of a Turing Machine by computer programs, which linearly expands the reasoning steps into atomic states to alleviate shortcut learning and explicit memory fetch mechanism to reduce the difficulties of dynamic and long-range data access in elementary operations. To validate the reliability and universality of TAIL, we construct a challenging synthetic dataset covering 8 classes of algorithms and 18 tasks. Without bells and whistles, TAIL significantly improves the length generalization ability as well as the performance of Qwen2.5-7B on various tasks using only synthetic data, surpassing previous methods and DeepSeek-R1. The experimental results reveal that the key concepts in the Turing Machine, instead of the thinking styles, are indispensable for TAIL for length generalization, through which the model exhibits read-and-write behaviors consistent with the properties of the Turing Machine in their attention layers. This work provides a promising direction for future research in the learning of LLM reasoning from synthetic data.

  5. AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

    Controllable captioning is essential for precise multimodal alignment and instruction following, yet existing models often lack fine-grained control and reliable evaluation protocols. To address this gap, we present the AnyCap Project, an integrated solution spanning model, dataset, and evaluation. We introduce AnyCapModel (ACM), a lightweight plug-and-play framework that enhances the controllability of existing foundation models for omni-modal captioning without retraining the base model. ACM reuses the original captions from base models while incorporating user instructions and modality features to generate improved captions. To remedy the data scarcity in controllable multimodal captioning, we build AnyCapDataset (ACD), covering three modalities, 28 user-instruction types, and 300\,k high-quality data entries. We further propose AnyCapEval, a new benchmark that provides more reliable evaluation metrics for controllable captioning by decoupling content accuracy and stylistic fidelity. ACM markedly improves caption quality across a diverse set of base models on AnyCapEval. Notably, ACM-8B raises GPT-4o\'s content scores by 45\% and style scores by 12\%, and it also achieves substantial gains on widely used benchmarks such as MIA-Bench and VidCapBench.

  6. Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

    This paper addresses the challenge of high-fidelity view synthesis of humans with sparse-view videos as input. Previous methods solve the issue of insufficient observation by leveraging 4D diffusion models to generate videos at novel viewpoints. However, the generated videos from these models often lack spatio-temporal consistency, thus degrading view synthesis quality. In this paper, we propose a novel sliding iterative denoising process to enhance the spatio-temporal consistency of the 4D diffusion model. Specifically, we define a latent grid in which each latent encodes the image, camera pose, and human pose for a certain viewpoint and timestamp, then alternately denoising the latent grid along spatial and temporal dimensions with a sliding window, and finally decode the videos at target viewpoints from the corresponding denoised latents. Through the iterative sliding, information flows sufficiently across the latent grid, allowing the diffusion model to obtain a large receptive field and thus enhance the 4D consistency of the output, while making the GPU memory consumption affordable. The experiments on the DNA-Rendering and ActorsHQ datasets demonstrate that our method is able to synthesize high-quality and consistent novel-view videos and significantly outperforms the existing approaches. See our project page for interactive demos and video results: https://diffuman4d.github.io/ .

  7. RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization

    Low-Rank Adaptation (LoRA) has become a widely adopted standard for parameter-efficient fine-tuning of large language models (LLMs), significantly reducing memory and computational demands. However, challenges remain, including finding optimal initialization strategies or mitigating overparametrization in low-rank matrix factorization. In this work, we propose a novel approach that addresses both of the challenges simultaneously within a unified framework. Our method treats a set of fixed-rank LoRA matrices as a smooth manifold. Considering adapters as elements on this manifold removes overparametrization, while determining the direction of the fastest loss decrease along the manifold provides initialization. Special care is taken to obtain numerically stable and computationally efficient implementation of our method, using best practices from numerical linear algebra and Riemannian optimization. Experimental results on LLM and diffusion model architectures demonstrate that RiemannLoRA consistently improves both convergence speed and final performance over standard LoRA and its state-of-the-art modifications.

  8. MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

    Spatial reasoning in 3D space is central to human cognition and indispensable for embodied tasks such as navigation and manipulation. However, state-of-the-art vision-language models (VLMs) struggle frequently with tasks as simple as anticipating how a scene will look after an egocentric motion: they perceive 2D images but lack an internal model of 3D dynamics. We therefore propose MindJourney, a test-time scaling framework that grants a VLM with this missing capability by coupling it to a controllable world model based on video diffusion. The VLM iteratively sketches a concise camera trajectory, while the world model synthesizes the corresponding view at each step. The VLM then reasons over this multi-view evidence gathered during the interactive exploration. Without any fine-tuning, our MindJourney achieves over an average 8% performance boost on the representative spatial reasoning benchmark SAT, showing that pairing VLMs with world models for test-time scaling offers a simple, plug-and-play route to robust 3D reasoning. Meanwhile, our method also improves upon the test-time inference VLMs trained through reinforcement learning, which demonstrates the potential of our method that utilizes world models for test-time scaling.

  9. FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

    Producing expressive facial animations from static images is a challenging task. Prior methods relying on explicit geometric priors (e.g., facial landmarks or 3DMM) often suffer from artifacts in cross reenactment and struggle to capture subtle emotions. Furthermore, existing approaches lack support for multi-character animation, as driving features from different individuals frequently interfere with one another, complicating the task. To address these challenges, we propose FantasyPortrait, a diffusion transformer based framework capable of generating high-fidelity and emotion-rich animations for both single- and multi-character scenarios. Our method introduces an expression-augmented learning strategy that utilizes implicit representations to capture identity-agnostic facial dynamics, enhancing the model's ability to render fine-grained emotions. For multi-character control, we design a masked cross-attention mechanism that ensures independent yet coordinated expression generation, effectively preventing feature interference. To advance research in this area, we propose the Multi-Expr dataset and ExprBench, which are specifically designed datasets and benchmarks for training and evaluating multi-character portrait animations. Extensive experiments demonstrate that FantasyPortrait significantly outperforms state-of-the-art methods in both quantitative metrics and qualitative evaluations, excelling particularly in challenging cross reenactment and multi-character contexts. Our project page is https://fantasy-amap.github.io/fantasy-portrait/.

  10. Voxtral

    We present Voxtral Mini and Voxtral Small, two multimodal audio chat models. Voxtral is trained to comprehend both spoken audio and text documents, achieving state-of-the-art performance across a diverse range of audio benchmarks, while preserving strong text capabilities. Voxtral Small outperforms a number of closed-source models, while being small enough to run locally. A 32K context window enables the model to handle audio files up to 40 minutes in duration and long multi-turn conversations. We also contribute three benchmarks for evaluating speech understanding models on knowledge and trivia. Both Voxtral models are released under Apache 2.0 license.

  11. AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

    We introduce AbGen, the first benchmark designed to evaluate the capabilities of LLMs in designing ablation studies for scientific research. AbGen consists of 1,500 expert-annotated examples derived from 807 NLP papers. In this benchmark, LLMs are tasked with generating detailed ablation study designs for a specified module or process based on the given research context. Our evaluation of leading LLMs, such as DeepSeek-R1-0528 and o4-mini, highlights a significant performance gap between these models and human experts in terms of the importance, faithfulness, and soundness of the ablation study designs. Moreover, we demonstrate that current automated evaluation methods are not reliable for our task, as they show a significant discrepancy when compared to human assessment. To better investigate this, we develop AbGen-Eval, a meta-evaluation benchmark designed to assess the reliability of commonly used automated evaluation systems in measuring LLM performance on our task. We investigate various LLM-as-Judge systems on AbGen-Eval, providing insights for future research on developing more effective and reliable LLM-based evaluation systems for complex scientific tasks.

  12. Teach Old SAEs New Domain Tricks with Boosting

    Sparse Autoencoders have emerged as powerful tools for interpreting the internal representations of Large Language Models, yet they often fail to capture domain-specific features not prevalent in their training corpora. This paper introduces a residual learning approach that addresses this feature blindness without requiring complete retraining. We propose training a secondary SAE specifically to model the reconstruction error of a pretrained SAE on domain-specific texts, effectively capturing features missed by the primary model. By summing the outputs of both models during inference, we demonstrate significant improvements in both LLM cross-entropy and explained variance metrics across multiple specialized domains. Our experiments show that this method efficiently incorporates new domain knowledge into existing SAEs while maintaining their performance on general tasks. This approach enables researchers to selectively enhance SAE interpretability for specific domains of interest, opening new possibilities for targeted mechanistic interpretability of LLMs.

  13. TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

    Video Frame Interpolation (VFI) aims to predict the intermediate frame I_n (we use n to denote time in videos to avoid notation overload with the timestep t in diffusion models) based on two consecutive neighboring frames I_0 and I_1. Recent approaches apply diffusion models (both image-based and video-based) in this task and achieve strong performance. However, image-based diffusion models are unable to extract temporal information and are relatively inefficient compared to non-diffusion methods. Video-based diffusion models can extract temporal information, but they are too large in terms of training scale, model size, and inference time. To mitigate the above issues, we propose Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation (TLB-VFI), an efficient video-based diffusion model. By extracting rich temporal information from video inputs through our proposed 3D-wavelet gating and temporal-aware autoencoder, our method achieves 20% improvement in FID on the most challenging datasets over recent SOTA of image-based diffusion models. Meanwhile, due to the existence of rich temporal information, our method achieves strong performance while having 3times fewer parameters. Such a parameter reduction results in 2.3x speed up. By incorporating optical flow guidance, our method requires 9000x less training data and achieves over 20x fewer parameters than video-based diffusion models. Codes and results are available at our project page: https://zonglinl.github.io/tlbvfi_page.

  14. FLEXITOKENS: Flexible Tokenization for Evolving Language Models

    Language models (LMs) are challenging to adapt to new data distributions by simple finetuning. This is due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation. This inflexibility often leads to inefficient tokenization, causing overfragmentation of out-of-distribution domains, unseen languages, or scripts. In this work, we develop byte-level LMs with learnable tokenizers to make tokenization adaptive. Our models include a submodule that learns to predict boundaries between the input byte sequence, encoding it into variable-length segments. Existing tokenizer-free methods train this boundary predictor using an auxiliary loss that enforces a fixed compression rate across the training corpus, introducing a new kind of rigidity. We propose FLEXITOKENS, a simplified training objective that enables significantly greater flexibility during adaptation. Evaluating across multiple multilingual benchmarks, morphologically diverse tasks, and domains, we demonstrate that FLEXITOKENS consistently reduces token over-fragmentation and achieves up to 10\% improvements on downstream task performance compared to subword and other gradient-based tokenizers. Code and data for our experiments will be released at https://github.com/owos/flexitokens

  15. Automating Steering for Safe Multimodal Large Language Models

    Recent progress in Multimodal Large Language Models (MLLMs) has unlocked powerful cross-modal reasoning abilities, but also raised new safety concerns, particularly when faced with adversarial multimodal inputs. To improve the safety of MLLMs during inference, we introduce a modular and adaptive inference-time intervention technology, AutoSteer, without requiring any fine-tuning of the underlying model. AutoSteer incorporates three core components: (1) a novel Safety Awareness Score (SAS) that automatically identifies the most safety-relevant distinctions among the model's internal layers; (2) an adaptive safety prober trained to estimate the likelihood of toxic outputs from intermediate representations; and (3) a lightweight Refusal Head that selectively intervenes to modulate generation when safety risks are detected. Experiments on LLaVA-OV and Chameleon across diverse safety-critical benchmarks demonstrate that AutoSteer significantly reduces the Attack Success Rate (ASR) for textual, visual, and cross-modal threats, while maintaining general abilities. These findings position AutoSteer as a practical, interpretable, and effective framework for safer deployment of multimodal AI systems.

Solidot(15)

  1. YouTube 主播因演示掌机模拟游戏而面临判刑

    YouTube 主播 Once Were Nerd 的视频内容主要围绕各种游戏主题,其中包括 Android 掌机如 Powkiddy 和 TrimUI,这些掌机内置了 SNES、Nintendo 64、PlayStation Portable、GameCube 等经典掌机的模拟器。随着硬件价格的下跌,玩家可以以 100 美元以内的价格购买到相当于升级版 PSP 或 GBA 的游戏机,因此它们变得非常受欢迎。然而模拟器是合法的,但模拟器使用的 ROM 文件很多情况下并不合法。Once Were Nerd 的内容引起了意大利版权执法部门的注意,今年四月金融警卫队(Guardia di Finanza)带着搜查令突袭了他的家。他的 Anbernic(扬立铭) 掌机相关视频被控是在宣传盗版材料,掌机预装了任天堂和索尼游戏的 ROM。执法部门扣押了他的 30 多台掌机。这些掌机基本上是中国公司制造的,因此超出了西方国家版权法的管辖范围。根据意大利的版权法, Once Were Nerd 因推广版权材料而面临最长三年监禁。

  2. 吸烟引起的表观遗传变化与衰老相似

    葡萄牙波尔图大学与西班牙巴塞罗那超级计算中心合作,采用多种技术分析了人体 46 种组织的样本,包括分析这些组织的基因表达、mRNA(信使核糖核酸)前体的选择性剪接、DNA(脱氧核糖核酸)甲基化和组织学改变等。 研究表明,吸烟会引发全身组织炎症。在人体组织中,吸烟引起的表观遗传变化与已知的衰老机制相似,例如会导致DNA上某些位点的高甲基化等。在吸烟人群中观察到的加速衰老现象与这类高甲基化有关。研究还发现,吸烟导致的表观遗传变化不仅发生在肺部,在胰腺、甲状腺、食管及大脑的某些区域都可以观察到。研究人员表示,这项研究“确定了烟草引起的(人体组织)分子层面变化,某些变化在戒烟后是不可逆的,特别是那些与衰老机制重叠的特征”。

  3. 比亚迪如何赶超特斯拉

    在西方的电动汽车市场,特斯拉仍然拥有领先优势。但在中国市场,它的领先优势不那么明显,或者根本不存在。以前中国电动汽车制造商相对于特斯拉的优势主要是价格,但过去几个月比亚迪在自动驾驶和电池技术上接近甚至超越了特斯拉。而同一时间特斯拉 CEO 马斯克(Elon Musk)则深度参与了特朗普政府的活动,之后不欢而散,宣布辞去政府职位重新专注于特斯拉。比亚迪去年的汽车总销量达到了 427 万辆,几乎是 2020 年的 10 倍,其中纯电汽车销量 176 万辆。特斯拉则在过去四年销量从不到 50 万辆增加到 179 万辆。随着比亚迪海外扩张的加快,该公司有望在 2025 年超过特斯拉成为最大的电动汽车制造商。Gigacasting 曾是特斯拉在 2021 年率先采用的技术,但到了 2023 年小鹏汽车推出 G6 SUV 时它采用的 Gigacasting 已经超过了特斯拉。今年前六个月特斯拉在中国的销量下滑了 5%,分析师认为这与马斯克的政治活动无关。特斯拉在中国面临的重大挑战之一是中国的监管政策限制其将收集的数据传输给美国,而美国当局也不允许它在中国进行训练,结果是中国版本的辅助驾驶系统 FSD 性能较差。

  4. 3.6 亿印度巴帝电信用户将能免费使用先进 AI 模型一年

    AI 公司 Perplexity 与印度电信巨头巴帝电信(Bharti Airtel)合作,向其 3.6 亿用户免费提供 Pro 服务一整年,这是全球同类服务中规模最大的分销协议。Perplexity 的 Pro 服务年费为 200 美元,提供的先进模型包括了 GPT-4.1、Claude Sonnet 和 Opus 4,甚至还有 xAI 最新的 Grok 4。按移动用户使用量计算印度已成为 ChatGPT 最大的市场。

  5. 研究揭示全球精英离岸隐藏财富的模式

    超级富豪和寡头之类的全球精英以利用离岸金融系统隐藏财富著称,美国达特茅斯大学(Dartmouth)的研究人员 利用离岸金融泄密数据库揭示了全球精英离岸隐藏财富的模式,数据库涵盖了近 3000 名全球富豪。研究发现了三种有明显区别的模式:来自威权国家、存在报复风险的精英更可能将财富分散到多个离岸金融中心;来自因缺乏公民权利或政府有限监管而导致财产容易被没收的国家的精英更可能采用隐蔽的策略隐藏财富,他们更希望维持匿名;来自担心资产被没收和腐败国家的精英则倾向于采用混合策略,分散资产并保持隐蔽性。论文主要作者、助理教授 Herbert Chang 以新加坡为例,称该国政府不腐败,但公民参与度很低,其精英仍然会利用离岸金融中心。

  6. SpaceX 的 Falcon 9 火箭发射了亚马逊的 24 颗宽带卫星

    美国时间周三凌晨 2:30 (06:30 UTC),SpaceX 在佛罗里达州卡纳维拉尔角太空军基地使用 Falcon 9 火箭发射了亚马逊的 24 颗宽带卫星,亚马逊的宽带卫星网络 Project Kuiper 设计与 SpaceX 的宽带卫星 Starlink 进行竞争,不过 Starlink 在轨卫星已有数千颗,而 Project Kuiper 至今只有 78 颗。完整的 Project Kuiper 网络将有 3232 颗。Kuiper 卫星造价昂贵,预计项目总费用 165-200 亿美元,其中发射一项将耗费 100 亿美元,亚马逊已经预定了超过 80 次发射合同,迫不得已它只能求助于竞争对手 SpaceX。

  7. 农业塑料带来的污染挑战

    我国是全球最大的农业塑料使用国,2022 年农业薄膜使用量达 134 万吨,覆盖面积超 1750 万公顷。地膜能调节土壤温湿度、抑制杂草虫害、减少化肥农药使用,甚至提升微生物多样性;在甘肃等干旱地区,地膜覆盖可使作物增产超 180%,相当于多开发了近 400 万公顷耕地。但长期使用和不当处理后,残留的塑料会破坏土壤结构、释放化学添加剂,甚至通过“土壤-作物-人体”链条传递风险。减少薄膜使用和推广可降解地膜面临很多挑战。可降解地膜的降解效果受土壤条件影响大,且成本高、农民接受度低;回收环节则因地膜破碎率高、处理成本大,依赖政府补贴难以持续。研究人员建议建立全国统一管理框架,结合区域特点制定差异化策略。

  8. Google 安全研究员报告 SonicWall 被植入后门

    Google 安全团队 Threat Intelligence Group 的研究人员报告,未知黑客组织正利用已停止支持的 SonicWall 设备植入后门。安全研究员将该组织命名为 UNC6148。停止支持意味着 SonicWall 不再继续提供安全更新。攻击者利用的是泄漏的本地管理员凭证,但暂时不清楚 UNC6148 是如何获取到管理员凭证的,也不清楚攻击者利用了哪些漏洞。UNC6148 在入侵之后安装了后门 Overstep,能选择性的删除日志,因此影响了安全研究人员的调查取证。

  9. 金平菇入侵北美改变当地菌落

    北美森林正面临一场金色入侵——原产东亚的金平菇(Golden oyster mushrooms, GOM)通过园艺贸易逃逸至野外后,展现出惊人的扩张能力。这种白色腐朽真菌专性分解硬木,自 2010 年首次在北美被发现后,短短 8 年间已攻占 25 个州和 1 个加拿大省份的林地。研究人员发现,被金平菇定殖的树木上,本土真菌群落结构发生显著改变,物种丰富度平均下降达 30%。就像"真菌界的殖民者",金平菇通过资源竞争排挤本地物种,可能重塑整个分解者生态网络。研究人员也强调,作为食物金平菇也有其积极作用,在制定管理方法时需要考虑到它的积极作用。解决方法包括培育不会扩散的品种。

  10. 韦伯望远镜可能发现了星际气体云塌缩形成的超大质量黑洞

    天文学家通过韦伯望远镜发现一个罕见天体,命名为「无限星系」(Infinity Galaxy)。这个系统是由两个盘状星系碰撞所形成,结构呈现两个紧密的核心,各自被环状结构包围,外观酷似数学符号「∞」,因此得名。无限星系距离地球约 83 亿光年,位于宇宙的中早期阶段。而在这个星系中,天文学家可能首次捕捉到一颗超大质量黑洞正在形成的过程,并指出这颗黑洞并非来自恒星坍缩,而是直接由气体云塌缩而成。这项发现支持「重种子」理论,有助解释为何在宇宙形成后不到十亿年内,已经出现质量庞大的黑洞。传统的「轻种子」(light seeds)理论认为,黑洞起初来自恒星核心坍缩所形成的小黑洞,质量约为数十到数千个太阳质量,需经由长期合并才能演化为超大质量黑洞。然而,这样的成长过程所需时间过长,无法解释宇宙早期即出现的巨大黑洞。因此,也有「重种子」(heavy seeds)理论提出,在特殊条件下,大质量气体云可直接塌缩为黑洞,省略中间合并的过程,但这一机制尚缺乏观测证据。无限星系可能提供了这种极端条件的实例。当两个星系碰撞时,产生的气体震波与压缩作用可能足以触发塌缩。这类情况虽在现今宇宙中极为罕见,但在早期宇宙中可能相当常见,有助解释韦伯望远镜所观测到的早期巨大黑洞来源。

  11. 乌克兰黑客破坏了俄罗斯无人机制造商的 IT 基础设施

    乌克兰黑客与情报部门合作,成功破坏了俄罗斯最大无人机制造商之一的 Gaskar Integration 的 IT 基础设施,导致该公司的生产陷入瘫痪。黑客在此次攻击中窃取了 47 TB 数据,之后将包括备份在内的所有数据全部从服务器上抹掉。窃取的数据转交给了乌克兰国防部。

  12. Mozilla 征求用户对 Firefox 未来发展的意见

    Firefox 最近推出了多项用户期待已久的新功能,如标签页组和垂直标签,Mozilla 此举也是旨在响应社区民意。在推出这些新功能之后,Firefox 接下来该这么做呢?Mozilla 开发者通过官方论坛征求用户的意见,并准备进行一次社区的 AMA(你问我答)活动。

  13. Valve 在支付公司压力下移除部分成人游戏

    Valve 更新了内容发行规则,添加了一条规定,表示支付公司有权决定哪些游戏内容能在 Steam 上发行:“Content that may violate the rules and standards set forth by Steam’s payment processors and related card networks and banks, or internet network providers. In particular, certain kinds of adult only content.”根据 SteamDB 对 Steam 游戏数据库的跟踪,它下架了部分成人游戏,这些下架的游戏与乱伦相关。Valve/Steam 不是第一家因成人内容而受到支付公司如 PayPal 压力的平台,Patreon 也曾面临类似的问题。

  14. 新发现肠道细菌能增强癌症免疫药效果

    日本国立癌症研究中心等研究团队发现了一种可提高“Opdivo(欧狄沃)”等癌症免疫药效果的肠道细菌,并在使用小鼠的实验中证实了效果。“Opdivo”和“Keytruda”等癌症免疫药通过解除免疫细胞的抑制机制,增强其对癌细胞的攻击力。但是有疗效的患者被认为只有 2~3 成左右。日本研究团队首先调查了使用癌症免疫药的 50 名肺癌和胃癌患者的粪便。发现癌症免疫药产生疗效的患者体内“瘤胃球菌科(Ruminococcaceae)”肠道细菌的占比更高。对这种肠道细菌进行详细分析后,发现了此前未知的新型肠道细菌“YB328”。为调查 YB328 的功能和性质,研究团队将癌症免疫药无效的患者的粪便移植给小鼠,并对小鼠使用了癌症免疫药和 YB328,结果发现小鼠的肿瘤缩小。研究表明,YB328 刺激了被认为是免疫系统指挥官的“树突状细胞”,并使其活性化。

  15. 新发现的外太阳系天体挑战第九行星假说

    天文学家通过夏威夷昴望远镜的观测,发现一颗极为遥远的太阳系天体 2023 KQ14,昵称菊石(Ammonite)。这颗天体近日点距离太阳约 66 天文单位,属于极罕见的「类赛德娜天体」(Sednoid),其轨道远离海王星的引力范围,揭示外太阳系未知区域的重要线索。目前已知的类赛德娜天体仅有三颗,分别为 Sedna、2012 VP₁₁₃ 和 Leleakuhonua,它们轨道方向大致相同,曾被认为可能受到某颗尚未发现的「第九行星」的引力牵引。然而新发现的菊石,其轨道方向却与这三者相反,显示外太阳系的动力架构比先前想像更加复杂,也让第九行星假说受到挑战。模拟结果显示,菊石的轨道数十亿年来相当稳定,未受太阳系内部行星明显扰动,可视为一枚「轨道化石」,保留着太阳系早期形成的动力痕迹。