Weekly Digest — 2025-W28
177 unique stories (2025-07-07 → 2025-07-13), aggregated across 8 sources.
Hacker News(42)
- New sphere-packing record stems from an unexpected source (www.quantamagazine.org)
- CPU-X: CPU-Z for Linux (thetumultuousunicornofdarkness.github.io)
- Adding a feature because ChatGPT incorrectly thinks it exists (www.holovaty.com)
- Launch HN: Morph (YC S23) – Apply AI code edits at 4,500 tokens/sec
- Solving Wordle with uv's dependency resolver (mildbyte.xyz)
- I used o3 to profile myself from my saved Pocket links (noperator.dev)
- Bootstrapping a side project into a profitable seven-figure business (projectionlab.com)
- Supabase MCP can leak your entire SQL database (www.generalanalysis.com)
- Breaking Git with a carriage return and cloning RCE (dgl.cx)
- Radium Music Editor (users.notam02.no)
- Firefox is fine. The people running it are not (www.theregister.com)
- GlobalFoundries to Acquire MIPS (mips.com)
GitHub Trending(29)
- rustfs / rustfs
🚀 High-performance distributed object storage for MinIO alternative.
- anthropics / prompt-eng-interactive-tutorial
Anthropic's Interactive Prompt Engineering Tutorial
- th-ch / youtube-music
YouTube Music Desktop App bundled with custom plugins
- dockur / macos
macOS inside a Docker container.
- pocketbase / pocketbase
Open Source realtime backend in 1 file
- commaai / openpilot
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
- humanlayer / 12-factor-agents
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
- Alibaba-NLP / WebAgent
🌐 WebAgent for Information Seeking bulit by Tongyi Lab: WebWalker & WebDancer & WebSailor https://arxiv.org/pdf/2507.02592
- HandsOnLLM / Hands-On-Large-Language-Models
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
- gusmanb / logicanalyzer
24 channel, 100Msps logic analyzer hardware and software
- googleapis / genai-toolbox
MCP Toolbox for Databases is an open source MCP server for databases.
- putyy / res-downloader
视频号、小程序、抖音、快手、小红书、直播流、m3u8、酷狗、QQ音乐等常见网络资源下载!
Product Hunt(41)
- TensorBlock Forge
One API for all AI models
- Stepfun Diligence Check
AI-powered search with agent-verified citations
- Sara, the AI Interviewer
Hire 10X faster. Unbiased structured interviews, 24/7.
- Context
The AI office suite
- Blogwald
Structure content for llms and search engines
- Voicebun
Build voice agents in seconds
- Clueso
Create stunning product videos in minutes with AI
- Sprinto Trust Center
Your single secure shareable compliance hub
- xmcp
The framework for building & shipping MCP applications
- Howdy
Send cold DMs that feel warm
- 21st.dev 2.0
Create, remix and share UI components with AI
- Tile
Ship App‑Store‑ready mobile apps with AI agents
Hugging Face(29)
- How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
Multimodal foundation models, such as GPT-4o, have recently made remarkable progress, but it is not clear where exactly these models stand in terms of understanding vision. In this paper, we benchmark the performance of popular multimodal foundation models (GPT-4o, o4-mini, Gemini 1.5 Pro and Gemini 2.0 Flash, Claude 3.5 Sonnet, Qwen2-VL, Llama 3.2) on standard computer vision tasks (semantic segmentation, object detection, image classification, depth and surface normal prediction) using established datasets (e.g., COCO, ImageNet and its variants, etc). The main challenges to performing this are: 1) most models are trained to output text and cannot natively express versatile domains, such as segments or 3D geometry, and 2) many leading models are proprietary and accessible only at an API level, i.e., there is no weight access to adapt them. We address these challenges by translating standard vision tasks into equivalent text-promptable and API-compatible tasks via prompt chaining to create a standardized benchmarking framework. We observe that 1) the models are not close to the state-of-the-art specialist models at any task. However, 2) they are respectable generalists; this is remarkable as they are presumably trained on primarily image-text-based tasks. 3) They perform semantic tasks notably better than geometric ones. 4) While the prompt-chaining techniques affect performance, better models exhibit less sensitivity to prompt variations. 5) GPT-4o performs the best among non-reasoning models, securing the top position in 4 out of 6 tasks, 6) reasoning models, e.g. o3, show improvements in geometric tasks, and 7) a preliminary analysis of models with native image generation, like the latest GPT-4o, shows they exhibit quirks like hallucinations and spatial misalignments.
- Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
The steep computational cost of diffusion models at inference hinders their use as fast physics emulators. In the context of image and video generation, this computational drawback has been addressed by generating in the latent space of an autoencoder instead of the pixel space. In this work, we investigate whether a similar strategy can be effectively applied to the emulation of dynamical systems and at what cost. We find that the accuracy of latent-space emulation is surprisingly robust to a wide range of compression rates (up to 1000x). We also show that diffusion-based emulators are consistently more accurate than non-generative counterparts and compensate for uncertainty in their predictions with greater diversity. Finally, we cover practical design choices, spanning from architectures to optimizers, that we found critical to train latent-space emulators.
- Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages
The rapid advancement of Large Language Models (LLMs) has intensified the need for evaluation frameworks that go beyond English centric benchmarks and address the requirements of linguistically diverse regions such as India. We present EKA-EVAL, a unified and production-ready evaluation framework that integrates over 35 benchmarks, including 10 Indic-specific datasets, spanning categories like reasoning, mathematics, tool use, long-context understanding, and reading comprehension. Compared to existing Indian language evaluation tools, EKA-EVAL offers broader benchmark coverage, with built-in support for distributed inference, quantization, and multi-GPU usage. Our systematic comparison positions EKA-EVAL as the first end-to-end, extensible evaluation suite tailored for both global and Indic LLMs, significantly lowering the barrier to multilingual benchmarking. The framework is open-source and publicly available at https://github.com/lingo-iitgn/ eka-eval and a part of ongoing EKA initiative (https://eka.soket.ai), which aims to scale up to over 100 benchmarks and establish a robust, multilingual evaluation ecosystem for LLMs.
- LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing
Evaluating creative writing generated by large language models (LLMs) remains challenging because open-ended narratives lack ground truths. Without performant automated evaluation methods, off-the-shelf (OTS) language models are employed as zero-shot judges, yet their reliability is unclear in this context. In pursuit of robust evaluation for creative writing, we introduce LitBench, the first standardized benchmark and paired dataset for creative writing verification, comprising a held-out test set of 2,480 debiased, human-labeled story comparisons drawn from Reddit and a 43,827-pair training corpus of human preference labels. Using LitBench, we (i) benchmark zero-shot LLM judges, (ii) train Bradley Terry and generative reward models, and (iii) conduct an online human study to validate reward model rankings on newly LLM-generated stories. Our benchmark identifies Claude-3.7-Sonnet as the strongest off-the-shelf judge, reaching 73% agreement with human preferences; among trained reward models, Bradley-Terry and Generative reward models both attain an accuracy of 78%, outperforming all off-the-shelf judges. An online human study further confirms that our trained reward models consistently align with human preferences in novel LLM-generated stories. We release LitBench and reward models at https://huggingface.co/collections/SAA-Lab/litbench-68267b5da3aafe58f9e43461, providing a vetted resource for reliable, automated evaluation and optimization of creative writing systems.
- MemOS: A Memory OS for AI System
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI), yet their lack of well-defined memory management systems hinders the development of long-context reasoning, continual personalization, and knowledge consistency.Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.While Retrieval-Augmented Generation (RAG) introduces external knowledge in plain text, it remains a stateless workaround without lifecycle control or integration with persistent representations.Recent work has modeled the training and inference cost of LLMs from a memory hierarchy perspective, showing that introducing an explicit memory layer between parameter memory and external retrieval can substantially reduce these costs by externalizing specific knowledge. Beyond computational efficiency, LLMs face broader challenges arising from how information is distributed over time and context, requiring systems capable of managing heterogeneous knowledge spanning different temporal scales and sources. To address this challenge, we propose MemOS, a memory operating system that treats memory as a manageable system resource. It unifies the representation, scheduling, and evolution of plaintext, activation-based, and parameter-level memories, enabling cost-efficient storage and retrieval. As the basic unit, a MemCube encapsulates both memory content and metadata such as provenance and versioning. MemCubes can be composed, migrated, and fused over time, enabling flexible transitions between memory types and bridging retrieval with parameter-based learning. MemOS establishes a memory-centric system framework that brings controllability, plasticity, and evolvability to LLMs, laying the foundation for continual learning and personalized modeling.
- Should We Still Pretrain Encoders with Masked Language Modeling?
Learning high-quality text representations is fundamental to a wide range of NLP tasks. While encoder pretraining has traditionally relied on Masked Language Modeling (MLM), recent evidence suggests that decoder models pretrained with Causal Language Modeling (CLM) can be effectively repurposed as encoders, often surpassing traditional encoders on text representation benchmarks. However, it remains unclear whether these gains reflect an inherent advantage of the CLM objective or arise from confounding factors such as model and data scale. In this paper, we address this question through a series of large-scale, carefully controlled pretraining ablations, training a total of 30 models ranging from 210 million to 1 billion parameters, and conducting over 15,000 fine-tuning and evaluation runs. We find that while training with MLM generally yields better performance across text representation tasks, CLM-trained models are more data-efficient and demonstrate improved fine-tuning stability. Building on these findings, we experimentally show that a biphasic training strategy that sequentially applies CLM and then MLM, achieves optimal performance under a fixed computational training budget. Moreover, we demonstrate that this strategy becomes more appealing when initializing from readily available pretrained CLM models (from the existing LLM ecosystem), reducing the computational burden needed to train best-in-class encoder models. We release all project artifacts at https://hf.co/MLMvsCLM to foster further research.
- 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture
Reconstructing fast-dynamic scenes from multi-view videos is crucial for high-speed motion analysis and realistic 4D reconstruction. However, the majority of 4D capture systems are limited to frame rates below 30 FPS (frames per second), and a direct 4D reconstruction of high-speed motion from low FPS input may lead to undesirable results. In this work, we propose a high-speed 4D capturing system only using low FPS cameras, through novel capturing and processing modules. On the capturing side, we propose an asynchronous capture scheme that increases the effective frame rate by staggering the start times of cameras. By grouping cameras and leveraging a base frame rate of 25 FPS, our method achieves an equivalent frame rate of 100-200 FPS without requiring specialized high-speed cameras. On processing side, we also propose a novel generative model to fix artifacts caused by 4D sparse-view reconstruction, as asynchrony reduces the number of viewpoints at each timestamp. Specifically, we propose to train a video-diffusion-based artifact-fix model for sparse 4D reconstruction, which refines missing details, maintains temporal consistency, and improves overall reconstruction quality. Experimental results demonstrate that our method significantly enhances high-speed 4D reconstruction compared to synchronous capture.
- DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Recent advances in vision-language-action (VLA) models have shown promise in integrating image generation with action prediction to improve generalization and reasoning in robot manipulation. However, existing methods are limited to challenging image-based forecasting, which suffers from redundant information and lacks comprehensive and critical world knowledge, including dynamic, spatial and semantic information. To address these limitations, we propose DreamVLA, a novel VLA framework that integrates comprehensive world knowledge forecasting to enable inverse dynamics modeling, thereby establishing a perception-prediction-action loop for manipulation tasks. Specifically, DreamVLA introduces a dynamic-region-guided world knowledge prediction, integrated with the spatial and semantic cues, which provide compact yet comprehensive representations for action planning. This design aligns with how humans interact with the world by first forming abstract multimodal reasoning chains before acting. To mitigate interference among the dynamic, spatial and semantic information during training, we adopt a block-wise structured attention mechanism that masks their mutual attention, preventing information leakage and keeping each representation clean and disentangled. Moreover, to model the conditional distribution over future actions, we employ a diffusion-based transformer that disentangles action representations from shared latent features. Extensive experiments on both real-world and simulation environments demonstrate that DreamVLA achieves 76.7% success rate on real robot tasks and 4.44 average length on the CALVIN ABC-D benchmarks.
- Pre-Trained Policy Discriminators are General Reward Models
We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired behaviors. Based on this conceptual insight, we propose a scalable pre-training method named Policy Discriminative Learning (POLAR), which trains a reward model (RM) to discern identical policies and discriminate different ones. Unlike traditional reward modeling methods relying on absolute preferences, POLAR captures the relative difference between one policy and an arbitrary target policy, which is a scalable, high-level optimization objective suitable for modeling generic ranking relationships. Leveraging the POLAR pre-training paradigm, we present a series of RMs with parameter scales from 1.8B to 7B. Empirical results show that POLAR substantially outperforms traditional non-pre-trained methods, significantly enhancing RM performance. For instance, POLAR-7B could improve preference accuracy from 54.8% to 81.0% on STEM tasks and from 57.9% to 85.5% on creative writing tasks compared to SOTA baselines. POLAR also shows robust generalization capabilities in RLHF using Reinforcement Fine-tuning (RFT), providing reliable reward signals and markedly enhancing policy performance--improving LLaMa3.1-8B from an average of 47.36% to 56.33% and Qwen2.5-32B from 64.49% to 70.47% on 20 benchmarks. Moreover, scaling experiments reveal a clear power-law relationship between computation and performance, supported by linear correlation coefficients approaching 0.99. The impressive performance, strong generalization, and scaling properties suggest that POLAR is a promising direction for developing general and strong reward models.
- BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
In this paper, we introduce BMMR, a large-scale bilingual, multimodal, multi-disciplinary reasoning dataset for the community to develop and evaluate large multimodal models (LMMs). BMMR comprises 110k college-level questions spanning 300 UNESCO-defined subjects, spanning diverse formats-multiple-choice, fill-in-the-blank, and open-ended QA-and sourced from both print and digital media such as books, exams, and quizzes. All data are curated and filtered via a human-in-the-loop and scalable framework, and each instance is paired with a high-quality reasoning path. The dataset is organized into two parts: BMMR-Eval that comprises 20,458 high-quality instances to comprehensively assess LMMs' knowledge and reasoning across multiple disciplines in both Chinese and English; and BMMR-Train that contains 88,991 instances to support further research and development, extending the current focus on mathematical reasoning to diverse disciplines and domains. In addition, we propose the process-based multi-discipline verifier (i.e., BMMR-Verifier) for accurate and fine-grained evaluation of reasoning paths. Extensive experiments on 24 models reveal that (i) even SOTA models (e.g., o3 and Gemini-2.5-Pro) leave substantial headroom on BMMR-Eval; (ii) reasoning models exhibit discipline bias and outperform LMMs only on specific subjects; (iii) open-source models still trail their proprietary counterparts; and (iv) fine-tuning on BMMR-Train narrows this gap. Additionally, we conduct reasoning-chain analyses using BMMR-Verifier and other in-depth studies, uncovering the challenges LMMs currently face in multidisciplinary reasoning. We will release the data, and we hope our work can offer insights and contributions to the community.
- A Survey on Latent Reasoning
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, especially when guided by explicit chain-of-thought (CoT) reasoning that verbalizes intermediate steps. While CoT improves both interpretability and accuracy, its dependence on natural language reasoning limits the model's expressive bandwidth. Latent reasoning tackles this bottleneck by performing multi-step inference entirely in the model's continuous hidden state, eliminating token-level supervision. To advance latent reasoning research, this survey provides a comprehensive overview of the emerging field of latent reasoning. We begin by examining the foundational role of neural network layers as the computational substrate for reasoning, highlighting how hierarchical representations support complex transformations. Next, we explore diverse latent reasoning methodologies, including activation-based recurrence, hidden state propagation, and fine-tuning strategies that compress or internalize explicit reasoning traces. Finally, we discuss advanced paradigms such as infinite-depth latent reasoning via masked diffusion models, which enable globally consistent and reversible reasoning processes. By unifying these perspectives, we aim to clarify the conceptual landscape of latent reasoning and chart future directions for research at the frontier of LLM cognition. An associated GitHub repository collecting the latest papers and repos is available at: https://github.com/multimodal-art-projection/LatentCoT-Horizon/.
- SingLoRA: Low Rank Adaptation Using a Single Matrix
Low-Rank Adaptation (LoRA) has significantly advanced parameter-efficient fine-tuning of large pretrained models. LoRA augments the pre-trained weights of a model by adding the product of two smaller matrices that together form a low-rank matrix update. Recent research has shown that scale disparities between these two matrices often cause unstable training dynamics, leading to suboptimal performance. In this paper, we propose SingLoRA, which reformulates low-rank adaptation by learning the weights update as a decomposition of a single low-rank matrix multiplied by its transpose. This simple design inherently removes inter-matrix scale conflicts, ensuring stable optimization, and roughly halves the parameter count. We analyze SingLoRA within the infinite-width neural network framework, showing that it guarantees stable feature learning by construction. Extensive experiments on multiple tasks validate these benefits. In common sense reasoning, fine-tuning LLama 7B on MNLI with SingLoRA achieves 91.3% accuracy - surpassing LoRA (89.1%) and LoRA+ (90.2%) - while using only 60% of their parameter budget. In image generation, fine-tuning Stable Diffusion with SingLoRA significantly improves image fidelity on DreamBooth, achieving a DINO similarity score of 0.151, compared to scores of 0.148 and 0.143 for DoRA and LoRA, respectively.
Solidot(36)
- 三星手机电池次数显著高于其它品牌
欧盟的新能效标签要求厂商标明电池的额定充电次数。那么根据充电次数,今天哪些手机品牌的电池更耐用?数据显示, 三星手机电池遥遥领先。Google Pixel 系列手机电池充电次数基本上是一千次;三星基本上是 2000 次(少数几款 1200 次);Fairphone 5 1200 次,Fairphone 6 降至 1000 次;摩托罗拉 Edge 50 系列为 1200 次,G55 800 次,其它型号基本上是 1000 次;Nothing 系列手机为 1400 次;OnePlus OnePlus 13R 1200 次,OnePlus 13 1000 次;索尼 Xperia 1 VII 为 1400 次,苹果 iPhone 16 系列都是 1000 次。
- 印度关闭互联网的次数高居第一
根据 Internet Society 的统计数字,自 2018 年以来它记录到了 863 次断网事件,其中印度一国就占了近半多达 411 次,其次是伊拉克的 140 次,叙利亚的 66 次,苏丹的 33 次,巴基斯坦和阿尔及利亚的 17 次,伊朗的 16 次。印度频繁断网的一个原因是法律授予官员以维护公共次序的名义切断互联网,地方官员有法定权力能命令电信公司手动关闭网络服务。要断网时,官员只需写信和发邮件给所有在当地有办事处的 ISP,ISP 随后屏蔽所有进出数据。 伊拉克断网则主要是因为考试。
- 为什么杀人鲸朝我们扔鱼
根据发表在《Journal of Comparative Psychology》期刊上的一项研究,世界各地经常观察到的虎鲸(或杀人鲸)朝人类扔鱼或其它猎物的现象可归因于它们想和我们交朋友。研究人员在过去 20 年记录了 34 次此类遭遇,即使人类拒绝了它们的礼物,虎鲸仍然会满怀期待的继续逗留,有时候还会再次尝试送礼,表明了它们奇特的建立关系的动机。论文主要作者、加拿大不列颠哥伦比亚 Bay Cetology 的 Jared Towers 说,虎鲸经常互相分享食物,这是一种亲社会行为,是它们建立关系的一种方式。它们与人类分享食物,可能表明它们也有兴趣与人类建立联系。
- Moderna 称 mRNA 流感疫苗有效性高于标准疫苗
Moderna 公布了基于 mRNA 的季节性流感疫苗的试验结果:mRNA 疫苗 mRNA-1010 有效性比标准疫苗高 27%。有大约 4.1 万 50 岁及以上人群参与试验,他们被随机分配接受 mRNA-1010 或标准疫苗接种,在流感季节进行约六个月的随访。相比标准疫苗,mRNA 疫苗的总体效力高出 26.6%,在 65 岁及以上参与者中高出 27.4%。早先的试验数据显示,mRNA-1010 在参与者体内产生的免疫反应比标准流感疫苗和高剂量疫苗要高。由于美国现任卫生部部长 Robert F. Kennedy Jr. 的反疫苗立场, mRNA-1010 的未来命运不确定。Kennedy Jr.已经取消了上届拜登政府授予 Moderna 的 mRNA 流感疫苗拨款。
- 美元正经历现代史上最糟糕的一年
美元正经历其现代史上最糟糕的一年,美元今年已下跌逾 7%,摩根士丹利预测下半年可能再下跌 10%。美元走弱可能增强美国的出口竞争力,推动特朗普政府的美国贸易再平衡计划,但也会使进口商品更昂贵,加剧关税带来的冲击。未来的问题是,美元是否不仅会失去其价值,还会失去其在全球金融体系中的核心地位。各国央行去美元化的努力是转向黄金,而不是转向另一种货币如人民币。
- 企业已经感受到气候变暖的影响
气候变化已对全球各地的企业产生影响。根据摩根士丹利的一份报告,被调查的企业逾半数过去一年经历了与气候相关的运营中断,包括成本增加、员工中断工作和收入损失。极端高温和风暴是造成运营中断最频繁的因素,其次是野火和烟雾、水资源短缺和洪水或海平面上升。彭博智库的一项分析发现,仅美国过去一年就花费了近万亿美元用于灾难恢复和气候相关的其它需求。近九成南美企业估计,到本十年末气候变化将对其商业模式构成风险。北美最主要风险则被认为不是气候变化而是政治动荡。
- NASA 新视野号成功演示深空恒星导航技术
虽然太空船能藉由恒星辨识方位,但要准确掌握其离开地球多远、行经何处,通常仍需仰赖地面以电波进行精密追踪。NASA 新视野号(New Horizons)任务团队的成员利用这艘目前已距地球超过 88 亿公里的飞船,成功示范仅透过星野影像即可判定方向与位置的导航方法。随着太空船深入太空,从其所在位置所见的恒星位置会开始偏离地球所见的位置。一艘航行至银河系深处的太空船可藉由这种因视差效应产生的偏移,来定位自己相对于邻近恒星的位置。而新视野号已飞行至足够遥远的距离,得以首次真实示范星际导航的可行性。自 2006 年发射以来,新视野号飞越冥王星与柯伊伯带天体 Arrokoth,并将在未来十年间逐步脱离太阳系,进入星际空间。2020 年新视野号科学团队同时从地球与太空中观测并拍摄了邻近恒星比邻星(距离地球4.2光年)与沃夫359(距离7.86光年)周围的星野。这项实验生动呈现出新视野号从内太阳系飞往外太阳系时的视角变化。而针对 2020 年影像中两颗恒星精确位置的更进一步分析,新视野号团队成员及成功推算出新视野号相对于邻近恒星的三维空间位置,精度达约 660 万公里。
- 日本生成式 AI 利用率 26%
日本总务省公布的 2025 年《信息通信白皮书》中发布调查结果称,使用生成式 AI 的个人仅占 26.7%。与上次调查相比增加至约 3 倍,但与进行对比调查的中国(81.2%)、美国(68.8%)和德国(59.2%)仍存在较大差距。关于不使用的理由,比例最高的是“生活和业务上没有需要”,超过 4 成,“不知道使用方法”也接近 4 成。使用率存在明显的年龄差异。使用率最高的 20~29岁人群为 44.7%,其次是 40~49 岁(29.6%)、30~39 岁(23.8%)、50~59 岁(19.9%)。最低的 60~69 岁仅为15.5%。日本国内企业的利用率为 55.2%,而中国(95.8%)、美国(90.6%)和德国(90.3%)均超过 9 成。
- Netflix 称其全球订户有五成看动漫
Netflix 加大了对动漫的投资,公布了其全球订户的动漫观看数据,凸显了日本动漫从小众市场成长为全球主流内容市场的过程。Netflix 称,其全球订户——逾 1.5 亿家庭约 3 亿用户——在观看动漫。过去五年,该平台动漫收视率增长了两倍,2024 年有 33 部动漫作品登上了它的 Global Top 10 (Non-English)排行榜,是 2021 年的两倍多。2024 年全球动漫内容的观看次数逾 10 亿次,其中 80% 至 90% 的用户选择观看配音版。为满足这一需求,Netflix 开始为动漫作品提供最多 33 种语言的配音和解说。
- 施乐完成对利盟的收购
施乐发表新闻稿,宣布完成了对美国打印机制造商利盟(Lexmark)的收购。利盟最初是 IBM 的打印机部门,1991 年独立成立利盟国际,它一度是财富 500 强之一,2016 年珠海艾派克科技(现纳思达)、香港太盟投资(PAG)及君联资本组成的财团以每股 40.5 美元的现金斥资 25.4 亿美元收购利盟。2023 年美国因利盟在产品生产中使用强迫劳动而对其实施制裁。此举意味着利盟在美国市场的销售面临困境。施乐是在去年 12 月宣布以 15 亿美元从中国财团手中收购利盟,它表示这笔交易有助于增强其产品组合。
- 印度外包巨头打击超时工作
印度外包巨头 Infosys 通知员工,警告他们每天工作时间不得超过 9 小时 15 分钟。该公司将监控员工工作时间,此举旨在防止员工工作倦怠,但这与公司联合创始人、英国前首相 Rishi Sunak 的岳父 Narayana Murthy 呼吁印度人每周工作 70 小时的立场相悖。
- 中国电影基金会计划利用 AI “重焕”经典功夫片
中国电影基金会等组织计划利用AI技术,对包括《警察故事》《黄飞鸿》和《精武门》等在内的 100 部经典功夫影片进行“重焕”。该基金会表示,将与上海灿星文化传媒股份有限公司等企业合作,向 AI 公司授权调用电影素材,以在全球范围内重新推出这些电影,吸引年轻观众。参与功夫片“重焕”项目的官员表示,AI 将用于为电影添加“令人惊叹的真实感”。他们正计划打造“身临其境的观看体验”,例如在竹林决斗,“感受动与静的哲学”。功夫电影的“重焕”将扩展到其他领域,包括创建武术视频游戏。行业观察人士表示,中国重新挖掘经典功夫电影作品的举措是明智的,这些作品多年来一直是美国动作电影的灵感来源。