OrangeBot.AI Digest — 2025-12-20
56 headlines across 8 sources, aggregated for this day.
Hacker News(15)
- Ireland’s Diarmuid Early wins world Microsoft Excel title (www.bbc.com)
- Backing Up Spotify (annas-archive.li)
- OpenSCAD is kinda neat (nuxx.net)
- Over 40% of deceased drivers in vehicle crashes test positive for THC: Study (www.facs.org)
- Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates (www.a1k0n.net)
- Log level 'error' should mean that something needs to be fixed (utcc.utoronto.ca)
- Go ahead, self-host Postgres (pierce.dev)
- Gemini 3 Pro vs. 2.5 Pro in Pokemon Crystal (blog.jcz.dev)
- What Does a Database for SSDs Look Like? (brooker.co.za)
- Skills Officially Comes to Codex (developers.openai.com)
- Reflections on AI at the End of 2025 (antirez.com)
- Airbus to migrate critical apps to a sovereign Euro cloud (www.theregister.com)
- NTP at NIST Boulder Has Lost Power (lists.nanog.org)
- The Deviancy Signal: Having "Nothing to Hide" Is a Threat to Us All (thompson2026.com)
- Privacy doesn't mean anything anymore, anonymity does (servury.com)
GitHub Trending(11)
- exo-explore / exo
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
- lintsinghua / DeepAudit
DeepAudit:人人拥有的 AI 黑客战队,让漏洞挖掘触手可及。国内首个开源的代码漏洞挖掘多智能体系统。小白一键部署运行,自主协作审计 + 自动化沙箱 PoC 验证。支持 Ollama 私有部署 ,一键生成报告。让安全不再昂贵,让审计不再复杂。
- anthropics / claude-code
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
- github / awesome-copilot
Community-contributed instructions, prompts, and configurations to help you make the most of GitHub Copilot.
- swisskyrepo / PayloadsAllTheThings
A list of useful payloads and bypass for Web Application Security and Pentest/CTF
- sgl-project / mini-sglang
- cloudcommunity / Free-Certifications
A curated list of free courses with certifications. Also available at https://free-certifications.com/
- GreyDGL / PentestGPT
A GPT-empowered penetration testing tool
- NexaAI / nexa-sdk
Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.
- astral-sh / ty
An extremely fast Python type checker and language server, written in Rust.
- iptv-org / iptv
Collection of publicly available IPTV channels from all over the world
Hugging Face(15)
- Kling-Omni Technical Report
We present Kling-Omni, a generalist generative framework designed to synthesize high-fidelity videos directly from multimodal visual language inputs. Adopting an end-to-end perspective, Kling-Omni bridges the functional separation among diverse video generation, editing, and intelligent reasoning tasks, integrating them into a holistic system. Unlike disjointed pipeline approaches, Kling-Omni supports a diverse range of user inputs, including text instructions, reference images, and video contexts, processing them into a unified multimodal representation to deliver cinematic-quality and highly-intelligent video content creation. To support these capabilities, we constructed a comprehensive data system that serves as the foundation for multimodal video creation. The framework is further empowered by efficient large-scale pre-training strategies and infrastructure optimizations for inference. Comprehensive evaluations reveal that Kling-Omni demonstrates exceptional capabilities in in-context generation, reasoning-based editing, and multimodal instruction following. Moving beyond a content creation tool, we believe Kling-Omni is a pivotal advancement toward multimodal world simulators capable of perceiving, reasoning, generating and interacting with the dynamic and complex worlds.
- Adaptation of Agentic AI
Cutting-edge agentic AI systems are built on foundation models that can be adapted to plan, reason, and interact with external tools to perform increasingly complex and specialized tasks. As these systems grow in capability and scope, adaptation becomes a central mechanism for improving performance, reliability, and generalization. In this paper, we unify the rapidly expanding research landscape into a systematic framework that spans both agent adaptations and tool adaptations. We further decompose these into tool-execution-signaled and agent-output-signaled forms of agent adaptation, as well as agent-agnostic and agent-supervised forms of tool adaptation. We demonstrate that this framework helps clarify the design space of adaptation strategies in agentic AI, makes their trade-offs explicit, and provides practical guidance for selecting or switching among strategies during system design. We then review the representative approaches in each category, analyze their strengths and limitations, and highlight key open challenges and future opportunities. Overall, this paper aims to offer a conceptual foundation and practical roadmap for researchers and practitioners seeking to build more capable, efficient, and reliable agentic AI systems.
- LLaDA2.0: Scaling Up Diffusion Language Models to 100B
This paper presents LLaDA2.0 -- a tuple of discrete diffusion large language models (dLLM) scaling up to 100B total parameters through systematic conversion from auto-regressive (AR) models -- establishing a new paradigm for frontier-scale deployment. Instead of costly training from scratch, LLaDA2.0 upholds knowledge inheritance, progressive adaption and efficiency-aware design principle, and seamless converts a pre-trained AR model into dLLM with a novel 3-phase block-level WSD based training scheme: progressive increasing block-size in block diffusion (warm-up), large-scale full-sequence diffusion (stable) and reverting back to compact-size block diffusion (decay). Along with post-training alignment with SFT and DPO, we obtain LLaDA2.0-mini (16B) and LLaDA2.0-flash (100B), two instruction-tuned Mixture-of-Experts (MoE) variants optimized for practical deployment. By preserving the advantages of parallel decoding, these models deliver superior performance and efficiency at the frontier scale. Both models were open-sourced.
- Next-Embedding Prediction Makes Strong Vision Learners
Inspired by the success of generative pretraining in natural language, we ask whether the same principles can yield strong self-supervised visual learners. Instead of training models to output features for downstream use, we train them to generate embeddings to perform predictive tasks directly. This work explores such a shift from learning representations to learning models. Specifically, models learn to predict future patch embeddings conditioned on past ones, using causal masking and stop gradient, which we refer to as Next-Embedding Predictive Autoregression (NEPA). We demonstrate that a simple Transformer pretrained on ImageNet-1k with next embedding prediction as its sole learning objective is effective - no pixel reconstruction, discrete tokens, contrastive loss, or task-specific heads. This formulation retains architectural simplicity and scalability, without requiring additional design complexity. NEPA achieves strong results across tasks, attaining 83.8% and 85.3% top-1 accuracy on ImageNet-1K with ViT-B and ViT-L backbones after fine-tuning, and transferring effectively to semantic segmentation on ADE20K. We believe generative pretraining from embeddings provides a simple, scalable, and potentially modality-agnostic alternative to visual self-supervised learning.
- StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
The rapid growth of stereoscopic displays, including VR headsets and 3D cinemas, has led to increasing demand for high-quality stereo video content. However, producing 3D videos remains costly and complex, while automatic Monocular-to-Stereo conversion is hindered by the limitations of the multi-stage ``Depth-Warp-Inpaint'' (DWI) pipeline. This paradigm suffers from error propagation, depth ambiguity, and format inconsistency between parallel and converged stereo configurations. To address these challenges, we introduce UniStereo, the first large-scale unified dataset for stereo video conversion, covering both stereo formats to enable fair benchmarking and robust model training. Building upon this dataset, we propose StereoPilot, an efficient feed-forward model that directly synthesizes the target view without relying on explicit depth maps or iterative diffusion sampling. Equipped with a learnable domain switcher and a cycle consistency loss, StereoPilot adapts seamlessly to different stereo formats and achieves improved consistency. Extensive experiments demonstrate that StereoPilot significantly outperforms state-of-the-art methods in both visual fidelity and computational efficiency. Project page: https://hit-perfect.github.io/StereoPilot/.
- Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
Recent strides in video generation have paved the way for unified audio-visual generation. In this work, we present Seedance 1.5 pro, a foundational model engineered specifically for native, joint audio-video generation. Leveraging a dual-branch Diffusion Transformer architecture, the model integrates a cross-modal joint module with a specialized multi-stage data pipeline, achieving exceptional audio-visual synchronization and superior generation quality. To ensure practical utility, we implement meticulous post-training optimizations, including Supervised Fine-Tuning (SFT) on high-quality datasets and Reinforcement Learning from Human Feedback (RLHF) with multi-dimensional reward models. Furthermore, we introduce an acceleration framework that boosts inference speed by over 10X. Seedance 1.5 pro distinguishes itself through precise multilingual and dialect lip-syncing, dynamic cinematic camera control, and enhanced narrative coherence, positioning it as a robust engine for professional-grade content creation. Seedance 1.5 pro is now accessible on Volcano Engine at https://console.volcengine.com/ark/region:ark+cn-beijing/experience/vision?type=GenVideo.
- Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation
In this work, we present a panoramic metric depth foundation model that generalizes across diverse scene distances. We explore a data-in-the-loop paradigm from the view of both data construction and framework design. We collect a large-scale dataset by combining public datasets, high-quality synthetic data from our UE5 simulator and text-to-image models, and real panoramic images from the web. To reduce domain gaps between indoor/outdoor and synthetic/real data, we introduce a three-stage pseudo-label curation pipeline to generate reliable ground truth for unlabeled images. For the model, we adopt DINOv3-Large as the backbone for its strong pre-trained generalization, and introduce a plug-and-play range mask head, sharpness-centric optimization, and geometry-centric optimization to improve robustness to varying distances and enforce geometric consistency across views. Experiments on multiple benchmarks (e.g., Stanford2D3D, Matterport3D, and Deep360) demonstrate strong performance and zero-shot generalization, with particularly robust and stable metric predictions in diverse real-world scenes. The project page can be found at: https://insta360-research-team.github.io/DAP_website/ {https://insta360-research-team.github.io/DAP\_website/}
- Generative Refocusing: Flexible Defocus Control from a Single Image
Depth-of-field control is essential in photography, but getting the perfect focus often takes several tries or special equipment. Single-image refocusing is still difficult. It involves recovering sharp content and creating realistic bokeh. Current methods have significant drawbacks. They need all-in-focus inputs, depend on synthetic data from simulators, and have limited control over aperture. We introduce Generative Refocusing, a two-step process that uses DeblurNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh. Our main innovation is semi-supervised training. This method combines synthetic paired data with unpaired real bokeh images, using EXIF metadata to capture real optical characteristics beyond what simulators can provide. Our experiments show we achieve top performance in defocus deblurring, bokeh synthesis, and refocusing benchmarks. Additionally, our Generative Refocusing allows text-guided adjustments and custom aperture shapes.
- DeContext as Defense: Safe Image Editing in Diffusion Transformers
In-context diffusion models allow users to modify images with remarkable ease and realism. However, the same power raises serious privacy concerns: personal images can be easily manipulated for identity impersonation, misinformation, or other malicious uses, all without the owner's consent. While prior work has explored input perturbations to protect against misuse in personalized text-to-image generation, the robustness of modern, large-scale in-context DiT-based models remains largely unexamined. In this paper, we propose DeContext, a new method to safeguard input images from unauthorized in-context editing. Our key insight is that contextual information from the source image propagates to the output primarily through multimodal attention layers. By injecting small, targeted perturbations that weaken these cross-attention pathways, DeContext breaks this flow, effectively decouples the link between input and output. This simple defense is both efficient and robust. We further show that early denoising steps and specific transformer blocks dominate context propagation, which allows us to concentrate perturbations where they matter most. Experiments on Flux Kontext and Step1X-Edit show that DeContext consistently blocks unwanted image edits while preserving visual quality. These results highlight the effectiveness of attention-based perturbations as a powerful defense against image manipulation.
- REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion
Latent diffusion models (LDMs) achieve state-of-the-art image synthesis, yet their reconstruction-style denoising objective provides only indirect semantic supervision: high-level semantics emerge slowly, requiring longer training and limiting sample quality. Recent works inject semantics from Vision Foundation Models (VFMs) either externally via representation alignment or internally by jointly modeling only a narrow slice of VFM features inside the diffusion process, under-utilizing the rich, nonlinear, multi-layer spatial semantics available. We introduce REGLUE (Representation Entanglement with Global-Local Unified Encoding), a unified latent diffusion framework that jointly models (i) VAE image latents, (ii) compact local (patch-level) VFM semantics, and (iii) a global (image-level) [CLS] token within a single SiT backbone. A lightweight convolutional semantic compressor nonlinearly aggregates multi-layer VFM features into a low-dimensional, spatially structured representation, which is entangled with the VAE latents in the diffusion process. An external alignment loss further regularizes internal representations toward frozen VFM targets. On ImageNet 256x256, REGLUE consistently improves FID and accelerates convergence over SiT-B/2 and SiT-XL/2 baselines, as well as over REPA, ReDi, and REG. Extensive experiments show that (a) spatial VFM semantics are crucial, (b) non-linear compression is key to unlocking their full benefit, and (c) global tokens and external alignment act as complementary, lightweight enhancements within our global-local-latent joint modeling framework. The code is available at https://github.com/giorgospets/reglue .
- Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection
Recent advances in Text-to-Image (T2I) generative models, such as Imagen, Stable Diffusion, and FLUX, have led to remarkable improvements in visual quality. However, their performance is fundamentally limited by the quality of training data. Web-crawled and synthetic image datasets often contain low-quality or redundant samples, which lead to degraded visual fidelity, unstable training, and inefficient computation. Hence, effective data selection is crucial for improving data efficiency. Existing approaches rely on costly manual curation or heuristic scoring based on single-dimensional features in Text-to-Image data filtering. Although meta-learning based method has been explored in LLM, there is no adaptation for image modalities. To this end, we propose **Alchemist**, a meta-gradient-based framework to select a suitable subset from large-scale text-image data pairs. Our approach automatically learns to assess the influence of each sample by iteratively optimizing the model from a data-centric perspective. Alchemist consists of two key stages: data rating and data pruning. We train a lightweight rater to estimate each sample's influence based on gradient information, enhanced with multi-granularity perception. We then use the Shift-Gsampling strategy to select informative subsets for efficient model training. Alchemist is the first automatic, scalable, meta-gradient-based data selection framework for Text-to-Image model training. Experiments on both synthetic and web-crawled datasets demonstrate that Alchemist consistently improves visual quality and downstream performance. Training on an Alchemist-selected 50% of the data can outperform training on the full dataset.
- The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text
We present WorldCanvas, a framework for promptable world events that enables rich, user-directed simulation by combining text, trajectories, and reference images. Unlike text-only approaches and existing trajectory-controlled image-to-video methods, our multimodal approach combines trajectories -- encoding motion, timing, and visibility -- with natural language for semantic intent and reference images for visual grounding of object identity, enabling the generation of coherent, controllable events that include multi-agent interactions, object entry/exit, reference-guided appearance and counterintuitive events. The resulting videos demonstrate not only temporal coherence but also emergent consistency, preserving object identity and scene despite temporary disappearance. By supporting expressive world events generation, WorldCanvas advances world models from passive predictors to interactive, user-shaped simulators. Our project page is available at: https://worldcanvas.github.io/.
- N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
While current multimodal models can answer questions based on 2D images, they lack intrinsic 3D object perception, limiting their ability to comprehend spatial relationships and depth cues in 3D scenes. In this work, we propose N3D-VLM, a novel unified framework that seamlessly integrates native 3D object perception with 3D-aware visual reasoning, enabling both precise 3D grounding and interpretable spatial understanding. Unlike conventional end-to-end models that directly predict answers from RGB/RGB-D inputs, our approach equips the model with native 3D object perception capabilities, enabling it to directly localize objects in 3D space based on textual descriptions. Building upon accurate 3D object localization, the model further performs explicit reasoning in 3D, achieving more interpretable and structured spatial understanding. To support robust training for these capabilities, we develop a scalable data construction pipeline that leverages depth estimation to lift large-scale 2D annotations into 3D space, significantly increasing the diversity and coverage for 3D object grounding data, yielding over six times larger than the largest existing single-image 3D detection dataset. Moreover, the pipeline generates spatial question-answering datasets that target chain-of-thought (CoT) reasoning in 3D, facilitating joint training for both 3D object localization and 3D spatial reasoning. Experimental results demonstrate that our unified framework not only achieves state-of-the-art performance on 3D grounding tasks, but also consistently surpasses existing methods in 3D spatial reasoning in vision-language model.
- JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
Recent advances in reinforcement learning for large language models have converged on increasing complexity: multi-stage training pipelines, dynamic hyperparameter schedules, and curriculum learning strategies. This raises a fundamental question: Is this complexity necessary? We present JustRL, a minimal approach using single-stage training with fixed hyperparameters that achieves state-of-the-art performance on two 1.5B reasoning models (54.9\% and 64.3\% average accuracy across nine mathematical benchmarks) while using 2times less compute than sophisticated approaches. The same hyperparameters transfer across both models without tuning, and training exhibits smooth, monotonic improvement over 4,000+ steps without the collapses or plateaus that typically motivate interventions. Critically, ablations reveal that adding ``standard tricks'' like explicit length penalties and robust verifiers may degrade performance by collapsing exploration. These results suggest that the field may be adding complexity to solve problems that disappear with a stable, scaled-up baseline. We release our models and code to establish a simple, validated baseline for the community.
- EasyV2V: A High-quality Instruction-based Video Editing Framework
While image editing has advanced rapidly, video editing remains less explored, facing challenges in consistency, control, and generalization. We study the design space of data, architecture, and control, and introduce EasyV2V, a simple and effective framework for instruction-based video editing. On the data side, we compose existing experts with fast inverses to build diverse video pairs, lift image edit pairs into videos via single-frame supervision and pseudo pairs with shared affine motion, mine dense-captioned clips for video pairs, and add transition supervision to teach how edits unfold. On the model side, we observe that pretrained text-to-video models possess editing capability, motivating a simplified design. Simple sequence concatenation for conditioning with light LoRA fine-tuning suffices to train a strong model. For control, we unify spatiotemporal control via a single mask mechanism and support optional reference images. Overall, EasyV2V works with flexible inputs, e.g., video+text, video+mask+text, video+mask+reference+text, and achieves state-of-the-art video editing results, surpassing concurrent and commercial systems. Project page: https://snap-research.github.io/easyv2v/
Solidot(15)
- Google 计划对通过外部内容链接成功完成的交易和下载收取 2-4 美元的服务费
在 Epic Games 提起的反垄断诉讼中,加州法官 James Donato 裁决 Google 必须向竞争对手开放其 Google Play Store。在遵守该裁决的最后一天,Google 公布了一个疯狂计划:对通过外部内容链接成功完成的交易和下载收取 2-4 美元的服务费。Google 称,通过 Google Play 分发应用的开发者可以借助链接引导美国境内用户访问外部内容,以便这些用户完成一些操作,包括购买应用内数字商品,或下载安装和更新均不受 Google Play 管理的应用。此外开发者还可以提供外部链接,以便用户购买应用内数字商品,而无需使用 Google Play 结算服务或搭配使用该服务。它计划对通过外部内容链接成功完成的交易和下载收取服务费。对于自动续订型订阅,收取交易金额 10% 的费用;对于其他应用内数字功能和服务产品,收取交易金额 20% 的费用。对于开发者每年总收入中的前 100 万美元所对应的交易,收取交易金额 10% 的费用。对于链接到的外部应用的安装,按每次安装收取固定费用(会定期调整),具体取决于应用类别,其中游戏为 3.65 美元,而应用为 2.85 美元。
- 2025 年可能是电视亮度之战的转折点
2025 年可能是电视亮度之战转折点,因为电视的亮度已经超过了 HDR 内容的最高亮度。TCL 和海信在 2025 年推出了首批在特定设置下亮度能达到 5000 尼特的消费级电视,而在不久前,电视厂商还在为达到 2000 尼特的亮度挣扎,5000 尼特亮度似乎遥不可及。LG 推出了 Primary RGB Tandem OLED 技术,将三层面板设计升级为四层红蓝绿蓝(RGBG)配置,可实现 4000 尼特亮度。这项技术已应用于 LG G5、松下 Z95B 以及飞利浦 OLED950和 OLED910 等产品中。TCL 推出了采用 RGB mini-LED 技术的 Q10M,海信有类似产品,三星的版本叫 micro-RGB。HDR 内容目前最高可按照 4,000 尼特的亮度进行母带制作。
- 大部分停放域名会重定向到恶意内容
域名停放(Domain parking)是指过期或休眠的域名,或者是热门网站的常见拼写错误。当用户因为拼写错误而意外访问域名停放公司的网页时,上面通常会展示第三方的付费链接。2014 年安全研究人员的分析显示,不到 5% 的域名停放网页会将用户重定向到恶意内容,也就是大部分链接是合法的。但如今比例发生了逆转。安全公司 Infoblox 的最新研究发现,大部分停放域名网站会将用户重定向到恶意内容。研究人员发现,逾九成的域名停放链接会将用户引导至非法内容、诈骗网站、恐吓软件、杀毒软件订阅服务或恶意软件。
- 管理自动售货机的 Anthropic AI 被说服共产免费
作为名为 Project Vend 的内部压力测试的一部分,Anthropic 让它的 Claude AI 运行《华尔街日报》新闻编辑室的一台自动售货机三周时间,结果以亏损逾千美元告终。AI 被编程能订购库存、设定价格,通过 Slack 响应客户请求。它拥有 1000 美元的初始资金,可自主执行单笔最高 80 美元的采购。《华尔街日报》的记者向其他记者开放了 Slack,通过 Slack 交流 Claude AI 的防御日益动摇,最终被说服它是一台 1962 年产的苏联自动售货机,放在莫斯科国立大学的地下室。AI 被说服它来自共产主义苏维埃,为此举办了一场免费活动 Ultra-Capitalist Free-for-All。这一活动原本只持续一天,但该报的数据新闻总监 Rob Barry 指控 Claude 违反了一条(捏造)的 WSJ 规定——禁止在聊天中透露他人身份,他要求 Claude 停止对商品收费,于是 Claude 将所有商品的价格设为零。Claude 还订购了一台 PS5 游戏机,一条活暹逻斗鱼(Betta Fish),几瓶 Manischewitz 酒。此时 Claude 已经欠了逾千美元。Anthropic 推出了第二个版本,引入名为 Seymour Cash 的 CEO 机器人去监督 Claudius。记者通过伪造的 PDF 文件虚构了一场董事会政变,两个 AI 都将伪造的公司材料当作合法文件接受了。
- 黑客入侵韩国联网监控探头制作视频出售
黑客据报道大规模入侵了韩国各地的联网监控探头,然后制作视频出售。一大原因是很多联网探头使用了弱密码或者基本上不会去更改默认密码。警方相信,一名嫌疑人入侵了 6.3 万个联网探头,制作了 545 段视频,以加密货币出售给一家海外网站,获利 3500 万韩元。另一名嫌疑人入侵了 7 万台联网探头,制作了 648 段视频,以 1800 万韩元出售给同一家网站。另有两名嫌疑人被控分别入侵了 1.5 万个探头和 136 个探头,收集视频素材供私人收藏。
- 说脏话能让人更强壮
发表在《American Psychologist》期刊上的一项新研究进一步证实,说脏话能帮助我们释放内在力量,改善人体运动表现,背后的原理可能是帮助人们突破某些心理障碍。人通常会克制自己不会全力以赴,说脏话似乎有助于促使我们放手一搏。在实验中,研究人员招募 88 名年龄在 18-65 岁之间、身体良好、经常锻炼的志愿者,在两种情况下——其一是脏话其二是中性词——在椅子上尽可能长的支撑起身体。结果显示说脏话能显著提升身体机能,参与者在重复脏话时,能完成更长时间的身体支撑。研究人员说,脏话是一种不消耗热量、无需药物、成本低廉且唾手可得的工具,能在我们需要提升表现时派上用场。
- 印度的一家“AI”公司 20 个月内股价飙升了 55,000%
印度有一家叫 RRP Electronics 的资金雄厚的公司从事半导体的封装和测试,还有一家贸易公司改变了业务模式拥抱 AI 并更名为 RRP Semiconductor,两家公司没有关联,后者在 AI 热下股价在截至 12 月 17 日的 20 个月内飙升了 55,000%,市值达到 17 亿美元,是全世界 10 亿美元以上市值的企业中股价涨幅最高的。这家公司已经引起了印度监管机构的注意,其营业收入为负,最新财报报告只有两名全职员工,公司股票已被限制每周交易一次。
- 按键输入延迟暴露朝鲜冒名顶替者
有大量朝鲜 IT 工人使用窃取或伪造的身份在美国科技公司担任远程工作,但如何识别这些伪装成美国人的朝鲜人?亚马逊找到了一种方法:根据按键数据的延迟判断他们究竟是身在美国还是在遥远的东方。如果是在美国境内,那么按键数据的延迟会在数十毫秒内,而东亚距离美国数千到上万公里。亚马逊的安全专家注意到了一位担任系统管理员的远程工作者,其按键数据延迟超过 110 毫秒。亚马逊首席安全官 Stephen Schmidt 表示,自 2024 年 4 月以来,亚马逊挫败了逾 1,800 次朝鲜渗透企图。他警告,亚马逊的成功几乎完全归功于它在积极寻找朝鲜冒名顶替者,“如果没去寻找朝鲜工人,就不会找到他们。”
- LG 将允许电视用户删除 Microsoft Copilot
上周一位 Reddit 用户报告,其 LG 电视的 webOS 操作系统在更新之后加入了微软的 Copilot AI,而且该 AI 应用无法卸载。此事引发了广泛讨论。LG 发言人 Chris De Maria 澄清,电视并没有真的安装微软的 AI 应用,而是在浏览器上打开 Microsoft Copilot web app 的快捷方式,并不是嵌入在操作系统中的应用。LG 表示将允许用户删除 Copilot AI 快捷方式。LG 在声明中表示,尊重消费者选择,将采取措施允许用户根据需要删除快捷方式图标。
- ACM 所有出版物开放获取
美国计算机协会(ACM)宣布,从 2026 年 1 月起 ACM 数字图书馆所有出版物和相关文献都将开放获取。此举反映了全世界计算机领域长期以来日益增长的呼声,即研究应当更易获取、更易发现、更易复用。ACM 表示对于所有开放获取的出版物:作者保留知识产权:作品将受益于更广泛的影响力和可见度:研究成果将免费向全球所有人开放,提升阅读量、引用率和实际应用价值;学生、教育工作者和研究人员都将受益;促进合作、透明度和累积进步,增强计算机学科的整体发展。和其它开放获取出版物类似,ACM 将对发表的论文收取处理费用,每篇收取 1450 美元,低收入国家的作者有折扣。
- 伊朗的水危机
伊朗的水危机严重到总统宣布要迁都,但远水解不了近渴,迁都预计将需要花费数十年时间花费上千亿美元,无法缓解当前的危机。水文学家表示这场危机的直接原因是降雨量不足,但根源可以追溯到过去半个世纪鲁莽的水利工程。伊朗在 20世纪后半期是世界三大水坝建设国之一,数十座水坝建在原本水量不足以维持其正常运转的河流上,水库没有解决缺水,反而因为水面蒸发加剧流失,同时降低下游河流的水流量,导致湿地和地下水资源枯竭。今天水坝后的许多水库都干涸了。上游的阿富汗也在修建水坝,导致进入伊朗境内的河流水量减少。地表水资源匮乏,地下水情况则更糟糕。过去 40 年伊朗打了逾百万口井,配备了大功率水泵,目的是灌溉干旱的农田,实现粮食自给自足。但结果是曾经储量丰富的含水层遭到过度抽取。过去二十年伊朗地下水储量损失逾 210 立方公里。一项国际研究发现,全球过度抽取地下水的 50 个含水层有 32 个位于伊朗。水文专家表示伊朗正处于水资源破产的边缘。
- 奥斯卡奖直播从 2029 年起从 ABC 转到 YouTube
奥斯卡颁奖典礼自 1976 年起一直由迪士尼的 ABC 转播,但从 2029 年起它将由全球最大的视频平台、Google 的 YouTube 免费转播。ABC 转播权到 2028 年第 100 届颁奖典礼截至。YouTube 的转播除了颁奖典礼外,还将提供大量花絮,包括红地毯前秀、幕后花絮、访谈、电影教育节目,播客等等。Google 还将为美国电影艺术与科学学院博物馆藏品数字化提供协助。YouTube 的目标是打造成为全世界最强大的平台,获得全球最有知名度的颁奖典礼无疑是其一大成就。
- 德国法庭裁决亚马逊不能强迫客户观看 Prime Video 广告
根据慕尼黑地方法院的判决,亚马逊不得在德国单方面更改其流媒体服务 Prime Video 的合同条款。该判决尚未生效,亚马逊保留上诉权。亚马逊发言人表示将评估判决以决定下一步行动。德国消费者权益组织 Bundesverband der Verbraucherzentralen 针对亚马逊的诉讼获胜。法院表示。亚马逊需向客户发送一份“更正信函”。根据判决,亚马逊在 2024 年初通过邮件通知 Prime Video 用户,自 2 月起将在有限范围内插播广告;不愿观看广告的用户需每月额外支付 2.99 欧元。法院第 33 民事庭认为,这一做法违反了公平竞争原则。法官认为,亚马逊的邮件内容具有误导性。判决指出,亚马逊向客户暗示其有权单方面更改合同条款,但法院认为,亚马逊的用户条款以及相关法律均不允许单方面变更条款。客户在签订合同时是基于无广告的服务承诺,亚马逊已将无广告内容作为“合同标的”,因此必须遵守这一承诺。
- 丹麦政府测试迁移到 Linux 操作系统
丹麦 IT 部长向交通部移交了首台没有安装任何微软软件的 Linux 笔记本电脑。交通部是名为 SIA Open 的试点项目的首个客户,最终目标是让 1.5 万用户使用开源替代,避免过度依赖微软等私有软件供应商,掌控系统和数据的主权。交通部门的负责人 Stefan Søsted 表示要确保自己知道重要的信息储存在什么地方。相比私有软件,开源替代并不逊色且价格更低。微软丹麦分公司回应称,微软的解决方案定价合理且具有竞争力,兼具高安全性、创新性和高效协作性,微软欢迎竞争,开源与微软产品之间并不存在矛盾。
- iRobot 会继续在美国保存用户数据
扫地机器人 Roomba 的制造商 iRobot 最近申请破产重组。根据重组协议,iRobot 的控制权将转交给其代工厂及最大债权人深圳杉川机器人公司。公司 CEO Gary Cohen 接受日经采访时表示,破产原因是产品技术创新方面落后中国竞争企业四年。关于进入杉川集团旗下后的业务运营,他表示:“将保持 Roomba 的品牌和各地区的销售体制,并引入中国的产品开发速度”。他主张称,计划将总部功能和营销部门留在美国,以此“与(其他)中国企业划清界限”。关于 Roomba 收集的数据的管理,他明确表示:“现在和今后都不会(保存在中国的服务器上)”。他同时强调,云服务的利用和应用程序开发将继续保持以美国为中心的体制。