CVE-2025-29783
Published: 19 March 2025
Description
Adversaries may exploit remote services to gain unauthorized access to internal systems once inside of a network.
Security Summary
CVE-2025-29783 is a remote code execution vulnerability in vLLM, a high-throughput and memory-efficient inference and serving engine for large language models (LLMs). The flaw occurs when vLLM is configured to use Mooncake for distributing key-value (KV) cache across distributed hosts, exposing unsafe deserialization directly over ZMQ/TCP on all network interfaces. It impacts any such deployments and is classified under CWE-502 (Deserialization of Untrusted Data), with a CVSS v3.1 base score of 9.0 (AV:A/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H).
An attacker with adjacent network access and low privileges can exploit this vulnerability with low complexity and no user interaction required. Successful exploitation allows remote code execution on the distributed hosts, granting high-impact access to confidentiality, integrity, and availability due to the high scope of the attack.
The vulnerability is fixed in vLLM version 0.8.0. Mitigation details are available in the GitHub security advisory at https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7, the fixing pull request at https://github.com/vllm-project/vllm/pull/14228, and the commit at https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2.
This issue is relevant to AI/ML practitioners deploying distributed LLM inference engines, as vLLM is commonly used for high-performance serving of large language models.
Details
- CWE(s)
Affected Products
MITRE ATT&CK Enterprise Techniques
Why these techniques?
The CVE describes a network-exposed unsafe deserialization vulnerability in vLLM (via ZMQ/TCP on all interfaces) that directly enables remote code execution with adjacent network access, mapping to exploitation of remote services for code execution on the target hosts.