CVE-2026-34159
Published: 01 April 2026
Description
llama.cpp is an inference of several LLM models in C/C++. Prior to version b8492, the RPC backend's deserialize_tensor() skips all bounds validation when a tensor's buffer field is 0. An unauthenticated attacker can read and write arbitrary process memory via…
more
crafted GRAPH_COMPUTE messages. Combined with pointer leaks from ALLOC_BUFFER/BUFFER_GET_BASE, this gives full ASLR bypass and remote code execution. No authentication required, just TCP access to the RPC server port. This issue has been patched in version b8492.
Mitigating Controls (NIST 800-53 r5)AI
Directly remediates the bounds validation flaw in deserialize_tensor() by requiring timely application of the vendor patch (b8492), preventing arbitrary memory read/write and RCE.
Requires validation of all information inputs including crafted GRAPH_COMPUTE messages and tensor buffers, enforcing bounds checks to block memory corruption from improper deserialization.
Monitors and controls communications at system boundaries to restrict unauthenticated TCP access to the exposed RPC server port, preventing remote exploitation.
Security SummaryAI
CVE-2026-34159 is a critical vulnerability in llama.cpp, a C/C++ inference engine for large language models (LLMs). In versions prior to b8492, the RPC backend's deserialize_tensor() function skips all bounds validation when a tensor's buffer field is 0, enabling improper handling of memory buffers. This flaw, classified under CWE-119 (Improper Restriction of Operations within the Bounds of a Memory Buffer), carries a CVSS v3.1 base score of 9.8 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H).
An unauthenticated attacker with TCP access to the RPC server port can exploit this issue by sending crafted GRAPH_COMPUTE messages to read and write arbitrary process memory. When combined with pointer leaks obtainable via ALLOC_BUFFER and BUFFER_GET_BASE operations, attackers achieve full ASLR bypass, culminating in remote code execution. No privileges or user interaction are required, making it highly accessible over the network.
The vulnerability has been patched in llama.cpp version b8492. Official mitigation details are available in the GitHub security advisory (GHSA-j8rj-fmpv-wcxw), the fixing pull request (#20908), and the commit (39bf0d3c6a95803e0f41aaba069ffbee26721042), which recommend upgrading to the patched version to restore proper bounds checking in deserialize_tensor().
This issue is particularly relevant to AI/ML deployments relying on llama.cpp for efficient LLM inference, as exposed RPC servers could enable compromise of model-serving infrastructure. No public evidence of real-world exploitation has been reported as of the CVE publication on 2026-04-01.
Details
- CWE(s)
Affected Products
AI Security AnalysisAI
- AI Category
- NLP and Transformers
- Risk Domain
- N/A
- OWASP Top 10 for LLMs 2025
- None mapped
- MITRE ATLAS Techniques
- None mapped
- Classification Reason
- Matched keywords: llama.cpp, llm
MITRE ATT&CK Enterprise TechniquesAI
Why these techniques?
The vulnerability enables unauthenticated remote code execution by sending crafted RPC messages over TCP to a public-facing llama.cpp RPC server port, directly mapping to exploitation of public-facing applications.