Bug 2491579 (CVE-2026-53923) - CVE-2026-53923 vllm: vLLM: Information disclosure via integer truncation
Summary: CVE-2026-53923 vllm: vLLM: Information disclosure via integer truncation
Keywords:
Status: NEW
Alias: CVE-2026-53923
Product: Security Response
Classification: Other
Component: vulnerability
Version: unspecified
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Product Security
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2026-06-22 23:01 UTC by OSIDB Bzimport
Modified: 2026-06-29 04:37 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description OSIDB Bzimport 2026-06-22 23:01:14 UTC
vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.


Note You need to log in before you can comment on or make changes to this bug.