Bug 2444608 (CVE-2026-0847)

Summary: CVE-2026-0847 nltk: NLTK: Arbitrary file read via path traversal vulnerability
Product: [Other] Security Response Reporter: OSIDB Bzimport <bzimport>
Component: vulnerabilityAssignee: Product Security DevOps Team <prodsec-dev>
Status: NEW --- QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: anpicker, bparees, dschmidt, ebourniv, erezende, hasun, jfula, jkoehler, jlanda, jowilson, jwong, kshier, lgallett, lphiri, nyancey, omaciel, ometelka, ptisnovs, sbunciak, simaishi, smcdonal, stcannon, syedriko, teagle, ttakamiy, xdharmai, yguenane
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: ---
Doc Text:
A flaw was found in NLTK (Natural Language Toolkit). This vulnerability allows a remote attacker to read arbitrary files on the server due to improper sanitization of file paths in several CorpusReader classes, including WordListCorpusReader, TaggedCorpusReader, and BracketParseCorpusReader. By exploiting this path traversal vulnerability, an attacker can gain unauthorized access to sensitive information, such as system files, SSH private keys, and API tokens. In certain scenarios, this could potentially lead to remote code execution.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description OSIDB Bzimport 2026-03-04 19:01:24 UTC
A vulnerability in NLTK versions up to and including 3.9.2 allows arbitrary file read via path traversal in multiple CorpusReader classes, including WordListCorpusReader, TaggedCorpusReader, and BracketParseCorpusReader. These classes fail to properly sanitize or validate file paths, enabling attackers to traverse directories and access sensitive files on the server. This issue is particularly critical in scenarios where user-controlled file inputs are processed, such as in machine learning APIs, chatbots, or NLP pipelines. Exploitation of this vulnerability can lead to unauthorized access to sensitive files, including system files, SSH private keys, and API tokens, and may potentially escalate to remote code execution when combined with other vulnerabilities.