Bug 2327501 (CVE-2024-52595)

Summary: CVE-2024-52595 python-lxml-html-clean: HTML Cleaner allows crafted scripts in special contexts like svg or math to pass through
Product: [Other] Security Response Reporter: OSIDB Bzimport <bzimport>
Component: vulnerabilityAssignee: Product Security DevOps Team <prodsec-dev>
Status: NEW --- QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedKeywords: Security
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
A flaw was found in the lxml-html-clean package. The HTML Parser does not properly handle context-switching for special HTML tags such as `<svg>`, `<math>` and `<noscript>`. Specifically, the content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-site scripting (XSS) attacks, compromising the security of users who rely on lxml_html_clean in the default configuration to sanitize untrusted HTML content.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2327558    
Bug Blocks:    

Description OSIDB Bzimport 2024-11-19 22:01:19 UTC
lxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.0, the HTML Parser in lxml does not properly handle context-switching for special HTML tags such as `<svg>`, `<math>` and `<noscript>`. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content. Users employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue. As a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability. Via `remove_tags`, one may specify tags to remove - their content is moved to their parents' tags. Via `kill_tags`, one may specify tags to be removed completely. Via `allow_tags`, one may restrict the set of permissible tags, excluding context-switching tags like `<svg>`, `<math>` and `<noscript>`.