Bug 1660235 (CVE-2018-19787)

Summary: CVE-2018-19787 python-lxml: XSS in lxml.html.clean module in lxml/html/clean.py
Product: [Other] Security Response Reporter: Laura Pardo <lpardo>
Component: vulnerabilityAssignee: Red Hat Product Security <security-response-team>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: abhgupta, apevec, dbaker, jjoyce, jmoran, jokerman, jpopelka, jschluet, lhh, lpeer, mburns, sclewis, slinaber, sthangav, trankin
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: python-lxml 4.2.5 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-27 03:19:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1660236, 1660980, 1660981, 1662779, 1662780, 1662781, 1662782, 1662783, 1662784, 1662785    
Bug Blocks: 1660239    

Description Laura Pardo 2018-12-17 22:20:55 UTC
An issue was discovered in lxml before 4.2.5. lxml/html/clean.py in the lxml.html.clean module does not remove javascript: URLs that use escaping, allowing a remote attacker to conduct XSS attacks, as demonstrated by "j a v a s c r i p t:" in Internet Explorer. This is a similar issue to CVE-2014-3146. 


References:
https://github.com/lxml/lxml/commit/6be1d081b49c97cfd7b3fbd934a193b668629109

Comment 1 Laura Pardo 2018-12-17 22:21:07 UTC
Created python-lxml tracking bugs for this issue:

Affects: fedora-all [bug 1660236]

Comment 2 Scott Gayou 2018-12-19 18:00:41 UTC
Easy to reproduce. As an example, '<a href="javascrip%20t%20:evil_function()">poc</a>' should be cleaned to '<a href="">poc</a>' but isn't.

Apparently Internet Explorer can somehow execute "j a v a s c r i p t:" (with spaces). I don't have any experience with that, but I'll trust upstream.