The lxml.html.clean module cleans up HTML by removing embedded or script content, special tags, CSS style annotations and much more. It was found  that the clean_html() function, provided by the lxml.html.clean module, did not properly clean HTML input if it included non-printed characters (\x01-\x08). A remote attacker could use this flaw to serve malicious content to an application using the clean_html() function to process HTML, possibly allowing the attacker to inject malicious code into a website generated by this application.
This issue has been reported upstream at  and a patch is available at .
Created python-lxml tracking bugs for this issue:
Affects: epel-5 [bug 1092614]
Is this patch included in 3.3.5? The release notes would seem to indicate that it was, but I want to make sure. I submitted 3.3.5 packages yesterday but it does not look like they have been pushed to the mirrors yet.
(In reply to Jeffrey C. Ollie from comment #2)
> Is this patch included in 3.3.5? The release notes would seem to indicate
> that it was, but I want to make sure. I submitted 3.3.5 packages yesterday
> but it does not look like they have been pushed to the mirrors yet.
I checked the sources of the builds you link to and they both include the patch that fixes this issue. Both builds are in testing and will be pushed to stable updates once they get enough karma or the in-testing time passes.
python-lxml-3.3.5-1.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
python-lxml-3.3.5-1.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
Red Hat Product Security has rated this issue as having Low security impact. This issue is not currently planned to be addressed in future updates. For additional information, refer to the Issue Severity Classification: https://access.redhat.com/security/updates/classification/.