Bug 460396 - libxml2's fix for recent security errata can break yum with RHN/satellite
Summary: libxml2's fix for recent security errata can break yum with RHN/satellite
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libxml2
Version: 5.2
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Daniel Veillard
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-08-28 02:54 UTC by James Antill
Modified: 2008-09-08 16:49 UTC (History)
7 users (show)

Fixed In Version: RHSA-2008-0836
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-09-02 18:32:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
RHEL-5 Patch corresponding to the new upstream fix (12.68 KB, patch)
2008-08-28 20:07 UTC, Daniel Veillard
no flags Details | Diff
RHEL-4 Patch corresponding to the new upstream fix (12.68 KB, patch)
2008-08-28 20:30 UTC, Daniel Veillard
no flags Details | Diff
RHEL-3 Patch corresponding to the new upstream fix (12.03 KB, patch)
2008-08-28 21:13 UTC, Daniel Veillard
no flags Details | Diff
RHN rpm ChangeLog data, from 2008-08-28. Has large number of named entities (11.45 MB, application/octet-stream)
2008-08-28 23:56 UTC, James Antill
no flags Details
RHEL-2.1 Patch corresponding to the new upstream fix (15.55 KB, patch)
2008-08-29 11:41 UTC, Daniel Veillard
no flags Details | Diff
first test for bad behaviour (1.67 KB, text/plain)
2008-09-01 14:03 UTC, Daniel Veillard
no flags Details
second test for bad behaviour (1.68 KB, text/plain)
2008-09-01 14:04 UTC, Daniel Veillard
no flags Details
dtd for 3rd test pf bad behaviour (1.62 KB, text/plain)
2008-09-01 14:05 UTC, Daniel Veillard
no flags Details
third test of bad behaviour (86 bytes, text/plain)
2008-09-01 14:06 UTC, Daniel Veillard
no flags Details
fourth test of bad behaviour (40.44 KB, text/plain)
2008-09-01 14:07 UTC, Daniel Veillard
no flags Details
fifth test of bad behaviour (1.55 KB, text/plain)
2008-09-01 14:08 UTC, Daniel Veillard
no flags Details
sixth test of bad behaviour (52.00 KB, text/plain)
2008-09-01 14:09 UTC, Daniel Veillard
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0878 0 normal SHIPPED_LIVE libxml2 bug fix update 2008-09-02 18:36:17 UTC

Description James Antill 2008-08-28 02:54:44 UTC
Description of problem:
 The recent libxml2 errata:

http://rhn.redhat.com/errata/RHSA-2008-0836.html

..."fixes" the problem by strictly limiting entities to 500,000. This was almost instantly noticed in Fedora as yum-metadata-parser (via. createrepo) then died parsing the XML metadata (because it tends to use a lot of > etc.

 RHN currently _only_ supplies XML metadata to yum, so at any point if we hit 500,000 entities yum will just die.
 This is a regression of the API, and needs a better fix.

 Also even if we are extremly careful, so we can guarantee RHN doesn't produce this error ... we have no control over what customers are putting into satellite etc.

 Also, personally, I think this is a very bad fix for any customers who are using the libxml2 API. ... they now have to "know" if their document might have "too many" entities in it, and act accordingly.

Version-Release number of selected component (if applicable):
libxml2-2.6.16-12.3

How reproducible:
 Always

Comment 1 James Antill 2008-08-28 04:10:50 UTC
 I assumed we wouldn't have hit it already, given it went through QA, but that was apparently optimistic...

 "yum makecache"

...or anything that downloads the other.xml (changelog data) kills yum due to this problem.

 This also means you can't download all packages from RHN and run createrepo on them anymore, as createrepo will die.

 As a quick hack I did:

# zcat /var/cache/yum/rhel-x86_64-server-5/other.xml.gz | \
  perl -nle ' while (/\&/g) { ++$tot; } END { print $tot; }' 
1086853

...which implies if we need to do a _quick hack_ to raise the limit, we can probably get away with about 1.5 million instead of 500,000.

Comment 3 Daniel Veillard 2008-08-28 11:45:55 UTC
I'm looking at getting better algorithms but it's excruciatingly hard.
Same level as designing an OOM killer that will only capture
offending processes !
Purely raising that limit is not a much better solution.
You may change one knob or another but the variety of exhaustion 
possible are really harder to process than what you seems to think.

Daniel

Comment 4 James Antill 2008-08-28 13:54:34 UTC
 Daniel confirmed that if RHN metadata generation moves from named entities to character entities libxml2 will be happy, which seems at least a viable short term. fix to give Daniel some breathing room.

 Eg. instead of < the metadata would have <

Comment 5 Daniel Veillard 2008-08-28 20:07:43 UTC
Created attachment 315291 [details]
RHEL-5 Patch corresponding to the new upstream fix

Comment 7 Daniel Veillard 2008-08-28 20:30:30 UTC
Created attachment 315294 [details]
RHEL-4 Patch corresponding to the new upstream fix

Comment 10 Daniel Veillard 2008-08-28 21:13:46 UTC
Created attachment 315298 [details]
RHEL-3 Patch corresponding to the new upstream fix

Comment 11 Daniel Veillard 2008-08-28 21:18:53 UTC
Any possibility to get an attachement of a compressed XML file generated 
for yum which broke the old version for testing purposes ?

xmllint --nooent --noent yum-test.xml.gz

should work to test since xmllint accepts gzipped compressed input on files,
if the file still fail to parse either the bug is not fixed or the generated
XML is broken ;-)

Daniel

Comment 12 James Antill 2008-08-28 23:56:57 UTC
Created attachment 315310 [details]
RHN rpm ChangeLog data, from 2008-08-28. Has large number of named entities

 Sure, pretty much any other.xml.gz will probably do as a test. Here's the current one (bzip2'd to get around the BZ upload limits):

Comment 14 Daniel Veillard 2008-08-29 07:45:15 UTC
Thanks James, 117 MBytes that starts to be a nice beast, once gzipped
again it makes a good regression test:

wei:~/XML -> time xmllint --noent --stream ../yum.xml.gz 

real    0m4.856s
user    0m4.833s
sys     0m0.020s
wei:~/XML -> cat .memdump 
      09:39:28 AM

      MEMORY ALLOCATED : 0, MAX was 63738
BLOCK  NUMBER   SIZE  TYPE
wei:~/XML -> 

  No error indicates parsing went well... and with my debug version
configured to track memory allocation i see no lost block and a very
reasonable memory consumption.
  Maybe we need to add this in our regression tests somehow (maybe
using valgrind instead to track potential memory leaks)

  For the make check tests upstream I added a program generating 
a million line document each line using entities with a bit of
recursion, so any similar error will be caught before being pushed
in the future,

  thanks again, and sorry for the troubles,

Daniel

Comment 15 Daniel Veillard 2008-08-29 11:41:17 UTC
Created attachment 315351 [details]
RHEL-2.1 Patch corresponding to the new upstream fix

This still requires to increase the size of the entity structure.
For libxml2-2.4.x there is really no placeholder to collect the
required informations.

Comment 19 Daniel Veillard 2008-09-01 14:03:57 UTC
Created attachment 315476 [details]
first test for bad behaviour

Comment 20 Daniel Veillard 2008-09-01 14:04:52 UTC
Created attachment 315477 [details]
second test for bad behaviour

Comment 21 Daniel Veillard 2008-09-01 14:05:44 UTC
Created attachment 315478 [details]
dtd for 3rd test pf bad behaviour

Comment 22 Daniel Veillard 2008-09-01 14:06:45 UTC
Created attachment 315479 [details]
third test of bad behaviour

Comment 23 Daniel Veillard 2008-09-01 14:07:33 UTC
Created attachment 315480 [details]
fourth test of bad behaviour

Comment 24 Daniel Veillard 2008-09-01 14:08:26 UTC
Created attachment 315481 [details]
fifth test of bad behaviour

Comment 25 Daniel Veillard 2008-09-01 14:09:14 UTC
Created attachment 315482 [details]
sixth test of bad behaviour

Comment 26 Daniel Veillard 2008-09-01 14:13:45 UTC
To use the six tests i just added save them on a local file system
as well as the lol3.dtd, and try in sequence

xmllint --noent --noout --loaddtd lolX.xml

the program should return nearly immediately not consume much memory
and raise at least one error about "Detected an entity reference loop"

Daniel

Comment 29 errata-xmlrpc 2008-09-02 18:32:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0878.html


Note You need to log in before you can comment on or make changes to this bug.