Bug 460396 - libxml2's fix for recent security errata can break yum with RHN/satellite
libxml2's fix for recent security errata can break yum with RHN/satellite
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libxml2 (Show other bugs)
5.2
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Daniel Veillard
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-08-27 22:54 EDT by James Antill
Modified: 2008-09-08 12:49 EDT (History)
7 users (show)

See Also:
Fixed In Version: RHSA-2008-0836
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-02 14:32:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
RHEL-5 Patch corresponding to the new upstream fix (12.68 KB, patch)
2008-08-28 16:07 EDT, Daniel Veillard
no flags Details | Diff
RHEL-4 Patch corresponding to the new upstream fix (12.68 KB, patch)
2008-08-28 16:30 EDT, Daniel Veillard
no flags Details | Diff
RHEL-3 Patch corresponding to the new upstream fix (12.03 KB, patch)
2008-08-28 17:13 EDT, Daniel Veillard
no flags Details | Diff
RHN rpm ChangeLog data, from 2008-08-28. Has large number of named entities (11.45 MB, application/octet-stream)
2008-08-28 19:56 EDT, James Antill
no flags Details
RHEL-2.1 Patch corresponding to the new upstream fix (15.55 KB, patch)
2008-08-29 07:41 EDT, Daniel Veillard
no flags Details | Diff
first test for bad behaviour (1.67 KB, text/plain)
2008-09-01 10:03 EDT, Daniel Veillard
no flags Details
second test for bad behaviour (1.68 KB, text/plain)
2008-09-01 10:04 EDT, Daniel Veillard
no flags Details
dtd for 3rd test pf bad behaviour (1.62 KB, text/plain)
2008-09-01 10:05 EDT, Daniel Veillard
no flags Details
third test of bad behaviour (86 bytes, text/plain)
2008-09-01 10:06 EDT, Daniel Veillard
no flags Details
fourth test of bad behaviour (40.44 KB, text/plain)
2008-09-01 10:07 EDT, Daniel Veillard
no flags Details
fifth test of bad behaviour (1.55 KB, text/plain)
2008-09-01 10:08 EDT, Daniel Veillard
no flags Details
sixth test of bad behaviour (52.00 KB, text/plain)
2008-09-01 10:09 EDT, Daniel Veillard
no flags Details

  None (edit)
Description James Antill 2008-08-27 22:54:44 EDT
Description of problem:
 The recent libxml2 errata:

http://rhn.redhat.com/errata/RHSA-2008-0836.html

..."fixes" the problem by strictly limiting entities to 500,000. This was almost instantly noticed in Fedora as yum-metadata-parser (via. createrepo) then died parsing the XML metadata (because it tends to use a lot of > etc.

 RHN currently _only_ supplies XML metadata to yum, so at any point if we hit 500,000 entities yum will just die.
 This is a regression of the API, and needs a better fix.

 Also even if we are extremly careful, so we can guarantee RHN doesn't produce this error ... we have no control over what customers are putting into satellite etc.

 Also, personally, I think this is a very bad fix for any customers who are using the libxml2 API. ... they now have to "know" if their document might have "too many" entities in it, and act accordingly.

Version-Release number of selected component (if applicable):
libxml2-2.6.16-12.3

How reproducible:
 Always
Comment 1 James Antill 2008-08-28 00:10:50 EDT
 I assumed we wouldn't have hit it already, given it went through QA, but that was apparently optimistic...

 "yum makecache"

...or anything that downloads the other.xml (changelog data) kills yum due to this problem.

 This also means you can't download all packages from RHN and run createrepo on them anymore, as createrepo will die.

 As a quick hack I did:

# zcat /var/cache/yum/rhel-x86_64-server-5/other.xml.gz | \
  perl -nle ' while (/\&/g) { ++$tot; } END { print $tot; }' 
1086853

...which implies if we need to do a _quick hack_ to raise the limit, we can probably get away with about 1.5 million instead of 500,000.
Comment 3 Daniel Veillard 2008-08-28 07:45:55 EDT
I'm looking at getting better algorithms but it's excruciatingly hard.
Same level as designing an OOM killer that will only capture
offending processes !
Purely raising that limit is not a much better solution.
You may change one knob or another but the variety of exhaustion 
possible are really harder to process than what you seems to think.

Daniel
Comment 4 James Antill 2008-08-28 09:54:34 EDT
 Daniel confirmed that if RHN metadata generation moves from named entities to character entities libxml2 will be happy, which seems at least a viable short term. fix to give Daniel some breathing room.

 Eg. instead of < the metadata would have <
Comment 5 Daniel Veillard 2008-08-28 16:07:43 EDT
Created attachment 315291 [details]
RHEL-5 Patch corresponding to the new upstream fix
Comment 7 Daniel Veillard 2008-08-28 16:30:30 EDT
Created attachment 315294 [details]
RHEL-4 Patch corresponding to the new upstream fix
Comment 10 Daniel Veillard 2008-08-28 17:13:46 EDT
Created attachment 315298 [details]
RHEL-3 Patch corresponding to the new upstream fix
Comment 11 Daniel Veillard 2008-08-28 17:18:53 EDT
Any possibility to get an attachement of a compressed XML file generated 
for yum which broke the old version for testing purposes ?

xmllint --nooent --noent yum-test.xml.gz

should work to test since xmllint accepts gzipped compressed input on files,
if the file still fail to parse either the bug is not fixed or the generated
XML is broken ;-)

Daniel
Comment 12 James Antill 2008-08-28 19:56:57 EDT
Created attachment 315310 [details]
RHN rpm ChangeLog data, from 2008-08-28. Has large number of named entities

 Sure, pretty much any other.xml.gz will probably do as a test. Here's the current one (bzip2'd to get around the BZ upload limits):
Comment 14 Daniel Veillard 2008-08-29 03:45:15 EDT
Thanks James, 117 MBytes that starts to be a nice beast, once gzipped
again it makes a good regression test:

wei:~/XML -> time xmllint --noent --stream ../yum.xml.gz 

real    0m4.856s
user    0m4.833s
sys     0m0.020s
wei:~/XML -> cat .memdump 
      09:39:28 AM

      MEMORY ALLOCATED : 0, MAX was 63738
BLOCK  NUMBER   SIZE  TYPE
wei:~/XML -> 

  No error indicates parsing went well... and with my debug version
configured to track memory allocation i see no lost block and a very
reasonable memory consumption.
  Maybe we need to add this in our regression tests somehow (maybe
using valgrind instead to track potential memory leaks)

  For the make check tests upstream I added a program generating 
a million line document each line using entities with a bit of
recursion, so any similar error will be caught before being pushed
in the future,

  thanks again, and sorry for the troubles,

Daniel
Comment 15 Daniel Veillard 2008-08-29 07:41:17 EDT
Created attachment 315351 [details]
RHEL-2.1 Patch corresponding to the new upstream fix

This still requires to increase the size of the entity structure.
For libxml2-2.4.x there is really no placeholder to collect the
required informations.
Comment 19 Daniel Veillard 2008-09-01 10:03:57 EDT
Created attachment 315476 [details]
first test for bad behaviour
Comment 20 Daniel Veillard 2008-09-01 10:04:52 EDT
Created attachment 315477 [details]
second test for bad behaviour
Comment 21 Daniel Veillard 2008-09-01 10:05:44 EDT
Created attachment 315478 [details]
dtd for 3rd test pf bad behaviour
Comment 22 Daniel Veillard 2008-09-01 10:06:45 EDT
Created attachment 315479 [details]
third test of bad behaviour
Comment 23 Daniel Veillard 2008-09-01 10:07:33 EDT
Created attachment 315480 [details]
fourth test of bad behaviour
Comment 24 Daniel Veillard 2008-09-01 10:08:26 EDT
Created attachment 315481 [details]
fifth test of bad behaviour
Comment 25 Daniel Veillard 2008-09-01 10:09:14 EDT
Created attachment 315482 [details]
sixth test of bad behaviour
Comment 26 Daniel Veillard 2008-09-01 10:13:45 EDT
To use the six tests i just added save them on a local file system
as well as the lol3.dtd, and try in sequence

xmllint --noent --noout --loaddtd lolX.xml

the program should return nearly immediately not consume much memory
and raise at least one error about "Detected an entity reference loop"

Daniel
Comment 29 errata-xmlrpc 2008-09-02 14:32:48 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0878.html

Note You need to log in before you can comment on or make changes to this bug.