Red Hat Bugzilla – Bug 49160
rpmfind claims full index is not wellformed XML
Last modified: 2007-04-18 12:34:46 EDT
I have rpmfind-1.6-8. Here's what happens when I attempt to search for
something that it can't find in the resource list:
jik:~!124> rpmfind dnswalk
rpmFetchDistribList: failed to grab metadata server list
Cannot install or locate resource dnswalk
Do you want to search it in the catalog? [Y/n] : Y
The index is 246 day(s) old, refetch [Y/n] : Y
Loading catalog to /home/jik/.rpmfinddir/fullIndex.rdf.gz
Searching the RPM catalog for dnswalk ...
rdfRead /home/jik/.rpmfinddir/fullIndex.rdf.gz: resource is not wellformed
Cannot open catalog /home/jik/.rpmfinddir/fullIndex.rdf.gz
I loaded fullIndex.rdf.gz into an editor, and it looks fine.
Hum, this is likely due to either:
- an http transfer problem or
- an encoding problem
If the file is not truncated the second is the more likely.
Could you install the libxml2 package from rawhide and
xmllint --noout /home/jik/.rpmfinddir/fullIndex.rdf.gz
it will certify whether the file is well formed XML.
If there is an error output could you provide the first few lines
of the error so I get an idea of what is going on ?
Also can you tell me what version of libxml you have installed ?
rpm -q libxml
When I ran the xmllint command you asked me to run, I got this:
/home/jik/.rpmfinddir/fullIndex.rdf.gz:28112: error: Extra content at the end of
#f8wR:)b |<M&)>eHy FfGK;G7
When I zcat the fullIndex.rdf.gz, I see this:
jik:~!135> zcat < /home/jik/.rpmfinddir/fullIndex.rdf.gz > /tmp/fullIndex.rdf
zcat: stdin: decompression OK, trailing garbage ignored
I'm sure these two errors are related to each other :-). It would appear that
somehow fullIndex.rdf.gz is ending up with extra data at the end of it; I don't
know whether that's a transfer problem or an actual corruption problem on the
server. On the other hand, it would seem that rpmfind should be able to ignore
this trailing garbage if zcat can.
I have libxml-1.8.14-1.
Err, no sorry this is not a bug from libxml perspective.
The XML spec is extremely clear about this, there is no end of
stream marker and for the document to be well-formed (i.e. considered
not corrupted) it must have the following content:
with Misc being very precise concerning the characters accepted
once the top element content is finished.
So this extra garbage makes it a broken XML document,
and hence it's not accessible to the rpmfind application.
It really seems to be a transport error since I
just fetched the fullIndex.rdf.gz here and checked,
there is no error.
This may be a bug in the HTTP client of libxml, if so
it is likely to be fixed in libxml2. But it doesn't
look like an rpmfind error so far,
thanks for your feedback,