Bug 49160 - rpmfind claims full index is not wellformed XML
Summary: rpmfind claims full index is not wellformed XML
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Raw Hide
Classification: Retired
Component: rpmfind
Version: 1.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Daniel Veillard
QA Contact: Aaron Brown
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-07-16 12:32 UTC by Jonathan Kamens
Modified: 2007-04-18 16:34 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2001-07-16 12:50:24 UTC
Embargoed:


Attachments (Terms of Use)

Description Jonathan Kamens 2001-07-16 12:32:44 UTC
I have rpmfind-1.6-8.  Here's what happens when I attempt to search for
something that it can't find in the resource list:

jik:~!124> rpmfind dnswalk
rpmFetchDistribList: failed to grab metadata server list
   URI: http://www.redhat.com/RDF/resources/distribs/metadata.rdf
Cannot install or locate resource dnswalk 
Do you want to search it in the catalog? [Y/n] : Y
The index is 246 day(s) old, refetch [Y/n] : Y
Loading catalog to /home/jik/.rpmfinddir/fullIndex.rdf.gz
Searching the RPM catalog for dnswalk ...
rdfRead /home/jik/.rpmfinddir/fullIndex.rdf.gz: resource is not wellformed
XML
Cannot open catalog /home/jik/.rpmfinddir/fullIndex.rdf.gz

I loaded fullIndex.rdf.gz into an editor, and it looks fine.

Comment 1 Daniel Veillard 2001-07-16 12:46:35 UTC
Hum, this is likely due to either:
  - an http transfer problem or
  - an encoding problem
If the file is not truncated the second is the more likely.
Could you install the libxml2 package from rawhide and 
run :
   xmllint --noout /home/jik/.rpmfinddir/fullIndex.rdf.gz

  it will certify whether the file is well formed XML.
If there is an error output could you provide the first few lines
of the error so I get an idea of what is going on ?
Also can you tell me what version of libxml you have installed ?
rpm -q libxml

  thanks,

Daniel

Comment 2 Jonathan Kamens 2001-07-16 12:50:20 UTC
When I ran the xmllint command you asked me to run, I got this:

/home/jik/.rpmfinddir/fullIndex.rdf.gz:28112: error: Extra content at the end of
the document
#f8wR:)b |<M&)>eHy FfGK;G7
                          |
^

When I zcat the fullIndex.rdf.gz, I see this:

jik:~!135> zcat < /home/jik/.rpmfinddir/fullIndex.rdf.gz > /tmp/fullIndex.rdf

zcat: stdin: decompression OK, trailing garbage ignored

I'm sure these two errors are related to each other :-).  It would appear that
somehow fullIndex.rdf.gz is ending up with extra data at the end of it; I don't
know whether that's a transfer problem or an actual corruption problem on the
server.  On the other hand, it would seem that rpmfind should be able to ignore
this trailing garbage if zcat can.

I have libxml-1.8.14-1.


Comment 3 Daniel Veillard 2001-07-16 13:07:54 UTC
Err, no sorry this is not a bug from libxml perspective.
The XML spec is extremely clear about this, there is no end of
stream marker and for the document to be well-formed (i.e. considered
not corrupted) it must have the following content:
http://www.w3.org/TR/REC-xml#NT-document

with Misc being very precise concerning the characters accepted
once the top element content is finished.
http://www.w3.org/TR/REC-xml#NT-Misc

So this extra garbage makes it a broken XML document,
and hence it's not accessible to the rpmfind application.

It really seems to be a transport error since I
just fetched the fullIndex.rdf.gz here and checked,
there is no error.

This may be a bug in the HTTP client of libxml, if so
it is likely to be fixed in libxml2. But it doesn't
look like an rpmfind error so far,

  thanks for your feedback,

Daniel



Note You need to log in before you can comment on or make changes to this bug.