Bug 171881

Summary: yum fails with "parser error : Start tag expected, '<' not found"
Product: [Fedora] Fedora Reporter: Stanton Finley <stantonfinley>
Component: yumAssignee: Jeremy Katz <katzj>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: herrold, katzj, matthias
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-11-05 23:44:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stanton Finley 2005-10-27 13:25:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
yum update fails more often than not with third party repositories with errors such as:

//var/cache/yum/dag/repomd.xml:1: parser error : Document is empty

^
//var/cache/yum/dag/repomd.xml:1: parser error : Start tag expected, '<' not found

^
Cannot open/read repomd.xml file for repository: dag

Doing a "yum clean all" sometimes helps as does doing a "yum clean all" followed by clearing the upstream cache by doing something similar to:

wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/filelists.xml.gz
wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/primary.xml.gz
wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/repomd.xml
wget --cache=off http://ayo.freshrpms.net/fedora/linux/4/i386/core/repodata/repomd.xml

Version-Release number of selected component (if applicable):
kernel-2.6.13-1.1532_FC4 and yum-2.4.0-0.fc4

How reproducible:
Sometimes

Steps to Reproduce:
1. Doing "yum update" as root.
2.
3.
  

Expected Results:  yum updates system with no errors.

Additional info:

Comment 1 Seth Vidal 2005-10-27 14:45:53 UTC
*** Bug 171882 has been marked as a duplicate of this bug. ***

Comment 2 Seth Vidal 2005-10-27 15:05:54 UTC
in yum 2.4.0 set this:
http_caching=packages

see if that solves the above problem.



Comment 3 Stanton Finley 2005-10-27 21:05:34 UTC
I have added the line "http_caching=packages" in /etc/yum.conf but I still must
regularly run the following commands and even this does not work consistently:

sudo yum clean all
wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/filelists.xml.gz
wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/primary.xml.gz
wget --cache=off http://apt.sw.be/fedora/4/en/i386/dag/repodata/repomd.xml
wget --cache=off
http://ayo.freshrpms.net/fedora/linux/4/i386/core/repodata/repomd.xml
sudo yum update


Comment 5 Seth Vidal 2005-10-27 21:53:41 UTC
all of those references and your entry just say the same thing - something is
wrong with the mirror hosting dag and freshrpms's repositories.

I'm going to bug mathias and see if he knows anything about this.



Comment 6 Stanton Finley 2005-10-27 23:29:15 UTC
Apparently this also happens with Livna.org repository so it's not RPMForge
specific. See http://www.fedoraforum.org/forum/showthread.php?t=47288 .

Comment 7 Seth Vidal 2005-10-28 06:01:21 UTC
Again - the problem you're citing is a broken repomd.xml file.

the reason why apt and up2date in FC3 (as cited) work fine is that they aren't
using that metadata.

If clearing the cache is fixing the problem for you then you should:
1. look to see what the [transparent] proxy is b/t you and the rest of the world.
2. ask the maintainers of it to fix it.
3. disable all http-caching in yum and see if it works better.


but if like you mention in Comment 3 that even when you force the cache update
with wget you're still seeing the problem - then the problem may not lie in a
bad proxy cache.


Comment 8 Matthias Saou 2005-10-28 09:42:19 UTC
One very easy way of knowing if it's a problem with the HEAnet server hosting
both ayo.freshrpms.net and apt.sw.be, is by using another mirror. For freshrpms,
you can try ayo.es6.freshrpms.net for instance, and see if the problem persists
or not.

Problems have been quite frequent on the HEAnet server, so it wouldn't surprise
me that much... :-(

Comment 9 Stanton Finley 2005-11-04 18:03:25 UTC
I'm still convinced that this bug is a problem with a recent version of yum.
When I do a fresh install of Fedora Core 4 I never have this problem until I do
my "yum update" which of course updates yum. Then the problems reported here begin.

Comment 10 Bill Privatus 2005-11-05 02:09:18 UTC
I am seeing this also.  It's very annoying.  I'm scripting updates to
synchronize two boxes - one set up with FC4 weeks ago, just started having the
problem in last two weeks; other box is brand new, it's the one where I'm trying
to install all the same software.

My scripts are running single updates after I was getting 100% failures with
15-20 package names on a yum command line.  With single updates, I am getting
several successes, then one or two failures - distributed across all repositories.

This is likely not a) repo-specific, or b) communications-related.  This one
needs fixed, and soon, else other enterprising souls are going to use my
approach - hammer at the servers until the attempt succeeds...

Which could induce the network congestion already hypothesized...

Oh, and you're going to love this.  Dries is having fits.  Request for
primary.xml.gz first return a list of 2959, 394 new and 417 deleted, then return
a list of 2982, with 417 new, and 394 deleted.  It's the only repository that
downloads its metadata *every* time you run yum.  And, the latter download
really drags.  I think the servers are not synchronized...!

I'm not a newbie but I don't know how to have yum tell me the server it's
getting these files from - it's too chatty as it is.  Maybe someone else could
advise or take a look themselves.

/bill

This ought to be raised to "high" severity - it's only going to get worse.

Comment 11 Seth Vidal 2005-11-05 03:08:03 UTC
When I setup a repo with a blank repomd.xml file. And it doesn't have a working
mirror then the yum execution fails. Just like it would if a repo was damaged in
any way. When I set up a mirror of that repo that has a functional repomd.xml
and listed it in the .repo file as a second baseurl then yum failsover as it should.

I've tested a number of repos off of apt.sw.be and freshrpms. It looks like the
mirror host is using either dns round robin or some other loadbalancing system.
In some cases what you get is an empty repomd.xml file and in other cases you
get a working file.

This is not a yum bug.

Comment 12 Stanton Finley 2005-11-05 17:43:27 UTC
There is nothing inherently wrong with load balancing or dns round robin for
sites serving rpms. I contend that this problem is a yum bug in that yum is not
more fault tolerant than it is. If I can run a âyum update" command once and it
fails and then run it again and it succeeds then yum should do this same thing
automatically. There should be a timeout in the code and then yum should try
again with appropriate verbose messages to alert the user that it is waiting for
the next round robin switch-over or load transition and not simply stop trying
with cryptic fault messages. Yum needs to be more robust than this.

Comment 13 Seth Vidal 2005-11-05 23:44:53 UTC
There's nothing wrong with it if all sites being load balanced are the same.

but these are NOT the same.

So one mirror has valid content, the other one has blank files.

Since yum doesn't know of this one url as being multiple urls it can't failover
between them.

The WHOLE POINT of rr dns is that the applications don't know about them.

if you want yum to know about the mirrors then you have to list each of the
sites in the mirrorlist or as multiple baseurls in the .repo file.




Comment 14 Matthias Saou 2005-11-07 10:21:27 UTC
I completely agree with Seth on this one.
When you get the error, I'd really like to know which ayo.freshrpms.net mirror
caused it, as by default the freshrpms-release package uses a mirrorlist just
like the main fedora repositories, so if one or more are broken, it's really
worth reporting to me.

One way of doing this is be testing all the mirrors (there are few, actually)
from the mirrorlist file individually, then sending me an email with the results
so that I can contact the appropriate mirror admin and/or remove the mirror from
the list.