_dump_format_items in /usr/lib/python2.6/site-packages/yum/packages.py crashes if repository metadata being read contains Unicode characters with any of its bytes is not within 7bit ascii range as described here: http://mail.python.org/pipermail/python-list/2004-October/286313.html This happens during concatenation of "msg"; easily reproducible by doing the following: Either download http://akahl.fedorapeople.org/fail/fail-1.0-1.fc11.noarch.rpm or generate the package yourself by using http://akahl.fedorapeople.org/fail/fail.spec, it contains the Unicode character © (copyright) in the license field which starts with the byte 0xc2 which is ordinal 194, thus > 128; there have been (or still are?) packages from RedHat in this condition, e.g. redhat-logos from EL-4. To reproduce the bug, create a local repository like $ mkdir /tmp/repo $ cp fail-1.0-1.fc11.noarch.rpm !$/ $ createrepo !$ $ /usr/libexec/kojid/mergerepos -a i386 -o /tmp/foo -r file://!$ (required packages: createrepo, koji-builder) You'll get the error: Adding repo: file:///tmp/repo Traceback (most recent call last): File "/usr/libexec/kojid/mergerepos", line 241, in <module> main(sys.argv[1:]) File "/usr/libexec/kojid/mergerepos", line 236, in main merge.write_metadata() File "/usr/libexec/kojid/mergerepos", line 216, in write_metadata mdgen.doPkgMetadata() File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 364, in doPkgMetadata self.writeMetadataDocs(packages) File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 527, in writeMetadataDocs self.primaryfile.write(po.xml_dump_primary_metadata()) File "/usr/lib/python2.6/site-packages/yum/packages.py", line 1015, in xml_dump_primary_metadata msg += misc.to_unicode(self._dump_format_items()) File "/usr/lib/python2.6/site-packages/yum/packages.py", line 894, in _dump_format_items msg += self._dump_pco('provides') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 28: ordinal not in range(128) Prove that the faulty character is indeed the copyright symbol can be gained with: $ cd /tmp/ $ gunzip primary.xml.gz $ cat primary.xml|perl -we ' my ($n, $data); while (my $n = read STDIN, $data, 1) { print $data; if (ord ($data) == 0xc2) { print " <-------\n"; exit 0; } }' It'll yield: (...) <location href="fail-1.0-1.fc11.noarch.rpm"/> <format> <rpm:license>� <------- I don't which other programs than mergerepo from koji-builder trigger this functionality but packages.py belongs to yum. Frankly, I think this defective by design: Why does Python try to interpret the first half of a string concatenation as ascii code? It ought to just concatenate two byte arrays by either re-using the first or allocating a new one. For now, whole koji infrastructures can easily be put down since this bug can be triggered by kojira which is required to create build repositories, if I got the manual right.
Errata for the second set of commands: $ zcat /tmp/repo/primary.xml.gz|perl -we ' my ($n, $data); while (my $n = read STDIN, $data, 1) { print $data; if (ord ($data) == 0xc2) { print " <-------\n"; exit 0; } }'
Update: F11 release and update repositories are also affected, i.e. the koji external repo method described at [1] is in effect completely broken until this bug is fixed. To reproduce, just set up some bogus koji build target and add some repository [2] using add-external-repo for the target and regenerate using `koji regen-repo dist-bogus-build`. The createrepo task will fail with the same error as described above. [1] https://fedoraproject.org/wiki/Koji/ExternalRepoServerBootstrap [2] https://fedoraproject.org/wiki/Koji/ExternalRepoServerBootstrap#Fedora_10
*** This bug has been marked as a duplicate of bug 520424 ***