Bug 513459

Summary: _dump_format_items crashes trying to merge Unicode strings
Product: [Fedora] Fedora Reporter: Alexander Kahl <fedora>
Component: yumAssignee: Seth Vidal <skvidal>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: ffesti, james.antill, maxamillion, pmatilai, tim.lauridsen, yersinia.spiros
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 520424 (view as bug list) Environment:
Last Closed: 2009-09-15 17:01:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 520424    

Description Alexander Kahl 2009-07-23 18:28:45 UTC
_dump_format_items in /usr/lib/python2.6/site-packages/yum/packages.py crashes if repository metadata being read contains Unicode characters with any of its bytes is not within 7bit ascii range as described here:
http://mail.python.org/pipermail/python-list/2004-October/286313.html

This happens during concatenation of "msg"; easily reproducible by doing the following:

Either download http://akahl.fedorapeople.org/fail/fail-1.0-1.fc11.noarch.rpm or generate the package yourself by using http://akahl.fedorapeople.org/fail/fail.spec, it contains the Unicode character © (copyright) in the license field which starts with the byte 0xc2 which is ordinal 194, thus > 128; there have been (or still are?) packages from RedHat in this condition, e.g. redhat-logos from EL-4.

To reproduce the bug, create a local repository like
$ mkdir /tmp/repo
$ cp fail-1.0-1.fc11.noarch.rpm !$/
$ createrepo !$
$ /usr/libexec/kojid/mergerepos -a i386 -o /tmp/foo -r file://!$
(required packages: createrepo, koji-builder)

You'll get the error:
Adding repo: file:///tmp/repo
Traceback (most recent call last):
  File "/usr/libexec/kojid/mergerepos", line 241, in <module>
    main(sys.argv[1:])
  File "/usr/libexec/kojid/mergerepos", line 236, in main
    merge.write_metadata()
  File "/usr/libexec/kojid/mergerepos", line 216, in write_metadata
    mdgen.doPkgMetadata()
  File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 364, in doPkgMetadata
    self.writeMetadataDocs(packages)
  File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 527, in writeMetadataDocs
    self.primaryfile.write(po.xml_dump_primary_metadata())
  File "/usr/lib/python2.6/site-packages/yum/packages.py", line 1015, in xml_dump_primary_metadata
    msg += misc.to_unicode(self._dump_format_items())
  File "/usr/lib/python2.6/site-packages/yum/packages.py", line 894, in _dump_format_items
    msg += self._dump_pco('provides')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 28: ordinal not in range(128)

Prove that the faulty character is indeed the copyright symbol can be gained with:
$ cd /tmp/
$ gunzip primary.xml.gz
$ cat primary.xml|perl -we ' my ($n, $data); while (my $n = read STDIN, $data, 1) { print $data; if (ord ($data) == 0xc2) { print " <-------\n"; exit 0; } }'

It'll yield:
(...)
<location href="fail-1.0-1.fc11.noarch.rpm"/>
  <format>
    <rpm:license>� <-------

I don't which other programs than mergerepo from koji-builder trigger this functionality but packages.py belongs to yum.

Frankly, I think this defective by design: Why does Python try to interpret the first half of a string concatenation as ascii code? It ought to just concatenate two byte arrays by either re-using the first or allocating a new one. For now, whole koji infrastructures can easily be put down since this bug can be triggered by kojira which is required to create build repositories, if I got the manual right.

Comment 1 Alexander Kahl 2009-07-24 06:52:08 UTC
Errata for the second set of commands:
$ zcat /tmp/repo/primary.xml.gz|perl -we ' my ($n, $data); while (my $n = read STDIN, $data, 1) { print $data; if (ord ($data) == 0xc2) { print " <-------\n"; exit 0; } }'

Comment 2 Alexander Kahl 2009-07-31 12:43:14 UTC
Update: F11 release and update repositories are also affected, i.e. the koji external repo method described at [1] is in effect completely broken until this bug is fixed.

To reproduce, just set up some bogus koji build target and add some repository [2] using add-external-repo for the target and regenerate using `koji regen-repo dist-bogus-build`. The createrepo task will fail with the same error as described above.

[1] https://fedoraproject.org/wiki/Koji/ExternalRepoServerBootstrap
[2] https://fedoraproject.org/wiki/Koji/ExternalRepoServerBootstrap#Fedora_10

Comment 3 seth vidal 2009-09-15 17:01:09 UTC

*** This bug has been marked as a duplicate of bug 520424 ***