Testing with a package which name includes an accented character: (http://nim.fedorapeople.org/%c3%a9colier-fonts-1.00-0.1.20070628.fc8.src.rpm) 1. rpmbuild, build-in-mock & rpmlint are happy 2. createrepo seems fine too 3. yum search works if you don't include the accented character in the search string. yum search colier Loading "skip-broken" plugin Excluding Packages in global exclude list Finished écolier-fonts.noarch 1.00-0.1.20070628.fc8 local Matched from: écolier-fonts Écolier court fonts Écolier are a set of latin fonts created by Jean-Marie Douteau to mimick the traditionnal cursive writing French children are taught in school. He kindly released two of them under the OFL, which are redistributed in this package. http://perso.orange.fr/jm.douteau/page_ecolier.htm 4. If you do it dies with yum search écolier Loading "skip-broken" plugin Excluding Packages in global exclude list Finished Cleaning up Everything Loading "skip-broken" plugin Excluding Packages in global exclude list Finished Traceback (most recent call last): File "/usr/bin/yum", line 29, in <module> yummain.main(sys.argv[1:]) File "/usr/share/yum-cli/yummain.py", line 102, in main result, resultmsgs = base.doCommands() File "/usr/share/yum-cli/cli.py", line 272, in doCommands return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds) File "/usr/share/yum-cli/yumcommands.py", line 343, in doCommand return base.search(extcmds) File "/usr/share/yum-cli/cli.py", line 829, in search for (po, matched_value) in matching: File "/usr/lib/python2.5/site-packages/yum/__init__.py", line 1240, in searchGenerator if value and value.lower().find(s.lower()) != -1: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) 5. yum install initially works then dies too yum install "écolier-fonts" Loading "skip-broken" plugin Excluding Packages in global exclude list Finished Setting up Install Process Parsing package install arguments Resolving Dependencies --> Running transaction check ---> Package écolier-fonts.noarch 0:1.00-0.1.20070628.fc8 set to be updated --> Finished Dependency Resolution Dependencies Resolved ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: écolier-fonts noarch 1.00-0.1.20070628.fc8 local 66 k Transaction Summary ============================================================================= Install 1 Package(s) Update 0 Package(s) Remove 0 Package(s) Total download size: 66 k Is this ok [y/N]: y Downloading Packages: /usr/lib64/python2.5/urllib.py:1205: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal res = map(safe_map.__getitem__, s) Traceback (most recent call last): File "/usr/bin/yum", line 29, in <module> yummain.main(sys.argv[1:]) File "/usr/share/yum-cli/yummain.py", line 180, in main base.doTransaction() File "/usr/share/yum-cli/cli.py", line 310, in doTransaction problems = self.downloadPkgs(downloadpkgs) File "/usr/lib/python2.5/site-packages/yum/__init__.py", line 833, in downloadPkgs cache=po.repo.http_caching != 'none', File "/usr/lib/python2.5/site-packages/yum/yumRepo.py", line 605, in getPackage cache=cache File "/usr/lib/python2.5/site-packages/yum/yumRepo.py", line 583, in _getFile http_headers=headers, File "/usr/lib/python2.5/site-packages/urlgrabber/mirror.py", line 411, in urlgrab return self._mirror_try(func, url, kw) File "/usr/lib/python2.5/site-packages/urlgrabber/mirror.py", line 397, in _mirror_try return func_ref( *(fullurl,), **kwargs ) File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 893, in urlgrab (url,parts) = opts.urlparser.parse(url, opts) File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 671, in parse parts = self.quote(parts) File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 707, in quote path = urllib.quote(path) File "/usr/lib64/python2.5/urllib.py", line 1205, in quote res = map(safe_map.__getitem__, s) KeyError: u'\xe9'
Created attachment 177681 [details] iri to uri function from Django Licensed under three clause, new style BSD.
Toshio, Is that license gpl compat? I don't want to look at the attachment unless it is.
Understood. It is GPL 2 & 3 compatible according to: http://fedoraproject.org/wiki/Licensing BSD License (no advertising) aka 3 Clause BSD. Although reading the license again, the whole license text probably needs to be included at the top of the file if you use it: http://code.djangoproject.com/browser/django/trunk/LICENSE
Created attachment 299034 [details] Sample yum output plus rpm -qa output This seemed to be the closest of three bugs with unicode and yum. yum search is consistantly failing on my machine with the kitchen sink installed which also happens to be x86_64 arch. This doesn't happen on either of two other machines I am running up to date rawhide on. The other two machines are x86 arch.
To my knowledge there are no packages with unicode names in Fedora right now. So this is probably the wrong bug. However there are multiple package with unicode descriptions or changelogs, which may trigger the bug you hit. Please open a separate ticket.
OK, I'll do that.
Bruno, please check https://bugzilla.redhat.com/show_bug.cgi?id=438633 your issue is fixed upstream
I just took another look at this and the problem is not as bad as I thought at first. urlgrabber handles non-ASCII filenames fine, it's internationalized domain names where it needs help. The solution is to realize that urlgrabber doesn't understand how to deal with unicode which is understandable because it's providing you an interface to something with no explicit encoding. So it's dealing with things at the byte encoded level, not the abstract unicode level. Assuming that all filenames will be in utf-8 on the filesystem in question all you need to do is convert to utf-8 before passing the url to urlgrabber:: url = repourl + packagename type(url) <type 'unicode'> urlgrabber.urlgrab(url.encode('utf-8') What do you do if the remote server does not encode its filesystem filenames in utf-8? You fail. Unless you can query the server to find out what encoding the filenames are using, there's no way to make this translation.
- urlgrabber.urlgrab(url.encode('utf-8') + urlgrabber.urlgrab(url.encode('utf-8'))
Created attachment 304434 [details] encode to utf-8 just before calling urlgrabber Here's a patch against git HEAD to encode the url to utf-8 just before calling urlgrabber. Tested on a file:// and http:// repo fine.
Also looked at the search problem in #4 and that seems to have been fixed in git HEAD by decoding the args and the database results to unicode strings before using them.
Created attachment 304439 [details] only convert unicode objects to utf8 So apparently, the sometimes hand off unicode objects to urlgrabber and sometimes we hand off YumRepository objects. This new patch creates a to_utf8() method like to_unicode() that only converts if the object passed to it is unicode. Using it in _get_file() does the right thing whether we are handing in a unicode or YumRepository.
Ok, that last patch looks fine. Applying.
yum-3.2.16-1.fc9 has been submitted as an update for Fedora 9
yum-3.2.16-2.fc9 has been submitted as an update for Fedora 9
yum-3.2.16-2.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.