Red Hat Bugzilla – Bug 624857
Saving system name with non-Latin-1 char raises UnicodeDecodeError
Last modified: 2011-01-13 01:11:57 EST
Steps to Reproduce:
1. Go to the system details page
2. Enter a non-Latin-1 character in the system name field (e.g. a letter from the cyrillic alphabet: я)
3. Click save
500 with this traceback:
Page handler: 'ordinal not in range(256)'
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 136, in _run
File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 429, in finalize
self.header_list = self.headers.sorted_list(protocol=self.version)
File "/usr/lib/python2.4/site-packages/cherrypy/lib/httptools.py", line 488, in sorted_list
v = v.encode("iso-8859-1")
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 58-59: ordinal not in range(256)
Should save successfully and redirect back to the system details page with new system name
The exception is thrown when cherrypy tries to encode headers as Latin-1 as per the HTTP spec. My guess is that we are passing a unicode object as the Location: header. Instead we should be encoding the URL in Location: as UTF-8 and then percent-encoding it.
There are a bunch of places in Server/bkr/server/controllers.py where we call:
redirect("./view/%s" % system.fqdn)
redirect is imported from TurboGears. Its first arg is the path to redirect to, which ends up passed through to cherrypy without any encoding applied. Assuming system.fqdn is a unicode object returned by sqlalchemy (I think it always will be?), then the redirect path we are passing will be a unicode object as well. Normally system.fqdn will only have ASCII chars, so cherrypy will encode the Location: header and everything works. Otherwise, we get the exception above.
I was thinking about replacing the redirect import with a wrapper around the TurboGears version which encodes the redirect path, but then we might end up with double-encoding problems. I think it's best to just do the encoding correctly in those places where we are interpolating unicode objects into the path.
Created an attachment (id=439287)
Patch: bz624857 - correctly encode /view/FQDN urls when redirecting
Bill, is this worth fixing? Although punycode makes it theoretically possible to have non-ASCII chars in a hostname, I doubt we will ever do that, so we could just close this as NOTABUG. On the other hand, I think the system.link property (which this patch add) makes things a little nicer so maybe it is worth doing?
This bug is silly. If people want to do crazy things like use IDN for the hostnames, they can put the punycoded version into Beaker.
Beaker should just validate that a system's fqdn is a valid hostname. This would also prevent people from making mistakes like having a space at the start of the fqdn (as we currently have in a few cases).
Beaker 0.6.2 has been released.