Bug 624857 - Saving system name with non-Latin-1 char raises UnicodeDecodeError
Summary: Saving system name with non-Latin-1 char raises UnicodeDecodeError
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: inventory
Version: 0.5
Hardware: All
OS: All
low
low
Target Milestone: ---
Assignee: Dan Callaghan
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 624853
TreeView+ depends on / blocked
 
Reported: 2010-08-17 23:54 UTC by Dan Callaghan
Modified: 2019-05-22 13:37 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-01-13 06:11:57 UTC
Embargoed:


Attachments (Terms of Use)
Patch: bz624857 - correctly encode /view/FQDN urls when redirecting (12.04 KB, patch)
2010-08-18 03:08 UTC, Dan Callaghan
no flags Details | Diff

Description Dan Callaghan 2010-08-17 23:54:06 UTC
Steps to Reproduce:
1. Go to the system details page
2. Enter a non-Latin-1 character in the system name field (e.g. a letter from the cyrillic alphabet: я)
3. Click save
  
Actual results:
500 with this traceback:

Page handler: 'ordinal not in range(256)'
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 136, in _run
    cherrypy.response.finalize()
  File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 429, in finalize
    self.header_list = self.headers.sorted_list(protocol=self.version)
  File "/usr/lib/python2.4/site-packages/cherrypy/lib/httptools.py", line 488, in sorted_list
    v = v.encode("iso-8859-1")
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 58-59: ordinal not in range(256)

Expected results:
Should save successfully and redirect back to the system details page with new system name

Additional info:
The exception is thrown when cherrypy tries to encode headers as Latin-1 as per the HTTP spec. My guess is that we are passing a unicode object as the Location: header. Instead we should be encoding the URL in Location: as UTF-8 and then percent-encoding it.

Comment 1 Dan Callaghan 2010-08-18 00:51:25 UTC
There are a bunch of places in Server/bkr/server/controllers.py where we call:

redirect("./view/%s" % system.fqdn)

redirect is imported from TurboGears. Its first arg is the path to redirect to, which ends up passed through to cherrypy without any encoding applied. Assuming system.fqdn is a unicode object returned by sqlalchemy (I think it always will be?), then the redirect path we are passing will be a unicode object as well. Normally system.fqdn will only have ASCII chars, so cherrypy will encode the Location: header and everything works. Otherwise, we get the exception above.

I was thinking about replacing the redirect import with a wrapper around the TurboGears version which encodes the redirect path, but then we might end up with double-encoding problems. I think it's best to just do the encoding correctly in those places where we are interpolating unicode objects into the path.

Comment 2 Dan Callaghan 2010-08-18 03:08:26 UTC
Created an attachment (id=439287)
Patch: bz624857 - correctly encode /view/FQDN urls when redirecting

Comment 3 Dan Callaghan 2010-09-01 22:42:53 UTC
Bill, is this worth fixing? Although punycode makes it theoretically possible to have non-ASCII chars in a hostname, I doubt we will ever do that, so we could just close this as NOTABUG. On the other hand, I think the system.link property (which this patch add) makes things a little nicer so maybe it is worth doing?

Comment 4 Dan Callaghan 2010-12-15 05:19:52 UTC
This bug is silly. If people want to do crazy things like use IDN for the hostnames, they can put the punycoded version into Beaker.

Beaker should just validate that a system's fqdn is a valid hostname. This would also prevent people from making mistakes like having a space at the start of the fqdn (as we currently have in a few cases).

Comment 5 Dan Callaghan 2011-01-13 06:11:57 UTC
Beaker 0.6.2 has been released.


Note You need to log in before you can comment on or make changes to this bug.