Bug 624857

Summary: Saving system name with non-Latin-1 char raises UnicodeDecodeError
Product: [Retired] Beaker Reporter: Dan Callaghan <dcallagh>
Component: inventoryAssignee: Dan Callaghan <dcallagh>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 0.5CC: bpeck, dcallagh, kbaker, mcsontos, rmancy
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 06:11:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 624853    
Attachments:
Description Flags
Patch: bz624857 - correctly encode /view/FQDN urls when redirecting none

Description Dan Callaghan 2010-08-17 23:54:06 UTC
Steps to Reproduce:
1. Go to the system details page
2. Enter a non-Latin-1 character in the system name field (e.g. a letter from the cyrillic alphabet: я)
3. Click save
  
Actual results:
500 with this traceback:

Page handler: 'ordinal not in range(256)'
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 136, in _run
    cherrypy.response.finalize()
  File "/usr/lib/python2.4/site-packages/cherrypy/_cphttptools.py", line 429, in finalize
    self.header_list = self.headers.sorted_list(protocol=self.version)
  File "/usr/lib/python2.4/site-packages/cherrypy/lib/httptools.py", line 488, in sorted_list
    v = v.encode("iso-8859-1")
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 58-59: ordinal not in range(256)

Expected results:
Should save successfully and redirect back to the system details page with new system name

Additional info:
The exception is thrown when cherrypy tries to encode headers as Latin-1 as per the HTTP spec. My guess is that we are passing a unicode object as the Location: header. Instead we should be encoding the URL in Location: as UTF-8 and then percent-encoding it.

Comment 1 Dan Callaghan 2010-08-18 00:51:25 UTC
There are a bunch of places in Server/bkr/server/controllers.py where we call:

redirect("./view/%s" % system.fqdn)

redirect is imported from TurboGears. Its first arg is the path to redirect to, which ends up passed through to cherrypy without any encoding applied. Assuming system.fqdn is a unicode object returned by sqlalchemy (I think it always will be?), then the redirect path we are passing will be a unicode object as well. Normally system.fqdn will only have ASCII chars, so cherrypy will encode the Location: header and everything works. Otherwise, we get the exception above.

I was thinking about replacing the redirect import with a wrapper around the TurboGears version which encodes the redirect path, but then we might end up with double-encoding problems. I think it's best to just do the encoding correctly in those places where we are interpolating unicode objects into the path.

Comment 2 Dan Callaghan 2010-08-18 03:08:26 UTC
Created an attachment (id=439287)
Patch: bz624857 - correctly encode /view/FQDN urls when redirecting

Comment 3 Dan Callaghan 2010-09-01 22:42:53 UTC
Bill, is this worth fixing? Although punycode makes it theoretically possible to have non-ASCII chars in a hostname, I doubt we will ever do that, so we could just close this as NOTABUG. On the other hand, I think the system.link property (which this patch add) makes things a little nicer so maybe it is worth doing?

Comment 4 Dan Callaghan 2010-12-15 05:19:52 UTC
This bug is silly. If people want to do crazy things like use IDN for the hostnames, they can put the punycoded version into Beaker.

Beaker should just validate that a system's fqdn is a valid hostname. This would also prevent people from making mistakes like having a space at the start of the fqdn (as we currently have in a few cases).

Comment 5 Dan Callaghan 2011-01-13 06:11:57 UTC
Beaker 0.6.2 has been released.