Bug 624853 - unicode url issues
Summary: unicode url issues
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Beaker
Classification: Retired
Component: web UI
Version: 0.5
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Dan Callaghan
QA Contact:
URL:
Whiteboard:
Depends On: 624857
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-17 23:14 UTC by Raymond Mancy
Modified: 2015-05-04 02:46 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-17 02:25:40 UTC
Embargoed:


Attachments (Terms of Use)
output (61.28 KB, application/octet-stream)
2010-08-17 23:27 UTC, Raymond Mancy
no flags Details
Patch: decode fqdn parameter for view() if necessary (942 bytes, patch)
2010-08-18 06:29 UTC, Dan Callaghan
no flags Details | Diff

Description Raymond Mancy 2010-08-17 23:14:50 UTC
We've always had issues with unicode urls (specifically machine names with non ascii characters).

There are potentially a few problems in TG/cherrpy that are causing this.

Comment 1 Raymond Mancy 2010-08-17 23:27:41 UTC
Created attachment 439251 [details]
output

The 'historysearch' widget in this deubg has an incorrectly encoded value as its action string.

Comment 2 Dan Callaghan 2010-08-18 06:29:59 UTC
Created an attachment (id=439305)
Patch: decode fqdn parameter for view() if necessary

Comment 3 Dan Callaghan 2010-08-18 06:50:11 UTC
The problem here is twofold: Kid works exclusively in unicode, so it expects template data to be passed in as unicode objects. If you do pass it a str, it will decode it using the encoding set in the `assume_encoding` attribute of the template (/usr/lib/python2.4/site-packages/kid/codewriter.py:581), with a default of sys.getdefaultencoding() (/usr/lib/python2.4/site-packages/kid/__init__.py:424) which is 'ascii'.

TurboGears lets you configure the `assume_encoding` attribute for normal templates (and by default it sets it 'utf8'), but the code path for widget templates is completely different and it never sets `assume_encoding` (/usr/lib/python2.4/site-packages/turbogears/widgets/base.py:180), so it will always be 'ascii'.

So then the problem becomes, we are passing in a raw str (with UTF-8 bytes) as the action attribute for the historysearch widget (Server/bkr/server/controllers.py:909). That's because the fqdn argument is passed as a raw str by cherrypy *if* it comes from the virtual path, as in /view/example.asdf.com. It's correctly decoded as UTF-8 when it's a query param though, as in /view/?fqdn=example.asdf.com.

The attached patch is a dodgy solution to the problem, in that it only fixes this one particular instance (which is unlikely to occur, since we don't normally have non-ASCII chars in a hostname). It doesn't help any of the other situations where we are using cherrypy's virtual path mapping. So it's probably not worth applying this particular patch, although we should maybe think about a more general solution for decoding controller params that come from a virtual path.

Comment 4 Dan Callaghan 2011-05-17 02:25:40 UTC
In general we should make sure to decode controller params correctly where needed. But for this particular case it is no longer an issue, now that we are validating system fqdns as (ASCII-only) hostnames: bug 624857. Closing this bug.


Note You need to log in before you can comment on or make changes to this bug.