Bug 867647

Summary: POST to sync repository API with UTF8 username causes traceback
Product: [Retired] Pulp Reporter: Mike McCune <mmccune>
Component: API/integrationAssignee: Jason Connor <jconnor>
Status: CLOSED WONTFIX QA Contact: Preethi Thomas <pthomas>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.1.0CC: jason.dobies, mmccune
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-24 15:55:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 835586    
Attachments:
Description Flags
call the script with a username each time
none
replacement that sets the content-type header none

Description Mike McCune 2012-10-17 23:09:04 UTC
If you have a user in pulp with UTF8 characters in the username you can't POST calls to:

 https://localhost/pulp/api/repositories/$REPO/sync/

without getting the dreaded:

 File "/usr/lib/python2.6/site-packages/pulp/server/db/model/persistence.py", line 55, in _process_value
    value = value.decode('utf-8')
  File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 271: invalid continuation byte


error ...

See attached bash script that reproduces the error with the following steps.  call the script with a new username each time, you can try with ascii which works fine or latin1 or utf8 chars.  Apache doesn't seem to like headers with latin chars ..

ASCII works fine:


$ ./utf8-pain.bash ascii1
Successfully created repository [ kpJfjWgFow ]

Successfully created user [ ascii1 ] with name [ None ]

[ ascii1 ] added to role [ super-users ]

{"scheduled_time": "2012-10-17T23:05:59Z", "exception": null, "traceback": null, "job_id": null, "class_name": null, "start_time": null, "args": ["kpJfjWgFow"], "method_name": "_sync", "finish_time": null, "state": "waiting", "result": null, "scheduler": "immediate", "progress": null, "id": "3713ddd7-18af-11e2-b28c-1803734d16c4"}

$ tail -f /var/log/pulp/pulp.log in another window and look for the above exception


non-ASCII will cause the exception:

$ ./utf8-pain.bash 7Mané
....
$ tail  /var/log/pulp/pulp.log
...
 File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 271: invalid continuation byte

Comment 1 Mike McCune 2012-10-17 23:10:11 UTC
Created attachment 629115 [details]
call the script with a username each time

$ ./utf8-pain.bash some-new-user

$ ./utf8-pain.bash some-non-ascii-7Mané

Comment 2 Mike McCune 2012-10-17 23:11:31 UTC
Created attachment 629116 [details]
replacement that sets the content-type header

Comment 3 Mike McCune 2012-10-22 19:25:52 UTC
https://github.com/pulp/pulp/pull/111

Comment 4 Jason Connor 2012-10-24 15:55:32 UTC
Pulp will only support ASCII usernames