450787 – optparse doesn't handle Unicode help text

Bug 450787 - optparse doesn't handle Unicode help text

Summary: optparse doesn't handle Unicode help text

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	python
Sub Component:
Version:	5.2
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	James Antill
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	446950
TreeView+	depends on / blocked

Reported:	2008-06-11 00:38 UTC by Bryan Mason
Modified:	2018-10-19 20:03 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-06-12 16:34:34 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Test case (1.10 KB, application/x-gzip) 2008-06-11 00:38 UTC, Bryan Mason	no flags	Details
Proposed patch (880 bytes, patch) 2008-06-11 22:39 UTC, Bryan Mason	no flags	Details \| Diff
View All

Description Bryan Mason 2008-06-11 00:38:53 UTC

Description of problem:

    optparse doesn't handle Unicode help text

Version-Release number of selected component (if applicable):

    python-2.4.3-21.el5

How reproducible:

    Every time.

Steps to Reproduce:

    1. Create a program that uses Unicode text for the help text in
       OptionParser.add_option().
    2. Run the program created in step 1.
    3. Boom!

or

    1. Unpack attached tarball with test program.
    2. run 'LANG=ja_JP.UTF-8 ./test3.py --help'
    3. Boom!
  
Actual results:

    $ LANG=ja_JP.UTF-8 ./test3.py --help
    Traceback (most recent call last):
      File "./test3.py", line 21, in ?
        (options,args) = parser.parse_args()
      File "/usr/lib64/python2.4/optparse.py", line 1275, in parse_args
        stop = self._process_args(largs, rargs, values)
      File "/usr/lib64/python2.4/optparse.py", line 1315, in _process_args
        self._process_long_opt(rargs, values)
      File "/usr/lib64/python2.4/optparse.py", line 1390, in _process_long_opt
        option.process(opt, value, values, self)
      File "/usr/lib64/python2.4/optparse.py", line 707, in process
        return self.take_action(
      File "/usr/lib64/python2.4/optparse.py", line 728, in take_action
        parser.print_help()
      File "/usr/lib64/python2.4/optparse.py", line 1534, in print_help
        file.write(self.format_help())
    UnicodeEncodeError: 'ascii' codec can't encode characters in position
114-168: ordinal not in range(128)

Expected results:

    No errors.  Program runs normally.

Additional info:

    See http://bugs.python.org/issue1498146

Comment 1 Bryan Mason 2008-06-11 00:38:53 UTC

Created attachment 308878 [details]
Test case

Comment 2 Bryan Mason 2008-06-11 00:40:01 UTC

I should have a proposed patch that fixes this shortly.

Comment 3 Bryan Mason 2008-06-11 22:39:18 UTC

Created attachment 309010 [details]
Proposed patch

Make optparse handle Unicode correctly.  Adapted from upstream patches to
optparse.py here:

http://svn.python.org/view/python/trunk/Lib/optparse.py?rev=46861&r1=46507&r2=46861


and here:

http://svn.python.org/view/python/trunk/Lib/optparse.py?rev=50791&r1=46863&r2=50791

Comment 5 Cole Robinson 2008-06-12 15:48:46 UTC

Was this intended to be filed against python-virtinst? It seems like the patch
you are presenting is against python itself.

Comment 6 Bryan Mason 2008-06-12 16:03:34 UTC

D'oh!  I was working on two bugs at once and got confused.  Yes, this should be
against python, not python-virtinst.  Sorry.

Comment 7 James Antill 2008-06-12 16:34:34 UTC

 In general python just doesn't play well with unicode, IMO. The whole API
pretty much guarantees tracebacks.
 python 2.5.1 does the same thing, we we worked around it in yum by doing:

-        self.optparser.print_help()
+        sys.stdout.write(self.optparser.format_help())

-        self.optparser.print_usage()
+        sys.stdout.write(self.optparser.format_usage())

...see upstream commit: a3a53f16b45e06aeaa3666b47705dc879b182724


 I'm loath to change anything in python/optparse to try and fix/workaround this,
because as I said, it's almost impossible to end up with something you _know_
won't traceback under all sets of input.

Note You need to log in before you can comment on or make changes to this bug.