From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.7.3) Gecko/20041012 Firefox/0.10.1 Description of problem: It's because popt assumes that a string's byte length means string's cell width in terminal. Though this is true in many legacy encodings (ISO8859-*, EUC-*), not in UTF-8. Patch attached. This patch makes popt use correct cell widths with mbstowcs()/wcswidth(). Version-Release number of selected component (if applicable): popt-1.9.1-12 How reproducible: Always Steps to Reproduce: 1. Generate ko_KR.UTF-8 or zh_TW.UTF-8 locale. 2. Run this command (or other Gnome program) in a UTF-8 capable terminal: $ LANG=zh_TW.UTF-8 gnome-session --help 3. Output looks bad; descriptions' starting column numbers are not same. Additional info: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=275234
Created attachment 105079 [details] popt-multibyte-help-width.diff This patch makes popt use correct cell widths with mbstowcs()/wcswidth(). Changwoo Ryu <cwryu>
Comment on attachment 105079 [details] popt-multibyte-help-width.diff --- popt-1.7.orig/popthelp.c 2002-09-16 22:16:54.000000000 +0900 +++ popt-1.7/popthelp.c 2004-10-07 08:13:26.000000000 +0900 @@ -187,6 +187,49 @@ return l; } +#include <wchar.h> + +static int mbswidth(const char *str) +{ + int width, wcslen; + wchar_t *wcs; + + wcslen = mbstowcs(NULL, str, 0); + if (wcslen > 0) { + wcs = malloc(sizeof(wchar_t) * (wcslen + 1)); + wcslen = mbstowcs(wcs, str, wcslen); + width = wcswidth(wcs, wcslen); + free(wcs); + } else + width = strlen(str); + return width; +} + +static int findLengthInWidth(const char *str, int width) +{ + const char *s = str; + wchar_t wc; + mbstate_t state = { 0 }; + int w, len, max = 0; + + while (*s) { + len = mbrtowc(&wc, s, strlen(s), &state); + if (len < 0) { + /* strange situation, fallback to the byte length. */ + return strlen(str); + } + w = wcwidth(wc); + if (w > 0) { + if ((max + w) > width) + break; + max += w; + } + s += len; + } + return (s - str); + +} + /** * Display help text for an option. * @param fp output file handle @@ -317,7 +360,7 @@ /*@=boundswrite@*/ if (help) - fprintf(fp," %-*s ", maxLeftCol, left); + fprintf(fp," %-*s ", maxLeftCol+strlen(left)-mbswidth(left), left); else { fprintf(fp," %s\n", left); goto out; @@ -328,13 +371,13 @@ help = defs; defs = NULL; } - helpLength = strlen(help); + helpLength = mbswidth(help); /*@-boundsread@*/ while (helpLength > lineLength) { const char * ch; char format[16]; - ch = help + lineLength - 1; + ch = help + findLengthInWidth(help, lineLength - 1); while (ch > help && !isspace(*ch)) ch--; if (ch == help) break; /* give up */ while (ch > (help + 1) && isspace(*ch)) ch--; @@ -346,7 +389,7 @@ /*@=formatconst@*/ help = ch; while (isspace(*help) && *help) help++; - helpLength = strlen(help); + helpLength = mbswidth(help); } /*@=boundsread@*/ @@ -389,7 +432,7 @@ s = getArgDescrip(opt, translation_domain); if (s) - len += sizeof("=")-1 + strlen(s); + len += sizeof("=")-1 + mbswidth(s); if (opt->argInfo & POPT_ARGFLAG_OPTIONAL) len += sizeof("[]")-1; if (len > max) max = len; }
Created attachment 105083 [details] new "popt-multibyte-help-width.diff" New Patch by Changwoo Ryu <cwryu>
I believe the alignment has already been corrected in a later version of popt. Could you verify whether that is true?
popt-1.9.1-12. gnome-terminal --help in ko_KR.UTF-8 http://hellocity.net/~sangu/files/popt/ko_KR.UTF-8.png gnome-terminal --help in en_US.UTF-8 http://hellocity.net/~sangu/files/popt/en_US.UTF-8.png
Links in last comment are no longer there apparently. Any better in fc4?
Tested with popt-1.10.2-6 in ko and zh_TW, problem still exist. No problem in en_US though. Question: Why is the component assigned to rpm instead of popt?
Created attachment 121217 [details] output in ko locale
Created attachment 121218 [details] popt ouput in the zh_TW locale
I think we should mark this bug as won't fix or can't fix. Yes, the bytes of a chracter don't means the width of a character, such Russian characters, ans this patch solve this problem, but this patch assume each character's width is the same, this is true for Russian, as the width of Russian character and English character is the same, but not true for Chinese, one Chinese character 's width in the gnome terminal seem being twice as the English character. So how to determine the width of charaters? By byte, true for English; By character, true for Russian. Then how to be true for Chinese? We can assume it as twice as English, but this is not a good solution too as determine this is not simple and common to other languages.
See comment #4. The latest released popt is 1.10.7. I've asked if the problem still exists because I believe it is already fixed. If not, I'll be happy to fix. NEEDINFO
If not fixed, I'll also need some help getting a reproducer together.
Does this problem exist in popt 1.12 which I added for review (bug #249352), too? Please make a note, if the problem still exists otherwise please close this bug report. Thank you.
User pnasrat's account has been closed
Reassigning to owner after bugzilla made a mess, sorry about the noise...
Lawrence, popt 1.12-3 should reach Rawhide soon. Can you please have a look, whether the problem still exists? Any suggestions whether this bug report can be really resolved, when it isn't until now?!
Created attachment 194591 [details] screenshot for zh_CN I tried to test with zh_CN Please update me if I have no understand real bug, but it seems it not yet fixed. Tested with following RPM: popt-1.12-3.fc8
I don't think, the latest popt update really solves the problem, but noticing here: https://admin.fedoraproject.org/updates/F8/pending/popt-1.13-1.fc8 I would tend to Hu Zheng's suggestion to mark this bug as CANTFIX, if no real solution comes up which works for every case.
Created attachment 299270 [details] Sample from `$ LANG=zh_TW.UTF-8 rpm --help > ~/Desktop/sample` $ rpm -qa |grep popt popt-1.13-3.fc9.i386 $ uname -a Linux cavalier 2.6.23.1-42.fc8 #1 SMP Tue Oct 30 13:55:12 EDT 2007 i686 athlon i386 GNU/Linux Fedora 9 (devel / rawhide) on 27 Mar 2008.
Hi Hu Zheng, I have just triaged this once again on Fedora 9 (devel / rawhide). Please kindly advice if this bug is reproducible for bug closure decision. IMHO the mis-alignment is because of mixed usage of spaces and tabs? Rgds, Caius.
Created attachment 299271 [details] screenshot on 27/03/08
requested by Jens Petersen (#27995)
ping
Cairus, I don't know how I should solve this problem as downstream maintainer. And upstream told me, that a native JA/KO person would be needed also having some GNOME knowledge in order to get this hopefully really solved. Let me know if I can do something more for you. Upstream contacts are e.g. mentioned in the popt package itself.
(In reply to comment #18) > I would tend to Hu Zheng's suggestion to mark this bug as CANTFIX, if no real > solution comes up which works for every case. Shall we?