finger-0.17-26 When a UTF-8 real username is created, fingering that user will print out garbage in place of the username, as finger implements "fputs()" using fputc, which doesn't understand wide characters. Patch and testcase attached below.
Created attachment 121538 [details] finger-munge-utf8-gecos.patch
Created attachment 121539 [details] finger-i18n.sh
Thanks, patched applied on rawhide
Radek, this patch doesn't take into account: - 7-bit terminals - the send_crs option - stripping non-printable characters I will fix those up, and upload a corrected patch soon.
Great, how about using Octalify makro from file to make nonprintable characters displayed?
Yep, would still need to take into account the 2 other ones. But doing a putc on each character is definitely a mistake. I'll check it out tomorrow, if I have some spare time.
Created attachment 122127 [details] bsd-finger-wide-char-support.patch A patch that does all that (hopefully). No autotool-fu, there's no configure.in or .ac in the tarball...
I'm not sure it works now :) At least your own test script gives me broken output root@garfield finger# ./finger test-i18n Login: test-i18n Name: ¥µ\uffffM-^E\uffffM-^E\u04b5\uffffM-^U\u4e25\u5f25\u6f25
Well, you will need to have all this defined for it to work: +#if defined(HAVE_WCHAR_H) && defined(HAVE_MBRTOWC) && defined(HAVE_WCWIDTH) As there's no configure-fu, hard-coding it at the top of display.c might be a short-term work-around.
And it's quite broken still :)
Created attachment 122128 [details] bsd-finger-wide-char-support2.patch
Ehm, again, here's my output # ./finger test-i18n Login: test-i18n Name: ¥µ\uffffM-^E\uffffM-^E\u04b5\uffffM-^U\u4e25\u5f25\u6f25 finger: display.c:226: fxputs: Assertion `bytesconsumed != (size_t)(-1) && bytesconsumed != (size_t)(-2)' failed. Directory: /home/test-i18n Aborted and reading your comment in the patch :) /* This isn't supposed to happen as we verified the string before hand */
Yeah, I'm currently testing with hard-coded defines. Don't understand why it doesn't parse the username as a valid multi-byte string...
Created attachment 122135 [details] bsd-finger-wide-char-support3.patch This *should* work, but for some reason, it doesn't parse the string properly, and think it's not a valid multi-byte string. I have no idea why it doesn't work... Can you find my stupid typo somewhere there?
Ok, I'll try to find it out today. From the first glance, I don't think that verifymultibate function is correct. Debugger shows that it returns 1 at a very first multibyte character it hits. Closely mbrtowc returns -1 when for ¥.
yeah, which is why I think there's a typo somewhere there...
Hmm, I thought this will correctly test whole line as multibyte string but it also fails when a single multibyte character appears in the line :( So is there sth wrong with the input buffer? static int verifymultibyte(const char *buf) { mbstate_t state; wchar_t *wbuf; size_t bytesconsumed; (void)memset(&state, 0, sizeof(mbstate_t)); int n = strlen(buf); (void)wmemset(wbuf, 0, 2*n); int x; if ((x = mbsrtowcs(wbuf, &buf,n,&state)) > 0) { printf("Success (length=%x)\n",x); return 1; } else { printf("Fail\n"); return 0; } }
Created attachment 122173 [details] 4th revision of widechar patch Ok, this works for me. Needs some more clean-up but it should be fine ...
Created attachment 122181 [details] bsd-finger-wide-char-support4.patch And a patch that actually works. Only difference with my -3 patch is the call to setlocale(). Thanks Jakub for pointing that out.
Yep, works good now. Applied on rawhide.
Created attachment 122267 [details] use fprintf for whole UTF8 line Here's another patch. The change is, that converted line is printed as a whole and not only char by char. Should be really fine now.
Radek, the updated patch doesn't take into account the send_crs option, or strip non-printable characters if the string is a proper multi-byte one.
Those should be converted by OCTALIFY, aren't they?
+ } else if (bytesconsumed == 1) { + op++; + } else { This should be special casing stuff like \n + char *tmp; + tmp = buffer; + buffer[bytesconsumed] = '\0'; + while (bytesconsumed-- > 0) { + OCTALIFY(tmp, op); + } tmp (ie. buffer) is used to store the OCTALIFY output, but isn't used afterwards.
IHMO \n will be processed by the if above, cos it will have only one byte. Maybe this if should have also isprint test for one byte characters. } else if (bytesconsumed == 1 && isprint(op[0])) { and the rest will be skipped or converted.
of course 'isprint(op[eop-op])'
ok, I got bit lost in this UTF-8 thing. Why do we want to strip \n and others? If I look at the original output, it contains newline characters .. is this wrong? I would like to see some clear example of some breakage it can cause. Also what does this sentence mean - "... or strip non-printable characters if the string is a proper multi-byte one" - don't see where I lost these ones. The above comment is wrong, op is already shifted.
Created attachment 122281 [details] bsd-finger-wide-char-support6.patch this patch respects send_crs options. Still remains the problem with some non-printable characters.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0342.html