Description of Problem: When using the tcsh package for RedHat 7.1, tcsh-6.10.5, which a french environnement, the two accented letters i and h (eacute and egrave) are badly displayed: the first time the character is typed, a space is displayed. The second time, the character is displayed twice. The bug was not present in package tcsh-6.10.1, which was an update package for RedHat7.0. LOCALE is: [richier@piano ~richier]# locale LANG=fr_FR@euro LC_CTYPE="fr_FR@euro" LC_NUMERIC="fr_FR@euro" LC_TIME="fr_FR@euro" LC_COLLATE="fr_FR@euro" LC_MONETARY="fr_FR@euro" LC_MESSAGES="fr_FR@euro" LC_PAPER="fr_FR@euro" LC_NAME="fr_FR@euro" LC_ADDRESS="fr_FR@euro" LC_TELEPHONE="fr_FR@euro" LC_MEASUREMENT="fr_FR@euro" LC_IDENTIFICATION="fr_FR@euro" LC_ALL= How Reproducible: In alle environmments : console, Xterm, rmote connection Steps to Reproduce: 1 type many eacute (e') in sequence Additional Information: I have compared the sources packages for tcsh6.10.1 oanr 6.10.5, and the problem comes from the patch csh-6.10.00-dspmbyte.patch. I do not have a correction, but I make 2 remarks: - tcsh-6.10-5 without the dspmbyte patch works well for my french letters - I am suprised by the change in the patch of an if (eq ...) with an if(strncasecmp...) as eq return 1 if the arguments are equal and strncasecmp return 0 in this case. So perheaps the correct test is if (0 == strncasecmp...) . In fact I tried this correction and it seems working, but as I do not understand multi byte code, I cannot be sure of side effects
You can't use strncasecmp or strlen on multi byte code. I made a patch that correctly check the length: --- tcsh-6.10.00/sh.h.orig Wed Jun 13 11:03:42 2001 +++ tcsh-6.10.00/sh.h Wed Jun 13 11:06:30 2001 @@ -551,6 +551,7 @@ #define eq(a, b) (Strcmp(a, b) == 0) +#define eqn(a, b, c) (Strncmp(a, b, c) == 0) /* globone() flags */ #define G_ERROR 0 /* default action: error if multiple words */ --- tcsh-6.10.00/sh.set.c.orig Wed Jun 13 11:05:37 2001 +++ tcsh-6.10.00/sh.set.c Wed Jun 13 12:40:34 2001 @@ -1211,7 +1211,8 @@ return; for (i = 0; dspmt[i].n; i++) { - if (eq(pcp, dspmt[i].n)) { + if (Strlen (dspmt[i].n) > 0 && + eqn(pcp, dspmt[i].n, (Strlen (dspmt[i].n)))) { set(CHECK_MBYTEVAR, Strsave(dspmt[i].v), VAR_READWRITE); update_dspmbyte_vars(); break; What was the redhat patch for? length or case comparaison? I'm not sure that checking the length is needed and I don't know how to do case compare.
looks like the j10n for tcsh doesn't properly respect i18n. If I can't fix it, I'll back the j10n out.
*** Bug 39645 has been marked as a duplicate of this bug. ***
thanks for the patch. indeed the original patch was flawed. the second patch in the interm was better, but the current recognizes the CJK locales as well as allowing 8-bit ISO-8859 data to pass through.
Rawhide tcsh-6.10-6.i386.rpm still has the problem
tcsh-6.10-6 from RH 7.2 still has the problem. It is visible in slovak environment too.
tcsh-6.10-7 STILL has the problem! The correction suggested by <martinea.ca> still works.
I've just verified that this bug is still present in (null). RedHat, please please please fix the obviously incorrect patch tcsh-6.10.00-dspmbyte.patch! The proper fix is in a comment by martinea.ca above. Management version: Kanji support is enabled for all locales not beginning with a j or a k, which is obviously wrong, and it makes it impossible to use high bit (iso8859-15) characters which are very common in german, french, swedish, etc.
Still broken in tcsh-6.12-4 from Red Hat 9. Time to do something about this soon? It's soon two years since martinea.ca gave you a correction, ready to apply ...
If you "set dspmbyte=utf8", does the problem go away? See also bug 73627.
Sorry, I am not using RH anymore, so I can't test it. I am also not authorized to view bug 73627, most probably because I refused to sign some weird NDA required by RH to get access to the contents of the coming releases including bug reports for the newest stuff (viva la free software). Which is also the main reason why I am not using RH anymore... FWIW, tcsh 6.11.00-2.2 from Debian works fine.
Setting dspmbyte to utf8 seems slightly strange in unibyte domains like iso-8859-*. But I tried anyway to see what happened, in my case with locale sv_SE. It does make a difference. With this setting, the characters display correctly as I type them. But tcsh seems to interpret them to start a quoted sequence of some kind. When I hit return, I get a question mark as prompt, just as when I end a command line with a backslash to continue it on the next line. (I'm not sure if bugzilla is 8-bit clean. In the example below, the character I try to echo is o-diaeresis, in ISO-8859-1 encoding, i.e. 0xF6.) postville> echo "$prompt" %m%# postville> echo "$prompt2" %R? postville> echo ö ? ö postville> Reading the documentation of dspmbyte, I also tried to set it to 256 zeros. (There are no multibyte characters in ISO-8859-1.) That does seem to work! I will try it out more extensively during the coming days, but at first sight it seems to solve the problem. Would it be possible to get this setting automatically, when using an ISO-8859-*, or other unibyte, locale?
*** Bug 57302 has been marked as a duplicate of this bug. ***
*** Bug 69351 has been marked as a duplicate of this bug. ***
*** Bug 75019 has been marked as a duplicate of this bug. ***
The dspmbyte patch was removed (replaced by an unrelated bugfix, actually) in tcsh-6.13-1, which should be available in rawhide soon.