41991 – Bad display of special french accented letters

Bug 41991 - Bad display of special french accented letters

Summary: Bad display of special french accented letters

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	tcsh
Sub Component:
Version:	7.1
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Miloslav Trmač
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Duplicates (4):	39645 57302 69351 75019 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-05-23 15:16 UTC by Jean-Luc Richier
Modified:	2007-04-18 16:33 UTC (History)
CC List:	8 users (show)
Fixed In Version:	6.13-1
Clone Of:
Environment:
Last Closed:	2004-08-18 10:03:20 UTC
Embargoed:

Attachments	(Terms of Use)

Description Jean-Luc Richier 2001-05-23 15:16:17 UTC

Description of Problem:
When using the tcsh package for RedHat 7.1, tcsh-6.10.5, which a french
environnement, the two accented letters i and h (eacute and egrave) are
badly displayed: the first time the character is typed, a space is
displayed.  The second time,  the character is displayed twice.
The bug was not present in package tcsh-6.10.1, which was an update
package for RedHat7.0.
LOCALE is:
[richier@piano ~richier]# locale
LANG=fr_FR@euro
LC_CTYPE="fr_FR@euro"
LC_NUMERIC="fr_FR@euro"
LC_TIME="fr_FR@euro"
LC_COLLATE="fr_FR@euro"
LC_MONETARY="fr_FR@euro"
LC_MESSAGES="fr_FR@euro"
LC_PAPER="fr_FR@euro"
LC_NAME="fr_FR@euro"
LC_ADDRESS="fr_FR@euro"
LC_TELEPHONE="fr_FR@euro"
LC_MEASUREMENT="fr_FR@euro"
LC_IDENTIFICATION="fr_FR@euro"
LC_ALL=

How Reproducible:
In alle environmments : console, Xterm, rmote connection


Steps to Reproduce:
1 type many eacute (e') in sequence


Additional Information:
	
 
I have compared the sources packages for tcsh6.10.1 oanr 6.10.5, and the
problem comes from the patch csh-6.10.00-dspmbyte.patch.
I do not have a correction, but I make 2 remarks:
- tcsh-6.10-5 without the dspmbyte patch works well for my french letters
- I am suprised by the change in the patch of an if (eq ...) with an
  if(strncasecmp...) as eq return 1 if the arguments are equal and
  strncasecmp return 0 in this case. So perheaps the correct test is if
  (0 == strncasecmp...) . In fact I tried this correction and it seems
  working, but as I do not understand multi byte code, I cannot be sure
  of side effects

Comment 1 martinea 2001-06-13 17:02:14 UTC

You can't use strncasecmp or strlen on multi byte code.

I made a patch that correctly check the length:

--- tcsh-6.10.00/sh.h.orig      Wed Jun 13 11:03:42 2001
+++ tcsh-6.10.00/sh.h   Wed Jun 13 11:06:30 2001
@@ -551,6 +551,7 @@


 #define        eq(a, b)        (Strcmp(a, b) == 0)
+#define        eqn(a, b, c)    (Strncmp(a, b, c) == 0)

 /* globone() flags */
 #define G_ERROR                0       /* default action: error if multiple
words */
--- tcsh-6.10.00/sh.set.c.orig  Wed Jun 13 11:05:37 2001
+++ tcsh-6.10.00/sh.set.c       Wed Jun 13 12:40:34 2001
@@ -1211,7 +1211,8 @@
        return;

     for (i = 0; dspmt[i].n; i++) {
-       if (eq(pcp, dspmt[i].n)) {
+       if (Strlen (dspmt[i].n) > 0 &&
+            eqn(pcp, dspmt[i].n, (Strlen (dspmt[i].n)))) {
            set(CHECK_MBYTEVAR, Strsave(dspmt[i].v), VAR_READWRITE);
            update_dspmbyte_vars();
            break;

What was the redhat patch for? length or case comparaison?

I'm not sure that checking the length is needed and I don't know
how to do case compare.

Comment 2 Eido Inoue 2001-06-28 14:36:55 UTC

looks like the j10n for tcsh doesn't properly respect i18n. If I can't fix it,
I'll back the j10n out.

Comment 3 Eido Inoue 2001-06-28 14:37:34 UTC

*** Bug 39645 has been marked as a duplicate of this bug. ***

Comment 4 Eido Inoue 2001-07-24 04:35:18 UTC

thanks for the patch. indeed the original patch was flawed. the second patch in
the interm was better, but the current recognizes the CJK locales as well as
allowing 8-bit ISO-8859 data to pass through.

Comment 5 Kurt Swanson 2001-08-16 04:44:55 UTC

Rawhide tcsh-6.10-6.i386.rpm still has the problem

Comment 6 stano 2001-12-23 15:50:19 UTC

tcsh-6.10-6 from RH 7.2 still has the problem. It is visible in slovak 
environment too.

Comment 7 Göran Uddeborg 2002-02-12 20:19:33 UTC

tcsh-6.10-7 STILL has the problem!

The correction suggested by <martinea.ca> still works.

Comment 8 Tobias Ringstrom 2002-09-13 16:31:40 UTC

I've just verified that this bug is still present in (null).

RedHat, please please please fix the obviously incorrect patch
tcsh-6.10.00-dspmbyte.patch!  The proper fix is in a comment
by martinea.ca above.

Management version:

      Kanji support is enabled for all locales not beginning
      with a j or a k, which is obviously wrong, and it makes
      it impossible to use high bit (iso8859-15) characters
      which are very common in german, french, swedish, etc.

Comment 9 Göran Uddeborg 2003-03-31 15:56:39 UTC

Still broken in tcsh-6.12-4 from Red Hat 9.  Time to do something about this
soon?  It's soon two years since martinea.ca gave you a
correction, ready to apply ...

Comment 10 Aleksey Nogin 2003-08-28 17:06:37 UTC

If you "set dspmbyte=utf8", does the problem go away? See also bug 73627.

Comment 11 stano 2003-08-28 18:02:49 UTC

Sorry, I am not using RH anymore, so I can't test it. I am also not authorized
to view bug 73627, most probably because I refused to sign some weird NDA
required by RH to get access to the contents of the coming releases including
bug reports for the newest stuff (viva la free software). Which is also the main
reason why I am not using RH anymore...

FWIW, tcsh 6.11.00-2.2 from Debian works fine.

Comment 12 Göran Uddeborg 2003-09-06 21:04:29 UTC

Setting dspmbyte to utf8 seems slightly strange in unibyte domains like
iso-8859-*.  But I tried anyway to see what happened, in my case with locale sv_SE.

It does make a difference.  With this setting, the characters display correctly
as I type them.  But tcsh seems to interpret them to start a quoted sequence of
some kind.  When I hit return, I get a question mark as prompt, just as when I
end a command line with a backslash to continue it on the next line.  (I'm not
sure if bugzilla is 8-bit clean.  In the example below, the character I try to
echo is o-diaeresis, in ISO-8859-1 encoding, i.e. 0xF6.)

postville> echo "$prompt"
%m%# 
postville> echo "$prompt2"
%R? 
postville> echo Ã¶
? 
Ã¶

postville>

Reading the documentation of dspmbyte, I also tried to set it to 256 zeros. 
(There are no multibyte characters in ISO-8859-1.)  That does seem to work!  I
will try it out more extensively during the coming days, but at first sight it
seems to solve the problem.

Would it be possible to get this setting automatically, when using an
ISO-8859-*, or other unibyte, locale?

Comment 13 Miloslav Trmač 2004-08-17 13:26:50 UTC

*** Bug 57302 has been marked as a duplicate of this bug. ***

Comment 14 Miloslav Trmač 2004-08-17 13:46:02 UTC

*** Bug 69351 has been marked as a duplicate of this bug. ***

Comment 15 Miloslav Trmač 2004-08-17 13:50:05 UTC

*** Bug 75019 has been marked as a duplicate of this bug. ***

Comment 16 Miloslav Trmač 2004-08-18 10:03:20 UTC

The dspmbyte patch was removed (replaced by an unrelated bugfix,
actually) in tcsh-6.13-1, which should be available in rawhide soon.

Note You need to log in before you can comment on or make changes to this bug.