Bug 74145 - mc-4.5.55-m.patch in SRPM is bogus
Summary: mc-4.5.55-m.patch in SRPM is bogus
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Raw Hide
Classification: Retired
Component: mc
Version: 1.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Havoc Pennington
QA Contact: Jay Turner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-09-16 22:04 UTC by Miloslav Trmac
Modified: 2015-01-08 00:00 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-05-09 12:55:47 UTC
Embargoed:


Attachments (Terms of Use)

Description Miloslav Trmac 2002-09-16 22:04:34 UTC
Version-Release number of selected component (if applicable):
mc-4.5.55-12

Whatever the mc-4.5.55-mb.patch is supposed to do, it doesn't:
+                      if((*text & 0xC0) == 0xC0) { /* start of multi byte char
+                              if((*text & 0xC0) != 0x80)
+                                      len = 2;
+                              else if((*text & 0xC0) != 0x800)
+                                      len = 3;
+                              else if((*text & 0xC0) != 0x10000)
+                                      len = 4;
+                              SLsmg_write_nstring(text, len);
+                              text = text + len;
+                      }

The above is equivalent to
+                      if((*text & 0xC0) == 0xC0) { /* start of multi byte char
+                              len = 2;
+                              SLsmg_write_nstring(text, len);
+                              text = text + len;
+                      }
(that is: the nested ifs are completely bogus (the first is always true,
the second and third are always false)).

I haven't looked at the surrounding code and just guess that text is
supposed to point at an UTF-8 text, so the following may be as bogus
as the original patch.

It's nice that Czech will work now, but what about the languages
which use more than 11 bits (2 UTF-8 chars) of character code space?

FWIW, the right way to test UTF-8 character length:
len = 1;
// if ((*text & 0xC0) == 0x80) CHAR_IS_BOGUS; else
if ((*text & 0xC0) == 0xC0)
  {
    if (*text < 0xE0) len = 2;
    else if (*text < 0xF0) len = 3;
    else if (*text < 0xF8) len = 4;
    // and so on
  }
Not to mention that the maximal length of UTF-8 character is 6, not 4 as
in the patch.

Comment 1 Miloslav Trmac 2003-05-09 12:55:47 UTC
This patch is not present in current rawhide.


Note You need to log in before you can comment on or make changes to this bug.