Bug 170072 - /usr/share/i18n/charmaps/IBM1047 maps EBCDIC 0x15 and 0x25 incorrectly
Summary: /usr/share/i18n/charmaps/IBM1047 maps EBCDIC 0x15 and 0x25 incorrectly
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-10-06 23:48 UTC by Eric Hopper
Modified: 2007-11-30 22:11 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2005-10-07 17:29:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
This is an updated IBM1047 with the correct mappings. (12.28 KB, text/plain)
2005-10-06 23:50 UTC, Eric Hopper
no flags Details

Description Eric Hopper 2005-10-06 23:48:18 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
This map is incorrect.  At least, according to the zOS telnet daemon.  If told that the terminal is utf-8 capable, EBCDIC 0x15's get translated to UTF-8 0x0a, and EBCDIC 0x25's get translated to UTF-8 0xC2 0x85 (Unicode 0x0085 or NEL).  The code translation in /usr/share/i18n/charmaps/IBM1047 has this the other way around.  This is causing me a lot of problems in trying to convert EBCDIC documents to ASCII as the newline characters come out all fouled up.


Version-Release number of selected component (if applicable):
glibc-2.3.5-10.3

How reproducible:
Always

Steps to Reproduce:
1.Find an EBCDIC document
2.iconv -f IBM1047 -t ISO-8859-1 <document >document.ascii
3.Notice how there don't seem to be any newlines in document.ascii


Additional info:

Comment 1 Eric Hopper 2005-10-06 23:50:41 UTC
Created attachment 119696 [details]
This is an updated IBM1047 with the correct mappings.

Comment 2 Jakub Jelinek 2005-10-07 17:29:20 UTC
The problem is that EBCDIC is inconsistent, sometimes is 0x15 LF and 0x25 NEL,
while sometimes vice versa.  See
http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf
for details.  Neither setting is correct nor wrong, see table 5-3 in that
document and text below the table.

Comment 3 Eric Hopper 2005-10-07 19:10:48 UTC
Then both mappings should have their own conversion tables so I can use the one
appropriate to the situation.  The one that's there now is useless for the work
I'm trying to do right now.  I will be keeping one that does it in the way that
works for me and be keeping it basically as a fork of the glibc tree that I will
have to hand-maintain to keep current with every release.

I find that level of effort for something this simple to be kind of ridiculous.
 Particularly since converting an encoding in /usr/share/i18n to a .so in
/usr/lib/gconv is very painful without actually opening up the glibc source and
doing a lot of divination.



Note You need to log in before you can comment on or make changes to this bug.