Red Hat Bugzilla – Bug 170072
/usr/share/i18n/charmaps/IBM1047 maps EBCDIC 0x15 and 0x25 incorrectly
Last modified: 2007-11-30 17:11:14 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7
Description of problem:
This map is incorrect. At least, according to the zOS telnet daemon. If told that the terminal is utf-8 capable, EBCDIC 0x15's get translated to UTF-8 0x0a, and EBCDIC 0x25's get translated to UTF-8 0xC2 0x85 (Unicode 0x0085 or NEL). The code translation in /usr/share/i18n/charmaps/IBM1047 has this the other way around. This is causing me a lot of problems in trying to convert EBCDIC documents to ASCII as the newline characters come out all fouled up.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Find an EBCDIC document
2.iconv -f IBM1047 -t ISO-8859-1 <document >document.ascii
3.Notice how there don't seem to be any newlines in document.ascii
Created attachment 119696 [details]
This is an updated IBM1047 with the correct mappings.
The problem is that EBCDIC is inconsistent, sometimes is 0x15 LF and 0x25 NEL,
while sometimes vice versa. See
for details. Neither setting is correct nor wrong, see table 5-3 in that
document and text below the table.
Then both mappings should have their own conversion tables so I can use the one
appropriate to the situation. The one that's there now is useless for the work
I'm trying to do right now. I will be keeping one that does it in the way that
works for me and be keeping it basically as a fork of the glibc tree that I will
have to hand-maintain to keep current with every release.
I find that level of effort for something this simple to be kind of ridiculous.
Particularly since converting an encoding in /usr/share/i18n to a .so in
/usr/lib/gconv is very painful without actually opening up the glibc source and
doing a lot of divination.