Bug 19647

Summary: 8859-1 and other encodings fail with iconv (endianess)
Product: [Retired] Red Hat Linux Reporter: krip
Component: jikesAssignee: Trond Eivind Glomsrxd <teg>
Status: CLOSED RAWHIDE QA Contact: Aaron Brown <abrown>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-11-15 11:13:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description krip 2000-10-24 00:21:38 UTC
the redhat 7.0 jikes-1.12-1 dist doesn't grok chars >128 when build with
iconv
(binary as well as naive source build).
without an encoding specified, they are invalid (jikes say \uFFF6 ...),
with -encoding option, the whole file is broken :(

to make it work, I had to patch line 895 in stream.cpp v 1.40
(char ** cast on 2nd (!) arg to iconv) and line 552 in option.cpp v 1.43
(explicitly demanding UTF-16BE !?).

diff -c orig/jikes-1.12/src/stream.cpp jikes-1.12/src/stream.cpp
*** orig/jikes-1.12/src/stream.cpp      Tue Jul 25 13:32:33 2000
--- jikes-1.12/src/stream.cpp   Tue Oct 24 01:41:40 2000
***************
*** 890,896 ****
                      size_t   chl  = 2;
                      size_t   srcl = 1;
                      size_t n = iconv(control.option.converter,
!                                      &source_ptr, &srcl,
                                       (char **)&chp, &chl
                      );
  
--- 890,896 ----
                      size_t   chl  = 2;
                      size_t   srcl = 1;
                      size_t n = iconv(control.option.converter,
!                                      (char **)&source_ptr, &srcl,
                                       (char **)&chp, &chl
                      );

diff -c orig/jikes-1.12/src/option.cpp jikes-1.12/src/option.cpp
*** orig/jikes-1.12/src/option.cpp      Mon Jul 31 00:56:31 2000
--- jikes-1.12/src/option.cpp   Tue Oct 24 01:53:54 2000
***************
*** 251,257 ****
  #elif defined(HAVE_ICONV_H)
                  encoding = new char[strlen(arguments.argv[++i]) + 1];
                  strcpy(encoding, arguments.argv[i]);
!                 converter=iconv_open("utf-16", encoding);
                  if(converter==(iconv_t)-1)
                  {
                      converter = NULL;
--- 251,257 ----
  #elif defined(HAVE_ICONV_H)
                  encoding = new char[strlen(arguments.argv[++i]) + 1];
                  strcpy(encoding, arguments.argv[i]);
!                 converter=iconv_open("UTF-16BE", encoding);
                  if(converter==(iconv_t)-1)
                  {
                      converter = NULL;

Comment 1 Trond Eivind Glomsrxd 2000-10-24 19:08:45 UTC
I've asked one of the jikes developers to have a look at it....

Comment 2 Trond Eivind Glomsrxd 2000-10-24 21:34:51 UTC
Here's the answer from Mo DeJong (mdejong):

Yeah, we know about this one. It is the single worst problem
in 1.12. Just as we fixed the CLASSPATH crashing problem
in 1.11, another nasty one shows up (Ugh). Fixing this
problem does not look too hard for any one locale, the
problem is we need to get regression tests in place before
a bug fix can be checked into the CVS. We need tests for
every single unicode char and its escaped equiv, so that
we can be sure it works everywhere. I am sure I will get
around to fixing it soon.


Comment 3 noa 2000-11-15 11:13:11 UTC
ok, the whole bug report is somewhat confusing and while we're waiting for a
proper fix from the
jikes people here is an explanation of what i've come to understand. The first
part of the patch is
needed for jikes to compile, and incorporated into redhat's src.rpm. The second
patch is needed
for the -encoding to work. Thus if you want to compile java-files with for
example ISO-8859-1
encoding, you need to apply the second patch and provide '-encoding iso-8859-1'
as an
argument to jikes when compiling.

I think it would be nice to have the ISO-8859-1 as the implicit default encoding
(as javac and jikes on other systems work) but that isn't really needed.

Comment 4 Trond Eivind Glomsrxd 2000-11-28 20:07:48 UTC
The patch has been applied to jikes-1.12-2 - a total fix should be part of 1.13,
whenever that is released.