Bug 138618 - openoffice 1.1.3 has problem with localized characters
openoffice 1.1.3 has problem with localized characters
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: im-sdk (Show other bugs)
3
All Linux
medium Severity high
: ---
: ---
Assigned To: Akira TAGOH
: i18n
: 132541 134591 138183 140152 146925 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-11-10 04:45 EST by kb
Modified: 2007-11-30 17:10 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-05 03:31:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
how it looks in openoffice (3.69 KB, image/png)
2004-11-10 05:14 EST, kb
no flags Details

  None (edit)
Description kb 2004-11-10 04:45:56 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a5)
Gecko/20041104

Description of problem:
after intalling fedora core 3 (updated a fully updated fedora core 2
to fedora core 3) openffice 1.1.3 (installed from distribution from
openoffice) does not accept localized characters such as ��� (å
aä ö) instead it displays Ã¥ ä ö, repsctivly.

i tried reinstalling openoffice 1.1.3, no change.
keyboard is set to swedish. the same charachters work fine in other
aplications.

Version-Release number of selected component (if applicable):
1.1.3

How reproducible:
Always

Steps to Reproduce:
see above

Additional info:
Comment 1 kb 2004-11-10 04:50:53 EST
i should add. i use the english version of openoffice (non localized
build)
Comment 2 kb 2004-11-10 05:00:28 EST
this also happens with openoffice 1.9.60 (1.9.m60)
Comment 3 kb 2004-11-10 05:14:02 EST
now i see that the erratic chars that i see in openoffice are
converted correctly in bugzilla, please see attachment
Comment 4 kb 2004-11-10 05:14:56 EST
Created attachment 106403 [details]
how it looks in openoffice
Comment 5 kb 2004-11-10 16:44:44 EST
tried reinstalling fonts (fonts-xorg*), no change.
changing severity since this basically makes it impossible to change
documents written in local language.
Comment 6 Dan Williams 2004-11-10 16:49:13 EST
Technically, since you've got 1.1.3 installed, this is an "upstream"
OOo bug and Fedora won't deal with it.  However, the 1.1.2 and such on
Fedora exhibit this problem too, so you're lucky :)

Here's one thign that's been reported to work for the moment:  try
disabling all input servers you may have running (htt/iiimf, Canna,
FreeWNN, etc).  For example, as root from teh command line do
"/sbin/service iiim stop" and see if your problem goes away.

If that works, can you tell me what your LANG is and what the contents
of your /etc/sysconfig/i18n file are?

Thanks!
Dan
Comment 7 kb 2004-11-10 16:57:26 EST
stopping iiim made the trick.

here is the content of /etc/sysconfig/i18n:
LANG="en_US.UTF-8"
SUPPORTED="en_GB.UTF-8:en_GB:en:en_US.UTF-8:en_US:en:de_AT.UTF-8:de_AT:de:de_DE.UTF-8:de_DE:de:no_NO.UTF-8:no_NO:no:nn_NO.UTF-8:nn_NO:nn:pl_PL.UTF-8:pl_PL:pl:es_ES.UTF-8:es_ES:es:sv_SE.UTF-8:sv_SE:sv"
SYSFONT="latarcyrheb-sun16"

Comment 8 Frode Tennebø 2005-02-18 17:26:19 EST
I can confirm kb's findings and that stopping htt fixes things (after restarting 
OOo):

[root@leia iptables]# more /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SUPPORTED="en_US.UTF-8:en_US:en:nb_NO.UTF-8:nb_NO:nb"
SYSFONT="latarcyrheb-sun16"
[root@leia iptables]# rpm -q openoffice.org
openoffice.org-1.1.3-5.5.0.fc3
Comment 9 Víctor Daniel Velasco Martínez 2005-02-24 22:00:03 EST
Same problem here, and solved by stopping iiim service   
   
$ more /etc/sysconfig/i18n  
LANG="es_MX.UTF-8"  
SUPPORTED="es_MX.UTF-8:es_MX:es:en_US.UTF-8:en_US:en:ja_JP.UTF-8:ja_JP:ja"  
SYSFONT="latarcyrheb-sun16"  
  
In RPM: openoffice.org-1.1.3-6.5.0.fc3 using FC3 
 
also, as a side note, CentOS 4 RC1 has exactly the same problem. I 
hope RHEL4 doesn't. 
Comment 10 Dan Williams 2005-02-24 22:08:34 EST
Lawrence & Leon:

We've had this problem for quite a while, and of course the workaround
is to turn off htt & iiim services...  Do you guys have any idea what
in the IIIMF framework would be causing this problem?  It seems only
to have problems with Western locales.  Perhaps its related to the bug
we had a year ago with IIIMF returning the Unicode values switched
around or whatever that was?
Comment 11 Dan Williams 2005-02-24 22:14:17 EST
llim & llch:

In fact, this looks _exactly_ the bug we had such a long time ago with
this, what iiimf returns is ~A and the original character.  The
previous bug is Bug 124538.  The patch that you posted in that issue
Leon may have gotten dropped, I will check.

https://bugzilla.redhat.com/beta/show_bug.cgi?id=124538

Dan
Comment 12 Dan Williams 2005-02-24 22:22:59 EST
llim & llch:

The patch in Bug 124538 _is_ applied to both RHEL-4 and devel versions
of OOo.  Evidently more needs to be done.  Is one of you guys (or
tagoh) able to investigate a bit more?
Comment 13 Dan Williams 2005-02-24 22:24:51 EST
Víctor Daniel Velasco Martínez:

Are you running in GNOME or KDE?  If you're in GNOME, do the widgets
in OOo track your current theme?  (ie, I'm trying to figure out if
you're using the GTK VCLplug, the KDE VCLplug, or the generic X11
VCLplug because the input code is slightly different for each one).
Comment 14 Leon Ho 2005-02-24 23:50:10 EST
The thing is we are using gtk+ binding so that portion of code should
totally skipped? So your question is very vaild if Víctor is using
GNOME or KDE.

Tagoh-san, what is your take on this?
Comment 15 Víctor Daniel Velasco Martínez 2005-02-25 11:36:58 EST
Well, using FC3 and CentOS4, both in KDE and iiim activated, and     
with openoffice.org-kde installed, the error happens, but the iiim    
to write japanese is ok, and an IME status windows opens with OO.     
    
When running in gnome, at least this fast test I made with FC3    
doesn't show the IME status window when opening OO, can't switch    
with CTRL-SPACE or SHIFT-SPACE, but spanish characters are ok. I'm    
not sure if I could activate the iiim in centos4, but spanish worked    
right too in gnome.     
    
The widgets in gnome are rendered using GTK, and qt's in kde.    
    
In FC3, i've got installed gtk-qt-engine-0.60-0.fdr.1.3 from  
kde-redhat, but I turned it off to be sure of the widgets drawing. I  
haven't used openoffice in fc3 before, so I can't assure since when  
the problem started, or if it wasn't present in the begining, and in  
centos4, it failed since a fresh install.  
  
Side note... I know centos is not yours, but if they are using the 
same source code than you, I think the problem could be present also 
in rhel, so somebody can verify if the problem is also there. 
 
A second side note, OO.o team has this bug registered too 
http://www.openoffice.org/issues/show_bug.cgi?id=32772 
and they consider your patch for OO.o2 and OO.o1.1.5 (if ever 
compiled), and why it wasn't patched before (breaks compatibility 
with older releases of IIIM and IIIMP) 
Comment 18 Leon Ho 2005-02-25 12:35:19 EST
Dan: Is IIIMP code in i18n_im.cxx is being enabled/disabled ATM? And
look like 2.0 has some fixes based on some discussion in the upstream bug?
Comment 19 Dan Williams 2005-02-25 13:03:32 EST
Leon: I believe the code in i18n_im.cxx is _enabled_ at the moment. 
The patch from the old issue was applied upstream and is present in
our builds.  The VCL in our builds is based off backported 2.0 code
from about 4 or 5 months ago, so it should have many of the 2.0 fixes
already.
Comment 20 Dan Williams 2005-02-25 13:11:52 EST
Viktor:  could you post exactly the keystrokes you are using so I can
try to reproduce this?

Is your keyboard layout set to es_MX as well?
What specific characters (and the keys you use to compose them) are
you trying to input?

Thanks!
Comment 21 Víctor Daniel Velasco Martínez 2005-02-25 14:02:46 EST
Dan:            
Starting OO.o, with service iiim started, in kde, I write this:       
(keystrokes -> characters obtained)       
   ñññ kakaka --> ñññ kakaka       
   [CTRL]+[SPACE] ñññ kakaka [ENTER} -> ñññ かかか        
 
testing other characters 
   çÇçÇ -->  çÇçÇ  (not used in spanish) 
   'a 'e 'i 'o 'u -> á é í ó ú   
   `a `e `i `o `u -> à è ì ò ù (not used in spanish)  
   ^a ^e ^i ^o ^u -> â ê î ô û (not used in spanish)  
   "a "e "i "o "u -> ä ë ï ö ü (not used in spanish)  
not tested with the ime, because it automaticaly takes japanese.   
        
So, japanese IME seems to be ok, but some characters are wrong both 
for japanese mode, or unicode-octal.       
       
About the keyboard layout, I'm not completely sure which one you       
need to know.        
       
In kcontrol, I set the keyboard to spanish. It gives me back the           
following command:           
           
    setxkbmap -model pc105 -layout es -variant basic           
           
and with system-config-keyboard         
    * running ['/usr/X11R6/bin/setxkbmap', '-layout', 'es',        
       '-model',  'pc105', '-option', '']         
       
There seems to be no es_MX variant, but anyway, Im acostumed to es,       
and use it always.       
       
I don't know if this is enough info in the keyboard layout or if you       
need more (or from other place)         
       
       
       
Comment 22 Dan Williams 2005-02-25 14:46:37 EST
llch: I cannot reproduce with 1.1.2-18 on RHEL4 with iiim both on and
off, using es_MX.UTF-8 locale and an es keymap.  Tried typing the ç
and Ç and it came through fine.  Unless you guys can, I don't think we
need to worry about the RHEL4 version for this one.

I can't reproduce with 1.1.3-6.5.fc3 either though, with iiim both on
and off, using es_MX.UTF-8 local and es keymap.

Not quite sure what the issue is?  I'm changing keymaps using
"gkb_xmmap es" and "gkb_xmmap us", and also with the command above
"setxkbmap -model pc105 -layout es -variant basic".
Comment 23 Víctor Daniel Velasco Martínez 2005-02-25 15:31:11 EST
did you tried in kde or in gnome? 
Comment 24 Dan Williams 2005-02-25 15:36:50 EST
gnome only for the moment.
Comment 25 Víctor Daniel Velasco Martínez 2005-02-25 16:39:31 EST
in gnome it works ok (even though I don't know how to switch the 
japanese writing on), the problem only happens when using kde. 
 
Comment 26 Akira TAGOH 2005-02-28 06:03:38 EST
Well, doesn't this problem happen on KDE apps as well? I mean when you
press ctrl+space on other KDE apps, what do you see then?
Comment 27 Dan Williams 2005-02-28 12:08:44 EST
*** Bug 132541 has been marked as a duplicate of this bug. ***
Comment 28 Dan Williams 2005-02-28 12:13:13 EST
*** Bug 134591 has been marked as a duplicate of this bug. ***
Comment 29 Dan Williams 2005-02-28 12:19:03 EST
*** Bug 138183 has been marked as a duplicate of this bug. ***
Comment 30 Dan Williams 2005-02-28 12:26:43 EST
*** Bug 140152 has been marked as a duplicate of this bug. ***
Comment 31 Dan Williams 2005-02-28 12:29:53 EST
*** Bug 146925 has been marked as a duplicate of this bug. ***
Comment 32 Víctor Daniel Velasco Martínez 2005-03-02 12:00:44 EST
CTRL+SPACE in KDE doesn't work, but I think it's because it can't use iimf yet  
(only XIM), but the localized characters (ñ, ç, etc) work right.   
  
The problem is just when using openoffice.org in kde. It even happens when  
using 1.9 withouth KDE integration... so, integration with KDE is not the  
cause.  
Comment 33 Dan Williams 2005-03-02 12:30:41 EST
When the problem here occurs, mbMultiLingual is TRUE.  That seems to
happen for most Western locales.  Upon receiving input from X, OOo
calls XmbLookupString(), and expects the input to be in either UTF-8
or the current encoding from LANG.  This is currently not the case,
XmbLookupString() returns printable text in an encoding that cannot be
determined.

Talked to Owen, he said that if XmbLookupString does not return the
printable string in the current locale/encoding, then there is most
likely a bug in the XIM/XIIIMF code, or in Xlib itself.

Attempting to determine the encoding of the text that
XmbLookupString() returns...
Comment 34 Dan Williams 2005-03-02 13:01:31 EST
Owen thinks that what comes back from XmbLookupString() looks like
double-encoded UTF-8.

For example, U+E7 (latin small letter c with cedilla) comes out as a
string 4-bytes in length:

0xC3  0x83  0xC2  0xA7

UTF-8 encoded U+E7 should be 0xC3 0xA7
Comment 35 Víctor Daniel Velasco Martínez 2005-03-02 14:36:37 EST
Dan: Well, I'm not good in the internals of the programs, or different  
languages codes, but please tell me anyway I can help to test, and how to do 
it. 
 
thanks 
 
V.Daniel 
Comment 36 Akira TAGOH 2005-03-03 16:47:28 EST
I've tracked this down that the double-encoded UTF-8 issue is caused
within xiiimp which come from im-sdk. so reassigning this to im-sdk.
Comment 37 Akira TAGOH 2005-03-04 11:45:51 EST
should be fixed in im-sdk-12.1.1-7.svn2208 (especially iiimf-x package)

Note You need to log in before you can comment on or make changes to this bug.