Bug 2125153 - ibus does not support special Arabic compose sequences
Summary: ibus does not support special Arabic compose sequences
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: ibus
Version: 36
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: fujiwara
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-08 07:01 UTC by Mike FABIAN
Modified: 2023-01-12 10:15 UTC (History)
6 users (show)

Fixed In Version: ibus-1.5.27-9.fc38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-12 10:15:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Mike FABIAN 2022-09-08 07:01:31 UTC
The following compose sequences do not work in ibus:

    $ grep -i arabic.*ligature /usr/share/X11/locale/en_US.UTF-8/Compose
    # Arabic Lam-Alef ligatures
    <UFEFB>	:   "لا" # ARABIC LIGATURE LAM WITH ALEF
    <UFEF7>	:   "لأ" # ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE
    <UFEF9>	:   "لإ" # ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW
    <UFEF5>	:   "لآ" # ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE

They are needed because the Arabic keyboard layout outputs UFEFB on some key:

    $ grep -i fefb /usr/share/X11/xkb/symbols/ara 
        key <AB05> {  [           UFEFB,                UFEF5,                  NoSymbol,            NoSymbol ]};  // ‎ﻻ‎ ‎ﻵ‎
        key <AB05> {  [           UFEFB,                UFEF5,                     U06AB,               U06AD ]};  // ‎ﻻ‎ ‎ﻵ‎     ‎ګ‎ ‎ڭ‎

but the UFEFB characters is not what is desired, what one really wants is U+0644 U+0627. But xkb keyboard layouts can only output one keysym when typing a key, not two. So compose was used as a hack to work around this limitation of xkb:

The keyboard produces UFEFB and then the compose support replaces  this with U+0644 U+0627.

This works when the compose support in Xorg is used but not when the compose support in ibus is used.

How to reproduce:

1) First show that it works when using the Xorg compose support:

Start xterm like this (to disable ibus and use the Xorg compose support):

    env XMODIFIERS=@im=none xterm &

Then in the xterm, type

    $ echo -n b | iconv -f utf8 -t utf16le | od -x
    0000000 0062
    0000002

and we see that the b produces U+0062, which is correct.

Switch to the Arabic keyboard

    setxkbmap  ara

type “arrow up” to get the echo -n b | iconv -f utf8 -t utf16le | od -x line back, go back to the b with “arrow left”, type b and now one gets:

    echo -n لا | iconv -f utf8 -t utf16le | od -x 
    0000000 0644 0627
    0000004

I.e. even though the keyboard surely outputs only U+FEFB, the Compose support of Xorg transforms this into U+0644 U+0627

2) Now repeat the same test using the compose support in ibus:

Start xterm like this to use the ibus compose support:

    env XMODIFIERS=@im=ibus xterm &

Then in the xterm, type

    $ echo -n b | iconv -f utf8 -t utf16le | od -x
    0000000 0062
    0000002

and we see that the b produces U+0062, which is correct.

Switch to the Arabic keyboard

    setxkbmap  ara

type “arrow up” to get the echo -n b | iconv -f utf8 -t utf16le | od -x line back, go back to the b with “arrow left”, type b and now one gets:

    echo -n ﻻ | iconv -f utf8 -t utf16le | od -x 
    0000000 fefb
    0000002

Comment 1 Mike FABIAN 2022-09-09 16:00:06 UTC
For ibus-typing-booster I fixed it like this (was surprisingly easy):

https://github.com/mike-fabian/ibus-typing-booster/issues/379
https://github.com/mike-fabian/ibus-typing-booster/commit/c788401c794843a6b99c91a51f9cb67b32ffc86e

I just had to allow other keys than Multi_key and dead keys to start a sequence and I had to add 0x01000000 to when calculating the key value of keysyms of the type <UXXXX>, that was all.

I hope it is equally easy in Gtk and ibus ...

Comment 2 Mike FABIAN 2022-09-11 18:02:15 UTC
The problem in ibus seems to be something different, sequences not starting with Multi_key and not with dead keys do work.
It is not the length of the sequence either, even sequences of length 1 work **if** they are written in ~/.XCompose 

So adding a ~/.XCompose like this makes typing the `b` key with the Arabic keyboard layout do the right thing:

$ cat ~/.XCompose
include "/%L"
<UFEFB>	:   "لا" # ARABIC LIGATURE LAM WITH ALEF
<U263A> : "single char compose sequence worked"
$ 

But that sequence is already defined in the system wide compose file:

$ grep '^<UFEFB>' /usr/share/X11/locale/en_US.UTF-8/Compose
<UFEFB>	:   "لا" # ARABIC LIGATURE LAM WITH ALEF

So it should work without having to add a ~/.XCompose file, but it doesn't.

Comment 3 Mike FABIAN 2022-09-12 07:21:23 UTC
The same trick works for Gtk3, I was confused for a moment because I happened to have a ~/.config/gtk-3.0/Compose file and then Gtk3 reads **only** that, see: https://docs.gtk.org/gtk3/class.IMContextSimple.html

But after I deleted ~/.config/gtk3/Compose, the sequences 

    $ cat ~/.XCompose
    include "/%L"
    <UFEFB>	:   "لا" # ARABIC LIGATURE LAM WITH ALEF
    <U263A> : "single char compose sequence worked"
    $ 
    
worked in

    env GTK_IM_MODULE=gtk-im-context-simple gedit

as well.

But it **still** does **not** work in Gtk4, although I made sure that ~/.config/gtk-4.0/Compose did not exist and confirmed that other compose sequences like 

    <Multi_key> <m> <o> <u> <s> <e> : "🐭" # U+1F42D MOUSE FACE

defined in ~/.XCompose worked with 

    env GTK_IM_MODULE=gtk-im-context-simple gnome-text-editor

and

    env GTK_IM_MODULE=gtk-im-context-simple gtk4-demo


the compose sequences

    <UFEFB>	:   "لا" # ARABIC LIGATURE LAM WITH ALEF
    <U263A> : "single char compose sequence worked"

still did not work with Gtk4.

Comment 4 fujiwara 2023-01-12 07:41:29 UTC
I think the issue is fixed in rawhide and I back port the fixes to ibus-1.5.27-11.1.fc37 in the copr repo.
https://copr.fedorainfracloud.org/coprs/fujiwara/ibus/

Could you please test it?

Comment 6 fujiwara 2023-01-12 10:15:34 UTC
Thank you for your quick tests.
Fedora rawhide has this fix but Fedora 37 does not and ibus copr 37 is available.


Note You need to log in before you can comment on or make changes to this bug.