Bug 1479682 - Xwayland reports keycode mapping 'unnamed', messes up spice VM keyboard
Xwayland reports keycode mapping 'unnamed', messes up spice VM keyboard
Status: NEW
Product: Fedora
Classification: Fedora
Component: spice-gtk (Show other bugs)
26
x86_64 Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Marc-Andre Lureau
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 1534324
  Show dependency treegraph
 
Reported: 2017-08-09 03:40 EDT by zhoujunqin
Modified: 2018-01-14 21:41 EST (History)
30 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1512564 1534324 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
screenshot for result-1 (316.71 KB, image/png)
2017-08-09 03:40 EDT, zhoujunqin
no flags Details
Systemtap capture (4.86 KB, text/plain)
2017-09-25 03:25 EDT, Olivier Fourdan
no flags Details

  None (edit)
Description zhoujunqin 2017-08-09 03:40:10 EDT
Created attachment 1311030 [details]
screenshot for result-1

Description of problem:
Can't input with keyboard when connect to a spice guest on a remote rhel7 with ssh (-X) connection on a fedora26 system

Version-Release number of selected component (if applicable):
Local host: Fedora26
virt-manager-1.4.1-2.fc26.noarch
spice-gtk3-0.33-3.fc26.x86_64

Remote host: RHEL7.4
virt-manager-1.4.1-7.el7.noarch
spice-gtk3-0.33-6.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install a fedora 26 Warkstation system on my laptop with iso file:

https://download.fedoraproject.org/pub/fedora/linux/releases/26/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-26-1.5.iso


2. After installation finished, ssh to another rhel7.4 host by using:

# ssh root@$ip -X 

3. When ssh connection setup, launch virt-manager->select a spice guest->connect to guest console and login guest system.

# virt-manager 

Spice guest xml configuration:
...
  <graphics type='spice' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
...


Actual results:
1. I can't input user name and password on login page, please see screenshot.
2. I can input with keyboard for vnc guest.
3. I can input with keyboard for spice guest when i add such rhel7 connection as a remote connection on my local virt-manager.
(Launch virt-manager on local fedora system-> Add connection-> Keep default 'Hypervisor', check 'Connect to a remote host', then input username and host ip, click 'Connect', then select spice guest and open guest console)

Expected results:
Can input with keyboard when connect to a spice guest on a remote rhel7 with ssh (-X) connection

Additional info:
I can also reproduce this issue when connect to a spice guest on a remote rhel6 with ssh (-X) connection
Comment 1 Cole Robinson 2017-08-17 16:47:23 EDT
I tried f26 host, rhel7 host, with an f26 livecd VM, and keyboard/mouse seemed to work fine. I see your VM is RHEL and not a livecd so it's not exactly the same, but:

Is the f26 host using wayland or X desktop?
If you switch to a VT with Send Key menu, does the keyboard work?
Comment 2 zhoujunqin 2017-08-18 04:48:43 EDT
(In reply to Cole Robinson from comment #1)
> I tried f26 host, rhel7 host, with an f26 livecd VM, and keyboard/mouse
> seemed to work fine. I see your VM is RHEL and not a livecd so it's not
> exactly the same, but:
> 
Hi Cole,
Do you mean that f26 livecd VM installed on rhel7 host, then connect by steps as Comment 0.
If so, i created a f26 on my rhel7 host, but i also cannot input with keyboard for spice guest like Comment 0.
I don't think it's the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1285770#c27
for keyboard works well when i choose vnc guest.

> Is the f26 host using wayland or X desktop?
My f26 host using GNOME

> If you switch to a VT with Send Key menu, does the keyboard work?
No, keyboard doesn't work.
Comment 3 Cole Robinson 2017-08-25 17:53:03 EDT
(In reply to zhoujunqin from comment #2)
> Hi Cole,
> Do you mean that f26 livecd VM installed on rhel7 host, then connect by
> steps as Comment 0.

Yes, but I didn't 'install' the VM, just ran it off the live CD.

> 
> > Is the f26 host using wayland or X desktop?
> My f26 host using GNOME

gnome+X or gnome+wayland? cat env | grep DISPLAY will tell us. If you see WAYLAND_DISPLAY, you are on wayland. At login time, click the gear icon and select the X session option, see if that makes any difference
Comment 4 zhoujunqin 2017-08-27 23:27:30 EDT
(In reply to Cole Robinson from comment #3)
> (In reply to zhoujunqin from comment #2)
> > Hi Cole,
> > Do you mean that f26 livecd VM installed on rhel7 host, then connect by
> > steps as Comment 0.
> 
> Yes, but I didn't 'install' the VM, just ran it off the live CD.
> 
> > 
> > > Is the f26 host using wayland or X desktop?
> > My f26 host using GNOME
> 
> gnome+X or gnome+wayland? cat env | grep DISPLAY will tell us. If you see
> WAYLAND_DISPLAY, you are on wayland. At login time, click the gear icon and
> select the X session option, see if that makes any difference

Yes, i tried as you said.

1. Check DISPLAY value in my host.

# env | grep DISPLAY
DISPLAY=:0
WAYLAND_DISPLAY=wayland-0

2. Logout my system and re-login.
Input root username, then in next password input page, click the gear icon and
select the 'GNOME on Xorg' session, with right password then click 'Sign in'.

Repeat steps in Comment 0, then find keyboard works well  when connect to a spice guest on a remote rhel7 with ssh (-X) connection.
Comment 5 Cole Robinson 2017-09-07 19:03:08 EDT
Okay I reproduced. Setup:

F26 client running Gnome+Wayland
F26 server running Gnome+X11
F26 VM + spice

virt-manager --debug will print this warning:

vnc-keymap-WARNING **: Unknown keycode mapping '(unnamed)'.
Please report to gtk-vnc-list@gnome.org
including the following information:

  - Operating system
  - GDK build
  - X11 Server
  - xprop -root
  - xdpyinfo


And every keypress/sendkey command will show errors like this:

(virt-viewer:10046): GSpice-CRITICAL **: send_key: assertion 'scancode != 0' failed
(virt-viewer:10046): GSpice-CRITICAL **: send_key: assertion 'scancode != 0' failed
(virt-viewer:10046): GSpice-CRITICAL **: send_key: assertion 'scancode != 0' failed


Weirdly, virt-viewer will show that first error, but then work, unless I run virt-manager, after which it doesn't work? Not positive but it seems to be easy to get virt-viewer into a non-working state after a few attempts.

Switching the VM to VNC will still show the same initial error (since that code seems shared with gtk-vnc), but keyboard appears to work, so maybe that's a hint to what the fix is.

Moving to spice-gtk
Comment 6 Daniel Berrange 2017-09-08 04:35:08 EDT
The error message you show suggests that virt-manager is running with the X11 display backend, *not* wayland. The keyboard layout guessing only runs with X11 backend, because wayland is hardcoded to always use evdev.


I can reproduce the problem easily enough on my local F26 + wayland desktop - just need to set the 'GDK_BACKEND=x11' env variable to force it to use Xwayland X11 server

eg normal usage fine:

$ virt-viewer -c domokun f25kubdev


force x11:


$ GDK_BACKEND=x11  virt-viewer -c domokun f25kubdev

(virt-viewer:5456): gtk-vnc-WARNING **: Unknown X11 keycode mapping '(unnamed)'.
Please report to gtk-vnc-list@gnome.org
including the following information:

  - Operating system
  - GDK build
  - X11 Server
  - xprop -root
  - xdpyinfo



This does work correctly immediately after logging into Wayland. At some point, however, XWayland gets messed up and starts reporting "unnamed" as the keycode mapping.   Simply starting & quitting virt-viewer (with GDK_BACKEND=x11) several times in a row triggered it after about the 3rd/4th launch. Thereafter is fubar until you logout again.

If, however, you login to "GNOME on X11" so that you use Xorg, instead of Xwayland, everything works fine and I can't reproduce the problem no matter how many times I launch


IOW, this feels like an XWayland bug to me
Comment 7 Cole Robinson 2017-09-08 09:39:18 EDT
Thanks for the info Dan, changing component for feedback from X maintainers.

Googling I see you brought this up on spice list before with some additional info:

https://lists.freedesktop.org/archives/spice-devel/2017-February/036055.html
https://bugs.freedesktop.org/show_bug.cgi?id=99714
Comment 8 Daniel Berrange 2017-09-08 09:43:10 EDT
FYI, my mention of firefox in that old spice-devel message seems to be a red herring. It simply seems to require any X application be started a few times for it to loose the keycode names.
Comment 9 Olivier Fourdan 2017-09-08 10:43:52 EDT
I doubt this is an Xwayland issue.

Can you reprocuce using another Wayland compositor instead of GNOME, like weston?
Comment 10 Daniel Berrange 2017-09-08 10:54:12 EDT
Any tips on how to easily run a non-GNOME Wayland compositor from a standard Fedora desktop ?
Comment 11 Olivier Fourdan 2017-09-08 12:42:30 EDT
> Any tips on how to easily run a non-GNOME Wayland compositor from a standard
> Fedora desktop ?

Sure, install weston and select "weston" as your session in gdm.

Note: gtk-vnc really confuses Xkb settings somehow, I am now stuck with a US layout instead of my usual UK layout... Even changing layouts in gnome-shell has no effect at all.(In reply to Daniel Berrange from comment #10)
Comment 12 Olivier Fourdan 2017-09-08 12:57:46 EDT
(In reply to Olivier Fourdan from comment #11)
> Note: gtk-vnc really confuses Xkb settings somehow, I am now stuck with a US
> layout instead of my usual UK layout... Even changing layouts in gnome-shell
> has no effect at all.(In reply to Daniel Berrange from comment #10)

Replying to $self, it's probably my fault, playing with setxkbmap, and unrelated to gtk-vnc... sorry.

Speaking of which, what gives "setxkbmap -query" on the system when gtk-vnc/virt-viewer fail with the error in comment 6?
Comment 13 Cole Robinson 2017-09-08 15:40:54 EDT
> Speaking of which, what gives "setxkbmap -query" on the system when
> gtk-vnc/virt-viewer fail with the error in comment 6?

Reproducing with single host gnome+wayland and the GDK_BACKEND=x11 trick dan mentioned, setxkbmap output doesn't change before or after failure, it looks like:

$ setxkbmap -query
rules:      evdev
model:      pc105
layout:     us(In reply to Olivier Fourdan from comment #11)

> > Any tips on how to easily run a non-GNOME Wayland compositor from a standard
> > Fedora desktop ?
> 
> Sure, install weston and select "weston" as your session in gdm.

I don't see that option. But I used steps here: https://fedoraproject.org/wiki/How_to_debug_Wayland_problems

$ cat ~/.config/weston.ini 
[core]
modules=xwayland.so

Launched weston-session from a VT, using GDK_BACKEND=x11 I still reproduce the issue. Not sure what that tells me though...


Note, for a simple reproducer, you can launch qemu like:

  qemu-kvm -spice addr=127.0.0.1,port=5900,disable-ticketing

And connect with this (remote-viewer uses virt-viewer code)

  GDK_BACKEND=x11 remote-viewer spice://127.0.0.1:5900

Then press some keys. Seems to work on the first attempt, but keep the VM running and relaunch remote-viewer a few times and at least for me it quickly gets into a busted keyboard state with messages on stderr
Comment 14 Daniel Berrange 2017-09-11 04:58:30 EDT
Same here, when I launch "weston" and then run remote-viewer forcing GDK_BACKEND=x11 it'll get the keyboard layout the first time, but after than it just gets "unmamed" for the keys. So does feel like an Xwayland bug that's independent of the compositor in use.
Comment 15 Olivier Fourdan 2017-09-11 08:15:59 EDT
Unfortunately, I cannot reproduce here.

Can you check in journalctl to see if you have messages related to xkbcomp when the problem occurs?
Comment 16 Daniel Berrange 2017-09-11 10:21:53 EDT
Ok, I've done some debugging with GDB to catch this.

Starting from a fresh GNOME 3 session on wayland with nothing running except GNOME shell and terminal

I attached to Xwayland and put a breakpoint on '_XkbLookupKeyboard' and then ran a demo program:

$ cat xkb.c

#include <stdio.h>
#include <X11/Xlib.h>
#include <X11/XKBlib.h>

int main(int argc, char **argv) {

  Display *xdisplay = XOpenDisplay(NULL);
  XkbDescPtr desc;
  const char *keycodes = NULL;

  desc = XkbGetMap(xdisplay,
		   XkbGBN_AllComponentsMask,
		   XkbUseCoreKbd);
  if (desc) {
    if (XkbGetNames(xdisplay, XkbKeycodesNameMask, desc) == Success) {
      fprintf(stderr, "Atom %lu\n", desc->names->keycodes);
      keycodes = XGetAtomName(xdisplay, desc->names->keycodes);
      if (!keycodes)
	fprintf(stderr, "could not lookup keycode name\n");
      else
	fprintf(stderr, "XKB keyboard map name '%s'\n", keycodes);
    } else {
      fprintf(stderr, "No XKB keyboard keyboard map name\n");
    }
    XkbFreeKeyboard(desc, XkbGBN_AllComponentsMask, True);
  } else {
    fprintf(stderr, "No XKB keyboard description available\n");
  }
}
$ gcc -Wall -o xkb xkb.c -lX11

$ ./xkb

This triggers the breakpoint and from there I can use GDB to identify the address of the struct containing the keycode names

Thread 1 "Xwayland" hit Breakpoint 1, _XkbLookupKeyboard (pDev=pDev@entry=0x7ffe8f5de068, id=256, client=client@entry=0x27fbcf0, access_mode=access_mode@entry=16, xkb_err=xkb_err@entry=0x7ffe8f5de070)
    at xkbUtils.c:95

(gdb) next
...repeat a few times...

(gdb) print dev->key->xkbInfo->desc->names
$3 = (XkbNamesPtr) 0x24b3d60
(gdb) print dev->key->xkbInfo->desc->names->keycodes
$4 = 144
(gdb) watch *0x24b3d60
Hardware watchpoint 2: *0x24b3d60
(gdb) clear _XkbLookupKeyboard
Deleted breakpoint 1 
(gdb) cont
Continuing.


The 'xkb' demo program now prints

Atom 144
XKB keyboard map name 'evdev+aliases(qwerty)'


IOW, the keycode names are in the Atom 144 and have string "evdev+aliases(qwerty)"

Now, I can run remote-viewer

  GDK_BACKEND=x11 remote-viewer vnc://localhost:5901  

(make sure you have a VNC server running on 5901 of course)

I have to repeated run + kill remote-viewer until the hardware watchpoint triggers in GDB.

Sometimes it triggers on the first go, sometimes it needs 10+ goes to trigger.

When the watchpoint triggers I see this stack trace

(gdb) bt
#0  _XkbCopyNames (src=0x26ce520, src=0x26ce520, dst=0x251b8b0) at xkbUtils.c:1334
#1  XkbCopyKeymap (dst=0x251b8b0, src=0x26ce520) at xkbUtils.c:1984
#2  0x000000000052074c in XkbCopyKeymap (src=<optimized out>, dst=<optimized out>) at xkbUtils.c:1965
#3  XkbDeviceApplyKeymap (dst=dst@entry=0x251c5a0, desc=<optimized out>) at xkbUtils.c:2025
#4  0x00000000004f9ad2 in CopyKeyClass (device=device@entry=0x26cdd30, master=master@entry=0x251c5a0) at exevents.c:233
#5  0x00000000004f9eda in DeepCopyKeyboardClasses (to=0x251c5a0, from=0x26cdd30) at exevents.c:427
#6  DeepCopyDeviceClasses (from=0x26cdd30, to=0x251c5a0, dce=0x7ffe8f5db740) at exevents.c:670
#7  0x00000000004fceb6 in ChangeMasterDeviceClasses (device=0x251c5a0, dce=0x7ffe8f5db740) at exevents.c:727
#8  0x00000000004fd0f4 in UpdateDeviceState (device=0x251c5a0, event=0x7ffe8f5db740) at exevents.c:807
#9  0x00000000004fd58c in ProcessDeviceEvent (ev=ev@entry=0x7ffe8f5db740, device=device@entry=0x251c5a0) at exevents.c:1709
#10 0x00000000004fdce3 in ProcessOtherEvent (ev=0x7ffe8f5db740, device=0x251c5a0) at exevents.c:1873
#11 0x000000000052afa2 in ProcessKeyboardEvent (ev=<optimized out>, keybd=0x251c5a0) at xkbPrKeyEv.c:165
#12 0x000000000046e0c3 in mieqProcessDeviceEvent (dev=0x26cdd30, event=0x7ffe8f5dc3a0, screen=0x1bc4ed0) at mieq.c:496
#13 0x000000000046e239 in mieqProcessInputEvents () at mieq.c:551
#14 0x0000000000424eef in keyboard_handle_modifiers (data=0x25414c0, keyboard=<optimized out>, serial=<optimized out>, mods_depressed=0, mods_latched=0, mods_locked
=0, group=0) at xwayland-input.c:694
#15 0x00007f01ef653bde in ffi_call_unix64 () from /lib64/libffi.so.6
#16 0x00007f01ef65354f in ffi_call () from /lib64/libffi.so.6
#17 0x00007f01f20b2dd4 in wl_closure_invoke (closure=<optimized out>, flags=1, target=<optimized out>, opcode=4, data=<optimized out>) at src/connection.c:935
#18 0x00007f01f20af998 in dispatch_event (display=<optimized out>, queue=<optimized out>) at src/wayland-client.c:1310
#19 0x00007f01f20b0c54 in dispatch_queue (queue=0x1bcd868, display=0x1bcd7a0) at src/wayland-client.c:1456
#20 wl_display_dispatch_queue_pending (display=0x1bcd7a0, queue=0x1bcd868) at src/wayland-client.c:1698
#21 0x00007f01f20b0cac in wl_display_dispatch_pending (display=<optimized out>) at src/wayland-client.c:1761
#22 0x00000000004230db in xwl_read_events (xwl_screen=0x1bc5420) at xwayland.c:565
#23 0x000000000058f3e1 in ospoll_wait (ospoll=0x1bbae60, timeout=<optimized out>) at ospoll.c:412
#24 0x0000000000588bdb in WaitForSomething (are_ready=<optimized out>) at WaitFor.c:226
#25 0x0000000000554aa3 in Dispatch () at dispatch.c:422
#26 0x0000000000558d10 in dix_main (argc=10, argv=0x7ffe8f5de268, envp=<optimized out>) at main.c:287
#27 0x00007f01eff1750a in __libc_start_main () from /lib64/libc.so.6
#28 0x00000000004227da in _start ()


So we can see that XWayland is copying the key board info from one device to another device.

If we jump upto frame 3, we can get the pointer for each of the two devices.

(gdb) print *master
$8 = {public = {devicePrivate = 0x1d9e700, processInputProc = 0x4fd9e0 <ProcessOtherEvent>, realInputProc = 0x4fd9e0 <ProcessOtherEvent>, enqueueInputProc = 0x55de9
0 <EnqueueEvent>, on = 0}, 
  next = 0x1d13e70, startup = 1, deviceProc = 0x548be0 <CoreKeyboardProc>, inited = 1, enabled = 1, coreEvents = 1, deviceGrab = {grabTime = {months = 0, millisecon
ds = 292524512}, fromPassiveGrab = 0, 
    implicitGrab = 0, unused = 0x0, grab = 0x0, activatingKey = 0 '\000', ActivateGrab = 0x565210 <ActivateKeyboardGrab>, DeactivateGrab = 0x565580 <DeactivateKeybo
ardGrab>, sync = {frozen = 0, 
      state = 0, other = 0x0, event = 0x1d72120}}, type = 2, xinput_type = 0, name = 0x1d79f40 "Virtual core keyboard", id = 3, key = 0x1d78e60, valuator = 0x0, tou
ch = 0x0, button = 0x0, 
  focus = 0x1d131e0, proximity = 0x0, kbdfeed = 0x1d726b0, ptrfeed = 0x0, intfeed = 0x0, stringfeed = 0x0, bell = 0x0, leds = 0x0, xkb_interest = 0x1fe32d0, config_
info = 0x0, 
  unused_classes = 0x1d723e0, saved_master_id = 0, devPrivates = 0x1d79b70, unwrapProc = 0x5295c0 <xkbUnwrapProc>, spriteInfo = 0x1d79b38, master = 0x0, lastSlave = 0x1f2af70, last = {valuators = {
      0 <repeats 36 times>}, numValuators = 0, slave = 0x1f2af70, scroll = 0x0, num_touches = 0, touches = 0x0}, properties = {properties = 0x1d79bf0, handlers = 0x1d79f10}, relative_transform = {m = {{
        1, 0, 0}, {0, 1, 0}, {0, 0, 1}}}, scale_and_transform = {m = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}}, xtest_master_id = 0, idle_counter = 0x1d13450}


(gdb) print *device
$9 = {public = {devicePrivate = 0x1d9e700, processInputProc = 0x52af40 <ProcessKeyboardEvent>, realInputProc = 0x52af40 <ProcessKeyboardEvent>, enqueueInputProc = 0x55de90 <EnqueueEvent>, on = 1}, 
  next = 0x0, startup = 1, deviceProc = 0x424d40 <xwl_keyboard_proc>, inited = 1, enabled = 1, coreEvents = 1, deviceGrab = {grabTime = {months = 0, milliseconds = 292524525}, fromPassiveGrab = 0, 
    implicitGrab = 0, unused = 0x0, grab = 0x0, activatingKey = 0 '\000', ActivateGrab = 0x565210 <ActivateKeyboardGrab>, DeactivateGrab = 0x565580 <DeactivateKeyboardGrab>, sync = {frozen = 0, 
      state = 0, other = 0x0, event = 0x1f2b300}}, type = 3, xinput_type = 239, name = 0x1f2af30 "xwayland-keyboard:14", id = 8, key = 0x1f2b590, valuator = 0x0, touch = 0x0, button = 0x0, 
  focus = 0x1f306b0, proximity = 0x0, kbdfeed = 0x1f2b610, ptrfeed = 0x0, intfeed = 0x0, stringfeed = 0x0, bell = 0x0, leds = 0x0, xkb_interest = 0x0, config_info = 0x0, unused_classes = 0x0, 
  saved_master_id = 0, devPrivates = 0x1f2ae70, unwrapProc = 0x5295c0 <xkbUnwrapProc>, spriteInfo = 0x1f2b2c8, master = 0x1d797e0, lastSlave = 0x0, last = {valuators = {0 <repeats 36 times>}, 
    numValuators = 0, slave = 0x0, scroll = 0x0, num_touches = 0, touches = 0x0}, properties = {properties = 0x1f2aed0, handlers = 0x1f2b4c0}, relative_transform = {m = {{1, 0, 0}, {0, 1, 0}, {0, 0, 
        1}}}, scale_and_transform = {m = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}}}, xtest_master_id = 0, idle_counter = 0x1f30720}



So this shows the problem.

XWayland is copying the keycode description from a device "xwayland-keyboard:14" into the device "Virtual core keyboard".

The "xwayland-keyboard:14" device has totally bogus keycode name description "(unnamed)" and so this blows away the info about evdev previously recorded against the "Virtual core keyboard" device.


What I don't understand is what triggers this copying and why it is non-deterministic.

From the stack trace we can see the Xwayland is receiving an event from the Wayland server and this triggers the update, but I'm fuzzy on what that event actually is.
Comment 17 Olivier Fourdan 2017-09-11 10:58:50 EDT
(In reply to Daniel Berrange from comment #16)
> [...]
> So this shows the problem.
> 
> XWayland is copying the keycode description from a device
> "xwayland-keyboard:14" into the device "Virtual core keyboard".
> 
> The "xwayland-keyboard:14" device has totally bogus keycode name description
> "(unnamed)" and so this blows away the info about evdev previously recorded
> against the "Virtual core keyboard" device.

I guess "xinput list" will show that "xwayland-keyboard:14" is a slave device. 

> What I don't understand is what triggers this copying and why it is non-
> deterministic.

Sounds like a race condition.

> From the stack trace we can see the Xwayland is receiving an event from the
> Wayland server and this triggers the update, but I'm fuzzy on what that event
> actually is.

Looks like it's a wl_keyboard modifiers event [1]. This callback into mieqProcessInputEvents() which leads to _XkbCopyNames() was introduced by commit 589f42e [2], is this issue a regression in xorg-server-1.19.x?

[1] https://cgit.freedesktop.org/wayland/wayland/tree/protocol/wayland.xml#n2155
[2] https://cgit.freedesktop.org/xorg/xserver/commit/?id=589f42e
Comment 18 Daniel Berrange 2017-09-11 11:48:08 EDT
I never ran Wayland in Fedora 24 or earlier. F25 was my first install to use it, and that already has  xorg-x11-server-1.19.0.  So no idea if this is a regression or not.
Comment 19 Peter Hutterer 2017-09-19 02:37:11 EDT
"What I don't understand is what triggers this copying and why it is non-deterministic."

It's triggered whenever you change the input device. If you had two keyboards connected, you'll get that copy whenever you switch from one device to the next. Both devices feed into the same master keyboard which needs to update to reflect the right state. This in itself shouldn't be the cause though because this will happen on the next key event anyway. So something is happening here that's different to the normal copy.

I wonder if keyboard_handle_modifiers triggers before 
keyboard_handle_keymap and that messes up our keymap?

The other bug may be triggered in keyboard_handle_modifiers(), sn.keycode is set to 0 and that event is sent to the client. Which in turn would be the reason for the scancode != 0 assertion failure. That assertion seems strange though, because the xlib docs say:

https://www.x.org/releases/X11R7.7/doc/libX11/XKB/xkblib.html#id2589934
      KeyCode keycode; /* keycode causing event, 0 if programmatic */
Comment 20 Olivier Fourdan 2017-09-25 03:25 EDT
Created attachment 1330398 [details]
Systemtap capture

(In reply to Peter Hutterer from comment #19)
> [...]
> 
> I wonder if keyboard_handle_modifiers triggers before 
> keyboard_handle_keymap and that messes up our keymap?
> 
> [...]

Attaching a systemtap log (captured by Christophe) at the time of the issue (2 runs of virt-viewer, first one keyboard is working, next run is having that keymap issue), it shows keyboard_handle_modifiers() but no trace of keyboard_handle_keymap()
Comment 21 Olivier Fourdan 2017-11-10 08:35:31 EST
(In reply to Peter Hutterer from comment #19)
> [...]
> I wonder if keyboard_handle_modifiers triggers before 
> keyboard_handle_keymap and that messes up our keymap?
> 
> The other bug may be triggered in keyboard_handle_modifiers(), sn.keycode is
> set to 0 and that event is sent to the client. Which in turn would be the
> reason for the scancode != 0 assertion failure. That assertion seems strange
> though, because the xlib docs say:
> 
> https://www.x.org/releases/X11R7.7/doc/libX11/XKB/xkblib.html#id2589934
>       KeyCode keycode; /* keycode causing event, 0 if programmatic */

Well, I'm not sure...

keyboard_handle_modifiers() is invoked when pressing modifiers:

https://cgit.freedesktop.org/xorg/xserver/tree/hw/xwayland/xwayland-input.c?h=server-1.19-branch#n682

While the error occurs for each and every key press (not just modifiers) and all key presses are dismissed in spice-gtk.

Looking at the spice-gtk side of things, the assertion which fails is at the very beginning of send_key:

https://cgit.freedesktop.org/spice/spice-gtk/tree/src/spice-widget.c#n1430

send_key() is called from a couple of places in that source code, the once in release_keys() is protected by a "if (scancode != 0) ...) so this is not the problem, which means the only way to get to send_key() with a scancode ==0 is from key_event() iself:

https://cgit.freedesktop.org/spice/spice-gtk/tree/src/spice-widget.c#n1563

If we ignore the WIN32 code in there, scancode is initially set to 0, then set to:

    if (!scancode)
        scancode = vnc_display_keymap_gdk2xtkbd(d->keycode_map,
                                                d->keycode_maplen,
                                                key->hardware_keycode);

Here:

https://cgit.freedesktop.org/spice/spice-gtk/tree/src/spice-widget.c#n1643

Therefore, I think it;s safe to assume the problem with the assertion is because vnc_display_keymap_gdk2xtkbd() retuen a 0 scancode for every key press.
Comment 22 Olivier Fourdan 2017-11-10 08:40:13 EST
So, I mean, the two issues are related, it's one bug (not two).

vnc_display_keymap_gdk2xtkbd_table() being the fuction displaying the "Unknown keycode mapping" warning:

https://cgit.freedesktop.org/spice/spice-gtk/tree/src/vncdisplaykeymap.c#n139

and vnc_display_keymap_gdk2xtkbd() using the keymap and maplen not retrieved by vnc_display_keymap_gdk2xtkbd_table() because of the error.
Comment 23 Daniel Berrange 2017-11-10 08:42:10 EST
(In reply to Olivier Fourdan from comment #21)

> Therefore, I think it;s safe to assume the problem with the assertion is
> because vnc_display_keymap_gdk2xtkbd() retuen a 0 scancode for every key
> press.

Ignore the assertion errors that Cole mentioned, that is merely fallout from the earlier failure due to Xwayland loosing the keycode mapping names.
Comment 24 Olivier Fourdan 2017-11-10 09:43:07 EST
(In reply to Daniel Berrange from comment #23)
> (In reply to Olivier Fourdan from comment #21)
> 
> > Therefore, I think it;s safe to assume the problem with the assertion is
> > because vnc_display_keymap_gdk2xtkbd() retuen a 0 scancode for every key
> > press.
> 
> Ignore the assertion errors that Cole mentioned, that is merely fallout from
> the earlier failure due to Xwayland loosing the keycode mapping names.

Yes, that's exactly what I was saying.

Now, for the funny part, testing with the reproducer in comment 16 shows the issue start occuring *after* closing an X11 window, not just virt-viewer (I was able to reproduce by opening/closing an xterm).

Thing is, initially, the value is correct, it starts being incorrect after a window is closed.
Comment 25 Olivier Fourdan 2017-11-10 09:44:15 EST
(by window, I mean, an X11 window of course, Wayland native is unrelated)
Comment 26 Daniel Berrange 2017-11-10 09:51:06 EST
(In reply to Olivier Fourdan from comment #24)
> Now, for the funny part, testing with the reproducer in comment 16 shows the
> issue start occuring *after* closing an X11 window, not just virt-viewer (I
> was able to reproduce by opening/closing an xterm).

Yes, I see that behaviour too - opening & closing any X11 app is sufficient to trigger it.
Comment 27 Olivier Fourdan 2017-11-13 04:56:33 EST
I've placed goold old style traces (fprintf (stderr, "%s", __func__) in various places, and found out that this is... normal actually.

Playing with weston nested (easier), I see that the "(unnamed)" actually comes from the keymap sent by the compositor itself (for the record, 144 is the atom for "evdev+aliases(qwerty)" and 308 is "(unnamed)")

   $ xlsatoms | grep 144
   144	evdev+aliases(qwerty)
   $ xlsatoms | grep 308
   308	(unnamed)

Initially, when Xwayland starts, the right atom is set:

XkbCopyKeymap: Copying xkb from 0x1333b30 to 0x1344af0
_XkbCopyNames: Copying xkb from 0x1333b30 to 0x1344af0
_XkbCopyNames: Copying keycode from 144 to 0
XkbCopyKeymap: Copying xkb from 0x1333b30 to 0x136ec60
_XkbCopyNames: Copying xkb from 0x1333b30 to 0x136ec60
_XkbCopyNames: Copying keycode from 144 to 0
XkbCopyKeymap: Copying xkb from 0x1333b30 to 0x14fa6a0
_XkbCopyNames: Copying xkb from 0x1333b30 to 0x14fa6a0
_XkbCopyNames: Copying keycode from 144 to 0


Then we receive the keymap from the compositor.

keyboard_handle_keymap

And apply the changes:

XkbDeviceApplyKeymap: Applying xkb 0x151a950 to device 8
XkbCopyKeymap: Copying xkb from 0x151a950 to 0x14fa6a0
_XkbCopyNames: Copying xkb from 0x151a950 to 0x14fa6a0
_XkbCopyNames: Copying keycode from 308 to 144
keyboard_handle_keymap: Applying xkb 0x151a950 to keyboard id=8

=> So we're copying an xkb desk with keycodes value 308 to the one with kecodes value 144 (basically, setting  "(unnamed)" in place of "evdev+aliases(qwerty)"):

DeepCopyDeviceClasses: from device id=6 to device id 2
DeepCopyDeviceClasses: from device id=8 to device id 3
XkbDeviceApplyKeymap: Applying xkb 0x14fa6a0 to device 3
XkbCopyKeymap: Copying xkb from 0x14fa6a0 to 0x1344af0
_XkbCopyNames: Copying xkb from 0x14fa6a0 to 0x1344af0
_XkbCopyNames: Copying keyvode from 308 to 144

That made we look further into what the compositor gixes us (Xwayland) so I printed the whole xwl_seat->keymap in keyboard_handle_keymap() and it shows:

keyboard_handle_keymap compiling 'xkb_keymap {
xkb_keycodes "(unnamed)" {
...

So this is the compositor itself who gives us that keycodes value...
Comment 28 Olivier Fourdan 2017-11-13 05:55:48 EST
So, I had a quick chat with Daniel (cc'ed) and there is no easy way to fix this, looking at the name like what virt-manager does was incorrect in the first place, better would be to read the full mapping and try to reconcile that with a known keymap.
Comment 29 Daniel Berrange 2017-11-13 06:04:39 EST
(In reply to Olivier Fourdan from comment #28)
> So, I had a quick chat with Daniel (cc'ed) and there is no easy way to fix
> this, looking at the name like what virt-manager does was incorrect in the
> first place, better would be to read the full mapping and try to reconcile
> that with a known keymap.

This doesn't make a whole lot of sense. What we're after is the raw hardware scan codes. AFAIK, these don't ever change, even after this keycode mapping property gets "lost". What the scancodes map /to/ will change, but the actual scancodes received by the Xserver should always be fixed, as the Linux evdev scancode set.
Comment 30 Daniel Stone 2017-11-13 06:26:30 EST
(In reply to Daniel Berrange from comment #29)
> This doesn't make a whole lot of sense. What we're after is the raw hardware
> scan codes. AFAIK, these don't ever change, even after this keycode mapping
> property gets "lost". What the scancodes map /to/ will change, but the
> actual scancodes received by the Xserver should always be fixed, as the
> Linux evdev scancode set.

Well, to the extent where those 'raw hardware scan codes' even exist: they get lost in translation when you are ultimately guested under Windows (Xwin), OS X (Xquartz), or a browser (Chromium/Exosphere), or ...

If all you're trying to establish is a very limited and hardcoded 'are we using the AT or evdev keycodes' (bearing in mind that, again, there are other options), I would recommend looking at the mapping for the page up keysym: if it is mapped to the AT scancode, then use the 'xfree86' mapping, else just use evdev.

Others will find more exotic ways to break that, but fixing that is pretty painful.
Comment 31 Daniel Stone 2017-11-13 06:30:27 EST
(In reply to Daniel Berrange from comment #16)
> The "xwayland-keyboard:14" device has totally bogus keycode name description
> "(unnamed)" and so this blows away the info about evdev previously recorded
> against the "Virtual core keyboard" device.

It's not that it's bogus, just that we don't retain the information. Keymap names are not an accurate way to reproduce mappings, especially when you take tools like xmodmap into account. We deliberately dropped those inside xkbcommon as we didn't feel like it was a useful thing to preserve.

It does break some uses like this, but as above, they were broken anyway ...
Comment 32 Daniel Berrange 2017-11-13 06:36:05 EST
(In reply to Daniel Stone from comment #30)
> (In reply to Daniel Berrange from comment #29)
> > This doesn't make a whole lot of sense. What we're after is the raw hardware
> > scan codes. AFAIK, these don't ever change, even after this keycode mapping
> > property gets "lost". What the scancodes map /to/ will change, but the
> > actual scancodes received by the Xserver should always be fixed, as the
> > Linux evdev scancode set.
> 
> Well, to the extent where those 'raw hardware scan codes' even exist: they
> get lost in translation when you are ultimately guested under Windows
> (Xwin), OS X (Xquartz), or a browser (Chromium/Exosphere), or ...

Agreed, it is a pretty messy & inexact science in general and certainly don't claim to be perfect in any way. We (Gtk-VNC/Spice-Gtk/etc) have a set of heuristics to identify the different platform X servers eg we detect Xwin from the vendor string, Xquartz from existance of OS-X specific extensions, and on Linux Xorg we've traditionally relied on the keycodemap name strings to distinguish legacy kbd (ATset 1 derivative) from evdev.

> If all you're trying to establish is a very limited and hardcoded 'are we
> using the AT or evdev keycodes' (bearing in mind that, again, there are
> other options), I would recommend looking at the mapping for the page up
> keysym: if it is mapped to the AT scancode, then use the 'xfree86' mapping,
> else just use evdev.

Oh, that's a nice idea, we can certainly try that approach. Just distinguishing AT from evdev is sufficient in this case.

> Others will find more exotic ways to break that, but fixing that is pretty
> painful.

Yes, understood.
Comment 33 Olivier Fourdan 2017-11-13 07:01:37 EST
(In reply to Daniel Berrange from comment #32)
> Oh, that's a nice idea, we can certainly try that approach. Just
> distinguishing AT from evdev is sufficient in this case.

Can I reassign this bug to spice-gtk then, since this is not an Xwayland issue?
Comment 34 Olivier Fourdan 2017-11-13 07:13:41 EST
Just for the record, there is no "real" race here, it's just that the XkbCopyKeymap() is called from mieqProcessInputEvents(). meaning that before Xwayland has processed any keyboard input event, the atom is unchanged, but once any key is pressed, XkbCopyKeymap() is called and the keycodes atom for the keyboard device is set to "(unnamed)" as passed by the Waylandc ompositor, Xwayland is just behaving as expected here.
Comment 35 Daniel Berrange 2017-11-13 09:26:33 EST
I cloned to https://bugzilla.redhat.com/show_bug.cgi?id=1512564  for equivalent gtk-vnc fix too

Note You need to log in before you can comment on or make changes to this bug.