Bug 1441490

Summary: [abrt] gnome-shell: xkb_keymap_ref(): gnome-shell killed by signal 11
Product: [Fedora] Fedora Reporter: Joachim Frieben <jfrieben>
Component: gnome-shellAssignee: Owen Taylor <otaylor>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: bugs, daniel.playfair.cal, debarshir, evan, faber, fmuellner, jeischma, marcus.husar, otaylor, umbertotozzato
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/2ba343949bac67f2d040693ced41dd7fbb46b50c
Whiteboard: abrt_hash:c50731d04df538049ea732054a5625b4389d0b2e;VARIANT_ID=workstation;
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-29 11:57:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: exploitable
none
File: limits
none
File: maps
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages none

Description Joachim Frieben 2017-04-12 05:02:49 UTC
Version-Release number of selected component:
gnome-shell-3.24.0-1.fc26

Additional info:
reporter:       libreport-2.9.1
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: xkb_keymap_ref
executable:     /usr/bin/gnome-shell
journald_cursor: s=4d25ad84722a479e84de1da3dc37d2c5;i=ffe7;b=a4aa3063b50f4c49b81feda9f9b634a5;m=65faf88;t=54cebe95abf63;x=81384b37474bcd89
kernel:         4.11.0-0.rc5.git0.1.fc26.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (7 frames)
 #0 xkb_keymap_ref at src/keymap.c:59
 #1 clutter_evdev_set_keyboard_map at evdev/clutter-device-manager-evdev.c:2515
 #2 meta_backend_native_set_keymap at backends/native/meta-backend-native.c:372
 #3 ffi_call_unix64 at ../src/x86/unix64.S:76
 #4 ffi_call at ../src/x86/ffi64.c:525
 #5 gjs_invoke_c_function(JSContext*, Function*, JS::HandleObject, JS::HandleValueArray const&, mozilla::Maybe<JS::MutableHandle<JS::Value> >&, GIArgument*) at gi/function.cpp:1020
 #6 function_call(JSContext*, unsigned int, JS::Value*) at gi/function.cpp:1339

Potential duplicate: bug 1401232

Comment 1 Joachim Frieben 2017-04-12 05:02:59 UTC
Created attachment 1271023 [details]
File: backtrace

Comment 2 Joachim Frieben 2017-04-12 05:03:01 UTC
Created attachment 1271024 [details]
File: cgroup

Comment 3 Joachim Frieben 2017-04-12 05:03:05 UTC
Created attachment 1271025 [details]
File: core_backtrace

Comment 4 Joachim Frieben 2017-04-12 05:03:08 UTC
Created attachment 1271026 [details]
File: dso_list

Comment 5 Joachim Frieben 2017-04-12 05:03:10 UTC
Created attachment 1271027 [details]
File: environ

Comment 6 Joachim Frieben 2017-04-12 05:03:12 UTC
Created attachment 1271028 [details]
File: exploitable

Comment 7 Joachim Frieben 2017-04-12 05:03:13 UTC
Created attachment 1271029 [details]
File: limits

Comment 8 Joachim Frieben 2017-04-12 05:03:22 UTC
Created attachment 1271030 [details]
File: maps

Comment 9 Joachim Frieben 2017-04-12 05:03:32 UTC
Created attachment 1271031 [details]
File: open_fds

Comment 10 Joachim Frieben 2017-04-12 05:03:34 UTC
Created attachment 1271032 [details]
File: proc_pid_status

Comment 11 Joachim Frieben 2017-04-12 05:03:36 UTC
Created attachment 1271033 [details]
File: var_log_messages

Comment 12 Marcus Husar 2017-04-20 07:54:47 UTC
Similar problem has been detected:

I just booted my machine up. After about two minutes I logged in and gnome-shell crashed.

reporter:       libreport-2.9.1
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: xkb_keymap_ref
executable:     /usr/bin/gnome-shell
global_pid:     1555
kernel:         4.11.0-0.rc6.git0.1.fc26.x86_64
package:        gnome-shell-3.24.1-1.fc26
reason:         gnome-shell killed by SIGSEGV
runlevel:       N 5
type:           CCpp
uid:            1000

Comment 13 Jiri Eischmann 2017-04-25 07:34:01 UTC
Similar problem has been detected:

I logged in to my account in GDM and it got stuck on grey screen and the cursor in the middle of the screen.

reporter:       libreport-2.9.1
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: xkb_keymap_ref
executable:     /usr/bin/gnome-shell
global_pid:     1872
kernel:         4.11.0-0.rc7.git0.1.fc26.x86_64
package:        gnome-shell-3.24.1-1.fc26
reason:         gnome-shell killed by SIGSEGV
runlevel:       N 5
type:           CCpp
uid:            1000

Comment 14 Rui Matos 2017-04-25 09:56:29 UTC
This seems to be the culprit:

Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths
Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: 1 include paths searched:
Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR:         /usr/share/X11/xkb
Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: 1 include paths could not be added:
Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR:         /home/frieben/.xkb
Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'de,us', variant ',', options ''


Can you check if you have selinux access denied messages in the journal around the crash and that /usr/share/X11/xkb/rules/evdev does exist?

Comment 15 Joachim Frieben 2017-04-25 12:42:25 UTC
I reinstalled Fedora 26 a few days ago, and I have not seen this crash again. File /usr/share/X11/xkb/rules/evdev does exist and belongs to package xkeyboard-config-2.20-3.fc26.noarch.

Comment 16 bugs 2017-05-15 08:35:41 UTC
I get this in Fedora 25 every now and then, sometimes it seem to happen shortly after i open a new tab in google chrome:

maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths
maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: 1 include paths searched:
maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR:         /usr/share/X11/xkb
maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: 1 include paths could not be added:
maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR:         /home/herman/.xkb
maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'us,se,us', variant 'altgr-intl,,', options '
maj 15 10:26:04 frylock kernel: gnome-shell[2263]: segfault at 8 ip 00007fa2236e0fb3 sp 00007ffec5ea0d68 error 6 in libxkbcommon.so.0.0.0[7fa2236c5000+3e000]
maj 15 10:26:04 frylock kernel: audit: type=1701 audit(1494836764.114:384): auid=1000 uid=1000 gid=1000 ses=2 pid=2263 comm="gnome-shell" exe="/usr/bin/gnome-shell" sig=11
maj 15 10:26:04 frylock abrt-hook-ccpp[6249]: Process 2263 (gnome-shell) of user 1000 killed by SIGSEGV - dumping core
maj 15 10:26:04 frylock audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=geoclue comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi
maj 15 10:26:04 frylock kernel: audit: type=1131 audit(1494836764.605:385): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=geoclue comm="systemd" exe="/usr/lib/systemd/sys
maj 15 10:26:14 frylock abrt-hook-ccpp[6249]: /var/spool/abrt is 3982844309 bytes (more than 1279MiB), deleting 'ccpp-2017-05-12-15:26:41-2082'
maj 15 10:26:17 frylock gnome-terminal-[2942]: Error reading events from display: Connection reset by peer
maj 15 10:26:17 frylock gnome-software[2517]: Error reading events from display: Broken pipe
maj 15 10:26:17 frylock evolution-alarm[2493]: Error reading events from display: Broken pipe
maj 15 10:26:17 frylock abrt-applet[2535]: Error reading events from display: Broken pipe
maj 15 10:26:17 frylock polkitd[1167]: Unregistered Authentication Agent for unix-session:2 (system bus name :1.43, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, 
maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE
maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Unit entered failed state.
maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Failed with result 'exit-code'.
maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE)
maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: Fatal server error:
maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE) failed to read Wayland events: Connection reset by peer
maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE)
maj 15 10:26:17 frylock gnome-session[2180]: gnome-session-binary[2180]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
maj 15 10:26:17 frylock gnome-session-binary[2180]: Unrecoverable failure in required component org.gnome.Shell.desktop
maj 15 10:26:17 frylock gnome-session-binary[2180]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11

Comment 17 Daniel Playfair Cal 2017-07-22 23:39:53 UTC
This crash also occurred for me in the process of debugging this other bug: https://bugzilla.gnome.org/show_bug.cgi?id=783935

Here is a journal log as well as the output of gnome-shell being run inside of valgrind: https://gist.github.com/hedgepigdaniel/00c2792c33c7993d17134a48bfe691d0

Comment 18 Daniel Playfair Cal 2017-07-22 23:40:31 UTC
These two also look very similar to this one: https://bugzilla.redhat.com/show_bug.cgi?id=1349265, https://bugzilla.redhat.com/show_bug.cgi?id=1398142

Comment 19 Daniel Playfair Cal 2017-07-23 00:25:46 UTC
Here is a journal log and the output of running gnome-shell with valgrind: https://gist.github.com/hedgepigdaniel/00c2792c33c7993d17134a48bfe691d0

Comment 20 Daniel Playfair Cal 2017-07-23 03:00:46 UTC
For me its reproducible 100% of the time when running gnome-shell with valgrind under wayland

Comment 21 Daniel Playfair Cal 2017-07-23 03:10:58 UTC
==27992== Invalid read of size 4
==27992==    at 0xF855ED3: xkb_keymap_ref (keymap.c:59)
==27992==    by 0x692E1D9: clutter_evdev_set_keyboard_map (clutter-device-manager-evdev.c:2546)
==27992==    by 0x5D73E52: meta_backend_native_set_keymap (meta-backend-native.c:486)
==27992==    by 0xBFEE17D: ffi_call_unix64 (unix64.S:76)
==27992==    by 0xBFEDAEE: ffi_call (ffi64.c:525)
==27992==    by 0x5577045: ??? (function.cpp:1021)
==27992==    by 0x55781A0: ??? (function.cpp:1340)
==27992==    by 0x41EAB5A: ???
==27992==    by 0x4420CFFF: ???
==27992==    by 0x39AB09A5: ???
==27992==    by 0x3FF: ???
==27992==    by 0x3788AE7F: ???
==27992==  Address 0x8 is not stack'd, malloc'd or (recently) free'd

Looks like in meta_backend_native_set_keymap keymap is NULL
Therefore I guess xkb_keymap_new_from_names in keymap.c returned NULL for some reason

Potentially relevant journal log:

Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths
Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: 1 include paths searched:
Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR:         /usr/share/X11/xkb
Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: 1 include paths could not be added:
Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR:         /home/daniel/.xkb
Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'us,us', variant ',', options ''

Comment 22 Daniel Playfair Cal 2017-07-23 03:33:08 UTC
On my system, /usr/share/X11/xkb/rules/evdev exists, is owned by root:root with mode 644

Looking at include.c in FindFileInXkbPath in libxkbcommon I can't see how it would print all that stuff unless fopen returned NULL when opening /usr/share/X11/xkb/rules/evdev

I'm thinking I should set a breakpoint after that fopen call, and see what errno is.

Comment 23 Daniel Playfair Cal 2017-07-23 07:59:51 UTC
Having run it a little bit with gdb and valgrind:
 - The file /usr/share/X11/xkb/rules/evdev is opened successfully many times in the same place (stepped over code with gdb)
 - the crash does not occur for me (at least not in 2 attempts) if only gdb is used
 - it occurs reliably if valgrind is used
 - if valgrind is used with vgdb (tried with --tool memcheck and none) it becomes extremely slow for me, so much so that after 45 minutes of CPU time it still doesn't occur

Then I tried patching the source to log the value of errno if file came back as NULL from fopen. errno was 24 (EMFILE /* Too many open files */).

I conclude form this that something is opening alot of files and not closing them, and that for some reason this occurs more when running with valgrind. Since I observed so many successful calls to open on this particular file using gdb, I'm guessing something is calling FindFileInXkbPath many times and maybe not closing the files afterwards or something.

Comment 24 Daniel Playfair Cal 2017-07-23 13:53:40 UTC
I tried increasing the default per process open file limit from 1024 to 8192, the result was that memcheck with gnome-shell running it in proceeded to consume about 12GB of RAM

During this time I ran lsof on the process. There were much more than 1024 file handles, and the majority of the entries looked like this:

memcheck- 1022 daniel 3782u  a_inode               0,12        0     9705 [timerfd]

I then ran gnome-shell inside of strace recording open/close syscalls and found that there were a large number of calls to open('/etc/localtime'), and also lots of consecutive calls to close() with the same file handle. See strace log: https://gist.githubusercontent.com/hedgepigdaniel/d15275fb293a1be01f3bdc64b9b3fff4/raw/9de45b093c974e3731f6fdd038cba30129f4709e/gistfile1.txt

I'm confused because right at the end of that log you can see the following:

open("/usr/share/X11/xkb/rules/evdev", O_RDONLY) = 1045
close(1045)                             = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} ---
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} ---

But I was fairly sure that fopen returned NULL in the code - which I assumed  would have meant that open() returned -1 or something (not 1045) if the syscall to open the file failed... am I misunderstanding strace?

Comment 25 Daniel Playfair Cal 2017-07-24 03:44:31 UTC
I realised that valgrind has a --track-fds option. The main thread of gnome-shell hits the 1024 open files limit, and the majority of the handles are created as follows:


==19805== Open file descriptor 145:
==19805==    at 0x7A5E8F7: timerfd_create (in /usr/lib/libc-2.25.so)
==19805==    by 0xD515853: g_datetime_source_init_timerfd (gnome-datetime-source.c:177)
==19805==    by 0xD515853: _gnome_datetime_source_new (gnome-datetime-source.c:275)
==19805==    by 0xD50B760: update_clock (gnome-wall-clock.c:359)
==19805==    by 0x71FBEAC: g_closure_invoke (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x720E4AD: ??? (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x7216C84: g_signal_emit_valist (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x721769E: g_signal_emit (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x6F5DB31: ??? (in /usr/lib/libgio-2.0.so.0.5200.3)
==19805==    by 0xBFEE17D: ffi_call_unix64 (unix64.S:76)
==19805==    by 0xBFEDAEE: ffi_call (ffi64.c:525)
==19805==    by 0x71FCA9C: g_cclosure_marshal_generic_va (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x71FC0E5: ??? (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x721693C: g_signal_emit_valist (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x721769E: g_signal_emit (in /usr/lib/libgobject-2.0.so.0.5200.3)
==19805==    by 0x6F5E2A7: ??? (in /usr/lib/libgio-2.0.so.0.5200.3)
==19805==    by 0x6F58DC9: ??? (in /usr/lib/libgio-2.0.so.0.5200.3)
==19805==    by 0x74898C4: g_main_context_dispatch (in /usr/lib/libglib-2.0.so.0.5200.3)
==19805==    by 0x7489C87: ??? (in /usr/lib/libglib-2.0.so.0.5200.3)
==19805==    by 0x7489FA1: g_main_loop_run (in /usr/lib/libglib-2.0.so.0.5200.3)
==19805==    by 0x5D297BB: meta_run (main.c:648)
==19805==    by 0x10A306: main (main.c:454)

Comment 26 Daniel Playfair Cal 2017-07-24 04:30:39 UTC
And the GJS stack at the same time:

Background<._init@resource:///org/gnome/shell/ui/background.js:251:23
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
_Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5
Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20
BackgroundSource<.getBackground@resource:///org/gnome/shell/ui/background.js:583:30
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
BackgroundManager<._createBackgroundActor@resource:///org/gnome/shell/ui/background.js:753:26
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
BackgroundManager<._updateBackgroundActor@resource:///org/gnome/shell/ui/background.js:729:34
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
BackgroundManager<._createBackgroundActor/changeSignalId<@resource:///org/gnome/shell/ui/background.js:775:13
_emit@resource:///org/gnome/gjs/modules/signals.js:126:27
Background<._init/this._settingsChangedSignalId<@resource:///org/gnome/shell/ui/background.js:267:45

Comment 27 Rui Matos 2017-07-24 12:32:42 UTC
See https://bugzilla.gnome.org/show_bug.cgi?id=782688

Comment 28 Umberto T 2017-08-04 20:54:26 UTC
*** Bug 1478579 has been marked as a duplicate of this bug. ***

Comment 29 Fedora End Of Life 2018-05-03 08:35:31 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 30 Fedora End Of Life 2018-05-29 11:57:36 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 31 Debarshi Ray 2018-08-08 06:08:41 UTC

*** This bug has been marked as a duplicate of bug 1507656 ***