Bug 1441490
Summary: | [abrt] gnome-shell: xkb_keymap_ref(): gnome-shell killed by signal 11 | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Joachim Frieben <jfrieben> | ||||||||||||||||||||||||
Component: | gnome-shell | Assignee: | Owen Taylor <otaylor> | ||||||||||||||||||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||||||||||||
Priority: | unspecified | ||||||||||||||||||||||||||
Version: | 26 | CC: | bugs, daniel.playfair.cal, debarshir, evan, faber, fmuellner, jeischma, marcus.husar, otaylor, umbertotozzato | ||||||||||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||
URL: | https://retrace.fedoraproject.org/faf/reports/bthash/2ba343949bac67f2d040693ced41dd7fbb46b50c | ||||||||||||||||||||||||||
Whiteboard: | abrt_hash:c50731d04df538049ea732054a5625b4389d0b2e;VARIANT_ID=workstation; | ||||||||||||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||||||||
Last Closed: | 2018-05-29 11:57:36 UTC | Type: | --- | ||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||
Attachments: |
|
Description
Joachim Frieben
2017-04-12 05:02:49 UTC
Created attachment 1271023 [details]
File: backtrace
Created attachment 1271024 [details]
File: cgroup
Created attachment 1271025 [details]
File: core_backtrace
Created attachment 1271026 [details]
File: dso_list
Created attachment 1271027 [details]
File: environ
Created attachment 1271028 [details]
File: exploitable
Created attachment 1271029 [details]
File: limits
Created attachment 1271030 [details]
File: maps
Created attachment 1271031 [details]
File: open_fds
Created attachment 1271032 [details]
File: proc_pid_status
Created attachment 1271033 [details]
File: var_log_messages
Similar problem has been detected: I just booted my machine up. After about two minutes I logged in and gnome-shell crashed. reporter: libreport-2.9.1 backtrace_rating: 4 cmdline: /usr/bin/gnome-shell crash_function: xkb_keymap_ref executable: /usr/bin/gnome-shell global_pid: 1555 kernel: 4.11.0-0.rc6.git0.1.fc26.x86_64 package: gnome-shell-3.24.1-1.fc26 reason: gnome-shell killed by SIGSEGV runlevel: N 5 type: CCpp uid: 1000 Similar problem has been detected: I logged in to my account in GDM and it got stuck on grey screen and the cursor in the middle of the screen. reporter: libreport-2.9.1 backtrace_rating: 4 cmdline: /usr/bin/gnome-shell crash_function: xkb_keymap_ref executable: /usr/bin/gnome-shell global_pid: 1872 kernel: 4.11.0-0.rc7.git0.1.fc26.x86_64 package: gnome-shell-3.24.1-1.fc26 reason: gnome-shell killed by SIGSEGV runlevel: N 5 type: CCpp uid: 1000 This seems to be the culprit: Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: 1 include paths searched: Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: /usr/share/X11/xkb Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: 1 include paths could not be added: Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: /home/frieben/.xkb Apr 12 00:49:45 riemann org.gnome.Shell.desktop[1524]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'de,us', variant ',', options '' Can you check if you have selinux access denied messages in the journal around the crash and that /usr/share/X11/xkb/rules/evdev does exist? I reinstalled Fedora 26 a few days ago, and I have not seen this crash again. File /usr/share/X11/xkb/rules/evdev does exist and belongs to package xkeyboard-config-2.20-3.fc26.noarch. I get this in Fedora 25 every now and then, sometimes it seem to happen shortly after i open a new tab in google chrome: maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: 1 include paths searched: maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: /usr/share/X11/xkb maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: 1 include paths could not be added: maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: /home/herman/.xkb maj 15 10:26:04 frylock org.gnome.Shell.desktop[2263]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'us,se,us', variant 'altgr-intl,,', options ' maj 15 10:26:04 frylock kernel: gnome-shell[2263]: segfault at 8 ip 00007fa2236e0fb3 sp 00007ffec5ea0d68 error 6 in libxkbcommon.so.0.0.0[7fa2236c5000+3e000] maj 15 10:26:04 frylock kernel: audit: type=1701 audit(1494836764.114:384): auid=1000 uid=1000 gid=1000 ses=2 pid=2263 comm="gnome-shell" exe="/usr/bin/gnome-shell" sig=11 maj 15 10:26:04 frylock abrt-hook-ccpp[6249]: Process 2263 (gnome-shell) of user 1000 killed by SIGSEGV - dumping core maj 15 10:26:04 frylock audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=geoclue comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi maj 15 10:26:04 frylock kernel: audit: type=1131 audit(1494836764.605:385): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=geoclue comm="systemd" exe="/usr/lib/systemd/sys maj 15 10:26:14 frylock abrt-hook-ccpp[6249]: /var/spool/abrt is 3982844309 bytes (more than 1279MiB), deleting 'ccpp-2017-05-12-15:26:41-2082' maj 15 10:26:17 frylock gnome-terminal-[2942]: Error reading events from display: Connection reset by peer maj 15 10:26:17 frylock gnome-software[2517]: Error reading events from display: Broken pipe maj 15 10:26:17 frylock evolution-alarm[2493]: Error reading events from display: Broken pipe maj 15 10:26:17 frylock abrt-applet[2535]: Error reading events from display: Broken pipe maj 15 10:26:17 frylock polkitd[1167]: Unregistered Authentication Agent for unix-session:2 (system bus name :1.43, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Unit entered failed state. maj 15 10:26:17 frylock systemd[2160]: gnome-terminal-server.service: Failed with result 'exit-code'. maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE) maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: Fatal server error: maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE) failed to read Wayland events: Connection reset by peer maj 15 10:26:17 frylock org.gnome.Shell.desktop[2263]: (EE) maj 15 10:26:17 frylock gnome-session[2180]: gnome-session-binary[2180]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11 maj 15 10:26:17 frylock gnome-session-binary[2180]: Unrecoverable failure in required component org.gnome.Shell.desktop maj 15 10:26:17 frylock gnome-session-binary[2180]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11 This crash also occurred for me in the process of debugging this other bug: https://bugzilla.gnome.org/show_bug.cgi?id=783935 Here is a journal log as well as the output of gnome-shell being run inside of valgrind: https://gist.github.com/hedgepigdaniel/00c2792c33c7993d17134a48bfe691d0 These two also look very similar to this one: https://bugzilla.redhat.com/show_bug.cgi?id=1349265, https://bugzilla.redhat.com/show_bug.cgi?id=1398142 Here is a journal log and the output of running gnome-shell with valgrind: https://gist.github.com/hedgepigdaniel/00c2792c33c7993d17134a48bfe691d0 For me its reproducible 100% of the time when running gnome-shell with valgrind under wayland ==27992== Invalid read of size 4 ==27992== at 0xF855ED3: xkb_keymap_ref (keymap.c:59) ==27992== by 0x692E1D9: clutter_evdev_set_keyboard_map (clutter-device-manager-evdev.c:2546) ==27992== by 0x5D73E52: meta_backend_native_set_keymap (meta-backend-native.c:486) ==27992== by 0xBFEE17D: ffi_call_unix64 (unix64.S:76) ==27992== by 0xBFEDAEE: ffi_call (ffi64.c:525) ==27992== by 0x5577045: ??? (function.cpp:1021) ==27992== by 0x55781A0: ??? (function.cpp:1340) ==27992== by 0x41EAB5A: ??? ==27992== by 0x4420CFFF: ??? ==27992== by 0x39AB09A5: ??? ==27992== by 0x3FF: ??? ==27992== by 0x3788AE7F: ??? ==27992== Address 0x8 is not stack'd, malloc'd or (recently) free'd Looks like in meta_backend_native_set_keymap keymap is NULL Therefore I guess xkb_keymap_new_from_names in keymap.c returned NULL for some reason Potentially relevant journal log: Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: Couldn't find file "rules/evdev" in include paths Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: 1 include paths searched: Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: /usr/share/X11/xkb Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: 1 include paths could not be added: Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: /home/daniel/.xkb Jul 22 21:17:38 danielpc-arch org.gnome.Shell-valgrind-errors.desktop[1040]: xkbcommon: ERROR: Couldn't look up rules 'evdev', model 'pc105+inet', layout 'us,us', variant ',', options '' On my system, /usr/share/X11/xkb/rules/evdev exists, is owned by root:root with mode 644 Looking at include.c in FindFileInXkbPath in libxkbcommon I can't see how it would print all that stuff unless fopen returned NULL when opening /usr/share/X11/xkb/rules/evdev I'm thinking I should set a breakpoint after that fopen call, and see what errno is. Having run it a little bit with gdb and valgrind: - The file /usr/share/X11/xkb/rules/evdev is opened successfully many times in the same place (stepped over code with gdb) - the crash does not occur for me (at least not in 2 attempts) if only gdb is used - it occurs reliably if valgrind is used - if valgrind is used with vgdb (tried with --tool memcheck and none) it becomes extremely slow for me, so much so that after 45 minutes of CPU time it still doesn't occur Then I tried patching the source to log the value of errno if file came back as NULL from fopen. errno was 24 (EMFILE /* Too many open files */). I conclude form this that something is opening alot of files and not closing them, and that for some reason this occurs more when running with valgrind. Since I observed so many successful calls to open on this particular file using gdb, I'm guessing something is calling FindFileInXkbPath many times and maybe not closing the files afterwards or something. I tried increasing the default per process open file limit from 1024 to 8192, the result was that memcheck with gnome-shell running it in proceeded to consume about 12GB of RAM During this time I ran lsof on the process. There were much more than 1024 file handles, and the majority of the entries looked like this: memcheck- 1022 daniel 3782u a_inode 0,12 0 9705 [timerfd] I then ran gnome-shell inside of strace recording open/close syscalls and found that there were a large number of calls to open('/etc/localtime'), and also lots of consecutive calls to close() with the same file handle. See strace log: https://gist.githubusercontent.com/hedgepigdaniel/d15275fb293a1be01f3bdc64b9b3fff4/raw/9de45b093c974e3731f6fdd038cba30129f4709e/gistfile1.txt I'm confused because right at the end of that log you can see the following: open("/usr/share/X11/xkb/rules/evdev", O_RDONLY) = 1045 close(1045) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} --- --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} --- But I was fairly sure that fopen returned NULL in the code - which I assumed would have meant that open() returned -1 or something (not 1045) if the syscall to open the file failed... am I misunderstanding strace? I realised that valgrind has a --track-fds option. The main thread of gnome-shell hits the 1024 open files limit, and the majority of the handles are created as follows: ==19805== Open file descriptor 145: ==19805== at 0x7A5E8F7: timerfd_create (in /usr/lib/libc-2.25.so) ==19805== by 0xD515853: g_datetime_source_init_timerfd (gnome-datetime-source.c:177) ==19805== by 0xD515853: _gnome_datetime_source_new (gnome-datetime-source.c:275) ==19805== by 0xD50B760: update_clock (gnome-wall-clock.c:359) ==19805== by 0x71FBEAC: g_closure_invoke (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x720E4AD: ??? (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x7216C84: g_signal_emit_valist (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x721769E: g_signal_emit (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x6F5DB31: ??? (in /usr/lib/libgio-2.0.so.0.5200.3) ==19805== by 0xBFEE17D: ffi_call_unix64 (unix64.S:76) ==19805== by 0xBFEDAEE: ffi_call (ffi64.c:525) ==19805== by 0x71FCA9C: g_cclosure_marshal_generic_va (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x71FC0E5: ??? (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x721693C: g_signal_emit_valist (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x721769E: g_signal_emit (in /usr/lib/libgobject-2.0.so.0.5200.3) ==19805== by 0x6F5E2A7: ??? (in /usr/lib/libgio-2.0.so.0.5200.3) ==19805== by 0x6F58DC9: ??? (in /usr/lib/libgio-2.0.so.0.5200.3) ==19805== by 0x74898C4: g_main_context_dispatch (in /usr/lib/libglib-2.0.so.0.5200.3) ==19805== by 0x7489C87: ??? (in /usr/lib/libglib-2.0.so.0.5200.3) ==19805== by 0x7489FA1: g_main_loop_run (in /usr/lib/libglib-2.0.so.0.5200.3) ==19805== by 0x5D297BB: meta_run (main.c:648) ==19805== by 0x10A306: main (main.c:454) And the GJS stack at the same time: Background<._init@resource:///org/gnome/shell/ui/background.js:251:23 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 _Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5 Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20 BackgroundSource<.getBackground@resource:///org/gnome/shell/ui/background.js:583:30 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 BackgroundManager<._createBackgroundActor@resource:///org/gnome/shell/ui/background.js:753:26 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 BackgroundManager<._updateBackgroundActor@resource:///org/gnome/shell/ui/background.js:729:34 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 BackgroundManager<._createBackgroundActor/changeSignalId<@resource:///org/gnome/shell/ui/background.js:775:13 _emit@resource:///org/gnome/gjs/modules/signals.js:126:27 Background<._init/this._settingsChangedSignalId<@resource:///org/gnome/shell/ui/background.js:267:45 *** Bug 1478579 has been marked as a duplicate of this bug. *** This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. *** This bug has been marked as a duplicate of bug 1507656 *** |