Description of problem: Firefox crashes on startup after an update. The issue persists even after a clean reinstall. Version-Release number of selected component (if applicable): firefox 111.0.1-1.fc38 How reproducible: Occurs every time, cannot run firefox because of the issue Steps to Reproduce: 1. Run firefox 2. ??? Actual results: Firefox does not run Expected results: Firefox runs as usual Additional info: Arch: x86 DE: GNOME on Wayland Fedora version: f38 beta (latest updates)
I'm on Firefox 38 beta If started (without args) from command line, like so, firefox Firefox shows it terminated on account of SEGV error If started with only -P argument to bring up profile selection dialog-window Firefox dies on SEGV error after user selects profile and proceeds, failing to bring up desired Firefox profile. I then updated to Firefox 112.0-1 build on koji https://koji.fedoraproject.org/koji/buildinfo?buildID=2181888 Same problem (workaound) Firefox can start if one uses the profile argument (-P) from both gnome shell alt-f2 or terminal command line. firefox -P expt firefox -P default When started this way, Firefox seems to work. I am not yet sure of stability or video stability as I have not yet used the browse much. But Browser remained on overnight without activity. A second similar SEGV error happens when Firefox is closed, and this message is shown in the command terminal. Given that profile argument is involved. It seems to me, I suspect this SEGV is triggered when Firefox has to read/update/transfer/save state of profiles. A disadvantage of this workaround, is that, it needs a terminal window shell open for each Firefox profile opened. Agree with severity, as even with workaround, browser is very much needed software, unable to start it can leave user unable to access any information.
Will look at it.
Guys, I'm unable to reproduce on VM. Please try to attach crash info: https://fedoraproject.org/wiki/Debugging_guidelines_for_Mozilla_products#Using_local_debugging Thanks.
I'm on it, It will take some time, ~2hrs maybe. But in the mean time, i thought this abrt bugtrace report but was suspicious because It seemed to be about profiles (nsProfileLock) Bug 2183333 - [abrt] firefox: nsProfileLock::FatalSignalHandler(): firefox killed by SIGSEGV https://bugzilla.redhat.com/show_bug.cgi?id=2183333
It may be a variant of https://bugzilla.mozilla.org/show_bug.cgi?id=1826583 MOZ_CRASH Reason: ```warning: queue 0x7f0821b867c0 destroyed while proxies still attached: I tried to build wayland-1.22.0 on Fedora 37 to debug it here and the build itself fails at redhat-linux-build/tests/queue-test: ``` Timeout was set to 2 seconds from now. test "queue_destroy_default_with_attached_proxies": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. warning: queue 0x55563dc54b20 destroyed while proxies still attached: wl_callback@2 still attached Tried to add event to destroyed queue Client 'client_test_queue_proxy_event_to_destroyed_queue' was killed by signal 6 test "queue_proxy_event_to_destroyed_queue": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. warning: queue 0x55563dc54b20 destroyed while proxies still attached: wl_callback@2 still attached test "queue_destroy_with_attached_proxies": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. test "queue_set_queue_race": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. test "queue_set_queue_proxy_wrapper": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. test "queue_roundtrip": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. test "queue_multiple_queues": exit status 0, pass. ---------------------------------------- Timeout was set to 2 seconds from now. test "queue_proxy_destroy": exit status 0, pass. ---------------------------------------- 8 tests, 8 pass, 0 fail ``` (it's interesting that the tests crash but are marked as passed :)) So it looks like a bug in wayland-1.22.0 library itself.
btw. there's also coredumctl crashes for it: Thu 2023-04-06 13:05:50 CEST 39737 1000 1000 SIGABRT present /raid/CVS/wayland/wayland-1.22.0/redhat-linux-build/tests/queue-test 31.3K Thu 2023-04-06 13:29:22 CEST 50324 1000 1000 SIGABRT present /raid/CVS/wayland/wayland-1.22.0/redhat-linux-build/tests/queue-test 31.3K Thu 2023-04-06 13:29:52 CEST 55375 1000 1000 SIGABRT present /raid/CVS/wayland/wayland-1.22.0/redhat-linux-build/tests/queue-test 30.5K Thu 2023-04-06 13:30:08 CEST 67180 1000 1000 SIGABRT present /raid/CVS/wayland/wayland-1.22.0/redhat-linux-build/tests/queue-test 30.5K
[root@sirius ff]# cat /etc/os-release | grep -E "^NAME=|^VERSION=" NAME="Fedora Linux" VERSION="38 (Workstation Edition Prerelease)" [root@sirius ff]# uname -a Linux sirius 6.2.9-300.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Mar 30 22:32:58 UTC 2023 x86_64 GNU/Linux I downgraded firefox back to firefox-111.0.1-1 [root@sirius ff]# rpm -qa | grep firefox firefox-langpacks-111.0.1-1.fc38.x86_64 firefox-111.0.1-1.fc38.x86_64 firefox-debugsource-111.0.1-1.fc38.x86_64 firefox-debuginfo-111.0.1-1.fc38.x86_64 # create new firefox profile expt2 # Start firefox profile expt2 via profile choose dialog [gana@sirius ~]$ firefox -P Segmentation fault (core dumped) [gana@sirius ~]$ date Thu Apr 6 05:23:33 PM IST 2023 [gana@sirius ~]$ coredumpctl list TIME PID UID GID SIG COREFILE EXE SIZE : Thu 2023-04-06 17:23:27 IST 57649 1000 1000 SIGSEGV present /usr/lib64/firefox/firefox 5.6M [gana@sirius ~]$ coredumpctl debug 57649 Attached 20230406_firefox_profile_coredumpbt001.txt
Created attachment 1956076 [details] bt001 coredump on running firefox -P see corresponding prev comment.
This is the second type of SEGV that happens after firefox window, whose profile was started directly, is closed [gana@sirius ~]$ date Thu Apr 6 05:41:37 PM IST 2023 [gana@sirius ~]$ firefox -P expt2 [ERROR viaduct::backend::ffi] Missing HTTP status [ERROR viaduct::backend::ffi] Missing HTTP status Segmentation fault (core dumped) [gana@sirius ~]$ coredumpctl list TIME PID UID GID SIG COREFILE EXE SIZE : Thu 2023-04-06 17:42:15 IST 59779 1000 1000 SIGSEGV present /usr/lib64/firefox/firefox 22.6M [gana@sirius ~]$ coredumpctl debug 59779 Attached 20230406_firefox_profile_coredumpbt002.txt backtrace seems longer, which maybe expected because one is closing a whole firefox window with a tab that came up
Created attachment 1956079 [details] bt002 coredump on running firefox -P expt2 see prev corresponding comment btw, the description for the previously uploaded attachment should have been "coredump on running firefox -P" without the profile name I'll see if I can correct, but if not maybe we're stuck with it
(In reply to Martin Stransky from comment #5) > It may be a variant of https://bugzilla.mozilla.org/show_bug.cgi?id=1826583 > > MOZ_CRASH Reason: ```warning: queue 0x7f0821b867c0 destroyed while proxies > still attached: That's a warning though, not a crash. The bug does not tell, but I assume this is with wayland 1.22. This has been added with commit https://gitlab.freedesktop.org/wayland/wayland/-/commit/0ba650202 to warn about leaks on desctruction. This was later relaxed with https://gitlab.freedesktop.org/wayland/wayland/-/commit/b01a85dfd5 but will still warn if the wl_proxy is destroyed after the wl_display. So either way, that's a bug in Firefox, not Wayland.
yup, wayland-1.22 FF was working on fedora-38. Wayland/MESA is what mainly got updated in the recent "dnf update" yesterday or so. Last few dnf history info included below [gana@sirius ~]$ rpm -qa | grep wayland xisxwayland-2-2.fc38.x86_64 gnome-session-wayland-session-44.0-1.fc38.x86_64 qt6-qtwayland-6.4.3-1.fc38.x86_64 qt5-qtwayland-5.15.8-6.fc38.x86_64 xorg-x11-server-Xwayland-22.1.9-1.fc38.x86_64 libwayland-client-1.22.0-1.fc38.x86_64 libwayland-cursor-1.22.0-1.fc38.x86_64 libwayland-server-1.22.0-1.fc38.x86_64 libwayland-egl-1.22.0-1.fc38.x86_64 libwayland-client-1.22.0-1.fc38.i686 libwayland-cursor-1.22.0-1.fc38.i686 libwayland-server-1.22.0-1.fc38.i686 libwayland-egl-1.22.0-1.fc38.i686 [gana@sirius ~]$ [root@sirius ff]# dnf history info 435 Transaction ID : 435 Begin time : Wed 05 Apr 2023 07:17:37 PM IST Begin rpmdb : 0c1c9fed73869e0d5bc2c56bcd70c3d88ced271fe2ab80bec2e4c20544a533bd End time : Wed 05 Apr 2023 07:17:39 PM IST (2 seconds) End rpmdb : 0119dd2a445a4ff4fb1741e93c4100db56acd8405d59f2a53664b2eee958f677 User : Ganapathi Kamath <gana> Return-Code : Success Releasever : 38 Command Line : update -y Comment : Packages Altered: Upgrade nushell-0.78.0-1.fc38.x86_64 @copr:copr.fedorainfracloud.org:atim:nushell Upgraded nushell-0.77.1-1.fc38.x86_64 @@System Upgrade libwayland-client-1.22.0-1.fc38.i686 @updates-testing Upgraded libwayland-client-1.21.0-2.fc38.i686 @@System Upgrade libwayland-client-1.22.0-1.fc38.x86_64 @updates-testing Upgraded libwayland-client-1.21.0-2.fc38.x86_64 @@System Upgrade libwayland-cursor-1.22.0-1.fc38.i686 @updates-testing Upgraded libwayland-cursor-1.21.0-2.fc38.i686 @@System Upgrade libwayland-cursor-1.22.0-1.fc38.x86_64 @updates-testing Upgraded libwayland-cursor-1.21.0-2.fc38.x86_64 @@System Upgrade libwayland-egl-1.22.0-1.fc38.i686 @updates-testing Upgraded libwayland-egl-1.21.0-2.fc38.i686 @@System Upgrade libwayland-egl-1.22.0-1.fc38.x86_64 @updates-testing Upgraded libwayland-egl-1.21.0-2.fc38.x86_64 @@System Upgrade libwayland-server-1.22.0-1.fc38.i686 @updates-testing Upgraded libwayland-server-1.21.0-2.fc38.i686 @@System Upgrade libwayland-server-1.22.0-1.fc38.x86_64 @updates-testing Upgraded libwayland-server-1.21.0-2.fc38.x86_64 @@System Upgrade zchunk-libs-1.3.1-1.fc38.x86_64 @updates-testing Upgraded zchunk-libs-1.3.0-1.fc38.x86_64 @@System [root@sirius ff]# dnf history info 434 Transaction ID : 434 Begin time : Wed 05 Apr 2023 11:14:00 AM IST Begin rpmdb : 9174cf621b9c14273feca5221146c06dca81b10ee5ddb99e33c394f05dfb6fa8 End time : Wed 05 Apr 2023 11:14:18 AM IST (18 seconds) End rpmdb : 0c1c9fed73869e0d5bc2c56bcd70c3d88ced271fe2ab80bec2e4c20544a533bd User : Ganapathi Kamath <gana> Return-Code : Success Releasever : 38 Command Line : update -y Comment : Packages Altered: Upgrade atmel-firmware-1.3-29.fc38.noarch @fedora Upgraded atmel-firmware-1.3-28.fc38.noarch @@System Upgrade zd1211-firmware-1.5-13.fc38.noarch @fedora Upgraded zd1211-firmware-1.5-12.fc38.noarch @@System Upgrade clang-libs-16.0.0-2.fc38.x86_64 @updates-testing Upgraded clang-libs-15.0.7-2.fc38.x86_64 @@System Upgrade clang-resource-filesystem-16.0.0-2.fc38.x86_64 @updates-testing Upgraded clang-resource-filesystem-15.0.7-2.fc38.x86_64 @@System Upgrade compiler-rt-16.0.0-1.fc38.x86_64 @updates-testing Upgraded compiler-rt-15.0.7-2.fc38.x86_64 @@System Upgrade createrepo_c-0.21.1-1.fc38.x86_64 @updates-testing Upgraded createrepo_c-0.20.1-2.fc38.x86_64 @@System Upgrade createrepo_c-libs-0.21.1-1.fc38.x86_64 @updates-testing Upgraded createrepo_c-libs-0.20.1-2.fc38.x86_64 @@System Upgrade cups-browsed-1:2.0~b4-1.fc38.x86_64 @updates-testing Upgraded cups-browsed-1:2.0~b3-2.fc38.x86_64 @@System Upgrade doublecmd-common-1.0.11-1.fc38.x86_64 @updates-testing Upgraded doublecmd-common-1.0.10-1.fc38.x86_64 @@System Upgrade doublecmd-gtk-1.0.11-1.fc38.x86_64 @updates-testing Upgraded doublecmd-gtk-1.0.10-1.fc38.x86_64 @@System Upgrade epiphany-runtime-1:44.1-1.fc38.x86_64 @updates-testing Upgraded epiphany-runtime-1:44.0-1.fc38.x86_64 @@System Upgrade grilo-plugins-0.3.16-1.fc38.x86_64 @updates-testing Upgraded grilo-plugins-0.3.15-4.fc38.x86_64 @@System Upgrade hwdata-0.369-1.fc38.noarch @updates-testing Upgraded hwdata-0.368-1.fc38.noarch @@System Upgrade libclc-16.0.0-1.fc38.x86_64 @updates-testing Upgraded libclc-15.0.7-2.fc38.x86_64 @@System Upgrade libomp-16.0.0-1.fc38.x86_64 @updates-testing Upgraded libomp-15.0.7-4.fc38.x86_64 @@System Upgrade libomp-devel-16.0.0-1.fc38.x86_64 @updates-testing Upgraded libomp-devel-15.0.7-4.fc38.x86_64 @@System Upgrade llvm-libs-16.0.0-2.fc38.i686 @updates-testing Upgraded llvm-libs-15.0.7-2.fc38.i686 @@System Upgrade llvm-libs-16.0.0-2.fc38.x86_64 @updates-testing Upgraded llvm-libs-15.0.7-2.fc38.x86_64 @@System Upgrade mesa-dri-drivers-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-dri-drivers-23.0.1-1.fc38.i686 @@System Upgrade mesa-dri-drivers-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-dri-drivers-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-filesystem-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-filesystem-23.0.1-1.fc38.i686 @@System Upgrade mesa-filesystem-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-filesystem-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libEGL-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-libEGL-23.0.1-1.fc38.i686 @@System Upgrade mesa-libEGL-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libEGL-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libGL-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-libGL-23.0.1-1.fc38.i686 @@System Upgrade mesa-libGL-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libGL-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libOSMesa-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-libOSMesa-23.0.1-1.fc38.i686 @@System Upgrade mesa-libOSMesa-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libOSMesa-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libOpenCL-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libOpenCL-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libgbm-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-libgbm-23.0.1-1.fc38.i686 @@System Upgrade mesa-libgbm-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libgbm-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libglapi-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-libglapi-23.0.1-1.fc38.i686 @@System Upgrade mesa-libglapi-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libglapi-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-libxatracker-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-libxatracker-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-va-drivers-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-va-drivers-23.0.1-1.fc38.i686 @@System Upgrade mesa-va-drivers-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-va-drivers-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-vdpau-drivers-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-vdpau-drivers-23.0.1-1.fc38.x86_64 @@System Upgrade mesa-vulkan-drivers-23.0.1-2.fc38.i686 @updates-testing Upgraded mesa-vulkan-drivers-23.0.1-1.fc38.i686 @@System Upgrade mesa-vulkan-drivers-23.0.1-2.fc38.x86_64 @updates-testing Upgraded mesa-vulkan-drivers-23.0.1-1.fc38.x86_64 @@System Upgrade spirv-llvm-translator-16.0.0-1.fc38.x86_64 @updates-testing Upgraded spirv-llvm-translator-15.0.0-3.fc38.x86_64 @@System Upgrade xorg-x11-drv-libinput-1.3.0-1.fc38.x86_64 @updates-testing Upgraded xorg-x11-drv-libinput-1.2.1-3.fc38.x86_64 @@System [root@sirius ff]# dnf history info 433 Transaction ID : 433 Begin time : Wed 05 Apr 2023 12:44:29 AM IST Begin rpmdb : e16dcc37c20bce4f03fd80666b084434d6d97c4cde9ad74ee46bafee4475a2c8 End time : Wed 05 Apr 2023 12:44:53 AM IST (24 seconds) End rpmdb : 9174cf621b9c14273feca5221146c06dca81b10ee5ddb99e33c394f05dfb6fa8 User : Ganapathi Kamath <gana> Return-Code : Success Releasever : 38 Command Line : update -y Comment : Packages Altered: Upgrade container-selinux-2:2.209.0-1.fc38.noarch @updates-testing Upgraded container-selinux-2:2.206.0-1.fc38.noarch @@System Upgrade dnsmasq-2.89-2.fc38.x86_64 @updates-testing Upgraded dnsmasq-2.89-1.fc38.x86_64 @@System Upgrade fedora-release-common-38-0.33.noarch @updates-testing Upgraded fedora-release-common-38-0.32.noarch @@System Upgrade fedora-release-identity-workstation-38-0.33.noarch @updates-testing Upgraded fedora-release-identity-workstation-38-0.32.noarch @@System Upgrade fedora-release-workstation-38-0.33.noarch @updates-testing Upgraded fedora-release-workstation-38-0.32.noarch @@System Upgrade flexiblas-3.3.1-1.fc38.x86_64 @updates-testing Upgraded flexiblas-3.3.0-2.fc38.x86_64 @@System Upgrade flexiblas-netlib-3.3.1-1.fc38.x86_64 @updates-testing Upgraded flexiblas-netlib-3.3.0-2.fc38.x86_64 @@System Upgrade flexiblas-openblas-openmp-3.3.1-1.fc38.x86_64 @updates-testing Upgraded flexiblas-openblas-openmp-3.3.0-2.fc38.x86_64 @@System Upgrade fluidsynth-libs-2.3.2-1.fc38.x86_64 @updates-testing Upgraded fluidsynth-libs-2.3.1-2.fc38.x86_64 @@System Upgrade gdb-13.1-3.fc38.x86_64 @updates-testing Upgraded gdb-13.1-2.fc38.x86_64 @@System Upgrade gdb-headless-13.1-3.fc38.x86_64 @updates-testing Upgraded gdb-headless-13.1-2.fc38.x86_64 @@System Upgrade ghostscript-10.01.0-2.fc38.x86_64 @updates-testing Upgraded ghostscript-10.01.0-1.fc38.x86_64 @@System Upgrade ghostscript-tools-fonts-10.01.0-2.fc38.x86_64 @updates-testing Upgraded ghostscript-tools-fonts-10.01.0-1.fc38.x86_64 @@System Upgrade ghostscript-tools-printing-10.01.0-2.fc38.x86_64 @updates-testing Upgraded ghostscript-tools-printing-10.01.0-1.fc38.x86_64 @@System Upgrade ibus-typing-booster-2.22.2-1.fc38.noarch @updates-testing Upgraded ibus-typing-booster-2.22.1-1.fc38.noarch @@System Upgrade libgs-10.01.0-2.fc38.x86_64 @updates-testing Upgraded libgs-10.01.0-1.fc38.x86_64 @@System Upgrade man-pages-6.04-1.fc38.noarch @updates-testing Upgraded man-pages-6.03-3.fc38.noarch @@System Upgrade podman-5:4.4.4-3.fc38.x86_64 @updates-testing Upgraded podman-5:4.4.3-1.fc38.x86_64 @@System Upgrade podman-gvproxy-5:4.4.4-3.fc38.x86_64 @updates-testing Upgraded podman-gvproxy-5:4.4.3-1.fc38.x86_64 @@System Upgrade podman-plugins-5:4.4.4-3.fc38.x86_64 @updates-testing Upgraded podman-plugins-5:4.4.3-1.fc38.x86_64 @@System Upgrade qt6-qtbase-6.4.3-2.fc38.x86_64 @updates-testing Upgraded qt6-qtbase-6.4.3-1.fc38.x86_64 @@System Upgrade qt6-qtbase-common-6.4.3-2.fc38.noarch @updates-testing Upgraded qt6-qtbase-common-6.4.3-1.fc38.noarch @@System Upgrade qt6-qtbase-gui-6.4.3-2.fc38.x86_64 @updates-testing Upgraded qt6-qtbase-gui-6.4.3-1.fc38.x86_64 @@System Upgrade rtl-sdr-0.6.0^20230403git142325a9-1.fc38.x86_64 @updates-testing Upgraded rtl-sdr-0.6.0-13.fc38.x86_64 @@System Upgrade rygel-0.42.2-1.fc38.x86_64 @updates-testing Upgraded rygel-0.42.1-1.fc38.x86_64 @@System Upgrade skopeo-1:1.11.2-1.fc38.x86_64 @updates-testing Upgraded skopeo-1:1.11.1-1.fc38.x86_64 @@System
(In reply to Olivier Fourdan from comment #11) > (In reply to Martin Stransky from comment #5) > > It may be a variant of https://bugzilla.mozilla.org/show_bug.cgi?id=1826583 > > > > MOZ_CRASH Reason: ```warning: queue 0x7f0821b867c0 destroyed while proxies > > still attached: > > That's a warning though, not a crash. > > The bug does not tell, but I assume this is with wayland 1.22. > > This has been added with commit > https://gitlab.freedesktop.org/wayland/wayland/-/commit/0ba650202 to warn > about leaks on desctruction. > > This was later relaxed with > https://gitlab.freedesktop.org/wayland/wayland/-/commit/b01a85dfd5 but will > still warn if the wl_proxy is destroyed after the wl_display. > > So either way, that's a bug in Firefox, not Wayland. Well, looks like there's the same bug right in the wayland package testsuite itself: (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007f0899e6eec3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007f0899e1ea76 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f0899e087fc in __GI_abort () at abort.c:79 #4 0x00007f089a004d11 in wl_abort (fmt=fmt@entry=0x7f089a009e10 "Tried to add event to destroyed queue\n") at ../src/wayland-util.c:462 #5 0x00007f089a006f27 in queue_event (len=<optimized out>, display=0x5592f8f1ca20) at ../src/wayland-client.c:1575 #6 read_events (display=0x5592f8f1ca20) at ../src/wayland-client.c:1670 #7 wl_display_read_events (display=display@entry=0x5592f8f1ca20) at ../src/wayland-client.c:1753 #8 0x00007f089a007289 in wl_display_dispatch_queue (queue=<optimized out>, display=<optimized out>) at ../src/wayland-client.c:1992 #9 wl_display_dispatch_queue (display=display@entry=0x5592f8f1ca20, queue=queue@entry=0x5592f8f1cb10) at ../src/wayland-client.c:1960 #10 0x00007f089a0083cf in wl_display_roundtrip_queue (display=display@entry=0x5592f8f1ca20, queue=queue@entry=0x5592f8f1cb10) at ../src/wayland-client.c:1403 #11 0x00007f089a008410 in wl_display_roundtrip (display=display@entry=0x5592f8f1ca20) at ../src/wayland-client.c:1432 #12 0x00005592f7653d0f in client_test_queue_proxy_event_to_destroyed_queue () at ../tests/queue-test.c:428 #13 0x00005592f7654f1d in run_client (data=0x0, log_fd=13, client_pipe=<optimized out>, wayland_sock=<optimized out>, client_main=0x5592f7653c70 <client_test_queue_proxy_event_to_destroyed_queue>) at ../tests/test-compositor.c:196 #14 display_create_client (d=d@entry=0x5592f8f1c300, client_main=client_main@entry=0x5592f7653c70 <client_test_queue_proxy_event_to_destroyed_queue>, name=name@entry=0x5592f7656c70 "client_test_queue_proxy_event_to_destroyed_queue", data=0x0) at ../tests/test-compositor.c:248 #15 0x00005592f76553e7 in client_create_with_name (data=0x0, name=0x5592f7656c70 "client_test_queue_proxy_event_to_destroyed_queue", client_main=0x5592f7653c70 <client_test_queue_proxy_event_to_destroyed_queue>, d=0x5592f8f1c300) at ../tests/test-compositor.c:296 #16 queue_proxy_event_to_destroyed_queue () at ../tests/queue-test.c:575 #17 0x00005592f7654597 in run_test (t=t@entry=0x5592f765a040 <testqueue_proxy_event_to_destroyed_queue>) at ../tests/test-runner.c:159 #18 0x00005592f7653090 in main (argc=<optimized out>, argv=<optimized out>) at ../tests/test-runner.c:359 Just build wayland package without optimization and then launch 'queue-test'. It's marked as 'passed' but it crashes. It's surely possible that it's a bug in Firefox but I'm surprised I see that in wayland testsuite itself.
I then did update of kernel and rebooted Linux. The new graphic libraries become active on reboot I usually prevent automatic dnf update of kernel with a exclude line in /etc/dnf/dnf.conf, which i manually un/comment on need. [root@sirius ff]# cat /etc/dnf/dnf.conf | grep ^exc exclude=VirtualBox,VirtualBox-server,akmod-VirtualBox,VirtualBox-kmod-common,VirtualBox-kmodsrc,kernel*,kmod*,mock*,qemu* [root@sirius ff]# dnf history info 437 Transaction ID : 437 Begin time : Wed 05 Apr 2023 08:51:54 PM IST Begin rpmdb : 64c0cdaa3bd88dd8ec16d44778a93130d1087bcaff07a2c60f2586a2ff8ba96a End time : Wed 05 Apr 2023 08:52:04 PM IST (10 seconds) End rpmdb : 06a407e07b1ce23aef522e81c36455c0128b9ff5d6b16471775784b2bda02b50 User : System <unset> Return-Code : Success Releasever : 38 Command Line : -y install --disablerepo=* /tmp/akmods.oVVg90Nz/results/kmod-VirtualBox-6.2.9-300.fc38.x86_64-7.0.6-1.fc38.x86_64.rpm Comment : Packages Altered: Install kmod-VirtualBox-6.2.9-300.fc38.x86_64-7.0.6-1.fc38.x86_64 @@commandline [root@sirius ff]# dnf history info 436 Transaction ID : 436 Begin time : Wed 05 Apr 2023 08:49:46 PM IST Begin rpmdb : 058ebb081bdff9856fa267c43bbcd1de8f87e91882cf033b4fb79012bfe90670 End time : Wed 05 Apr 2023 08:51:44 PM IST (118 seconds) End rpmdb : 64c0cdaa3bd88dd8ec16d44778a93130d1087bcaff07a2c60f2586a2ff8ba96a User : Ganapathi Kamath <gana> Return-Code : Success Releasever : 38 Command Line : update -y Comment : Packages Altered: Install kernel-6.2.9-300.fc38.x86_64 @fedora Install kernel-core-6.2.9-300.fc38.x86_64 @fedora Install kernel-devel-6.2.9-300.fc38.x86_64 @fedora Install kernel-modules-6.2.9-300.fc38.x86_64 @fedora Install kernel-modules-core-6.2.9-300.fc38.x86_64 @fedora Install kernel-modules-extra-6.2.9-300.fc38.x86_64 @fedora Install kernel-modules-internal-6.2.9-300.fc38.x86_64 @fedora Upgrade kernel-devel-matched-6.2.9-300.fc38.x86_64 @fedora Upgraded kernel-devel-matched-6.2.8-300.fc38.x86_64 @@System
(In reply to Martin Stransky from comment #13) > Just build wayland package without optimization and then launch > 'queue-test'. It's marked as 'passed' but it crashes. > It's surely possible that it's a bug in Firefox but I'm surprised I see that > in wayland testsuite itself. Well, no, that's precisely what the test suite checks. That `client_test_queue_proxy_event_to_destroyed_queue()` in the backtrace is part of the commit I linked above (https://gitlab.freedesktop.org/wayland/wayland/-/commit/b01a85dfd5), this is precisely to check that the warning is raised in the case it is supposed to check.
To be clear, the warning is what the test wants to check in the test suite, that triggers an abort() (SIGABRT in the codedump), so that's normal, it does not mean that the code in wayland is wrong, it's the opposite - But that means Firefox triggering the same issue is a bug in Firefox.
I think there are two problems actually: (gdb) bt #0 0x00007fb6e1aafb94 in __pthread_kill_implementation () at /lib64/libc.so.6 #1 0x00007fb6e1a5eaee in raise () at /lib64/libc.so.6 #2 0x00007fb6da6f0c90 in nsProfileLock::FatalSignalHandler(int, siginfo_t*, void*) () at /usr/lib64/firefox/libxul.so #3 0x00007fb6e1a5eba0 in <signal handler called> () at /lib64/libc.so.6 #4 0x00007fb6d92ef379 in mozilla::widget::WlCrashHandler(char const*, __va_list_tag*) () at /usr/lib64/firefox/libxul.so #5 0x00007fb6dfa6ac9e in wl_log () at /lib64/libwayland-client.so.0 #6 0x00007fb6dfa6c854 in wl_display_prepare_read_queue () at /lib64/libwayland-client.so.0 #7 0x00007fb6dfa6c929 in wl_display_dispatch_queue_pending () at /lib64/libwayland-client.so.0 #8 0x00007fb6b235bd84 in dri2_teardown_wayland () at /lib64/libEGL_mesa.so.0 #9 0x00007fb6b2350e58 in dri2_display_destroy () at /lib64/libEGL_mesa.so.0 #10 0x00007fb6b2351380 in dri2_terminate () at /lib64/libEGL_mesa.so.0 #11 0x00007fb6b233eabc in eglTerminate () at /lib64/libEGL_mesa.so.0 #12 0x00007fb6d6f2470e in mozilla::gl::EglDisplay::~EglDisplay() () at /usr/lib64/firefox/libxul.so #13 0x00007fb6d6bdc44e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use_cold() () at /usr/lib64/firefox/libxul.so #14 0x00007fb6d6f3e82e in mozilla::gl::GLContextEGL::~GLContextEGL() () at /usr/lib64/firefox/libxul.so #15 0x00007fb6d6f3ea2d in mozilla::gl::GLContextEGL::~GLContextEGL() () at /usr/lib64/firefox/libxul.so #16 0x00007fb6d7213135 in mozilla::wr::RenderThread::ShutDownTask() () at /usr/lib64/firefox/libxul.so #17 0x00007fb6d6b9233e in mozilla::detail::runnable_args_base<(mozilla::detail::RunnableResult)0>::Run() () at /usr/lib64/firefox/libxul.so #18 0x00007fb6d65308d4 in nsThread::ProcessNextEvent(bool, bool*) () at /usr/lib64/firefox/libxul.so #19 0x00007fb6d651d7ef in NS_ProcessNextEvent(nsIThread*, bool) () at /usr/lib64/firefox/libxul.so #20 0x00007fb6d6bf1dca in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) () at /usr/lib64/firefox/libxul.so #21 0x00007fb6d6bacbe2 in MessageLoop::Run() () at /usr/lib64/firefox/libxul.so #22 0x00007fb6d65300ea in nsThread::ThreadFunc(void*) () at /usr/lib64/firefox/libxul.so #23 0x00007fb6e0ee8759 in _pt_root () at /lib64/libnspr4.so #24 0x00007fb6e1aadc57 in start_thread () at /lib64/libc.so.6 #25 0x00007fb6e1b33a70 in clone3 () at /lib64/libc.so.6 1. Firefox is calling `wl_display_prepare_read_queue()` after the queue has been destroyed, hence causing the warning in libwayland (new in 1.22). 2. wl_log() triggers mozilla::widget::WlCrashHandler() causing the crash Possibly fixing the second point could be enough to avoid the crash (yet the use-after-free would still be there, from point 1)
There in widget/gtk/nsWaylandDisplay.cpp: 278 nsWaylandDisplay::nsWaylandDisplay(wl_display* aDisplay) 279 : mThreadId(PR_GetCurrentThread()), mDisplay(aDisplay) { 280 // GTK sets the log handler on display creation, thus we overwrite it here 281 // in a similar fashion 282 wl_log_set_handler_client(WlCrashHandler); That basically means that whenever Waylanmd will log something, Firefox will call WlCrashHandler() which crashes Firefox: 274 static void WlCrashHandler(const char* format, va_list args) { 275 MOZ_CRASH_UNSAFE(g_strdup_vprintf(format, args)); 276 } So that warning about a possible use-after-free from Wayland ends up being fatal to Firefox.
Also, the problem with crashing on any log message like that is that it hides further log messages. That's unfortunate because Wayland will list the proxies that are still attached, bit we have no way to see then sicne the first message already crashed Firefox. Can you please demote the wl_log_set_handler_client() in Firefox widget/gtk/nsWaylandDisplay.cpp to just print to stderr without crashing Firefox, to see whether that helps and to get the proxies still attached?
Meanwhile, I might have a mitigation patch...
See https://gitlab.freedesktop.org/wayland/wayland/-/merge_requests/308
Scratch build for F38 here: https://koji.fedoraproject.org/koji/taskinfo?taskID=99595611 That seems to work here, no more crash and wecan see the proxies being still attached: warning: queue 0x7fc6f3b69790 destroyed while proxies still attached: wl_display@1 still attached Please give that scratch build a try.
Yeah, looks like I failed to get rid of it :D
Olivier, the log handler was here to check Wayland server errors when wayland-client quits the application but that quit doesn't produce a crash. So such quit is invisible for us and we can't fix that on Firefox side. So we decided to add crash handler to log to crash when we're going to be terminated anyway. For instance this one: "wl_surface@292: error 2: Buffer size (1067x120) must be an integer multiple of the buffer_scale (2)" quits Firefox but we need to know about it. So how to proceed here? Should we check 'error' or 'warning' strings? How can client detect that it's going to be terminated or not? Thanks.
FEDORA-2023-78c8016658 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-78c8016658
FEDORA-2023-78c8016658 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
Happy to report ff starts and closes without the SEGV [gana@sirius ~]$ rpm -qa | grep firefox firefox-langpacks-112.0-2.fc38.x86_64 firefox-112.0-2.fc38.x86_64 firefox-debugsource-112.0-2.fc38.x86_64 firefox-debuginfo-112.0-2.fc38.x86_64 [gana@sirius ~]$ firefox -P warning: queue 0x7f9b59f61bb0 destroyed while proxies still attached: wl_display@1 still attached [ERROR glean_core] Error setting metrics feature config: Json(Error("EOF while parsing a value", line: 1, column: 0)) [ERROR viaduct::backend::ffi] Missing HTTP status [ERROR viaduct::backend::ffi] Missing HTTP status warning: queue 0x7fc13cbd86a0 destroyed while proxies still attached: wl_display@1 still attached [gana@sirius ~]$ [gana@sirius ~]$ firefox -P expt2 [ERROR glean_core] Error setting metrics feature config: Json(Error("EOF while parsing a value", line: 1, column: 0)) [ERROR viaduct::backend::ffi] Missing HTTP status [ERROR viaduct::backend::ffi] Missing HTTP status warning: queue 0x7f68835d4ac0 destroyed while proxies still attached: wl_display@1 still attached
(In reply to Martin Stransky from comment #24) > Olivier, the log handler was here to check Wayland server errors when > wayland-client quits the application but that quit doesn't produce a crash. > So such quit is invisible for us and we can't fix that on Firefox side. > > So we decided to add crash handler to log to crash when we're going to be > terminated anyway. For instance this one: > > "wl_surface@292: error 2: Buffer size (1067x120) must be an integer multiple > of the buffer_scale (2)" > > quits Firefox but we need to know about it. So how to proceed here? Should > we check 'error' or 'warning' strings? How can client detect that it's going > to be terminated or not? I am not entirely sure what you mean by an error not producing a crash, all Wayland protocol errors will terminate the Wayland connection from the server (compositor) side, so surely the client will notice and eventually terminate? FWIW, Xwayland does the same, it calls FatalError() from wl_log_set_handler_client() that's how I knew about the problem (and I forgot to post that patch upstream).
FTR, I have now closed my mitigation MR in Wayland upstream, clients are not supposed to terminate on wl_log(). I see the comment in Firefox refers to GTK, indeed GTK was also doing that [1] but that was fixed in 2015 [2] by Ray. [1] https://gitlab.gnome.org/GNOME/gtk/-/commit/4252ac6d6ce [2] https://gitlab.gnome.org/GNOME/gtk/-/commit/f4d2022d46e
(In reply to Olivier Fourdan from comment #29) > FTR, I have now closed my mitigation MR in Wayland upstream, clients are not > supposed to terminate on wl_log(). > > I see the comment in Firefox refers to GTK, indeed GTK was also doing that > [1] but that was fixed in 2015 [2] by Ray. > > [1] https://gitlab.gnome.org/GNOME/gtk/-/commit/4252ac6d6ce > [2] https://gitlab.gnome.org/GNOME/gtk/-/commit/f4d2022d46e Yes, I understand it. Recent (supposed) scenario is: 1) There's a wayland protocol error (wrong client behaviour, etc.) 2) Wayland client library will log that 3) Application itself is terminated after that by exit() from client-libwayland library. so application can install error handler to catch wayland log errors and do something. The problem is that the application termination is not detected by Firefox - it looks like clear exit from Firefox perspective. OTOH when crash handler is invoked from wayland error log, we can process it at crash-stats.mozilla.org and get info that Firefox did something wrong, like this one: https://crash-stats.mozilla.org/report/index/186b31c5-5b58-4ee6-9f96-9b8850230405 see: "MOZ_CRASH Reason (Sanitized) warning: queue 0x7f9c47a69be0 destroyed while proxies still attached:" so how to handle it? Is there any better way how to handle it? And if application is going to be terminated anyway, why not to just crash from log handler and post relevant data about crash/wayland error to diagnose?
(In reply to Martin Stransky from comment #30) > > Yes, I understand it. Recent (supposed) scenario is: > > 1) There's a wayland protocol error (wrong client behaviour, etc.) > 2) Wayland client library will log that > 3) Application itself is terminated after that by exit() from > client-libwayland library. But it is not a clean exit, I do not see client-libwayland calling exit() on behalf of the client. When an error occurs, either display_fatal_error() [1] or display_protocol_error() [2] is called from display_handle_error() [3]. After that, wl_display_read_events() will return -1 and it's game over. > […] > > so how to handle it? Is there any better way how to handle it? If wl_display_read_events() [4] return -1, you can check the actual error with wl_display_get_error() [5] > And if application is going to be terminated anyway, why not to just crash > from log handler and post relevant data about crash/wayland error to > diagnose? That "queue destroyed while proxies still attached" is an example of a wl_log() which is not a about a Wayland protcol violation and hence that does not terminate the client, so the client should not just crash from the log handler. [1] https://gitlab.freedesktop.org/wayland/wayland/-/blob/1.22.0/src/wayland-client.c?ref_type=tags#L138-158 [2] https://gitlab.freedesktop.org/wayland/wayland/-/blob/1.22.0/src/wayland-client.c?ref_type=tags#L160-220 [3] https://gitlab.freedesktop.org/wayland/wayland/-/blob/1.22.0/src/wayland-client.c?ref_type=tags#L1038-1064 [4] https://gitlab.freedesktop.org/wayland/wayland/-/blob/1.22.0/src/wayland-client.c?ref_type=tags#L1703-1758 [5] https://gitlab.freedesktop.org/wayland/wayland/-/blob/1.22.0/src/wayland-client.c?ref_type=tags#L2086-2111
Thanks. I see the termination is called from wl_abort() which is fatal and it uses wl_log_handler() and then abort(). But there's also wl_log() which uses wl_log_handler() and that's not fatal. Unfortunately it uses the same handler and we can't detect on client side which is the recent one.
(In reply to Olivier Fourdan from comment #31) > (In reply to Martin Stransky from comment #30) > > > > Yes, I understand it. Recent (supposed) scenario is: > > > > 1) There's a wayland protocol error (wrong client behaviour, etc.) > > 2) Wayland client library will log that > > 3) Application itself is terminated after that by exit() from > > client-libwayland library. > > But it is not a clean exit, I do not see client-libwayland calling exit() on > behalf of the client. > > When an error occurs, either display_fatal_error() [1] or > display_protocol_error() [2] is called from display_handle_error() [3]. > > After that, wl_display_read_events() will return -1 and it's game over. Yes but AFAIK that errors doen't lead to immediate Wayland client termination. They're rater propagated back to application as returned error state from wl_display_* calls and are caught by Gtk (for instance by _gdk_wayland_display_queue_events()). In such case wl_log_handler() is not called. Unfortunately that means Firefox can't catch them unless we install atexit handlers for it.
So looks like I need to go through libwayland library and check for wl_abort()/wl_log() calls and update Firefox wl_log_handler() handler for that as we really want to know if Firefox is terminated due to a protocol error.