Bug 1842473 - webkit2gtk segfault on wayland
Summary: webkit2gtk segfault on wayland
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: egl-wayland
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: leigh scott
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-01 11:05 UTC by Carlos Mogas da Silva
Modified: 2020-08-31 16:28 UTC (History)
7 users (show)

Fixed In Version: egl-wayland-1.1.5-3.fc31 egl-wayland-1.1.5-3.fc32 egl-wayland-1.1.5-3.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-24 01:05:57 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github NVIDIA egl-wayland issues 27 0 None closed failed to lock pthread mutex 2020-08-14 16:05:54 UTC

Description Carlos Mogas da Silva 2020-06-01 11:05:40 UTC
Description of problem:
While running wayland on an nvidia card (using proprietary drivers), evolution crashes right after opening (i think it's because it's trying to display an email).


Version-Release number of selected component (if applicable):
evolution-3.34.4-1.fc31
webkit2gtk3-2.28.2-1.fc31

How reproducible: Everytime


Steps to Reproduce:
1. Install nvidia proprietary driver
2. allow gdm and gnome-shell to use wayland
3. try to run evolution under wayland

Actual results:
crashes with this stacktrace:
                Stack trace of thread 7930:
                #0  0x00007fa2c6a51625 raise (libc.so.6)
                #1  0x00007fa2c6a3a8d9 abort (libc.so.6)
                #2  0x00007fa2c6a3a7a9 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fa2c6a49a66 __assert_fail (libc.so.6)
                #4  0x00007fa2a54d5b7d wlExternalApiLock (libnvidia-egl-wayland.so.1)
                #5  0x00007fa2a54da4ab wlEglGetInternalHandleExport (libnvidia-egl-wayland.so.1)
                #6  0x00007fa2a58584ef n/a (libEGL_nvidia.so.0)
                #7  0x00007fa2a57dfeeb n/a (libEGL_nvidia.so.0)
                #8  0x00007fa2a54d7752 wl_eglstream_display_bind (libnvidia-egl-wayland.so.1)
                #9  0x00007fa2a54d6355 wlEglBindDisplaysHook (libnvidia-egl-wayland.so.1)
                #10 0x00007fa2a58543f3 n/a (libEGL_nvidia.so.0)
                #11 0x00007fa2a57dc775 n/a (libEGL_nvidia.so.0)
                #12 0x00007fa2c50aab11 _ZN2WS8Instance10initializeEPv (libWPEBackend-fdo-1.0.so.1)
                #13 0x00007fa2c797cbf6 _ZN6WebKit14WebProcessPool28platformInitializeWebProcessERKNS_15WebProcessProxyERNS_28WebProcessCreationParametersE (libwebkit2gtk-4.0.so.37)
                #14 0x00007fa2c784fdfa _ZN6WebKit14WebProcessPool23initializeNewWebProcessERNS_15WebProcessProxyEPNS_16WebsiteDataStoreENS1_11IsPrewarmedE (libwebkit2gtk-4.0.so.37)
                #15 0x00007fa2c7850f17 _ZN6WebKit14WebProcessPool19createNewWebProcessEPNS_16WebsiteDataStoreENS_15WebProcessProxy11IsPrewarmedE (libwebkit2gtk-4.0.so.37)
                #16 0x00007fa2c785166d _ZN6WebKit14WebProcessPool27processForRegistrableDomainERNS_16WebsiteDataStoreEPNS_12WebPageProxyERKN7WebCore17RegistrableDomainE (libwebkit2gtk-4.0.so.37)
                #17 0x00007fa2c7851757 _ZN6WebKit12WebPageProxy13launchProcessERKN7WebCore17RegistrableDomainENS0_19ProcessLaunchReasonE (libwebkit2gtk-4.0.so.37)
                #18 0x00007fa2c78552ce _ZN6WebKit12WebPageProxy8loadDataERKN3IPC13DataReferenceERKN3WTF6StringES8_S8_PN3API6ObjectEN7WebCore28ShouldOpenExternalURLsPolicyE (libwebkit2gtk-4.0.so.37)
                #19 0x00007fa2c78f11a0 webkit_web_view_load_bytes (libwebkit2gtk-4.0.so.37)
                #20 0x00007fa2c6db9c90 web_view_load_string (libevolution-util.so)
                #21 0x00007fa2bd4b0cc9 mail_reader_set_folder (libevolution-mail.so)
                #22 0x00007fa2bd49f53c mail_paned_view_set_folder (libevolution-mail.so)
                #23 0x00007fa2bd2a6314 mail_shell_view_got_folder_cb (module-mail.so)
                #24 0x00007fa2ca2f670a g_task_return_now (libgio-2.0.so.0)
                #25 0x00007fa2ca2f674d complete_in_idle_cb (libgio-2.0.so.0)
                #26 0x00007fa2cacb2e8b g_idle_dispatch (libglib-2.0.so.0)
                #27 0x00007fa2cacb6570 g_main_context_dispatch (libglib-2.0.so.0)
                #28 0x00007fa2cacb6900 g_main_context_iterate.isra.0 (libglib-2.0.so.0)
                #29 0x00007fa2cacb6bf3 g_main_loop_run (libglib-2.0.so.0)
                #30 0x00007fa2ca7a043d gtk_main (libgtk-3.so.0)
                #31 0x0000556c6206278d main (evolution)
                #32 0x00007fa2c6a3c1a3 __libc_start_main (libc.so.6)
                #33 0x0000556c620628ee _start (evolution)
                



Expected results:
should work normally


Additional info:
I looked into the evolution bugzilla and found out this bug report [1] that mentions that webkitgtk is the culprit here and that version 2.29.1 *should* fix the issue (not garanteed). that version is only on rawhide atm, so, idk if it's possible to upgrade the f31/32 version.

[1] https://gitlab.gnome.org/GNOME/evolution/-/issues/927

Comment 1 Michael Catanzaro 2020-06-01 13:55:21 UTC
I don't see any evidence that this would be fixed in 2.29.1. I won't upgrade F31/F32 to unstable WebKit anyway. If it's really fixed in 2.29.1, which I doubt, then we'd need to identify the related commit and request it be backported to the next 2.28 release.

Anyway, to make progress, please post a proper backtrace taken with gdb 'bt full', showing where in libnvidia-egl-wayland the crash occurs. You're lucky that component is open source, because otherwise this would be CANTFIX.

Comment 2 Carlos Mogas da Silva 2020-06-01 17:03:01 UTC
That webkit2gtk3 part is huge so I pasted the "bt full" here: https://l.r3pek.org/288be

This is just the first 14 calls.

(gdb) bt full
#0  0x00007ffff3a9c625 in raise () at /lib64/libc.so.6
#1  0x00007ffff3a858d9 in abort () at /lib64/libc.so.6
#2  0x00007ffff3a857a9 in _nl_load_domain.cold () at /lib64/libc.so.6
#3  0x00007ffff3a94a66 in annobin_assert.c_end () at /lib64/libc.so.6
#4  0x00007fffe4109b7d in wlExternalApiLock () at ../src/wayland-thread.c:87
        __PRETTY_FUNCTION__ = "wlExternalApiLock"
#5  0x00007fffe410e4ab in wlEglGetInternalHandleExport (dpy=0x5555566dad60, type=13233, handle=0x5555566dad60) at ../src/wayland-eglhandle.c:146
#6  0x00007fffd65574ef in  () at /lib64/libEGL_nvidia.so.0
#7  0x00007fffd64deeeb in  () at /lib64/libEGL_nvidia.so.0
#8  0x00007fffe410b752 in wl_eglstream_display_bind (data=data@entry=0x5555566cc5c0, wlDisplay=wlDisplay@entry=0x55555649b360, eglDisplay=eglDisplay@entry=0x5555566dad60)
    at ../src/wayland-eglstream-server.c:311
        wlStreamDpy = 0x555556b69f90
        exts = 0x0
        env = 0x0
#9  0x00007fffe410a355 in wlEglBindDisplaysHook (data=0x5555566cc5c0, dpy=0x5555566dad60, nativeDpy=0x55555649b360) at ../src/wayland-egldisplay.c:87
        res = 0
#10 0x00007fffd65533f3 in  () at /lib64/libEGL_nvidia.so.0
#11 0x00007fffd64db775 in  () at /lib64/libEGL_nvidia.so.0
#12 0x00007ffff20f5b11 in WS::Instance::initialize(void*) () at /lib64/libWPEBackend-fdo-1.0.so.1
#13 0x00007ffff49c7bf6 in WebKit::WebProcessPool::platformInitializeWebProcess(WebKit::WebProcessProxy const&, WebKit::WebProcessCreationParameters&) (this=this@entry=0x7fffe42ee000, process=
    ..., parameters=...) at ../Source/WebKit/UIProcess/glib/WebProcessPoolGLib.cpp:119
#14 0x00007ffff489adfa in WebKit::WebProcessPool::initializeNewWebProcess(WebKit::WebProcessProxy&, WebKit::WebsiteDataStore*, WebKit::WebProcessProxy::IsPrewarmed)
    (this=<optimized out>, process=..., websiteDataStore=0x7fffe42e4000, isPrewarmed=WebKit::WebProcessProxy::IsPrewarmed::No) at ../Source/WebKit/UIProcess/WebProcessPool.cpp:1044
        initializationActivity = {m_ref = std::unique_ptr<WebKit::ProcessThrottler::Activity<(WebKit::ProcessThrottler::ActivityType)0>> = {get() = 0x0}}
        parameters = <snip here>


Looks like the opensource part of the driver is having trouble with locking? (relevant code below)
    if (!wlMutexInitialized || pthread_mutex_lock(&wlMutex)) {
        assert(!"failed to lock pthread mutex");
        return -1;
    }

Comment 3 Michael Catanzaro 2020-06-01 18:25:19 UTC
One final question before I reassign component: does the crash still occur if you run with the environment variable WEBKIT_FORCE_SANDBOX=0? We just found a sandbox bug that can cause certain syscalls to randomly fail so let's eliminate that potential cause first.

Comment 4 Carlos Mogas da Silva 2020-06-01 18:28:41 UTC
(In reply to Michael Catanzaro from comment #3)
> One final question before I reassign component: does the crash still occur
> if you run with the environment variable WEBKIT_FORCE_SANDBOX=0? We just
> found a sandbox bug that can cause certain syscalls to randomly fail so
> let's eliminate that potential cause first.

Yes, same error.

Comment 5 Michael Catanzaro 2020-06-01 18:33:06 UTC
OK -> egl-wayland for further diagnosis

Comment 6 leigh scott 2020-06-01 19:28:52 UTC
I have built the latest version

https://bodhi.fedoraproject.org/updates/FEDORA-2020-be2c4beb82

https://koji.fedoraproject.org/koji/buildinfo?buildID=1519004


If it still reproduces after that you will need to upstream

https://github.com/NVIDIA/egl-wayland/issues

Comment 7 Carlos Mogas da Silva 2020-06-01 22:09:38 UTC
Nop, still the same error on the latest version.

Reported upstream.

Comment 8 leigh scott 2020-06-02 10:57:57 UTC
(In reply to Carlos Mogas da Silva from comment #7)
> Nop, still the same error on the latest version.
> 
> Reported upstream.

Thank you for forwarding the issue upstream, without the debug symbols for libEGL_nvidia.so.0 it is hard to make sense of it.

Comment 9 Carlos Mogas da Silva 2020-08-14 16:06:49 UTC
Upstream bug closed with a fix applied. They didn't version bump, but you can pick the patch up if you want ;)

Comment 10 leigh scott 2020-08-14 18:12:03 UTC
(In reply to Carlos Mogas da Silva from comment #9)
> Upstream bug closed with a fix applied. They didn't version bump, but you
> can pick the patch up if you want ;)

I have added a comment to the commit

https://github.com/NVIDIA/egl-wayland/commit/9558ec02d0f7bbf30dc1f9ee4c0b06c9b0c49afe

Comment 11 Fedora Update System 2020-08-15 00:40:14 UTC
FEDORA-2020-6900d113da has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-6900d113da

Comment 12 Fedora Update System 2020-08-15 00:40:14 UTC
FEDORA-EPEL-2020-83d4434be7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-83d4434be7

Comment 13 Fedora Update System 2020-08-15 00:40:15 UTC
FEDORA-2020-fcc03a2706 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-fcc03a2706

Comment 14 Fedora Update System 2020-08-16 01:30:12 UTC
FEDORA-2020-fcc03a2706 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-fcc03a2706`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-fcc03a2706

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 15 Fedora Update System 2020-08-16 01:36:39 UTC
FEDORA-EPEL-2020-83d4434be7 has been pushed to the Fedora EPEL 7 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2020-83d4434be7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2020-08-16 01:38:44 UTC
FEDORA-2020-6900d113da has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-6900d113da`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-6900d113da

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Fedora Update System 2020-08-24 01:05:57 UTC
FEDORA-2020-6900d113da has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 18 Fedora Update System 2020-08-24 01:12:49 UTC
FEDORA-2020-fcc03a2706 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Fedora Update System 2020-08-31 16:28:25 UTC
FEDORA-EPEL-2020-83d4434be7 has been pushed to the Fedora EPEL 7 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.