Bug 1729925

Summary: Frequent crashes of XWayland
Product: [Fedora] Fedora Reporter: jd.bugzilla
Component: xorg-x11-serverAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 30CC: bskeggs, caillon+fedoraproject, jglisse, john.j5live, ofourdan, pv.bugzilla, rfrank, rhughes, robert.frank, rstrode, sandmann, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://gitlab.freedesktop.org/xorg/xserver/merge_requests/242
Whiteboard:
Fixed In Version: xorg-x11-server-1.20.5-7.fc30 xorg-x11-server-1.20.5-7.fc31 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-12 21:08:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journalctl /usr/bin/gnome-shell none

Description jd.bugzilla 2019-07-15 10:43:35 UTC
Frequent crashes of XWayland on F30. Triggered mostly by clicking on save in Firefox download dialog. Also very easy to trigger doing anything (clicking) in GIMP. Since I spend most of the time in terminal and firefox I don't know if other programs trigger the same bug.

Relevant backtrace from XWayland:

-----------------------------------------------------------
Thread 1 (Thread 0x7f3221b9da80 (LWP 1504)):
#0  0x00007f322148fe75 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f322147a895 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00005587bce17b20 in OsAbort () at utils.c:1351
No locals.
#3  0x00005587bce1cdb9 in AbortServer () at log.c:879
No locals.
#4  0x00005587bce1dc1a in FatalError (f=f@entry=0x5587bce43970 "Caught signal %d (%s). Server aborting\n") at log.c:1017
        args = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7fff5774af60, reg_save_area = 0x7fff5774ae90}}
        args2 = {{gp_offset = 8, fp_offset = 48, overflow_arg_area = 0x7fff5774af60, reg_save_area = 0x7fff5774ae90}}
        beenhere = 1
#5  0x00005587bce14e69 in OsSigHandler (unused=<optimized out>, sip=0x7fff5774b0b0, signo=11) at osinit.c:156
No locals.
#6  OsSigHandler (signo=11, sip=0x7fff5774b0b0, unused=<optimized out>) at osinit.c:110
No locals.
#7  <signal handler called>
No symbol table info available.
#8  0x00007f322185bef5 in gbm_bo_destroy (bo=0x0) at ../src/gbm/main/gbm.c:439
No locals.
#9  0x00005587bccbd027 in xwl_glamor_gbm_create_pixmap (screen=0x5587bdd96420, width=77, height=32, depth=32, hint=0) at xwayland-glamor-gbm.c:245
        format = <optimized out>
        xwl_screen = <optimized out>
        xwl_gbm = <optimized out>
        bo = 0x0
        pixmap = <optimized out>
#10 0x00005587bcdd9ede in ProcCreatePixmap (client=0x5587be1749c0) at dispatch.c:1440
        pMap = <optimized out>
        pDraw = 0x5587be22b8d0
        stuff = <optimized out>
        pDepth = <optimized out>
        i = <optimized out>
        rc = 0
#11 0x00005587bcddede4 in Dispatch () at dispatch.c:478
        result = <optimized out>
        client = 0x5587be1749c0
        start_tick = 66375
#12 0x00005587bcde2ec4 in dix_main (argc=12, argv=0x7fff5774b788, envp=<optimized out>) at main.c:276
        i = <optimized out>
        alwaysCheckForInput = {0, 1}
#13 0x00007f322147bf33 in __libc_start_main () from /lib64/libc.so.6
No symbol table info available.
#14 0x00005587bccb23ee in _start ()
No symbol table info available.
-----------------------------------------------------------


It seems that xwl_glamor_gbm_create_pixmap function calls gbm_bo_destroy with NULL argument:

-----------------------------------------------------------
/usr/src/debug/xorg-x11-server-1.20.5-4.fc30.x86_64/hw/xwayland/xwayland-glamor-gbm.c

236	        {
237	            bo = gbm_bo_create(xwl_gbm->gbm, width, height, format,
238	                               GBM_BO_USE_SCANOUT | GBM_BO_USE_RENDERING);
239	        }
240	
241	        if (bo)
242	            pixmap = xwl_glamor_gbm_create_pixmap_for_bo(screen, bo, depth);
243	
244	        if (!pixmap)
245	            gbm_bo_destroy(bo);
246	    }
247	
248	    if (!pixmap)
249	        pixmap = glamor_create_pixmap(screen, width, height, depth, hint);
250	
251	    return pixmap;
252	}
-----------------------------------------------------------

From the source of gbm_bo_destroy function it is clear that it can't handle NULL argument very well (i.e. it dereferences NULL pointer ... and then happens crash!)

-----------------------------------------------------------
/usr/src/debug/mesa-19.0.8-1.fc30.x86_64/src/gbm/main/gbm.c

436	GBM_EXPORT void
437	gbm_bo_destroy(struct gbm_bo *bo)
438	{
439	   if (bo->destroy_user_data)
440	      bo->destroy_user_data(bo, bo->user_data);
441	
442	   bo->gbm->bo_destroy(bo);
443	}
-----------------------------------------------------------

Comment 1 Olivier Fourdan 2019-07-23 08:57:47 UTC
Humm, right, many thabnks for the detaile dbugt report!

Yet I wonder why `xwl_glamor_gbm_create_pixmap_for_bo` would fail frequently in your case. Can you check the journalctl logs for gnome-shell (journalctl /usr/bin/gnome-shell) to see if you spot any relevant message from either Xwayland or gnome-shell/mutter?

Comment 2 Olivier Fourdan 2019-07-23 09:15:07 UTC
Meanwhile, I filed https://gitlab.freedesktop.org/xorg/xserver/merge_requests/242 to address the issue

Comment 3 jd.bugzilla 2019-07-24 10:06:49 UTC
(In reply to Olivier Fourdan from comment #1)
> Humm, right, many thabnks for the detaile dbugt report!
> 
> Yet I wonder why `xwl_glamor_gbm_create_pixmap_for_bo` would fail frequently
> in your case. Can you check the journalctl logs for gnome-shell (journalctl
> /usr/bin/gnome-shell) to see if you spot any relevant message from either
> Xwayland or gnome-shell/mutter?

Thank you for dealing with this annoying bug.

It is not so frequent anymore, however it happens occasionally.
It's strange, I tried to trigger bug the other day by using GIMP extensively (10+ minutes) and nothing happened. Yet, another day, I opened image with GIMP and just as I clicked on image Xwayland crashed.

Dunno what is causing it, I don't see anything overtly suspicious in the logs. I've attached the output of journalctl /usr/bin/gnome-shell from the last boot when crash happened.

Perhaps timer errors like these:

libinput error: client bug: timer event6 debounce: offset negative (-6ms)

Comment 4 jd.bugzilla 2019-07-24 10:08:45 UTC
Created attachment 1593092 [details]
journalctl /usr/bin/gnome-shell

Comment 5 Olivier Fourdan 2019-07-24 10:13:53 UTC
Humm, nope, you're right, nothing obvious in the logs (the libinput error is unrelated).

Anyway, the fix is now upstream, so next step will be to backport it to the stable branch as well and respin a fedora build.

Comment 6 jd.bugzilla 2019-07-24 10:24:32 UTC
(In reply to Olivier Fourdan from comment #5)
> Humm, nope, you're right, nothing obvious in the logs (the libinput error is
> unrelated).
> 
> Anyway, the fix is now upstream, so next step will be to backport it to the
> stable branch as well and respin a fedora build.

I see, thank you.

Comment 7 Phil V 2019-08-08 14:28:42 UTC
Also seeing sudden total crashes of GNOME-Wayland.

Comment 8 Fedora Update System 2019-09-10 16:59:59 UTC
FEDORA-2019-6c6582bafc has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-6c6582bafc

Comment 9 Fedora Update System 2019-09-10 17:26:13 UTC
FEDORA-2019-21b97145cb has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-21b97145cb

Comment 10 Fedora Update System 2019-09-11 15:36:48 UTC
xorg-x11-server-1.20.5-7.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-6c6582bafc

Comment 11 Fedora Update System 2019-09-12 21:08:46 UTC
xorg-x11-server-1.20.5-7.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2019-09-18 00:02:47 UTC
xorg-x11-server-1.20.5-7.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.