Bug 1174257 - DRI3 causes any GL process on remote X11 display to hang indefinitely in glXChooseVisual()
Summary: DRI3 causes any GL process on remote X11 display to hang indefinitely in glXC...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: mesa
Version: 23
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1221168 1324505 1335183 1338069 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-15 13:57 UTC by Michael Stahl
Modified: 2019-08-19 07:37 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-20 13:05:06 UTC


Attachments (Terms of Use)
Fix DRI3 over ssh (7.06 KB, patch)
2016-02-09 19:29 UTC, poma
no flags Details | Diff
os: Treat ssh as a non-local client (v4) (4.87 KB, patch)
2016-06-15 13:45 UTC, poma
no flags Details | Diff

Description Michael Stahl 2014-12-15 13:57:59 UTC
Description of problem:

starting glxgears with a remote X11 display hangs indefinitely
because libGL calls dri3_create_screen() even though
the display is remote.

with workaround LIBGL_DRI3_DISABLE=1 glxgears works fine


Version-Release number of selected component (if applicable):

mesa-libGL-10.3.5-1.20141207.fc21.x86_64
mesa-dri-drivers-10.3.5-1.20141207.fc21.x86_64


How reproducible:

always


Steps to Reproduce:
1. ssh -X somebox
2. gdb --args glxgears
3. run

Actual results:

no window shown, hangs forever in:

(gdb) bt
#0  0x00000030ec4f51c0 in __poll_nocancel () from /lib64/libc.so.6
#1  0x00000030f080a182 in _xcb_conn_wait () from /lib64/libxcb.so.1
#2  0x00000030f080ba8f in wait_for_reply () from /lib64/libxcb.so.1
#3  0x00000030f080bba1 in xcb_wait_for_reply () from /lib64/libxcb.so.1
#4  0x00000030fb04e869 in dri3_create_screen () from /lib64/libGL.so.1
#5  0x00000030fb01f9a1 in __glXInitialize () from /lib64/libGL.so.1
#6  0x00000030fb01c0ab in GetGLXPrivScreenConfig.part.2 () from /lib64/libGL.so.1
#7  0x00000030fb01c923 in glXChooseVisual () from /lib64/libGL.so.1
#8  0x0000000000403699 in make_window.constprop ()
#9  0x0000000000401a07 in main ()


Expected results:

some spinning wheels


Additional info:

Comment 1 Alexander Larsson 2014-12-15 13:59:46 UTC
Seems also to be the cause of this Xwayland/Gtk issue:
https://bugzilla.gnome.org/show_bug.cgi?id=741011

Comment 2 Daniel Stone 2014-12-15 21:59:01 UTC
Fix is be easy enough, but I have no way of testing this atm. Anyone want to give this a spin?

diff --git a/dri3/dri3_request.c b/dri3/dri3_request.c
index fe45620..1249070 100644
--- a/dri3/dri3_request.c
+++ b/dri3/dri3_request.c
@@ -92,6 +92,9 @@ proc_dri3_open(ClientPtr client)

     REQUEST_SIZE_MATCH(xDRI3OpenReq);

+    if (!client->local)
+       return BadMatch;
+
     status = dixLookupDrawable(&drawable, stuff->drawable, client, 0, DixReadAccess);
     if (status != Success)
         return status;

Comment 3 Michael Stahl 2014-12-16 16:47:45 UTC
doodling about for a bit, i've now got a F21 chroot with RPMs
with the patch from comment #2, running that with systemd-nspawn -b
and ssh -X into it the problem is still there in a different place:

(gdb) bt
#0  0x00007ffff722e1c0 in __poll_nocancel () from /lib64/libc.so.6
#1  0x00007ffff4e3c182 in _xcb_conn_wait () from /lib64/libxcb.so.1
#2  0x00007ffff4e3da8f in wait_for_reply () from /lib64/libxcb.so.1
#3  0x00007ffff4e3dba1 in xcb_wait_for_reply () from /lib64/libxcb.so.1
#4  0x00007ffff7b8fa19 in dri3_open (provider=0, root=178, dpy=<optimized out>)
    at dri3_glx.c:1670
#5  dri3_create_screen (screen=0, priv=<optimized out>) at dri3_glx.c:1909
#6  0x00007ffff7b60aa1 in AllocAndFetchScreenConfigs (priv=0x612240, 
    dpy=0x606010) at glxext.c:788
#7  __glXInitialize (dpy=dpy@entry=0x606010) at glxext.c:902
#8  0x00007ffff7b5d1ab in GetGLXPrivScreenConfig (dpy=dpy@entry=0x606010, 
    scrn=scrn@entry=0, ppriv=ppriv@entry=0x7fffffffdda0, 
    ppsc=ppsc@entry=0x7fffffffdda8) at glxcmds.c:172
#9  0x00007ffff7b5da23 in GetGLXPrivScreenConfig (ppsc=0x7fffffffdda8, 
    ppriv=0x7fffffffdda0, scrn=0, dpy=0x606010) at glxcmds.c:168
#10 glXChooseVisual (dpy=0x606010, screen=0, attribList=0x7fffffffe000)
    at glxcmds.c:1249
#11 0x0000000000403699 in make_window.constprop ()
#12 0x0000000000401a07 in main ()

Comment 4 Richard W.M. Jones 2015-05-27 18:59:02 UTC
*** Bug 1221168 has been marked as a duplicate of this bug. ***

Comment 5 Richard W.M. Jones 2015-05-27 19:02:37 UTC
Any further fix for this?  It affects virt-viewer which uses GLX
and is often used remote over ssh.

Comment 6 Jason Tibbitts 2015-06-26 23:05:13 UTC
I'm having a substantially similar issue.  Logged into two different F21 machines, if I ssh to another machine (F21 or F22) and run, say, teapot or glxinfo, it will hang pretty much immediately.  This is also breaking things like Matlab and Mathematica.

In addition to LIBGL_DRI3_DISABLE=1, LIBGL_ALWAYS_SOFTWARE=1 or LIBGL_ALWAYS_INDIRECT=1 also work (as is probably obvious).

I have, however, had this work when logging into a freshly installed F22 as a test user and then ssh'ing into the same machines where it failed.  I need to try and understand why it worked at all.

Comment 7 Jan Kurik 2015-07-15 14:35:58 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Comment 8 poma 2015-12-03 19:29:33 UTC
... firefox, thunderbird, ...

Comment 9 poma 2015-12-04 05:30:49 UTC
Also happens on Rawhide/Fedora 24.

Comment 10 poma 2015-12-04 05:36:00 UTC
Michael, did you report the bug upstream?

Comment 11 Jan Kurik 2015-12-04 08:01:54 UTC
(In reply to poma from comment #9)
> Also happens on Rawhide/Fedora 24.

What is the question for me ?

Comment 12 Igor Gnatenko 2015-12-04 08:04:45 UTC
(In reply to poma from comment #9)
> Also happens on Rawhide/Fedora 24.

(In reply to poma from comment #10)
> Michael, did you report the bug upstream?

Why Michael? This bug was already fixed somewhere. If you cannot attach backtrace - please close bug.

Comment 13 Michael Stahl 2015-12-04 12:01:00 UTC
no i didn't file an upstream bug for this (maybe somebody else did?).

on Fedora 23 i don't have this problem - but i don't know if Fedora 23 actually enables DRI3 by default and how to check that.

Comment 14 poma 2015-12-05 13:25:18 UTC
(In reply to Michael Stahl from comment #13)
> no i didn't file an upstream bug for this (maybe somebody else did?).
> 
> on Fedora 23 i don't have this problem - but i don't know if Fedora 23
> actually enables DRI3 by default and how to check that.


As far as I know, DRI3 is not the default.

$ LIBGL_DEBUG=verbose vblank_mode=0 glxgears

using the following configuration:

/etc/X11/xorg.conf.d/nouveau-dri3.conf
Section "Device"
	Identifier  "nvidia0"
	Driver      "nouveau"
	Option      "DRI" "3"
EndSection

Xorg.0.log:
NOUVEAU(0): DRI3 on EXA enabled
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/etc/X11/xorg.conf.d/radeon-dri3.conf
Section "Device"
	Identifier  "amd0"
	Driver      "radeon"
	Option      "DRI3" "on"
EndSection

Xorg.0.log:
RADEON(0): DRI3 enabled
~~~~~~~~~~~~~~~~~~~~~~~

/etc/X11/xorg.conf.d/intel-dri3.conf
Section "Device"
	Identifier  "intel0"
	Driver      "intel"
	Option      "DRI" "3"
EndSection

Xorg.0.log:
intel(0): direct rendering: DRI2 DRI3 enabled
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Comment 15 poma 2015-12-05 13:26:03 UTC
GL/DRI3 over ssh broken
https://bugs.freedesktop.org/show_bug.cgi?id=93261

Comment 16 poma 2015-12-11 12:35:21 UTC
Resolved in xserver.

Comment 17 poma 2016-02-09 19:29:27 UTC
Created attachment 1122503 [details]
Fix DRI3 over ssh

Reported-by: Michael Stahl <mstahl@redhat.com>
Suggested-by: Daniel Stone <daniel@fooishbar.org>
Tested-by: poma <pomidorabelisima@gmail.com>

Comment 18 poma 2016-02-09 19:33:54 UTC
$ rpm --query --changelog xorg-x11-server-Xorg-1.18.1-2.fc22.x86_64
* Tue Feb 09 2016 poma <poma@gmail.com> 1.18.1-2
- dri3: Refuse to work for remote clients (v2)

* Mon Feb 08 2016 Adam Jackson <ajax@redhat.com> 1.18.1-1
- xserver 1.18.1

* Mon Jan 04 2016 poma <poma@gmail.com> 1.18.0-5
- os: Treat ssh as a non-local client (v3)

* Fri Dec 11 2015 poma <poma@gmail.com> 1.18.0-4
- dri3: Refuse to work for remote clients

* Fri Dec 11 2015 poma <poma@gmail.com> 1.18.0-3
- os: Treat ssh as a non-local client v2.1

...

Comment 19 Stephan Bergmann 2016-04-07 10:13:25 UTC
*** Bug 1324505 has been marked as a duplicate of this bug. ***

Comment 20 Rex Dieter 2016-05-11 13:48:54 UTC
*** Bug 1335183 has been marked as a duplicate of this bug. ***

Comment 21 Rex Dieter 2016-05-21 16:55:54 UTC
*** Bug 1338069 has been marked as a duplicate of this bug. ***

Comment 22 Mauro Carvalho Chehab 2016-06-08 11:41:33 UTC
This bug is still happening on Fedora 23 with:
   xorg-x11-server-Xorg-1.18.3-2.fc23.x86_64

I tested with 2 different remote machines, one with Fedora 23 and another one with Fedora 24. With Fedora 23, the workaround of disabling DRI3 works.

However, on the other machine with Fedora 23, disabling DRI3 makes the window to appear, but it causes a crash:

$ LIBGL_DRI3_DISABLE=1 LIBGL_DEBUG=verbose vblank_mode=0 glxgears
libGL error: failed to authenticate magic 7
libGL error: failed to load driver: i965
libGL: OpenDriver: trying /usr/lib64/dri/tls/swrast_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/swrast_dri.so
ATTENTION: default value of option vblank_mode overridden by environment.
Illegal instruction (core dumped)

Comment 23 Martin Gregorie 2016-06-08 12:57:21 UTC
The workround of disabling DRI3 does not work for LibreOffice 5.0.6.2 under Fedora 23.

Comment 24 poma 2016-06-15 13:45:14 UTC
Created attachment 1168389 [details]
os: Treat ssh as a non-local client (v4)

The missing component to fix DRI3 over ssh (nouveau).

Comment 25 poma 2016-06-15 13:56:23 UTC
(In reply to Martin Gregorie from comment #23)
> The workround of disabling DRI3 does not work for LibreOffice 5.0.6.2 under
> Fedora 23.

firefox, thunderbird, libreoffice, ..., work OK


however glxgears (glx-utils-8.3.0-3.fc22.x86_64) over ssh (nouveau) is generally broken, i.e. not particularly to DRI v2 or v3 
...
*** Error in `glxgears': free(): invalid pointer: 0x00007f75ed26c920 ***
======= Backtrace: =========
...

Comment 26 Martin Gregorie 2016-06-17 22:57:25 UTC
Its now working as of this evening's update, which delivered LibreOffice build 5.0.6.2-7.fc23, so I've now added 'export LIBGL_DRI3_DISABLE=1' to .bash_profile in the remote logins.

Thanks for the fix.

Comment 27 poma 2016-07-21 08:11:30 UTC
The last missed bit is part of released xserver 1.18.4
https://cgit.freedesktop.org/xorg/xserver/commit/?h=server-1.18-branch&id=3c4cead

xorg-x11-server-1.18.4-1.fc24
https://bodhi.fedoraproject.org/updates/FEDORA-2016-a0ad95d1d7

Comment 28 Fedora End Of Life 2016-11-24 11:20:19 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 29 Fedora End Of Life 2016-12-20 13:05:06 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.