RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1911827 - gdm-3.28.3-35.el8.x86_64 enables Wayland on mgag200, which is broken
Summary: gdm-3.28.3-35.el8.x86_64 enables Wayland on mgag200, which is broken
Keywords:
Status: CLOSED DUPLICATE of bug 1670273
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: gdm
Version: CentOS Stream
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: rc
: 8.0
Assignee: Ray Strode [halfline]
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On: 1670273
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-31 15:36 UTC by Tarun Reddy
Modified: 2021-10-22 08:00 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 14:43:26 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
glxinfo (114.80 KB, text/plain)
2021-01-07 18:03 UTC, Tarun Reddy
no flags Details
gnome-shell.log (2.61 KB, text/plain)
2021-01-07 18:04 UTC, Tarun Reddy
no flags Details
gnome-shell.log with mouse delay (4.02 KB, text/plain)
2021-01-07 18:08 UTC, Tarun Reddy
no flags Details
gnome-shell after new mutter (52.19 KB, text/plain)
2021-01-12 15:35 UTC, Tarun Reddy
no flags Details
gnome-shell with debug enabled (12.37 KB, text/plain)
2021-01-12 21:01 UTC, Tarun Reddy
no flags Details
gnome-shell with second mutter update (11.63 KB, text/plain)
2021-01-13 17:19 UTC, Tarun Reddy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
CentOS 17987 0 None None None 2020-12-31 15:36:30 UTC

Description Tarun Reddy 2020-12-31 15:36:31 UTC
Description of problem:
On CentOS 8 stream (not sure if this is the right place to report this issue), the latest gdm-3.28.3-35.el8.x86_64 breaks the GUI on my Dell T430 (and presumably many, many server platforms with the MGA G200 server GPU). Namely, it re-enables Wayland on server GPUs, and at least with the Matrox chipset, that breaks things very badly. USB input stops working consistently (have to unplug and replug in to get it to work) and gnome-shell spins with high cpu all the time. 

Reverting to Xorg from Wayland via /etc/gdm/custom.conf fixes everything.


Version-Release number of selected component (if applicable):
gdm-3.28.3-35

How reproducible:


Steps to Reproduce:
1. Install Centos Stream on system with Server MGA G200 GPU. Example:
08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (rev 01) (prog-if 00 [VGA controller])
    DeviceName: Embedded Video
    Subsystem: Dell Device 063b
    Flags: bus master, medium devsel, latency 64, IRQ 17, NUMA node 0
    Memory at 94000000 (32-bit, prefetchable) [size=16M]
    Memory at 95800000 (32-bit, non-prefetchable) [size=16K]
    Memory at 95000000 (32-bit, non-prefetchable) [size=8M]
    Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
    Capabilities: [dc] Power Management version 1
    Kernel driver in use: mgag200
    Kernel modules: mgag200

2. systemctl set-default graphical.target
3. Update to latest gdm-3.28.3-35.el8.x86_64

Actual results:
Try and use a USB mouse in GUI locally on the system. It skips and jumps and then stops working altogether. Also note high cpu usage of gnome-shell.


Expected results:
Smooth USB responsiveness and no gnome-shell cpu spikes

Additional info:
WaylandEnable=false
in /etc/gdm/custom.conf
disables Wayland and fixes the issue.

Comment 1 Tarun Reddy 2020-12-31 15:38:57 UTC
Should have also pointed to this bug (that I can't see):

https://bugzilla.redhat.com/show_bug.cgi?id=1670273

Comment 2 Tomas Pelka 2021-01-06 12:58:35 UTC
(In reply to Tarun Reddy from comment #1)
> Should have also pointed to this bug (that I can't see):
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1670273

Unfortunately I can't give you access as this is a partner bug but this one really seems like dup of bz1670273.

Comment 3 Tomas Pelka 2021-01-06 13:00:13 UTC
And one more thing, make sure you have mutter-3.32.2-49.el8 or newer.

Comment 4 Tarun Reddy 2021-01-07 00:33:51 UTC
mutter-3.32.2-51.el8.x86_64 is installed on my system. Not sure why it relates but I'm guessing it is in the details of the other bug. In any case, I think this is a serious bug as this is the server video card on every Dell server I know of. Granted many don't run GUIs, but the one that do will immediately feel this pain. (Fedora 33 also exhibits this last time I checked. Not on install, but on initial boot).

Comment 5 Tomas Pelka 2021-01-07 07:35:35 UTC
(In reply to Tarun Reddy from comment #4)
> mutter-3.32.2-51.el8.x86_64 is installed on my system. Not sure why it
> relates but I'm guessing it is in the details of the other bug. In any case,
> I think this is a serious bug as this is the server video card on every Dell
> server I know of. Granted many don't run GUIs, but the one that do will
> immediately feel this pain. (Fedora 33 also exhibits this last time I
> checked. Not on install, but on initial boot).

Agree I moved bz1670273 back to assigned.

Comment 6 Jonas Ådahl 2021-01-07 08:11:47 UTC
When reproducing (running the Wayland session), can you run the following commands:

   gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' 

Paste the output. This will tell us whether GNOME Shell turned off animations (which it should have).

   journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log

Attach gnome-shell.log. This will contain log information telling us whether the shadow buffer (intended to be used on matrox like GPUs) were enabled or not.

   glxinfo >& glxinfo.log

Attach glxinfo.log. This will tell us more about the renderer environment (even if we're running under Wayland, Xwayland will translate for us).

Comment 7 Tarun Reddy 2021-01-07 18:02:42 UTC
[treddy@xxxxxxx ~]$ gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"'
(<true>,)

glxinfo.log and gnome-shell.log attached

Comment 8 Tarun Reddy 2021-01-07 18:03:41 UTC
Created attachment 1745406 [details]
glxinfo

Comment 9 Tarun Reddy 2021-01-07 18:04:13 UTC
Created attachment 1745407 [details]
gnome-shell.log

Comment 10 Tarun Reddy 2021-01-07 18:08:57 UTC
Created attachment 1745410 [details]
gnome-shell.log with mouse delay

Comment 11 Jonas Ådahl 2021-01-07 18:12:43 UTC
So there are two issues: gnome-shell fails to disable animations, even though there is no hardware acceleration, and shadow buffers aren't enabled. The former should happen, and I will have to dig into why it fails, and the latter is either due to DRM_CAP_DUMB_PREFER_SHADOW is missing, or it is set to 0. I will prepare a build of mutter that forces shadow buffers for mgag200.

Comment 12 Tarun Reddy 2021-01-07 18:40:13 UTC
Thank you... will be happy to test a pre-release build if you want.

Comment 13 Jonas Ådahl 2021-01-12 10:55:19 UTC
(In reply to Tarun Reddy from comment #12)
> Thank you... will be happy to test a pre-release build if you want.

I have prepared mutter builds here: http://people.redhat.com/~jadahl/mutter-matrox/

Would be great if you could install them, reboot, see if they make a difference, and after having logged in to a Wayland session, again run

    journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log

and

    gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"'

Comment 14 Tarun Reddy 2021-01-12 15:35:57 UTC
Created attachment 1746676 [details]
gnome-shell after new mutter

Comment 15 Tarun Reddy 2021-01-12 15:37:16 UTC
[treddy@erie ~]$ rpm -qa | grep mutter
mutter-3.32.2-55.el8.x86_64

Still causes the issue. 

gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"'
(<true>,)

and gnome-shell.log attached.

Comment 16 Jonas Ådahl 2021-01-12 17:18:52 UTC
Thanks! So with those logs, it looks like mutter doesn't properly detect that the renderer is using llvmpipe, even though this is communicated to be the case with glxinfo earlier. Could you append

    export COGL_DEBUG=winsys

to ~/.bashrc, log out, log back in, then run

    journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log


This should tell us what mutter sees when it tries to determine the architecture.

Comment 17 Tarun Reddy 2021-01-12 21:01:13 UTC
Created attachment 1746807 [details]
gnome-shell with debug enabled

Comment 18 Jonas Ådahl 2021-01-13 08:32:31 UTC
(In reply to Tarun Reddy from comment #17)
> Created attachment 1746807 [details]
> gnome-shell with debug enabled

Thanks! Looks like this uses a different "GL_VENDOR" than what it already handles ("Mesa Project" vs "Mesa/X.org"). I created builds that now also handles "Mesa/X.org".

They are available here: http://people.redhat.com/~jadahl/mutter-matrox-2/

They have the same version number as the ones you already have, so make sure you *re*install them.

Comment 19 Tarun Reddy 2021-01-13 17:19:23 UTC
Created attachment 1747115 [details]
gnome-shell with second mutter update

Definitely better. Animations are also disabled, but still jerky USB input compared to Wayland off.

Comment 20 Tarun Reddy 2021-01-13 17:21:10 UTC
[treddy@erie ~]$ gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"'
(<false>,)


But as you can see from gnome-shell in previous attachment, I get a ton of these:

Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1  - Dell Dell USB Mouse: client bug: event processing lagging behind by 11ms, your system is too slow
Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1  - Dell Dell USB Mouse: client bug: event processing lagging behind by 12ms, your system is too slow
Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1  - Dell Dell USB Mouse: WARNING: log rate limit exceeded (5 msgs per 60min). Discarding future messages.

And the UI is usable but certainly not smooth by any means. If this was my desktop, I would immediately disable Wayland. Way better under Xorg.

Comment 21 Jonas Ådahl 2021-01-13 17:45:50 UTC
Thanks for testing; seems mgag200 doesn't support the necessary DMA buffer related functionality that mutter needs to implement the same kind of optimization the X server does:

Jan 13 10:14:53 erie.home.tarunreddy.com gnome-shell[4291]: Failed to initialize double buffered shadow fb for VGA-1: Failed to export buffer's DMA fd: Function not implemented

Comment 22 Tarun Reddy 2021-01-13 17:52:42 UTC
Ah... that makes sense, and explains the general poor experience. If it was me, I would revert the original change and disable Wayland for mgag200 devices, or at least where DMA buffers are unavailable. That said, RedHat may be looking to remove Xorg completely to reduce surface area of support.

Comment 23 Jonas Ådahl 2021-01-13 18:08:06 UTC
There is yet another thing to try - a different way of allocating the needed kernel buffers, that might be good enough for mgag200, since its buffer does not reside on any actual GPU memory. I will give that a try; but if that doesn't make the situation good enough, we will indeed have to disable Wayland on mgag200 again.

Comment 24 Tarun Reddy 2021-01-23 18:34:40 UTC
Any updates?

Comment 25 Jonas Ådahl 2021-01-25 10:13:58 UTC
(In reply to Tarun Reddy from comment #24)
> Any updates?

There was also an issue looking like a buffer stride issue on another Matrox card, so I have changed GDM to revert the change that enabled Wayland for 8.4. I will try to gain access to the relevant hardware to have a better chance to get things right, but would appreciate testing of future builds.

Comment 26 Tarun Reddy 2021-02-05 19:41:47 UTC
Did gdm-3.28.3-37.el8.x86_64 supposedly roll this back? The changelog doesn't show anything and the behavior on my system still shows the issue. I understand if the issue hasn't elevated in priority to fix getting Wayland to work on Matrox cards, but you have to be better about rolling back changes that break things.

It is very frustrating to see the slow rollback of the issue on a very popular server platform. 

This also tells me that CentOS Stream really is broken and is not an improvement. If you plan on making CentOS Stream a usable platform, you not only have to be able to move quickly forward, but also quickly backwards when issues are actually reported or you risk losing the very people who can help.

Comment 27 Ray Strode [halfline] 2021-02-05 19:50:08 UTC
Hi,

This was reverted in gdm-3.28.3-38.el8.  CentOS streams seems to have gdm-3.28.3-39.el8 which should have this rollback in it.

You could try "dnf clean all" and see if that helps, or failing that:

rpm -Uvh http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/gdm-3.28.3-39.el8.x86_64.rpm

Comment 28 Tarun Reddy 2021-02-05 20:00:34 UTC
Just saw the commit message here:
https://git.centos.org/rpms/gdm/c/dfaa010d85fe64d05cb6a2633e7b529aa7ee9247?branch=c8s

Already tried "yum clean all" and update, but didn't get that update. Clearly some mirrors are behind. Thank you and I'll download manually.

Comment 29 Ray Strode [halfline] 2021-10-18 14:43:26 UTC

*** This bug has been marked as a duplicate of bug 1670273 ***

Comment 30 Tarun Reddy 2021-10-18 14:53:37 UTC
Can I get access to bug 1670273?

Comment 31 Ray Strode [halfline] 2021-10-18 18:33:25 UTC
Tarun, that bug is currently private because it contains logs from a partner.

I can summarize the main points for you, however:

1. The mutter builtin display server (wayland) is not performant enough to use by default on Matrox server cards.
2. Several changes were added during the 8.4 development timeframe to try to address this performance problem.
3. Testing showed, however, that even with those changes there was still a non-negligible performance gap between X11 and wayland for this particular GPU.
4. The decision was made to keep X11 as the default for Matrox cards in RHEL 8. gdm-3.28.3-35 was the first and last version of GDM since 8.0 GA to use wayland by default on matrox chips.
5. This may change in future major releases.

Comment 32 Jonas Ådahl 2021-10-22 08:00:57 UTC
One additional note to add is that bug 1670273 is currently blocked on bug 1952417, which is public, to make further progress.


Note You need to log in before you can comment on or make changes to this bug.