Bug 1911827
| Summary: | gdm-3.28.3-35.el8.x86_64 enables Wayland on mgag200, which is broken | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Tarun Reddy <tarun> | ||||||||||||||
| Component: | gdm | Assignee: | Ray Strode [halfline] <rstrode> | ||||||||||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||||
| Priority: | unspecified | ||||||||||||||||
| Version: | CentOS Stream | CC: | bstinson, carl, hdegoede, jadahl, jkoten, jwboyer, mtsai2, rhsu5, tarun, tpelka | ||||||||||||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||||||||||||
| Target Release: | 8.0 | ||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||
| OS: | Unspecified | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||
| Last Closed: | 2021-10-18 14:43:26 UTC | Type: | Bug | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Bug Depends On: | 1670273 | ||||||||||||||||
| Bug Blocks: | |||||||||||||||||
| Attachments: |
|
||||||||||||||||
Should have also pointed to this bug (that I can't see): https://bugzilla.redhat.com/show_bug.cgi?id=1670273 (In reply to Tarun Reddy from comment #1) > Should have also pointed to this bug (that I can't see): > > https://bugzilla.redhat.com/show_bug.cgi?id=1670273 Unfortunately I can't give you access as this is a partner bug but this one really seems like dup of bz1670273. And one more thing, make sure you have mutter-3.32.2-49.el8 or newer. mutter-3.32.2-51.el8.x86_64 is installed on my system. Not sure why it relates but I'm guessing it is in the details of the other bug. In any case, I think this is a serious bug as this is the server video card on every Dell server I know of. Granted many don't run GUIs, but the one that do will immediately feel this pain. (Fedora 33 also exhibits this last time I checked. Not on install, but on initial boot). (In reply to Tarun Reddy from comment #4) > mutter-3.32.2-51.el8.x86_64 is installed on my system. Not sure why it > relates but I'm guessing it is in the details of the other bug. In any case, > I think this is a serious bug as this is the server video card on every Dell > server I know of. Granted many don't run GUIs, but the one that do will > immediately feel this pain. (Fedora 33 also exhibits this last time I > checked. Not on install, but on initial boot). Agree I moved bz1670273 back to assigned. When reproducing (running the Wayland session), can you run the following commands: gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' Paste the output. This will tell us whether GNOME Shell turned off animations (which it should have). journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log Attach gnome-shell.log. This will contain log information telling us whether the shadow buffer (intended to be used on matrox like GPUs) were enabled or not. glxinfo >& glxinfo.log Attach glxinfo.log. This will tell us more about the renderer environment (even if we're running under Wayland, Xwayland will translate for us). [treddy@xxxxxxx ~]$ gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' (<true>,) glxinfo.log and gnome-shell.log attached Created attachment 1745406 [details]
glxinfo
Created attachment 1745407 [details]
gnome-shell.log
Created attachment 1745410 [details]
gnome-shell.log with mouse delay
So there are two issues: gnome-shell fails to disable animations, even though there is no hardware acceleration, and shadow buffers aren't enabled. The former should happen, and I will have to dig into why it fails, and the latter is either due to DRM_CAP_DUMB_PREFER_SHADOW is missing, or it is set to 0. I will prepare a build of mutter that forces shadow buffers for mgag200. Thank you... will be happy to test a pre-release build if you want. (In reply to Tarun Reddy from comment #12) > Thank you... will be happy to test a pre-release build if you want. I have prepared mutter builds here: http://people.redhat.com/~jadahl/mutter-matrox/ Would be great if you could install them, reboot, see if they make a difference, and after having logged in to a Wayland session, again run journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log and gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' Created attachment 1746676 [details]
gnome-shell after new mutter
[treddy@erie ~]$ rpm -qa | grep mutter mutter-3.32.2-55.el8.x86_64 Still causes the issue. gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' (<true>,) and gnome-shell.log attached. Thanks! So with those logs, it looks like mutter doesn't properly detect that the renderer is using llvmpipe, even though this is communicated to be the case with glxinfo earlier. Could you append
export COGL_DEBUG=winsys
to ~/.bashrc, log out, log back in, then run
journalctl _PID=$(pgrep -x gnome-shell -u $UID) >& gnome-shell.log
This should tell us what mutter sees when it tries to determine the architecture.
Created attachment 1746807 [details]
gnome-shell with debug enabled
(In reply to Tarun Reddy from comment #17) > Created attachment 1746807 [details] > gnome-shell with debug enabled Thanks! Looks like this uses a different "GL_VENDOR" than what it already handles ("Mesa Project" vs "Mesa/X.org"). I created builds that now also handles "Mesa/X.org". They are available here: http://people.redhat.com/~jadahl/mutter-matrox-2/ They have the same version number as the ones you already have, so make sure you *re*install them. Created attachment 1747115 [details]
gnome-shell with second mutter update
Definitely better. Animations are also disabled, but still jerky USB input compared to Wayland off.
[treddy@erie ~]$ gdbus call -e -d org.gnome.Shell.Introspect -o /org/gnome/Shell/Introspect -m org.freedesktop.DBus.Properties.Get '"org.gnome.Shell.Introspect"' '"AnimationsEnabled"' (<false>,) But as you can see from gnome-shell in previous attachment, I get a ton of these: Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1 - Dell Dell USB Mouse: client bug: event processing lagging behind by 11ms, your system is too slow Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1 - Dell Dell USB Mouse: client bug: event processing lagging behind by 12ms, your system is too slow Jan 13 10:15:24 erie.home.tarunreddy.com org.gnome.Shell.desktop[4291]: libinput error: event1 - Dell Dell USB Mouse: WARNING: log rate limit exceeded (5 msgs per 60min). Discarding future messages. And the UI is usable but certainly not smooth by any means. If this was my desktop, I would immediately disable Wayland. Way better under Xorg. Thanks for testing; seems mgag200 doesn't support the necessary DMA buffer related functionality that mutter needs to implement the same kind of optimization the X server does: Jan 13 10:14:53 erie.home.tarunreddy.com gnome-shell[4291]: Failed to initialize double buffered shadow fb for VGA-1: Failed to export buffer's DMA fd: Function not implemented Ah... that makes sense, and explains the general poor experience. If it was me, I would revert the original change and disable Wayland for mgag200 devices, or at least where DMA buffers are unavailable. That said, RedHat may be looking to remove Xorg completely to reduce surface area of support. There is yet another thing to try - a different way of allocating the needed kernel buffers, that might be good enough for mgag200, since its buffer does not reside on any actual GPU memory. I will give that a try; but if that doesn't make the situation good enough, we will indeed have to disable Wayland on mgag200 again. Any updates? (In reply to Tarun Reddy from comment #24) > Any updates? There was also an issue looking like a buffer stride issue on another Matrox card, so I have changed GDM to revert the change that enabled Wayland for 8.4. I will try to gain access to the relevant hardware to have a better chance to get things right, but would appreciate testing of future builds. Did gdm-3.28.3-37.el8.x86_64 supposedly roll this back? The changelog doesn't show anything and the behavior on my system still shows the issue. I understand if the issue hasn't elevated in priority to fix getting Wayland to work on Matrox cards, but you have to be better about rolling back changes that break things. It is very frustrating to see the slow rollback of the issue on a very popular server platform. This also tells me that CentOS Stream really is broken and is not an improvement. If you plan on making CentOS Stream a usable platform, you not only have to be able to move quickly forward, but also quickly backwards when issues are actually reported or you risk losing the very people who can help. Hi, This was reverted in gdm-3.28.3-38.el8. CentOS streams seems to have gdm-3.28.3-39.el8 which should have this rollback in it. You could try "dnf clean all" and see if that helps, or failing that: rpm -Uvh http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/gdm-3.28.3-39.el8.x86_64.rpm Just saw the commit message here: https://git.centos.org/rpms/gdm/c/dfaa010d85fe64d05cb6a2633e7b529aa7ee9247?branch=c8s Already tried "yum clean all" and update, but didn't get that update. Clearly some mirrors are behind. Thank you and I'll download manually. *** This bug has been marked as a duplicate of bug 1670273 *** Can I get access to bug 1670273? Tarun, that bug is currently private because it contains logs from a partner. I can summarize the main points for you, however: 1. The mutter builtin display server (wayland) is not performant enough to use by default on Matrox server cards. 2. Several changes were added during the 8.4 development timeframe to try to address this performance problem. 3. Testing showed, however, that even with those changes there was still a non-negligible performance gap between X11 and wayland for this particular GPU. 4. The decision was made to keep X11 as the default for Matrox cards in RHEL 8. gdm-3.28.3-35 was the first and last version of GDM since 8.0 GA to use wayland by default on matrox chips. 5. This may change in future major releases. One additional note to add is that bug 1670273 is currently blocked on bug 1952417, which is public, to make further progress. |
Description of problem: On CentOS 8 stream (not sure if this is the right place to report this issue), the latest gdm-3.28.3-35.el8.x86_64 breaks the GUI on my Dell T430 (and presumably many, many server platforms with the MGA G200 server GPU). Namely, it re-enables Wayland on server GPUs, and at least with the Matrox chipset, that breaks things very badly. USB input stops working consistently (have to unplug and replug in to get it to work) and gnome-shell spins with high cpu all the time. Reverting to Xorg from Wayland via /etc/gdm/custom.conf fixes everything. Version-Release number of selected component (if applicable): gdm-3.28.3-35 How reproducible: Steps to Reproduce: 1. Install Centos Stream on system with Server MGA G200 GPU. Example: 08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (rev 01) (prog-if 00 [VGA controller]) DeviceName: Embedded Video Subsystem: Dell Device 063b Flags: bus master, medium devsel, latency 64, IRQ 17, NUMA node 0 Memory at 94000000 (32-bit, prefetchable) [size=16M] Memory at 95800000 (32-bit, non-prefetchable) [size=16K] Memory at 95000000 (32-bit, non-prefetchable) [size=8M] Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: [dc] Power Management version 1 Kernel driver in use: mgag200 Kernel modules: mgag200 2. systemctl set-default graphical.target 3. Update to latest gdm-3.28.3-35.el8.x86_64 Actual results: Try and use a USB mouse in GUI locally on the system. It skips and jumps and then stops working altogether. Also note high cpu usage of gnome-shell. Expected results: Smooth USB responsiveness and no gnome-shell cpu spikes Additional info: WaylandEnable=false in /etc/gdm/custom.conf disables Wayland and fixes the issue.