Bug 1656905
Summary: | screen not coming back from dpms | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Jiri Koten <jkoten> |
Component: | mutter | Assignee: | Jonas Ådahl <jadahl> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Desktop QE <desktop-qa-list> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.0 | CC: | fmuellner, jwboyer, rstrode, tpelka |
Target Milestone: | rc | ||
Target Release: | 8.0 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | mutter-3.28.3-9.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-14 00:55:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1635157, 1657660 | ||
Attachments: |
Created attachment 1518187 [details]
inhibit-frame-clockwhen-power-saving.patch
This patch fixes suspend/resume issue. The problem was that we'll skip page flipping when power saving, but might already had scheduled a paint, meaning we'd still eglSwapBuffers(). When resuming, we'd paint again, call eglSwapBuffers(), which would dead lock since there is no room in the FIFO, as the previous buffer was not page flipped out.
Created attachment 1518188 [details]
renderer-native: Skip eglSwapBuffers when last EGLStream flip was busy
This patch fixes a similar dead lock to the power saving one, but one that can't be avoided, but must be worked around.
The problem is that when changing mode, we recreate streams with the new resolution, and the first page flip seems to always fail. We work around this by pretending nothing happened, except avoid calling eglSwapBuffers() until a page flip finally worked. Currently there is no way to get notified when a page flip will actually be possible, so some way of try-until-it-works is currently necessary, but nvidia is looking into alternative ways to solve this.
(In reply to Jonas Ådahl from comment #1) > Created attachment 1518187 [details] > inhibit-frame-clockwhen-power-saving.patch Looks like this is just a list of patches, and not the patches (In reply to Jonas Ådahl from comment #2) > The problem is that when changing mode, we recreate streams with the new > resolution, and the first page flip seems to always fail. We work around > this by pretending nothing happened, except avoid calling eglSwapBuffers() > until a page flip finally worked. Currently there is no way to get notified > when a page flip will actually be possible, so some way of > try-until-it-works is currently necessary, but nvidia is looking into > alternative ways to solve this. Hmm so it's not that the next flip fails, but that flips for the next N ms's fail? I mean, does a sleep (1); call fix it too? The thing is, we may end up losing drawing with the current approach, I think. there's nothing to enforce another flip after the one we ignored, right? Created attachment 1518189 [details]
inhibit-frame-clockwhen-power-saving.patch
Now attaching with actual content.
(In reply to Ray Strode [halfline] from comment #3) > (In reply to Jonas Ådahl from comment #1) > > Created attachment 1518187 [details] > > inhibit-frame-clockwhen-power-saving.patch > Looks like this is just a list of patches, and not the patches > > (In reply to Jonas Ådahl from comment #2) > > The problem is that when changing mode, we recreate streams with the new > > resolution, and the first page flip seems to always fail. We work around > > this by pretending nothing happened, except avoid calling eglSwapBuffers() > > until a page flip finally worked. Currently there is no way to get notified > > when a page flip will actually be possible, so some way of > > try-until-it-works is currently necessary, but nvidia is looking into > > alternative ways to solve this. > Hmm so it's not that the next flip fails, but that flips for the next N ms's > fail? > > I mean, does a sleep (1); call fix it too? > > The thing is, we may end up losing drawing with the current approach, I > think. there's nothing to enforce another flip after the one we ignored, > right? We'll flip somewhat old content (the one from the first eglSwapBuffers() instead of what seems to be always be the second). I'm not sure sleep(1) will avoid the issue, but I guess I can test. Created attachment 1518330 [details]
avoid-eglstream-deadlock.patch
This is the alternative solution, which does the following:
* If eglstream page flipping fails with 'busy', retry again once after 17ms
* If eglstream page flipping fails for any other reason or on the second try, continue as normal but inhibit eglSwapBuffers() calls
The first part should for the most part already avoid the dead lock, and the second part is more of a backup plan.
Created attachment 1518936 [details]
inhibit-frame-clockwhen-power-saving.patch
Updated power saving frame clock inhibitation patch, as the old one had a bug.
Created attachment 1518938 [details]
avoid-eglstream-deadlock.patch
Updated the eglSwapBuffers() deadlock patch after having rebased it on top of the frame clock inhibitation patch.
Created attachment 1520049 [details]
eglstream-mailbox-mode.patch
After some discussion with an Nvidia engineer, here is an alternative, much less invasive, patch, that avoids the dead locks. It works by changing the EGLStream to operate in mailbox mode, instead of FIFO mode.
much nicer. I think i would drop the g_warning, since it's going to show up in normal operation right ? I imagine since we're buffering less frames this may have some performance considerations? but I guess it just brings us to par with the other drivers anyway? Let's do it. I just want to add, that not messing with the frame clock seems like a very good thing from a "less change is less risk" perspective. Created attachment 1520050 [details]
eglstream-mailbox-mode.patch (v2)
New version of the EGLStream mailbox patch. Changes since last version is only warning on non "busy" flip errors, and swapping the order of the commits. As for risk vs performance, concur that the decreased risk of a lot less invasive changes has bigger value than the (so far non-noticable) risk of worsened performance.
Verified in mutter-3.28.3-17.el8. kernel-4.18.0-67.el8 NVidia 410.93 |
Created attachment 1512175 [details] session log Description of problem: Running Wayland with nvidia binary drivers, the screen is not coming back from dpms after idle or lock screen. Version-Release number of selected component (if applicable): mutter-3.28.3-5.el8 kernel-4.18.0-48.el8 nvidia drivers 410.78 How reproducible: 100% Steps to Reproduce: 1. Disable the gdm udev rule for nvidia 2. enable nvidia modeset 3. Login to session 4. Lock screen 5. Wake up the screen Actual results: The screen remains blank Expected results: The screen turns on and shows lock screen Additional info: