Bug 1889474

Summary: External display laggy when using xorg, nouveau, hybrid mode with lid closed
Product: [Fedora] Fedora Reporter: Mark Pearson <mpearson>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 33CC: acaringi, airlied, bskeggs, csoriano, ghalat, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-03 13:44:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1816645, 1816768    

Description Mark Pearson 2020-10-19 18:09:11 UTC
1. Please describe the problem:
I know this is a weird combo of settings, but I've hit it and now a customer has too so I was hoping the RH engineers might have some insight.

Seen on systems with a Nvidia graphics cards (I've reproduced on P1G2, and a customer using Fedora has seen it on P15g)

If you use xorg instead of wayland (which in my case I have to use for screen share and running MS teams) then if the lid is closed the external display performance becomes really laggy. It doesn't seem to impact systems without an Nvidia card (at least from my limiting testing).

In my case booting in discrete mode solved the problem.

I couldn't spot anything obviously wrong in the kernel message logs. Let me know if there is anything else I should/could dig into


2. What is the Version-Release number of the kernel: 5.8.12


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
Not sure I'm afraid. I'll see if I can reproduce it on a different machine - the one I'm seeing it on is my daily driver so I don't want to mess around with it too much now I have it working.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
Yes. 
Boot in xorg, discrete mode and close the lid.
Note - I used gnome hacks to disable the suspend on lid close (which I don't think it should do when an external display is on and powered on but that's a different issue)


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Not tried yet

6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.
I don't have these available right now, I will reproduce on another system and collect.

Comment 1 Mark Pearson 2020-11-18 16:33:57 UTC
As a note I reproduced this on the P15 with Fedora 33 latest in wayland, with hybrid mode. I have there only seen it with the nvidia driver, but the nouveau driver keeps crashing and doesn't work with an external monitor on my unit so I'm not sure how important that is.

- No obvious messages in the kernel log when switching between output only display and internal or shared.
- When external display is enabled (either external only or shared) the CPU utilisation for xorg goes up to ~40%. But the lagginess is seen only when in external only mode
- nothing controversial or useful in xrandr between modes
- I haven't confirmed with rawhide as the nvidia driver is a pain to install from there. But it is in 5.9.8 and importantly now in wayland which I hadn't seen previously. It means it's much more likely to be encountered by customers and is a bit less of a weird combo.

Let me know if there is anything useful I can collect as data

Mark

Comment 2 Mark Pearson 2020-11-18 17:56:05 UTC
I reinstalled F33 and confirmed I don't see the issue with Nouveau with the caveat that I can only enable external only with HDMI attached - it crashes with USB-c into a Thunderbolt port.

This seems to be Nvidia driver related at this point. I should really go and retest with my P1G2 but it's my daily machine so messing around with it is a bit of a pain :)

Mark

Comment 3 Carlos Soriano 2020-12-08 12:23:02 UTC
As we mentioned on the mail thread, we think it's due to https://gitlab.freedesktop.org/xorg/xserver/-/issues/948, which is a limitation of the present extension in xorg-server for direct rendering apps when using PRIME.

There is a WIP MR to try to address this at https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/460, which isn't really ready. However, I would suggest to be cautious: With XOrg being a frozen project upstream, I generally don't expect new features such as this one to be fully implemented, merged and a new xorg-server release to be made for making it available.

Comment 4 Carlos Soriano 2021-03-08 11:13:32 UTC
*** Bug 1936009 has been marked as a duplicate of this bug. ***

Comment 5 Mark Pearson 2021-05-03 13:44:51 UTC
I think I'm good to close this one - the world is moving on....