Bug 1477182

Summary: Lenovo P50: DP does not connect unless the connection is present at boot time
Product: [Fedora] Fedora Reporter: Florian Engel <florian.engel>
Component: xorg-x11-drv-nouveauAssignee: Lyude <lyude>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: airlied, ajax, brant.evans, bskeggs, jhutar, jkysela, jsarnovsky, jscarbor, kherbst, lmeyer, lyude, phiporiphic, sgehwolf, spamthemoe, vkadlcik
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-08 00:41:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xrandr after I have docked for a first time
none
xrandr after I have undocked and docked again
none
dmesg with drm.debug when docking didn't work
none
dmesg with drm.debug when docking worked. none

Description Florian Engel 2017-08-01 12:22:20 UTC
Description of problem:
I have a Lenovo P50 with an external Monitor connected via Display Port. The system boots and presents the monitor successfully. Once the system has booted, I disconnect and reconnect the DP cable (undocking and redocking produces the same result). The monitor is no longer recognized. On rare occasions (~1 out of 25 tries), it works. Rebooting with the monitor connected solves the problem. The chances for a succesful connection seems to be better for the first try after a reboot.

Everything works as expected, if the connection is made through the HDMI port of the laptop itself. The HDMI port on the dock uses DP-MST as connection and does not work. I assume the problem does only concern DP.

Version-Release number of selected component (if applicable):
uname -a: 
Linux gerty 4.11.11-300.fc26.x86_64 #1 SMP Mon Jul 17 16:32:11 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

dnf info xorg-x11-drv-nouveau: 
Name         : xorg-x11-drv-nouveau
Epoch        : 1
Version      : 1.0.15
Release      : 1.fc26
Arch         : x86_64
Size         : 219 k
Source       : xorg-x11-drv-nouveau-1.0.15-1.fc26.src.rpm
Repo         : @System
From repo    : fedora
Summary      : Xorg X11 nouveau video driver for NVIDIA graphics chipsets
URL          : http://www.x.org
License      : MIT
Description  : X.Org X11 nouveau video driver.


Steps to Reproduce:
1. Boot up P50 with monitor connected through DP
2. Disconnect DP cable
3. Reconnect DP cable

Actual results:
Monitor is no longer recognized (not connected in xrandr/Gnome Display Settings)

Expected results:
Monitor is recognized.

Additional info:
dmesg output for disconnect:
[   59.718284] nouveau 0000:01:00.0: DRM: suspending console...
[   59.718289] nouveau 0000:01:00.0: DRM: suspending display...
[   59.718323] nouveau 0000:01:00.0: DRM: evicting buffers...
[   59.800927] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
[   59.800949] nouveau 0000:01:00.0: DRM: suspending fence...
[   59.803672] nouveau 0000:01:00.0: DRM: suspending object tree...

dmesg output for successful connect:
[   78.254497] nouveau 0000:01:00.0: DRM: resuming object tree...
[   78.399369] nouveau 0000:01:00.0: priv: HUB0: 614900 00800000 (1d408200)
[   78.569793] nouveau 0000:01:00.0: DRM: resuming fence...
[   78.569817] nouveau 0000:01:00.0: DRM: resuming display...
[   78.569914] nouveau 0000:01:00.0: DRM: resuming console...
[   87.212999] nouveau 0000:01:00.0: disp: 0x64a8[0]: INIT_GENERIC_CONDITON: unknown 0x07

dmesg output for unseccesful connect:
[  378.239501] nouveau 0000:01:00.0: DRM: resuming object tree...
[  378.384399] nouveau 0000:01:00.0: priv: HUB0: 614900 00800000 (1e408200)
[  378.554878] nouveau 0000:01:00.0: DRM: resuming fence...
[  378.554903] nouveau 0000:01:00.0: DRM: resuming display...
[  378.556371] nouveau 0000:01:00.0: DRM: resuming console...
[  383.814940] nouveau 0000:01:00.0: DRM: suspending console...
[  383.814944] nouveau 0000:01:00.0: DRM: suspending display...
[  383.815423] nouveau 0000:01:00.0: DRM: evicting buffers...
[  383.815426] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
[  383.815456] nouveau 0000:01:00.0: DRM: suspending fence...
[  383.818837] nouveau 0000:01:00.0: DRM: suspending object tree...

Comment 1 Moritz Halbritter 2017-08-07 08:47:59 UTC
I have exactly the same problem on the same hardware.

Comment 2 Jim Scarborough 2018-02-26 16:44:07 UTC
See also https://bugzilla.redhat.com/show_bug.cgi?id=1527669 , P50 not booting when docked. 

I have seen the docking station monitor recognized *sometimes* after docking while the P50 was awake.  I have yet to identify a specific pattern of when it might or might not happen.

Sometimes the internal laptop display is not recognized.  I work around that by suspending and waking the system, connecting another display to the mini DVI port, or rebooting if that fails.

Comment 3 Václav Kadlčík 2018-02-27 06:19:36 UTC
FWIW, my P50 connects external DPs (both on-the-box and on-the-dock)
with a *much* higher success rate; haven't measured but I'd say above
80%. My setup:

 * the discrete card only (set in BIOS/UEFI)
 * the notebook is always awake
 * X11
 * xorg-x11-drv-nouveau-1.0.15-1.fc26
 * kernel-4.16.0-*.fc29.x86_64 (from fedora-rawhide-kernel-nodebug,
   in hope to get rid of bz1527669 and bz1511786 but no luck yet)

If a connection fails, re-attaching usually helps. IIRC, I haven't
needed a reboot so far.

Comment 4 Severin Gehwolf 2018-04-05 15:40:46 UTC
Same problem as reported in comment 0 here. Though, since I also run into bz1527669 connection at boot time is somewhat problematic. I have:

xorg-x11-drv-nouveau-1.0.15-3.fc27.x86_64

All of these kernels seem to have the problem:

kernel-core-4.15.8-300.fc27.x86_64
kernel-core-4.15.10-300.fc27.x86_64
kernel-core-4.15.4-300.fc27.x86_64

Comment 5 Fedora End Of Life 2018-05-03 08:28:35 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 6 Severin Gehwolf 2018-05-03 08:38:55 UTC
Changing to F27 since I've got this exact problem there too.

Comment 7 Jan Hutař 2018-05-22 20:08:48 UTC
I have exactly same problem with Lenovo P50 (Optimus/Prime issue? Integrated Intel graphics card, discrete NVidia) on Fedora 28 (using XFce):

kernel-4.16.8-300.fc28.x86_64
xorg-x11-drv-nouveau-1.0.15-4.fc28.x86_64
xorg-x11-drv-intel-2.99.917-32.20171025.fc28.x86_64

I have upgraded to to F28 from F27 some time ago and in F27 it worked for me quite quite reliably for about 2 months ago.

Comment 8 Jan Hutař 2018-06-05 07:31:25 UTC
Hello. Is there a way for ordinary guy to help with this? Maybe some options somewhere on how to get more info?

Comment 9 Lyude 2018-06-11 16:56:22 UTC
Yes---the first thing you could try is pretty simple. Boot up into gnome-shell with the monitor disconnected, then try connecting the display and opening up gnome-shell control panel and going to the display configuration section to see if that causes the display to get noticed.

If that doesn't work then, could you boot up with the following parameters added to your kernel cmdline?

drm.debug=0x16 log_buf_len=100M

Then boot up into gnome-shell with the monitor disconnected, then try connecting the display? Once you've done that, run

echo 0x0 > /sys/module/drm/parameters/debug

Then get me the /full/ copy of your dmesg from that boot so I can see what nouveau's doing wrong (or not doing at all!)

Comment 10 Jan Hutař 2018-06-12 08:51:16 UTC
I think that the title of the bug does not match what was written in the initial comment. Because I'm on XFce (I'm on ThinkPad P50 with integrated Intel and discrete NVidia), I have used these steps (attempted to correlate with what you have in comment #9):

Comment 15 Jan Hutař 2018-06-12 09:07:49 UTC
Created attachment 1450366 [details]
xrandr after I have docked for a first time

Comment 19 Jan Hutař 2018-06-12 09:11:40 UTC
Created attachment 1450370 [details]
xrandr after I have undocked and docked again

Comment 23 Jan Hutař 2018-06-12 09:33:28 UTC
I assume most of the attached logs is useless to you, but hope there is at least something usable.

Comment 24 Lyude 2018-06-12 22:46:27 UTC
Oh interesting... this doesn't look like the culprit is what I thought it would be. One last thing to check: can you see if this problem still happens if you're booted with nouveau.runpm=0?

Comment 25 Jan Hutař 2018-06-13 06:23:50 UTC
Do not have access to the docking station today, will report back bit later.

Comment 26 Severin Gehwolf 2018-06-13 07:25:24 UTC
Created attachment 1450809 [details]
dmesg with drm.debug when docking didn't work

Here is a reference dmesg from a boot today. Booted undocked, then docked => monitors didn't come up.

Comment 27 Severin Gehwolf 2018-06-13 07:27:40 UTC
Created attachment 1450810 [details]
dmesg with drm.debug when docking worked.

Here is a reference dmesg from a boot today. Booted undocked, then docked => monitors *did* come up.

Comment 28 Severin Gehwolf 2018-06-13 07:28:52 UTC
FWIW, I'm on Gnome 3 with X11 (not wayland) on F28.

Comment 29 Severin Gehwolf 2018-06-14 12:24:22 UTC
I've used nouveau.runpm=0 today and it worked, but I'm not sure it worked because of that switch or for some other random reason (it used to work/not-work randomly before). I'll add this to my default command line and will report back in a week or so should this fix the issue for me.

Comment 30 Jan Hutař 2018-06-15 08:55:52 UTC
I have tried with latest kernel 4.16.15-300.fc28.x86_64 with "nouveau.runpm=0" and have not seen any improvement.

Comment 31 Jan Hutař 2018-06-20 08:02:22 UTC
Having kernel 4.16.15-300.fc28.x86_64 and (I believe) default kernel command-line (i.e. without above recommended "nouveau.runpm=0" although it might work as well):

$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-4.16.15-300.fc28.x86_64 root=/dev/mapper/luks-8d982066-60f4-4af5-b32f-883e1919d201 ro rd.lvm.lv=fedora/root rd.luks.uuid=luks-8d982066-60f4-4af5-b32f-883e1919d201 rd.lvm.lv=fedora/swap rd.luks.uuid=luks-58555151-1aeb-4500-a648-4ad9e59dcebf rhgb quiet LANG=en_US.UTF-8

I'm on ThinkPad P50 (intergrated Intel card + discrete NVidia with default nouveau drivers), using xorg and XFce.

This works for me:

1. external display is turned off (connected to docking station via display port)
2. resume laptop from suspend and dock to docking station
3. turn on the external display

Looks like it works without turning off external display, but I have not tried too much. For me, I consider this bug fixed by mentioned kernel.

Comment 32 Lyude 2018-06-20 16:48:39 UTC
Just a heads up, I haven't forgotten about this bug: I'm currently trying to get access to a P50 so I can take a closer look at this! I will set NEEDINFO on myself for the time being so bugzilla doesn't let me forget about this

Comment 33 Severin Gehwolf 2018-06-21 07:24:09 UTC
(In reply to Severin Gehwolf from comment #29)
> I've used nouveau.runpm=0 today and it worked, but I'm not sure it worked
> because of that switch or for some other random reason (it used to
> work/not-work randomly before). I'll add this to my default command line and
> will report back in a week or so should this fix the issue for me.

FWIW, I've seen the issue of booting undocked, then dock and the monitors not being recognized again today. Seems still an issue. Here are is my kernel command line.

$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-4.16.15-300.fc28.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet nouveau.runpm=0 LANG=en_CA.UTF-8

Comment 34 Jan Hutař 2018-06-21 08:36:11 UTC
(In reply to Severin Gehwolf from comment #33)
> (In reply to Severin Gehwolf from comment #29)
> > I've used nouveau.runpm=0 today and it worked, but I'm not sure it worked
> > because of that switch or for some other random reason (it used to
> > work/not-work randomly before). I'll add this to my default command line and
> > will report back in a week or so should this fix the issue for me.
> 
> FWIW, I've seen the issue of booting undocked, then dock and the monitors
> not being recognized again today. Seems still an issue. Here are is my
> kernel command line.
> 
> $ cat /proc/cmdline 
> BOOT_IMAGE=/vmlinuz-4.16.15-300.fc28.x86_64 root=/dev/mapper/fedora-root ro
> resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap
> rhgb quiet nouveau.runpm=0 LANG=en_CA.UTF-8

Does the turn off external monitor -> undock -> suspend -> resume -> dock -> turn on external display works for you?

Comment 35 Severin Gehwolf 2018-06-21 08:57:19 UTC
(In reply to Jan Hutař from comment #34)
> Does the turn off external monitor -> undock -> suspend -> resume -> dock ->
> turn on external display works for you?

I haven't tried, sorry. Seems too convoluted. I've rebooted in the dock, which seems to work better now.

Comment 36 Lyude 2018-06-29 17:54:59 UTC
Finally got my hands on a P50! Going to start taking a look at this asap, sorry for the wait

Comment 37 Lyude 2018-07-17 15:39:50 UTC
Hey, just giving an update here since you guys have been waiting for a while! I've managed to find and fix quite a number of bugs on the P50 with nouveau so far, and I've got a couple more to go. I've also managed to reproduce the display detection issue, although I don't have any fixes for it just yet

Current wip patchsets:

https://patchwork.freedesktop.org/series/45862/
https://patchwork.freedesktop.org/series/46498/
https://patchwork.freedesktop.org/series/46637/

I will keep you all updated and come up with a test RPM when I think I have this working!

Comment 38 Jan Hutař 2018-07-17 16:00:28 UTC
Thank you!

Comment 39 Lyude 2018-08-09 21:23:54 UTC
Good news everyone! Sorry that took so long-the number of bugs and the complexity of some of them that I've had to fix for this to work has been a nightmare! But I think I actually have a fix for this now that mostly works with some exceptions:

- docking during suspend/resume might still not get picked up by nouveau just yet, hoping to fix this in DRM's core
- If you are hitting a weird bug where disp init randomly fails on the P50, that won't fix this. I'm still investigating what's going on there

Anyway---I will update this sometime today when I have a scratch build going in koji, as much testing as possible from you guys would be appreciated when that happens!

Comment 40 Lyude 2018-08-09 23:57:29 UTC
Hooray! Scratch build ready for F28 x86_64:

https://koji.fedoraproject.org/koji/taskinfo?taskID=28952695

Scratch build running for F27:

https://koji.fedoraproject.org/koji/taskinfo?taskID=28952713

If anyone still hits any issues with display detection bugs, I've also backported DP aux tracing to these scratch-builds. So, if your display still isn't getting detected:

Add drm.debug=0x106 to your kernel cmdline
Get the system into a state where your displays aren't being detected, then get me your full dmesg from that boot by running sosreport.

Cheers!

Comment 41 Jan Hutař 2018-08-15 08:25:41 UTC
So I have booted to 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 and did few docking and undockings (no suspends, no monitor external turning off) and it worked reliably. Only once (while watching `journalctl -f`), I have noticed:

Aug 15 10:20:16 localhost.localdomain kernel: usb 1-4: USB disconnect, device number 14
Aug 15 10:20:16 localhost.localdomain kernel: usb 1-4.1: USB disconnect, device number 15
Aug 15 10:20:16 localhost.localdomain kernel: WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
Aug 15 10:20:16 localhost.localdomain kernel: Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 fuse tun devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bnep sunrpc xfs libcrc32c arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp iwlmvm snd_hda_codec_realtek mei_wdt snd_hda_codec_generic coretemp iTCO_wdt iTCO_vendor_support kvm_intel mac80211 kvm snd_hda_intel snd_hda_codec irqbypass snd_hda_core intel_cstate snd_hwdep iwlwifi intel_uncore snd_seq snd_seq_device uvcvideo
Aug 15 10:20:16 localhost.localdomain kernel:  intel_rapl_perf btusb snd_pcm videobuf2_vmalloc btrtl videobuf2_memops videobuf2_v4l2 btbcm btintel videobuf2_common cfg80211 bluetooth videodev media joydev intel_wmi_thunderbolt thinkpad_acpi wmi_bmof snd_timer ecdh_generic mei_me i2c_i801 rtsx_pci_ms snd memstick mei soundcore rfkill intel_pch_thermal shpchp pcc_cpufreq binfmt_misc dm_crypt nouveau i915 mxm_wmi rtsx_pci_sdmmc ttm mmc_core i2c_algo_bit crct10dif_pclmul drm_kms_helper crc32_pclmul crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw rtsx_pci wmi video
Aug 15 10:20:16 localhost.localdomain kernel: CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
Aug 15 10:20:16 localhost.localdomain kernel: Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Aug 15 10:20:16 localhost.localdomain kernel: Workqueue: events nouveau_display_hpd_work [nouveau]
Aug 15 10:20:16 localhost.localdomain kernel: RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
Aug 15 10:20:16 localhost.localdomain kernel: RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
Aug 15 10:20:16 localhost.localdomain kernel: RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
Aug 15 10:20:16 localhost.localdomain kernel: RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
Aug 15 10:20:16 localhost.localdomain kernel: RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
Aug 15 10:20:16 localhost.localdomain kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
Aug 15 10:20:16 localhost.localdomain kernel: R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
Aug 15 10:20:16 localhost.localdomain kernel: FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
Aug 15 10:20:16 localhost.localdomain kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 10:20:16 localhost.localdomain kernel: CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
Aug 15 10:20:16 localhost.localdomain kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 15 10:20:16 localhost.localdomain kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 15 10:20:16 localhost.localdomain kernel: Call Trace:
Aug 15 10:20:16 localhost.localdomain kernel:  ? _cond_resched+0x15/0x30
Aug 15 10:20:16 localhost.localdomain kernel:  nouveau_connector_detect+0x2ce/0x520 [nouveau]
Aug 15 10:20:16 localhost.localdomain kernel:  ? _cond_resched+0x15/0x30
Aug 15 10:20:16 localhost.localdomain kernel:  ? ww_mutex_lock+0x12/0x40
Aug 15 10:20:16 localhost.localdomain kernel:  drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
Aug 15 10:20:16 localhost.localdomain kernel:  drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
Aug 15 10:20:16 localhost.localdomain kernel:  nouveau_display_hpd_work+0x2a/0x60 [nouveau]
Aug 15 10:20:16 localhost.localdomain kernel:  process_one_work+0x187/0x340
Aug 15 10:20:16 localhost.localdomain kernel:  worker_thread+0x2e/0x380
Aug 15 10:20:16 localhost.localdomain kernel:  ? pwq_unbound_release_workfn+0xd0/0xd0
Aug 15 10:20:16 localhost.localdomain kernel:  kthread+0x112/0x130
Aug 15 10:20:16 localhost.localdomain kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
Aug 15 10:20:16 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
Aug 15 10:20:16 localhost.localdomain kernel: Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff 
Aug 15 10:20:16 localhost.localdomain kernel: ---[ end trace 55d811b38fc8e71a ]---

But that does not seem to have any negative effect.

Comment 42 Lyude 2018-08-15 18:13:01 UTC
(In reply to Jan Hutař from comment #41)
> So I have booted to 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 and did few
> docking and undockings (no suspends, no monitor external turning off) and it
> worked reliably. Only once (while watching `journalctl -f`), I have noticed:
> 
> Aug 15 10:20:16 localhost.localdomain kernel: usb 1-4: USB disconnect,
> device number 14
> Aug 15 10:20:16 localhost.localdomain kernel: usb 1-4.1: USB disconnect,
> device number 15
> Aug 15 10:20:16 localhost.localdomain kernel: WARNING: CPU: 0 PID: 838 at
> drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170
> nouveau_dp_detect+0x17e/0x370 [nouveau]
> Aug 15 10:20:16 localhost.localdomain kernel: Modules linked in: xt_CHECKSUM
> ipt_MASQUERADE nf_nat_masquerade_ipv4 fuse tun devlink ip6t_rpfilter
> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat
> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
> nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
> iptable_mangle iptable_raw iptable_security ebtable_filter ebtables
> ip6table_filter ip6_tables bnep sunrpc xfs libcrc32c arc4 intel_rapl
> x86_pkg_temp_thermal intel_powerclamp iwlmvm snd_hda_codec_realtek mei_wdt
> snd_hda_codec_generic coretemp iTCO_wdt iTCO_vendor_support kvm_intel
> mac80211 kvm snd_hda_intel snd_hda_codec irqbypass snd_hda_core intel_cstate
> snd_hwdep iwlwifi intel_uncore snd_seq snd_seq_device uvcvideo
> Aug 15 10:20:16 localhost.localdomain kernel:  intel_rapl_perf btusb snd_pcm
> videobuf2_vmalloc btrtl videobuf2_memops videobuf2_v4l2 btbcm btintel
> videobuf2_common cfg80211 bluetooth videodev media joydev
> intel_wmi_thunderbolt thinkpad_acpi wmi_bmof snd_timer ecdh_generic mei_me
> i2c_i801 rtsx_pci_ms snd memstick mei soundcore rfkill intel_pch_thermal
> shpchp pcc_cpufreq binfmt_misc dm_crypt nouveau i915 mxm_wmi rtsx_pci_sdmmc
> ttm mmc_core i2c_algo_bit crct10dif_pclmul drm_kms_helper crc32_pclmul
> crc32c_intel nvme drm e1000e ghash_clmulni_intel nvme_core serio_raw
> rtsx_pci wmi video
> Aug 15 10:20:16 localhost.localdomain kernel: CPU: 0 PID: 838 Comm:
> kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
> Aug 15 10:20:16 localhost.localdomain kernel: Hardware name: LENOVO
> 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
> Aug 15 10:20:16 localhost.localdomain kernel: Workqueue: events
> nouveau_display_hpd_work [nouveau]
> Aug 15 10:20:16 localhost.localdomain kernel: RIP:
> 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
> Aug 15 10:20:16 localhost.localdomain kernel: RSP: 0018:ffffa15143933cf0
> EFLAGS: 00010293
> Aug 15 10:20:16 localhost.localdomain kernel: RAX: 0000000000000000 RBX:
> ffff8cb4f656c400 RCX: 0000000000000000
> Aug 15 10:20:16 localhost.localdomain kernel: RDX: ffffa1514500e4e4 RSI:
> ffffa1514500e4e4 RDI: 0000000001009002
> Aug 15 10:20:16 localhost.localdomain kernel: RBP: ffff8cb4f4a8a800 R08:
> ffffa15143933cfd R09: ffffa15143933cfc
> Aug 15 10:20:16 localhost.localdomain kernel: R10: 0000000000000000 R11:
> 0000000000000000 R12: ffff8cb4fb57a000
> Aug 15 10:20:16 localhost.localdomain kernel: R13: ffff8cb4fb57a000 R14:
> ffff8cb4f4a8f800 R15: ffff8cb4f656c418
> Aug 15 10:20:16 localhost.localdomain kernel: FS:  0000000000000000(0000)
> GS:ffff8cb51f400000(0000) knlGS:0000000000000000
> Aug 15 10:20:16 localhost.localdomain kernel: CS:  0010 DS: 0000 ES: 0000
> CR0: 0000000080050033
> Aug 15 10:20:16 localhost.localdomain kernel: CR2: 00007f78ec938000 CR3:
> 000000073720a003 CR4: 00000000003606f0
> Aug 15 10:20:16 localhost.localdomain kernel: DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Aug 15 10:20:16 localhost.localdomain kernel: DR3: 0000000000000000 DR6:
> 00000000fffe0ff0 DR7: 0000000000000400
> Aug 15 10:20:16 localhost.localdomain kernel: Call Trace:
> Aug 15 10:20:16 localhost.localdomain kernel:  ? _cond_resched+0x15/0x30
> Aug 15 10:20:16 localhost.localdomain kernel: 
> nouveau_connector_detect+0x2ce/0x520 [nouveau]
> Aug 15 10:20:16 localhost.localdomain kernel:  ? _cond_resched+0x15/0x30
> Aug 15 10:20:16 localhost.localdomain kernel:  ? ww_mutex_lock+0x12/0x40
> Aug 15 10:20:16 localhost.localdomain kernel: 
> drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
> Aug 15 10:20:16 localhost.localdomain kernel: 
> drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
> Aug 15 10:20:16 localhost.localdomain kernel: 
> nouveau_display_hpd_work+0x2a/0x60 [nouveau]
> Aug 15 10:20:16 localhost.localdomain kernel:  process_one_work+0x187/0x340
> Aug 15 10:20:16 localhost.localdomain kernel:  worker_thread+0x2e/0x380
> Aug 15 10:20:16 localhost.localdomain kernel:  ?
> pwq_unbound_release_workfn+0xd0/0xd0
> Aug 15 10:20:16 localhost.localdomain kernel:  kthread+0x112/0x130
> Aug 15 10:20:16 localhost.localdomain kernel:  ?
> kthread_create_worker_on_cpu+0x70/0x70
> Aug 15 10:20:16 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
> Aug 15 10:20:16 localhost.localdomain kernel: Code: 4c 8d 44 24 0d b9 00 05
> 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2
> 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8
> ff 02 0f 84 72 ff 
> Aug 15 10:20:16 localhost.localdomain kernel: ---[ end trace
> 55d811b38fc8e71a ]---
> 
> But that does not seem to have any negative effect.

Have you encountered any issues with the card failing to runtime suspend, mainly one that gives off a backtrace that looks like this?

[    3.851956] ------------[ cut here ]------------
[    3.851958] nouveau 0000:01:00.0: timeout
[    3.851995] WARNING: CPU: 0 PID: 62 at drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1560 gf100_grctx_generate+0x89d/0x8b0 [nouveau]
[    3.851997] Modules linked in: serio_raw crc32c_intel xhci_pci i915(O+) xhci_hcd nouveau(O) video mxm_wmi wmi i2c_algo_bit drm_kms_helper(O) syscopyarea sysfillrect sysimgblt fb_sys_fops ttm(O) drm(O) i2c_core
[    3.852010] CPU: 0 PID: 62 Comm: kworker/0:2 Tainted: G           O      4.18.0-rc8Lyude-Test+ #7
[    3.852011] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[    3.852018] Workqueue: events output_poll_execute [drm_kms_helper]
[    3.852105] RIP: 0010:gf100_grctx_generate+0x89d/0x8b0 [nouveau]
[    3.852107] Code: ff 49 8b 7c 24 10 48 8b 5f 50 48 85 db 75 04 48 8b 5f 10 e8 25 5d 30 e1 48 89 da 48 c7 c7 4e e7 2a a0 48 89 c6 e8 65 c1 e9 e0 <0f> 0b bb f0 ff ff ff e9 68 f9 ff ff 0f 1f 80 00 00 00 00 0f 1f 44 
[    3.852127] RSP: 0018:ffffc9000027b898 EFLAGS: 00010282
[    3.852128] RAX: 0000000000000000 RBX: ffff880876c20bd0 RCX: 0000000000000006
[    3.852130] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff88089b415570
[    3.852132] RBP: ffffc9000027b958 R08: 0000000000000000 R09: 0000000000000000
[    3.852133] R10: ffff880876685f00 R11: ffffffff8140cc60 R12: ffff8808716d2000
[    3.852135] R13: ffffc9000027b8d0 R14: ffffc9000027b8c8 R15: ffff88087165c000
[    3.852137] FS:  0000000000000000(0000) GS:ffff88089b400000(0000) knlGS:0000000000000000
[    3.852139] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.852140] CR2: 00005621d2c180b8 CR3: 000000000200a005 CR4: 00000000003606f0
[    3.852142] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    3.852144] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    3.852145] Call Trace:
[    3.852168]  ? nv04_timer_read+0x48/0x60 [nouveau]
[    3.852191]  gf100_gr_init_ctxctl+0x536/0xa40 [nouveau]
[    3.852212]  gf100_gr_init+0x563/0x590 [nouveau]
[    3.852234]  gf100_gr_init_+0x5b/0x60 [nouveau]
[    3.852255]  nvkm_gr_init+0x1d/0x20 [nouveau]
[    3.852267]  nvkm_engine_init+0xb9/0x1f0 [nouveau]
[    3.852280]  nvkm_subdev_init+0xbc/0x210 [nouveau]
[    3.852292]  nvkm_engine_ref.part.0+0x4a/0x70 [nouveau]
[    3.852304]  nvkm_engine_ref+0x13/0x20 [nouveau]
[    3.852316]  nvkm_ioctl_new+0x12c/0x260 [nouveau]
[    3.852337]  ? nvkm_fifo_chan_dtor+0x100/0x100 [nouveau]
[    3.852358]  ? gf100_fermi_mthd+0x100/0x100 [nouveau]
[    3.852371]  nvkm_ioctl+0xe2/0x180 [nouveau]
[    3.852392]  nvkm_client_ioctl+0x12/0x20 [nouveau]
[    3.852403]  nvif_object_ioctl+0x47/0x50 [nouveau]
[    3.852415]  nvif_object_init+0xc8/0x120 [nouveau]
[    3.852435]  nvc0_fbcon_accel_init+0x5b/0x950 [nouveau]
[    3.852455]  nouveau_fbcon_create+0x5bb/0x5e0 [nouveau]
[    3.852460]  ? drm_setup_crtcs+0x247/0xa60 [drm_kms_helper]
[    3.852464]  __drm_fb_helper_initial_config_and_unlock+0x1c0/0x410 [drm_kms_helper]
[    3.852468]  drm_fb_helper_hotplug_event.part.33+0xa9/0xb0 [drm_kms_helper]
[    3.852472]  drm_fb_helper_hotplug_event+0x1c/0x30 [drm_kms_helper]
[    3.852492]  nouveau_fbcon_output_poll_changed+0xb6/0x110 [nouveau]
[    3.852496]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[    3.852500]  output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[    3.852504]  process_one_work+0x1b2/0x370
[    3.852506]  worker_thread+0x37/0x3a0
[    3.852508]  kthread+0x120/0x140
[    3.852510]  ? wq_update_unbound_numa+0x10/0x10
[    3.852511]  ? kthread_create_worker_on_cpu+0x70/0x70
[    3.852514]  ret_from_fork+0x35/0x40
[    3.852516] ---[ end trace 583fe2d8feb59e4a ]---
[    3.852733] nouveau 0000:01:00.0: gr: failed to construct context
[    3.852737] nouveau 0000:01:00.0: gr: init failed, -16

Comment 43 Jan Hutař 2018-08-16 07:07:33 UTC
When I left from work, I have disconnected, suspended and then resumed at home and ABRT plugin noticed me about this:

Aug 15 17:56:57 localhost.localdomain systemd-logind[1233]: Lid opened.
Aug 15 17:57:01 localhost.localdomain kernel: usb 1-1: USB disconnect, device number 30
Aug 15 17:57:01 localhost.localdomain upowerd[2396]: unhandled action 'unbind' on /sys/devices/pci0000:00/0000:00:14.0/usb1/1-1
Aug 15 17:57:04 localhost.localdomain kernel: WARNING: CPU: 3 PID: 16826 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nvkm_dp_acquire+0xaf2/0xc90 [nouveau]
Aug 15 17:57:04 localhost.localdomain kernel: usb 1-4: USB disconnect, device number 26
Aug 15 17:57:04 localhost.localdomain kernel: Modules linked in:
Aug 15 17:57:04 localhost.localdomain kernel: usb 1-4.1: USB disconnect, device number 27
Aug 15 17:57:04 localhost.localdomain kernel:  ccm vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 fuse tun devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute b>
Aug 15 17:57:04 localhost.localdomain kernel:  uvcvideo intel_rapl_perf btusb snd_pcm videobuf2_vmalloc btrtl videobuf2_memops videobuf2_v4l2 btbcm btintel videobuf2_common cfg80211 bluetooth videodev media joydev intel_wmi_thunderbolt t>
Aug 15 17:57:04 localhost.localdomain kernel: CPU: 3 PID: 16826 Comm: kworker/3:2 Tainted: G        W         4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
Aug 15 17:57:04 localhost.localdomain kernel: Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Aug 15 17:57:04 localhost.localdomain kernel: Workqueue: events nvkm_notify_work [nouveau]
Aug 15 17:57:04 localhost.localdomain kernel: RIP: 0010:nvkm_dp_acquire+0xaf2/0xc90 [nouveau]
Aug 15 17:57:04 localhost.localdomain kernel: RSP: 0018:ffffa151492afd60 EFLAGS: 00010293
Aug 15 17:57:04 localhost.localdomain kernel: RAX: 0000000000000000 RBX: ffff8cb4f656f000 RCX: 0000000000000000
Aug 15 17:57:04 localhost.localdomain kernel: RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
Aug 15 17:57:04 localhost.localdomain kernel: RBP: ffff8cb4f41cd180 R08: ffffa151492afe05 R09: ffffa151492afdb0
Aug 15 17:57:04 localhost.localdomain kernel: R10: 0000000000000000 R11: 0000000010624dd3 R12: 0000000000000000
Aug 15 17:57:04 localhost.localdomain kernel: R13: 000000000006cc3c R14: ffff8cb4f4a8a800 R15: ffff8cb4f656f0f0
Aug 15 17:57:04 localhost.localdomain kernel: FS:  0000000000000000(0000) GS:ffff8cb51f4c0000(0000) knlGS:0000000000000000
Aug 15 17:57:04 localhost.localdomain kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 17:57:04 localhost.localdomain kernel: CR2: 00007f6c58005ff8 CR3: 000000073720a004 CR4: 00000000003606e0
Aug 15 17:57:04 localhost.localdomain kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 15 17:57:04 localhost.localdomain kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 15 17:57:04 localhost.localdomain kernel: Call Trace:
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? syscall_return_via_sysret+0x13/0x83
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x40/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ? __switch_to_asm+0x34/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  nvkm_dp_hpd+0xdf/0x150 [nouveau]
Aug 15 17:57:04 localhost.localdomain kernel:  nvkm_notify_work+0x1d/0x80 [nouveau]
Aug 15 17:57:04 localhost.localdomain kernel:  process_one_work+0x187/0x340
Aug 15 17:57:04 localhost.localdomain kernel:  worker_thread+0x2e/0x380
Aug 15 17:57:04 localhost.localdomain kernel:  ? pwq_unbound_release_workfn+0xd0/0xd0
Aug 15 17:57:04 localhost.localdomain kernel:  kthread+0x112/0x130
Aug 15 17:57:04 localhost.localdomain kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
Aug 15 17:57:04 localhost.localdomain kernel:  ret_from_fork+0x35/0x40
Aug 15 17:57:04 localhost.localdomain kernel: Code: 00 00 b9 02 02 00 00 4c 89 f7 ba 09 00 00 00 be 01 00 00 00 e8 50 0f fd ff 41 89 c4 85 c0 0f 85 fe 00 00 00 80 7c 24 50 03 74 02 <0f> 0b 4c 89 f7 e8 24 0d fd ff f6 84 24 a7 00 00 00 01 >
Aug 15 17:57:04 localhost.localdomain kernel: ---[ end trace 55d811b38fc8e71b ]---
Aug 15 17:57:04 localhost.localdomain kernel: nouveau 0000:01:00.0: disp: outp 00:0006:0f42: training failed

Also looks like I have not hit that ever:

# journalctl | grep output_poll_execute
<no_output>

Comment 44 Lyude 2018-08-16 21:16:01 UTC
Alright, I wrote up some patches to fix the suspend/resume warnings that you described. Could you try one of these builds once they finish and tell me if things improve?

https://koji.fedoraproject.org/koji/taskinfo?taskID=29123454 for F28
https://koji.fedoraproject.org/koji/taskinfo?taskID=29123541 for F27

Comment 45 Lyude 2018-08-20 18:16:34 UTC
Hooray! I managed to figure out the fix for the gf100_grctx_generate error as well, so this new version shouldn't run into that issue at all:

https://koji.fedoraproject.org/koji/taskinfo?taskID=29203776 for F28
https://koji.fedoraproject.org/koji/taskinfo?taskID=29203849 for F27

Comment 46 Jan Hutař 2018-08-21 08:24:40 UTC
Soo, I have updated a system today, used kernel from comment #45:

$ uname -a
Linux localhost.localdomain 4.17.17-300.Lyude.bz1477182.V5.fc28.x86_64 #1 SMP Mon Aug 20 18:09:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

and I was able to nicely dock & undock & suspend. This is awesome! Please ship it! ;-)

Comment 47 Jan Hutař 2018-10-18 06:45:01 UTC
Hello. I'm now using 4.18.12-200.fc28.x86_64 and looks like patches are in there, am I right?

Comment 48 Lyude 2018-11-07 23:06:54 UTC
(In reply to Jan Hutař from comment #47)
> Hello. I'm now using 4.18.12-200.fc28.x86_64 and looks like patches are in
> there, am I right?

yikes, sorry I forgot to respond to this!

Yeah-they're definitely in the Fedora kernel at this point, is it OK if I close this as fixed?

Comment 49 Jan Hutař 2018-11-07 23:46:20 UTC
Yes, good for me. Thank you!