Description of problem: A customer reported that a Thunderbolt 3 docking station doesn't work properly with Rhel 7.5 (or Rhel 7.4). From certain points of view this looks like a firmware problem and apparently it got somehow resolved with BIOS update and enabling a 'Fast boot' option in the HP BIOS. Purpose of this BZ is mainly to investigate this properly and determine, whether this is/was indeed a firmware issue or if there might be also some problem in our kernel/thunderbolt-driver. Please see Additional info below for more details. Hardware environment: Laptop: HP ZBook G4 15inch Thunderbolt Dock: P5Q58AA Version-Release number of selected component (if applicable): kernel-3.10.0-862.el7.x86 How reproducible: Always (at least at customer's site) Steps to Reproduce: Boot with the TB3 docking station connected. Actual results: The boot process hangs for a while, after finally booting, devices connected to the dock don't work. Expected results: Boot without hang and everything works fine Additional info: From what the customer have tested so far, apparently the docking station gets discovered as usb hub with number 3 (and also apparently 4) as those devices are not discovered during boot _without_ the dock connected. The problem probably is denoted by the following messages: ~~~~ Apr 19 13:32:21 <hostname> kernel: usb 3-1: new high-speed USB device number 2 using xhci_hcd ... Apr 19 13:32:27 <hostname> kernel: xhci_hcd 0000:3d:00.0: Timeout while waiting for setup device command Apr 19 13:32:27 <hostname> kernel: usb 3-1: hub failed to enable device, error -62 ~~~~ (-ETIME == -62) To me this seems like the device usb-3-1 either never received the command or never replied or never finished initialization, however I'm not too familiar in this area, so I might got it wrong. Also there were later reported 'hung_tasks' on procedure paths xhci_alloc_dev(), which should held a mutex and xhci_setup_device(), which waited for a mutex: ~~~~ Apr 19 13:36:14 <hostname> kernel: INFO: task kworker/0:1:55 blocked for more than 120 seconds. Apr 19 13:36:14 <hostname> kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 13:36:14 <hostname> kernel: kworker/0:1 D ffff8808a9275ee0 0 55 2 0x00000000 Apr 19 13:36:14 <hostname> kernel: Workqueue: usb_hub_wq hub_event Apr 19 13:36:14 <hostname> kernel: Call Trace: Apr 19 13:36:14 <hostname> kernel: [<ffffffffacca37f2>] ? del_timer_sync+0x52/0x60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad313e69>] schedule_preempt_disabled+0x29/0x70 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad311c27>] __mutex_lock_slowpath+0xc7/0x1d0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0ffd6c>] ? xhci_discover_or_reset_device+0x11c/0x580 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad31100f>] mutex_lock+0x1f/0x2f Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0f9ab2>] xhci_setup_device+0x62/0x7b0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0bda84>] ? hub_port_reset+0x464/0x680 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0fa213>] xhci_address_device+0x13/0x20 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0be06b>] hub_port_init+0x3cb/0xb80 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad07e8d9>] ? update_autosuspend+0x39/0x60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad07e945>] ? pm_runtime_set_autosuspend_delay+0x45/0x60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0c1618>] hub_port_connect+0x158/0x9d0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0c25cf>] hub_event+0x73f/0xb60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb2dff>] process_one_work+0x17f/0x440 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb3ac6>] worker_thread+0x126/0x3c0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbae31>] kthread+0xd1/0xe0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbad60>] ? insert_kthread_work+0x40/0x40 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad31f61d>] ret_from_fork_nospec_begin+0x7/0x21 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbad60>] ? insert_kthread_work+0x40/0x40 Apr 19 13:36:14 <hostname> kernel: INFO: task kworker/0:2:62 blocked for more than 120 seconds. Apr 19 13:36:14 <hostname> kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 13:36:14 <hostname> kernel: kworker/0:2 D ffff8817ad62cf10 0 62 2 0x00000000 Apr 19 13:36:14 <hostname> kernel: Workqueue: usb_hub_wq hub_event Apr 19 13:36:14 <hostname> kernel: Call Trace: Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccd92c8>] ? check_preempt_wakeup+0x148/0x250 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad312f49>] schedule+0x29/0x70 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad3108b9>] schedule_timeout+0x239/0x2c0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffacdf62b1>] ? __slab_free+0x81/0x2f0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad3132fd>] wait_for_completion+0xfd/0x140 Apr 19 13:36:14 <hostname> kernel: [<ffffffffacccee80>] ? wake_up_state+0x20/0x20 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0f891e>] xhci_alloc_dev+0xee/0x2d0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0bb245>] usb_alloc_dev+0x75/0x340 Apr 19 13:36:14 <hostname> kernel: [<ffffffffacf4cc08>] ? kobject_put+0x28/0x60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0c1753>] hub_port_connect+0x293/0x9d0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad0c25cf>] hub_event+0x73f/0xb60 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb2dff>] process_one_work+0x17f/0x440 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb3ac6>] worker_thread+0x126/0x3c0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccb39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbae31>] kthread+0xd1/0xe0 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbad60>] ? insert_kthread_work+0x40/0x40 Apr 19 13:36:14 <hostname> kernel: [<ffffffffad31f61d>] ret_from_fork_nospec_begin+0x7/0x21 Apr 19 13:36:14 <hostname> kernel: [<ffffffffaccbad60>] ? insert_kthread_work+0x40/0x40 ~~~~ Additionally, this hang was observed both on RHEL 7.4 and 7.5, where the difference was only that on 7.5, the device apparently got discovered as thunderbolt 3, denoted by the following messages logged after the usb3 discovery: (usb 3/4 also discovered at Apr 19 13:32:21) ~~~~ Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: current switch config: Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Switch: 8086:15d3 (Revision: 6, TB Versi on: 2) Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Max Port Number: 11 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Config: Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Upstream Port Number: 5 Depth: 0 Route String: 0x0 Enabled: 1, PlugEventsDelay: 254ms Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: unknown1: 0x0 unknown4: 0x0 Apr 19 13:32:21 <hostname> kernel: TECH PREVIEW: Thunderbolt 3 may not be fully supported. #012Please review provided documentation for limitations. Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: 0: Thunderbolt HW version detected: 3 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: 0: uid: 0xf037df7cc07200 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Port 0: 8086:15d3 (Revision: 6, TB Version: 1, Type: Port (0x1)) Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Max hop id (in/out): 7/7 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Max counters: 8 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: NFC Credits: 0x800000 Apr 19 13:32:21 <hostname> kernel: thunderbolt 0000:06:00.0: Port 1: 8086:15d3 (Revision: 6, TB Version: 1, Type: Port (0x1)) ----[ rest omitted ]---- ~~~~ Note that these messages are seen when booting _with_ the dock connected and only on Rhel 7.5 (7.4 doesn't have the driver yet). Since the hang is seen the same with and without the thunderbolt driver messages, in my opinion the problem is not with the tech_preview driver. Last but not least, the customer has installed updates for AMD and the BIOS, and after some investigation with HP, they came to the following resolution: ~~~~ <cite> 1. There is a BIOS option in the G4 laptop called "Fast Boot". Enabling this setting causes the system to boot quickly and the USB ports on the dock work as expected. So, seems like this flag bypasses some checks and allow the USB ports on the dock to be available after boot. I can now have the dock connected and start a cold boot and the system now comes up quickly with the USB ports enabled. 2. The Video graphics driver setting was initially at "Auto". This was suggested to be set at "Discrete Graphics". Changing to this setting most likely causes the NVIDIA card to be "always" used for all graphics needs. But, in this setting, when shutting down the system, there would be a double beep. HP suggested to upgrade NVIDIA driver from 390.42 to 390.48. After this driver upgrade, the double beep problem is gone. So, the problems described in this ticket on RHEL74 setup now seem resolved, with the fixes being: + Enable "Fast Boot" in BIOS. + Upgrade NVIDIA driver to 390.48 version. </cite> ~~~~ The customer confirmed that while using this solution the hardware works as expected both on RHEL 7.4 and 7.5. ~~~~ <cite> > In the meantime, can you please confirm that the dock works properly as expected with the 'Fast Boot' option? Yes, I am using the Dock with 2 USB and 2 Display port devices attached and it works fine on RHEL74. Yesterday, I tested this setup using the RHEL75 HDD and it worked just the same. I did install the latest NVIDIA driver on RHEL75 too, so there was no double beep during shutdown. </cite> ~~~~