Description of problem: Unplug Ancker USB-C dock and the panic happens. System seems fine for a few minutes then locks up solid. Forced power down by long press of power button is required to recover. Problem is reproduced every time dock is unplugged. lsusb: Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter Bus 004 Device 002: ID 2109:0817 VIA Labs, Inc. USB3.0 Hub Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 004: ID 04ca:706b Lite-On Technology Corp. Integrated Camera Bus 001 Device 009: ID 2109:0100 VIA Labs, Inc. Sierra Wireless EM7455 Qualcomm® Snapdragon™ X7 LTE-A Bus 001 Device 008: ID 1050:0407 Yubico.com Yubikey 4 OTP+U2F+CCID Bus 001 Device 007: ID 1a40:0801 Terminus Technology Inc. ThinkPad X1 Tablet Thin Keyboard Gen 3 Bus 001 Device 006: ID 2109:2817 VIA Labs, Inc. USB2.0 Hub Bus 001 Device 003: ID 1199:9079 Sierra Wireless, Inc. Sierra Wireless EM7455 Qualcomm® Snapdragon™ X7 LTE-A Bus 001 Device 002: ID 17ef:60b5 Lenovo ThinkPad X1 Tablet Thin Keyboard Gen 3 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Additional info: reporter: libreport-2.11.3 BUG: kernel NULL pointer dereference, address: 0000000000000080 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 17346 Comm: kworker/0:0 Not tainted 5.2.8-200.fc30.x86_64 #1 Hardware name: LENOVO 20KJCTO1WW/20KJCTO1WW, BIOS N1ZET76W(1.32 ) 07/18/2019 Workqueue: events ucsi_connector_change [typec_ucsi] RIP: 0010:ucsi_displayport_remove_partner+0xa/0x20 [typec_ucsi] Code: 38 00 c7 43 28 00 00 00 00 48 83 c7 10 5b e9 1d 3b 1c fc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 48 85 ff 74 0f <48> 8b 47 78 48 c7 00 00 00 00 00 c6 40 3d 00 c3 66 0f 1f 44 00 00 RSP: 0018:ffffac2986ff7df8 EFLAGS: 00010202 RAX: 0000000000000008 RBX: ffff91bf8d99e448 RCX: 000000008100006e RDX: 000000008100006f RSI: 000000008100006e RDI: 0000000000000008 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000001 R10: ffff91bf91003b00 R11: fffff8cc10b76720 R12: ffff91bf8d99e440 R13: 0000000000000001 R14: ffff91bf8d99e590 R15: ffff91bf8d99e320 FS: 0000000000000000(0000) GS:ffff91bf91400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000080 CR3: 000000013e40a001 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ucsi_unregister_altmodes+0x7b/0x90 [typec_ucsi] ucsi_unregister_partner.part.0+0x13/0x30 [typec_ucsi] ucsi_connector_change+0x247/0x340 [typec_ucsi] process_one_work+0x19d/0x380 worker_thread+0x50/0x3b0 kthread+0xfb/0x130 ? process_one_work+0x380/0x380 ? kthread_park+0x80/0x80 ret_from_fork+0x35/0x40 Modules linked in: uas usb_storage cdc_ether r8152 typec_displayport thunderbolt fuse rfcomm ccm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables cmac ip6table_filter ip6_tables iptable_filter ip_tables bnep sunrpc vfat fat arc4 snd_soc_skl snd_hda_codec_hdmi snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc iwlmvm snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi spi_pxa2xx_platform snd_hda_codec_realtek intel_rapl snd_soc_core dw_dmac snd_hda_codec_generic mac80211 snd_compress ac97_bus wacom snd_pcm_dmaengine mei_hdcp mei_wdt snd_hda_intel iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec iwlwifi snd_hda_core kvm ipu3_cio2 snd_hwdep btusb btrtl v4l2_fwnode btbcm uvcvideo snd_seq btintel videobuf2_dma_sg videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 irqbypass snd_seq_device videobuf2_common cfg80211 intel_cstate intel_uncore snd_pcm bluetooth intel_rapl_perf wmi_bmof intel_wmi_thunderbolt rtsx_pci_ms i2c_i801 hid_sensor_rotation snd_timer mei_me videodev hid_sensor_als idma64 hid_sensor_accel_3d hid_sensor_gyro_3d intel_xhci_usb_role_switch qcserial thinkpad_acpi usb_wwan mei joydev hid_sensor_trigger hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf ledtrig_audio intel_lpss_pci memstick ecdh_generic intel_pch_thermal intel_lpss industrialio processor_thermal_device media roles ucsi_acpi ecc snd intel_soc_dts_iosf typec_ucsi typec soundcore int3403_thermal soc_button_array rfkill intel_vbtn int340x_thermal_zone pcc_cpufreq int3400_thermal intel_hid acpi_pad acpi_thermal_rel sparse_keymap dm_crypt cdc_mbim cdc_wdm cdc_ncm usbnet mii hid_sensor_hub intel_ishtp_loader intel_ishtp_hid i915 crct10dif_pclmul crc32_pclmul rtsx_pci_sdmmc i2c_algo_bit crc32c_intel drm_kms_helper mmc_core drm nvme ghash_clmulni_intel serio_raw nvme_core rtsx_pci intel_ish_ipc intel_ishtp wmi i2c_hid video pinctrl_sunrisepoint pinctrl_intel hid_multitouch CR2: 0000000000000080
Created attachment 1647245 [details] File: dmesg
I'd like to add a me-too here (this bugs me for a long time now - essentially since I have that new USB-C and Thunderbolt only laptop). I have two adapters, one HDMI only, the other one adapts to everything that can be adapted to. It does not matter which one I use, the effects (solid lockup after a while, to be solved by a long power button press) are the same as John describes, in any case. See logs attached.
Created attachment 1661964 [details] journal
We recently had a similar bug filed (bug 1762031), that bug is being tracked in the upstream kernel bugzilla here: https://bugzilla.kernel.org/show_bug.cgi?id=206365 It would be good if you can attach a dmesg + link to https://retrace.fedoraproject.org/faf/reports/bthash/6c143da44e62f214a2018303d35cb7fc42c873d1 there. There are also some debugging instructions provided there, please follow those and provide the requested information upstream.
Just adding a note. I discovered recently that the laptop I have this problem with has a known firmware issue with thunderbolt over USB-C and I have not yet updated. The firmware update tool requires windows :( When I get around to updating the firmware, I will retest and report back. Affected systems are listed here https://pcsupport.lenovo.com/ca/en/solutions/ht508988
Ping? Many people seem to be hitting this, also see bug 1762031, bug 1785972, bug 1798810 and bug 1800913. Can someone seeing this issue please provide the information requested in the upstream bug to help debug this ? : https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5
After checking the upstream bug one more time, I noticed that yesterday Heikki provided a patch to test. I've started a test/scratch build of a Fedora kernel with that patch added: https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485 (note still building atm, this takes a couple of hours) See here for generic instructions for installing a kernel directly from koji: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt If you can reproduce the bug, by e.g. unplugging your chager, then please give this new kernel a try and let us know if it fixes things. If this new kernel does not fix things, please collect the debugging info described here: https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5
Hm. I have in stalled the test kernel you refer to. Experienced two crashes of the same sort in the meantime, but had missed to enable tracing as the upstream bug suggests. Started doing that, # Unload all UCSI modules modprobe -r ucsi_acpi Killed I attach the oopsing log.
Created attachment 1663272 [details] journal showing crash at ucsi_acpi unload # uname -r 5.5.3-200.rhbz1762031.fc31.x86_64
The kernel test build is ready for downloading, please give it a try.
Created attachment 1663273 [details] trace as requested trace as requested in https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5
Created attachment 1663274 [details] journal with oops at time of trace taken Please correct me if I'm doing something wrong. As per https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5, # Unload all UCSI modules modprobe -r ucsi_acpi Plug the device (an HDMI monitor adapted via USB-C) Run script to reload and wait for null pointer access, #!/bin/sh modprobe typec_ucsi # Enable UCSI tracing echo 1 > /sys/kernel/debug/tracing/events/ucsi/enable # Now reload the ACPI glue driver modprobe ucsi_acpi exec journalctl -afe Be quick to collect, again via a script, #!/bin/sh cat /sys/kernel/debug/tracing/trace > trace journalctl -b 0 > journal sync
Upstream has provided a second patch which should fix this. I've done a scratch-build of a Fedora kernel with that patch added: https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485 Note the build has already finished, so you can get it right away. See here for generic instructions for installing a kernel directly from koji: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt If you can reproduce the bug, by e.g. unplugging your chager, then please give this new kernel a try and let us know if it fixes things. If this new kernel does not fix things, please collect the debugging info described here: https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5
I can confirm this is fixed with your build. I had been able to reproduce this by * plug external monitor * unplug external monitor * wait a few seconds * watch kernel go up in flames No crash after many cycles now.
Jörg, thank you for the positive testing feedback. I've passed your feedback along to the upstream developer. So hopefully we will get an official version of the patch fixing this soon. In the mean time it might be best if you stick with the test kernel which I build for now.
You're welcome. As you can imagine, the situation has drastically improved for me too, so yes, you're welcome. Btw, https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt required me to rpm --oldpackage. I suspect that regular updates will not update kernel and modules from now on. How is this situation reverted once the fix comes as a regular update?
(In reply to Jörg Faschingbauer from comment #16) > You're welcome. As you can imagine, the situation has drastically improved > for me too, so yes, you're welcome. > > Btw, https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt > required me to rpm --oldpackage. I suspect that regular updates will not > update kernel and modules from now on. How is this situation reverted once > the fix comes as a regular update? The (kernel) updates will still get installed whenever you do an update, as long as you manually select the test kernel on every boot it will not be removed since the running kernel is never removed on boot. But if you let it boot into the latest kernel and then do an upgrade then the test kernel might end up being removed (if it is the oldest kernel at that point).
I'm still getting the same kernel oops with the test kernel supplied at https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485. I'm wondering if I'm installing the right thing. The status in that link is dated Feb 14 while the link was provided in a comment dated Feb 21. And the link is the same as was provided in a comment dated Feb 14.
(In reply to John Stebbins from comment #18) > I'm still getting the same kernel oops with the test kernel supplied at > https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485. I'm wondering > if I'm installing the right thing. The status in that link is dated Feb 14 > while the link was provided in a comment dated Feb 21. And the link is the > same as was provided in a comment dated Feb 14. You are right, I somehow ended up putting the old link in the comment, sorry. The new build is here: https://koji.fedoraproject.org/koji/taskinfo?taskID=41750652
I'm still getting the oops with this kernel as well. But if I follow the instructions here https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 to collect more debug info, the oops does *not* happen. After performing those steps, if I plug and unplug the device again the oops will happen upon unplug. I'll attach dmesg and trace output. These logs are collected after performing the steps provided to collect trace info plus one more plug-unplug cycle.
Created attachment 1665271 [details] dmesg after oops
Created attachment 1665272 [details] trace after oops
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 30 kernel bugs. Fedora 30 has now been rebased to 5.5.7-100.fc30. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 31, and are still experiencing this issue, please change the version to Fedora 31. If you experience different issues, please open a new bug report for those.
The test-kernel which I've build was based on 5.5.5, there are no relevant fixes in 5.5.7 vs 5.5.5, so this is likely still an issue, clearing need info. John, thank you for the logs I've added a note the upstream bug about your findings and logs. Looking at the logs this seems to be an issue which should really have been fixed by the test-build I did. So now I wonder if I somehow messed up the test-build. Since the fixes fix some real issues, regardless if they fix your (John's) case too, I will add them to the Fedora kernel pkgs to be picked up by the next build. The next official Fedora kernel build for f30 + f31 will be either 5.5.9-201.fc31 or 5.5.10, please give this a try once it hit updates-testing and let me know if it resolves things for you.
Hans, will do, thanks.
*** Bug 1745924 has been marked as a duplicate of this bug. ***
*** Bug 1762031 has been marked as a duplicate of this bug. ***
*** Bug 1798810 has been marked as a duplicate of this bug. ***
*** Bug 1800913 has been marked as a duplicate of this bug. ***
*** Bug 1803363 has been marked as a duplicate of this bug. ***
*** Bug 1750197 has been marked as a duplicate of this bug. ***
*** Bug 1785832 has been marked as a duplicate of this bug. ***
FEDORA-2020-fee107f027 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2020-fee107f027
FEDORA-2020-aabfec096f has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-aabfec096f
kernel-5.5.10-100.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-fee107f027
kernel-5.5.10-200.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-aabfec096f
Created attachment 1671600 [details] dmesg after oops with 5.5.10
(In reply to John Stebbins from comment #37) > Created attachment 1671600 [details] > dmesg after oops with 5.5.10 Bummer, so we likely have another missing check somewhere and need another fix on top of the 2 current ones :| I've forwarded this info and your earlier trace upstream here: https://bugzilla.kernel.org/show_bug.cgi?id=206365 John, perhaps you can create a bugzilla.kernel.org account (just requires an email address, nothing more) and directly engage with the upstream maintainer there if he needs more info ? I can still build kernels with any patches upstream comes up with for you, but it would be nice if I do not have to play the middle man for gathering logs and such.
kernel-5.5.10-100.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.
kernel-5.5.10-200.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.
*** Bug 1815912 has been marked as a duplicate of this bug. ***
A while ago I thought the bug was fixed in the test kernel you (Hans) provided us. Not true, it was back soon all of a sudden (does the firmware play games?). At that time I got distracted by having to earn money, sorry for that. Still distracted, but: I had to switch back to X11 (a customer wants me to do online training with ... M$ Teams ... which can do desktop sharing provided you're on X11 as I found out). No crash. 5.5.10-200.fc31.x86_64
Issue is still present on my Zenbook in kernels 5.5.10-200.fc31.x86_64 and 5.5.11-200.fc31.x86_64. Tested on both Wayland and X11 with the same result. As far as I know, the issue isn't present in 4.x kernels, though I've only tested that in other distros. Solus is the only distro I've tested that didn't crash in a 5.x kernel.
(In reply to Kevin Rahardjo from comment #43) > Issue is still present on my Zenbook in kernels 5.5.10-200.fc31.x86_64 and > 5.5.11-200.fc31.x86_64. Tested on both Wayland and X11 with the same result. > As far as I know, the issue isn't present in 4.x kernels, though I've only > tested that in other distros. Solus is the only distro I've tested that > didn't crash in a 5.x kernel. Yes it turns out that the fix was still not complete, sorry. I've started a test Fedora kernel build with an additional patch which should hopefully finally really fix this: https://koji.fedoraproject.org/koji/taskinfo?taskID=43644168 Note this is still building atm, this will take a couple of hours to finish. When it is finished please give it a try and let us know if this fixes the issue, then I can add the patch to the official Fedora kernels. For generic instructions on installing a kernel directly from koji (our buildsystem), see: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt
Thanks Hans, I've done some thorough testing with this patch and I wasn't able to replicate any of the problems that previously existed. Should be good to see this implemented in the mainline kernel soon.
(In reply to Kevin Rahardjo from comment #45) > Thanks Hans, I've done some thorough testing with this patch and I wasn't > able to replicate any of the problems that previously existed. Should be > good to see this implemented in the mainline kernel soon. Great thank you for testing. The fix is queued up for merging into 5.7-rc# here: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/log/?h=usb-linus Once it hits Linus' tree I expect Greg to quickly add it to the 5.6.y series. In the mean time you can keep using the test kernel build I did to workaround this issue.
I can also confirm that this has eliminated the oops I was seeing.
As expected the fix has landed in 5.6.8 and is now available in the official Fedora 5.6.8 kernels: F31: https://koji.fedoraproject.org/koji/buildinfo?buildID=1499538 F32: https://koji.fedoraproject.org/koji/buildinfo?buildID=1499537 Running: sudo dnf --enablerepo=updates-testing 'kernel*' Should get you the new, fixed, official kernel.
Erm that dnf command is missing the "update" command, it should be: sudo dnf --enablerepo=updates-testing update 'kernel*'
*** Bug 1830426 has been marked as a duplicate of this bug. ***