Bug 1785972
Summary: | [abrt] ucsi_unregister_altmodes: BUG: kernel NULL pointer dereference, address: 0000000000000080 [typec_ucsi] | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | John Stebbins <stebbins> | ||||||||||||||||||
Component: | kernel | Assignee: | Hans de Goede <hdegoede> | ||||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||||||
Priority: | unspecified | ||||||||||||||||||||
Version: | 30 | CC: | airlied, bburton, bskeggs, bugzilla.redhat, clausomh, darrienglasser, furquan2011, hdegoede, ichavero, itamar, jarodwilson, jeremy, jf, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mark.p.de.vries.nl, masami256, mchehab, mizarc+fedora7, mjg59, rnaef77, steved, vincent.datrier | ||||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
URL: | https://retrace.fedoraproject.org/faf/reports/bthash/6c143da44e62f214a2018303d35cb7fc42c873d1 | ||||||||||||||||||||
Whiteboard: | abrt_hash:911744eb682d44eb4cf6218fad164faa43f88d17;VARIANT_ID=workstation; | ||||||||||||||||||||
Fixed In Version: | kernel-5.5.10-100.fc30 kernel-5.5.10-200.fc31 | Doc Type: | If docs needed, set a value | ||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||
Last Closed: | 2020-03-21 03:15:49 UTC | Type: | --- | ||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
Embargoed: | |||||||||||||||||||||
Attachments: |
|
Description
John Stebbins
2019-12-22 22:11:04 UTC
Created attachment 1647245 [details]
File: dmesg
I'd like to add a me-too here (this bugs me for a long time now - essentially since I have that new USB-C and Thunderbolt only laptop). I have two adapters, one HDMI only, the other one adapts to everything that can be adapted to. It does not matter which one I use, the effects (solid lockup after a while, to be solved by a long power button press) are the same as John describes, in any case. See logs attached. Created attachment 1661964 [details]
journal
We recently had a similar bug filed (bug 1762031), that bug is being tracked in the upstream kernel bugzilla here: https://bugzilla.kernel.org/show_bug.cgi?id=206365 It would be good if you can attach a dmesg + link to https://retrace.fedoraproject.org/faf/reports/bthash/6c143da44e62f214a2018303d35cb7fc42c873d1 there. There are also some debugging instructions provided there, please follow those and provide the requested information upstream. Just adding a note. I discovered recently that the laptop I have this problem with has a known firmware issue with thunderbolt over USB-C and I have not yet updated. The firmware update tool requires windows :( When I get around to updating the firmware, I will retest and report back. Affected systems are listed here https://pcsupport.lenovo.com/ca/en/solutions/ht508988 Ping? Many people seem to be hitting this, also see bug 1762031, bug 1785972, bug 1798810 and bug 1800913. Can someone seeing this issue please provide the information requested in the upstream bug to help debug this ? : https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 After checking the upstream bug one more time, I noticed that yesterday Heikki provided a patch to test. I've started a test/scratch build of a Fedora kernel with that patch added: https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485 (note still building atm, this takes a couple of hours) See here for generic instructions for installing a kernel directly from koji: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt If you can reproduce the bug, by e.g. unplugging your chager, then please give this new kernel a try and let us know if it fixes things. If this new kernel does not fix things, please collect the debugging info described here: https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 Hm. I have in stalled the test kernel you refer to. Experienced two crashes of the same sort in the meantime, but had missed to enable tracing as the upstream bug suggests. Started doing that, # Unload all UCSI modules modprobe -r ucsi_acpi Killed I attach the oopsing log. Created attachment 1663272 [details]
journal showing crash at ucsi_acpi unload
# uname -r
5.5.3-200.rhbz1762031.fc31.x86_64
The kernel test build is ready for downloading, please give it a try. Created attachment 1663273 [details] trace as requested trace as requested in https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 Created attachment 1663274 [details] journal with oops at time of trace taken Please correct me if I'm doing something wrong. As per https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5, # Unload all UCSI modules modprobe -r ucsi_acpi Plug the device (an HDMI monitor adapted via USB-C) Run script to reload and wait for null pointer access, #!/bin/sh modprobe typec_ucsi # Enable UCSI tracing echo 1 > /sys/kernel/debug/tracing/events/ucsi/enable # Now reload the ACPI glue driver modprobe ucsi_acpi exec journalctl -afe Be quick to collect, again via a script, #!/bin/sh cat /sys/kernel/debug/tracing/trace > trace journalctl -b 0 > journal sync Upstream has provided a second patch which should fix this. I've done a scratch-build of a Fedora kernel with that patch added: https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485 Note the build has already finished, so you can get it right away. See here for generic instructions for installing a kernel directly from koji: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt If you can reproduce the bug, by e.g. unplugging your chager, then please give this new kernel a try and let us know if it fixes things. If this new kernel does not fix things, please collect the debugging info described here: https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 I can confirm this is fixed with your build. I had been able to reproduce this by * plug external monitor * unplug external monitor * wait a few seconds * watch kernel go up in flames No crash after many cycles now. Jörg, thank you for the positive testing feedback. I've passed your feedback along to the upstream developer. So hopefully we will get an official version of the patch fixing this soon. In the mean time it might be best if you stick with the test kernel which I build for now. You're welcome. As you can imagine, the situation has drastically improved for me too, so yes, you're welcome. Btw, https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt required me to rpm --oldpackage. I suspect that regular updates will not update kernel and modules from now on. How is this situation reverted once the fix comes as a regular update? (In reply to Jörg Faschingbauer from comment #16) > You're welcome. As you can imagine, the situation has drastically improved > for me too, so yes, you're welcome. > > Btw, https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt > required me to rpm --oldpackage. I suspect that regular updates will not > update kernel and modules from now on. How is this situation reverted once > the fix comes as a regular update? The (kernel) updates will still get installed whenever you do an update, as long as you manually select the test kernel on every boot it will not be removed since the running kernel is never removed on boot. But if you let it boot into the latest kernel and then do an upgrade then the test kernel might end up being removed (if it is the oldest kernel at that point). I'm still getting the same kernel oops with the test kernel supplied at https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485. I'm wondering if I'm installing the right thing. The status in that link is dated Feb 14 while the link was provided in a comment dated Feb 21. And the link is the same as was provided in a comment dated Feb 14. (In reply to John Stebbins from comment #18) > I'm still getting the same kernel oops with the test kernel supplied at > https://koji.fedoraproject.org/koji/taskinfo?taskID=41491485. I'm wondering > if I'm installing the right thing. The status in that link is dated Feb 14 > while the link was provided in a comment dated Feb 21. And the link is the > same as was provided in a comment dated Feb 14. You are right, I somehow ended up putting the old link in the comment, sorry. The new build is here: https://koji.fedoraproject.org/koji/taskinfo?taskID=41750652 I'm still getting the oops with this kernel as well. But if I follow the instructions here https://bugzilla.kernel.org/show_bug.cgi?id=206365#c5 to collect more debug info, the oops does *not* happen. After performing those steps, if I plug and unplug the device again the oops will happen upon unplug. I'll attach dmesg and trace output. These logs are collected after performing the steps provided to collect trace info plus one more plug-unplug cycle. Created attachment 1665271 [details]
dmesg after oops
Created attachment 1665272 [details]
trace after oops
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 30 kernel bugs. Fedora 30 has now been rebased to 5.5.7-100.fc30. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 31, and are still experiencing this issue, please change the version to Fedora 31. If you experience different issues, please open a new bug report for those. The test-kernel which I've build was based on 5.5.5, there are no relevant fixes in 5.5.7 vs 5.5.5, so this is likely still an issue, clearing need info. John, thank you for the logs I've added a note the upstream bug about your findings and logs. Looking at the logs this seems to be an issue which should really have been fixed by the test-build I did. So now I wonder if I somehow messed up the test-build. Since the fixes fix some real issues, regardless if they fix your (John's) case too, I will add them to the Fedora kernel pkgs to be picked up by the next build. The next official Fedora kernel build for f30 + f31 will be either 5.5.9-201.fc31 or 5.5.10, please give this a try once it hit updates-testing and let me know if it resolves things for you. Hans, will do, thanks. *** Bug 1745924 has been marked as a duplicate of this bug. *** *** Bug 1762031 has been marked as a duplicate of this bug. *** *** Bug 1798810 has been marked as a duplicate of this bug. *** *** Bug 1800913 has been marked as a duplicate of this bug. *** *** Bug 1803363 has been marked as a duplicate of this bug. *** *** Bug 1750197 has been marked as a duplicate of this bug. *** *** Bug 1785832 has been marked as a duplicate of this bug. *** FEDORA-2020-fee107f027 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2020-fee107f027 FEDORA-2020-aabfec096f has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-aabfec096f kernel-5.5.10-100.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-fee107f027 kernel-5.5.10-200.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-aabfec096f Created attachment 1671600 [details]
dmesg after oops with 5.5.10
(In reply to John Stebbins from comment #37) > Created attachment 1671600 [details] > dmesg after oops with 5.5.10 Bummer, so we likely have another missing check somewhere and need another fix on top of the 2 current ones :| I've forwarded this info and your earlier trace upstream here: https://bugzilla.kernel.org/show_bug.cgi?id=206365 John, perhaps you can create a bugzilla.kernel.org account (just requires an email address, nothing more) and directly engage with the upstream maintainer there if he needs more info ? I can still build kernels with any patches upstream comes up with for you, but it would be nice if I do not have to play the middle man for gathering logs and such. kernel-5.5.10-100.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report. kernel-5.5.10-200.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report. *** Bug 1815912 has been marked as a duplicate of this bug. *** A while ago I thought the bug was fixed in the test kernel you (Hans) provided us. Not true, it was back soon all of a sudden (does the firmware play games?). At that time I got distracted by having to earn money, sorry for that. Still distracted, but: I had to switch back to X11 (a customer wants me to do online training with ... M$ Teams ... which can do desktop sharing provided you're on X11 as I found out). No crash. 5.5.10-200.fc31.x86_64 Issue is still present on my Zenbook in kernels 5.5.10-200.fc31.x86_64 and 5.5.11-200.fc31.x86_64. Tested on both Wayland and X11 with the same result. As far as I know, the issue isn't present in 4.x kernels, though I've only tested that in other distros. Solus is the only distro I've tested that didn't crash in a 5.x kernel. (In reply to Kevin Rahardjo from comment #43) > Issue is still present on my Zenbook in kernels 5.5.10-200.fc31.x86_64 and > 5.5.11-200.fc31.x86_64. Tested on both Wayland and X11 with the same result. > As far as I know, the issue isn't present in 4.x kernels, though I've only > tested that in other distros. Solus is the only distro I've tested that > didn't crash in a 5.x kernel. Yes it turns out that the fix was still not complete, sorry. I've started a test Fedora kernel build with an additional patch which should hopefully finally really fix this: https://koji.fedoraproject.org/koji/taskinfo?taskID=43644168 Note this is still building atm, this will take a couple of hours to finish. When it is finished please give it a try and let us know if this fixes the issue, then I can add the patch to the official Fedora kernels. For generic instructions on installing a kernel directly from koji (our buildsystem), see: https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt Thanks Hans, I've done some thorough testing with this patch and I wasn't able to replicate any of the problems that previously existed. Should be good to see this implemented in the mainline kernel soon. (In reply to Kevin Rahardjo from comment #45) > Thanks Hans, I've done some thorough testing with this patch and I wasn't > able to replicate any of the problems that previously existed. Should be > good to see this implemented in the mainline kernel soon. Great thank you for testing. The fix is queued up for merging into 5.7-rc# here: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/log/?h=usb-linus Once it hits Linus' tree I expect Greg to quickly add it to the 5.6.y series. In the mean time you can keep using the test kernel build I did to workaround this issue. I can also confirm that this has eliminated the oops I was seeing. As expected the fix has landed in 5.6.8 and is now available in the official Fedora 5.6.8 kernels: F31: https://koji.fedoraproject.org/koji/buildinfo?buildID=1499538 F32: https://koji.fedoraproject.org/koji/buildinfo?buildID=1499537 Running: sudo dnf --enablerepo=updates-testing 'kernel*' Should get you the new, fixed, official kernel. Erm that dnf command is missing the "update" command, it should be: sudo dnf --enablerepo=updates-testing update 'kernel*' *** Bug 1830426 has been marked as a duplicate of this bug. *** |