Description of problem: systems does not boot after kernelupgrade from 4.18.0-425.3.1.el8.x86_64 to 4.18.0-425.10.1.el8_7.x86_64 System shows watchdog: bug: soft logup - cpu7 stuck for 23s timed out waiting for device dev-mapper dependency failed for resume from hibernation using device dev mapper watchdog bug soft lockup cpu7 stuck for 22s When choosing the old kernel 4.18.0-425.3.1.el8.x86_64 at grub2 menu, system boots again Version-Release number of selected component (if applicable): 4.18.0-425.10.1.el8_7.x86_64 How reproducible: We have two Dell 3600 series workstation (with NVME), which are affected by this problem Steps to Reproduce: 1.Update kernel from 4.18.0-425.3.1.el8.x86_64 to 4.18.0-425.10.1.el8_7.x86_64 2.reboot 3.systems shows error above Actual results: Expected results: Additional info:
This seems to be the same class of issue as reported at https://bugzilla.redhat.com/show_bug.cgi?id=2160842 Would you mind setting the system up to collect a vmcore when such occurrences are observed?
Hi could you provide a link to a document how to collect vmcore? The affected systems run at two customer locations and we will need a step-by-step guide to collectthe information. Regards Hansjörg
(In reply to Need Real Name from comment #2) > Hi > could you provide a link to a document how to collect vmcore? > The affected systems run at two customer locations and we will need a > step-by-step guide to collectthe information. > You can start here: https://access.redhat.com/solutions/6038 It's also advisable, if you don't know how to set up the system to capture a vmcore when it hangs, to open a support case with Red Hat support department and get help to accomplish it. (Bugzilla is not a support channel)
Hi thanks. The problem occurs at very early boot(waiting for device-mapper) when trying to resume from hibernation (as you can see from the log. Therefore I doubpt, that kdump would be available at this stage? Regards Hansjörg
Hi I found this https://elrepo.org/bugs/view.php?id=1316 "RHEL8.7 system with the kernel version 4.18.0-425.3.1.el8.x86_64 fails to boot with soft lockup message" Any kmod packages that use the affected 'pv_lock_ops' symbol need rebuilding against (bug-free) kernel-4.18.0-425.10.1.el8_7 And the affected systems have nvidia.ko installed from elrepo I will test the new nvidia.ko today and let you know, if it helps Regards Hansjörg
(In reply to Need Real Name from comment #5) > Hi > > I found this > > https://elrepo.org/bugs/view.php?id=1316 > > "RHEL8.7 system with the kernel version 4.18.0-425.3.1.el8.x86_64 fails to > boot with soft lockup message" > > Any kmod packages that use the affected 'pv_lock_ops' symbol need rebuilding > against (bug-free) kernel-4.18.0-425.10.1.el8_7 > Yes, that's correct. There was an unwittingly and silent KABI break introduced on kernel-4.18.0-425.el8, which made modules built for older releases stop loading due to the paravirt lock patching. The fix for that KABI break, introduced in kernel-4.18.0-425.10.1.el8_7 end up causing the same problem for modules compiled against all earlier RHEL-8.7 builds. So, in your case a module that was compiled for kernel-4.18.0-425.3.1.el8.x86_64 will stop loading when updating the kernel to 4.18.0-425.10.1.el8_7. -- Rafael *** This bug has been marked as a duplicate of bug 2160842 ***
Hi with the new nvidia-x11-drv-525.85.05-1.el8_7.elrepo.x86_64 the system boots with 4.18.0-425.10.1.el8_7 again Regards Hansjörg