Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2161152

Summary: systems does not boot after kernelupgrade to 4.18.0-425.10.1.el8_7.x86_64
Product: Red Hat Enterprise Linux 8 Reporter: Need Real Name <hansjoerg.maurer>
Component: kernelAssignee: core-kernel-bot <core-kernel-mgr>
kernel sub component: Kernel-Core QA Contact: Red Hat Kernel QE team <kernel-qe>
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: unspecified CC: aquini, chref, reynolds
Version: 8.7Flags: pm-rhel: mirror+
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-19 22:23:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Need Real Name 2023-01-16 06:48:13 UTC
Description of problem:
systems does not boot after kernelupgrade from 4.18.0-425.3.1.el8.x86_64 to 4.18.0-425.10.1.el8_7.x86_64
System shows
watchdog: bug: soft logup - cpu7 stuck for 23s
timed out waiting for device dev-mapper
dependency failed for resume from hibernation using device dev mapper 
 watchdog bug soft lockup cpu7 stuck for 22s

When choosing  the old kernel 4.18.0-425.3.1.el8.x86_64 at grub2 menu, system boots again

Version-Release number of selected component (if applicable):
4.18.0-425.10.1.el8_7.x86_64



How reproducible:
We have two Dell 3600 series workstation (with NVME), which are affected by this problem


Steps to Reproduce:
1.Update kernel from 4.18.0-425.3.1.el8.x86_64 to 4.18.0-425.10.1.el8_7.x86_64
2.reboot
3.systems shows error above 

Actual results:


Expected results:


Additional info:

Comment 1 Rafael Aquini 2023-01-16 20:09:21 UTC
This seems to be the same class of issue as reported at
https://bugzilla.redhat.com/show_bug.cgi?id=2160842

Would you mind setting the system up to collect a vmcore
when such occurrences are observed?

Comment 2 Need Real Name 2023-01-16 21:05:48 UTC
Hi
could you provide a link to a document how to collect vmcore?
The affected systems run at two customer locations and we will need a step-by-step guide to collectthe information.

Regards

Hansjörg

Comment 3 Rafael Aquini 2023-01-16 22:12:02 UTC
(In reply to Need Real Name from comment #2)
> Hi
> could you provide a link to a document how to collect vmcore?
> The affected systems run at two customer locations and we will need a
> step-by-step guide to collectthe information.
> 

You can start here: https://access.redhat.com/solutions/6038

It's also advisable, if you don't know how to set up the system to
capture a vmcore when it hangs, to open a support case with Red Hat
support department and get help to accomplish it. 
(Bugzilla is not a support channel)

Comment 4 Need Real Name 2023-01-17 05:46:40 UTC
Hi
thanks.
The problem occurs at very early boot(waiting for device-mapper) when trying to resume from hibernation (as you can see from the log.
Therefore I doubpt, that kdump would be available at this stage?
Regards

Hansjörg

Comment 5 Need Real Name 2023-01-19 06:40:03 UTC
Hi

I found this

https://elrepo.org/bugs/view.php?id=1316

"RHEL8.7 system with the kernel version 4.18.0-425.3.1.el8.x86_64 fails to boot with soft lockup message"

Any kmod packages that use the affected 'pv_lock_ops' symbol need rebuilding against (bug-free) kernel-4.18.0-425.10.1.el8_7

And the affected systems have nvidia.ko installed from elrepo
I will test the new nvidia.ko  today and let you know, if it helps

Regards

Hansjörg

Comment 6 Rafael Aquini 2023-01-19 22:23:39 UTC
(In reply to Need Real Name from comment #5)
> Hi
> 
> I found this
> 
> https://elrepo.org/bugs/view.php?id=1316
> 
> "RHEL8.7 system with the kernel version 4.18.0-425.3.1.el8.x86_64 fails to
> boot with soft lockup message"
> 
> Any kmod packages that use the affected 'pv_lock_ops' symbol need rebuilding
> against (bug-free) kernel-4.18.0-425.10.1.el8_7
> 

Yes, that's correct. There was an unwittingly and silent KABI break introduced on
kernel-4.18.0-425.el8, which made modules built for older releases stop loading
due to the paravirt lock patching. The fix for that KABI break, introduced in 
kernel-4.18.0-425.10.1.el8_7 end up causing the same problem for modules compiled
against all earlier RHEL-8.7 builds. So, in your case a module that
was compiled for kernel-4.18.0-425.3.1.el8.x86_64 will stop loading when updating
the kernel to 4.18.0-425.10.1.el8_7.

-- Rafael

*** This bug has been marked as a duplicate of bug 2160842 ***

Comment 7 Need Real Name 2023-01-22 14:55:05 UTC
Hi

with the new
nvidia-x11-drv-525.85.05-1.el8_7.elrepo.x86_64
the system boots with  4.18.0-425.10.1.el8_7 again

Regards

Hansjörg