Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1958724

Summary: NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
Product: Red Hat Enterprise Linux 9 Reporter: Qiao Zhao <qzhao>
Component: kernelAssignee: Jerome Marchand <jmarchan>
kernel sub component: Ftrace QA Contact: Qiao Zhao <qzhao>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: crwood, williams
Version: 9.0Keywords: Triaged
Target Milestone: betaFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-20 08:28:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qiao Zhao 2021-05-10 02:33:00 UTC
Description of problem:

Running ftrace tier test trigger this panic: (case ftrace-stress-test is the last one, but not sure this case cause the panic)

[  266.096450] LTP: starting ftrace_regression01 (ftrace_regression01.sh)
[  266.382835] LTP: starting ftrace_regression02 (ftrace_regression02.sh)
[  266.837985] LTP: starting ftrace-stress-test (ftrace_stress_test.sh 90)
[  267.146863] nop_test_accept flag set to 1: we accept. Now cat trace_options to see the result
[  269.529412] nop_test_accept flag set to 0: we accept. Now cat trace_options to see the result
[  271.732463] nop_test_accept flag set to 1: we accept. Now cat trace_options to see the result
[  274.233153] nop_test_accept flag set to 0: we accept. Now cat trace_options to see the result
[  277.084742] nop_test_accept flag set to 1: we accept. Now cat trace_options to see the result
[  280.866461] nop_test_accept flag set to 0: we accept. Now cat trace_options to see the result
[  280.866530] Ring buffer clock went backwards: 283531425012 -> 280866529715
[  286.673401] nop_test_accept flag set to 1: we accept. Now cat trace_options to see the result
[  288.423084] nop_test_accept flag set to 0: we accept. Now cat trace_options to see the result
[  298.596095] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
[  298.596099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_ssc fscache rfkill sunrpc intel_rapl_msr intel_rapl_common vfat fat sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support dcdbas irqbypass rapl intel_cstate intel_uncore ipmi_ssif pcspkr mgag200 i2c_algo_bit drm_kms_helper cdc_ether mxm_wmi usbnet syscopyarea sysfillrect sysimgblt fb_sys_fops cec mii lpc_ich mei_me mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse ip_tables xfs sr_mod cdrom sd_mod t10_pi ahci libahci crct10dif_pclmul bnx2x crc32_pclmul libata ghash_clmulni_intel tg3 megaraid_sas mdio libcrc32c crc32c_intel wmi dm_mirror dm_region_hash dm_log dm_mod
[  298.596187] CPU: 4 PID: 49987 Comm: sh Kdump: loaded Tainted: G               X --------- ---  5.12.0-1.el9.x86_64 #1
[  298.596189] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.11.0 11/02/2019
[  298.596190] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x42/0x190
[  298.596191] Code: 0f ba 2a 08 0f 92 c1 8b 02 0f b6 c9 c1 e1 08 30 e4 09 c8 a9 00 01 ff ff 0f 85 ed 00 00 00 85 c0 74 0e 8b 02 84 c0 74 08 f3 90 <8b> 02 84 c0 75 f8 b8 01 00 00 00 66 89 02 c3 8b 37 b8 00 02 00 00
[  298.596193] RSP: 0018:ffff95824795fab0 EFLAGS: 00000002
[  298.596196] RAX: 0000000000300101 RBX: 0000000000000002 RCX: 0000000000000000
[  298.596197] RDX: ffffffff9c8932c8 RSI: 0000000000000000 RDI: ffffffff9c8932c8
[  298.596198] RBP: ffffffff9c8932c8 R08: 0000004375628667 R09: ffff8928c126a000
[  298.596200] R10: 8000000000000063 R11: ffff8928c1240081 R12: 0000004375628667
[  298.596201] R13: 00000000000003e8 R14: 0000000000000f54 R15: 00000000000003d5
[  298.596202] FS:  00007f556b6e4740(0000) GS:ffff892de7c80000(0000) knlGS:0000000000000000
[  298.596204] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  298.596205] CR2: 00005587bf3e88d0 CR3: 0000000685fb8005 CR4: 00000000001706e0
[  298.596206] Call Trace:
[  298.596207]  trace_clock_global+0x75/0x80
[  298.596208]  __rb_reserve_next+0x1f0/0x440
[  298.596209]  ring_buffer_lock_reserve+0x11d/0x340
[  298.596211]  trace_buffer_lock_reserve+0x10/0x50
[  298.596212]  __trace_graph_return+0x1f/0x80
[  298.596213]  trace_graph_return+0x8f/0xd0
[  298.596214]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596215]  ftrace_return_to_handler+0x8b/0xf0
[  298.596216]  ? flush_ldt+0x30/0x30
[  298.596217]  ? tick_nohz_handler+0x1/0x90
[  298.596218]  return_to_handler+0x15/0x30
[  298.596220]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596221]  ? __text_poke+0x2d4/0x440
[  298.596222]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596223]  ? text_poke_bp_batch+0x112/0x1f0
[  298.596224]  ? __x64_sys_setreuid16+0x30/0x30
[  298.596225]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596227]  ? text_poke_flush+0x44/0x50
[  298.596228]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596229]  ? text_poke_queue+0x26/0x70
[  298.596230]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596231]  ? ftrace_replace_code+0x117/0x170
[  298.596232]  ? ftrace_modify_all_code+0xb8/0x150
[  298.596233]  ? cache_mod+0x290/0x290
[  298.596234]  ? ftrace_run_update_code+0x13/0x70
[  298.596236]  ? ftrace_shutdown.part.0+0x106/0x1a0
[  298.596237]  ? unregister_ftrace_function+0x35/0x60
[  298.596238]  ? stack_trace_sysctl+0x8b/0x90
[  298.596239]  ? proc_sys_call_handler+0x141/0x260
[  298.596240]  ? new_sync_write+0x11c/0x1b0
[  298.596241]  ? vfs_write+0x1b8/0x270
[  298.596242]  ? ksys_write+0x5f/0xe0
[  298.596243]  ? do_syscall_64+0x33/0x40
[  298.596245]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  298.596246] Kernel panic - not syncing: Hard LOCKUP
[  298.596247] CPU: 4 PID: 49987 Comm: sh Kdump: loaded Tainted: G               X --------- ---  5.12.0-1.el9.x86_64 #1
[  298.596249] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.11.0 11/02/2019
[  298.596250] Call Trace:
[  298.596251]  <NMI>
[  298.596252]  dump_stack+0x64/0x7c
[  298.596253]  panic+0xfb/0x2d7
[  298.596255]  nmi_panic.cold+0xc/0xc
[  298.596256]  watchdog_overflow_callback.cold+0x5c/0x70
[  298.596257]  __perf_event_overflow+0x52/0xf0
[  298.596258]  handle_pmi_common+0x200/0x310
[  298.596259]  ? do_kernel_range_flush+0x30/0x30
[  298.596260]  ? prepare_ftrace_return+0x5c/0x80
[  298.596262]  ? intel_pmu_handle_irq+0xf3/0x210
[  298.596263]  ? icl_set_topdown_event_period+0x100/0x100
[  298.596264]  ? function_graph_enter+0xe7/0x140
[  298.596265]  ? icl_set_topdown_event_period+0x100/0x100
[  298.596266]  ? prepare_ftrace_return+0x5c/0x80
[  298.596267]  ? ftrace_graph_caller+0x6b/0xa0
[  298.596268]  ? intel_pmu_reset+0x1c0/0x1c0
[  298.596269]  ? prepare_ftrace_return+0x5c/0x80
[  298.596271]  ? ftrace_graph_caller+0x6b/0xa0
[  298.596272]  intel_pmu_handle_irq+0xf3/0x210
[  298.596273]  perf_event_nmi_handler+0x28/0x50
[  298.596274]  nmi_handle+0x58/0x100
[  298.596275]  default_do_nmi+0x42/0x130
[  298.596276]  exc_nmi+0x11f/0x150
[  298.596277]  end_repeat_nmi+0x16/0x55
[  298.596278] RIP: 0010:native_queued_spin_lock_slowpath.part.0+0x42/0x190
[  298.596280] Code: 0f ba 2a 08 0f 92 c1 8b 02 0f b6 c9 c1 e1 08 30 e4 09 c8 a9 00 01 ff ff 0f 85 ed 00 00 00 85 c0 74 0e 8b 02 84 c0 74 08 f3 90 <8b> 02 84 c0 75 f8 b8 01 00 00 00 66 89 02 c3 8b 37 b8 00 02 00 00
[  298.596281] RSP: 0018:ffff95824795fab0 EFLAGS: 00000002
[  298.596283] RAX: 0000000000300101 RBX: 0000000000000002 RCX: 0000000000000000
[  298.596285] RDX: ffffffff9c8932c8 RSI: 0000000000000000 RDI: ffffffff9c8932c8
[  298.596286] RBP: ffffffff9c8932c8 R08: 0000004375628667 R09: ffff8928c126a000
[  298.596287] R10: 8000000000000063 R11: ffff8928c1240081 R12: 0000004375628667
[  298.596288] R13: 00000000000003e8 R14: 0000000000000f54 R15: 00000000000003d5
[  298.596290]  ? native_queued_spin_lock_slowpath.part.0+0x42/0x190
[  298.596291]  ? native_queued_spin_lock_slowpath.part.0+0x42/0x190
[  298.596292]  </NMI>
[  298.596293]  trace_clock_global+0x75/0x80
[  298.596294]  __rb_reserve_next+0x1f0/0x440
[  298.596295]  ring_buffer_lock_reserve+0x11d/0x340
[  298.596296]  trace_buffer_lock_reserve+0x10/0x50
[  298.596297]  __trace_graph_return+0x1f/0x80
[  298.596299]  trace_graph_return+0x8f/0xd0
[  298.596300]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596301]  ftrace_return_to_handler+0x8b/0xf0
[  298.596302]  ? flush_ldt+0x30/0x30
[  298.596303]  ? tick_nohz_handler+0x1/0x90
[  298.596304]  return_to_handler+0x15/0x30
[  298.596306]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596307]  ? __text_poke+0x2d4/0x440
[  298.596308]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596309]  ? text_poke_bp_batch+0x112/0x1f0
[  298.596310]  ? __x64_sys_setreuid16+0x30/0x30
[  298.596312]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596313]  ? text_poke_flush+0x44/0x50
[  298.596314]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596315]  ? text_poke_queue+0x26/0x70
[  298.596316]  ? ftrace_graph_caller+0xa0/0xa0
[  298.596317]  ? ftrace_replace_code+0x117/0x170
[  298.596318]  ? ftrace_modify_all_code+0xb8/0x150
[  298.596320]  ? cache_mod+0x290/0x290
[  298.596321]  ? ftrace_run_update_code+0x13/0x70
[  298.596322]  ? ftrace_shutdown.part.0+0x106/0x1a0
[  298.596323]  ? unregister_ftrace_function+0x35/0x60
[  298.596324]  ? stack_trace_sysctl+0x8b/0x90
[  298.596325]  ? proc_sys_call_handler+0x141/0x260
[  298.596326]  ? new_sync_write+0x11c/0x1b0
[  298.596327]  ? vfs_write+0x1b8/0x270
[  298.596328]  ? ksys_write+0x5f/0xe0
[  298.596330]  ? do_syscall_64+0x33/0x40
[  298.596331]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae


Host -> https://beaker.engineering.redhat.com/view/dell-per730-01.dell2.lab.eng.bos.redhat.com
Job  -> https://beaker.engineering.redhat.com/jobs/5351554
nvr  -> 5.12.0-1.el9.x86_64

:  vmcore files:
:      http://ibm-x3250m4-03.rhts.eng.pek2.redhat.com/vmcore/qzhao/5.12.0-1.el9.x86_64/5351554/dell-per730-01.dell2.lab.eng.bos.redhat.com/10.16.218.153-2021-05-08-08:57:45/vmcore
:      http://ibm-x3250m4-03.rhts.eng.pek2.redhat.com/vmcore/qzhao/5.12.0-1.el9.x86_64/5351554/dell-per730-01.dell2.lab.eng.bos.redhat.com/10.16.218.153-2021-05-08-08:57:45/vmcore-dmesg.txt

Version-Release number of selected component (if applicable):


How reproducible:
once

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info: