Bug 2072877

Summary: [RHEL9] kernel panic when create raid1 with lvcreate command
Product: Red Hat Enterprise Linux 9 Reporter: ChanghuiZhong <czhong>
Component: mdadmAssignee: Nigel Croxon <ncroxon>
Status: CLOSED CURRENTRELEASE QA Contact: Fine Fan <ffan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.1CC: dledford, ffan, guazhang, hwkernel-mgr, ncroxon, xni
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-5.14.0-94.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-19 16:05:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ChanghuiZhong 2022-04-07 07:56:59 UTC
Description of problem:
found a known issue on 5.14.0-76.mr626_220401_0216.el9,kernel panic when create raid1 with lvcreate command

[ 3308.408252] BUG: kernel NULL pointer dereference, address: 0000000000000060 
[ 3308.416022] #PF: supervisor write access in kernel mode 
[ 3308.421851] #PF: error_code(0x0002) - not-present page 
[ 3308.427585] PGD 0 P4D 0  
[ 3308.430409] Oops: 0002 [#1] PREEMPT SMP PTI 
[ 3308.435075] CPU: 1 PID: 1655 Comm: lvcreate Kdump: loaded Not tainted 5.14.0-76.mr626_220401_0216.el9.x86_64 #1 
[ 3308.446333] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.4.3 01/17/2017 
[ 3308.454681] RIP: 0010:blk_queue_flag_set+0x7/0x10 
[ 3308.459931] Code: 00 00 00 0f 1f 44 00 00 48 8b 35 b4 5c 50 02 48 8d 57 38 bf 00 20 00 00 e9 66 2e c5 ff 66 0f 1f 44 00 00 0f 1f 44 00 00 89 ff <f0> 48 0f ab 7e 60 c3 66 90 0f 1f 44 00 00 65 48 8b 04 25 40 6f 01 
[ 3308.480884] RSP: 0018:ffff9808c17dfb98 EFLAGS: 00010202 
[ 3308.486712] RAX: ffff8b1289dcb000 RBX: ffff8b128390c070 RCX: 0000000000000000 
[ 3308.494672] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001d 
[ 3308.502633] RBP: ffff8b128390c058 R08: ffffffff92c61310 R09: ffff9808c17dfa58 
[ 3308.510593] R10: 0000000000000000 R11: 0000000000000060 R12: ffff8b128390c070 
[ 3308.518554] R13: ffff8b128487d440 R14: 0000000000000000 R15: 0000000000000001 
[ 3308.526514] FS:  00007fdf36cf88c0(0000) GS:ffff8b15efc40000(0000) knlGS:0000000000000000 
[ 3308.535541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[ 3308.541949] CR2: 0000000000000060 CR3: 0000000104008002 CR4: 00000000001706e0 
[ 3308.549909] Call Trace: 
[ 3308.552634]  md_run+0x48f/0x9b0 
[ 3308.556139]  ? super_validate+0x10e/0x190 [dm_raid] 
[ 3308.561580]  raid_ctr+0x47d/0xb2a [dm_raid] 
[ 3308.566248]  dm_table_add_target+0x177/0x360 [dm_mod] 
[ 3308.571890]  table_load+0x127/0x370 [dm_mod] 
[ 3308.576656]  ctl_ioctl+0x16b/0x280 [dm_mod] 
[ 3308.581328]  dm_ctl_ioctl+0xa/0x10 [dm_mod] 
[ 3308.585997]  __x64_sys_ioctl+0x82/0xb0 
[ 3308.590179]  do_syscall_64+0x3b/0x90 
[ 3308.594168]  entry_SYSCALL_64_after_hwframe+0x44/0xae 
[ 3308.599804] RIP: 0033:0x7fdf37194c0b 
[ 3308.603788] Code: 73 01 c3 48 8b 0d 1d 62 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed 61 1b 00 f7 d8 64 89 01 48 
[ 3308.624741] RSP: 002b:00007ffde2a860f8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 
[ 3308.633187] RAX: ffffffffffffffda RBX: 000055892fc361b0 RCX: 00007fdf37194c0b 
[ 3308.641148] RDX: 000055893163f090 RSI: 00000000c138fd09 RDI: 0000000000000003 
[ 3308.649108] RBP: 000055892fd0d0d6 R08: 000055892fd84500 R09: 00007ffde2a85f50 
[ 3308.657068] R10: 0000000000000007 R11: 0000000000000206 R12: 000055893163f0c0 
[ 3308.665028] R13: 000055893163f140 R14: 000055893163f090 R15: 000055893161f430 
[ 3308.672990] Modules linked in: raid1 dm_raid raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc dm_multipath intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm dcdbas irqbypass mgag200 i2c_algo_bit drm_kms_helper rapl iTCO_wdt iTCO_vendor_support syscopyarea sysfillrect intel_cstate sysimgblt fb_sys_fops cec intel_uncore pcspkr lpc_ich ses mxm_wmi enclosure mei_me mei ipmi_ssif scsi_transport_sas ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi sg crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel libata tg3 megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod 
[ 3308.751764] CR2: 0000000000000060 




Version-Release number of selected component (if applicable):
5.14.0-76.mr626_220401_0216.el9

How reproducible:
100%

Steps to Reproduce:
parted -s /dev/sdb  mklabel gpt  mkpart primary 1M 100G
parted -s /dev/sdc  mklabel gpt  mkpart primary 1M 100G
parted -s /dev/sdd  mklabel gpt  mkpart primary 1M 100G
parted -s /dev/sde  mklabel gpt  mkpart primary 1M 100G

mkfs -t xfs -f /dev/sdb1
mkfs -t xfs -f /dev/sdc1
mkfs -t xfs -f /dev/sdd1
mkfs -t xfs -f /dev/sde1

pvcreate -y /dev/{"sdb"1,"sdc"1,"sdd"1,"sde"1}
vgcreate  black_bird  /dev/{"sdb"1,"sdc"1,"sdd"1,"sde"1}
lvcreate --type raid1 -m 3 -n non_synced_primary_raid1_3legs_1 -L 3G black_bird /dev/"sdb"1:0-2400 /dev/"sdc"1:0-2400 /dev/"sdd"1:0-2400 /dev/"sde"1:0-2400

Actual results:
kernel panic

Expected results:
no issue

Additional info:

it is one known issue,and it can be fixed by the commit:

commit 0f9650bd838efe5c52f7e5f40c3204ad59f1964d
Author: Song Liu <song>
Date:   Wed Feb 2 09:24:10 2022 -0800

    md: fix NULL pointer deref with nowait but no mddev->queue


more detail see https://bugzilla.redhat.com/show_bug.cgi?id=2066297#c9

Comment 3 ChanghuiZhong 2022-04-16 09:38:37 UTC
reproduce this issue on 5.14.0-78.el9,

[   66.259633] BUG: kernel NULL pointer dereference, address: 0000000000000060 
[   66.266592] #PF: supervisor write access in kernel mode 
[   66.271821] #PF: error_code(0x0002) - not-present page 
[   66.276958] PGD 0 P4D 0  
[   66.279500] Oops: 0002 [#1] PREEMPT SMP NOPTI 
[   66.283857] CPU: 29 PID: 2270 Comm: lvcreate Kdump: loaded Not tainted 5.14.0-78.el9.x86_64 #1 
[   66.292465] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.6.6 01/13/2022 
[   66.300028] RIP: 0010:blk_queue_flag_set+0x7/0x10 
[   66.304735] Code: d2 74 11 48 8b 52 78 48 85 d2 74 08 e8 02 79 94 00 0f b6 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 89 ff <f0> 48 0f ab 7e 60 c3 66 90 0f 1f 44 00 00 89 ff f0 48 0f b3 7e 60 
[   66.323483] RSP: 0018:ffffb9ddc4e7bb98 EFLAGS: 00010202 
[   66.328709] RAX: ffff9346135dab70 RBX: ffff934601074070 RCX: 0000000000000000 
[   66.335841] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001d 
[   66.342973] RBP: ffff934601074058 R08: ffffffffae261e10 R09: ffffb9ddc4e7ba58 
[   66.350108] R10: 0000000000000000 R11: 0000000000000000 R12: ffff934601074070 
[   66.357240] R13: ffff9346067bf840 R14: 0000000000000000 R15: 0000000000000001 
[   66.364372] FS:  00007f48695698c0(0000) GS:ffff9347f7d40000(0000) knlGS:0000000000000000 
[   66.372460] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[   66.378204] CR2: 0000000000000060 CR3: 0000000289b3c000 CR4: 0000000000350ee0 
[   66.385337] Call Trace: 
[   66.387791]  md_run+0x48f/0x9b0 
[   66.390939]  ? super_validate+0x10e/0x190 [dm_raid] 
[   66.395815]  raid_ctr+0x47d/0xb2a [dm_raid] 
[   66.400005]  dm_table_add_target+0x177/0x360 [dm_mod] 
[   66.405091]  table_load+0x127/0x370 [dm_mod] 
[   66.409370]  ctl_ioctl+0x16b/0x280 [dm_mod] 
[   66.413560]  dm_ctl_ioctl+0xa/0x10 [dm_mod] 
[   66.417752]  __x64_sys_ioctl+0x82/0xb0 
[   66.421504]  do_syscall_64+0x3b/0x90 
[   66.425084]  entry_SYSCALL_64_after_hwframe+0x44/0xae 
[   66.430136] RIP: 0033:0x7f4869a05c0b 
[   66.433716] Code: 73 01 c3 48 8b 0d 1d 62 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed 61 1b 00 f7 d8 64 89 01 48 
[   66.452460] RSP: 002b:00007ffc83a85a58 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 
[   66.460030] RAX: ffffffffffffffda RBX: 0000561972f6f1b0 RCX: 00007f4869a05c0b 
[   66.467160] RDX: 00005619747bc780 RSI: 00000000c138fd09 RDI: 0000000000000003 
[   66.474285] RBP: 00005619730460d6 R08: 00005619730bd500 R09: 00007ffc83a858b0 
[   66.481417] R10: 0000000000000007 R11: 0000000000000206 R12: 00005619747bc7b0 
[   66.488550] R13: 00005619747bc830 R14: 00005619747bc780 R15: 00005619747aecd0 
[   66.495677] Modules linked in: raid1 dm_raid raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc dm_multipath intel_rapl_msr dcdbas intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl pcspkr dell_smbios dell_wmi_descriptor wmi_bmof ipmi_ssif mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec k10temp i2c_piix4 acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi sg crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel libata tg3 ccp wmi dm_mirror dm_region_hash dm_log dm_mod 
[   66.558916] CR2: 0000000000000060

Comment 6 Nigel Croxon 2022-05-19 16:05:29 UTC
https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/858

Fixed in Version: kernel-5.14.0-94.el9

Comment 7 guazhang@redhat.com 2022-09-09 00:04:17 UTC
*** Bug 2066174 has been marked as a duplicate of this bug. ***