Bug 2184224

Summary: dm raid5 hang during lvcreate-raid.sh lvm-testsuite test
Product: Red Hat Enterprise Linux 9 Reporter: Ben Marzinski <bmarzins>
Component: kernelAssignee: XiaoNi <xni>
kernel sub component: RAID QA Contact: Red Hat Kernel QE team <kernel-qe>
Status: ASSIGNED --- Docs Contact:
Severity: unspecified    
Priority: high CC: agk, heinzm, jlelli, msnitzer, ncroxon, xni
Version: 9.3Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Marzinski 2023-04-04 00:49:27 UTC
Description of problem:
Repeatedly running
#lvm2-testsuite --only lvcreate-raid.sh

will eventually hang, causing the following kernel messages:
[ 1722.976156] INFO: task lvm:212260 blocked for more than 122 seconds.
[ 1722.982531]       Not tainted 5.14.0-294.bz2183556.el9.x86_64 #1
[ 1722.988544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1722.996378] task:lvm             state:D stack:    0 pid:212260 ppid:211591 flags:0x00000002
[ 1723.004817] Call Trace:
[ 1723.007278]  <TASK>
[ 1723.009387]  __schedule+0x248/0x630
[ 1723.012896]  schedule+0x5a/0xc0
[ 1723.016051]  md_write_start.part.0+0x197/0x240
[ 1723.020512]  ? cpuacct_percpu_seq_show+0x10/0x10
[ 1723.025141]  raid5_make_request+0x8c/0x3e0 [raid456]
[ 1723.030144]  ? sched_show_numa+0xf0/0xf0
[ 1723.034076]  md_handle_request+0x135/0x1e0
[ 1723.038186]  raid_map+0x25/0x40 [dm_raid]
[ 1723.042220]  __map_bio+0x46/0x1f0 [dm_mod]
[ 1723.046365]  dm_split_and_process_bio+0x1b8/0x460 [dm_mod]
[ 1723.051863]  dm_submit_bio+0x4d/0x130 [dm_mod]
[ 1723.056323]  __submit_bio+0x89/0x130
[ 1723.059921]  __submit_bio_noacct+0x81/0x1f0
[ 1723.064114]  submit_bio_wait+0x54/0xb0
[ 1723.067878]  blkdev_issue_zeroout+0x12f/0x200
[ 1723.072245]  blkdev_common_ioctl+0x6aa/0x9f0
[ 1723.076526]  blkdev_ioctl+0x116/0x280
[ 1723.080200]  __x64_sys_ioctl+0x8a/0xc0
[ 1723.083964]  do_syscall_64+0x5c/0x90
[ 1723.087549]  ? vfs_write+0x178/0x280
[ 1723.091138]  ? vfs_write+0x178/0x280
[ 1723.094726]  ? syscall_exit_work+0x11a/0x150
[ 1723.099008]  ? syscall_exit_to_user_mode+0x12/0x30
[ 1723.103809]  ? do_syscall_64+0x69/0x90
[ 1723.107569]  ? do_syscall_64+0x69/0x90
[ 1723.111321]  ? do_syscall_64+0x69/0x90
[ 1723.115073]  ? sysvec_apic_timer_interrupt+0x3c/0x90
[ 1723.120041]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 1723.125102] RIP: 0033:0x7fe8e3e3ec6b
[ 1723.128690] RSP: 002b:00007ffdb9ed8668 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1723.136264] RAX: ffffffffffffffda RBX: 00005595c908f3f8 RCX: 00007fe8e3e3ec6b
[ 1723.143406] RDX: 00007ffdb9ed8690 RSI: 000000000000127f RDI: 0000000000000004
[ 1723.143410] RBP: 0000000000000008 R08: 0000000000000000 R09: 00007ffdb9ed7067
[ 1723.143412] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdb9ed8690
[ 1723.143413] R13: 00005595caa21590 R14: 0000000000001000 R15: 00005595cab6a028
[ 1723.143417]  </TASK>
[ 1723.143419] INFO: task mdX_raid5:212303 blocked for more than 123 seconds.
[ 1723.143421]       Not tainted 5.14.0-294.bz2183556.el9.x86_64 #1
[ 1723.143422] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1723.143423] task:mdX_raid5       state:D stack:    0 pid:212303 ppid:     2 flags:0x00004000
[ 1723.143426] Call Trace:
[ 1723.143427]  <TASK>
[ 1723.143428]  __schedule+0x248/0x630
[ 1723.143431]  schedule+0x5a/0xc0
[ 1723.143433]  raid5d+0x44a/0x660 [raid456]
[ 1723.143441]  ? cpuacct_percpu_seq_show+0x10/0x10
[ 1723.143444]  md_thread+0xab/0x160
[ 1723.143446]  ? cpuacct_percpu_seq_show+0x10/0x10
[ 1723.143448]  ? super_90_load+0x70/0x70
[ 1723.143450]  kthread+0xe0/0x100
[ 1723.143455]  ? kthread_complete_and_exit+0x20/0x20
[ 1723.143457]  ret_from_fork+0x22/0x30
[ 1723.143465]  </TASK>


Version-Release number of selected component (if applicable):
kernel-5.14.0-285.el9 and later.

How reproducible:
Always reproducible on kernel-5.14.0-285.el9 and later. Cannot reproduce on rhel-9.2 or rhel-8 kernels.

Steps to Reproduce:
# while lvm2-testsuite --only lvcreate-raid.sh ; do :; done

Actual results:
Hang with kernel error messages

Expected results:
Test always completes successfully

Additional Information:
md_write_start() is scheduling at:

        wait_event(mddev->sb_wait,
                   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) ||
                   mddev->suspended);

raid5d() is scheduling at:

                wait_event_lock_irq(mddev->sb_wait,
                        !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags),
                        conf->device_lock);

Comment 1 Ben Marzinski 2023-04-04 00:53:13 UTC
The first c9s commit where I can reproduce this is:

commit 0990e82f887de26e61813c33cae08d76138239da
Author: Juri Lelli <juri.lelli>
Date:   Tue Aug 2 11:07:41 2022 +0200

    printk: Bring back the RT bits.

    Bugzilla: https://bugzilla.redhat.com/2171995
    Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

    commit b576ab7eadbf1835eddfa5a47eea0a57d3f5b74f
    Author:    Sebastian Andrzej Siewior <bigeasy>
    Date:      Tue Jul 19 20:08:01 2022 +0200

        printk: Bring back the RT bits.

        This is a revert of the commits:
        | 07a22b61946f0 Revert "printk: add functions to prefer direct printing"
        | 5831788afb17b Revert "printk: add kthread console printers"
        | 2d9ef940f89e0 Revert "printk: extend console_lock for per-console locking"
        | 007eeab7e9f03 Revert "printk: remove @console_locked"
        | 05c96b3713aa2 Revert "printk: Block console kthreads when direct printing will be required"
        | 20fb0c8272bbb Revert "printk: Wait for the global console lock when the system is going down"

        which is needed for the atomic consoles which are used on PREEMPT_RT.

        Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

    Signed-off-by: Juri Lelli <juri.lelli>

Comment 2 Ben Marzinski 2023-04-04 00:58:45 UTC
CCing Juri Lelli, since the bisect is pointing to a Kernel-RT commit.