Hide Forgot
Description of problem: This test scenario places a ext4 or xfs on top of lvm vdo, then with i/o running, extends the filesytem a number of times. This deadlock is reproducible and tends to happen once the umount is attempted. SCENARIO - open_ext4_fsadm_vdo_resize: Create an EXT4 filesysem on VDO, mount it, and then attempt multiple online fsadm resizes with data checking adding entry to the devices file for /dev/sdc1 creating PV on hayes-03 using device /dev/sdc1 pvcreate --yes -ff /dev/sdc1 Physical volume "/dev/sdc1" successfully created. adding entry to the devices file for /dev/sde1 creating PV on hayes-03 using device /dev/sde1 pvcreate --yes -ff /dev/sde1 Physical volume "/dev/sde1" successfully created. adding entry to the devices file for /dev/sdd1 creating PV on hayes-03 using device /dev/sdd1 pvcreate --yes -ff /dev/sdd1 Physical volume "/dev/sdd1" successfully created. adding entry to the devices file for /dev/sdf1 creating PV on hayes-03 using device /dev/sdf1 pvcreate --yes -ff /dev/sdf1 Physical volume "/dev/sdf1" successfully created. adding entry to the devices file for /dev/sdg1 creating PV on hayes-03 using device /dev/sdg1 pvcreate --yes -ff /dev/sdg1 Physical volume "/dev/sdg1" successfully created. adding entry to the devices file for /dev/sdh1 creating PV on hayes-03 using device /dev/sdh1 pvcreate --yes -ff /dev/sdh1 Physical volume "/dev/sdh1" successfully created. adding entry to the devices file for /dev/sdi1 creating PV on hayes-03 using device /dev/sdi1 pvcreate --yes -ff /dev/sdi1 Physical volume "/dev/sdi1" successfully created. creating VG on hayes-03 using PV(s) /dev/sdc1 /dev/sde1 /dev/sdd1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 vgcreate vdo_sanity /dev/sdc1 /dev/sde1 /dev/sdd1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 WARNING: Devices have inconsistent physical block sizes (4096 and 512). WARNING: Devices have inconsistent physical block sizes (4096 and 512). WARNING: Devices have inconsistent physical block sizes (4096 and 512). WARNING: Devices have inconsistent physical block sizes (4096 and 512). Volume group "vdo_sanity" successfully created lvcreate --yes --type linear -n vdo_pool -L 50G vdo_sanity Wiping vdo signature on /dev/vdo_sanity/vdo_pool. Logical volume "vdo_pool" created. lvconvert --yes --type vdo-pool -n vdo_lv -V 100G vdo_sanity/vdo_pool WARNING: Converting logical volume vdo_sanity/vdo_pool to VDO pool volume with formating. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) The VDO volume can address 46 GB in 23 data slabs, each 2 GB. It can grow to address at most 16 TB of physical storage in 8192 slabs. If a larger maximum size might be needed, use bigger slabs. Logical volume "vdo_lv" created. Converted vdo_sanity/vdo_pool to VDO pool volume and created virtual vdo_sanity/vdo_lv VDO volume. mkfs --type ext4 /dev/vdo_sanity/vdo_lv mke2fs 1.46.5 (30-Dec-2021) Discarding device blocks: done Creating filesystem with 26214400 4k blocks and 6553600 inodes Filesystem UUID: a5d68bb8-225b-40d8-8b44-4402eba33216 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872 Allocating group tables: done Writing inode tables: done Creating journal (131072 blocks): done Writing superblocks and filesystem accounting information: done mount /dev/vdo_sanity/vdo_lv /mnt/vdo_lv Writing files to /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -n 5000 checkit starting with: CREATE Num files: 5000 Random Seed: 41332 Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv Starting dd io to vdo fs to be resized Attempt to resize the open vdo filesystem multiple times with lvextend/fsadm on hayes-03 +++ itr 1 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv Size of logical volume vdo_sanity/vdo_lv changed from 100.00 GiB (25600 extents) to <100.49 GiB (25725 extents). Logical volume vdo_sanity/vdo_lv successfully resized. resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/mapper/vdo_sanity-vdo_lv is mounted on /mnt/vdo_lv; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/mapper/vdo_sanity-vdo_lv is now 26342400 (4k) blocks long. PRE:104857600.0 POST:105369600.0 PRE:{'102626232'} POST:{'103191448'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv +++ itr 2 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv Size of logical volume vdo_sanity/vdo_lv changed from <100.49 GiB (25725 extents) to <100.98 GiB (25850 extents). Logical volume vdo_sanity/vdo_lv successfully resized. resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/mapper/vdo_sanity-vdo_lv is mounted on /mnt/vdo_lv; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/mapper/vdo_sanity-vdo_lv is now 26470400 (4k) blocks long. PRE:105369600.0 POST:105881600.0 PRE:{'103191448'} POST:{'103695224'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv +++ itr 3 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv Size of logical volume vdo_sanity/vdo_lv changed from <100.98 GiB (25850 extents) to 101.46 GiB (25975 extents). Logical volume vdo_sanity/vdo_lv successfully resized. resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/mapper/vdo_sanity-vdo_lv is mounted on /mnt/vdo_lv; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/mapper/vdo_sanity-vdo_lv is now 26598400 (4k) blocks long. PRE:105881600.0 POST:106393600.0 PRE:{'103695224'} POST:{'104199000'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv +++ itr 4 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv Size of logical volume vdo_sanity/vdo_lv changed from 101.46 GiB (25975 extents) to 101.95 GiB (26100 extents). Logical volume vdo_sanity/vdo_lv successfully resized. resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/mapper/vdo_sanity-vdo_lv is mounted on /mnt/vdo_lv; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/mapper/vdo_sanity-vdo_lv is now 26726400 (4k) blocks long. PRE:106393600.0 POST:106905600.0 PRE:{'104199000'} POST:{'104702776'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.6/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.2167047 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.2167047 Working dir: /mnt/vdo_lv umount /mnt/vdo_lv umount: /mnt/vdo_lv: target is busy. Mar 15 14:33:18 hayes-03 qarshd[41878]: Running cmdline: umount /mnt/vdo_lv Mar 15 14:33:18 hayes-03 systemd[1]: qarshd.104.51:5016-10.2.16.61:50142.service: Deactivated successfully. Mar 15 14:35:07 hayes-03 systemd[1]: qarshd.104.51:5016-10.2.16.61:49820.service: Deactivated successfully. Mar 15 14:35:07 hayes-03 systemd[1]: qarshd.104.51:5016-10.2.16.61:49820.service: Consumed 4min 25.655s CPU time. Mar 15 14:37:37 hayes-03 kernel: INFO: task kworker/u81:2:40079 blocked for more than 122 seconds. Mar 15 14:37:37 hayes-03 kernel: Tainted: G OE --------- --- 5.14.0-70.el9.x86_64 #1 Mar 15 14:37:37 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 15 14:37:37 hayes-03 kernel: task:kworker/u81:2 state:D stack: 0 pid:40079 ppid: 2 flags:0x00004000 Mar 15 14:37:37 hayes-03 kernel: Workqueue: writeback wb_workfn (flush-253:2) Mar 15 14:37:37 hayes-03 kernel: Call Trace: Mar 15 14:37:37 hayes-03 kernel: __schedule+0x203/0x560 Mar 15 14:37:37 hayes-03 kernel: schedule+0x43/0xb0 Mar 15 14:37:37 hayes-03 kernel: io_schedule+0x47/0x70 Mar 15 14:37:37 hayes-03 kernel: limiter_wait_for_one_free+0xb2/0xe0 [kvdo] Mar 15 14:37:37 hayes-03 kernel: ? do_wait_intr_irq+0xa0/0xa0 Mar 15 14:37:37 hayes-03 kernel: vdo_launch_bio+0x93/0x170 [kvdo] Mar 15 14:37:37 hayes-03 kernel: __map_bio+0x64/0x210 [dm_mod] Mar 15 14:37:37 hayes-03 kernel: __split_and_process_non_flush+0x1e3/0x250 [dm_mod] Mar 15 14:37:37 hayes-03 kernel: ? bio_alloc_bioset+0x9d/0x330 Mar 15 14:37:37 hayes-03 kernel: __split_and_process_bio+0xed/0x290 [dm_mod] Mar 15 14:37:37 hayes-03 kernel: dm_submit_bio+0x7b/0x170 [dm_mod] Mar 15 14:37:37 hayes-03 kernel: __submit_bio+0x9a/0x140 Mar 15 14:37:37 hayes-03 kernel: __submit_bio_noacct+0x81/0x1e0 Mar 15 14:37:37 hayes-03 kernel: ext4_io_submit+0x48/0x60 [ext4] Mar 15 14:37:37 hayes-03 kernel: ext4_writepages+0x5a1/0x820 [ext4] Mar 15 14:37:37 hayes-03 kernel: ? blk_mq_dispatch_rq_list+0x17b/0x610 Mar 15 14:37:37 hayes-03 kernel: ? sysvec_apic_timer_interrupt+0xb/0x90 Mar 15 14:37:37 hayes-03 kernel: ? asm_sysvec_apic_timer_interrupt+0x12/0x20 Mar 15 14:37:37 hayes-03 kernel: do_writepages+0xcf/0x1c0 Mar 15 14:37:37 hayes-03 kernel: __writeback_single_inode+0x3d/0x230 Mar 15 14:37:37 hayes-03 kernel: ? wbc_detach_inode+0x13f/0x210 Mar 15 14:37:37 hayes-03 kernel: writeback_sb_inodes+0x1fc/0x480 Mar 15 14:37:37 hayes-03 kernel: __writeback_inodes_wb+0x4c/0xe0 Mar 15 14:37:37 hayes-03 kernel: wb_writeback+0x1d7/0x2c0 Mar 15 14:37:37 hayes-03 kernel: wb_do_writeback+0x1cd/0x2b0 Mar 15 14:37:37 hayes-03 kernel: wb_workfn+0x6e/0x280 Mar 15 14:37:37 hayes-03 kernel: ? put_prev_task_fair+0x1b/0x30 Mar 15 14:37:37 hayes-03 kernel: ? pick_next_task+0x7d3/0x840 Mar 15 14:37:37 hayes-03 kernel: ? __update_idle_core+0x1b/0xb0 Mar 15 14:37:37 hayes-03 kernel: ? finish_task_switch.isra.0+0xb4/0x290 Mar 15 14:37:37 hayes-03 kernel: process_one_work+0x1e8/0x3c0 Mar 15 14:37:37 hayes-03 kernel: worker_thread+0x50/0x3b0 Mar 15 14:37:37 hayes-03 kernel: ? rescuer_thread+0x370/0x370 Mar 15 14:37:37 hayes-03 kernel: kthread+0x149/0x170 Mar 15 14:37:37 hayes-03 kernel: ? set_kthread_struct+0x40/0x40 Mar 15 14:37:37 hayes-03 kernel: ret_from_fork+0x22/0x30 Mar 15 14:37:37 hayes-03 kernel: INFO: task jbd2/dm-2-8:41326 blocked for more than 123 seconds. Mar 15 14:37:37 hayes-03 kernel: Tainted: G OE --------- --- 5.14.0-70.el9.x86_64 #1 Mar 15 14:37:37 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 15 14:37:37 hayes-03 kernel: task:jbd2/dm-2-8 state:D stack: 0 pid:41326 ppid: 2 flags:0x00004000 Mar 15 14:37:37 hayes-03 kernel: Call Trace: Mar 15 14:37:37 hayes-03 kernel: ? bit_wait+0x60/0x60 Mar 15 14:37:37 hayes-03 kernel: __schedule+0x203/0x560 Mar 15 14:37:37 hayes-03 kernel: ? __submit_bio+0xa3/0x140 Mar 15 14:37:37 hayes-03 kernel: schedule+0x43/0xb0 Mar 15 14:37:37 hayes-03 kernel: io_schedule+0x47/0x70 Mar 15 14:37:37 hayes-03 kernel: bit_wait_io+0xd/0x60 Mar 15 14:37:37 hayes-03 kernel: __wait_on_bit+0x2a/0x90 Mar 15 14:37:37 hayes-03 kernel: out_of_line_wait_on_bit+0x92/0xb0 Mar 15 14:37:37 hayes-03 kernel: ? var_wake_function+0x20/0x20 Mar 15 14:37:37 hayes-03 kernel: jbd2_journal_commit_transaction+0x15af/0x19d0 [jbd2] Mar 15 14:37:37 hayes-03 kernel: ? finish_task_switch.isra.0+0xb4/0x290 Mar 15 14:37:37 hayes-03 kernel: kjournald2+0xaf/0x280 [jbd2] Mar 15 14:37:37 hayes-03 kernel: ? do_wait_intr_irq+0xa0/0xa0 Mar 15 14:37:37 hayes-03 kernel: ? jbd2_journal_release_jbd_inode+0x150/0x150 [jbd2] Mar 15 14:37:37 hayes-03 kernel: kthread+0x149/0x170 Mar 15 14:37:37 hayes-03 kernel: ? set_kthread_struct+0x40/0x40 Mar 15 14:37:37 hayes-03 kernel: ret_from_fork+0x22/0x30 Mar 15 14:39:39 hayes-03 kernel: INFO: task kworker/u81:2:40079 blocked for more than 245 seconds. Mar 15 14:39:39 hayes-03 kernel: Tainted: G OE --------- --- 5.14.0-70.el9.x86_64 #1 Mar 15 14:39:39 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 15 14:39:39 hayes-03 kernel: task:kworker/u81:2 state:D stack: 0 pid:40079 ppid: 2 flags:0x00004000 Mar 15 14:39:39 hayes-03 kernel: Workqueue: writeback wb_workfn (flush-253:2) Mar 15 14:39:39 hayes-03 kernel: Call Trace: Mar 15 14:39:39 hayes-03 kernel: __schedule+0x203/0x560 Mar 15 14:39:39 hayes-03 kernel: schedule+0x43/0xb0 Mar 15 14:39:39 hayes-03 kernel: io_schedule+0x47/0x70 Mar 15 14:39:39 hayes-03 kernel: limiter_wait_for_one_free+0xb2/0xe0 [kvdo] Mar 15 14:39:39 hayes-03 kernel: ? do_wait_intr_irq+0xa0/0xa0 Mar 15 14:39:39 hayes-03 kernel: vdo_launch_bio+0x93/0x170 [kvdo] Mar 15 14:39:39 hayes-03 kernel: __map_bio+0x64/0x210 [dm_mod] Mar 15 14:39:40 hayes-03 kernel: __split_and_process_non_flush+0x1e3/0x250 [dm_mod] Mar 15 14:39:40 hayes-03 kernel: ? bio_alloc_bioset+0x9d/0x330 Mar 15 14:39:40 hayes-03 kernel: __split_and_process_bio+0xed/0x290 [dm_mod] Mar 15 14:39:40 hayes-03 kernel: dm_submit_bio+0x7b/0x170 [dm_mod] Mar 15 14:39:40 hayes-03 kernel: __submit_bio+0x9a/0x140 Mar 15 14:39:40 hayes-03 kernel: __submit_bio_noacct+0x81/0x1e0 Mar 15 14:39:40 hayes-03 kernel: ext4_io_submit+0x48/0x60 [ext4] Mar 15 14:39:40 hayes-03 kernel: ext4_writepages+0x5a1/0x820 [ext4] Mar 15 14:39:40 hayes-03 kernel: ? blk_mq_dispatch_rq_list+0x17b/0x610 Mar 15 14:39:40 hayes-03 kernel: ? sysvec_apic_timer_interrupt+0xb/0x90 Mar 15 14:39:40 hayes-03 kernel: ? asm_sysvec_apic_timer_interrupt+0x12/0x20 Mar 15 14:39:40 hayes-03 kernel: do_writepages+0xcf/0x1c0 Mar 15 14:39:40 hayes-03 kernel: __writeback_single_inode+0x3d/0x230 Mar 15 14:39:40 hayes-03 kernel: ? wbc_detach_inode+0x13f/0x210 Mar 15 14:39:40 hayes-03 kernel: writeback_sb_inodes+0x1fc/0x480 Mar 15 14:39:40 hayes-03 kernel: __writeback_inodes_wb+0x4c/0xe0 Mar 15 14:39:40 hayes-03 kernel: wb_writeback+0x1d7/0x2c0 Mar 15 14:39:40 hayes-03 kernel: wb_do_writeback+0x1cd/0x2b0 Mar 15 14:39:40 hayes-03 kernel: wb_workfn+0x6e/0x280 Mar 15 14:39:40 hayes-03 kernel: ? put_prev_task_fair+0x1b/0x30 Mar 15 14:39:40 hayes-03 kernel: ? pick_next_task+0x7d3/0x840 Mar 15 14:39:40 hayes-03 kernel: ? __update_idle_core+0x1b/0xb0 Mar 15 14:39:40 hayes-03 kernel: ? finish_task_switch.isra.0+0xb4/0x290 Mar 15 14:39:40 hayes-03 kernel: process_one_work+0x1e8/0x3c0 Mar 15 14:39:40 hayes-03 kernel: worker_thread+0x50/0x3b0 Mar 15 14:39:40 hayes-03 kernel: ? rescuer_thread+0x370/0x370 Mar 15 14:39:40 hayes-03 kernel: kthread+0x149/0x170 Mar 15 14:39:40 hayes-03 kernel: ? set_kthread_struct+0x40/0x40 Mar 15 14:39:40 hayes-03 kernel: ret_from_fork+0x22/0x30 Version-Release number of selected component (if applicable): kernel-5.14.0-70.el9 BUILT: Thu Feb 24 05:48:54 PM CST 2022 lvm2-2.03.14-4.el9 BUILT: Wed Feb 16 06:01:21 AM CST 2022 lvm2-libs-2.03.14-4.el9 BUILT: Wed Feb 16 06:01:21 AM CST 2022 device-mapper-1.02.183-4.el9 BUILT: Wed Feb 16 06:01:21 AM CST 2022 device-mapper-libs-1.02.183-4.el9 BUILT: Wed Feb 16 06:01:21 AM CST 2022 vdo-8.1.1.360-1.el9 BUILT: Sat Feb 12 11:34:09 PM CST 2022 kmod-kvdo-8.1.1.360-15.el9 BUILT: Mon Feb 28 12:06:18 PM CST 2022 How reproducible: Often
Here's a deadlock and trace with an XFS filesystem being unmounted. kernel-4.18.0-378.el8 BUILT: Fri Apr 1 17:57:23 CDT 2022 lvm2-2.03.14-3.el8 BUILT: Tue Jan 4 14:54:16 CST 2022 lvm2-libs-2.03.14-3.el8 BUILT: Tue Jan 4 14:54:16 CST 2022 vdo-6.2.6.14-14.el8 BUILT: Fri Feb 11 14:43:08 CST 2022 kmod-kvdo-6.2.6.14-84.el8 BUILT: Tue Mar 22 07:41:18 CDT 2022 Apr 12 13:58:25 hayes-03 qarshd[39247]: Running cmdline: umount /mnt/vdo_lv Apr 12 13:58:25 hayes-03 systemd[1]: mnt-vdo_lv.mount: Succeeded. Apr 12 14:01:49 hayes-03 kernel: INFO: task kworker/u81:0:37761 blocked for more than 120 seconds. Apr 12 14:01:49 hayes-03 kernel: Tainted: G O --------- - - 4.18.0-378.el8.x86_64 #1 Apr 12 14:01:49 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 12 14:01:49 hayes-03 kernel: task:kworker/u81:0 state:D stack: 0 pid:37761 ppid: 2 flags:0x80004080 Apr 12 14:01:49 hayes-03 kernel: Workqueue: writeback wb_workfn (flush-253:2) Apr 12 14:01:49 hayes-03 kernel: Call Trace: Apr 12 14:01:49 hayes-03 kernel: __schedule+0x2d1/0x830 Apr 12 14:01:49 hayes-03 kernel: ? finish_wait+0x80/0x80 Apr 12 14:01:49 hayes-03 kernel: schedule+0x35/0xa0 Apr 12 14:01:49 hayes-03 kernel: io_schedule+0x12/0x40 Apr 12 14:01:49 hayes-03 kernel: limiterWaitForOneFree+0xbc/0xf0 [kvdo] Apr 12 14:01:49 hayes-03 kernel: ? finish_wait+0x80/0x80 Apr 12 14:01:49 hayes-03 kernel: kvdoMapBio+0xc8/0x2a0 [kvdo] Apr 12 14:01:49 hayes-03 kernel: __map_bio+0x4c/0x210 [dm_mod] Apr 12 14:01:49 hayes-03 kernel: __split_and_process_non_flush+0x1d8/0x240 [dm_mod] Apr 12 14:01:49 hayes-03 kernel: dm_make_request+0x12c/0x380 [dm_mod] Apr 12 14:01:49 hayes-03 kernel: generic_make_request+0x25b/0x350 Apr 12 14:01:49 hayes-03 kernel: submit_bio+0x3c/0x160 Apr 12 14:01:49 hayes-03 kernel: iomap_writepage_map+0x509/0x670 Apr 12 14:01:49 hayes-03 kernel: write_cache_pages+0x197/0x420 Apr 12 14:01:49 hayes-03 kernel: ? iomap_invalidatepage+0xe0/0xe0 Apr 12 14:01:49 hayes-03 kernel: ? blk_queue_enter+0xdf/0x1f0 Apr 12 14:01:49 hayes-03 kernel: iomap_writepages+0x1c/0x40 Apr 12 14:01:49 hayes-03 kernel: xfs_vm_writepages+0x7e/0xb0 [xfs] Apr 12 14:01:49 hayes-03 kernel: do_writepages+0xc2/0x1c0 Apr 12 14:01:49 hayes-03 kernel: __writeback_single_inode+0x39/0x2f0 Apr 12 14:01:49 hayes-03 kernel: writeback_sb_inodes+0x1e6/0x450 Apr 12 14:01:50 hayes-03 kernel: __writeback_inodes_wb+0x5f/0xc0 Apr 12 14:01:50 hayes-03 kernel: wb_writeback+0x247/0x2e0 Apr 12 14:01:50 hayes-03 kernel: ? get_nr_inodes+0x35/0x50 Apr 12 14:01:50 hayes-03 kernel: wb_workfn+0x37c/0x4d0 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x35/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x41/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x35/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x41/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x35/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x41/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x35/0x70 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x41/0x70 Apr 12 14:01:50 hayes-03 kernel: process_one_work+0x1a7/0x360 Apr 12 14:01:50 hayes-03 kernel: ? create_worker+0x1a0/0x1a0 Apr 12 14:01:50 hayes-03 kernel: worker_thread+0x30/0x390 Apr 12 14:01:50 hayes-03 kernel: ? create_worker+0x1a0/0x1a0 Apr 12 14:01:50 hayes-03 kernel: kthread+0x10a/0x120 Apr 12 14:01:50 hayes-03 kernel: ? set_kthread_struct+0x40/0x40 Apr 12 14:01:50 hayes-03 kernel: ret_from_fork+0x35/0x40 Apr 12 14:01:50 hayes-03 kernel: INFO: task kworker/36:0:38305 blocked for more than 120 seconds. Apr 12 14:01:50 hayes-03 kernel: Tainted: G O --------- - - 4.18.0-378.el8.x86_64 #1 Apr 12 14:01:50 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 12 14:01:50 hayes-03 kernel: task:kworker/36:0 state:D stack: 0 pid:38305 ppid: 2 flags:0x80004080 Apr 12 14:01:50 hayes-03 kernel: Workqueue: xfs-sync/dm-2 xfs_log_worker [xfs] Apr 12 14:01:50 hayes-03 kernel: Call Trace: Apr 12 14:01:50 hayes-03 kernel: __schedule+0x2d1/0x830 Apr 12 14:01:50 hayes-03 kernel: ? __switch_to_asm+0x41/0x70 Apr 12 14:01:50 hayes-03 kernel: ? crc_22+0x1e/0x1e [crc32c_intel] Apr 12 14:01:50 hayes-03 kernel: ? finish_wait+0x80/0x80 Apr 12 14:01:50 hayes-03 kernel: schedule+0x35/0xa0 Apr 12 14:01:50 hayes-03 kernel: io_schedule+0x12/0x40 Apr 12 14:01:50 hayes-03 kernel: limiterWaitForOneFree+0xbc/0xf0 [kvdo] Apr 12 14:01:50 hayes-03 kernel: ? finish_wait+0x80/0x80 Apr 12 14:01:50 hayes-03 kernel: kvdoMapBio+0xc8/0x2a0 [kvdo] Apr 12 14:01:50 hayes-03 kernel: __map_bio+0x4c/0x210 [dm_mod] Apr 12 14:01:50 hayes-03 kernel: __split_and_process_non_flush+0x1d8/0x240 [dm_mod] Apr 12 14:01:50 hayes-03 kernel: dm_make_request+0x12c/0x380 [dm_mod] Apr 12 14:01:50 hayes-03 kernel: generic_make_request+0x25b/0x350 Apr 12 14:01:50 hayes-03 kernel: ? bio_add_page+0x42/0x50 Apr 12 14:01:50 hayes-03 kernel: submit_bio+0x3c/0x160 Apr 12 14:01:50 hayes-03 kernel: xlog_state_release_iclog+0x6e/0x80 [xfs] Apr 12 14:01:50 hayes-03 kernel: xfs_log_force+0x129/0x1c0 [xfs] Apr 12 14:01:50 hayes-03 kernel: xfs_log_worker+0x35/0x60 [xfs] Apr 12 14:01:50 hayes-03 kernel: process_one_work+0x1a7/0x360 Apr 12 14:01:50 hayes-03 kernel: ? create_worker+0x1a0/0x1a0 Apr 12 14:01:50 hayes-03 kernel: worker_thread+0x30/0x390 Apr 12 14:01:50 hayes-03 kernel: ? create_worker+0x1a0/0x1a0 Apr 12 14:01:50 hayes-03 kernel: kthread+0x10a/0x120 Apr 12 14:01:50 hayes-03 kernel: ? set_kthread_struct+0x40/0x40 Apr 12 14:01:50 hayes-03 kernel: ret_from_fork+0x35/0x40 Apr 12 14:01:50 hayes-03 kernel: INFO: task umount:39248 blocked for more than 120 seconds. Apr 12 14:01:50 hayes-03 kernel: Tainted: G O --------- - - 4.18.0-378.el8.x86_64 #1 Apr 12 14:01:50 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 12 14:01:50 hayes-03 kernel: task:umount state:D stack: 0 pid:39248 ppid: 39247 flags:0x00004080 Apr 12 14:01:50 hayes-03 kernel: Call Trace: Apr 12 14:01:50 hayes-03 kernel: __schedule+0x2d1/0x830 Apr 12 14:01:50 hayes-03 kernel: ? cpumask_next+0x17/0x20 Apr 12 14:01:50 hayes-03 kernel: ? mnt_get_count+0x39/0x50 Apr 12 14:01:50 hayes-03 kernel: schedule+0x35/0xa0 Apr 12 14:01:50 hayes-03 kernel: rwsem_down_write_slowpath+0x308/0x5c0 Apr 12 14:01:50 hayes-03 kernel: ? fsnotify_grab_connector+0x3c/0x60 Apr 12 14:01:50 hayes-03 kernel: deactivate_super+0x43/0x50 Apr 12 14:01:50 hayes-03 kernel: cleanup_mnt+0x3b/0x70 Apr 12 14:01:50 hayes-03 kernel: task_work_run+0x8a/0xb0 Apr 12 14:01:50 hayes-03 kernel: exit_to_usermode_loop+0xeb/0xf0 Apr 12 14:01:50 hayes-03 kernel: do_syscall_64+0x198/0x1a0 Apr 12 14:01:50 hayes-03 kernel: entry_SYSCALL_64_after_hwframe+0x65/0xca Apr 12 14:01:50 hayes-03 kernel: RIP: 0033:0x7f2d8ea10dab Apr 12 14:01:50 hayes-03 kernel: Code: Unable to access opcode bytes at RIP 0x7f2d8ea10d81. Apr 12 14:01:50 hayes-03 kernel: RSP: 002b:00007fffb7b9cac8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 Apr 12 14:01:50 hayes-03 kernel: RAX: 0000000000000000 RBX: 00005560393e9460 RCX: 00007f2d8ea10dab Apr 12 14:01:50 hayes-03 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005560393e9640 Apr 12 14:01:50 hayes-03 kernel: RBP: 0000000000000000 R08: 00005560393e9660 R09: 00007f2d8eb57820 Apr 12 14:01:50 hayes-03 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005560393e9640 Apr 12 14:01:50 hayes-03 kernel: R13: 00007f2d8f881184 R14: 0000000000000000 R15: 00000000ffffffff Apr 12 14:03:52 hayes-03 kernel: INFO: task kworker/u81:0:37761 blocked for more than 120 seconds.
If there’s been a lot of data written into cache memory, and the file system flushes it out all at once, sometimes it takes a while for VDO to process all the data being sent to it, especially if the VDO configuration hasn’t been tuned for performance on the particular hardware. Threads waiting for the data to finish writing, like the umount process, will block, and if there’s enough data, they may block for long enough that the kernel’s 120-second timer fires and logs the complaints you see here. If you use “iostat” and “top” while this is happening, do they indicate activity at the VDO level? If so, I would expect the umount to complete if you wait long enough. Or is this a real, permanent lockup with no system activity? Can you tell me something about the system you’re running the test on? How much memory has it got? Are the drives you’re using flash storage or spinning HDDs?
Adding a note that this still exists in our testing, since this bug is marked MODIFIED with a Fixed In version listed. kernel-5.14.0-130.el9 BUILT: Fri Jul 15 07:31:56 AM CDT 2022 lvm2-2.03.16-2.el9 BUILT: Thu Jul 14 11:45:18 AM CDT 2022 lvm2-libs-2.03.16-2.el9 BUILT: Thu Jul 14 11:45:18 AM CDT 2022 vdo-8.1.1.360-1.el9 BUILT: Sat Feb 12 11:34:09 PM CST 2022 kmod-kvdo-8.1.1.371-41.el9 BUILT: Sat Jul 16 03:39:21 PM CDT 2022 Jul 26 11:24:54 hayes-03 kernel: INFO: task kworker/18:0:192809 blocked for more than 122 seconds. Jul 26 11:24:54 hayes-03 kernel: Tainted: G O --------- --- 5.14.0-130.el9.x86_64 #1 Jul 26 11:24:54 hayes-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 26 11:24:54 hayes-03 kernel: task:kworker/18:0 state:D stack: 0 pid:192809 ppid: 2 flags:0x00004000 Jul 26 11:24:54 hayes-03 kernel: Workqueue: xfs-conv/dm-3 xfs_end_io [xfs] Jul 26 11:24:54 hayes-03 kernel: Call Trace: Jul 26 11:24:54 hayes-03 kernel: __schedule+0x206/0x580 Jul 26 11:24:54 hayes-03 kernel: schedule+0x43/0xa0 Jul 26 11:24:54 hayes-03 kernel: rwsem_down_write_slowpath+0x27b/0x4c0
Yes, the hopefully-fixed version is kmod-kvdo-8.2.0.2 or later.
These test scenarios now run with out issue on the latest rpms. Marking VERIFIED. kernel-5.14.0-138.el9 BUILT: Sun Jul 31 06:20:38 AM CDT 2022 vdo-8.2.0.2-1.el9 BUILT: Tue Jul 19 02:28:15 PM CDT 2022 kmod-kvdo-8.2.0.2-41.el9 BUILT: Thu Jul 28 05:24:49 PM CDT 2022 lvm2-2.03.16-3.el9 BUILT: Mon Aug 1 04:42:35 AM CDT 2022 lvm2-libs-2.03.16-3.el9 BUILT: Mon Aug 1 04:42:35 AM CDT 2022 ============================================================ Iteration 14 of 14 started at Wed Aug 10 14:13:03 2022 ============================================================ SCENARIO - open_ext4_fsadm_vdo_resize: Create an EXT4 filesysem on VDO, mount it, and then attempt multiple online fsadm resizes with data checking (bug 2064802) adding entry to the devices file for /dev/sde1 creating PV on hayes-03 using device /dev/sde1 pvcreate --yes -ff /dev/sde1 Physical volume "/dev/sde1" successfully created. adding entry to the devices file for /dev/sdd1 creating PV on hayes-03 using device /dev/sdd1 pvcreate --yes -ff /dev/sdd1 Physical volume "/dev/sdd1" successfully created. adding entry to the devices file for /dev/sdh1 creating PV on hayes-03 using device /dev/sdh1 pvcreate --yes -ff /dev/sdh1 Physical volume "/dev/sdh1" successfully created. adding entry to the devices file for /dev/sdi1 creating PV on hayes-03 using device /dev/sdi1 pvcreate --yes -ff /dev/sdi1 Physical volume "/dev/sdi1" successfully created. creating VG on hayes-03 using PV(s) /dev/sde1 /dev/sdd1 /dev/sdh1 /dev/sdi1 vgcreate vdo_sanity /dev/sde1 /dev/sdd1 /dev/sdh1 /dev/sdi1 Volume group "vdo_sanity" successfully created lvcreate --yes --type linear -n vdo_pool -L 50G vdo_sanity Wiping vdo signature on /dev/vdo_sanity/vdo_pool. Logical volume "vdo_pool" created. lvconvert --yes --type vdo-pool -n vdo_lv -V 100G vdo_sanity/vdo_pool The VDO volume can address 46 GB in 23 data slabs, each 2 GB. It can grow to address at most 16 TB of physical storage in 8192 slabs. If a larger maximum size might be needed, use bigger slabs. Logical volume "vdo_lv" created. Converted vdo_sanity/vdo_pool to VDO pool volume and created virtual vdo_sanity/vdo_lv VDO volume. WARNING: Converting logical volume vdo_sanity/vdo_pool to VDO pool volume with formating. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) mkfs --type ext4 -F /dev/vdo_sanity/vdo_lv mount /dev/vdo_sanity/vdo_lv /mnt/vdo_lv Writing files to /mnt/vdo_lv /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -n 5000 /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -v Starting dd io to vdo fs to be resized Attempt to resize the open vdo filesystem multiple times with lvextend/fsadm on hayes-03 +++ itr 1 +++ [...] +++ itr 4 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv Size of logical volume vdo_sanity/vdo_lv changed from 101.46 GiB (25975 extents) to 101.95 GiB (26100 extents). Logical volume vdo_sanity/vdo_lv successfully resized. Filesystem at /dev/mapper/vdo_sanity-vdo_lv is mounted on /mnt/vdo_lv; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/mapper/vdo_sanity-vdo_lv is now 26726400 (4k) blocks long. resize2fs 1.46.5 (30-Dec-2021) PRE:106393600.0 POST:106905600.0 PRE:{'104137560'} POST:{'104641336'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.1314273 Working dir: /mnt/vdo_lv umount /mnt/vdo_lv lvremove -f vdo_sanity/vdo_lv Logical volume "vdo_lv" successfully removed. removing vg vdo_sanity from hayes-03 Volume group "vdo_sanity" successfully removed removing pv /dev/sde1 on hayes-03 Labels on physical volume "/dev/sde1" successfully wiped. removing entry from the devices file for /dev/sde1 removing pv /dev/sdd1 on hayes-03 Labels on physical volume "/dev/sdd1" successfully wiped. removing entry from the devices file for /dev/sdd1 removing pv /dev/sdh1 on hayes-03 Labels on physical volume "/dev/sdh1" successfully wiped. removing entry from the devices file for /dev/sdh1 removing pv /dev/sdi1 on hayes-03 Labels on physical volume "/dev/sdi1" successfully wiped. removing entry from the devices file for /dev/sdi1 SCENARIO - open_xfs_fsadm_vdo_resize: Create an XFS filesysem on VDO, mount it, and then attempt multiple online fsadm resizes with data checking (bug 2064802) adding entry to the devices file for /dev/sde1 creating PV on hayes-03 using device /dev/sde1 pvcreate --yes -ff /dev/sde1 Physical volume "/dev/sde1" successfully created. adding entry to the devices file for /dev/sdd1 creating PV on hayes-03 using device /dev/sdd1 pvcreate --yes -ff /dev/sdd1 Physical volume "/dev/sdd1" successfully created. adding entry to the devices file for /dev/sdh1 creating PV on hayes-03 using device /dev/sdh1 pvcreate --yes -ff /dev/sdh1 Physical volume "/dev/sdh1" successfully created. adding entry to the devices file for /dev/sdi1 creating PV on hayes-03 using device /dev/sdi1 pvcreate --yes -ff /dev/sdi1 Physical volume "/dev/sdi1" successfully created. creating VG on hayes-03 using PV(s) /dev/sde1 /dev/sdd1 /dev/sdh1 /dev/sdi1 vgcreate vdo_sanity /dev/sde1 /dev/sdd1 /dev/sdh1 /dev/sdi1 Volume group "vdo_sanity" successfully created lvcreate --yes --type linear -n vdo_pool -L 50G vdo_sanity Wiping vdo signature on /dev/vdo_sanity/vdo_pool. Logical volume "vdo_pool" created. lvconvert --yes --type vdo-pool -n vdo_lv -V 100G vdo_sanity/vdo_pool The VDO volume can address 46 GB in 23 data slabs, each 2 GB. It can grow to address at most 16 TB of physical storage in 8192 slabs. If a larger maximum size might be needed, use bigger slabs. Logical volume "vdo_lv" created. Converted vdo_sanity/vdo_pool to VDO pool volume and created virtual vdo_sanity/vdo_lv VDO volume. WARNING: Converting logical volume vdo_sanity/vdo_pool to VDO pool volume with formating. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) mkfs --type xfs -f /dev/vdo_sanity/vdo_lv mount /dev/vdo_sanity/vdo_lv /mnt/vdo_lv Writing files to /mnt/vdo_lv /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -n 5000 /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -v Starting dd io to vdo fs to be resized Attempt to resize the open vdo filesystem multiple times with lvextend/fsadm on hayes-03 +++ itr 1 +++ [...] +++ itr 4 +++ Adding additional space to vdo_sanity/vdo_lv on hayes-03 lvextend --yes --resizefs -L +500M vdo_sanity/vdo_lv 40000+0 records in 40000+0 records out 41943040000 bytes (42 GB, 39 GiB) copied, 186.615 s, 225 MB/s Size of logical volume vdo_sanity/vdo_lv changed from 101.46 GiB (25975 extents) to 101.95 GiB (26100 extents). Logical volume vdo_sanity/vdo_lv successfully resized. meta-data=/dev/mapper/vdo_sanity-vdo_lv isize=512 agcount=5, agsize=6553600 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 bigtime=1 inobtcount=1 data = bsize=4096 blocks=26598400, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=12800, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 data blocks changed from 26598400 to 26726400 PRE:106393600.0 POST:106905600.0 PRE:{'106342400'} POST:{'106854400'} Checking files from /mnt/vdo_lv /usr/tests/sts-rhel8.7/bin/checkit -w /mnt/vdo_lv -f /tmp/Filesystem.1314273 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/Filesystem.1314273 Working dir: /mnt/vdo_lv umount /mnt/vdo_lv lvremove -f vdo_sanity/vdo_lv Logical volume "vdo_lv" successfully removed. removing vg vdo_sanity from hayes-03 Volume group "vdo_sanity" successfully removed removing pv /dev/sde1 on hayes-03 Labels on physical volume "/dev/sde1" successfully wiped. removing entry from the devices file for /dev/sde1 removing pv /dev/sdd1 on hayes-03 Labels on physical volume "/dev/sdd1" successfully wiped. removing entry from the devices file for /dev/sdd1 removing pv /dev/sdh1 on hayes-03 Labels on physical volume "/dev/sdh1" successfully wiped. removing entry from the devices file for /dev/sdh1 removing pv /dev/sdi1 on hayes-03 Labels on physical volume "/dev/sdi1" successfully wiped. removing entry from the devices file for /dev/sdi1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (kmod-kvdo bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8333