Bug 1959626
| Summary: | Ability to online resize (reduce) cache origin can lead to deadlock attempting to --splitcache | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Corey Marthaler <cmarthal> | 
| Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | 
| lvm2 sub component: | Cache Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | 
| Status: | CLOSED WONTFIX | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | agk, heinzm, jbrassow, mcsontos, msnitzer, prajnoha, zkabelac | 
| Version: | 8.5 | Keywords: | Triaged | 
| Target Milestone: | beta | Flags: | pm-rhel:
                mirror+ | 
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-11 07:25:28 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| I don't know if this is supposed to work or not, Zdenek should be able to tell us the status of reducing dm-cache. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. | 
Description of problem: [root@hayes-03 ~]# pvscan PV /dev/sdb1 VG cache_sanity lvm2 [446.62 GiB / 446.62 GiB free] PV /dev/sdc1 VG cache_sanity lvm2 [446.62 GiB / 446.62 GiB free] PV /dev/sdd1 VG cache_sanity lvm2 [446.62 GiB / 446.62 GiB free] PV /dev/sde1 VG cache_sanity lvm2 [446.62 GiB / 446.62 GiB free] PV /dev/sdf1 VG cache_sanity lvm2 [<1.82 TiB / <1.82 TiB free] PV /dev/sdg1 VG cache_sanity lvm2 [<1.82 TiB / <1.82 TiB free] PV /dev/sdh1 VG cache_sanity lvm2 [<1.82 TiB / <1.82 TiB free] PV /dev/sdi1 VG cache_sanity lvm2 [<1.82 TiB / <1.82 TiB free] PV /dev/sdj1 VG cache_sanity lvm2 [<1.82 TiB / <1.82 TiB free] Total: 9 [<10.84 TiB] / in use: 9 [<10.84 TiB] / in no VG: 0 [0 ] [root@hayes-03 ~]# lvcreate --yes --wipesignatures y -L 4G -n corigin cache_sanity @slow Wiping ext4 signature on /dev/cache_sanity/corigin. Logical volume "corigin" created. [root@hayes-03 ~]# lvcreate --yes -L 4G -n pool cache_sanity @fast Wiping ext4 signature on /dev/cache_sanity/pool. Logical volume "pool" created. [root@hayes-03 ~]# lvcreate --yes -L 12M -n pool_meta cache_sanity @fast Logical volume "pool_meta" created. [root@hayes-03 ~]# lvconvert --yes --type cache-pool --cachepolicy smq --cachemode writeback -c 32 --poolmetadata cache_sanity/pool_meta cache_sanity/pool WARNING: Converting cache_sanity/pool and cache_sanity/pool_meta to cache pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Converted cache_sanity/pool and cache_sanity/pool_meta to cache pool. [root@hayes-03 ~]# lvconvert --yes --type cache --cachemetadataformat 1 --cachepool cache_sanity/pool cache_sanity/corigin Logical volume cache_sanity/corigin is now cached. [root@hayes-03 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices corigin cache_sanity Cwi-a-C--- 4.00g [pool_cpool] [corigin_corig] 0.00 8.72 0.00 corigin_corig(0) [corigin_corig] cache_sanity owi-aoC--- 4.00g /dev/sdj1(0) [lvol0_pmspare] cache_sanity ewi------- 12.00m /dev/sdb1(0) [pool_cpool] cache_sanity Cwi---C--- 4.00g 0.00 8.72 0.00 pool_cpool_cdata(0) [pool_cpool_cdata] cache_sanity Cwi-ao---- 4.00g /dev/sdd1(0) [pool_cpool_cmeta] cache_sanity ewi-ao---- 12.00m /dev/sdd1(1024) [root@hayes-03 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert corigin cache_sanity Cwi-a-C--- 4.00g [pool_cpool] [corigin_corig] 0.00 8.72 0.00 [root@hayes-03 ~]# mkfs.ext4 /dev/cache_sanity/corigin mke2fs 1.45.6 (20-Mar-2020) Discarding device blocks: done Creating filesystem with 1048576 4k blocks and 262144 inodes [...] Writing superblocks and filesystem accounting information: done [root@hayes-03 ~]# mount /dev/cache_sanity/corigin /mnt/corigin/ [root@hayes-03 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/cache_sanity-corigin 3.9G 16M 3.7G 1% /mnt/corigin [root@hayes-03 ~]# lvreduce --yes -L -120M -r /dev/cache_sanity/corigin Do you want to unmount "/mnt/corigin" ? [Y|n] y fsck from util-linux 2.32.1 /dev/mapper/cache_sanity-corigin: 11/262144 files (0.0% non-contiguous), 36942/1048576 blocks resize2fs 1.45.6 (20-Mar-2020) Resizing the filesystem on /dev/mapper/cache_sanity-corigin to 1017856 (4k) blocks. The filesystem on /dev/mapper/cache_sanity-corigin is now 1017856 (4k) blocks long. Size of logical volume cache_sanity/corigin_corig changed from 4.00 GiB (1024 extents) to 3.88 GiB (994 extents). Logical volume cache_sanity/corigin successfully resized. [root@hayes-03 ~]# lvreduce --yes -L -120M -r /dev/cache_sanity/corigin Do you want to unmount "/mnt/corigin" ? [Y|n] y fsck from util-linux 2.32.1 /dev/mapper/cache_sanity-corigin: 11/262144 files (0.0% non-contiguous), 36942/1017856 blocks resize2fs 1.45.6 (20-Mar-2020) Resizing the filesystem on /dev/mapper/cache_sanity-corigin to 987136 (4k) blocks. The filesystem on /dev/mapper/cache_sanity-corigin is now 987136 (4k) blocks long. Size of logical volume cache_sanity/corigin_corig changed from 3.88 GiB (994 extents) to <3.77 GiB (964 extents). Logical volume cache_sanity/corigin successfully resized. [...] May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142272, limit=7897088 May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142784, limit=7897088 May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8139776, limit=7897088 May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142272, limit=7897088 [...] [root@hayes-03 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert corigin cache_sanity Cwi-aoC--- <3.77g [pool_cpool] [corigin_corig] 3.21 12.92 0.07 [root@hayes-03 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices corigin cache_sanity Cwi-aoC--- <3.77g [pool_cpool] [corigin_corig] 3.21 12.92 0.07 corigin_corig(0) [corigin_corig] cache_sanity owi-aoC--- <3.77g /dev/sdj1(0) [lvol0_pmspare] cache_sanity ewi------- 12.00m /dev/sdb1(0) [pool_cpool] cache_sanity Cwi---C--- 4.00g 3.21 12.92 0.07 pool_cpool_cdata(0) [pool_cpool_cdata] cache_sanity Cwi-ao---- 4.00g /dev/sdd1(0) [pool_cpool_cmeta] cache_sanity ewi-ao---- 12.00m /dev/sdd1(1024) # This presumably spins forever: [root@hayes-03 ~]# lvconvert --yes --splitcache /dev/cache_sanity/corigin Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. Flushing 3 blocks for cache cache_sanity/corigin. [...] write(1, " Flushing 3 blocks for cache ca"..., 52) = 52 rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0 rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGINT, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGTERM, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[INT KILL TERM STOP RTMIN RT_1], NULL, 8) = 0 nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0 rt_sigprocmask(SIG_BLOCK, NULL, ~[INT KILL TERM STOP RTMIN RT_1], 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0 rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0 ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", flags=DM_EXISTS_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG} => {version=4.43.0, data_size=475, data_start=312, dev=makedev(0xfd, 0), name="cache_sanity-corigin", uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", target_count=1, open_count=1, event_nr=5, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG, ...}) = 0 write(1, " Flushing 3 blocks for cache ca"..., 52) = 52 rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0 rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGINT, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigaction(SIGTERM, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[INT KILL TERM STOP RTMIN RT_1], NULL, 8) = 0 nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0 rt_sigprocmask(SIG_BLOCK, NULL, ~[INT KILL TERM STOP RTMIN RT_1], 8) = 0 rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0 rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0 ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", flags=DM_EXISTS_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG} => {version=4.43.0, data_size=475, data_start=312, dev=makedev(0xfd, 0), name="cache_sanity-corigin", uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", target_count=1, open_count=1, event_nr=5, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG, ...}) = 0 [...] Version-Release number of selected component (if applicable): kernel-4.18.0-305.2.el8 BUILT: Wed May 5 10:35:03 CDT 2021 lvm2-2.03.12-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 lvm2-libs-2.03.12-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 lvm2-dbusd-2.03.12-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 device-mapper-1.02.177-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 device-mapper-libs-1.02.177-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 device-mapper-event-1.02.177-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 device-mapper-event-libs-1.02.177-0.1.20210426git4dc5d4a.el8 BUILT: Mon Apr 26 08:23:33 CDT 2021 How reproducible: Everytime