RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1959626 - Ability to online resize (reduce) cache origin can lead to deadlock attempting to --splitcache
Summary: Ability to online resize (reduce) cache origin can lead to deadlock attemptin...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: lvm2
Version: 8.5
Hardware: x86_64
OS: Linux
high
high
Target Milestone: beta
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-11 23:12 UTC by Corey Marthaler
Modified: 2022-11-11 07:25 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-11 07:25:28 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)

Description Corey Marthaler 2021-05-11 23:12:34 UTC
Description of problem:
[root@hayes-03 ~]# pvscan
  PV /dev/sdb1   VG cache_sanity    lvm2 [446.62 GiB / 446.62 GiB free]
  PV /dev/sdc1   VG cache_sanity    lvm2 [446.62 GiB / 446.62 GiB free]
  PV /dev/sdd1   VG cache_sanity    lvm2 [446.62 GiB / 446.62 GiB free]
  PV /dev/sde1   VG cache_sanity    lvm2 [446.62 GiB / 446.62 GiB free]
  PV /dev/sdf1   VG cache_sanity    lvm2 [<1.82 TiB / <1.82 TiB free]
  PV /dev/sdg1   VG cache_sanity    lvm2 [<1.82 TiB / <1.82 TiB free]
  PV /dev/sdh1   VG cache_sanity    lvm2 [<1.82 TiB / <1.82 TiB free]
  PV /dev/sdi1   VG cache_sanity    lvm2 [<1.82 TiB / <1.82 TiB free]
  PV /dev/sdj1   VG cache_sanity    lvm2 [<1.82 TiB / <1.82 TiB free]
  Total: 9 [<10.84 TiB] / in use: 9 [<10.84 TiB] / in no VG: 0 [0   ]

[root@hayes-03 ~]# lvcreate --yes --wipesignatures y  -L 4G -n corigin cache_sanity @slow
  Wiping ext4 signature on /dev/cache_sanity/corigin.
  Logical volume "corigin" created.
[root@hayes-03 ~]# lvcreate --yes  -L 4G -n pool cache_sanity @fast
  Wiping ext4 signature on /dev/cache_sanity/pool.
  Logical volume "pool" created.
[root@hayes-03 ~]# lvcreate --yes  -L 12M -n pool_meta cache_sanity @fast
  Logical volume "pool_meta" created.
[root@hayes-03 ~]# lvconvert --yes --type cache-pool --cachepolicy smq --cachemode writeback -c 32 --poolmetadata cache_sanity/pool_meta cache_sanity/pool
  WARNING: Converting cache_sanity/pool and cache_sanity/pool_meta to cache pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
  Converted cache_sanity/pool and cache_sanity/pool_meta to cache pool.
[root@hayes-03 ~]# lvconvert --yes --type cache --cachemetadataformat 1 --cachepool cache_sanity/pool cache_sanity/corigin
  Logical volume cache_sanity/corigin is now cached.
                                                                                                   
[root@hayes-03 ~]# lvs -a -o +devices
  LV                 VG           Attr       LSize  Pool         Origin          Data%  Meta%  Move Log Cpy%Sync Convert Devices            
  corigin            cache_sanity Cwi-a-C---  4.00g [pool_cpool] [corigin_corig] 0.00   8.72            0.00             corigin_corig(0)   
  [corigin_corig]    cache_sanity owi-aoC---  4.00g                                                                      /dev/sdj1(0)       
  [lvol0_pmspare]    cache_sanity ewi------- 12.00m                                                                      /dev/sdb1(0)       
  [pool_cpool]       cache_sanity Cwi---C---  4.00g                              0.00   8.72            0.00             pool_cpool_cdata(0)
  [pool_cpool_cdata] cache_sanity Cwi-ao----  4.00g                                                                      /dev/sdd1(0)       
  [pool_cpool_cmeta] cache_sanity ewi-ao---- 12.00m                                                                      /dev/sdd1(1024)    
[root@hayes-03 ~]# lvs
  LV      VG           Attr       LSize Pool         Origin          Data%  Meta%  Move Log Cpy%Sync Convert
  corigin cache_sanity Cwi-a-C--- 4.00g [pool_cpool] [corigin_corig] 0.00   8.72            0.00            
[root@hayes-03 ~]# mkfs.ext4 /dev/cache_sanity/corigin 
mke2fs 1.45.6 (20-Mar-2020)
Discarding device blocks: done                            
Creating filesystem with 1048576 4k blocks and 262144 inodes
[...]
Writing superblocks and filesystem accounting information: done 

[root@hayes-03 ~]# mount /dev/cache_sanity/corigin /mnt/corigin/
[root@hayes-03 ~]# df -h
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/cache_sanity-corigin  3.9G   16M  3.7G   1% /mnt/corigin

[root@hayes-03 ~]# lvreduce --yes -L -120M -r /dev/cache_sanity/corigin
Do you want to unmount "/mnt/corigin" ? [Y|n] y
fsck from util-linux 2.32.1
/dev/mapper/cache_sanity-corigin: 11/262144 files (0.0% non-contiguous), 36942/1048576 blocks
resize2fs 1.45.6 (20-Mar-2020)
Resizing the filesystem on /dev/mapper/cache_sanity-corigin to 1017856 (4k) blocks.
The filesystem on /dev/mapper/cache_sanity-corigin is now 1017856 (4k) blocks long.

  Size of logical volume cache_sanity/corigin_corig changed from 4.00 GiB (1024 extents) to 3.88 GiB (994 extents).
  Logical volume cache_sanity/corigin successfully resized.


[root@hayes-03 ~]# lvreduce --yes -L -120M -r /dev/cache_sanity/corigin
Do you want to unmount "/mnt/corigin" ? [Y|n] y
fsck from util-linux 2.32.1
/dev/mapper/cache_sanity-corigin: 11/262144 files (0.0% non-contiguous), 36942/1017856 blocks
resize2fs 1.45.6 (20-Mar-2020)
Resizing the filesystem on /dev/mapper/cache_sanity-corigin to 987136 (4k) blocks.
The filesystem on /dev/mapper/cache_sanity-corigin is now 987136 (4k) blocks long.

  Size of logical volume cache_sanity/corigin_corig changed from 3.88 GiB (994 extents) to <3.77 GiB (964 extents).
  Logical volume cache_sanity/corigin successfully resized.

[...]
May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device
May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142272, limit=7897088
May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device
May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142784, limit=7897088
May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device
May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8139776, limit=7897088
May 11 17:50:36 hayes-03 kernel: attempt to access beyond end of device
May 11 17:50:36 hayes-03 kernel: dm-3: rw=1, want=8142272, limit=7897088
[...]

[root@hayes-03 ~]# lvs
  LV      VG           Attr       LSize  Pool         Origin          Data%  Meta%  Move Log Cpy%Sync Convert
  corigin cache_sanity Cwi-aoC--- <3.77g [pool_cpool] [corigin_corig] 3.21   12.92           0.07            
[root@hayes-03 ~]# lvs -a -o +devices
  LV                 VG           Attr       LSize  Pool         Origin          Data%  Meta%  Move Log Cpy%Sync Convert Devices            
  corigin            cache_sanity Cwi-aoC--- <3.77g [pool_cpool] [corigin_corig] 3.21   12.92           0.07             corigin_corig(0)   
  [corigin_corig]    cache_sanity owi-aoC--- <3.77g                                                                      /dev/sdj1(0)       
  [lvol0_pmspare]    cache_sanity ewi------- 12.00m                                                                      /dev/sdb1(0)       
  [pool_cpool]       cache_sanity Cwi---C---  4.00g                              3.21   12.92           0.07             pool_cpool_cdata(0)
  [pool_cpool_cdata] cache_sanity Cwi-ao----  4.00g                                                                      /dev/sdd1(0)       
  [pool_cpool_cmeta] cache_sanity ewi-ao---- 12.00m                                                                      /dev/sdd1(1024) 

# This presumably spins forever:  
[root@hayes-03 ~]# lvconvert --yes --splitcache /dev/cache_sanity/corigin
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  Flushing 3 blocks for cache cache_sanity/corigin.
  [...]


write(1, "  Flushing 3 blocks for cache ca"..., 52) = 52
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[INT KILL TERM STOP RTMIN RT_1], NULL, 8) = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[INT KILL TERM STOP RTMIN RT_1], 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0
ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", flags=DM_EXISTS_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG} => {version=4.43.0, data_size=475, data_start=312, dev=makedev(0xfd, 0), name="cache_sanity-corigin", uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", target_count=1, open_count=1, event_nr=5, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG, ...}) = 0
write(1, "  Flushing 3 blocks for cache ca"..., 52) = 52
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=0x559ce4c03c90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[INT KILL TERM STOP RTMIN RT_1], NULL, 8) = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, NULL) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[INT KILL TERM STOP RTMIN RT_1], 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f88e3681400}, NULL, 8) = 0
ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", flags=DM_EXISTS_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG} => {version=4.43.0, data_size=475, data_start=312, dev=makedev(0xfd, 0), name="cache_sanity-corigin", uuid="LVM-jbOTZnz8QRfVePqpJWBdoEfMvi3e52Ab3DYLkcqkOzeIGQo4U0pX09xKWaFzft9r", target_count=1, open_count=1, event_nr=5, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG|DM_SKIP_BDGET_FLAG|DM_NOFLUSH_FLAG, ...}) = 0
[...]


Version-Release number of selected component (if applicable):
kernel-4.18.0-305.2.el8    BUILT: Wed May  5 10:35:03 CDT 2021
lvm2-2.03.12-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
lvm2-libs-2.03.12-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
lvm2-dbusd-2.03.12-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
device-mapper-1.02.177-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
device-mapper-libs-1.02.177-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
device-mapper-event-1.02.177-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021
device-mapper-event-libs-1.02.177-0.1.20210426git4dc5d4a.el8    BUILT: Mon Apr 26 08:23:33 CDT 2021


How reproducible:
Everytime

Comment 1 David Teigland 2021-06-28 19:29:13 UTC
I don't know if this is supposed to work or not, Zdenek should be able to tell us the status of reducing dm-cache.

Comment 5 RHEL Program Management 2022-11-11 07:25:28 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.