Description of problem: This sort of operation used to be protected against, which makes me believe this is a regression. [root@grant-03 ~]# lvcreate --yes -n fs_to_mirror -L 1.5G mirror_sanity Logical volume "fs_to_mirror" created. [root@grant-03 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices fs_to_mirror mirror_sanity -wi-a----- 1.50g /dev/nvme0n1p1(0) [root@grant-03 ~]# lvconvert --type mirror -m 1 mirror_sanity/fs_to_mirror & [1] 1560199 [root@grant-03 ~]# Logical volume mirror_sanity/fs_to_mirror being converted. mirror_sanity/fs_to_mirror: Converted: 3.12% ABORTING: Mirror percentage check failed. [1]+ Exit 5 lvconvert --type mirror -m 1 mirror_sanity/fs_to_mirror <WHILE ABOVE CMD IS RUNNING> [root@grant-03 ~]# lvconvert --splitmirrors 1 --name new mirror_sanity/fs_to_mirror Logical volume mirror_sanity/fs_to_mirror converted. [root@grant-03 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices fs_to_mirror mirror_sanity -wi-a----- 1.50g /dev/nvme0n1p1(0) new mirror_sanity -wi-a----- 1.50g /dev/nvme1n1p1(0) May 11 18:31:15 grant-03 systemd[1]: Started LVM2 poll daemon. May 11 18:31:16 grant-03 dmeventd[1559481]: mirror_sanity-fs_to_mirror is now in-sync. May 11 18:31:20 grant-03 dmeventd[1559481]: No longer monitoring mirror device mirror_sanity-fs_to_mirror for events. May 11 18:31:30 grant-03 lvmpolld[1560234]: W: #011LVPOLL: PID 1560237: STDERR: ' ABORTING: Mirror percentage check failed.' May 11 18:31:30 grant-03 lvmpolld[1560234]: LVMPOLLD: lvm2 cmd (PID 1560237) failed (retcode: 5) May 11 18:32:30 grant-03 systemd[1]: lvm2-lvmpolld.service: Deactivated successfully. Version-Release number of selected component (if applicable): kernel-5.14.0-284.11.1.el9_2 BUILT: Wed Apr 12 04:16:55 PM CEST 2023 lvm2-2.03.21-1.el9 BUILT: Fri Apr 21 02:33:33 PM CEST 2023 lvm2-libs-2.03.21-1.el9 BUILT: Fri Apr 21 02:33:33 PM CEST 2023 How reproducible: Everytime
This appears to only happen with fast nvme devices. I haven't been able to reproduce this with SSDs. Also, if I bump the size of the mirror way up with the nvme devices, then I will see the "Unable to split mirror mirror_sanity/fs_to_mirror that is not in-sync." messages again. So this appears to be a timing issue where LVM thinks the sync is complete, or close to complete, but it's not really.
(In reply to Corey Marthaler from comment #1) > This appears to only happen with fast nvme devices. I haven't been able to > reproduce this with SSDs. Also, if I bump the size of the mirror way up with > the nvme devices, then I will see the "Unable to split mirror > mirror_sanity/fs_to_mirror that is not in-sync." messages again. > > So this appears to be a timing issue where LVM thinks the sync is complete, > or close to complete, but it's not really. So it seems. FYI: we've seen similar issues with dm-raid on (small) LVs backed by fast PVs whch have been fixed a few years ago. No reproducer in VM here (virtio-scsi -> host thin -> host PCIe4 NVMe): # lvconvert -y --ty mirror -m1 t/t&lvconvert -y --splitmirrors 1 --name n t/t [1] 5089 Logical volume t/t being converted. Unable to split mirror t/t that is not in-sync. t/t: Converted: 75.00% # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert root rhel_vm45 -wi-ao---- <17.00g swap rhel_vm45 -wi-ao---- 2.00g t t mwi-a-m--- 128.00m [t_mlog] 100.00 # lvconvert -m0 t t/t: Converted: 100.00% /t Logical volume t/t converted. [root@vm45 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert root rhel_vm45 -wi-ao---- <17.00g swap rhel_vm45 -wi-ao---- 2.00g t t -wi-a----- 128.00m # lvconvert -y --ty mirror -m1 t/t&sleep 0.1;lvconvert -y --splitmirrors 1 --name n t/t [2] 5404 Logical volume t/t being converted. [1] Done lvconvert -y --ty mirror -m1 t/t Unable to split mirror t/t that is not in-sync. t/t: Converted: 71.88%