Bug 1685257
Summary: | "Internal error: #LVs (8) != #visible LVs (3) + #snapshots (1) + #internal LVs (5) in VG" when trying to uncache or splitcache cache origin volume | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Corey Marthaler <cmarthal> | ||||||
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | ||||||
lvm2 sub component: | Cache Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | unspecified | ||||||||
Priority: | unspecified | CC: | agk, heinzm, jbrassow, mcsontos, msnitzer, prajnoha, thornber, zkabelac | ||||||
Version: | 8.0 | Flags: | pm-rhel:
mirror+
|
||||||
Target Milestone: | rc | ||||||||
Target Release: | 8.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | lvm2-2.03.07-1.el8 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-04-28 16:58:57 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Corey Marthaler
2019-03-04 19:17:20 UTC
Created attachment 1540713 [details]
verbose lvconvert attempt
Created attachment 1540714 [details]
another verbose lvconvert attempt
Why would the "vgchange --sysinit -ay cache_sanity" be causing Buffer I/O errors? Mar 4 14:25:20 hayes-01 qarshd[9596]: Running cmdline: lvconvert --merge cache_sanity/merge_reboot --yes Mar 4 14:25:21 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49504). Mar 4 14:25:21 hayes-01 qarshd[9610]: Talking to peer ::ffff:10.15.80.218:49504 (IPv6) Mar 4 14:25:21 hayes-01 qarshd[9610]: Running cmdline: umount /mnt/merge_reboot /mnt/corigin Mar 4 14:25:21 hayes-01 kernel: XFS (dm-16): Unmounting Filesystem Mar 4 14:25:21 hayes-01 kernel: XFS (dm-10): Unmounting Filesystem Mar 4 14:25:21 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49506). Mar 4 14:25:21 hayes-01 qarshd[9615]: Talking to peer ::ffff:10.15.80.218:49506 (IPv6) Mar 4 14:25:21 hayes-01 qarshd[9615]: Running cmdline: vgchange -an cache_sanity Mar 4 14:25:21 hayes-01 lvm[4244]: No longer monitoring snapshot cache_sanity-merge_reboot. Mar 4 14:25:22 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49508). Mar 4 14:25:22 hayes-01 qarshd[9650]: Talking to peer ::ffff:10.15.80.218:49508 (IPv6) Mar 4 14:25:22 hayes-01 qarshd[9650]: Running cmdline: vgchange --sysinit -ay cache_sanity Mar 4 14:25:22 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 0, async page read Mar 4 14:25:22 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 1, async page read Mar 4 14:25:22 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 0, async page read Mar 4 14:25:22 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 1, async page read Mar 4 14:25:22 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49510). Mar 4 14:25:22 hayes-01 qarshd[9707]: Talking to peer ::ffff:10.15.80.218:49510 (IPv6) Mar 4 14:25:23 hayes-01 qarshd[9707]: Running cmdline: lvs -a -o +devices cache_sanity Mar 4 14:25:23 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49512). Mar 4 14:25:23 hayes-01 qarshd[9713]: Talking to peer ::ffff:10.15.80.218:49512 (IPv6) Mar 4 14:25:23 hayes-01 qarshd[9713]: Running cmdline: vgchange --refresh cache_sanity Mar 4 14:25:23 hayes-01 systemd[1]: Started LVM2 poll daemon. Mar 4 14:25:23 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 0, async page read Mar 4 14:25:23 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 1, async page read Mar 4 14:25:23 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49514). Mar 4 14:25:23 hayes-01 qarshd[9755]: Talking to peer ::ffff:10.15.80.218:49514 (IPv6) Mar 4 14:25:24 hayes-01 qarshd[9755]: Running cmdline: lvs -a -o +devices cache_sanity Mar 4 14:25:24 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49516). Mar 4 14:25:24 hayes-01 qarshd[9760]: Talking to peer ::ffff:10.15.80.218:49516 (IPv6) Mar 4 14:25:24 hayes-01 qarshd[9760]: Running cmdline: lvs --noheadings -a -o lv_name --select pool_lv=pool Mar 4 14:25:24 hayes-01 systemd[1]: Started qarsh Per-Connection Server (10.15.80.218:49518). Mar 4 14:25:24 hayes-01 qarshd[9765]: Talking to peer ::ffff:10.15.80.218:49518 (IPv6) Mar 4 14:25:24 hayes-01 qarshd[9765]: Running cmdline: lvconvert -vvvv --splitcache /dev/cache_sanity/corigin Mar 4 14:25:25 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 0, async page read Mar 4 14:25:25 hayes-01 kernel: Buffer I/O error on dev dm-16, logical block 1, async page read Mar 4 14:25:40 hayes-01 lvmpolld[9723]: W: #011LVPOLL: PID 9748: STDERR: ' WARNING: This metadata update is NOT backed up.' Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-12, logical block 0, async page read Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-12, logical block 1, async page read Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-11, logical block 0, async page read Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-11, logical block 1, async page read Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-10, logical block 0, async page read Mar 4 14:26:22 hayes-01 kernel: Buffer I/O error on dev dm-10, logical block 1, async page read I assume I have fixes for this in my tree for upstreaming - I need to extend test suite to cover described case to be sure about that. FWIW, still hitting this in final 8.1 regression testing. kernel-4.18.0-147.4.el8 BUILT: Thu Oct 3 15:38:54 CDT 2019 lvm2-2.03.05-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 lvm2-libs-2.03.05-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 lvm2-dbusd-2.03.05-5.el8 BUILT: Thu Sep 26 01:43:33 CDT 2019 device-mapper-1.02.163-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 device-mapper-libs-1.02.163-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 device-mapper-event-1.02.163-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 device-mapper-event-libs-1.02.163-5.el8 BUILT: Thu Sep 26 01:40:57 CDT 2019 device-mapper-persistent-data-0.8.5-2.el8 BUILT: Wed Jun 5 10:28:04 CDT 2019 vdo-6.2.1.134-11.el8 BUILT: Fri Aug 2 10:39:03 CDT 2019 kmod-kvdo-6.2.1.138-57.el8 BUILT: Fri Sep 13 11:00:16 CDT 2019 SCENARIO - [reboot_before_cache_snap_merge_starts] Attempt to merge an inuse snapshot, then "reboot" the machine before the merge can take place *** Cache info for this scenario *** * origin (slow): /dev/mapper/vPV14 * pool (fast): /dev/mapper/vPV13 ************************************ Adding "slow" and "fast" tags to corresponding pvs Create origin (slow) volume lvcreate --wipesignatures y -L 4G -n corigin cache_sanity @slow Create cache data and cache metadata (fast) volumes lvcreate -L 2G -n pool cache_sanity @fast lvcreate -L 12M -n pool_meta cache_sanity @fast Create cache pool volume by combining the cache data and cache metadata (fast) volumes with policy: mq mode: writethrough lvconvert --yes --type cache-pool --cachepolicy mq --cachemode writethrough -c 64 --poolmetadata cache_sanity/pool_meta cache_sanity/pool WARNING: Converting cache_sanity/pool and cache_sanity/pool_meta to cache pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Create cached volume by combining the cache pool (fast) and origin (slow) volumes lvconvert --yes --type cache --cachemetadataformat 1 --cachepool cache_sanity/pool cache_sanity/corigin Placing an xfs filesystem on origin volume Mounting origin volume Making snapshot of origin volume lvcreate -s /dev/cache_sanity/corigin -c 128 -n merge_reboot -L 500M Mounting snap volume Attempt to merge snapshot cache_sanity/merge_reboot lvconvert --merge cache_sanity/merge_reboot --yes umount and deactivate volume group vgchange --sysinit -ay cache_sanity vgchange --refresh cache_sanity Separating cache pool (lvconvert --splitcache) cache_sanity/corigin from cache origin Internal error: #LVs (8) != #visible LVs (3) + #snapshots (1) + #internal LVs (5) in VG cache_sanity couldn't split cache pool volume Marking verified in the latest rpms. kernel-4.18.0-151.el8 BUILT: Fri Nov 15 13:14:53 CST 2019 lvm2-2.03.07-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 lvm2-libs-2.03.07-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 lvm2-dbusd-2.03.07-1.el8 BUILT: Mon Dec 2 00:12:23 CST 2019 lvm2-lockd-2.03.07-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 device-mapper-1.02.167-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 device-mapper-libs-1.02.167-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 device-mapper-event-1.02.167-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 device-mapper-event-libs-1.02.167-1.el8 BUILT: Mon Dec 2 00:09:32 CST 2019 device-mapper-persistent-data-0.8.5-2.el8 BUILT: Wed Jun 5 10:28:04 CDT 2019 vdo-6.2.2.24-11.el8 BUILT: Wed Oct 30 21:22:06 CDT 2019 kmod-kvdo-6.2.2.24-60.el8 BUILT: Mon Nov 11 16:14:12 CST 2019 Both "lvconvert --splitcache" and "lvconvert --uncache" no longer fail when attempted on top of cache volumes stacked on VDO PVs. hayes-02: pvcreate /dev/mapper/vPV15 /dev/mapper/vPV14 /dev/mapper/vPV13 /dev/mapper/vPV12 hayes-02: vgcreate cache_sanity /dev/mapper/vPV15 /dev/mapper/vPV14 /dev/mapper/vPV13 /dev/mapper/vPV12 ============================================================ Iteration 1 of 1 started at Fri Dec 6 09:48:19 CST 2019 ============================================================ SCENARIO - [reboot_before_cache_snap_merge_starts] Attempt to merge an inuse snapshot, then "reboot" the machine before the merge can take place *** Cache info for this scenario *** * origin (slow): /dev/mapper/vPV15 * pool (fast): /dev/mapper/vPV14 ************************************ Adding "slow" and "fast" tags to corresponding pvs Create origin (slow) volume lvcreate --wipesignatures y -L 4G -n corigin cache_sanity @slow Create cache data and cache metadata (fast) volumes lvcreate -L 2G -n pool cache_sanity @fast lvcreate -L 12M -n pool_meta cache_sanity @fast Create cache pool volume by combining the cache data and cache metadata (fast) volumes with policy: smq mode: writeback lvconvert --yes --type cache-pool --cachepolicy smq --cachemode writeback -c 32 --poolmetadata cache_sanity/pool_meta cache_sanity/pool WARNING: Converting cache_sanity/pool and cache_sanity/pool_meta to cache pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Create cached volume by combining the cache pool (fast) and origin (slow) volumes lvconvert --yes --type cache --cachemetadataformat 2 --cachepool cache_sanity/pool cache_sanity/corigin Placing an xfs filesystem on origin volume Mounting origin volume Making snapshot of origin volume lvcreate -s /dev/cache_sanity/corigin -c 128 -n merge_reboot -L 500M Mounting snap volume Attempt to merge snapshot cache_sanity/merge_reboot lvconvert --merge cache_sanity/merge_reboot --yes umount and deactivate volume group vgchange --sysinit -ay cache_sanity vgchange --refresh cache_sanity [root@hayes-02 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices corigin cache_sanity Cwi-a-C--- 4.00g [pool_cpool] [corigin_corig] 0.56 6.71 0.00 corigin_corig(0) [corigin_corig] cache_sanity owi-aoC--- 4.00g /dev/mapper/vPV15(0) [lvol0_pmspare] cache_sanity ewi------- 12.00m /dev/mapper/vPV15(1024) [pool_cpool] cache_sanity Cwi---C--- 2.00g 0.56 6.71 0.00 pool_cpool_cdata(0) [pool_cpool_cdata] cache_sanity Cwi-ao---- 2.00g /dev/mapper/vPV14(0) [pool_cpool_cmeta] cache_sanity ewi-ao---- 12.00m /dev/mapper/vPV14(512) [root@hayes-02 ~]# lvconvert --splitcache /dev/cache_sanity/corigin Flushing 0 blocks for cache cache_sanity/corigin. Logical volume cache_sanity/corigin is not cached and cache_sanity/pool is unused. ============================================================ Iteration 12 of 13 started at Fri Dec 6 10:28:06 CST 2019 ============================================================ SCENARIO - [reboot_before_cache_snap_merge_starts] Attempt to merge an inuse snapshot, then "reboot" the machine before the merge can take place *** Cache info for this scenario *** * origin (slow): /dev/mapper/vPV13 * pool (fast): /dev/mapper/vPV12 ************************************ Adding "slow" and "fast" tags to corresponding pvs Create origin (slow) volume lvcreate --wipesignatures y -L 4G -n corigin cache_sanity @slow WARNING: xfs signature detected on /dev/cache_sanity/corigin at offset 0. Wipe it? [y/n]: [n] Aborted wiping of xfs. 1 existing signature left on the device. Create cache data and cache metadata (fast) volumes lvcreate -L 2G -n pool cache_sanity @fast lvcreate -L 12M -n pool_meta cache_sanity @fast Create cache pool volume by combining the cache data and cache metadata (fast) volumes with policy: mq mode: writeback lvconvert --yes --type cache-pool --cachepolicy mq --cachemode writeback -c 64 --poolmetadata cache_sanity/pool_meta cache_sanity/pool WARNING: Converting cache_sanity/pool and cache_sanity/pool_meta to cache pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Create cached volume by combining the cache pool (fast) and origin (slow) volumes lvconvert --yes --type cache --cachemetadataformat 1 --cachepool cache_sanity/pool cache_sanity/corigin Placing an xfs filesystem on origin volume Mounting origin volume Making snapshot of origin volume lvcreate -s /dev/cache_sanity/corigin -c 128 -n merge_reboot -L 500M Mounting snap volume Attempt to merge snapshot cache_sanity/merge_reboot lvconvert --merge cache_sanity/merge_reboot --yes umount and deactivate volume group vgchange --sysinit -ay cache_sanity vgchange --refresh cache_sanity Uncaching cache origin (lvconvert --uncache) cache_sanity/corigin from cache origin Removing cache origin volume cache_sanity/corigin lvremove -f /dev/cache_sanity/corigin Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:1881 |