Bug 1251976
Summary: | Raid scrubbing on cache pool raid logical volumes impossible | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Fabrice Allibe <fabrice.allibe> |
Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> |
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 22 | CC: | agk, bmarzins, bmr, cmarthal, dwysocha, heinzm, jbrassow, jonathan, lvm-team, msnitzer, prajnoha, prockai, zkabelac |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-07-19 19:54:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Fabrice Allibe
2015-08-10 12:44:05 UTC
Raid scrubbing does not yet work on cache origin volumes (see bug 1169495), nor on cache pool volumes not yet associated with a cache origin volume (see bug 1169500). However, it does currently work on cache pool volumes with a valid cache origin volume. # Create origin (slow) volume lvcreate --type raid1 -m 1 -L 4G -n origin vg /dev/mapper/mpathb2 /dev/mapper/mpathg2 Waiting until all mirror|raid volumes become fully syncd... 0/1 mirror(s) are fully synced: ( 46.17% ) 0/1 mirror(s) are fully synced: ( 87.00% ) 1/1 mirror(s) are fully synced: ( 100.00% ) # Create cache data and cache metadata (fast) volumes lvcreate --type raid1 -m 1 -L 4G -n pool vg /dev/mapper/mpathh1 /dev/mapper/mpathe1 lvcreate --type raid1 -m 1 -L 12M -n pool_meta vg /dev/mapper/mpathh1 /dev/mapper/mpathe1 Waiting until all mirror|raid volumes become fully syncd... 1/2 mirror(s) are fully synced: ( 52.51% 100.00% ) 2/2 mirror(s) are fully synced: ( 100.00% 100.00% ) # Create cache pool volume by combining the cache data and cache metadata (fast) volumes lvconvert --yes --type cache-pool --cachemode writeback -c 32 --poolmetadata vg/pool_meta vg/pool WARNING: Converting logical volume vg/pool and vg/pool_meta to pool's data and metadata volumes. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) # Create cached volume by combining the cache pool (fast) and origin (slow) volumes lvconvert --yes --type cache --cachepool vg/pool vg/origin [root@harding-03 ~]# lvs -a -o +devices LV Attr LSize Pool Origin Data% Meta% Cpy%Sync Devices [lvol0_pmspare] ewi------- 12.00m /dev/mapper/mpatha1(0) origin Cwi-a-C--- 4.00g [pool] [origin_corig] 0.00 12.92 100.00 origin_corig(0) [origin_corig] rwi-aoC--- 4.00g 100.00 origin_corig_rimage_0(0),origin_corig_rimage_1(0) [origin_corig_rimage_0] iwi-aor--- 4.00g /dev/mapper/mpathb2(1) [origin_corig_rimage_1] iwi-aor--- 4.00g /dev/mapper/mpathg2(1) [origin_corig_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathb2(0) [origin_corig_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathg2(0) [pool] Cwi---C--- 4.00g 0.00 12.92 100.00 pool_cdata(0) [pool_cdata] Cwi-aor--- 4.00g 100.00 pool_cdata_rimage_0(0),pool_cdata_rimage_1(0) [pool_cdata_rimage_0] iwi-aor--- 4.00g /dev/mapper/mpathh1(1) [pool_cdata_rimage_1] iwi-aor--- 4.00g /dev/mapper/mpathe1(1) [pool_cdata_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathh1(0) [pool_cdata_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathe1(0) [pool_cmeta] ewi-aor--- 12.00m 100.00 pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0) [pool_cmeta_rimage_0] iwi-aor--- 12.00m /dev/mapper/mpathh1(1026) [pool_cmeta_rimage_1] iwi-aor--- 12.00m /dev/mapper/mpathe1(1026) [pool_cmeta_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathh1(1025) [pool_cmeta_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathe1(1025) [root@harding-03 ~]# lvchange --syncaction repair vg/pool_cmeta [root@harding-03 ~]# lvchange --syncaction check vg/pool_cdata [root@harding-03 ~]# lvs -a -o +devices,raid_sync_action,raid_mismatch_count LV Attr LSize Pool Origin Data% Meta% Cpy%Sync Devices SyncAction Mismatches [lvol0_pmspare] ewi------- 12.00m /dev/mapper/mpatha1(0) origin Cwi-a-C--- 4.00g [pool] [origin_corig] 0.00 12.92 100.00 origin_corig(0) [origin_corig] rwi-aoC--- 4.00g 100.00 origin_corig_rimage_0(0),origin_corig_rimage_1(0) idle 0 [origin_corig_rimage_0] iwi-aor--- 4.00g /dev/mapper/mpathb2(1) [origin_corig_rimage_1] iwi-aor--- 4.00g /dev/mapper/mpathg2(1) [origin_corig_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathb2(0) [origin_corig_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathg2(0) [pool] Cwi---C--- 4.00g 0.00 12.92 100.00 pool_cdata(0) [pool_cdata] Cwi-aor--- 4.00g 6.25 pool_cdata_rimage_0(0),pool_cdata_rimage_1(0) check 0 [pool_cdata_rimage_0] iwi-aor--- 4.00g /dev/mapper/mpathh1(1) [pool_cdata_rimage_1] iwi-aor--- 4.00g /dev/mapper/mpathe1(1) [pool_cdata_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathh1(0) [pool_cdata_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathe1(0) [pool_cmeta] ewi-aor--- 12.00m 100.00 pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0) idle 0 [pool_cmeta_rimage_0] iwi-aor--- 12.00m /dev/mapper/mpathh1(1026) [pool_cmeta_rimage_1] iwi-aor--- 12.00m /dev/mapper/mpathe1(1026) [pool_cmeta_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathh1(1025) [pool_cmeta_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpathe1(1025) Your example seems to be very close to my setup. So I am now trying to trigger syncaction on cmeta and cdata related to my two cache pool volumes... However it does not work: [root@serveur ~]# lvchange --syncaction check vg_serveur/cacheroot_cmeta vg_serveur/cacheroot_cmeta must be a RAID logical volume to perform this action. [root@serveur ~]# lvchange --syncaction check vg_serveur/cacheroot_cdata vg_serveur/cacheroot_cdata must be a RAID logical volume to perform this action. [root@serveur ~]# lvchange --syncaction check vg_serveur/cachehome_cmeta vg_serveur/cachehome_cmeta must be a RAID logical volume to perform this action. [root@serveur ~]# lvchange --syncaction check vg_serveur/cachehome_cdata vg_serveur/cachehome_cdata must be a RAID logical volume to perform this action. What is wrong here ? Assuming you have the same set up from comment #0, the only raid volumes you had there were the two origin volumes (both made up of sda2 and sdb2). home Cwi-aoC--- 405,44g home_corig(0) [home_corig] rwi-aoC--- 405,44g home_corig_rimage_0(0),home_corig_rimage_1(0) [home_corig_rimage_0] iwi-aor--- 405,44g /dev/sda2(1913) [home_corig_rimage_1] iwi-aor--- 405,44g /dev/sdb2(1913) [home_corig_rmeta_0] ewi-aor--- 32,00m /dev/sda2(0) [home_corig_rmeta_1] ewi-aor--- 32,00m /dev/sdb2(1912) root Cwi-aoC--- 50,00g root_corig(0) [root_corig] rwi-aoC--- 50,00g root_corig_rimage_0(0),root_corig_rimage_1(0) [root_corig_rimage_0] iwi-aor--- 50,00g /dev/sda2(1) [root_corig_rimage_1] iwi-aor--- 50,00g /dev/sdb2(1) [root_corig_rmeta_0] ewi-aor--- 32,00m /dev/sda2(1912) [root_corig_rmeta_1] ewi-aor--- 32,00m /dev/sdb2(0) However, your pool _cdata and _cmeta volumes are just one disk (sdc3) linears [cachehome] Cwi---C--- 32,00g cachehome_cdata(0) [cachehome_cdata Cwi-ao---- 32,00g /dev/sdc3(163) [cachehome_cmeta] ewi-ao---- 32,00m /dev/sdc3(162) [cacheroot] Cwi---C--- 5,00g cacheroot_cdata(0) [cacheroot_cdata] Cwi-ao---- 5,00g /dev/sdc3(2) [cacheroot_cmeta] ewi-ao---- 32,00m /dev/sdc3(1) My pool _cdata and _cmeta volumes were both raid [pool] Cwi---C--- pool_cdata(0) [pool_cdata] Cwi-aor--- pool_cdata_rimage_0(0),pool_cdata_rimage_1(0) [pool_cdata_rimage_0] iwi-aor--- /dev/mapper/mpathh1(1) [pool_cdata_rimage_1] iwi-aor--- /dev/mapper/mpathe1(1) [pool_cdata_rmeta_0] ewi-aor--- /dev/mapper/mpathh1(0) [pool_cdata_rmeta_1] ewi-aor--- /dev/mapper/mpathe1(0) [pool_cmeta] ewi-aor--- pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0) [pool_cmeta_rimage_0] iwi-aor--- /dev/mapper/mpathh1(1026) [pool_cmeta_rimage_1] iwi-aor--- /dev/mapper/mpathe1(1026) [pool_cmeta_rmeta_0] ewi-aor--- /dev/mapper/mpathh1(1025) Fabrice, as Corey pointed out, you've got a setup with non-redundant pool _cdata and _cmeta linear devices. I.e. your configuration will loose data in case your linear cache devices (e.g. on /dev/sdc3) fail unless you run the cache in writethrough mode, which'd still cause system interruptions but avoid data loss on your origin raid devices in case any of the cache devices fail, If this is what you had planned for, you might still run raid repair/check as a workaround on e.g. your home origin raid1 device with "dmsetup message vg_serveur-home_corig 0 check" or "dmsetup message vg_serveur-home_corig 0 repair" respectively. If your intention actually is a recommended fully redundant configuration, please follow comment#1 Correct. I setup logical volumes with RAID1 on two physical disks (sda & sdb), and non-redundant pool _cdata and _cmeta on a single SSD (sdc). Heinz, thanks for the detailed clarification. I tried the trick. It does the job : [root@serveur ~]# dmsetup message vg_serveur-root_corig 0 check [root@serveur ~]# dmesg ... [ 80.147834] md: data-check of RAID array mdX [ 80.147840] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 80.147843] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check. [ 80.147847] md: using 128k window, over a total of 52428800k. ... [ 367.146233] md: mdX: data-check done. [root@serveur ~]# lvs -o +raid_sync_action,raid_mismatch_count vg_serveur/root_corig LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert SyncAction Mismatches [root_corig] vg_serveur rwi-aoC-m- 50,00g 100,00 idle 256 I had setup my volume vg_serveur/root in writeback mode: [root@serveur ~]# dmsetup status vg_serveur-home: 0 850264064 cache 8 1567/8192 128 8401/524288 822662 5090733 442926 690710 0 0 0 1 writethrough 2 migration_threshold 2048 mq 10 random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 read_promote_adjustment 4 write_promote_adjustment 8 vg_serveur-root_corig_rimage_1: 0 104857600 linear vg_serveur-root_corig_rimage_0: 0 104857600 linear vg_serveur-swap: 0 20316160 raid raid1 2 AA 20316160/20316160 idle 0 vg_serveur-root: 0 104857600 cache 8 250/8192 128 79623/81920 3030545 3396068 4213453 2262811 0 13 2 1 writeback 2 migration_threshold 2048 mq 10 random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 read_promote_adjustment 4 write_promote_adjustment 8 vg_serveur-cachehome_cdata: 0 67108864 linear vg_serveur-root_corig: 0 104857600 raid raid1 2 AA 104857600/104857600 idle 256 vg_serveur-cachehome_cmeta: 0 65536 linear vg_serveur-swap_rmeta_1: 0 65536 linear vg_serveur-swap_rimage_1: 0 20316160 linear vg_serveur-root_corig_rmeta_1: 0 65536 linear vg_serveur-swap_rmeta_0: 0 65536 linear vg_serveur-swap_rimage_0: 0 20316160 linear vg_serveur-root_corig_rmeta_0: 0 65536 linear vg_serveur-home_corig_rmeta_1: 0 65536 linear vg_serveur-home_corig_rimage_1: 0 850264064 linear vg_serveur-home_corig_rmeta_0: 0 65536 linear vg_serveur-cacheroot_cdata: 0 10485760 linear vg_serveur-home_corig_rimage_0: 0 850264064 linear vg_serveur-cacheroot_cmeta: 0 65536 linear vg_serveur-home_corig: 0 850264064 raid raid1 2 AA 100110336/850264064 check 0 So I'm destroying/recreating it to get back in writethrough. [root@serveur ~]# lvremove vg_serveur/cacheroot Do you really want to remove and DISCARD logical volume cacheroot? [y/n]: y Flushing cache for root. 29 blocks must still be flushed. ... 7 blocks must still be flushed. Logical volume "cacheroot" successfully removed [root@serveur ~]# lvs -o +raid_sync_action,raid_mismatch_count vg_serveur/root LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert SyncAction Mismatches root vg_serveur rwi-aor--- 50,00g 100,00 idle 0 [root@serveur ~]# lvcreate -n cacheroot_meta -l 1 vg_serveur /dev/sdc3 Logical volume "cacheroot_meta" created. [root@serveur ~]# lvcreate -n cacheroot -l 160 vg_serveur /dev/sdc3 Logical volume "cacheroot" created. [root@serveur ~]# lvconvert --type cache-pool --poolmetadata vg_serveur/cacheroot_meta vg_serveur/cacheroot WARNING: Converting logical volume vg_serveur/cacheroot and vg_serveur/cacheroot_meta to pool's data and metadata volumes. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Do you really want to convert vg_serveur/cacheroot and vg_serveur/cacheroot_meta? [y/n]: y Converted vg_serveur/cacheroot to cache pool. [root@serveur ~]# lvconvert --type cache --cachepool vg_serveur/cacheroot vg_serveur/root Logical volume vg_serveur/root is now cached. Checking. Sounds OK: [root@serveur ~]# dmsetup status | grep write vg_serveur-home: 0 850264064 cache 8 1567/8192 128 8401/524288 822662 5090733 442926 690710 0 0 0 1 writethrough 2 migration_threshold 2048 mq 10 random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 read_promote_adjustment 4 write_promote_adjustment 8 vg_serveur-root: 0 104857600 cache 8 167/8192 128 0/81920 0 0 0 79 0 0 0 1 writethrough 2 migration_threshold 2048 mq 10 random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 read_promote_adjustment 4 write_promote_adjustment 8 Then, scrubbing on origin devices ended successfully without error. Thanks a lot ! In conclusion, using a single SSD as caching device is quite common. As a result, a big big warning should be added in documentation about "writeback" mode. Meanwhile, would it be possible to include the dmsetup tricks (to trigger check and repair on the origin devices) in the documentation? Fabrice, dmsetup is mainly a low-level testing tool and it's not meant to be used in production unless requested like in this context (please keep in mind that you have a non-resilient stack WRT your cache data and metadata devices); any other dmsetup information related to mapping targets (e.g. dm-raid) for developers/testers comes with the kernels source. The real solution should be to either display a big warning for such non-resilient stack on creation, warn and enforce writethrough or prohibit altogether unless user knows what he does and requests enforcement. Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |