Bug 1086442

Summary: RFE: allow raid scrubbing on cache pool raid volumes
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Petr Rockai <prockai>
lvm2 sub component: Cache Logical Volumes QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: agk, heinzm, jbrassow, mcsontos, msnitzer, prajnoha, prockai, zkabelac
Version: 7.0Keywords: FutureFeature
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.112-1.el7 Doc Type: Enhancement
Doc Text:
It is now allowed to issue a scrub command on RAID volumes that are hidden underneath a persistent cache LV.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 13:08:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1119326    

Description Corey Marthaler 2014-04-10 22:42:17 UTC
Description of problem:
[root@harding-02 ~]# lvs -a -o +devices
  LV                             VG            Attr       LSize   Devices
  display_cache                  cache_sanity  Cwi---C---   1.00g display_cache_cdata(0)
  [display_cache_cdata]          cache_sanity  Cwi---C---   1.00g display_cache_cdata_rimage_0(0),display_cache_cdata_rimage_1(0),display_cache_cdata_rimage_2(0),display_cache_cdata_rimage_3(0)
  [display_cache_cdata_rimage_0] cache_sanity  Iwi---r--- 512.00m /dev/sdc2(1)
  [display_cache_cdata_rimage_1] cache_sanity  Iwi---r--- 512.00m /dev/sdc1(1)
  [display_cache_cdata_rimage_2] cache_sanity  Iwi---r--- 512.00m /dev/sdb1(1)
  [display_cache_cdata_rimage_3] cache_sanity  Iwi---r--- 512.00m /dev/sdb3(1)
  [display_cache_cdata_rmeta_0]  cache_sanity  ewi---r---   4.00m /dev/sdc2(0)
  [display_cache_cdata_rmeta_1]  cache_sanity  ewi---r---   4.00m /dev/sdc1(0)
  [display_cache_cdata_rmeta_2]  cache_sanity  ewi---r---   4.00m /dev/sdb1(0)
  [display_cache_cdata_rmeta_3]  cache_sanity  ewi---r---   4.00m /dev/sdb3(0)
  [display_cache_cmeta]          cache_sanity  ewi---C---   8.00m display_cache_cmeta_rimage_0(0),display_cache_cmeta_rimage_1(0),display_cache_cmeta_rimage_2(0),display_cache_cmeta_rimage_3(0)
  [display_cache_cmeta_rimage_0] cache_sanity  Iwi---r---   4.00m /dev/sdc2(130)
  [display_cache_cmeta_rimage_1] cache_sanity  Iwi---r---   4.00m /dev/sdc1(130)
  [display_cache_cmeta_rimage_2] cache_sanity  Iwi---r---   4.00m /dev/sdb1(130)
  [display_cache_cmeta_rimage_3] cache_sanity  Iwi---r---   4.00m /dev/sdb3(130)
  [display_cache_cmeta_rmeta_0]  cache_sanity  ewi---r---   4.00m /dev/sdc2(129)
  [display_cache_cmeta_rmeta_1]  cache_sanity  ewi---r---   4.00m /dev/sdc1(129)
  [display_cache_cmeta_rmeta_2]  cache_sanity  ewi---r---   4.00m /dev/sdb1(129)
  [display_cache_cmeta_rmeta_3]  cache_sanity  ewi---r---   4.00m /dev/sdb3(129)
  [lvol0_pmspare]                cache_sanity  ewi-------   8.00m /dev/sdc3(0)

[root@harding-02 ~]# 
[root@harding-02 ~]# 
[root@harding-02 ~]# lvchange --syncaction repair cache_sanity/display_cache_cdata
  Unable to change internal LV display_cache_cdata directly
[root@harding-02 ~]# lvchange --syncaction repair cache_sanity/display_cache_cmeta
  Unable to change internal LV display_cache_cmeta directly


Version-Release number of selected component (if applicable):
3.10.0-110.el7.x86_64
lvm2-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-libs-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-cluster-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014

Comment 2 Marian Csontos 2014-05-05 09:10:35 UTC
It is not only scrubbing what needs to be allowed:

- up and down conversion of RAID1
- lvconvert --repair ...

Especially repair is critical for users of writeback cache.

As a side note: these operations works fine for _corig.

Comment 3 Petr Rockai 2014-10-13 14:24:57 UTC
Fix pushed upstream as 22a6b0e40b302b333f9bd995c0b81b0146e8232a.

Comment 5 Corey Marthaler 2014-11-14 00:15:16 UTC
Still unable to scrub in .112-1, unless I'm doing it wrong.

[root@host-110 ~]# lvs -a -o +devices
  LV                    Attr       LSize   Pool Origin Data%  Meta%  Cpy%Sync Devices                                      
  corigin               rwi-a-r---   4.00g                           40.82    corigin_rimage_0(0),corigin_rimage_1(0)      
  [corigin_rimage_0]    Iwi-aor---   4.00g                                    /dev/sdc2(1)
  [corigin_rimage_1]    Iwi-aor---   4.00g                                    /dev/sde2(1)
  [corigin_rmeta_0]     ewi-aor---   4.00m                                    /dev/sdc2(0)
  [corigin_rmeta_1]     ewi-aor---   4.00m                                    /dev/sde2(0)
  [lvol0_pmspare]       ewi-------   8.00m                                    /dev/sda1(0)
  pool                  Cwi---C---   2.00g                                    pool_cdata(0)
  [pool_cdata]          Cwi---r---   2.00g                                    pool_cdata_rimage_0(0),pool_cdata_rimage_1(0)
  [pool_cdata_rimage_0] Iwi---r---   2.00g                                    /dev/sde1(1)
  [pool_cdata_rimage_1] Iwi---r---   2.00g                                    /dev/sdd2(1)
  [pool_cdata_rmeta_0]  ewi---r---   4.00m                                    /dev/sde1(0)
  [pool_cdata_rmeta_1]  ewi---r---   4.00m                                    /dev/sdd2(0)
  [pool_cmeta]          ewi---r---   8.00m                                    pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0)
  [pool_cmeta_rimage_0] Iwi---r---   8.00m                                    /dev/sde1(514)
  [pool_cmeta_rimage_1] Iwi---r---   8.00m                                    /dev/sdd2(514)
  [pool_cmeta_rmeta_0]  ewi---r---   4.00m                                    /dev/sde1(513)
  [pool_cmeta_rmeta_1]  ewi---r---   4.00m                                    /dev/sdd2(513)
[root@host-110 ~]# lvchange --syncaction repair cache_sanity/pool
  cache_sanity/pool must be a RAID logical volume to perform this action.
[root@host-110 ~]# lvchange --syncaction repair cache_sanity/pool_cdata
  Unable to send message to an inactive logical volume.
[root@host-110 ~]# lvchange --syncaction repair cache_sanity/pool_cmeta
  Unable to send message to an inactive logical volume.
[root@host-110 ~]# lvconvert --yes --type cache --cachepool cache_sanity/pool cache_sanity/corigin
  Logical volume cache_sanity/corigin is now cached.
[root@host-110 ~]# lvs -a -o +devices
  LV                       Attr       LSize   Pool   Origin          Data%  Meta%  Cpy%Sync Devices                                            
  corigin                  Cwi-a-C---   4.00g [pool] [corigin_corig] 0.02   3.47   0.00     corigin_corig(0)
  [corigin_corig]          rwi-aoC---   4.00g                                      100.00   corigin_corig_rimage_0(0),corigin_corig_rimage_1(0)
  [corigin_corig_rimage_0] iwi-aor---   4.00g                                               /dev/sdc2(1)
  [corigin_corig_rimage_1] iwi-aor---   4.00g                                               /dev/sde2(1)
  [corigin_corig_rmeta_0]  ewi-aor---   4.00m                                               /dev/sdc2(0)
  [corigin_corig_rmeta_1]  ewi-aor---   4.00m                                               /dev/sde2(0)
  [lvol0_pmspare]          ewi-------   8.00m                                               /dev/sda1(0)
  [pool]                   Cwi---C---   2.00g                        0.02   3.47   0.00     pool_cdata(0)
  [pool_cdata]             Cwi-aor---   2.00g                                      32.03    pool_cdata_rimage_0(0),pool_cdata_rimage_1(0)      
  [pool_cdata_rimage_0]    Iwi-aor---   2.00g                                               /dev/sde1(1)
  [pool_cdata_rimage_1]    Iwi-aor---   2.00g                                               /dev/sdd2(1)
  [pool_cdata_rmeta_0]     ewi-aor---   4.00m                                               /dev/sde1(0)
  [pool_cdata_rmeta_1]     ewi-aor---   4.00m                                               /dev/sdd2(0)
  [pool_cmeta]             ewi-aor---   8.00m                                      100.00   pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0)      
  [pool_cmeta_rimage_0]    iwi-aor---   8.00m                                               /dev/sde1(514)
  [pool_cmeta_rimage_1]    iwi-aor---   8.00m                                               /dev/sdd2(514)
  [pool_cmeta_rmeta_0]     ewi-aor---   4.00m                                               /dev/sde1(513)
  [pool_cmeta_rmeta_1]     ewi-aor---   4.00m                                               /dev/sdd2(513)
[root@host-110 ~]# lvchange --syncaction repair cache_sanity/corigin
  cache_sanity/corigin must be a RAID logical volume to perform this action.
[root@host-110 ~]# lvchange --syncaction repair cache_sanity/pool
  Unable to change internal LV pool directly



3.10.0-189.el7.x86_64
lvm2-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-libs-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-cluster-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-persistent-data-0.3.2-1.el7    BUILT: Thu Apr  3 09:58:51 CDT 2014
cmirror-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014

Comment 6 Petr Rockai 2014-11-19 14:28:53 UTC
I don't quite understand. We have the following in the testsuite and it works:

lvcreate -n corigin --type cache --cachepool $vg/cpool -l 10
 
lvchange --syncaction repair $vg/cpool_cmeta
lvchange --syncaction repair $vg/cpool_cdata

and this is what I get:

#lvconvert-cache-raid.sh:51+ lvs -a -o+seg_pe_ranges @PREFIX@vg
  LV                    VG             Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert PE Ranges                                                                                         
  cpool                 @PREFIX@vg rwi-a-r---   5.00m                                    100.00           cpool_rimage_0:0-9 cpool_rimage_1:0-9                                                             
  cpool_meta            @PREFIX@vg rwi-a-r---   5.00m                                    100.00           cpool_meta_rimage_0:0-9 cpool_meta_rimage_1:0-9                                                   
  [cpool_meta_rimage_0] @PREFIX@vg iwi-aor---   5.00m                                                     @TESTDIR@/dev/mapper/@PREFIX@pv1:1-10 
  [cpool_meta_rimage_1] @PREFIX@vg iwi-aor---   5.00m                                                     @TESTDIR@/dev/mapper/@PREFIX@pv2:1-10 
  [cpool_meta_rmeta_0]  @PREFIX@vg ewi-aor--- 512.00k                                                     @TESTDIR@/dev/mapper/@PREFIX@pv1:0-0  
  [cpool_meta_rmeta_1]  @PREFIX@vg ewi-aor--- 512.00k                                                     @TESTDIR@/dev/mapper/@PREFIX@pv2:0-0  
  [cpool_rimage_0]      @PREFIX@vg iwi-aor---   5.00m                                                     @TESTDIR@/dev/mapper/@PREFIX@pv1:12-21
  [cpool_rimage_1]      @PREFIX@vg iwi-aor---   5.00m                                                     @TESTDIR@/dev/mapper/@PREFIX@pv2:12-21
  [cpool_rmeta_0]       @PREFIX@vg ewi-aor--- 512.00k                                                     @TESTDIR@/dev/mapper/@PREFIX@pv1:11-11
  [cpool_rmeta_1]       @PREFIX@vg ewi-aor--- 512.00k                                                     @TESTDIR@/dev/mapper/@PREFIX@pv2:11-11

#lvconvert-cache-raid.sh:52+ lvconvert --yes --type cache-pool --poolmetadata @PREFIX@vg/cpool_meta @PREFIX@vg/cpool
#lvconvert-cache-raid.sh:53+ lvcreate -n corigin --type cache --cachepool @PREFIX@vg/cpool -l 10

#lvconvert-cache-raid.sh:55+ lvchange --syncaction repair @PREFIX@vg/cpool_cmeta
6,13782,234147111314,-;md: requested-resync of RAID array mdX
#lvconvert-cache-raid.sh:56+ lvchange --syncaction repair @PREFIX@vg/cpool_cdata
6,13786,234147260003,-;md: requested-resync of RAID array mdX


I think it would work if you ran the syncaction commands after activating the pool (you currently run different syncaction commands, on the pool/corigin LVs after activating, while you run it on the cdata/cmeta LVs before activating them).

Comment 7 Corey Marthaler 2014-11-19 19:56:26 UTC
This works for me as well now that I also upgraded kernels (to 3.10.0-200.el7.bz1159001v2.x86_64).

Comment 8 Corey Marthaler 2014-11-20 23:30:15 UTC
Scrubbing doesn't work on cache pools that have not yet been used with a cache origin device since they apparently can't be activated. Is this expected behavior?


[root@host-116 ~]# lvcreate --type raid1 -m 1 -L 2G -n pool cache_sanity /dev/sde2 /dev/sda2
  Logical volume "pool" created.
[root@host-116 ~]# lvcreate --type raid1 -m 1 -L 8M -n pool_meta cache_sanity /dev/sde2 /dev/sda2
  Logical volume "pool_meta" created.
[root@host-116 ~]# lvs -a -o +devices
  LV                   Attr       LSize   Pool Origin Cpy%Sync Devices
  pool                 rwi-a-r---   2.00g             100.00   pool_rimage_0(0),pool_rimage_1(0)
  pool_meta            rwi-a-r---   8.00m             100.00   pool_meta_rimage_0(0),pool_meta_rimage_1(0)
  [pool_meta_rimage_0] iwi-aor---   8.00m                      /dev/sde2(514)
  [pool_meta_rimage_1] iwi-aor---   8.00m                      /dev/sda2(514)
  [pool_meta_rmeta_0]  ewi-aor---   4.00m                      /dev/sde2(513)
  [pool_meta_rmeta_1]  ewi-aor---   4.00m                      /dev/sda2(513)
  [pool_rimage_0]      iwi-aor---   2.00g                      /dev/sde2(1)
  [pool_rimage_1]      iwi-aor---   2.00g                      /dev/sda2(1)
  [pool_rmeta_0]       ewi-aor---   4.00m                      /dev/sde2(0)
  [pool_rmeta_1]       ewi-aor---   4.00m                      /dev/sda2(0)

[root@host-116 ~]# lvconvert --yes --type cache-pool --poolmetadata cache_sanity/pool_meta cache_sanity/pool
  WARNING: Converting logical volume cache_sanity/pool and cache_sanity/pool_meta to pool's data and metadata volumes.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
  Converted cache_sanity/pool to cache pool.

[root@host-116 ~]# lvs -a -o +devices
  LV                    Attr       LSize   Pool Origin Cpy%Sync Devices
  [lvol0_pmspare]       ewi-------   8.00m                      /dev/sda2(516)
  pool                  Cwi---C---   2.00g                      pool_cdata(0)
  [pool_cdata]          Cwi---r---   2.00g                      pool_cdata_rimage_0(0),pool_cdata_rimage_1(0)
  [pool_cdata_rimage_0] Iwi---r---   2.00g                      /dev/sde2(1)
  [pool_cdata_rimage_1] Iwi---r---   2.00g                      /dev/sda2(1)
  [pool_cdata_rmeta_0]  ewi---r---   4.00m                      /dev/sde2(0)
  [pool_cdata_rmeta_1]  ewi---r---   4.00m                      /dev/sda2(0)
  [pool_cmeta]          ewi---r---   8.00m                      pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0)
  [pool_cmeta_rimage_0] Iwi---r---   8.00m                      /dev/sde2(514)
  [pool_cmeta_rimage_1] Iwi---r---   8.00m                      /dev/sda2(514)
  [pool_cmeta_rmeta_0]  ewi---r---   4.00m                      /dev/sde2(513)
  [pool_cmeta_rmeta_1]  ewi---r---   4.00m                      /dev/sda2(513)

[root@host-116 ~]# lvchange --syncaction repair cache_sanity/pool_cdata
  Unable to send message to an inactive logical volume.
[root@host-116 ~]# lvchange --syncaction repair cache_sanity/pool_cmeta
  Unable to send message to an inactive logical volume.
[root@host-116 ~]# lvchange -ay cache_sanity/pool

[root@host-116 ~]# lvs -a -o +devices
  LV                    Attr       LSize   Pool Origin Cpy%Sync Devices
  [lvol0_pmspare]       ewi-------   8.00m                      /dev/sda2(516)
  pool                  Cwi---C---   2.00g                      pool_cdata(0)
  [pool_cdata]          Cwi---r---   2.00g                      pool_cdata_rimage_0(0),pool_cdata_rimage_1(0)
  [pool_cdata_rimage_0] Iwi---r---   2.00g                      /dev/sde2(1)
  [pool_cdata_rimage_1] Iwi---r---   2.00g                      /dev/sda2(1)
  [pool_cdata_rmeta_0]  ewi---r---   4.00m                      /dev/sde2(0)
  [pool_cdata_rmeta_1]  ewi---r---   4.00m                      /dev/sda2(0)
  [pool_cmeta]          ewi---r---   8.00m                      pool_cmeta_rimage_0(0),pool_cmeta_rimage_1(0)
  [pool_cmeta_rimage_0] Iwi---r---   8.00m                      /dev/sde2(514)
  [pool_cmeta_rimage_1] Iwi---r---   8.00m                      /dev/sda2(514)
  [pool_cmeta_rmeta_0]  ewi---r---   4.00m                      /dev/sde2(513)
  [pool_cmeta_rmeta_1]  ewi---r---   4.00m                      /dev/sda2(513)

[root@host-116 ~]# lvchange --syncaction repair cache_sanity/pool_cdata
  Unable to send message to an inactive logical volume.
[root@host-116 ~]# lvchange --syncaction repair cache_sanity/pool_cmeta
  Unable to send message to an inactive logical volume.


3.10.0-200.el7.bz1159001v2.x86_64

lvm2-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-libs-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
lvm2-cluster-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-event-libs-1.02.91-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014
device-mapper-persistent-data-0.4.1-1.el7    BUILT: Tue Oct 28 09:49:38 CDT 2014
cmirror-2.02.112-1.el7    BUILT: Tue Nov 11 09:39:35 CST 2014

Comment 9 Corey Marthaler 2014-12-01 20:41:21 UTC
Marking this verified in the latest rpms. Raid scrubbing does work on cache pool volumes that are currently associated with a cache origin volume. 

Scrubbing does not work on stand alone cache pool volumes (1169500) nor cache origin volumes (1169495).


3.10.0-206.el7.x86_64
lvm2-2.02.114-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
lvm2-libs-2.02.114-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
lvm2-cluster-2.02.114-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
device-mapper-1.02.92-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
device-mapper-libs-1.02.92-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
device-mapper-event-1.02.92-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
device-mapper-event-libs-1.02.92-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014
device-mapper-persistent-data-0.4.1-2.el7    BUILT: Wed Nov 12 12:39:46 CST 2014
cmirror-2.02.114-2.el7    BUILT: Mon Dec  1 10:57:14 CST 2014

Comment 11 errata-xmlrpc 2015-03-05 13:08:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0513.html