Bug 1373637
| Summary: | LVM cache: Ability to repair cache origin if RAID | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jonathan Earl Brassow <jbrassow> |
| Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> |
| lvm2 sub component: | Mirroring and RAID | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, rbednar, zkabelac |
| Version: | 7.2 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | lvm2-2.02.169-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-01 21:47:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1385242 | ||
|
Description
Jonathan Earl Brassow
2016-09-06 20:08:09 UTC
Just to be clear, we obviously want things to work in automatic mode to, so have 'allocate' as the fault handling policy and kill a dev in the _corig sub-LV. It should work. Zdenek/Agk should be consulted if they want to see the top-level LV handled too, but that would be extremely confusing to me (i.e. which sub-LV could be meant by that?). Better and easier to address the affected sub-LV IMHO. If more than one sub-LV is affected at the same time, dmeventd should raise multiple events for different dm devices and they should be handled in turn. Also, please be aware that it is possible this has been fixed in an earlier release, but never formally tested and signed-off by QA. That certainly would explain the fact that it already works. Adding QA ack for 7.4. Automated test for verification available. See QA Whiteboard. Any raid cache SubLVs and any raid corig SubLV can be repaired # lvs -aoname,attr,size,segtype,syncpercent,datastripes,stripesize,reshapelenle,datacopies,regionsize,devices tb|sed 's/ *$//' LV Attr LSize Type Cpy%Sync #DStr Stripe RSize #Cpy Region Devices [cache] Cwi---C--- 100.00m cache-pool 0.00 1 0 1 0 cache_cdata(0) [cache_cdata] Cwi-aor--- 100.00m raid1 100.00 2 0 2 512.00k cache_cdata_rimage_0(0),cache_cdata_rimage_1(0) [cache_cdata_rimage_0] iwi-aor--- 100.00m linear 1 0 1 0 /dev/sda(1) [cache_cdata_rimage_1] iwi-aor--- 100.00m linear 1 0 1 0 /dev/sdaa(1) [cache_cdata_rmeta_0] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sda(0) [cache_cdata_rmeta_1] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sdaa(0) [cache_cmeta] ewi-aor--- 8.00m raid1 100.00 2 0 2 512.00k cache_cmeta_rimage_0(0),cache_cmeta_rimage_1(0) [cache_cmeta_rimage_0] iwi-aor--- 8.00m linear 1 0 1 0 /dev/sdab(1) [cache_cmeta_rimage_1] iwi-aor--- 8.00m linear 1 0 1 0 /dev/sdac(1) [cache_cmeta_rmeta_0] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sdab(0) [cache_cmeta_rmeta_1] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sdac(0) [lvol0_pmspare] ewi------- 8.00m linear 1 0 1 0 /dev/sdd(0) r Cwi-a-C--- 1.00g cache 0.00 1 0 1 0 r_corig(0) [r_corig] rwi-aoC--- 1.00g raid1 100.00 2 0 2 512.00k r_corig_rimage_0(0),r_corig_rimage_1(0) [r_corig_rimage_0] iwi-aor--- 1.00g linear 1 0 1 0 /dev/sdad(1) [r_corig_rimage_1] iwi-aor--- 1.00g linear 1 0 1 0 /dev/sdae(1) [r_corig_rmeta_0] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sdad(0) [r_corig_rmeta_1] ewi-aor--- 4.00m linear 1 0 1 0 /dev/sdae(0) # lvconvert -y --repair tb/cache_cdata tb/cache_cdata does not contain devices specified to replace. Faulty devices in tb/cache_cdata successfully replaced. # lvconvert -y --repair tb/cache_cmeta tb/cache_cmeta does not contain devices specified to replace. Faulty devices in tb/cache_cmeta successfully replaced. # lvconvert -y --repair tb/r_corig tb/r_corig does not contain devices specified to replace. Faulty devices in tb/r_corig successfully replaced. Verified. Posting links to test results: Scenario: kill_primary_synced_raid1_2legs PASS (warn policy): https://beaker.cluster-qe.lab.eng.brq.redhat.com/logs/2017/06/587/58770/191571/530715/TESTOUT.log PASS (allocate policy): https://beaker.cluster-qe.lab.eng.brq.redhat.com/logs/2017/06/587/58769/191568/530706/TESTOUT.log 3.10.0-671.el7.x86_64 lvm2-2.02.171-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 lvm2-libs-2.02.171-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 lvm2-cluster-2.02.171-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 device-mapper-1.02.140-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 device-mapper-libs-1.02.140-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 device-mapper-event-1.02.140-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 device-mapper-event-libs-1.02.140-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 17:15:46 CEST 2017 cmirror-2.02.171-2.el7 BUILT: Wed May 24 16:02:34 CEST 2017 NOTE: Also verified that both scenarios fail with lvm2-2.02.166-1.el7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2222 |