Bug 1124414

Summary: LVM cache: dmeventd does not work on RAID volumes that are under cache
Product: Red Hat Enterprise Linux 7 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Petr Rockai <prockai>
lvm2 sub component: Cache Logical Volumes QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, cmarthal, heinzm, jbrassow, mcsontos, msnitzer, nperic, prajnoha, prockai, zkabelac
Version: 7.1   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.112-1.el7 Doc Type: Bug Fix
Doc Text:
For RAID volumes, dmeventd monitoring would be no longer possible if a persistent cache was created on top of those RAID volumes, preventing automatic repair in case of a disk failure. This has been corrected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 13:09:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1119326    

Description Jonathan Earl Brassow 2014-07-29 12:47:03 UTC
dmeventd does not work on RAID volumes that are components of a cache LV.  With raid_fault_policy set to "allocate", I get the following results:

[root@bp-01 ~]# devices  vg                                                                                                                            
  LV                            Attr       Cpy%Sync Devices                                                                                                                                                                                                                                                   
  lv                            Cwi-a-C---          lv_corig(0)                                                                                                 
  lv_cachepool                  Cwi---C---          lv_cachepool_cdata(0)                                        
  [lv_cachepool_cdata]          Cwi-aoC--- 100.00   lv_cachepool_cdata_rimage_0(0),lv_cachepool_cdata_rimage_1(0)
  [lv_cachepool_cdata_rimage_0] iwi-aor---          /dev/sdc1(2562)                                              
  [lv_cachepool_cdata_rimage_1] iwi-aor---          /dev/sdd1(2562)                                              
  [lv_cachepool_cdata_rmeta_0]  ewi-aor---          /dev/sdc1(2561)                                              
  [lv_cachepool_cdata_rmeta_1]  ewi-aor---          /dev/sdd1(2561)                                              
  [lv_cachepool_cmeta]          ewi-aoC--- 100.00   lv_cachepool_cmeta_rimage_0(0),lv_cachepool_cmeta_rimage_1(0)
  [lv_cachepool_cmeta_rimage_0] iwi-aor---          /dev/sdc1(2819)                                              
  [lv_cachepool_cmeta_rimage_1] iwi-aor---          /dev/sdd1(2819)                                              
  [lv_cachepool_cmeta_rmeta_0]  ewi-aor---          /dev/sdc1(2818)                                              
  [lv_cachepool_cmeta_rmeta_1]  ewi-aor---          /dev/sdd1(2818)                                              
  [lv_corig]                    rwi-aor--- 100.00   lv_corig_rimage_0(0),lv_corig_rimage_1(0)                    
  [lv_corig_rimage_0]           iwi-aor---          /dev/sdc1(1)                                                 
  [lv_corig_rimage_1]           iwi-aor---          /dev/sdd1(1)                                                 
  [lv_corig_rmeta_0]            ewi-aor---          /dev/sdc1(0)                                                 
  [lv_corig_rmeta_1]            ewi-aor---          /dev/sdd1(0)                                                 
  [lvol0_pmspare]               ewi-------          /dev/sdc1(2822)                                              
[root@bp-01 ~]# off.sh sdc
Turning off sdc

FAILURE TO OPERATE ON 'lv_cachepool_cmeta':
Jul 29 07:36:50 bp-01 kernel: md/raid1:mdX: Disk failure on dm-14, disabling device.
Jul 29 07:36:50 bp-01 kernel: md/raid1:mdX: Operation continuing on 1 devices.
Jul 29 07:36:50 bp-01 lvm[3372]: Names including "_cmeta" are reserved. Please choose a different LV name.
Jul 29 07:36:50 bp-01 lvm[3372]: Run `lvconvert --help' for more information.
Jul 29 07:36:50 bp-01 lvm[3372]: Repair of RAID device vg-lv_cachepool_cmeta failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Failed to process event for vg-lv_cachepool_cmeta
...

FAILURE TO OPERATE ON 'lv_cachepool_cdata':
Jul 29 07:36:50 bp-01 lvm[3372]: Device #0 of raid1 array, vg-lv_cachepool_cdata, has failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Names including "_cdata" are reserved. Please choose a different LV name.
Jul 29 07:36:50 bp-01 lvm[3372]: Run `lvconvert --help' for more information.
Jul 29 07:36:50 bp-01 lvm[3372]: Repair of RAID device vg-lv_cachepool_cdata failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Failed to process event for vg-lv_cachepool_cdata


SUCCESSFUL OPERATION ON 'lv_corig':
Jul 29 07:38:36 bp-01 lvm[3372]: Device #0 of raid1 array, vg-lv_corig, has failed.
Jul 29 07:38:37 bp-01 kernel: device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
...
Jul 29 07:38:40 bp-01 lvm[3372]: Faulty devices in vg/lv_corig successfully replaced.

Comment 2 Marian Csontos 2014-07-29 13:32:43 UTC
Already reported in Bug 1086442.

Comment 3 Petr Rockai 2014-10-07 07:03:19 UTC
I have relaxed the check on LV name restrictions for both lvconvert --splitmirrors and, relevant here, --repair. The latter is required to operate on any RAID LVs, whether they are part of a bigger aggregate LV or not. As highlighted in this bug, this is especially relevant for dmeventd monitoring. Everything should work now as expected, upstream fix in b66f16fd63014c958d65ab37df4b72f1566176d3.

Comment 5 Nenad Peric 2015-01-27 15:07:19 UTC
Tested with multiple failures of data meta and cached LV. 
They were all RAID1 below. 

[root@tardis-03 ~]# lvs -a -olv_name,devices,sync_percent
  Couldn't find device with uuid ZXzlVf-FsX7-9IA3-0XOW-j0Zc-XPot-O3xVJV.
  Couldn't find device with uuid 8XXlJc-bser-qzmy-Nm3q-i6MD-Yx17-T12SFe.
  Couldn't find device with uuid sqiLxb-EuUH-ydO1-FvRz-l3RE-l9ja-xYKQ0z.
  LV                      Devices                                           Cpy%Sync
  home                    /dev/sda2(1024)                                           
  root                    /dev/sda2(58577)                                          
  swap                    /dev/sda2(0)                                              
  [cache0]                cache0_cdata(0)                                   99.14   
  [cache0_cdata]          cache0_cdata_rimage_0(0),cache0_cdata_rimage_1(0) 100.00  
  [cache0_cdata_rimage_0] /dev/sdb1(14)                                             
  [cache0_cdata_rimage_1] /dev/sdf1(1)                                              
  [cache0_cdata_rmeta_0]  /dev/sdb1(13)                                             
  [cache0_cdata_rmeta_1]  /dev/sdf1(0)                                              
  [cache0_cmeta]          cache0_cmeta_rimage_0(0),cache0_cmeta_rimage_1(0) 7.69    
  [cache0_cmeta_rimage_0] /dev/sdb1(14096)                                          
  [cache0_cmeta_rimage_1] /dev/sdh1(1)                                              
  [cache0_cmeta_rmeta_0]  /dev/sdb1(14095)                                          
  [cache0_cmeta_rmeta_1]  /dev/sdh1(0)                                              
  lvol0                   lvol0_corig(0)                                    99.14   
  [lvol0_corig]           lvol0_corig_rimage_0(0),lvol0_corig_rimage_1(0)   16.84   
  [lvol0_corig_rimage_0]  /dev/sdc1(1)                                              
  [lvol0_corig_rimage_1]  /dev/sdb1(1295)                                           
  [lvol0_corig_rmeta_0]   /dev/sdc1(0)                                              
  [lvol0_corig_rmeta_1]   /dev/sdb1(1294)                                           
  [lvol1_pmspare]         /dev/sdb1(0)                             


The missing legs were replaced successfully, the name errors were not encountered.

Marking VERIFIED with:

kernel-3.10.0-223.el7.x86_64
lvm2-2.02.115-2.el7.x86_64

Comment 7 errata-xmlrpc 2015-03-05 13:09:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0513.html