1124414 – LVM cache: dmeventd does not work on RAID volumes that are under cache

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1124414 - LVM cache: dmeventd does not work on RAID volumes that are under cache

Summary: LVM cache: dmeventd does not work on RAID volumes that are under cache

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Petr Rockai
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1119326
TreeView+	depends on / blocked

Reported:	2014-07-29 12:47 UTC by Jonathan Earl Brassow
Modified:	2021-09-03 12:37 UTC (History)
CC List:	10 users (show)
Fixed In Version:	lvm2-2.02.112-1.el7
Doc Type:	Bug Fix
Doc Text:	For RAID volumes, dmeventd monitoring would be no longer possible if a persistent cache was created on top of those RAID volumes, preventing automatic repair in case of a disk failure. This has been corrected.
Clone Of:
Environment:
Last Closed:	2015-03-05 13:09:27 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:0513	0	normal	SHIPPED_LIVE	lvm2 bug fix and enhancement update	2015-03-05 16:14:41 UTC

Description Jonathan Earl Brassow 2014-07-29 12:47:03 UTC

dmeventd does not work on RAID volumes that are components of a cache LV.  With raid_fault_policy set to "allocate", I get the following results:

[root@bp-01 ~]# devices  vg                                                                                                                            
  LV                            Attr       Cpy%Sync Devices                                                                                                                                                                                                                                                   
  lv                            Cwi-a-C---          lv_corig(0)                                                                                                 
  lv_cachepool                  Cwi---C---          lv_cachepool_cdata(0)                                        
  [lv_cachepool_cdata]          Cwi-aoC--- 100.00   lv_cachepool_cdata_rimage_0(0),lv_cachepool_cdata_rimage_1(0)
  [lv_cachepool_cdata_rimage_0] iwi-aor---          /dev/sdc1(2562)                                              
  [lv_cachepool_cdata_rimage_1] iwi-aor---          /dev/sdd1(2562)                                              
  [lv_cachepool_cdata_rmeta_0]  ewi-aor---          /dev/sdc1(2561)                                              
  [lv_cachepool_cdata_rmeta_1]  ewi-aor---          /dev/sdd1(2561)                                              
  [lv_cachepool_cmeta]          ewi-aoC--- 100.00   lv_cachepool_cmeta_rimage_0(0),lv_cachepool_cmeta_rimage_1(0)
  [lv_cachepool_cmeta_rimage_0] iwi-aor---          /dev/sdc1(2819)                                              
  [lv_cachepool_cmeta_rimage_1] iwi-aor---          /dev/sdd1(2819)                                              
  [lv_cachepool_cmeta_rmeta_0]  ewi-aor---          /dev/sdc1(2818)                                              
  [lv_cachepool_cmeta_rmeta_1]  ewi-aor---          /dev/sdd1(2818)                                              
  [lv_corig]                    rwi-aor--- 100.00   lv_corig_rimage_0(0),lv_corig_rimage_1(0)                    
  [lv_corig_rimage_0]           iwi-aor---          /dev/sdc1(1)                                                 
  [lv_corig_rimage_1]           iwi-aor---          /dev/sdd1(1)                                                 
  [lv_corig_rmeta_0]            ewi-aor---          /dev/sdc1(0)                                                 
  [lv_corig_rmeta_1]            ewi-aor---          /dev/sdd1(0)                                                 
  [lvol0_pmspare]               ewi-------          /dev/sdc1(2822)                                              
[root@bp-01 ~]# off.sh sdc
Turning off sdc

FAILURE TO OPERATE ON 'lv_cachepool_cmeta':
Jul 29 07:36:50 bp-01 kernel: md/raid1:mdX: Disk failure on dm-14, disabling device.
Jul 29 07:36:50 bp-01 kernel: md/raid1:mdX: Operation continuing on 1 devices.
Jul 29 07:36:50 bp-01 lvm[3372]: Names including "_cmeta" are reserved. Please choose a different LV name.
Jul 29 07:36:50 bp-01 lvm[3372]: Run `lvconvert --help' for more information.
Jul 29 07:36:50 bp-01 lvm[3372]: Repair of RAID device vg-lv_cachepool_cmeta failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Failed to process event for vg-lv_cachepool_cmeta
...

FAILURE TO OPERATE ON 'lv_cachepool_cdata':
Jul 29 07:36:50 bp-01 lvm[3372]: Device #0 of raid1 array, vg-lv_cachepool_cdata, has failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Names including "_cdata" are reserved. Please choose a different LV name.
Jul 29 07:36:50 bp-01 lvm[3372]: Run `lvconvert --help' for more information.
Jul 29 07:36:50 bp-01 lvm[3372]: Repair of RAID device vg-lv_cachepool_cdata failed.
Jul 29 07:36:50 bp-01 lvm[3372]: Failed to process event for vg-lv_cachepool_cdata


SUCCESSFUL OPERATION ON 'lv_corig':
Jul 29 07:38:36 bp-01 lvm[3372]: Device #0 of raid1 array, vg-lv_corig, has failed.
Jul 29 07:38:37 bp-01 kernel: device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
...
Jul 29 07:38:40 bp-01 lvm[3372]: Faulty devices in vg/lv_corig successfully replaced.

Comment 2 Marian Csontos 2014-07-29 13:32:43 UTC

Already reported in Bug 1086442.

Comment 3 Petr Rockai 2014-10-07 07:03:19 UTC

I have relaxed the check on LV name restrictions for both lvconvert --splitmirrors and, relevant here, --repair. The latter is required to operate on any RAID LVs, whether they are part of a bigger aggregate LV or not. As highlighted in this bug, this is especially relevant for dmeventd monitoring. Everything should work now as expected, upstream fix in b66f16fd63014c958d65ab37df4b72f1566176d3.

Comment 5 Nenad Peric 2015-01-27 15:07:19 UTC

Tested with multiple failures of data meta and cached LV. 
They were all RAID1 below. 

[root@tardis-03 ~]# lvs -a -olv_name,devices,sync_percent
  Couldn't find device with uuid ZXzlVf-FsX7-9IA3-0XOW-j0Zc-XPot-O3xVJV.
  Couldn't find device with uuid 8XXlJc-bser-qzmy-Nm3q-i6MD-Yx17-T12SFe.
  Couldn't find device with uuid sqiLxb-EuUH-ydO1-FvRz-l3RE-l9ja-xYKQ0z.
  LV                      Devices                                           Cpy%Sync
  home                    /dev/sda2(1024)                                           
  root                    /dev/sda2(58577)                                          
  swap                    /dev/sda2(0)                                              
  [cache0]                cache0_cdata(0)                                   99.14   
  [cache0_cdata]          cache0_cdata_rimage_0(0),cache0_cdata_rimage_1(0) 100.00  
  [cache0_cdata_rimage_0] /dev/sdb1(14)                                             
  [cache0_cdata_rimage_1] /dev/sdf1(1)                                              
  [cache0_cdata_rmeta_0]  /dev/sdb1(13)                                             
  [cache0_cdata_rmeta_1]  /dev/sdf1(0)                                              
  [cache0_cmeta]          cache0_cmeta_rimage_0(0),cache0_cmeta_rimage_1(0) 7.69    
  [cache0_cmeta_rimage_0] /dev/sdb1(14096)                                          
  [cache0_cmeta_rimage_1] /dev/sdh1(1)                                              
  [cache0_cmeta_rmeta_0]  /dev/sdb1(14095)                                          
  [cache0_cmeta_rmeta_1]  /dev/sdh1(0)                                              
  lvol0                   lvol0_corig(0)                                    99.14   
  [lvol0_corig]           lvol0_corig_rimage_0(0),lvol0_corig_rimage_1(0)   16.84   
  [lvol0_corig_rimage_0]  /dev/sdc1(1)                                              
  [lvol0_corig_rimage_1]  /dev/sdb1(1295)                                           
  [lvol0_corig_rmeta_0]   /dev/sdc1(0)                                              
  [lvol0_corig_rmeta_1]   /dev/sdb1(1294)                                           
  [lvol1_pmspare]         /dev/sdb1(0)                             


The missing legs were replaced successfully, the name errors were not encountered.

Marking VERIFIED with:

kernel-3.10.0-223.el7.x86_64
lvm2-2.02.115-2.el7.x86_64

Comment 7 errata-xmlrpc 2015-03-05 13:09:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0513.html

Note You need to log in before you can comment on or make changes to this bug.