RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1399844 - LVM RAID: Unable to refresh transiently failed device
Summary: LVM RAID: Unable to refresh transiently failed device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1430028
Blocks: 1385242
TreeView+ depends on / blocked
 
Reported: 2016-11-29 21:36 UTC by Jonathan Earl Brassow
Modified: 2021-09-03 12:36 UTC (History)
8 users (show)

Fixed In Version: lvm2-2.02.169-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 21:49:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
First "lvchange -vvvv --refresh ... " run output (118.72 KB, text/plain)
2016-11-30 14:30 UTC, Heinz Mauelshagen
no flags Details
Second "lvchange -vvvv --refresh ..." run output (116.44 KB, text/plain)
2016-11-30 14:30 UTC, Heinz Mauelshagen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2222 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2017-08-01 18:42:41 UTC

Description Jonathan Earl Brassow 2016-11-29 21:36:38 UTC
'lvchange --refresh' seem incapable of bringing a transiently failed device back in... see below:

[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r--- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-aor--- 500.00m linear          /dev/sdb1(1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-aor---   4.00m linear          /dev/sdb1(0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# off.sh sdb
Turning off sdb
[root@bp-01 ~]# dd if=/dev/zero of=/dev/vg/raid1 bs=4M count=1
1+0 records in
1+0 records out
4194304 bytes (4.2 MB) copied, 0.11769 s, 35.6 MB/s
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  WARNING: Couldn't find all devices for LV vg/raid1_rimage_0 while checking used and assumed devices.
  WARNING: Couldn't find all devices for LV vg/raid1_rmeta_0 while checking used and assumed devices.
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r-p- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-aor-p- 500.00m linear          [unknown](1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-aor-p-   4.00m linear          [unknown](0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# vgchange -an vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  WARNING: Couldn't find all devices for LV vg/raid1_rimage_0 while checking used and assumed devices.
  WARNING: Couldn't find all devices for LV vg/raid1_rmeta_0 while checking used and assumed devices.
  0 logical volume(s) in volume group "vg" now active
[root@bp-01 ~]# vgchange -ay vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  1 logical volume(s) in volume group "vg" now active
[root@bp-01 ~]# lvchange -an vg/raid1
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi---r-p- 500.00m raid1           raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] Iwi---r-p- 500.00m linear          [unknown](1)
  [raid1_rimage_1] Iwi---r--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi---r-p-   4.00m linear          [unknown](0)
  [raid1_rmeta_1]  ewi---r---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# lvchange -ay vg/raid1
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r-p- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-a-r-p- 500.00m linear          [unknown](1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-a-r-p-   4.00m linear          [unknown](0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# on.sh sdb
Turning on sdb
[root@bp-01 ~]# lvchange --refresh vg/raid1
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  Refusing refresh of partial LV vg/raid1. Use '--activationmode partial' to override.
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  WARNING: Device for PV dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke not found or rejected by a filter.
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r-p- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-a-r-p- 500.00m linear          [unknown](1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-a-r-p-   4.00m linear          [unknown](0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# pvscan --cache
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r--- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-a-r--- 500.00m linear          /dev/sdb1(1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-a-r---   4.00m linear          /dev/sdb1(0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# dmsetup status vg-raid1
0 1024000 raid raid1 2 A 1024000/1024000 idle 0 0
[root@bp-01 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices vg
  LV               Attr       LSize   Type   Cpy%Sync Devices
  raid1            rwi-a-r--- 500.00m raid1  100.00   raid1_rimage_0(0),raid1_rimage_1(0)
  [raid1_rimage_0] iwi-a-r--- 500.00m linear          /dev/sdb1(1)
  [raid1_rimage_1] iwi-aor--- 500.00m linear          /dev/sdc1(1)
  [raid1_rmeta_0]  ewi-a-r---   4.00m linear          /dev/sdb1(0)
  [raid1_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdc1(0)
[root@bp-01 ~]# dmsetup table
vg-raid1_rmeta_0-missing_0_0: 0 8192 error
rhel_bp--01-home: 0 853286912 linear 8:2 16517120
rhel_bp--01-swap: 0 16515072 linear 8:2 2048
rhel_bp--01-root: 0 104857600 linear 8:2 869804032
vg-raid1_rmeta_1: 0 8192 linear 8:33 2048
vg-raid1_rmeta_0: 0 8192 linear 253:3 0
vg-raid1_rimage_1: 0 1024000 linear 8:33 10240
vg-raid1_rimage_0: 0 1024000 linear 253:5 0
vg-raid1_rimage_0-missing_0_0: 0 1024000 error
vg-raid1: 0 1024000 raid raid1 3 0 region_size 1024 2 - - 253:7 253:8

Comment 1 Heinz Mauelshagen 2016-11-29 23:16:09 UTC
Jon,

I had to run "lvchange --refresh $lv" twice after your pv offline, vgchange -an, vgchange -ay, pv online scenario to make it happen on recent 7.3.

Does the same apply to 7.2?

Comment 2 Heinz Mauelshagen 2016-11-30 13:10:32 UTC
Works after two "lvchange --refresh ..." runs (why 2 needs
further clarification) on RHEL 7.2 without lvmetad (with lvmetad
"pvscan --cache /dev/sdb" is necessary to update metatad cache),
lvm2 2.02.130(2)-RHEL7, kernel 3.10.0-327.el7.x86_64 with
dm-raid target 1.0.7:

[root@vm102 ~]# lvcreate -m1 --ty raid1 -L256 -nr --nosync ssd
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: New raid1 won't be synchronised. Don't read what you didn't write!
WARNING: ext4 signature detected on /dev/ssd/r at offset 1080. Wipe it? [y/n]: y
  Wiping ext4 signature on /dev/ssd/r.
  Logical volume "r" created.
[root@vm102 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices ssd
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  LV           Attr       LSize   Type   Cpy%Sync Devices                    
  r            Rwi-a-r--- 256.00m raid1  100.00   r_rimage_0(0),r_rimage_1(0)
  [r_rimage_0] iwi-aor--- 256.00m linear          /dev/sda(1)                
  [r_rimage_1] iwi-aor--- 256.00m linear          /dev/sdb(1)                
  [r_rmeta_0]  ewi-aor---   4.00m linear          /dev/sda(0)                
  [r_rmeta_1]  ewi-aor---   4.00m linear          /dev/sdb(0)                
[root@vm102 ~]# mkfs -t ext4 /dev/ssd/r
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: done                            
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=0 blocks, Stripe width=0 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=33816576
32 block groups
8192 blocks per group, 8192 fragments per group
2048 inodes per group
Superblock backups stored on blocks: 
        8193, 24577, 40961, 57345, 73729, 204801, 221185

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done 

[root@vm102 ~]# echo offline > /sys/block/sdb/device/state
[root@vm102 ~]# lvs -a -o name,attr,size,segtype,syncpercent,devices ssd
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  Couldn't find device with uuid KC4elc-LlBl-jCVC-eBF9-lrZc-LWSd-veEOGW.
  LV           Attr       LSize   Type   Cpy%Sync Devices                    
  r            Rwi-a-r-p- 256.00m raid1  100.00   r_rimage_0(0),r_rimage_1(0)
  [r_rimage_0] iwi-aor--- 256.00m linear          /dev/sda(1)                
  [r_rimage_1] iwi-aor-p- 256.00m linear          unknown device(1)          
  [r_rmeta_0]  ewi-aor---   4.00m linear          /dev/sda(0)                
  [r_rmeta_1]  ewi-aor-p-   4.00m linear          unknown device(0)          
[root@vm102 ~]# fsck -fn /dev/ssd/r
fsck from util-linux 2.23.2
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/ssd-r: 11/65536 files (0.0% non-contiguous), 18535/262144 blocks
[root@vm102 ~]# dmsetup  status ssd-r
0 524288 raid raid1 2 AD 524288/524288 idle 0
[root@vm102 ~]# dmsetup  table ssd-r
0 524288 raid raid1 4 0 nosync region_size 1024 2 253:2 253:3 253:4 253:5
[root@vm102 ~]# vgchange -an ssd
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  Couldn't find device with uuid KC4elc-LlBl-jCVC-eBF9-lrZc-LWSd-veEOGW.
  0 logical volume(s) in volume group "ssd" now active
[root@vm102 ~]# vgchange -ay ssd
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  Couldn't find device with uuid KC4elc-LlBl-jCVC-eBF9-lrZc-LWSd-veEOGW.
  1 logical volume(s) in volume group "ssd" now active
[root@vm102 ~]# dmsetup  status ssd-r
0 524288 raid raid1 2 AA 524288/524288 idle 0
[root@vm102 ~]# dmsetup  table ssd-r
0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 - -
[root@vm102 ~]# fsck -fn /dev/ssd/r
fsck from util-linux 2.23.2
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/ssd-r: 11/65536 files (0.0% non-contiguous), 18535/262144 blocks
[root@vm102 ~]# lvchange --refresh ssd/r
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  Couldn't find device with uuid KC4elc-LlBl-jCVC-eBF9-lrZc-LWSd-veEOGW.
  Refusing refresh of partial LV ssd/r. Use '--activationmode partial' to override.
[root@vm102 ~]# for d in /dev/sd*;do echo running > /sys/block/`basename $d`/device/state;done
[root@vm102 ~]# lvchange --refresh ssd/r
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
[root@vm102 ~]# dmsetup  table ssd-r
0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 - -
[root@vm102 ~]# lvchange --refresh ssd/r
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
[root@vm102 ~]# dmsetup  table ssd-r
0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 253:5 253:7

Comment 3 Heinz Mauelshagen 2016-11-30 14:27:11 UTC
So the workaround is 2 "lvchange --refresh ..." runs after the PV with the
components turned accessible again as mentioned before.

Analyzing "lvchange -vvvv --refresh ..." shows the linear tables for *_r{meta|rimage]_1" are being reloaded together with the raid1 table
"0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 253:5 253:7"
are being processed in the _first_ refresh, still the table output is
"0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 - -"
until after the _second_ refresh run.

Excerpt from "lvchange -vvvv --refresh ..." of first run:
#libdm-deptree.c:2732     Loading ssd-r table (253:8)
#libdm-deptree.c:2676         Adding target to (253:8): 0 524288 raid raid1 3 0 region_size 1024 2 253:2 253:3 253:5 253:7
#ioctl/libdm-iface.c:1832         dm table   (253:8) OF   [16384] (*1)
#ioctl/libdm-iface.c:1832         dm reload   (253:8) NF   [16384] (*1)


Jon,
can you second this behaviour?

Comment 4 Heinz Mauelshagen 2016-11-30 14:30:00 UTC
Created attachment 1226348 [details]
First "lvchange -vvvv --refresh ... " run output

Comment 5 Heinz Mauelshagen 2016-11-30 14:30:45 UTC
Created attachment 1226350 [details]
Second "lvchange -vvvv --refresh ..." run output

Comment 6 Heinz Mauelshagen 2016-11-30 17:09:37 UTC
Transient device failures aren't supported by lvm yet.

In the first refresh run, the "*_r{meta|image}_1" SubLVs still contain mappings to error targets (the *_missing_* devices), thus causing the dm-raid constructor to fail on reading any metadata. This is because the linear mappings are loaded into the inactive slots of the respective mapped devices but are being resumed _after_ the load of the raid target (they'd need to be resumed prior to the raid1 target load).

In the second refresh run, the linear mappings are active so the raid constructor succeeds reading the RAID superblock.

Discussed the implications seen in the traces with Zdenek who has approaches how to better this situation generically (I let him describe what he has in mind).

So the workaround is the double "lvchange --refresh ..." until we have the enhancements to transient device failures solved generically.

Comment 7 Heinz Mauelshagen 2016-11-30 22:01:28 UTC
Upstream commit 0b8bf73a63d8 avoids the need to run two "lvchange --refresh ..."  until we have enhanced lvm too handle transient failures better.

Comment 8 Zdenek Kabelac 2016-12-01 09:23:10 UTC
The commit from Comment 7 is only temporary hack - which doesn't work with locking mechanism we do currently use - as only TOP level has a lock and any attempt to suspend & resume remotely active subLVs will do nothing.

The trouble we have here is - our locking/activation code is 'free' to change and replace missing segments of a LV without having these change stored in lvm2 metadata.

So whenever next command is running - it has no idea table content is different from a lvm2 metadata state - ATM there is no such deep revalidation of a device being done.

But this is still not such a big issue when we consider lvm2 has no plan yet to support transient device failures.

State of now:

When lvm2 detects missing PV - such a PV should be marked as MISSING_PV and its 'reattachment' back to VG happen only when user requests: 'vgextend --restoremissing'

So we really cannot claim we do support 'transient' disk failures - there needs to be some plan for it.

We may need a PV dedication as raid-leg for this even, then however we are not much different from direct 'mdadm' usage.

Comment 9 Heinz Mauelshagen 2016-12-05 16:11:09 UTC
(In reply to Zdenek Kabelac from comment #8)
> The commit from Comment 7 is only temporary hack - which doesn't work with
> locking mechanism we do currently use - as only TOP level has a lock and any
> attempt to suspend & resume remotely active subLVs will do nothing.

The patch is aiming at convenience unitl we come up with a concept to dea with sane preload/resume sequencing.  The cluster case is rather rare where user requests refresh on one node when the RaidLV os exclusively activated on another.

<SNIP>

Comment 10 Heinz Mauelshagen 2016-12-12 21:11:08 UTC
Posted upstream commit 87117c2b2546 in addition to 0b8bf73a63d8 to cope with remotely active, clustered RaidLVs.

Comment 14 Heinz Mauelshagen 2016-12-23 02:51:46 UTC
Upstream commit 95d68f1d0e16
(and kernel patch
"[dm-devel][PATCH] dm raid: fix transient device failure processing").

Comment 16 Corey Marthaler 2017-06-27 23:27:59 UTC
Marking verified with the latest rpms.

3.10.0-688.el7.x86_64
lvm2-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
lvm2-libs-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
lvm2-cluster-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-libs-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-event-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-event-libs-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 17 errata-xmlrpc 2017-08-01 21:49:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2222


Note You need to log in before you can comment on or make changes to this bug.