Bug 903249

Summary: LVM RAID: 'lvs' does not always report the proper status of a RAID LV
Product: Red Hat Enterprise Linux 6 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.4CC: agk, cmarthal, dwysocha, heinzm, jbrassow, lnovich, msnitzer, nperic, prajnoha, prockai, slevine, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.100-1.el6 Doc Type: Bug Fix
Doc Text:
Previously, if a device temporarily failed the kernel would notice the interruption and regard the device failed. The kernel needs to be notified before it regards the device as alive again. LVM, however, would be able to see the device and 'lvs' would report the device as operating normally (i.e. without the partial attribute) - even though the kernel still regarded the device as failed. The user had to use 'dmsetup' in order to find out the true state of the device. Now 'lvs' will print a 'p' (partial) attribute if a device is missing and will also print a 'r' (refresh/replace) if the device is present but the kernel regards the device as still missing. Upon seeing an 'r' attribute for a RAID logical volume, the user can then decide if the array should be refresh (reloaded into the kernel using 'lvchange --refresh') or if the device should be replaced.
Story Points: ---
Clone Of:
: 987094 (view as bug list) Environment:
Last Closed: 2013-11-21 23:19:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 912336, 987094    

Description Jonathan Earl Brassow 2013-01-23 15:00:04 UTC
When reporting on the 'partial' status of a RAID LV, lvs only relies on whether it can read the device in the array or not.  Therefore, it can miss the following scenario:
1) Create RAID LV and wait for it to sync
2) kill a device and do a little I/O to cause it to recognise the failed dev
3) turn the device back on

At this point, 'lvs' will think all the devices are fine and not report anything wrong with the LV.  However, the kernel still considers the device 'dead' until it is reloaded.  This can be seen from 'dmsetup status <VG>-<LV>'.

'lvs' should query the kernel for the status information and determine the health of the devices so it can properly report the status of the array in this scenario.  We may want to use a character other than (p)artial to indicated that the device might be alive again but the array needs to be restarted.

Comment 2 RHEL Program Management 2013-01-27 06:47:56 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Jonathan Earl Brassow 2013-02-01 17:44:26 UTC
It is not just the (p)artial flag that is misreported, the character that indicates whether a particular image in a RAID LV is in-sync or not can also be improved.  Here is an excerpt from a recent commit:

    The other case where 'lvs' gives incomplete or improper output is when a
    device is replaced or added to a RAID LV.  It should display that the RAID
    LV is in the process of sync'ing and that the new device is the only one
    that is not-in-sync - as indicated by a leading 'I' in the Attr column.
    (Remember that 'i' indicates an (i)mage that is in-sync and 'I' indicates
    an (I)mage that is not in sync.)  Here's an example of the old incorrect
    behaviour:
    [root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
      LV            VG   Attr      Cpy%Sync Devices
      lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
      [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
      [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
      [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
      [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
    [root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_p
      LV            VG   Attr      Cpy%Sync Devices
      lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rim
      [lv_rimage_0] vg   Iwi-aor--          /dev/sda1(1)
      [lv_rimage_1] vg   Iwi-aor--          /dev/sdb1(1)
      [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
      [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
      [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
      [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)                        
    ** Note that only the last device that has been added should be marked 'I'.
    
    Here is an example of the correct output after this patch is applied:
    [root@bp-01 lvm2]# lvs -a -o name,vg_name,attr,copy_percent,devices vg
      LV            VG   Attr      Cpy%Sync Devices
      lv            vg   rwi-a-r--   100.00 lv_rimage_0(0),lv_rimage_1(0)
      [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
      [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
      [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
      [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
    [root@bp-01 lvm2]# lvconvert -m +1 vg/lv; lvs -a -o name,vg_name,attr,copy_p
      LV            VG   Attr      Cpy%Sync Devices
      lv            vg   rwi-a-r--     0.00 lv_rimage_0(0),lv_rimage_1(0),lv_rim
      [lv_rimage_0] vg   iwi-aor--          /dev/sda1(1)
      [lv_rimage_1] vg   iwi-aor--          /dev/sdb1(1)
      [lv_rimage_2] vg   Iwi-aor--          /dev/sdc1(1)
      [lv_rmeta_0]  vg   ewi-aor--          /dev/sda1(0)
      [lv_rmeta_1]  vg   ewi-aor--          /dev/sdb1(0)
      [lv_rmeta_2]  vg   ewi-aor--          /dev/sdc1(0)
    ** Note only the last image is marked with an 'I'.  This is correct and we c
       tell that it isn't the whole array that is sync'ing, but just the new
       device.

Comment 4 Jonathan Earl Brassow 2013-02-01 18:00:55 UTC
These following 3 commits are in LVM upstream version 2.02.99:

PATCH3:
commit 801d4f96a8a2333361d7292d9c79ffdb5a96fac3
Author: Jonathan Brassow <jbrassow>
Date:   Fri Feb 1 11:33:54 2013 -0600

PATCH2:
commit 37ffe6a13ad56122abdc808c13af9eeb1adf6731
Author: Jonathan Brassow <jbrassow>
Date:   Fri Feb 1 11:32:18 2013 -0600

PATCH1:
commit c8242e5cf4895f13e16b598b387c876c6fab7180
Author: Jonathan Brassow <jbrassow>
Date:   Fri Feb 1 11:31:47 2013 -0600

Comment 7 Jonathan Earl Brassow 2013-05-06 20:16:18 UTC
Commit improves the 'lvs' output even further:

PATCH4 (see other 3 in comment 4):
commit ff64e3500f6acf93dce017388445c4828111d06f
Author: Jonathan Brassow <jbrassow>
Date:   Thu Apr 11 15:33:59 2013 -0500

    RAID:  Add scrubbing support for RAID LVs
    
    New options to 'lvchange' allow users to scrub their RAID LVs.
    Synopsis:
        lvchange --syncaction {check|repair} vg/raid_lv
    
    RAID scrubbing is the process of reading all the data and parity blocks in
    an array and checking to see whether they are coherent.  'lvchange' can
    now initaite the two scrubbing operations: "check" and "repair".  "check"
    will go over the array and recored the number of discrepancies but not
    repair them.  "repair" will correct the discrepancies as it finds them.
    
    'lvchange --syncaction repair vg/raid_lv' is not to be confused with
    'lvconvert --repair vg/raid_lv'.  The former initiates a background
    synchronization operation on the array, while the latter is designed to
    repair/replace failed devices in a mirror or RAID logical volume.
    
    Additional reporting has been added for 'lvs' to support the new
    operations.  Two new printable fields (which are not printed by
    default) have been added: "syncaction" and "mismatches".  These
    can be accessed using the '-o' option to 'lvs', like:
        lvs -o +syncaction,mismatches vg/lv
    "syncaction" will print the current synchronization operation that the
    RAID volume is performing.  It can be one of the following:
            - idle:   All sync operations complete (doing nothing)
            - resync: Initializing an array or recovering after a machine failur
            - recover: Replacing a device in the array
            - check: Looking for array inconsistencies
            - repair: Looking for and repairing inconsistencies
    The "mismatches" field with print the number of descrepancies found during
    a check or repair operation.
    
    The 'Cpy%Sync' field already available to 'lvs' will print the progress
    of any of the above syncactions, including check and repair.
    
    Finally, the lv_attr field has changed to accomadate the scrubbing operation
    as well.  The role of the 'p'artial character in the lv_attr report field
    as expanded.  "Partial" is really an indicator for the health of a
    logical volume and it makes sense to extend this include other health
    indicators as well, specifically:
            'm'ismatches:  Indicates that there are discrepancies in a RAID
                           LV.  This character is shown after a scrubbing
                           operation has detected that portions of the RAID
                           are not coherent.
            'r'efresh   :  Indicates that a device in a RAID array has suffered
                           a failure and the kernel regards it as failed -
                           even though LVM can read the device label and
                           considers the device to be ok.  The LV should be
                           'r'efreshed to notify the kernel that the device is
                           now available, or the device should be 'r'eplaced
                           if it is suspected of failing.

Comment 14 Nenad Peric 2013-10-03 09:15:08 UTC
The printable fields syncaction and mismatches are not present in lvs:

lvs -o +mismatches vg/raid10
Unrecognised field: mismatches

lvs -o +syncaction vg/raid10
Unrecognised field: syncaction

Tested with version: lvm2-2.02.100-3.el6.x86_64

According to Comment 7, this should be present. 
Could you please clarify if this was maybe not included for some reason and should not be tested for?

Comment 15 Alasdair Kergon 2013-10-03 09:29:20 UTC
Try lvs -o help:


    copy_percent           - For RAID, mirrors and pvmove, current percentage in
-sync.
    sync_percent           - For RAID, mirrors and pvmove, current percentage in
-sync.
    raid_mismatch_count    - For RAID, number of mismatches found or repaired.
    raid_sync_action       - For RAID, the current synchronization action being performed.
    raid_write_behind      - For RAID1, the number of outstanding writes allowed to writemostly devices.
    raid_min_recovery_rate - For RAID1, the minimum recovery I/O load in kiB/sec/disk.
    raid_max_recovery_rate - For RAID1, the maximum recovery I/O load in kiB/sec/disk.

Comment 16 Nenad Peric 2013-10-03 11:29:18 UTC
Ok cool, so this was changed compared to the instructions/remarks in Comment 7. 
Thanks for the pointer.

Comment 17 Nenad Peric 2013-10-03 12:46:18 UTC
Tested lvs output with raid1, raid4, raid5, raid6 and raid10. 
The I flag only shows on a devace which was replaced (as expected).
here's the output from raid5 test:

[root@virt-013 yum.repos.d]# lvs -a -o name,vg_name,attr,raid_sync_action,lv_size,copy_percent,devices
  /dev/sdc1: read failed after 0 of 1024 at 10733879296: Input/output error
  /dev/sdc1: read failed after 0 of 1024 at 10733948928: Input/output error
  /dev/sdc1: read failed after 0 of 1024 at 0: Input/output error
  /dev/sdc1: read failed after 0 of 1024 at 4096: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid 6w287X-9pmZ-hZGL-rQW3-pUY5-6O7j-zcdPrf.
  LV               VG         Attr       SyncAction LSize   Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r-p- idle         2.00g   100.00 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---            684.00m          /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---            684.00m          /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         iwi-aor-p-            684.00m          unknown device(1)                                                      
  [raid5_rimage_3] vg         iwi-aor---            684.00m          /dev/sdd1(1)                                                           
  [raid5_rmeta_0]  vg         ewi-aor---              4.00m          /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---              4.00m          /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor-p-              4.00m          unknown device(0)                                                      
  [raid5_rmeta_3]  vg         ewi-aor---              4.00m          /dev/sdd1(0)                                                           
  lv_root          vg_virt013 -wi-ao----              6.71g          /dev/vda2(0)                                                           
  lv_swap          vg_virt013 -wi-ao----            816.00m          /dev/vda2(1718)                                                        
[root@virt-013 yum.repos.d]# dd if=/dev/urandom of=/dev/vg/raid5 bs=512 count=10240



[root@virt-013 yum.repos.d]# lvs -a -o name,vg_name,attr,raid_sync_action,lv_size,copy_percent,devices
  Couldn't find device with uuid 6w287X-9pmZ-hZGL-rQW3-pUY5-6O7j-zcdPrf.
  LV               VG         Attr       SyncAction LSize   Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r--- recover      2.00g    60.62 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---            684.00m          /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---            684.00m          /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         Iwi-aor---            684.00m          /dev/sde1(1)                                                           
  [raid5_rimage_3] vg         iwi-aor---            684.00m          /dev/sdd1(1)                                                           
  [raid5_rmeta_0]  vg         ewi-aor---              4.00m          /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---              4.00m          /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor---              4.00m          /dev/sde1(0)                                                           
  [raid5_rmeta_3]  vg         ewi-aor---              4.00m          /dev/sdd1(0)                              

Only the addded device /dev/sde1  was shown as syncing (I).

marking verified with: lvm2-2.02.100-4.el6.x86_64

Comment 18 Nenad Peric 2013-10-03 13:03:01 UTC
Adding one more test just showing the behavior during removal and restoration of the same device:

Removed /dev/sdd1

  LV               VG         Attr       SyncAction LSize   Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r-p- idle         2.00g   100.00 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---            684.00m          /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---            684.00m          /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         iwi-aor---            684.00m          /dev/sde1(1)                                                           
  [raid5_rimage_3] vg         iwi-aor-p-            684.00m          unknown device(1)                                                      
  [raid5_rmeta_0]  vg         ewi-aor---              4.00m          /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---              4.00m          /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor---              4.00m          /dev/sde1(0)                                                           
  [raid5_rmeta_3]  vg         ewi-aor-p-              4.00m          unknown device(0)                                                      
                       

Wrote some data, and brought the device back:

[root@virt-013 yum.repos.d]# lvs -a -o name,vg_name,attr,raid_sync_action,lv_size,copy_percent,devices
  Couldn't find device with uuid 6w287X-9pmZ-hZGL-rQW3-pUY5-6O7j-zcdPrf.
  LV               VG         Attr       SyncAction LSize   Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r-r- idle         2.00g   100.00 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---            684.00m          /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---            684.00m          /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         iwi-aor---            684.00m          /dev/sde1(1)                                                           
  [raid5_rimage_3] vg         iwi-aor-r-            684.00m          /dev/sdd1(1)                                                           
  [raid5_rmeta_0]  vg         ewi-aor---              4.00m          /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---              4.00m          /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor---              4.00m          /dev/sde1(0)                                                           
  [raid5_rmeta_3]  vg         ewi-aor-r-              4.00m          /dev/sdd1(0)                 

The LV has 'r' on 9th bit, indicating that a refresh is needed.
dmsetup sees the device as dead still:

vg-raid5: 0 4202496 raid raid5_ls 4 AAAD 1400832/1400832 idle 0

[root@virt-013 yum.repos.d]# lvchange --refresh vg/raid5
[root@virt-013 yum.repos.d]# lvs -a -o name,vg_name,attr,raid_sync_action,raid_mismatch_count,copy_percent,devices
  LV               VG         Attr       SyncAction Mismatches Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r--- idle                0   100.00 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---                                /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---                                /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         iwi-aor---                                /dev/sde1(1)                                                           
  [raid5_rimage_3] vg         iwi-aor---                                /dev/sdd1(1)                                                           
  [raid5_rmeta_0]  vg         ewi-aor---                                /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---                                /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor---                                /dev/sde1(0)                                                           
  [raid5_rmeta_3]  vg         ewi-aor---                                /dev/sdd1(0)                                                           
  lv_root          vg_virt013 -wi-ao----                                /dev/vda2(0)                                                           
  lv_swap          vg_virt013 -wi-ao----                                /dev/vda2(1718)                                                        
[root@virt-013 yum.repos.d]# lvs -a -o name,vg_name,attr,raid_sync_action,raid_mismatch_count,copy_percent,devices
  LV               VG         Attr       SyncAction Mismatches Cpy%Sync Devices                                                                
  raid5            vg         rwi-a-r--- idle                0   100.00 raid5_rimage_0(0),raid5_rimage_1(0),raid5_rimage_2(0),raid5_rimage_3(0)
  [raid5_rimage_0] vg         iwi-aor---                                /dev/sda1(1)                                                           
  [raid5_rimage_1] vg         iwi-aor---                                /dev/sdb1(1)                                                           
  [raid5_rimage_2] vg         iwi-aor---                                /dev/sde1(1)                                                           
  [raid5_rimage_3] vg         iwi-aor---                                /dev/sdd1(1)                                                           
  [raid5_rmeta_0]  vg         ewi-aor---                                /dev/sda1(0)                                                           
  [raid5_rmeta_1]  vg         ewi-aor---                                /dev/sdb1(0)                                                           
  [raid5_rmeta_2]  vg         ewi-aor---                                /dev/sde1(0)                                                           
  [raid5_rmeta_3]  vg         ewi-aor---                                /dev/sdd1(0)

Comment 19 errata-xmlrpc 2013-11-21 23:19:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1704.html