Bug 985976
Summary: | LVM RAID: Add ability to perform RAID scrubbing operations | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jonathan Earl Brassow <jbrassow> | |
Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> | |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 6.5 | CC: | agk, cmarthal, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, slevine, thornber, zkabelac | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | lvm2-2.02.100-1.el6 | Doc Type: | Bug Fix | |
Doc Text: |
RAID logical volumes that are created via LVM are now capable of performing scrubbing operations. Scrubbing operations are user-initiated checks to ensure that the RAID volume is consistent. For example, a scrubbing "check" operation on a RAID1 logical volume would determine if there are any sectors in the mirror set that are not the same.
There are two scrubbing operations that can be performed: "check" and "repair". The "check" operation will examine the logical volume for any discrepancies, but will not correct them. The "repair" operation will correct any discrepancies found.
Once a "check" operation is performed, the user can tell if any mismatches were found by examining the 'lv_attr' field in the output of an 'lvs' command. The user can also find out the number of discrepancies found by including the 'raid_mismatch_count' field in the 'lvs' output. Here are a couple examples:
# To perform a "check" on a RAID logical volume, do:
~> lvchange --syncaction check vg/lv
# To perform a "repair" on a RAID logical volume, do:
~> lvchange --syncaction repair vg/lv
# To determine the mismatch count after a "check", do:
~> lvs -o +raid_mismatch_count vg/lv
|
Story Points: | --- | |
Clone Of: | ||||
: | 986443 (view as bug list) | Environment: | ||
Last Closed: | 2013-11-21 23:25:46 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 985920 | |||
Bug Blocks: | 986443, 986445 |
Description
Jonathan Earl Brassow
2013-07-18 16:04:15 UTC
Testing procedure: 1) Testing proper 'lvs' output: 'lvs' has two new reportable fields (not printed by default): "syncaction" and "mismatches". It must be possible to report these fields and have them be correct. (This can be done while testing scrubbing functionality - i.e. #2 below) Ex1> lvs -o syncaction --noheadings vg/lv Answer depends on current operation and can be: - idle: All sync operations complete (doing nothing) - resync: Initializing an array or recovering after a machine failure - recover: Replacing a device in the array - check: Looking for array inconsistencies - repair: Looking for and repairing inconsistencies An attempt should be made to validate these states are printed correctly. During an initial sync, it should read 'resync'. When finished, it should read 'idle'. When replacing a device in the array, it should be 'recover'. etc. Ex2> lvs -o mismatches,attr --noheadings vg/lv Should be '0' at all times unless a "check" has been performed and discrepancies have been found in the array. The last character in the attribute field should also read 'm' if there are mismatches after a "check" has been run. (Note that in the case of a device failure or device transient failure, the 'p'artial and 'r'place/'r'efresh flags take precidence over the 'm'ismatches flag.) 2) Testing correctness of scrubbing operations: There are currently 2 scrubbing operations which can be performed: "check" and "repair". To test them, do the following (or similar): for all RAID types { for all devices in the array { Create RAID array Wait for sync - 'lvs -o syncaction' should be "resync" - 'lvs -o sync_percent' should grow to 100% Perform "check" (lvchange --syncaction check vg/lv) Wait for sync - 'lvs -o syncaction' should be "check" - 'lvs -o sync_percent' should grow to 100% 'lvs -o mismatches' should be 0 Deactivate RAID array Write crap to $device (inner for loop) - if writing to PV directly, be sure to skip over LVM label, etc - lvm2/test/shell/lvchange-raid.sh has example of how to do this Activate RAID array Perform "check" (lvchange --syncaction check vg/lv) Wait for sync - 'lvs -o syncaction' should be "check" - 'lvs -o sync_percent' should grow to 100% 'lvs -o mismatches' should be NON-ZERO Perform "repair" (lvchange --syncaction repair vg/lv) Wait for sync - 'lvs -o syncaction' should be "repair" - 'lvs -o sync_percent' should grow to 100% 'lvs -o mismatches' should be 0 Perform "check" (lvchange --syncaction check vg/lv) Wait for sync - 'lvs -o syncaction' should be "check" - 'lvs -o sync_percent' should grow to 100% 'lvs -o mismatches' should be 0 done done 3) Other sanity checks: - You must not be able to start a "check"/"repair" while another sync operation is happening. - If throttling is available, it should function on "check" and "repair". (https://bugzilla.redhat.com/show_bug.cgi?id=969171#c3 is the RHEL6 test suggestions for throttling.) In addition to the commit in comment 0, there is a fix for a segfault that must also go in: commit 4eea66019157abd992c8802564b675fd97420c01 Author: Jonathan Brassow <jbrassow> Date: Fri Jul 19 10:01:48 2013 -0500 RAID: Fix segfault when reporting raid_syncaction field on older kernel The status printed for dm-raid targets on older kernels does not include the syncaction field. This is handled by dev_manager_raid_status() just fine by populating the raid status structure with NULL for that field. However, lv_raid_sync_action() does not properly handle that field being NULL. So, check for it and return 0 if it is NULL. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1704.html |