Bug 1446754
| Summary: | all primary raid1 failures whether in sync or not now require user intervention | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> |
| lvm2 sub component: | Mirroring and RAID | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | agk, heinzm, jbrassow, mcsontos, msnitzer, mthacker, prajnoha, prockai, zkabelac |
| Version: | 7.4 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | lvm2-2.02.171-4.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-01 21:52:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1311765 | ||
| Bug Blocks: | |||
|
Description
Corey Marthaler
2017-04-28 19:17:03 UTC
This will have to be fixed. The key part about the summary is "whether in sync or not". The rationale behind not allowing a user to automatically repair a primary leg while syncing is clear, but it should definitely be allowed if the RAID1 is in-sync at the time the failure happens. [root@bp-01 ~]# devices WARNING: Not using lvmetad because a repair command was run. /dev/sdb1: read failed after 0 of 4096 at 898387345408: Input/output error /dev/sdb1: read failed after 0 of 4096 at 898387402752: Input/output error /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error /dev/sdb1: read failed after 0 of 4096 at 4096: Input/output error Couldn't find device with uuid dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke. WARNING: Couldn't find all devices for LV vg/raid1_rimage_0 while checking used and assumed devices. WARNING: Couldn't find all devices for LV vg/raid1_rmeta_0 while checking used and assumed devices. LV Attr Cpy%Sync Devices home -wi-ao---- /dev/sda2(2016) root -wi-ao---- /dev/sda2(106177) swap -wi-ao---- /dev/sda2(0) raid1 rwi-a-r-p- 100.00 raid1_rimage_0(0),raid1_rimage_1(0) [raid1_rimage_0] Iwi-aor-p- [unknown](1) [raid1_rimage_1] iwi-aor--- /dev/sdc1(1) [raid1_rmeta_0] ewi-aor-p- [unknown](0) [raid1_rmeta_1] ewi-aor--- /dev/sdc1(0) [root@bp-01 ~]# lvconvert --repair vg/raid1 WARNING: Disabling lvmetad cache for repair command. WARNING: Not using lvmetad because of repair. /dev/sdb1: read failed after 0 of 4096 at 898387345408: Input/output error /dev/sdb1: read failed after 0 of 4096 at 898387402752: Input/output error /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error /dev/sdb1: read failed after 0 of 4096 at 4096: Input/output error Couldn't find device with uuid dmVM0n-K1JI-wJ71-7Jto-o8r3-5IK4-QlsDke. WARNING: Couldn't find all devices for LV vg/raid1_rimage_0 while checking used and assumed devices. WARNING: Couldn't find all devices for LV vg/raid1_rmeta_0 while checking used and assumed devices. Attempt to replace failed RAID images (requires full device resync)? [y/n]: y Unable to extract primary RAID image while RAID array is not in-sync (use --force option to replace). Failed to remove the specified images from vg/raid1. Failed to replace faulty devices in vg/raid1. I've got a patch I'm testing that should fix things up. Prob have it by tmr. patches committed upstream:
commit 88e649628863e78b101c584c513053fc9461c24d
Author: Jonathan Brassow <jbrassow>
Date: Tue Jun 6 10:43:12 2017 -0500
lvconvert: linear -> raid1 upconvert should cause "recover" not "resync"
* and *
commit acaf3a5d47fd65b2e385a516544f8e6ec8d89b2d
Author: Jonathan Brassow <jbrassow>
Date: Tue Jun 6 10:43:49 2017 -0500
lvconvert: Don't require a 'force' option during RAID repair.
Marking verified with the latest rpms. # already synced lvm[28754]: Faulty devices in black_bird/synced_primary_raid1_2legs_1 successfully replaced. # not yet insync (but also, not a linear up convert) lvm[28754]: Faulty devices in black_bird/non_synced_primary_raid1_2legs_1 successfully replaced. LVM can again successfully (automatically) repair failed raid1 primary devices while the raid fault policy is set to allocate. That said, the caveat with bug 1446780 still exists, leaving the user confused about how many times the device(s) was actually repaired, or whether it is believed to be in sync during the repair process. 3.10.0-688.el7.x86_64 lvm2-2.02.171-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 lvm2-libs-2.02.171-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 lvm2-cluster-2.02.171-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 device-mapper-1.02.140-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 device-mapper-libs-1.02.140-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 device-mapper-event-1.02.140-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 device-mapper-event-libs-1.02.140-7.el7 BUILT: Thu Jun 22 08:35:15 CDT 2017 device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 10:15:46 CDT 2017 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2222 |