Bug 1083285
| Summary: | [lvmetad] lost mirror leg not replaced when running lvmetad | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Marian Csontos <mcsontos> |
| Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> |
| lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | agk, coughlan, heinzm, jbrassow, jharriga, mcsontos, msnitzer, prajnoha, prockai, zkabelac |
| Version: | 7.0 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-05-06 15:59:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Workaround: run mirrors without lvmetad. Peter, could you asses risks of not fixing this for GA? (In reply to Marian Csontos from comment #1) > Workaround: run mirrors without lvmetad. > > Peter, could you asses risks of not fixing this for GA? Well, lvmetad is used by default in RHEL7 and even though these are "old" mirrors (not RAID mirrors), we should still fix this I think! So I'd say this is a blocker. Now, can we get more debug info for this - dmeventd logs, lvmetad logs, the pvscan --cache logs... My suspicion is that dmeventd sees old info still, while lvmetad is updated later (but that's just a quick guess at the moment). Reproducible in RHEL7, *working* in rawhide and upstream lvm2 version. Trying to narrow it down... (In reply to Peter Rajnoha from comment #6) > Reproducible in RHEL7, *working* in rawhide and upstream lvm2 version. Nope, sorry. It's also working in RHEL7. I just had the default "remove" policy set for image faults - when using "allocate" policy, it also works for me in RHEL7. So it must be some timing issue. Marian, can you please try again and gather the logs? Failed to reproduce so far. I can't reproduce this either. From the original report, it seems you had some leftover devices from possibly failing repair earlier?
Apr 1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use.
I'm also somewhat confused why kernel says "raid1" even though this seems to be old-style mirrors (might be an entirely unrelated artefact of course)?
(In reply to Petr Rockai from comment #13) > I can't reproduce this either. From the original report, it seems you had > some leftover devices from possibly failing repair earlier? > > Apr 1 08:17:18 node lvm[4439]: Unable to create LV > nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already > in use. Nope. This is from current run as the mirror was successfully created and this error was produced only when repairing the failed leg. > I'm also somewhat confused why kernel says "raid1" even though this seems to > be old-style mirrors (might be an entirely unrelated artefact of course)? Yes, it is confusing seeing raid1 instead of mirror even though mirror is actually RAID1 implementation. Still not seen again. The fact it is not reproducible suggests there is a timing issue in the code and few speculative explanations include: - lvmetad serving inconsistent metadata at some point (e.g. because of locking issues - see next point) - IIRC *zkabelac* has found some issues with locking in mirror code - Zdenku, am I right? - mlog LV device is (is it?) removed (deactivated) during recovery and when upconverting the mirror the links were not cleaned yet. Just seen another instance of the issue. I have found one of the machines running tests got stuck on output to affected mirror.
So this is what the testsuite does:
# starts a task writing to LV
# mirror with out of sync secondary leg on /dev/sdc1
echo offline > /sys/block/sdc/device/state
pvscan --cache /dev/sdc1
This time the `lvconvert --repair` was clearly started (and managed to failed with the same error) before the `pvscan --cache` was even executed so lvconvert IS reading outdated metadata.
I am not at all sure this can happen in the real world where the disk fails or is pulled out (or deleted by `echo 1 > /sys/block/sdc/device/delete`).
Perhaps the best way to check that is to put a sleep in udev rule running the pvscan...
So running just `echo offline` without `pvscan --cache` and waiting while there is output to the mirror LV is enough to trigger the issue. Adding sleep to udev rule before calling `pvscan --cache` does not trigger the issue. Section 26.17. "Controlling the SCSI Command Timer and Device Status"[1] of the Storage Administration Guide says: > The Linux SCSI layer sets a timer on each command. When this timer expires, > the SCSI layer will quiesce the host bus adapter (HBA) and wait for all > outstanding commands to either time out or complete. Afterwards, the SCSI > layer will activate the driver's error handler. > > When the error handler is triggered, it attempts the following operations > in order (until one successfully executes): > > Abort the command. > Reset the device. > Reset the bus. > Reset the host. > > If all of these operations fail, the device will be set to the offline state. Seems this may happen on disk failure and I wonder if `echo offline` should be enough to trigger full recovery including `pvscan --cache` of the affected device? The same applies to RHEL6[2] and 5[3]. [1]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/7-Beta/html/Storage_Administration_Guide/scsi-command-timer-device-status.html [2]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/scsi-command-timer-device-status.html [3]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/task_controlling-scsi-command-timer-onlining-devices.html *** This bug has been marked as a duplicate of bug 1089170 *** |
Description of problem: Failure of a (out-of-sync) secondary mirror leg is not recovered. The recovery procedure attempts to replace the log which is sane. Apr 1 08:17:18 node lvm[4439]: intermediate VG write failed. Apr 1 08:17:18 node lvm[4439]: Mirror status: 1 of 2 images failed. Apr 1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 1 logs. Apr 1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use. Apr 1 08:17:18 node lvm[4439]: Failed to create mirror log. lvs -ao+devices: (14:42:19) [root@cimrmanc5-node02:~]$ lvs -ao+devices PV 1eZHM6-QgMk-ixnB-j2PI-dA4j-hLoi-Jtwoyp not recognised. Is the device missing? PV 1eZHM6-QgMk-ixnB-j2PI-dA4j-hLoi-Jtwoyp not recognised. Is the device missing? LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert Devices nonsyncd_secondary_2legs_1 helter_skelter mwi-aom-p- 1.00g nonsyncd_secondary_2legs_1_mlog 14.06 nonsyncd_secondary_2legs_1_mimage_0(0),nonsyncd_secondary_2legs_1_mimage_1(0) [nonsyncd_secondary_2legs_1_mimage_0] helter_skelter Iwi-aom--- 1.00g /dev/sde1(0) [nonsyncd_secondary_2legs_1_mimage_1] helter_skelter Iwi-aom-p- 1.00g unknown device(0) [nonsyncd_secondary_2legs_1_mlog] helter_skelter lwi-aom--- 4.00m /dev/sda1(0) root rhel -wi-ao---- 10.31g /dev/vda2(307) swap rhel -wi-ao---- 1.20g /dev/vda2(0) Later I see lot of these: Apr 1 08:17:18 cimrmanc5-node02 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 cimrmanc5-node02 kernel: [ 1651.461451] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 cimrmanc5-node02 kernel: [ 1651.462402] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 cimrmanc5-node02 kernel: [ 1651.463238] device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 cimrmanc5-node02 kernel: [ 1651.464342] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 cimrmanc5-node02 kernel: [ 1651.465086] end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 cimrmanc5-node02 kernel: sd 2:0:0:5: rejecting I/O to offline device Relvant settings from lvm.conf: mirror_log_fault_policy = "allocate" mirror_image_fault_policy = "allocate" Version-Release number of selected component (if applicable): lvm2-2.02.105-14.el7.x86_64 How reproducible: ??? (Will try with a clean machine) Steps to Reproduce: # mirror with out of sync secondary leg on /dev/sdc1 echo offline > /sys/block/sdc/device/state pvscan --cache /dev/sdc1 Actual results: leg is not repalced though there is enough free PVs Expected results: leg should be replaced Additional info: Apr 1 08:17:17 node qarshd[9484]: Running cmdline: echo offline > /sys/block/sdc/device/state Apr 1 08:17:17 node systemd: Starting qarsh Per-Connection Server... Apr 1 08:17:17 node systemd: Started qarsh Per-Connection Server. Apr 1 08:17:17 node qarshd[9487]: Talking to peer ::ffff:192.168.109.254:60303 (IPv6) Apr 1 08:17:17 node qarshd[9487]: Running cmdline: pvscan --cache /dev/sdc1 Apr 1 08:17:18 node kernel: [ 1650.427743] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1650.942262] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1650.942973] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1650.943691] device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 node kernel: [ 1650.944763] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1650.945432] end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node lvm[4439]: Secondary mirror device 253:4 sync failed. Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 node lvm[4439]: Secondary mirror device 253:4 flush failed. Apr 1 08:17:18 node lvm[4439]: Device failure in helter_skelter-nonsyncd_secondary_2legs_1. Apr 1 08:17:18 node kernel: [ 1651.048412] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.049261] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.050035] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.050716] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 34348662784: Input/output error Apr 1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 34348728320: Input/output error Apr 1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 0: Input/output error Apr 1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 4096: Input/output error Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.087889] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.090766] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node lvm[4439]: intermediate VG write failed. Apr 1 08:17:18 node lvm[4439]: Mirror status: 1 of 2 images failed. Apr 1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 1 logs. Apr 1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use. Apr 1 08:17:18 node lvm[4439]: Failed to create mirror log. Apr 1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 0 logs. Apr 1 08:17:18 node lvm[4439]: WARNING: Failed to replace 1 of 1 logs in volume nonsyncd_secondary_2legs_1 Apr 1 08:17:18 node lvm[4439]: helter_skelter/nonsyncd_secondary_2legs_1: Converted: 14.8% Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.115460] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.116015] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.116993] device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 node kernel: [ 1651.117935] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.118646] end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 node kernel: [ 1651.127704] sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: [ 1651.128014] end_request: I/O error, dev sdc, sector 0 Apr 1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device Apr 1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0