RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1089170 - LVM cannot repair mirror when lvmetad is running if the drives gives I/O errors
Summary: LVM cannot repair mirror when lvmetad is running if the drives gives I/O errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Petr Rockai
QA Contact: Cluster QE
URL:
Whiteboard:
: 1083285 (view as bug list)
Depends On: 1085553
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-04-18 07:36 UTC by Nenad Peric
Modified: 2021-09-03 12:38 UTC (History)
8 users (show)

Fixed In Version: lvm2-2.02.115-1.el7
Doc Type: Bug Fix
Doc Text:
A problem in dmeventd monitoring plugin for mirror volumes prevented partially failed mirrors from being automatically repaired in cases where the failed device continued to be available to the system but gave IO errors. The code was changed to correctly handle present but broken devices while lvmetad is used.
Clone Of:
Environment:
Last Closed: 2015-03-05 13:08:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Verbose logs of mirror repair attempt with lvmetad running. (59.91 KB, text/plain)
2014-04-18 09:57 UTC, Nenad Peric
no flags Details
Logs from using failure injection (lvconvert not looping, but failing to repair) (68.46 KB, application/x-gzip)
2014-04-18 12:21 UTC, Peter Rajnoha
no flags Details
Log of lvconvert with lvmetad (real HW) (57.68 KB, text/plain)
2014-04-18 12:44 UTC, Nenad Peric
no flags Details
Log of lvconvert without lvmetad (real HW) (1.26 MB, text/plain)
2014-04-18 13:08 UTC, Nenad Peric
no flags Details
Reproducer with logs using failure injection (lvconvert looping and failing to repair) (38.64 KB, application/x-gzip)
2014-04-22 08:21 UTC, Peter Rajnoha
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0513 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2015-03-05 16:14:41 UTC

Description Nenad Peric 2014-04-18 07:36:55 UTC
Description of problem:

If a drive fails in such a way that it is still present in the system, but gives out I/O errors, LVM with mirror LVs running lvmetad cannot recover from it. 
Since no event in udev is issued, lvmetad is not aware of the failure.
The repair of such a failed mirror did not complete even by running:

lvconvert --repair (called both from dmeventd and on cmd line)

Version-Release number of selected component (if applicable):

lvm2-2.02.105-14.el7.x86_64
device-mapper-1.02.84-14.el7.x86_64


How reproducible:

100%

Steps to Reproduce:

Get a device to fail with I/O errors, but still being visible in the system. 
(unmap iscsi drive assigned to the client on the storage server for example)


Actual results:

Mirror gets I/O errors, lvmetad is not aware of the failure. 
If some write is attempted it will hang and deadlock, since the repair will never really start or complete (will be stuck at 99.8%) 

Expected results:

The same behavior as without lvmetad - the repair should complete based on policies chosen in lvm.conf. 

Additional info:

Comment 1 Peter Rajnoha 2014-04-18 07:51:42 UTC
Similar thing can be observed with kernel's fault injection feature (echo 1 > /sys/block/.../make_it_fail). However, in this case, the lvconvert --repair does not hang as in the case with real HW Nenad described above, but lvconvert just finished wihtout actually repairing the failed mirror (it does finish with the mirror repaired in case lvmetad is not used). I'll post some logs shortly...

Comment 3 Nenad Peric 2014-04-18 09:57:42 UTC
Created attachment 887498 [details]
Verbose logs of mirror repair attempt with lvmetad running.

Attaching a file with verbose logs of lvconvert -vvvv --repair with lvmetad running.

Comment 4 Peter Rajnoha 2014-04-18 12:21:46 UTC
Created attachment 887554 [details]
Logs from using failure injection (lvconvert not looping, but failing to repair)

Also, ataching logs from alternative way of hitting the problem with lvmetad+lvconvert --repair on failed mirror leg. The leg shot down by failure injection: echo 1 > /sys/block/sda/make_it_fail. Contains logs from both lvconvert --repair with lvmetad (which fails to repair the mirror) and lvconvert --repari without using lvmetad which does the reapir correctly.

There's one difference though form what's seen on real HW failure - in case of simulating the problem via failure injection, the lvconvert does not loop, but just reports "successful finish" despite the mirror is not repaired. Anyway, this can also help in identifying the problem with mirror repair + lvmetad.

Comment 5 Nenad Peric 2014-04-18 12:44:45 UTC
Created attachment 887567 [details]
Log of lvconvert with lvmetad (real HW)

To avoid other influences such aas attempts at auto-repair, I have disabled dmeventd and monitoring of LVM.
This debug log is the first repair attempt, done manually on command line.
It loops in the end as mentioned by Peter.

Comment 6 Nenad Peric 2014-04-18 13:08:22 UTC
Created attachment 887568 [details]
Log of lvconvert without lvmetad (real HW)

Debug output of lvconvert without lvmetad which finishes successfully.

Comment 7 Peter Rajnoha 2014-04-22 08:21:57 UTC
Created attachment 888382 [details]
Reproducer with logs using failure injection (lvconvert looping and failing to repair)

(In reply to Peter Rajnoha from comment #4)
> There's one difference though form what's seen on real HW failure - in case
> of simulating the problem via failure injection, the lvconvert does not
> loop, but just reports "successful finish" despite the mirror is not
> repaired. Anyway, this can also help in identifying the problem with mirror
> repair + lvmetad.

And now I've managed to make the lvconvert to loop infinitely when using lvmetad and when injecting the mirror failure. Attaching complete logs with reproducer (both for working case without lvmetad and non-working case with lvmetad). The lvconvert looped at 93.8% in this test and never finished the repair.

Comment 8 Peter Rajnoha 2014-04-22 11:46:40 UTC
When pvscan --cache is run just after the mirror failure and before anything tries to write to the mirror, the subsequent lvconvert --repair works correctly then. So it seems that lvconvert is just doing wrong decision based on outdated information.

Comment 9 Jonathan Earl Brassow 2014-04-22 19:22:10 UTC
This is a duplicate of bug 1085553, but i'd rather allow this bug to stay open and dependent on 1085553 since it is not immediately obvious.

Comment 10 RHEL Program Management 2014-04-30 05:48:11 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 11 Petr Rockai 2014-05-05 11:17:02 UTC
This is pretty weird actually. We have a test which covers a very similar scenario, and even when I explicitly disable lvmetad notifications, lvconvert --repair still works OK (cf. lvconvert-repair-transient.sh). I'll try to find the crucial difference and make a test for the failing case also. Might be related to *which* legs are failing...

Comment 12 Marian Csontos 2014-05-06 15:59:53 UTC
*** Bug 1083285 has been marked as a duplicate of this bug. ***

Comment 13 Marian Csontos 2014-05-06 16:22:52 UTC
Petr, it is possible the recovery is doing nothing? I have seen few cases where mirror with core log was "repaired" and reported as success by dmeventd but the missing PV was left as one of the legs as lvconvert somehow did not see any missing devices.

I suggest moving code re-enabling the devices only after `vgreduce --removemissing`.
And moving wait before vgreduce to check the mirror was actually repaired.

From another instance of the duplicate Bug 1083285:

    Apr  1 08:46:47 node01 lvm[4441]: Log device 253:5 has failed (D).
    Apr  1 08:46:47 node01 lvm[4441]: Device failure in revolution_9-mirror_1.
    Apr  1 08:46:47 node01 lvm[4441]: /dev/sdh1: read failed after 0 of 512 at 34348662784: Input/output error
    Apr  1 08:46:47 node01 lvm[4441]: /dev/sdh1: read failed after 0 of 512 at 34348728320: Input/output error
    Apr  1 08:46:47 node01 lvm[4441]: /dev/sdh1: read failed after 0 of 512 at 0: Input/output error
    Apr  1 08:46:47 node01 lvm[4441]: /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error
    Apr  1 08:46:47 node01 lvm[4441]: intermediate VG write failed.

                                      ^ seems this is the common signature of the failures

    Apr  1 08:46:47 node01 lvm[4441]: Mirror log status: 1 of 1 images failed.
    Apr  1 08:46:47 node01 lvm[4441]: Trying to up-convert to 3 images, 1 logs.
    Apr  1 08:46:47 node01 lvm[4441]: Unable to create LV mirror_1_mlog in Volume Group revolution_9: name already in use.
    Apr  1 08:46:47 node01 lvm[4441]: Failed to create mirror log. 
    Apr  1 08:46:47 node01 lvm[4441]: Trying to up-convert to 2 images, 1 logs.
    Apr  1 08:46:47 node01 lvm[4441]: intermediate VG write failed.                                                  
    Apr  1 08:46:47 node01 lvm[4441]: Trying to up-convert to 2 images, 0 logs.
    Apr  1 08:46:47 node01 lvm[4441]: WARNING: Failed to replace 1 of 3 images in volume mirror_1
    Apr  1 08:46:47 node01 lvm[4441]: WARNING: Failed to replace 1 of 1 logs in volume mirror_1
    Apr  1 08:46:47 node01 lvm[4441]: Repair of mirrored device revolution_9-mirror_1 finished successfully.

                                      ^ I have not noticed this before but it is possible the lvconvert is returning success while failing

Comment 14 Petr Rockai 2014-05-07 10:44:09 UTC
Yes, this seems to be the culprit:

#device/dev-io.c:106     /dev/sdb1: read failed after 0 of 1024 at 4096: Input/output error
#device/dev-io.c:249         <backtrace>
#format_text/format-text.c:330         <backtrace>
#format_text/format-text.c:597         <backtrace>
#metadata/metadata.c:2668         <backtrace>
#metadata/mirror.c:1041   intermediate VG write failed.
#metadata/mirror.c:1145         <backtrace>

This clearly hints at a problem in recognizing that the VG is partial and that partial write errors should be accepted in vg_write (as long as *some* MDAs write OK).

However, I can't reproduce this so far. I have a very similar test, complete with IO errors, the only difference I can think of is that my PVs are DM error targets instead of broken iSCSI devices.

(...) So it turns out the problem in my non-reproducibility is this:
#/home/mornfall/dev/lvm2/git-upstream/lib/filters/filter-persistent.c:292         @TESTDIR@/dev/mapper/@PREFIX@pv4: Skipping unusable device

When I comment out device_is_usable in the persistent filter, I get:

#/home/mornfall/dev/lvm2/git-upstream/lib/metadata/mirror.c:1041   intermediate VG write failed.

I.e. device_is_usable is being so smart that it fools our tests. Anyway, I can fix this now that I can reproduce the problem.

PS: The recovery in our upstream test certainly works, the mirror goes from ADAD+A to AA+A just fine (same for DDAA+A -> AA+A), both when the device goes away completely and when the device is replaced by an error device.

Comment 15 Petr Rockai 2014-06-08 19:20:09 UTC
Should work upstream now, although we don't have a regression test at the moment, due to abovementioned issues. I have tested this manually (with device_is_usable disabled) and it seems to work OK. I'll try to find a non-invasive way to make this more testable. The associated upstream commit is dba6dec661c6a0193d9d713c6fd0b9b008c0f26a.

Comment 17 Nenad Peric 2015-01-16 18:24:50 UTC
I am still having issues with this one. Even though the errors are not completely the same, I am not getting the device replaced and mirror repaired. 
It is just stuck at 0%
Will try to get more info...

Comment 18 Nenad Peric 2015-01-20 16:05:19 UTC
Ok this is now fixed in the newest build:

Jan 20 16:55:50 tardis-03 lvm[2085]: /dev/sdb1: read failed after 0 of 2048 at 0: Input/output error
Jan 20 16:55:50 tardis-03 lvm[2085]: No PV label found on /dev/sdb1.
Jan 20 16:55:50 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:55:58 tardis-03 lvm[2085]: Mirror status: 1 of 2 images failed.
Jan 20 16:55:58 tardis-03 lvm[2085]: Trying to up-convert to 2 images, 1 logs.
Jan 20 16:55:59 tardis-03 lvm[2085]: Monitoring mirror device vg-mirror_lv for events.
Jan 20 16:55:59 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:55:59 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 0.0%
Jan 20 16:56:14 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:56:14 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 21.7%
Jan 20 16:56:29 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:56:29 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 34.2%
Jan 20 16:56:44 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:56:44 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 43.1%
Jan 20 16:56:59 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:56:59 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 51.7%
Jan 20 16:57:14 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:57:14 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 60.4%
Jan 20 16:57:29 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:57:29 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 69.4%
Jan 20 16:57:44 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:57:44 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 78.1%
Jan 20 16:57:59 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:57:59 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 86.6%
Jan 20 16:58:14 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:58:14 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 95.6%
Jan 20 16:58:29 tardis-03 lvm[2085]: WARNING: Device for PV YvhIyh-heiq-ugUa-JJkQ-zmG9-ds0c-46wfVY not found or rejected by a filter.
Jan 20 16:58:29 tardis-03 lvm[2085]: vg/mirror_lv: Converted: 100.0%
Jan 20 16:58:44 tardis-03 lvm[2085]: Repair of mirrored device vg-mirror_lv finished successfully.
Jan 20 16:58:44 tardis-03 lvm[2085]: vg-mirror_lv is now in-sync.
Jan 20 16:58:45 tardis-03 lvm[2085]: No longer monitoring mirror device vg-mirror_lv for events.


Although there is a spam of "WARNING" messages...

Marking VERIFIED with:

3.10.0-223.el7.x86_64

lvm2-2.02.114-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
lvm2-libs-2.02.114-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
lvm2-cluster-2.02.114-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
device-mapper-1.02.92-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
device-mapper-libs-1.02.92-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
device-mapper-event-1.02.92-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
device-mapper-event-libs-1.02.92-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015
device-mapper-persistent-data-0.4.1-2.el7    BUILT: Wed Nov 12 19:39:46 CET 2014
cmirror-2.02.114-6.el7    BUILT: Tue Jan 20 14:49:01 CET 2015

Comment 20 errata-xmlrpc 2015-03-05 13:08:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0513.html


Note You need to log in before you can comment on or make changes to this bug.