Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1083285

Summary:	[lvmetad] lost mirror leg not replaced when running lvmetad
Product:	Red Hat Enterprise Linux 7	Reporter:	Marian Csontos <mcsontos>
Component:	lvm2	Assignee:	LVM and device-mapper development team <lvm-team>
lvm2 sub component:	Default / Unclassified	QA Contact:	cluster-qe <cluster-qe>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	agk, coughlan, heinzm, jbrassow, jharriga, mcsontos, msnitzer, prajnoha, prockai, zkabelac
Version:	7.0
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-05-06 15:59:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Marian Csontos 2014-04-01 19:51:04 UTC

Description of problem:

Failure of a (out-of-sync) secondary mirror leg is not recovered.
The recovery procedure attempts to replace the log which is sane.

    Apr  1 08:17:18 node lvm[4439]: intermediate VG write failed.
    Apr  1 08:17:18 node lvm[4439]: Mirror status: 1 of 2 images failed.
    Apr  1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 1 logs.
    Apr  1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use.
    Apr  1 08:17:18 node lvm[4439]: Failed to create mirror log.

lvs -ao+devices:

    (14:42:19) [root@cimrmanc5-node02:~]$ lvs -ao+devices
      PV 1eZHM6-QgMk-ixnB-j2PI-dA4j-hLoi-Jtwoyp not recognised. Is the device missing?
      PV 1eZHM6-QgMk-ixnB-j2PI-dA4j-hLoi-Jtwoyp not recognised. Is the device missing?
      LV                                    VG             Attr       LSize  Pool Origin Data%  Move Log                             Cpy%Sync Convert Devices
      nonsyncd_secondary_2legs_1            helter_skelter mwi-aom-p-  1.00g                         nonsyncd_secondary_2legs_1_mlog    14.06         nonsyncd_secondary_2legs_1_mimage_0(0),nonsyncd_secondary_2legs_1_mimage_1(0)
      [nonsyncd_secondary_2legs_1_mimage_0] helter_skelter Iwi-aom---  1.00g                                                                          /dev/sde1(0)
      [nonsyncd_secondary_2legs_1_mimage_1] helter_skelter Iwi-aom-p-  1.00g                                                                          unknown device(0)
      [nonsyncd_secondary_2legs_1_mlog]     helter_skelter lwi-aom---  4.00m                                                                          /dev/sda1(0)
      root                                  rhel           -wi-ao---- 10.31g                                                                          /dev/vda2(307)
      swap                                  rhel           -wi-ao----  1.20g                                                                          /dev/vda2(0)

Later I see lot of these:

    Apr  1 08:17:18 cimrmanc5-node02 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
    Apr  1 08:17:18 cimrmanc5-node02 kernel: [ 1651.461451] sd 2:0:0:5: rejecting I/O to offline device
    Apr  1 08:17:18 cimrmanc5-node02 kernel: [ 1651.462402] sd 2:0:0:5: rejecting I/O to offline device
    Apr  1 08:17:18 cimrmanc5-node02 kernel: [ 1651.463238] device-mapper: raid1: Write error during recovery (error = 0x1)
    Apr  1 08:17:18 cimrmanc5-node02 kernel: [ 1651.464342] sd 2:0:0:5: rejecting I/O to offline device
    Apr  1 08:17:18 cimrmanc5-node02 kernel: [ 1651.465086] end_request: I/O error, dev sdc, sector 0
    Apr  1 08:17:18 cimrmanc5-node02 kernel: sd 2:0:0:5: rejecting I/O to offline device

Relvant settings from lvm.conf:

    mirror_log_fault_policy = "allocate"
    mirror_image_fault_policy = "allocate"


Version-Release number of selected component (if applicable):
lvm2-2.02.105-14.el7.x86_64

How reproducible:
??? (Will try with a clean machine)

Steps to Reproduce:
    # mirror with out of sync secondary leg on /dev/sdc1
    echo offline > /sys/block/sdc/device/state
    pvscan --cache /dev/sdc1

Actual results:
leg is not repalced though there is enough free PVs

Expected results:
leg should be replaced

Additional info:

Apr  1 08:17:17 node qarshd[9484]: Running cmdline: echo offline > /sys/block/sdc/device/state
Apr  1 08:17:17 node systemd: Starting qarsh Per-Connection Server...
Apr  1 08:17:17 node systemd: Started qarsh Per-Connection Server.
Apr  1 08:17:17 node qarshd[9487]: Talking to peer ::ffff:192.168.109.254:60303 (IPv6)
Apr  1 08:17:17 node qarshd[9487]: Running cmdline: pvscan --cache /dev/sdc1
Apr  1 08:17:18 node kernel: [ 1650.427743] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1650.942262] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1650.942973] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1650.943691] device-mapper: raid1: Write error during recovery (error = 0x1)
Apr  1 08:17:18 node kernel: [ 1650.944763] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1650.945432] end_request: I/O error, dev sdc, sector 0
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node lvm[4439]: Secondary mirror device 253:4 sync failed.
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0
Apr  1 08:17:18 node lvm[4439]: Secondary mirror device 253:4 flush failed.
Apr  1 08:17:18 node lvm[4439]: Device failure in helter_skelter-nonsyncd_secondary_2legs_1.
Apr  1 08:17:18 node kernel: [ 1651.048412] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.049261] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.050035] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.050716] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 34348662784: Input/output error
Apr  1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 34348728320: Input/output error
Apr  1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 0: Input/output error
Apr  1 08:17:18 node lvm[4439]: /dev/sdc1: read failed after 0 of 512 at 4096: Input/output error
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.087889] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.090766] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node lvm[4439]: intermediate VG write failed.
Apr  1 08:17:18 node lvm[4439]: Mirror status: 1 of 2 images failed.
Apr  1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 1 logs.
Apr  1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use.
Apr  1 08:17:18 node lvm[4439]: Failed to create mirror log.
Apr  1 08:17:18 node lvm[4439]: Trying to up-convert to 2 images, 0 logs.
Apr  1 08:17:18 node lvm[4439]: WARNING: Failed to replace 1 of 1 logs in volume nonsyncd_secondary_2legs_1
Apr  1 08:17:18 node lvm[4439]: helter_skelter/nonsyncd_secondary_2legs_1: Converted: 14.8%
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.115460] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.116015] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.116993] device-mapper: raid1: Write error during recovery (error = 0x1)
Apr  1 08:17:18 node kernel: [ 1651.117935] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.118646] end_request: I/O error, dev sdc, sector 0
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0
Apr  1 08:17:18 node kernel: [ 1651.127704] sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: [ 1651.128014] end_request: I/O error, dev sdc, sector 0
Apr  1 08:17:18 node kernel: sd 2:0:0:5: rejecting I/O to offline device
Apr  1 08:17:18 node kernel: end_request: I/O error, dev sdc, sector 0

Comment 1 Marian Csontos 2014-04-01 19:53:43 UTC

Workaround: run mirrors without lvmetad.

Peter, could you asses risks of not fixing this for GA?

Comment 2 Peter Rajnoha 2014-04-04 14:58:38 UTC

(In reply to Marian Csontos from comment #1)
> Workaround: run mirrors without lvmetad.
> 
> Peter, could you asses risks of not fixing this for GA?

Well, lvmetad is used by default in RHEL7 and even though these are "old" mirrors (not RAID mirrors), we should still fix this I think! So I'd say this is a blocker. Now, can we get more debug info for this - dmeventd logs, lvmetad logs, the pvscan --cache logs... My suspicion is that dmeventd sees old info still, while lvmetad is updated later (but that's just a quick guess at the moment).

Comment 6 Peter Rajnoha 2014-04-07 10:56:55 UTC

Reproducible in RHEL7, *working* in rawhide and upstream lvm2 version. Trying to narrow it down...

Comment 7 Peter Rajnoha 2014-04-07 11:07:59 UTC

(In reply to Peter Rajnoha from comment #6)
> Reproducible in RHEL7, *working* in rawhide and upstream lvm2 version.

Nope, sorry. It's also working in RHEL7. I just had the default "remove" policy set for image faults - when using "allocate" policy, it also works for me in RHEL7. So it must be some timing issue. Marian, can you please try again and gather the logs?

Comment 12 Marian Csontos 2014-04-14 17:02:45 UTC

Failed to reproduce so far.

Comment 13 Petr Rockai 2014-05-05 11:10:20 UTC

I can't reproduce this either. From the original report, it seems you had some leftover devices from possibly failing repair earlier?

    Apr  1 08:17:18 node lvm[4439]: Unable to create LV nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already in use.

I'm also somewhat confused why kernel says "raid1" even though this seems to be old-style mirrors (might be an entirely unrelated artefact of course)?

Comment 14 Marian Csontos 2014-05-05 12:17:47 UTC

(In reply to Petr Rockai from comment #13)
> I can't reproduce this either. From the original report, it seems you had
> some leftover devices from possibly failing repair earlier?
> 
>     Apr  1 08:17:18 node lvm[4439]: Unable to create LV
> nonsyncd_secondary_2legs_1_mlog in Volume Group helter_skelter: name already
> in use.

Nope. This is from current run as the mirror was successfully created and this error was produced only when repairing the failed leg.

> I'm also somewhat confused why kernel says "raid1" even though this seems to
> be old-style mirrors (might be an entirely unrelated artefact of course)?

Yes, it is confusing seeing raid1 instead of mirror even though mirror is actually RAID1 implementation.

Still not seen again.

The fact it is not reproducible suggests there is a timing issue in the code and few speculative explanations include:

- lvmetad serving inconsistent metadata at some point (e.g. because of locking issues - see next point)
- IIRC *zkabelac* has found some issues with locking in mirror code - Zdenku, am I right?
- mlog LV device is (is it?) removed (deactivated) during recovery and when upconverting the mirror the links were not cleaned yet.

Comment 15 Marian Csontos 2014-05-06 10:18:36 UTC

Just seen another instance of the issue. I have found one of the machines running tests got stuck on output to affected mirror.

So this is what the testsuite does:

    # starts a task writing to LV
    # mirror with out of sync secondary leg on /dev/sdc1
    echo offline > /sys/block/sdc/device/state
    pvscan --cache /dev/sdc1

This time the `lvconvert --repair` was clearly started (and managed to failed with the same error) before the `pvscan --cache` was even executed so lvconvert IS reading outdated metadata.

I am not at all sure this can happen in the real world where the disk fails or is pulled out (or deleted by `echo 1 > /sys/block/sdc/device/delete`).

Perhaps the best way to check that is to put a sleep in udev rule running the pvscan...

Comment 16 Marian Csontos 2014-05-06 13:05:28 UTC

So running just `echo offline` without `pvscan --cache` and waiting while there is output to the mirror LV is enough to trigger the issue.

Adding sleep to udev rule before calling `pvscan --cache` does not trigger the issue.

Section 26.17. "Controlling the SCSI Command Timer and Device Status"[1] of the Storage Administration Guide says:

> The Linux SCSI layer sets a timer on each command. When this timer expires,
> the SCSI layer will quiesce the host bus adapter (HBA) and wait for all
> outstanding commands to either time out or complete. Afterwards, the SCSI
> layer will activate the driver's error handler.
>
> When the error handler is triggered, it attempts the following operations
> in order (until one successfully executes):
>
>     Abort the command.
>     Reset the device.
>     Reset the bus.
>     Reset the host. 
> 
> If all of these operations fail, the device will be set to the offline state.

Seems this may happen on disk failure and I wonder if `echo offline` should be enough to trigger full recovery including `pvscan --cache` of the affected device?

The same applies to RHEL6[2] and 5[3].

[1]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/7-Beta/html/Storage_Administration_Guide/scsi-command-timer-device-status.html
[2]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/scsi-command-timer-device-status.html
[3]: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/task_controlling-scsi-command-timer-onlining-devices.html

Comment 17 Marian Csontos 2014-05-06 15:59:53 UTC


*** This bug has been marked as a duplicate of bug 1089170 ***