Bug 1049934
| Summary: | device mapper gets locked up during mirror failure testing (policy: remove) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Nenad Peric <nperic> |
| Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> |
| lvm2 sub component: | Default / Unclassified | QA Contact: | Cluster QE <mspqa-list> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | agk, heinzm, jbrassow, lmiksik, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac |
| Version: | 7.0 | Keywords: | TestBlocker |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | lvm2-2.02.105-1.el7 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-06-13 11:54:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
cluster or not? can you determine the backtrace? (If single machine mode, find the process and do 'gdb program `ps -o pid= -C <hung_program_name>`' on it. Once in 'gdb', continue and then hit '^C' once. gdb will be stuck waiting for the program to become un-hung. Then you can resume (dmsetup resume /dev/mapper/<vg>-<mirror>_m*; dmsetup resume /dev/mapper/<vg>-<mirror>) the LV. This will release the stuck process and gdb will drop to a prompt where you can get the backtrace at the exact place where the process hung. If the resume fails, you can replace the mirror associated devices' tables with error targets (load, suspend, resume), but it shouldn't come to that. It is not a cluster, a simple single host. I cannot determine a backtrace because gdb gets stuck as well. So I cannot attach to a process. The issue is that ^C would not let me out to a prompt. I will try and reproduce it today and attempt the steps you mentioned. Worst case, I will open a different console and issue dmsetup commands there. Blocked processes: root 16036 0.0 0.0 0 0 ? D 16:26 0:00 [jbd2/dm-6-8] root 16434 1.4 0.2 25948 2716 ? D 16:29 0:15 /usr/tests/sts-rhel7.0/bin/xdoio -v root 16446 0.0 0.0 107920 624 ? D 16:29 0:00 dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M root 18700 0.0 0.5 139220 5816 pts/0 D 16:44 0:00 lvs I provided 'lvs' as a hung program name to gdb, although there are more. [root@virt-122 ~]# gdb program `ps -o pid= -C lvs` GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-45.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... program: No such file or directory. Attaching to process 18700 ^C Did not get the prompt back. In another console I did: [root@virt-122 ~]# dmsetup resume /dev/mapper/revolution_9-mirror_1_m*; dmsetup resume /dev/mapper/revolution_9-mirror_1 [root@virt-122 ~]# [root@virt-122 ~]# dmsetup status revolution_9-mirror_1: 0 4194304 mirror 2 253:3 253:7 4096/4096 1 AA 1 core rhel_virt--122-swap: 0 1679360 linear rhel_virt--122-root: 0 14065664 linear revolution_9-mirror_1_mimagetmp_3: 0 4194304 mirror 2 253:4 253:5 4080/4096 1 AD 3 disk 253:2 A revolution_9-mirror_1_mlog: 0 8192 linear revolution_9-mirror_1_mimage_3: 0 4194304 linear revolution_9-mirror_1_mimage_2: 0 4194304 linear revolution_9-mirror_1_mimage_1: 0 4194304 linear [root@virt-122 ~]# dmsetup -c info Name Maj Min Stat Open Targ Event UUID revolution_9-mirror_1 253 6 L--w 3 1 4 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRjI6PBTFgaswYMAekEhcUH87cddmuxgve rhel_virt--122-swap 253 0 L--w 2 1 0 LVM-8MiCCFIsIrsh92YYS1xzWmSinn50pbMIgZftSAaDDhSwFKkA1ff07aOkH3toiqnC rhel_virt--122-root 253 1 L--w 1 1 0 LVM-8MiCCFIsIrsh92YYS1xzWmSinn50pbMILRNFgBXJXoaTpLoyBfPpMlZLsAdCLc79 revolution_9-mirror_1_mimagetmp_3 253 3 L--w 1 1 2 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRCzbvOcay8IekifyeVnnNYvOUKv2BuLip revolution_9-mirror_1_mlog 253 2 L--w 1 1 0 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRWX3Gy4XdGxAgHTrnuPiXNmxwEEtz1eBX revolution_9-mirror_1_mimage_3 253 7 L--w 1 1 0 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRMZfhn8dsdIYOxPSEaWG6stIlvn3av6pm revolution_9-mirror_1_mimage_2 253 5 L--w 1 1 0 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRQQfcdPn6Gpd9pRvIDF1bCjXYRKiALZ9X revolution_9-mirror_1_mimage_1 253 4 L--w 1 1 0 LVM-mOHnVXHNwTdIUqmJb2nCpwrQXOWiBArRQ0ulzeqMCMZzhlRJXQI7ZmvelKw16J1L I did ps -aux again and nothing changed: root 16036 0.0 0.0 0 0 ? D 16:26 0:00 [jbd2/dm-6-8] root 16434 1.1 0.2 25948 2716 ? D 16:29 0:15 /usr/tests/sts-rhel7.0/bin/xdoio -v root 16446 0.0 0.0 107920 624 ? D 16:29 0:00 dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M root 18700 0.0 0.5 139220 5816 pts/0 D 16:44 0:00 lvs Of course I did not get the prompt in gdb back. Then I got some info through sysrq_trigger which I will attach as well. Not sure if it is helpful.. Created attachment 847701 [details]
The dump from sysrq_trigger for hung processes
A quick look reveals that this is not the same bug we are seeing as a TIMEOUT in RHEL7 - there are no suspended DM devices on the system.
I do believe this issue has already been solved though. Here is a new parameter for lvm.conf introduced in version 2.02.104 (Note that your RPM version is based on 2.02.103!).
# ignore_lvm_mirrors: Introduced in version 2.02.104
# This setting determines whether logical volumes of "mirror" segment
# type are scanned for LVM labels. This affects the ability of
# mirrors to be used as physical volumes. If 'ignore_lvm_mirrors'
# is set to '1', it becomes impossible to create volume groups on top
# of mirror logical volumes - i.e. to stack volume groups on mirrors.
#
# Allowing mirror logical volumes to be scanned (setting the value to '0')
# can potentially cause LVM processes and I/O to the mirror to become
# blocked. This is due to the way that the "mirror" segment type handles
# failures. In order for the hang to manifest itself, an LVM command must
# be run just after a failure and before the automatic LVM repair process
# takes place OR there must be failures in multiple mirrors in the same
# volume group at the same time with write failures occurring moments
# before a scan of the mirror's labels.
#
# Note that these scanning limitations do not apply to the LVM RAID
# types, like "raid1". The RAID segment types handle failures in a
# different way and are not subject to possible process or I/O blocking.
#
# It is encouraged that users set 'ignore_lvm_mirrors' to 1 if they
# are using the "mirror" segment type. Users that require volume group
# stacking on mirrored logical volumes should consider using the "raid1"
# segment type. The "raid1" segment type is not available for
# active/active clustered volume groups.
#
# Set to 1 to disallow stacking and thereby avoid a possible deadlock.
ignore_lvm_mirrors = 1
Here is the upstream commit for this fix:
commit d5896f0afd28096c3dc04a9c7e9627b387f145fb
Author: Jonathan Brassow <jbrassow>
Date: Tue Oct 22 19:14:33 2013 -0500
Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O. When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices. This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.
Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel. While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely. This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.
I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs). If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command. This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror. They
can then release their lock and allow a repair command to commence. In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.
This patch brings solution #1. If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
BTW, this is the same issue as bug 1049934. Tested with:
kernel-3.10.0-76.el7.x86_64
lvm2-2.02.105-1.el7.x86_64
device-mapper-1.02.84-1.el7.x86_64
part of lvm config:
mirror_log_fault_policy="remove"
mirror_image_fault_policy="remove"
use_lvmetad=0
ignore_suspended_devices=0
ignore_lvm_mirrors=1
Failed on scenario:
********* Mirror info for this scenario *********
* mirrors: mirror_1
* leg devices: /dev/sdb1 /dev/sdc1 /dev/sda1
* log devices: /dev/sdd1
* failpv(s): /dev/sdc1
* failnode(s): virt-007.cluster-qe.lab.eng.brq.redhat.com
* lvmetad: 0
* leg fault policy: remove
* log fault policy: remove
*************************************************
Attaching the dump from sysrq-trigger for tasks in uninterruptable sleep.
Created attachment 854890 [details]
New sysrq-trigger dump for tasks in uninterruptable sleep
Added the attachment, and output from dmsetup is here:
[root@virt-007 ~]# dmsetup status
revolution_9-mirror_1_mimage_2-missing_0_0: 0 4194304 error
revolution_9-mirror_1: 0 4194304 mirror 2 253:3 253:7 4096/4096 1 AA 1 core
revolution_9-mirror_1_mimagetmp_3: 0 4194304 mirror 2 253:4 253:5 4073/4096 1 AD 3 disk 253:2 A
revolution_9-mirror_1_mlog: 0 8192 linear
revolution_9-mirror_1_mimage_3: 0 4194304 linear
rhel_virt--007-swap: 0 1679360 linear
rhel_virt--007-root: 0 14065664 linear
revolution_9-mirror_1_mimage_2: 0 4194304 linear
revolution_9-mirror_1_mimage_1: 0 4194304 linear
[root@virt-007 ~]# dmsetup ls
revolution_9-mirror_1_mimage_2-missing_0_0 (253:8)
revolution_9-mirror_1 (253:6)
revolution_9-mirror_1_mimagetmp_3 (253:3)
revolution_9-mirror_1_mlog (253:2)
revolution_9-mirror_1_mimage_3 (253:7)
rhel_virt--007-swap (253:0)
rhel_virt--007-root (253:1)
revolution_9-mirror_1_mimage_2 (253:5)
revolution_9-mirror_1_mimage_1 (253:4)
[root@virt-007 ~]# dmsetup info -c
Name Maj Min Stat Open Targ Event UUID
revolution_9-mirror_1_mimage_2-missing_0_0 253 8 L--w 1 1 0 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6Ch9SGN66RHuHk0jXx93y2RUQht1REZfz-missing_0_0
revolution_9-mirror_1 253 6 LI-w 1 1 4 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6qAwQN1OZxaQzK4ruGIvYLHpRTJaJDPPv
revolution_9-mirror_1_mimagetmp_3 253 3 LI-w 2 1 4 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6CLnGO6wmMZkYNJifwgqyMgWTD6UlkxxT
revolution_9-mirror_1_mlog 253 2 L--w 2 1 0 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6ez71xCVNPw1qdvaUE41ZqGeSUPBfjC8C
revolution_9-mirror_1_mimage_3 253 7 L--w 2 1 0 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6PcMnWk6rF73OYyBKD0iK6QNfmvFn9Wns
rhel_virt--007-swap 253 0 L--w 2 1 0 LVM-Q4TEZHce4pd5WNDkovJvy6Laqfk1qaR4qzTee3xl8bhkIl6IU8KpbZBcFMVAWRAW
rhel_virt--007-root 253 1 L--w 1 1 0 LVM-Q4TEZHce4pd5WNDkovJvy6Laqfk1qaR4kPaILncyf6fcBbq3YRnLn3VaV7mzw75H
revolution_9-mirror_1_mimage_2 253 5 LI-w 2 1 0 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6Ch9SGN66RHuHk0jXx93y2RUQht1REZfz
revolution_9-mirror_1_mimage_1 253 4 L--w 2 1 0 LVM-9GOGLHMfDtDOKa5U6yYCsyruEcuK7hr6rCXT2dq4fpP7KCkt4XsBIfb68Xf2EXKu
Is 'dmeventd' running at the end of the test? If it dies, then everything stops. 'dmeventd' seems to die for me right after the mirror becomes in sync. If that is truly the case, then of course nothing will work. program is being killed even before receiving an event from the kernel - mirror was 66% in-sync when dmeventd was killed... I have no idea why yet. (gdb) break process_event Breakpoint 1 at 0x7f4272ef5d90: file dmeventd_mirror.c, line 155. (gdb) c Continuing. [Thread 0x7f4272587700 (LWP 19765) exited] Program terminated with signal SIGKILL, Killed. The program no longer exists. dmeventd seems to be dying for the same reasons as bug 1039218. Daemons spawned from commands run through 'qarsh' are being killed. (It does take a minute or two before this happens though.) Here is the example: Creating through 'qarsh': [root@host-074 ~]# qarsh localhost lvcreate --type mirror -m 1 -L 4G -n lv vg; lvs -a -o name,attr,copy_percent vg; ps -C dmeventd; sleep 180; lvs -a -o name,attr,copy_percent vg; ps -C dmeventd Logical volume "lv" created LV Attr Cpy%Sync lv mwi-a-m--- 0.10 [lv_mimage_0] Iwi-aom--- [lv_mimage_1] Iwi-aom--- [lv_mlog] lwi-aom--- PID TTY TIME CMD 27212 ? 00:00:00 dmeventd LV Attr Cpy%Sync lv mwi-a-m--- 47.95 [lv_mimage_0] Iwi-aom--- [lv_mimage_1] Iwi-aom--- [lv_mlog] lwi-aom--- PID TTY TIME CMD [root@host-074 ~]# Creating through 'ssh': [root@host-074 ~]# ssh localhost lvcreate --type mirror -m 1 -L 4G -n lv vg; lvs -a -o name,attr,copy_percent vg; ps -C dmeventd; sleep 180; lvs -a -o name,attr,copy_percent vg; ps -C dmeventd root@localhost's password: Logical volume "lv" created LV Attr Cpy%Sync lv mwi-aom--- 0.10 [lv_mimage_0] Iwi-aom--- [lv_mimage_1] Iwi-aom--- [lv_mlog] lwi-aom--- PID TTY TIME CMD 27401 ? 00:00:00 dmeventd LV Attr Cpy%Sync lv mwi-a-m--- 46.00 [lv_mimage_0] Iwi-aom--- [lv_mimage_1] Iwi-aom--- [lv_mlog] lwi-aom--- PID TTY TIME CMD 27401 ? 00:00:00 dmeventd [root@host-074 ~]# So, there were two problems. The first was allowing label scans of mirror LVs. That has been fixed in the latest release (which is why I am moving the bug directly to MODIFIED). The second problem lies with the test suite - specifically 'qarsh' - and is a duplicate of bug 1039218. Tested with newer versions of LVM packages and passed all mirror tests successfuly. Last tested with lvm2-2.02.105-7.el7.x86_64 device-mapper-1.02.84-7.el7.x86_64 Marking this BZ verified. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |
Created attachment 847161 [details] The output from sysrq blocked tasks dump, and dmsetup outputs. Description of problem: When running mirror testing (revolution tests), with policies set to 'remove', the device-mapper gets stuck and no i/o works. LVM commandds do not return as well but get stuck waiting. The processes end up in uninterruptable sleep (D). It does not happen always during the same test. Version-Release number of selected component (if applicable): lvm2-2.02.103-10.el7.x86_64 device-mapper-1.02.82-10.el7.x86_64 kernel-3.10.0-64.el7.x86_64 How reproducible: [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] ================================================================================ [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] Iteration 0.6 started at Wed Jan 8 13:57:16 CET 2014 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] ================================================================================ [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] ACTUAL LEG ORDER: /dev/sda1 /dev/sdc1 /dev/sdd1 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] Scenario kill_random_devices: Kill random devices (log and legs) [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] ********* Mirror info for this scenario ********* [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * mirrors: mirror_1 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * leg devices: /dev/sda1 /dev/sdc1 /dev/sdd1 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * log devices: /dev/sde1 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * failpv(s): /dev/sde1 /dev/sda1 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * failnode(s): virt-123.cluster-qe.lab.eng.brq.redhat.com [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * lvmetad: 0 [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * leg fault policy: remove [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] * log fault policy: remove [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] ************************************************* It runs for a while and after removal of the device, it gets stuck: [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] Disabling device sde on virt-123.cluster-qe.lab.eng.brq.redhat.com [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] Disabling device sda on virt-123.cluster-qe.lab.eng.brq.redhat.com [lvm_single_mirror_failure] [lvm_single_revolution_9_remove] <lock-up> Steps to Reproduce: So far just by running revolution tests with remove policies. Actual results: device mapper and lvm get stuck Expected results: Not to get stuck? Additional info: Attached a file with a dump from sysrq and outputs from dmsetup status and info -c As far as lvm.conf goes: mirror_segtype_default="mirror" raid10_segtype_default="mirror" use_lvmetad=0 raid_fault_policy="warn" mirror_log_fault_policy="remove" mirror_image_fault_policy="remove"