Bug 1380521
| Summary: | mirror type conversion/repair deadlock after device failure (device-mapper: raid1: Write error during recovery) | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||||||||||
| Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> | ||||||||||||
| lvm2 sub component: | Mirroring and RAID | QA Contact: | cluster-qe <cluster-qe> | ||||||||||||
| Status: | CLOSED ERRATA | Docs Contact: | Steven J. Levine <slevine> | ||||||||||||
| Severity: | medium | ||||||||||||||
| Priority: | high | CC: | agk, heinzm, jbrassow, jreznik, mcsontos, msnitzer, mthacker, pasik, prajnoha, prockai, rbednar, slevine, zkabelac | ||||||||||||
| Version: | 7.3 | Keywords: | Regression, ZStream | ||||||||||||
| Target Milestone: | rc | ||||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | x86_64 | ||||||||||||||
| OS: | Linux | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | lvm2-2.02.169-1.el7 | Doc Type: | Bug Fix | ||||||||||||
| Doc Text: |
Interaction problems no longer occur with the *lvmetad* daemon when `mirror` segment type is used
Previously, when the legacy `mirror` segment type was used to create mirrored logical volumes with 3 or more legs, interaction problems could occur with the *lvmetad* daemon. Problems observed occurred only after a second device failure, when mirror fault policies were set to the non-default `allocate` option, when *lvmetad* was used, and there had been no reboot of the machine between device failure events. This bug has been fixed, and the described interaction problems no longer occur.
|
Story Points: | --- | ||||||||||||
| Clone Of: | |||||||||||||||
| : | 1431146 (view as bug list) | Environment: | |||||||||||||
| Last Closed: | 2017-08-01 21:47:18 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Bug Depends On: | |||||||||||||||
| Bug Blocks: | 1385242, 1431146 | ||||||||||||||
| Attachments: |
|
||||||||||||||
*** Bug 1380833 has been marked as a duplicate of this bug. *** This is reproducible easily. It appears that lvm may be using the failed device for the new allocation/repair device? Also should a mirror type volume be doing: "device-mapper: raid1: Write error during recovery"
================================================================================
Iteration 0.2 started at Mon Oct 3 10:52:58 CDT 2016
================================================================================
ACTUAL LEG ORDER: /dev/sda1 /dev/sdc1 /dev/sde1
Scenario kill_primary_leg: Kill primary leg
********* Mirror info for this scenario *********
* mirrors: mirror_1
* leg devices: /dev/sda1 /dev/sdc1 /dev/sde1
* log devices: /dev/sdg1
* failpv(s): /dev/sda1
* failnode(s): host-122
* lvmetad: 1
* leg fault policy: allocate
* log fault policy: allocate
*************************************************
Current mirror/raid device structure(s):
LV Attr LSize Cpy%Sync Devices
mirror_1 mwi-aom--- 2.00g 100.00 mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0)
[mirror_1_mimage_0] iwi-aom--- 2.00g /dev/sda1(0)
[mirror_1_mimage_1] iwi-aom--- 2.00g /dev/sdc1(0)
[mirror_1_mimage_2] iwi-aom--- 2.00g /dev/sde1(0)
[mirror_1_mlog] lwi-aom--- 4.00m /dev/sdg1(0)
PV=/dev/sda1
mirror_1_mimage_0: 4.1
PV=/dev/sda1
mirror_1_mimage_0: 4.1
Writing verification files (checkit) to mirror(s) on...
---- host-122 ----
<start name="host-122_mirror_1" pid="9561" time="Mon Oct 3 10:53:05 2016" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure
Verifying files (checkit) on mirror(s) on...
---- host-122 ----
Disabling device sda on host-122rescan device...
/dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error
/dev/sda1: read failed after 0 of 512 at 21467938816: Input/output error
/dev/sda1: read failed after 0 of 512 at 0: Input/output error
/dev/sda1: read failed after 0 of 512 at 4096: Input/output error
/dev/sda1: read failed after 0 of 2048 at 0: Input/output error
Getting recovery check start time from /var/log/messages: Oct 3 10:53
Attempting I/O to cause mirror down conversion(s) on host-122
dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.139669 s, 300 MB/s
[root@host-122 ~]# lvs -a -o +devices
LV VG Attr LSize Log Cpy%Sync Convert Devices
mirror_1 revolution_9 cwi-aom--- 2.00g 0.00 [mirror_1_mimagetmp_3] mirror_1_mimagetmp_3(0),mirror_1_mimage_3(0)
[mirror_1_mimage_1] revolution_9 iwi-aom--- 2.00g /dev/sdc1(0)
[mirror_1_mimage_2] revolution_9 iwi-aom--- 2.00g /dev/sde1(0)
[mirror_1_mimage_3] revolution_9 Iwi-aom--- 2.00g /dev/sda1(0)
[mirror_1_mimagetmp_3] revolution_9 mwi-aom--- 2.00g [mirror_1_mlog] 100.00 mirror_1_mimage_1(0),mirror_1_mimage_2(0)
[mirror_1_mlog] revolution_9 lwi-aom--- 4.00m /dev/sdg1(0)
# LOG
Oct 3 10:53:27 host-122 qarshd[4743]: Running cmdline: echo offline > /sys/block/sda/device/state
Oct 3 10:53:28 host-122 qarshd[4747]: Running cmdline: pvscan --cache /dev/sda1
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
[...]
Oct 3 10:53:28 host-122 lvm[1278]: Log device 253:2 has failed (D).
Oct 3 10:53:28 host-122 lvm[1278]: Device failure in revolution_9-mirror_1.
Oct 3 10:53:28 host-122 lvm[1278]: WARNING: Device for PV EvH7eU-edwm-qhxJ-w3LI-PgGP-s3xE-pt0sGE not found or rejected by a filter.
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:28 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct 3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct 3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct 3 10:53:29 host-122 multipathd: dm-3: devmap not registered, can't remove
Oct 3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 lvm[1278]: Mirror status: 1 of 3 images failed.
Oct 3 10:53:29 host-122 lvm[1278]: Trying to up-convert to 3 images, 1 logs.
Oct 3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct 3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:29 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 3 images in volume revolution_9/mirror_1.
Oct 3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 1 logs in volume revolution_9/mirror_1.
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct 3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
[...]
Oct 3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct 3 10:53:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:54:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:54:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:54:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:54:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:55:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:55:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:55:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct 3 10:55:42 host-122 kernel: INFO: task jbd2/dm-6-8:4143 blocked for more than 120 seconds.
Oct 3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 3 10:55:42 host-122 kernel: jbd2/dm-6-8 D ffff880027d54400 0 4143 2 0x00000080
Oct 3 10:55:42 host-122 kernel: ffff88003b403c90 0000000000000046 ffff88003aa94e70 ffff88003b403fd8
Oct 3 10:55:42 host-122 kernel: ffff88003b403fd8 ffff88003b403fd8 ffff88003aa94e70 ffff88003b403da0
Oct 3 10:55:42 host-122 kernel: ffff88003bd4a8c0 ffff88003aa94e70 ffff88003b403d88 ffff880027d54400
Oct 3 10:55:42 host-122 kernel: Call Trace:
Oct 3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct 3 10:55:42 host-122 kernel: [<ffffffffa03a4238>] jbd2_journal_commit_transaction+0x248/0x19a0 [jbd2]
Oct 3 10:55:42 host-122 kernel: [<ffffffff810ca86e>] ? account_entity_dequeue+0xae/0xd0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810ce35c>] ? dequeue_entity+0x11c/0x5d0
Oct 3 10:55:42 host-122 kernel: [<ffffffff81060abf>] ? kvm_clock_read+0x1f/0x30
Oct 3 10:55:42 host-122 kernel: [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct 3 10:55:42 host-122 kernel: [<ffffffff8109807e>] ? try_to_del_timer_sync+0x5e/0x90
Oct 3 10:55:42 host-122 kernel: [<ffffffffa03a9e99>] kjournald2+0xc9/0x260 [jbd2]
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct 3 10:55:42 host-122 kernel: [<ffffffffa03a9dd0>] ? commit_timeout+0x10/0x10 [jbd2]
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct 3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct 3 10:55:42 host-122 kernel: INFO: task kworker/u2:4:4224 blocked for more than 120 seconds.
Oct 3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 3 10:55:42 host-122 kernel: kworker/u2:4 D ffffffff81689410 0 4224 2 0x00000080
Oct 3 10:55:42 host-122 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6)
Oct 3 10:55:42 host-122 kernel: ffff88001f48f830 0000000000000046 ffff88003aa96dd0 ffff88001f48ffd8
Oct 3 10:55:42 host-122 kernel: ffff88001f48ffd8 ffff88001f48ffd8 ffff88003aa96dd0 ffff88003fc16c40
Oct 3 10:55:42 host-122 kernel: 0000000000000000 7fffffffffffffff ffff88003ff5b6e8 ffffffff81689410
Oct 3 10:55:42 host-122 kernel: Call Trace:
Oct 3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct 3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct 3 10:55:42 host-122 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0
Oct 3 10:55:42 host-122 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0
Oct 3 10:55:42 host-122 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30
Oct 3 10:55:42 host-122 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0
Oct 3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct 3 10:55:42 host-122 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130
Oct 3 10:55:42 host-122 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20
Oct 3 10:55:42 host-122 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50
Oct 3 10:55:42 host-122 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90
Oct 3 10:55:42 host-122 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40
Oct 3 10:55:42 host-122 kernel: [<ffffffffa03c4273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4]
Oct 3 10:55:42 host-122 kernel: [<ffffffffa03c887b>] ext4_writepages+0x45b/0xd60 [ext4]
Oct 3 10:55:42 host-122 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30
Oct 3 10:55:42 host-122 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80
Oct 3 10:55:42 host-122 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40
Oct 3 10:55:42 host-122 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210
Oct 3 10:55:42 host-122 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420
Oct 3 10:55:42 host-122 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
Oct 3 10:55:42 host-122 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460
Oct 3 10:55:42 host-122 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
Oct 3 10:55:42 host-122 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410
Oct 3 10:55:42 host-122 kernel: [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct 3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct 3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
With 'mirror_{image,log}_fault_policy = "allocate"' I'm able to reporduce the failing dmeventd repair but succeed in manually running it.
No deadlocks occur.
[root@rhel-7-3 ~]# lvm version
LVM version: 2.02.166(2)-RHEL7 (2016-09-28)
Library version: 1.02.135-RHEL7 (2016-09-28)
Driver version: 4.34.0
[root@rhel-7-3 ~]# uname -r
3.10.0-512.el7.x86_64
1. Note that this never fails on the first failure iteration. 2. Verified that this test case passes in 7.2.z rpms/kernel, and continues to fail with older 7.3 rpms but with current kernel. Will try now with older 7.3 kernel.. This also fails with the following rpms/kernel, so it's been around for awhile in 7.3 apparently. 3.10.0-495.el7.x86_64 Build Date : Mon 22 Aug 2016 07:01:35 PM CDT lvm2-2.02.164-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 lvm2-libs-2.02.164-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 lvm2-cluster-2.02.164-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 device-mapper-1.02.133-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 device-mapper-libs-1.02.133-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 device-mapper-event-1.02.133-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 device-mapper-event-libs-1.02.133-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 device-mapper-persistent-data-0.6.3-1.el7 BUILT: Fri Jul 22 05:29:13 CDT 2016 cmirror-2.02.164-3.el7 BUILT: Wed Aug 24 05:20:41 CDT 2016 We've had some success when lvmetad is disabled. It is likely this will be the work-around.
Firstly, the reproducer is to:
*) set mirror fault policies to "allocate"
1) create 3-way mirror (the long being on a separate device, so 4 devices)
- do not use '--nosync'
2) wait for sync
3) kill device under *mimage_1
- dmeventd will respond by down-converting, then up-convert to 3-way mirror again
4) wait for sync
5) kill device under *mimage_0
*) POW!
Created attachment 1207668 [details]
Detailed output showing allocation from failed PV on second repair
contains verbose output from dmeventd
Created attachment 1207670 [details]
typescript from failing run
All the commands run and outputs to hit problem of allocating from missing PV.
Created attachment 1207671 [details]
dmeventd -ddd -l -f
Output from dmeventd during the whole time the 'typescript' above was being captured.
Created attachment 1207672 [details]
logs during time of typescript
In my tests: (using just 2 leg mirror and failing primary leg and doing single byte write to a mirror) [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 76 MIRROR STATUS : 2 253:5 253:6 20/20 1 AA 3 disk 253:4 A [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 153 @PREFIX@vg-LV1 is now in-sync. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 76 MIRROR STATUS : 2 253:5 253:6 20/20 1 RA 3 disk 253:4 A [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 60 Primary mirror device 253:5 read failed. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 153 @PREFIX@vg-LV1 is now in-sync. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:dm ioctl/libdm-iface.c:1838 dm waitevent LVM-DU2y0afw0O8N1cYrNAJcQ2f7EVgKSYfhfSFVrDLVQl4WsSjTDMYf72wByocKytmA [ opencount flush ] [16384] (*1) [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:dm dmeventd.c: 886 Completed waitevent task for @PREFIX@vg-LV1. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 76 MIRROR STATUS : 2 253:5 253:6 20/20 1 AA 3 disk 253:4 A [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 153 @PREFIX@vg-LV1 is now in-sync. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 76 MIRROR STATUS : 2 253:5 253:6 20/20 1 RA 3 disk 253:4 A [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 60 Primary mirror device 253:5 read failed. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 153 @PREFIX@vg-LV1 is now in-sync. [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 76 MIRROR STATUS : 2 253:5 253:6 19/20 1 DA 3 disk 253:4 A [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 64 Primary mirror device 253:5 has failed (D). [ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr dmeventd_mirror.c: 156 Device failure in @PREFIX@vg-LV1. We see - we get fist 'R' error and after that 'D' raid1: Mirror read failed from 253:5. Trying alternative device. raid1: Mirror read failed. So when writes are not 'sector-aligned' and mirror is forced to read itself - there is this 'error' subpart. Yet - I'm not clear how this relates to ignore_suspended_devices Got fix in testing for the initial read error. The reproducer is simple: *) set all mirror fault policies to "allocate" 1) create 3-way mirror 2) wait for sync 3) kill _mimage_1 4) perform 'dd' write to trigger dmeventd auto repair 5) wait for repair and 100% sync 6) kill _mimage_0 7) perform 'dd' write to trigger dmeventd auto repair *) observe problem (depends on failure type used) I've tested permutations of 'lvmetad', 'lvmpolld', and 'ignore_suspended_devices'. The test only fails when 'lvmetad' is used. When lvmetad is used, if I 'echo offline > /sys/block/$dev/device/state' the device, dmeventd segfaults. If I 'echo 1 > /sys/block/$dev/device/delete' the device instead, the lvconvert allocates the replacement device from a failed PV (very bad), but only when 'ignore_suspended_devices = 0'. So, 'ignore_suspended_devices = 1' only helps one case - not all - and we still don't have a reason for why that is yet. Slight edit to doc text, just moving things around: Moved sentences around so that this lists issues next to each other then provides workaround. Separated out parenthetical comment about raid1 into its own small paragraph. Upstream commit 76f6951c3e8f Upstream commit e5b6f2685a46 replaces old 76f6951c3e8f. And there is another on top of that. Looks like the best backport strategy is to take the whole file from upstream. Heinz, could you please correct me if I am wrong? ...looking at master it would also require 506d88a2ec8c which may require other changes. I will look at that possibility tomorrow. Created attachment 1277565 [details] test result I was not able to reproduce this even after more than 40 iterations of this scenario which initially triggered the bug. Also tried manual reproducer from Comment 18 without hitting the issue. So I'm marking this as verified. Attaching one of the iterations as test result. 3.10.0-663.el7.x86_64 lvm2-2.02.171-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 lvm2-libs-2.02.171-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 lvm2-cluster-2.02.171-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 device-mapper-1.02.140-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 device-mapper-libs-1.02.140-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 device-mapper-event-1.02.140-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 device-mapper-event-libs-1.02.140-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 17:15:46 CEST 2017 cmirror-2.02.171-1.el7 BUILT: Wed May 3 14:05:13 CEST 2017 Roman, lvm2-2.02.171-7 and kernel-3.10.0-685.el7.x86_64 also tested ok here. Automatically disabling lvmetad on repair as a workaround to avoid cache incoherencies causing these deadlocks (mentioned in comment #9). Disabling lvmetad got introduced with upstream commit 506d88a2ec8cc4960b907b2cf34ca5786d060943. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2222 |
Description of problem: This may be an outlier as these scenarios normally pass. ================================================================================ Iteration 0.2 started at Thu Sep 29 14:45:47 CDT 2016 ================================================================================ ACTUAL LEG ORDER: /dev/sda1 /dev/sdb1 /dev/sdf1 Scenario kill_primary_leg: Kill primary leg ********* Mirror info for this scenario ********* * mirrors: mirror_1 * leg devices: /dev/sda1 /dev/sdb1 /dev/sdf1 * log devices: /dev/sde1 * failpv(s): /dev/sda1 * failnode(s): host-117 * lvmetad: 1 * leg fault policy: allocate * log fault policy: allocate ************************************************* Current mirror/raid device structure(s): LV Attr LSize Cpy%Sync Devices mirror_1 mwi-aom--- 2.00g 100.00 mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0) [mirror_1_mimage_0] iwi-aom--- 2.00g /dev/sda1(0) [mirror_1_mimage_1] iwi-aom--- 2.00g /dev/sdb1(0) [mirror_1_mimage_2] iwi-aom--- 2.00g /dev/sdf1(0) [mirror_1_mlog] lwi-aom--- 4.00m /dev/sde1(0) PV=/dev/sda1 mirror_1_mimage_0: 4.1 PV=/dev/sda1 mirror_1_mimage_0: 4.1 Writing verification files (checkit) to mirror(s) on... ---- host-117 ---- <start name="host-117_mirror_1" pid="5009" time="Thu Sep 29 14:45:52 2016" type="cmd" /> Sleeping 15 seconds to get some outsanding I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- host-117 ---- Disabling device sda on host-117rescan device... /dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error /dev/sda1: read failed after 0 of 512 at 21467938816: Input/output error /dev/sda1: read failed after 0 of 512 at 0: Input/output error /dev/sda1: read failed after 0 of 512 at 4096: Input/output error /dev/sda1: read failed after 0 of 2048 at 0: Input/output error Getting recovery check start time from /var/log/messages: Sep 29 14:46 Attempting I/O to cause mirror down conversion(s) on host-117 dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.258832 s, 162 MB/s Sep 29 14:46:15 host-117 qarshd[4720]: Running cmdline: echo offline > /sys/block/sda/device/state Sep 29 14:46:15 host-117 systemd: Started qarsh Per-Connection Server (10.15.80.224:45394). Sep 29 14:46:15 host-117 systemd: Starting qarsh Per-Connection Server (10.15.80.224:45394)... Sep 29 14:46:15 host-117 qarshd[4724]: Talking to peer ::ffff:10.15.80.224:45394 (IPv6) Sep 29 14:46:15 host-117 qarshd[4724]: Running cmdline: pvscan --cache /dev/sda1 Sep 29 14:46:15 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device Sep 29 14:46:17 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device Sep 29 14:46:17 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device Sep 29 14:46:17 host-117 lvm[1236]: Primary mirror device 253:3 has failed (D). Sep 29 14:46:17 host-117 lvm[1236]: Device failure in revolution_9-mirror_1. Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Device for PV XdwUg9-Ky9m-DWod-Pqdh-f8aJ-0x0N-XGW6Z7 not found or rejected by a filter. Sep 29 14:46:17 host-117 lvm[1236]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error Sep 29 14:46:17 host-117 lvm[1236]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9. Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9. Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9. Sep 29 14:46:17 host-117 lvm[1236]: Mirror status: 1 of 3 images failed. Sep 29 14:46:17 host-117 lvm[1236]: Trying to up-convert to 3 images, 1 logs. Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9. Sep 29 14:46:17 host-117 multipathd: dm-3: remove map (uevent) Sep 29 14:46:17 host-117 multipathd: dm-3: devmap not registered, can't remove Sep 29 14:46:17 host-117 multipathd: dm-3: remove map (uevent) Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9. Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 lvm[1236]: WARNING: Failed to replace 1 of 3 images in volume revolution_9/mirror_1. Sep 29 14:46:18 host-117 lvm[1236]: WARNING: Failed to replace 1 of 1 logs in volume revolution_9/mirror_1. Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:21 host-117 qarshd[4832]: Running cmdline: dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M Sep 29 14:46:23 host-117 kernel: recovery_complete: 842 callbacks suppressed Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1) Sep 29 14:46:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:47:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:47:18 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:47:33 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:47:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:48:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:48:18 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:48:33 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:48:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:49:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00% Sep 29 14:49:16 host-117 kernel: INFO: task kworker/u2:5:1121 blocked for more than 120 seconds. Sep 29 14:49:16 host-117 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 29 14:49:16 host-117 kernel: kworker/u2:5 D ffffffff81689410 0 1121 2 0x00000080 Sep 29 14:49:16 host-117 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6) Sep 29 14:49:16 host-117 kernel: ffff88003ce17830 0000000000000046 ffff880035ee2f10 ffff88003ce17fd8 Sep 29 14:49:16 host-117 kernel: ffff88003ce17fd8 ffff88003ce17fd8 ffff880035ee2f10 ffff88003fc16c40 Sep 29 14:49:16 host-117 kernel: 0000000000000000 7fffffffffffffff ffff88003ff619e8 ffffffff81689410 Sep 29 14:49:16 host-117 kernel: Call Trace: Sep 29 14:49:16 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50 Sep 29 14:49:16 host-117 kernel: [<ffffffff8168b399>] schedule+0x29/0x70 Sep 29 14:49:16 host-117 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0 Sep 29 14:49:16 host-117 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0 Sep 29 14:49:16 host-117 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30 Sep 29 14:49:16 host-117 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0 Sep 29 14:49:16 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50 Sep 29 14:49:16 host-117 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130 Sep 29 14:49:16 host-117 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20 Sep 29 14:49:16 host-117 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50 Sep 29 14:49:16 host-117 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90 Sep 29 14:49:16 host-117 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0 Sep 29 14:49:16 host-117 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40 Sep 29 14:49:16 host-117 kernel: [<ffffffffa0387273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4] Sep 29 14:49:16 host-117 kernel: [<ffffffffa038b87b>] ext4_writepages+0x45b/0xd60 [ext4] Sep 29 14:49:16 host-117 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30 Sep 29 14:49:16 host-117 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80 Sep 29 14:49:16 host-117 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40 Sep 29 14:49:16 host-117 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210 Sep 29 14:49:16 host-117 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420 Sep 29 14:49:16 host-117 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0 Sep 29 14:49:16 host-117 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0 Sep 29 14:49:16 host-117 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460 Sep 29 14:49:16 host-117 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470 Sep 29 14:49:16 host-117 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410 Sep 29 14:49:16 host-117 kernel: [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460 Sep 29 14:49:16 host-117 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0 Sep 29 14:49:16 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 Sep 29 14:49:16 host-117 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90 Sep 29 14:49:16 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 Version-Release number of selected component (if applicable): 3.10.0-510.el7.x86_64 lvm2-2.02.166-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 lvm2-libs-2.02.166-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 lvm2-cluster-2.02.166-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 device-mapper-1.02.135-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 device-mapper-libs-1.02.135-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 device-mapper-event-1.02.135-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 device-mapper-event-libs-1.02.135-1.el7 BUILT: Wed Sep 28 02:26:52 CDT 2016 device-mapper-persistent-data-0.6.3-1.el7 BUILT: Fri Jul 22 05:29:13 CDT 2016