Bug 1380521 - mirror type conversion/repair deadlock after device failure (device-mapper: raid1: Write error during recovery)
Summary: mirror type conversion/repair deadlock after device failure (device-mapper: r...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.3
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
Steven J. Levine
URL:
Whiteboard:
Keywords: Regression, ZStream
: 1380833 (view as bug list)
Depends On:
Blocks: 1385242 1431146
TreeView+ depends on / blocked
 
Reported: 2016-09-29 20:22 UTC by Corey Marthaler
Modified: 2017-08-02 08:13 UTC (History)
13 users (show)

(edit)
Interaction problems no longer occur with the *lvmetad* daemon when `mirror` segment type is used

Previously, when the legacy `mirror` segment type was used to create mirrored logical volumes with 3 or more legs, interaction problems could occur with the *lvmetad* daemon. Problems observed occurred only after a second device failure, when mirror fault policies were set to the non-default `allocate` option, when *lvmetad* was used, and there had been no reboot of the machine between device failure events. This bug has been fixed, and the described interaction problems no longer occur.
Clone Of:
: 1431146 (view as bug list)
(edit)
Last Closed: 2017-08-01 21:47:18 UTC


Attachments (Terms of Use)
Detailed output showing allocation from failed PV on second repair (332.23 KB, text/plain)
2016-10-05 19:23 UTC, Jonathan Earl Brassow
no flags Details
typescript from failing run (7.97 KB, text/plain)
2016-10-05 19:50 UTC, Jonathan Earl Brassow
no flags Details
dmeventd -ddd -l -f (808.45 KB, text/plain)
2016-10-05 19:52 UTC, Jonathan Earl Brassow
no flags Details
logs during time of typescript (14.03 KB, text/plain)
2016-10-05 19:56 UTC, Jonathan Earl Brassow
no flags Details
test result (8.57 KB, text/plain)
2017-05-10 08:18 UTC, Roman Bednář
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2222 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2017-08-01 18:42:41 UTC

Description Corey Marthaler 2016-09-29 20:22:15 UTC
Description of problem:
This may be an outlier as these scenarios normally pass.

================================================================================
Iteration 0.2 started at Thu Sep 29 14:45:47 CDT 2016
================================================================================
ACTUAL LEG ORDER: /dev/sda1 /dev/sdb1 /dev/sdf1
Scenario kill_primary_leg: Kill primary leg
********* Mirror info for this scenario *********
* mirrors:            mirror_1
* leg devices:        /dev/sda1 /dev/sdb1 /dev/sdf1
* log devices:        /dev/sde1
* failpv(s):          /dev/sda1
* failnode(s):        host-117
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
*************************************************


Current mirror/raid device structure(s):
  LV                  Attr       LSize   Cpy%Sync Devices                                                       
   mirror_1            mwi-aom---   2.00g 100.00   mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0)
   [mirror_1_mimage_0] iwi-aom---   2.00g          /dev/sda1(0)                                                  
   [mirror_1_mimage_1] iwi-aom---   2.00g          /dev/sdb1(0)                                                  
   [mirror_1_mimage_2] iwi-aom---   2.00g          /dev/sdf1(0)                                                  
   [mirror_1_mlog]     lwi-aom---   4.00m          /dev/sde1(0)                                                  

PV=/dev/sda1
        mirror_1_mimage_0: 4.1
PV=/dev/sda1
        mirror_1_mimage_0: 4.1

Writing verification files (checkit) to mirror(s) on...
        ---- host-117 ----

<start name="host-117_mirror_1"  pid="5009" time="Thu Sep 29 14:45:52 2016" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-117 ----

Disabling device sda on host-117rescan device...
  /dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error
  /dev/sda1: read failed after 0 of 512 at 21467938816: Input/output error
  /dev/sda1: read failed after 0 of 512 at 0: Input/output error
  /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sda1: read failed after 0 of 2048 at 0: Input/output error


Getting recovery check start time from /var/log/messages: Sep 29 14:46
Attempting I/O to cause mirror down conversion(s) on host-117
dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.258832 s, 162 MB/s




Sep 29 14:46:15 host-117 qarshd[4720]: Running cmdline: echo offline > /sys/block/sda/device/state
Sep 29 14:46:15 host-117 systemd: Started qarsh Per-Connection Server (10.15.80.224:45394).
Sep 29 14:46:15 host-117 systemd: Starting qarsh Per-Connection Server (10.15.80.224:45394)...
Sep 29 14:46:15 host-117 qarshd[4724]: Talking to peer ::ffff:10.15.80.224:45394 (IPv6)
Sep 29 14:46:15 host-117 qarshd[4724]: Running cmdline: pvscan --cache /dev/sda1
Sep 29 14:46:15 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device
Sep 29 14:46:17 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device
Sep 29 14:46:17 host-117 kernel: sd 3:0:0:1: rejecting I/O to offline device
Sep 29 14:46:17 host-117 lvm[1236]: Primary mirror device 253:3 has failed (D).
Sep 29 14:46:17 host-117 lvm[1236]: Device failure in revolution_9-mirror_1.
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Device for PV XdwUg9-Ky9m-DWod-Pqdh-f8aJ-0x0N-XGW6Z7 not found or rejected by a filter.
Sep 29 14:46:17 host-117 lvm[1236]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Sep 29 14:46:17 host-117 lvm[1236]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 14:46:17 host-117 lvm[1236]: Mirror status: 1 of 3 images failed.
Sep 29 14:46:17 host-117 lvm[1236]: Trying to up-convert to 3 images, 1 logs.
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 14:46:17 host-117 multipathd: dm-3: remove map (uevent)
Sep 29 14:46:17 host-117 multipathd: dm-3: devmap not registered, can't remove
Sep 29 14:46:17 host-117 multipathd: dm-3: remove map (uevent)
Sep 29 14:46:17 host-117 lvm[1236]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 lvm[1236]: WARNING: Failed to replace 1 of 3 images in volume revolution_9/mirror_1.
Sep 29 14:46:18 host-117 lvm[1236]: WARNING: Failed to replace 1 of 1 logs in volume revolution_9/mirror_1.
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:18 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:21 host-117 qarshd[4832]: Running cmdline: dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
Sep 29 14:46:23 host-117 kernel: recovery_complete: 842 callbacks suppressed
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:23 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 14:46:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:47:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:47:18 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:47:33 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:47:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:48:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:48:18 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:48:33 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:48:48 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:49:03 host-117 lvm[1236]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 14:49:16 host-117 kernel: INFO: task kworker/u2:5:1121 blocked for more than 120 seconds.
Sep 29 14:49:16 host-117 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 29 14:49:16 host-117 kernel: kworker/u2:5    D ffffffff81689410     0  1121      2 0x00000080
Sep 29 14:49:16 host-117 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6)
Sep 29 14:49:16 host-117 kernel: ffff88003ce17830 0000000000000046 ffff880035ee2f10 ffff88003ce17fd8
Sep 29 14:49:16 host-117 kernel: ffff88003ce17fd8 ffff88003ce17fd8 ffff880035ee2f10 ffff88003fc16c40
Sep 29 14:49:16 host-117 kernel: 0000000000000000 7fffffffffffffff ffff88003ff619e8 ffffffff81689410
Sep 29 14:49:16 host-117 kernel: Call Trace:
Sep 29 14:49:16 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Sep 29 14:49:16 host-117 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Sep 29 14:49:16 host-117 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0
Sep 29 14:49:16 host-117 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0
Sep 29 14:49:16 host-117 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30
Sep 29 14:49:16 host-117 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0
Sep 29 14:49:16 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Sep 29 14:49:16 host-117 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130
Sep 29 14:49:16 host-117 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20
Sep 29 14:49:16 host-117 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50
Sep 29 14:49:16 host-117 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90
Sep 29 14:49:16 host-117 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0
Sep 29 14:49:16 host-117 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40
Sep 29 14:49:16 host-117 kernel: [<ffffffffa0387273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4]
Sep 29 14:49:16 host-117 kernel: [<ffffffffa038b87b>] ext4_writepages+0x45b/0xd60 [ext4]
Sep 29 14:49:16 host-117 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30
Sep 29 14:49:16 host-117 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80
Sep 29 14:49:16 host-117 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40
Sep 29 14:49:16 host-117 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210
Sep 29 14:49:16 host-117 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420
Sep 29 14:49:16 host-117 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0
Sep 29 14:49:16 host-117 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
Sep 29 14:49:16 host-117 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460
Sep 29 14:49:16 host-117 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
Sep 29 14:49:16 host-117 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410
Sep 29 14:49:16 host-117 kernel: [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
Sep 29 14:49:16 host-117 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Sep 29 14:49:16 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Sep 29 14:49:16 host-117 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Sep 29 14:49:16 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140


Version-Release number of selected component (if applicable):
3.10.0-510.el7.x86_64

lvm2-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
lvm2-libs-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
lvm2-cluster-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016

Comment 2 Corey Marthaler 2016-10-03 18:38:39 UTC
*** Bug 1380833 has been marked as a duplicate of this bug. ***

Comment 3 Corey Marthaler 2016-10-03 18:51:50 UTC
This is reproducible easily.  It appears that lvm may be using the failed device for the new allocation/repair device?  Also should a mirror type volume be doing: "device-mapper: raid1: Write error during recovery"



================================================================================
Iteration 0.2 started at Mon Oct  3 10:52:58 CDT 2016
================================================================================
ACTUAL LEG ORDER: /dev/sda1 /dev/sdc1 /dev/sde1
Scenario kill_primary_leg: Kill primary leg
********* Mirror info for this scenario *********
* mirrors:            mirror_1
* leg devices:        /dev/sda1 /dev/sdc1 /dev/sde1
* log devices:        /dev/sdg1
* failpv(s):          /dev/sda1
* failnode(s):        host-122
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
*************************************************
                                                                                                                                                                                                                  
Current mirror/raid device structure(s):                                                                                                                                                                          
  LV                  Attr       LSize Cpy%Sync Devices                                                                                                                                                           
  mirror_1            mwi-aom--- 2.00g 100.00   mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0)                                                                                                    
  [mirror_1_mimage_0] iwi-aom--- 2.00g          /dev/sda1(0)                                                                                                                                                      
  [mirror_1_mimage_1] iwi-aom--- 2.00g          /dev/sdc1(0)                                                                                                                                                      
  [mirror_1_mimage_2] iwi-aom--- 2.00g          /dev/sde1(0)                                                                                                                                                      
  [mirror_1_mlog]     lwi-aom--- 4.00m          /dev/sdg1(0)

PV=/dev/sda1
        mirror_1_mimage_0: 4.1
PV=/dev/sda1
        mirror_1_mimage_0: 4.1

Writing verification files (checkit) to mirror(s) on...
        ---- host-122 ----

<start name="host-122_mirror_1"  pid="9561" time="Mon Oct  3 10:53:05 2016" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-122 ----


Disabling device sda on host-122rescan device...
  /dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error
  /dev/sda1: read failed after 0 of 512 at 21467938816: Input/output error
  /dev/sda1: read failed after 0 of 512 at 0: Input/output error
  /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sda1: read failed after 0 of 2048 at 0: Input/output error


Getting recovery check start time from /var/log/messages: Oct  3 10:53
Attempting I/O to cause mirror down conversion(s) on host-122
dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.139669 s, 300 MB/s

[root@host-122 ~]# lvs -a -o +devices
  LV                     VG            Attr       LSize    Log             Cpy%Sync Convert                Devices
  mirror_1               revolution_9  cwi-aom---   2.00g                  0.00     [mirror_1_mimagetmp_3] mirror_1_mimagetmp_3(0),mirror_1_mimage_3(0)
  [mirror_1_mimage_1]    revolution_9  iwi-aom---   2.00g                                                  /dev/sdc1(0)
  [mirror_1_mimage_2]    revolution_9  iwi-aom---   2.00g                                                  /dev/sde1(0)
  [mirror_1_mimage_3]    revolution_9  Iwi-aom---   2.00g                                                  /dev/sda1(0)
  [mirror_1_mimagetmp_3] revolution_9  mwi-aom---   2.00g  [mirror_1_mlog] 100.00                          mirror_1_mimage_1(0),mirror_1_mimage_2(0)   
  [mirror_1_mlog]        revolution_9  lwi-aom---   4.00m                                                  /dev/sdg1(0)


# LOG

Oct  3 10:53:27 host-122 qarshd[4743]: Running cmdline: echo offline > /sys/block/sda/device/state
Oct  3 10:53:28 host-122 qarshd[4747]: Running cmdline: pvscan --cache /dev/sda1
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
[...]
Oct  3 10:53:28 host-122 lvm[1278]: Log device 253:2 has failed (D).
Oct  3 10:53:28 host-122 lvm[1278]: Device failure in revolution_9-mirror_1.
Oct  3 10:53:28 host-122 lvm[1278]: WARNING: Device for PV EvH7eU-edwm-qhxJ-w3LI-PgGP-s3xE-pt0sGE not found or rejected by a filter.
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct  3 10:53:29 host-122 multipathd: dm-3: devmap not registered, can't remove
Oct  3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: Mirror status: 1 of 3 images failed.
Oct  3 10:53:29 host-122 lvm[1278]: Trying to up-convert to 3 images, 1 logs.
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 3 images in volume revolution_9/mirror_1.
Oct  3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 1 logs in volume revolution_9/mirror_1.
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
[...]
Oct  3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:42 host-122 kernel: INFO: task jbd2/dm-6-8:4143 blocked for more than 120 seconds.
Oct  3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 10:55:42 host-122 kernel: jbd2/dm-6-8     D ffff880027d54400     0  4143      2 0x00000080
Oct  3 10:55:42 host-122 kernel: ffff88003b403c90 0000000000000046 ffff88003aa94e70 ffff88003b403fd8
Oct  3 10:55:42 host-122 kernel: ffff88003b403fd8 ffff88003b403fd8 ffff88003aa94e70 ffff88003b403da0
Oct  3 10:55:42 host-122 kernel: ffff88003bd4a8c0 ffff88003aa94e70 ffff88003b403d88 ffff880027d54400
Oct  3 10:55:42 host-122 kernel: Call Trace:
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a4238>] jbd2_journal_commit_transaction+0x248/0x19a0 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ca86e>] ? account_entity_dequeue+0xae/0xd0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ce35c>] ? dequeue_entity+0x11c/0x5d0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81060abf>] ? kvm_clock_read+0x1f/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff8109807e>] ? try_to_del_timer_sync+0x5e/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a9e99>] kjournald2+0xc9/0x260 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a9dd0>] ? commit_timeout+0x10/0x10 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: INFO: task kworker/u2:4:4224 blocked for more than 120 seconds.
Oct  3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 10:55:42 host-122 kernel: kworker/u2:4    D ffffffff81689410     0  4224      2 0x00000080
Oct  3 10:55:42 host-122 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6)
Oct  3 10:55:42 host-122 kernel: ffff88001f48f830 0000000000000046 ffff88003aa96dd0 ffff88001f48ffd8
Oct  3 10:55:42 host-122 kernel: ffff88001f48ffd8 ffff88001f48ffd8 ffff88003aa96dd0 ffff88003fc16c40
Oct  3 10:55:42 host-122 kernel: 0000000000000000 7fffffffffffffff ffff88003ff5b6e8 ffffffff81689410
Oct  3 10:55:42 host-122 kernel: Call Trace:
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct  3 10:55:42 host-122 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0
Oct  3 10:55:42 host-122 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03c4273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4]
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03c887b>] ext4_writepages+0x45b/0xd60 [ext4]
Oct  3 10:55:42 host-122 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80
Oct  3 10:55:42 host-122 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40
Oct  3 10:55:42 host-122 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210
Oct  3 10:55:42 host-122 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420
Oct  3 10:55:42 host-122 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
Oct  3 10:55:42 host-122 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140

Comment 4 Heinz Mauelshagen 2016-10-04 15:55:21 UTC
With 'mirror_{image,log}_fault_policy = "allocate"' I'm able to reporduce the failing dmeventd repair but succeed in manually running it.

No deadlocks occur.

[root@rhel-7-3 ~]# lvm version
  LVM version:     2.02.166(2)-RHEL7 (2016-09-28)
  Library version: 1.02.135-RHEL7 (2016-09-28)
  Driver version:  4.34.0
[root@rhel-7-3 ~]# uname -r
3.10.0-512.el7.x86_64

Comment 5 Corey Marthaler 2016-10-04 17:06:59 UTC
1. Note that this never fails on the first failure iteration.


2. Verified that this test case passes in 7.2.z rpms/kernel, and continues to fail with older 7.3 rpms but with current kernel. Will try now with older 7.3 kernel..

Comment 7 Corey Marthaler 2016-10-04 19:21:25 UTC
This also fails with the following rpms/kernel, so it's been around for awhile in 7.3 apparently.

3.10.0-495.el7.x86_64  Build Date  : Mon 22 Aug 2016 07:01:35 PM CDT

lvm2-2.02.164-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
lvm2-libs-2.02.164-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
lvm2-cluster-2.02.164-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
device-mapper-1.02.133-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
device-mapper-libs-1.02.133-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
device-mapper-event-1.02.133-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
device-mapper-event-libs-1.02.133-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016
cmirror-2.02.164-3.el7    BUILT: Wed Aug 24 05:20:41 CDT 2016

Comment 9 Jonathan Earl Brassow 2016-10-05 15:03:46 UTC
We've had some success when lvmetad is disabled.  It is likely this will be the work-around.

Firstly, the reproducer is to:
*) set mirror fault policies to "allocate"
1) create 3-way mirror (the long being on a separate device, so 4 devices)
    - do not use '--nosync'
2) wait for sync
3) kill device under *mimage_1
    - dmeventd will respond by down-converting, then up-convert to 3-way mirror again
4) wait for sync
5) kill device under *mimage_0
*) POW!

Comment 10 Jonathan Earl Brassow 2016-10-05 19:23 UTC
Created attachment 1207668 [details]
Detailed output showing allocation from failed PV on second repair

contains verbose output from dmeventd

Comment 12 Jonathan Earl Brassow 2016-10-05 19:50 UTC
Created attachment 1207670 [details]
typescript from failing run

All the commands run and outputs to hit problem of allocating from missing PV.

Comment 13 Jonathan Earl Brassow 2016-10-05 19:52 UTC
Created attachment 1207671 [details]
dmeventd -ddd -l -f

Output from dmeventd during the whole time the 'typescript' above was being captured.

Comment 14 Jonathan Earl Brassow 2016-10-05 19:56 UTC
Created attachment 1207672 [details]
logs during time of typescript

Comment 16 Zdenek Kabelac 2016-10-06 14:08:28 UTC
In my tests:

(using just 2 leg mirror and failing primary leg and doing single byte write to a mirror)

[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  76 MIRROR STATUS : 2 253:5 253:6 20/20 1 AA 3 disk 253:4 A
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c: 153   @PREFIX@vg-LV1 is now in-sync.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  76 MIRROR STATUS : 2 253:5 253:6 20/20 1 RA 3 disk 253:4 A
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  60 Primary mirror device 253:5 read failed.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c: 153   @PREFIX@vg-LV1 is now in-sync.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:dm             ioctl/libdm-iface.c:1838       dm waitevent  LVM-DU2y0afw0O8N1cYrNAJcQ2f7EVgKSYfhfSFVrDLVQl4WsSjTDMYf72wByocKytmA [ opencount flush ]   [16384] (*1)
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:dm                      dmeventd.c: 886       Completed waitevent task for @PREFIX@vg-LV1.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  76 MIRROR STATUS : 2 253:5 253:6 20/20 1 AA 3 disk 253:4 A
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c: 153   @PREFIX@vg-LV1 is now in-sync.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  76 MIRROR STATUS : 2 253:5 253:6 20/20 1 RA 3 disk 253:4 A
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  60 Primary mirror device 253:5 read failed.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c: 153   @PREFIX@vg-LV1 is now in-sync.
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  76 MIRROR STATUS : 2 253:5 253:6 19/20 1 DA 3 disk 253:4 A
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c:  64 Primary mirror device 253:5 has failed (D).
[ 0:07] ## DMEVENTD: [ 0:02] 571b8700:mirr             dmeventd_mirror.c: 156 Device failure in @PREFIX@vg-LV1.



We see - we get fist 'R' error and after that  'D' 

raid1: Mirror read failed from 253:5. Trying alternative device.
raid1: Mirror read failed.


So when writes are not 'sector-aligned' and mirror is forced to read itself - there is this 'error' subpart.

Yet - I'm not clear how this relates to ignore_suspended_devices

Comment 17 Heinz Mauelshagen 2016-10-10 14:11:35 UTC
Got fix in testing for the initial read error.

Comment 18 Jonathan Earl Brassow 2016-10-10 20:33:44 UTC
The reproducer is simple:
*) set all mirror fault policies to "allocate"
1) create 3-way mirror
2) wait for sync
3) kill _mimage_1
4) perform 'dd' write to trigger dmeventd auto repair 
5) wait for repair and 100% sync
6) kill _mimage_0
7) perform 'dd' write to trigger dmeventd auto repair 
*) observe problem (depends on failure type used)

I've tested permutations of 'lvmetad', 'lvmpolld', and 'ignore_suspended_devices'.  The test only fails when 'lvmetad' is used.

When lvmetad is used, if I 'echo offline > /sys/block/$dev/device/state' the device, dmeventd segfaults.  If I 'echo 1 > /sys/block/$dev/device/delete' the device instead, the lvconvert allocates the replacement device from a failed PV (very bad), but only when 'ignore_suspended_devices = 0'.

So, 'ignore_suspended_devices = 1' only helps one case - not all - and we still don't have a reason for why that is yet.

Comment 21 Steven J. Levine 2016-10-13 16:19:05 UTC
Slight edit to doc text, just moving things around: Moved sentences around so that this lists issues next to each other then provides workaround.  Separated out parenthetical comment about raid1 into its own small paragraph.

Comment 23 Heinz Mauelshagen 2017-03-09 19:42:14 UTC
Upstream commit 76f6951c3e8f

Comment 27 Marian Csontos 2017-04-27 14:01:05 UTC
Upstream commit e5b6f2685a46 replaces old 76f6951c3e8f. And there is another on top of that. Looks like the best backport strategy is to take the whole file from upstream. Heinz, could you please correct me if I am wrong?

Comment 28 Marian Csontos 2017-04-27 15:40:43 UTC
...looking at master it would also require 506d88a2ec8c which may require other changes. I will look at that possibility tomorrow.

Comment 30 Roman Bednář 2017-05-10 08:18 UTC
Created attachment 1277565 [details]
test result

I was not able to reproduce this even after more than 40 iterations of this scenario which initially triggered the bug. Also tried manual reproducer from Comment 18 without hitting the issue. So I'm marking this as verified. Attaching one of the iterations as test result.



3.10.0-663.el7.x86_64

lvm2-2.02.171-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
lvm2-cluster-2.02.171-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 17:15:46 CEST 2017
cmirror-2.02.171-1.el7    BUILT: Wed May  3 14:05:13 CEST 2017

Comment 31 Heinz Mauelshagen 2017-06-23 11:14:59 UTC
Roman,

lvm2-2.02.171-7 and kernel-3.10.0-685.el7.x86_64 also tested ok here.

Automatically disabling lvmetad on repair as a workaround to avoid cache incoherencies causing these deadlocks (mentioned in comment #9).

Disabling lvmetad got introduced with upstream 
commit 506d88a2ec8cc4960b907b2cf34ca5786d060943.

Comment 32 errata-xmlrpc 2017-08-01 21:47:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2222


Note You need to log in before you can comment on or make changes to this bug.