Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1380833

Summary: mirror allocation/repair deadlock
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-03 18:38:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2016-09-30 17:13:08 UTC
Description of problem:
================================================================================
Iteration 0.6 started at Thu Sep 29 17:17:49 CDT 2016
================================================================================
ACTUAL LEG ORDER: /dev/sdc1 /dev/sdb1
Scenario kill_non_primary_leg: Kill non primary leg
********* Mirror info for this scenario *********
* mirrors:            mirror_1
* leg devices:        /dev/sdc1 /dev/sdb1
* log devices:        
* failpv(s):          /dev/sdb1
* failnode(s):        host-117
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
*************************************************


Current mirror/raid device structure(s):
  LV                  Attr       LSize   Cpy%Sync Devices                                  
   mirror_1            mwi-aom---   2.00g 100.00   mirror_1_mimage_0(0),mirror_1_mimage_1(0)
   [mirror_1_mimage_0] iwi-aom---   2.00g          /dev/sdc1(0)                             
   [mirror_1_mimage_1] iwi-aom---   2.00g          /dev/sdb1(0)                             
   root                -wi-ao----   6.20g          /dev/vda2(205)                           
   swap                -wi-ao---- 820.00m          /dev/vda2(0)                             


PV=/dev/sdb1
        mirror_1_mimage_1: 5.1
PV=/dev/sdb1
        mirror_1_mimage_1: 5.1

Writing verification files (checkit) to mirror(s) on...
        ---- host-117 ----

<start name="host-117_mirror_1"  pid="15277" time="Thu Sep 29 17:17:54 2016" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-117 ----

Name             GrpID RgID ObjType ArID ArStart ArSize  RMrg/s WMrg/s R/s  W/s  RSz/s WSz/s   AvgRqSz QSize Util% AWait RdAWait WrAWait
host-117_load        0    0 group      0 284.00m 980.00k   0.00   0.00 0.00 2.00     0 980.00k 490.00k  0.13  6.60 10.50    0.00   10.50
Name             GrpID RgID ObjType RgStart RgSize  #Areas ArSize  ProgID 
host-117_load        0    0 group   284.00m 980.00k      1 980.00k dmstats

Disabling device sdb on host-117rescan device...
  /dev/sdb1: read failed after 0 of 512 at 21467824128: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 21467938816: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 0: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sdb1: read failed after 0 of 2048 at 0: Input/output error


Getting recovery check start time from /var/log/messages: Sep 29 17:18
Attempting I/O to cause mirror down conversion(s) on host-117
dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.265292 s, 158 MB/s

[deadlock]


  LV                  VG            Attr       LSize   Cpy%Sync Devices                                  
  mirror_1            revolution_9  mwi-aom---   2.00g 0.00     mirror_1_mimage_0(0),mirror_1_mimage_1(0)
  [mirror_1_mimage_0] revolution_9  Iwi-aom---   2.00g          /dev/sdc1(0)                             
  [mirror_1_mimage_1] revolution_9  Iwi-aom---   2.00g          /dev/sdb1(0)                             




Sep 29 17:18:19 host-117 lvm[3828]: Log device 253:2 has failed (D).
Sep 29 17:18:19 host-117 lvm[3828]: Device failure in revolution_9-mirror_1.
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Device for PV 3Ho10n-LX0i-OVfp-48NY-l1Md-1is9-mUfiuT not found or rejected by a filter.
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 17:18:19 host-117 multipathd: dm-3: remove map (uevent)
Sep 29 17:18:19 host-117 multipathd: dm-3: devmap not registered, can't remove
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 17:18:19 host-117 multipathd: dm-3: remove map (uevent)
Sep 29 17:18:19 host-117 multipathd: dm-2: remove map (uevent)
Sep 29 17:18:19 host-117 multipathd: dm-2: devmap not registered, can't remove
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 lvm[3828]: Mirror status: 1 of 2 images failed.
Sep 29 17:18:19 host-117 lvm[3828]: Trying to up-convert to 2 images, 0 logs.
Sep 29 17:18:19 host-117 lvm[3828]: WARNING: Failed to write an MDA of VG revolution_9.
Sep 29 17:18:19 host-117 multipathd: dm-2: remove map (uevent)
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device
Sep 29 17:18:19 host-117 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Sep 29 17:18:19 host-117 kernel: sd 2:0:0:1: rejecting I/O to offline device


Sep 29 17:18:34 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:18:49 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:19:04 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:19:19 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:19:34 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:19:49 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:20:01 host-117 systemd: Started Session 3 of user root.
Sep 29 17:20:01 host-117 systemd: Starting Session 3 of user root.
Sep 29 17:20:04 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:20:19 host-117 lvm[3828]: revolution_9/mirror_1: Converted: 0.00%
Sep 29 17:20:24 host-117 kernel: INFO: task jbd2/dm-6-8:3964 blocked for more than 120 seconds.
Sep 29 17:20:24 host-117 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 29 17:20:24 host-117 kernel: jbd2/dm-6-8     D ffff880028a99400     0  3964      2 0x00000080
Sep 29 17:20:24 host-117 kernel: ffff88003abefc90 0000000000000046 ffff880028081f60 ffff88003abeffd8
Sep 29 17:20:24 host-117 kernel: ffff88003abeffd8 ffff88003abeffd8 ffff880028081f60 ffff88003abefda0
Sep 29 17:20:24 host-117 kernel: ffff88003d3de8c0 ffff880028081f60 ffff88003abefd88 ffff880028a99400
Sep 29 17:20:24 host-117 kernel: Call Trace:
Sep 29 17:20:24 host-117 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Sep 29 17:20:24 host-117 kernel: [<ffffffffa037d238>] jbd2_journal_commit_transaction+0x248/0x19a0 [jbd2]
Sep 29 17:20:24 host-117 kernel: [<ffffffff810ca86e>] ? account_entity_dequeue+0xae/0xd0
Sep 29 17:20:24 host-117 kernel: [<ffffffff810ce35c>] ? dequeue_entity+0x11c/0x5d0
Sep 29 17:20:24 host-117 kernel: [<ffffffff81060abf>] ? kvm_clock_read+0x1f/0x30
Sep 29 17:20:24 host-117 kernel: [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Sep 29 17:20:24 host-117 kernel: [<ffffffff8109807e>] ? try_to_del_timer_sync+0x5e/0x90
Sep 29 17:20:24 host-117 kernel: [<ffffffffa0382e99>] kjournald2+0xc9/0x260 [jbd2]
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Sep 29 17:20:24 host-117 kernel: [<ffffffffa0382dd0>] ? commit_timeout+0x10/0x10 [jbd2]
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Sep 29 17:20:24 host-117 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Sep 29 17:20:24 host-117 kernel: INFO: task kworker/u2:2:4786 blocked for more than 120 seconds.
Sep 29 17:20:24 host-117 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 29 17:20:24 host-117 kernel: kworker/u2:2    D ffffffff81689410     0  4786      2 0x00000080
Sep 29 17:20:24 host-117 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6)
Sep 29 17:20:24 host-117 kernel: ffff8800224eb830 0000000000000046 ffff880020d48000 ffff8800224ebfd8
Sep 29 17:20:24 host-117 kernel: ffff8800224ebfd8 ffff8800224ebfd8 ffff880020d48000 ffff88003fc16c40
Sep 29 17:20:24 host-117 kernel: 0000000000000000 7fffffffffffffff ffff88003ff5e6e8 ffffffff81689410
Sep 29 17:20:24 host-117 kernel: Call Trace:
Sep 29 17:20:24 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Sep 29 17:20:24 host-117 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Sep 29 17:20:24 host-117 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0
Sep 29 17:20:24 host-117 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0
Sep 29 17:20:24 host-117 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30
Sep 29 17:20:24 host-117 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0
Sep 29 17:20:24 host-117 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Sep 29 17:20:24 host-117 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130
Sep 29 17:20:24 host-117 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20
Sep 29 17:20:24 host-117 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50
Sep 29 17:20:24 host-117 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90
Sep 29 17:20:24 host-117 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0
Sep 29 17:20:24 host-117 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40
Sep 29 17:20:24 host-117 kernel: [<ffffffffa039d273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4]
Sep 29 17:20:24 host-117 kernel: [<ffffffffa03a187b>] ext4_writepages+0x45b/0xd60 [ext4]
Sep 29 17:20:24 host-117 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30
Sep 29 17:20:24 host-117 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80
Sep 29 17:20:24 host-117 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40
Sep 29 17:20:24 host-117 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210
Sep 29 17:20:24 host-117 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420
Sep 29 17:20:24 host-117 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0
Sep 29 17:20:24 host-117 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
Sep 29 17:20:24 host-117 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460
Sep 29 17:20:24 host-117 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
Sep 29 17:20:24 host-117 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Corey Marthaler 2016-09-30 17:13:40 UTC
3.10.0-510.el7.x86_64

lvm2-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
lvm2-libs-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
lvm2-cluster-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016

Comment 3 Corey Marthaler 2016-10-03 16:10:49 UTC
This is reproducible easily.  It appears that lvm may be using the failed device for allocation of the new device?

================================================================================
Iteration 0.2 started at Mon Oct  3 10:52:58 CDT 2016
================================================================================
ACTUAL LEG ORDER: /dev/sda1 /dev/sdc1 /dev/sde1
Scenario kill_primary_leg: Kill primary leg
********* Mirror info for this scenario *********
* mirrors:            mirror_1
* leg devices:        /dev/sda1 /dev/sdc1 /dev/sde1
* log devices:        /dev/sdg1
* failpv(s):          /dev/sda1
* failnode(s):        host-122
* lvmetad:            1
* leg fault policy:   allocate
* log fault policy:   allocate
*************************************************
                                                                                                                                                                                                                  
Current mirror/raid device structure(s):                                                                                                                                                                          
  LV                  Attr       LSize Cpy%Sync Devices                                                                                                                                                           
  mirror_1            mwi-aom--- 2.00g 100.00   mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0)                                                                                                    
  [mirror_1_mimage_0] iwi-aom--- 2.00g          /dev/sda1(0)                                                                                                                                                      
  [mirror_1_mimage_1] iwi-aom--- 2.00g          /dev/sdc1(0)                                                                                                                                                      
  [mirror_1_mimage_2] iwi-aom--- 2.00g          /dev/sde1(0)                                                                                                                                                      
  [mirror_1_mlog]     lwi-aom--- 4.00m          /dev/sdg1(0)

PV=/dev/sda1
        mirror_1_mimage_0: 4.1
PV=/dev/sda1
        mirror_1_mimage_0: 4.1

Writing verification files (checkit) to mirror(s) on...
        ---- host-122 ----

<start name="host-122_mirror_1"  pid="9561" time="Mon Oct  3 10:53:05 2016" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-122 ----


Disabling device sda on host-122rescan device...
  /dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error
  /dev/sda1: read failed after 0 of 512 at 21467938816: Input/output error
  /dev/sda1: read failed after 0 of 512 at 0: Input/output error
  /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sda1: read failed after 0 of 2048 at 0: Input/output error


Getting recovery check start time from /var/log/messages: Oct  3 10:53
Attempting I/O to cause mirror down conversion(s) on host-122
dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.139669 s, 300 MB/s

[root@host-122 ~]# lvs -a -o +devices
  LV                     VG            Attr       LSize    Log             Cpy%Sync Convert                Devices
  mirror_1               revolution_9  cwi-aom---   2.00g                  0.00     [mirror_1_mimagetmp_3] mirror_1_mimagetmp_3(0),mirror_1_mimage_3(0)
  [mirror_1_mimage_1]    revolution_9  iwi-aom---   2.00g                                                  /dev/sdc1(0)
  [mirror_1_mimage_2]    revolution_9  iwi-aom---   2.00g                                                  /dev/sde1(0)
  [mirror_1_mimage_3]    revolution_9  Iwi-aom---   2.00g                                                  /dev/sda1(0)
  [mirror_1_mimagetmp_3] revolution_9  mwi-aom---   2.00g  [mirror_1_mlog] 100.00                          mirror_1_mimage_1(0),mirror_1_mimage_2(0)   
  [mirror_1_mlog]        revolution_9  lwi-aom---   4.00m                                                  /dev/sdg1(0)


# LOG

Oct  3 10:53:27 host-122 qarshd[4743]: Running cmdline: echo offline > /sys/block/sda/device/state
Oct  3 10:53:28 host-122 qarshd[4747]: Running cmdline: pvscan --cache /dev/sda1
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
[...]
Oct  3 10:53:28 host-122 lvm[1278]: Log device 253:2 has failed (D).
Oct  3 10:53:28 host-122 lvm[1278]: Device failure in revolution_9-mirror_1.
Oct  3 10:53:28 host-122 lvm[1278]: WARNING: Device for PV EvH7eU-edwm-qhxJ-w3LI-PgGP-s3xE-pt0sGE not found or rejected by a filter.
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:28 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: /dev/sda1: read failed after 0 of 512 at 4096: Input/output error
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct  3 10:53:29 host-122 multipathd: dm-3: devmap not registered, can't remove
Oct  3 10:53:29 host-122 multipathd: dm-3: remove map (uevent)
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: Mirror status: 1 of 3 images failed.
Oct  3 10:53:29 host-122 lvm[1278]: Trying to up-convert to 3 images, 1 logs.
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 lvm[1278]: WARNING: Failed to write an MDA of VG revolution_9.
Oct  3 10:53:29 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:29 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 3 images in volume revolution_9/mirror_1.
Oct  3 10:53:30 host-122 lvm[1278]: WARNING: Failed to replace 1 of 1 logs in volume revolution_9/mirror_1.
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
Oct  3 10:53:30 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:30 host-122 kernel: device-mapper: raid1: Write error during recovery (error = 0x1)
[...]
Oct  3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:41 host-122 kernel: sd 3:0:0:1: rejecting I/O to offline device
Oct  3 10:53:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:54:45 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:00 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:15 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:30 host-122 lvm[1278]: revolution_9/mirror_1: Converted: 0.00%
Oct  3 10:55:42 host-122 kernel: INFO: task jbd2/dm-6-8:4143 blocked for more than 120 seconds.
Oct  3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 10:55:42 host-122 kernel: jbd2/dm-6-8     D ffff880027d54400     0  4143      2 0x00000080
Oct  3 10:55:42 host-122 kernel: ffff88003b403c90 0000000000000046 ffff88003aa94e70 ffff88003b403fd8
Oct  3 10:55:42 host-122 kernel: ffff88003b403fd8 ffff88003b403fd8 ffff88003aa94e70 ffff88003b403da0
Oct  3 10:55:42 host-122 kernel: ffff88003bd4a8c0 ffff88003aa94e70 ffff88003b403d88 ffff880027d54400
Oct  3 10:55:42 host-122 kernel: Call Trace:
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a4238>] jbd2_journal_commit_transaction+0x248/0x19a0 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ca86e>] ? account_entity_dequeue+0xae/0xd0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ce35c>] ? dequeue_entity+0x11c/0x5d0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81060abf>] ? kvm_clock_read+0x1f/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff8109807e>] ? try_to_del_timer_sync+0x5e/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a9e99>] kjournald2+0xc9/0x260 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03a9dd0>] ? commit_timeout+0x10/0x10 [jbd2]
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: INFO: task kworker/u2:4:4224 blocked for more than 120 seconds.
Oct  3 10:55:42 host-122 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 10:55:42 host-122 kernel: kworker/u2:4    D ffffffff81689410     0  4224      2 0x00000080
Oct  3 10:55:42 host-122 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:6)
Oct  3 10:55:42 host-122 kernel: ffff88001f48f830 0000000000000046 ffff88003aa96dd0 ffff88001f48ffd8
Oct  3 10:55:42 host-122 kernel: ffff88001f48ffd8 ffff88001f48ffd8 ffff88003aa96dd0 ffff88003fc16c40
Oct  3 10:55:42 host-122 kernel: 0000000000000000 7fffffffffffffff ffff88003ff5b6e8 ffffffff81689410
Oct  3 10:55:42 host-122 kernel: Call Trace:
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168b399>] schedule+0x29/0x70
Oct  3 10:55:42 host-122 kernel: [<ffffffff81688de9>] schedule_timeout+0x239/0x2d0
Oct  3 10:55:42 host-122 kernel: [<ffffffff815b6513>] ? ip_output+0x73/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81060aef>] ? kvm_clock_get_cycles+0x1f/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff810eaa6c>] ? ktime_get_ts64+0x4c/0xf0
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689410>] ? bit_wait+0x50/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168a93e>] io_schedule_timeout+0xae/0x130
Oct  3 10:55:42 host-122 kernel: [<ffffffff8168a9d8>] io_schedule+0x18/0x20
Oct  3 10:55:42 host-122 kernel: [<ffffffff81689421>] bit_wait_io+0x11/0x50
Oct  3 10:55:42 host-122 kernel: [<ffffffff81688f45>] __wait_on_bit+0x65/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff8117fb61>] wait_on_page_bit+0x81/0xa0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03c4273>] mpage_prepare_extent_to_map+0x1e3/0x2e0 [ext4]
Oct  3 10:55:42 host-122 kernel: [<ffffffffa03c887b>] ext4_writepages+0x45b/0xd60 [ext4]
Oct  3 10:55:42 host-122 kernel: [<ffffffff812ab683>] ? selinux_socket_sendmsg+0x23/0x30
Oct  3 10:55:42 host-122 kernel: [<ffffffff810c95a8>] ? __enqueue_entity+0x78/0x80
Oct  3 10:55:42 host-122 kernel: [<ffffffff8118d24e>] do_writepages+0x1e/0x40
Oct  3 10:55:42 host-122 kernel: [<ffffffff81228670>] __writeback_single_inode+0x40/0x210
Oct  3 10:55:42 host-122 kernel: [<ffffffff8122935e>] writeback_sb_inodes+0x25e/0x420
Oct  3 10:55:42 host-122 kernel: [<ffffffff81229c9f>] wb_writeback+0xff/0x2f0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
Oct  3 10:55:42 host-122 kernel: [<ffffffff8122bc45>] bdi_writeback_workfn+0x115/0x460
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a8d76>] worker_thread+0x126/0x410
Oct  3 10:55:42 host-122 kernel: [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Oct  3 10:55:42 host-122 kernel: [<ffffffff81696218>] ret_from_fork+0x58/0x90
Oct  3 10:55:42 host-122 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140

Comment 4 Corey Marthaler 2016-10-03 18:38:39 UTC
same scenarios and appears to be the same issue, marking as dup of 1380521.

*** This bug has been marked as a duplicate of bug 1380521 ***