Hide Forgot
Description of problem: This seems similar to bug 1219222 and was found while attempting to verify it. However bug 1219222 was nearly 100% reproducible and this was only seen once in hundreds of iterations. ================================================================================ Iteration 4.17 started at Wed Feb 17 07:06:21 CST 2016 ================================================================================ Scenario kill_second_spanned_primary_synced_raid1_2legs: Kill primary leg of synced 2 leg raid1 volume(s) ********* RAID hash info for this scenario ********* * names: synced_spanned_primary_raid1_2legs_1 * sync: 1 * type: raid1 * -m |-i value: 1 * leg devices: /dev/sdf1 /dev/sdc1 /dev/sda1 /dev/sde1 * spanned legs: 1 * failpv(s): /dev/sda1 * additional snap: /dev/sdf1 * failnode(s): host-030.virt.lab.msp.redhat.com * lvmetad: 0 * raid fault policy: warn ****************************************************** Creating raids(s) on host-030.virt.lab.msp.redhat.com... host-030.virt.lab.msp.redhat.com: lvcreate --type raid1 -m 1 -n synced_spanned_primary_raid1_2legs_1 -L 500M black_bird /dev/sdf1:0-62 /dev/sdc1:0-62 /dev/sda1:0-62 /dev/sde1:0-62 EXCLUSIVELY ACTIVATING RAID on host-030.virt.lab.msp.redhat.com Current mirror/raid device structure(s): LV Attr LSize Cpy%Sync Devices synced_spanned_primary_raid1_2legs_1 rwi-a-r--- 500.00m 0.80 synced_spanned_primary_raid1_2legs_1_rimage_0(0),synced_spanned_primary_raid1_2legs_1_rimage_1(0) [synced_spanned_primary_raid1_2legs_1_rimage_0] Iwi-aor--- 500.00m /dev/sdf1(1) [synced_spanned_primary_raid1_2legs_1_rimage_0] Iwi-aor--- 500.00m /dev/sda1(0) [synced_spanned_primary_raid1_2legs_1_rimage_1] Iwi-aor--- 500.00m /dev/sdc1(1) [synced_spanned_primary_raid1_2legs_1_rimage_1] Iwi-aor--- 500.00m /dev/sde1(0) [synced_spanned_primary_raid1_2legs_1_rmeta_0] ewi-aor--- 4.00m /dev/sdf1(0) [synced_spanned_primary_raid1_2legs_1_rmeta_1] ewi-aor--- 4.00m /dev/sdc1(0) Waiting until all mirror|raid volumes become fully syncd... 1/1 mirror(s) are fully synced: ( 100.00% ) Creating ext on top of mirror(s) on host-030.virt.lab.msp.redhat.com... mke2fs 1.41.12 (17-May-2010) Mounting mirrored ext filesystems on host-030.virt.lab.msp.redhat.com... PV=/dev/sda1 synced_spanned_primary_raid1_2legs_1_rimage_0: 2 Creating a snapshot volume of each of the raids Writing verification files (checkit) to mirror(s) on... ---- host-030.virt.lab.msp.redhat.com ---- <start name="host-030.virt.lab.msp.redhat.com_synced_spanned_primary_raid1_2legs_1" pid="2580" time="Wed Feb 17 07:06:57 2016" type="cmd" /> Sleeping 15 seconds to get some outsanding I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- host-030.virt.lab.msp.redhat.com ---- Disabling device sda on host-030.virt.lab.msp.redhat.com simple pvs failed Feb 17 07:07:19 host-030 qarshd[28377]: Running cmdline: echo offline > /sys/block/sda/device/state Feb 17 07:07:19 host-030 xinetd[3157]: EXIT: qarsh status=0 pid=28377 duration=0(sec) Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 kernel: md/raid1:mdX: dm-3: rescheduling sector 507904 Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 kernel: md/raid1:mdX: dm-3: rescheduling sector 509072 Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 kernel: md/raid1:mdX: dm-3: rescheduling sector 884736 Feb 17 07:07:26 host-030 kernel: md/raid1:mdX: dm-3: rescheduling sector 884744 Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 kernel: md/raid1:mdX: dm-3: rescheduling sector 917504 Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 lvm[13214]: Device #0 of raid1 array, black_bird-synced_spanned_primary_raid1_2legs_1-real, has failed. Feb 17 07:07:26 host-030 lvm[13214]: /dev/sda1: read failed after 0 of 2048 at 0: Input/output error Feb 17 07:07:26 host-030 kernel: sd 2:0:0:1: rejecting I/O to offline device Feb 17 07:07:26 host-030 lvm[13214]: /dev/sda1: read failed after 0 of 512 at 21467824128: Input/output error [...] Feb 17 07:07:34 host-030 qarshd[28380]: Running cmdline: pvs -a [...] Feb 17 07:09:34 host-030 xinetd[3157]: EXIT: qarsh status=0 pid=28380 duration=120(sec) Feb 17 07:09:54 host-030 kernel: INFO: task clvmd:7838 blocked for more than 120 seconds. Feb 17 07:09:54 host-030 kernel: Not tainted 2.6.32-615.el6.x86_64 #1 Feb 17 07:09:54 host-030 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 17 07:09:54 host-030 kernel: clvmd D 0000000000000000 0 7838 1 0x00000080 Feb 17 07:09:54 host-030 kernel: ffff88003661b5c8 0000000000000086 0000000000000000 0000000300000001 Feb 17 07:09:54 host-030 kernel: 0000000000000000 ffff880014da3e50 0000000000000282 0000000000000003 Feb 17 07:09:54 host-030 kernel: 0000000000000001 0000000000000282 ffff88003a5b7068 ffff88003661bfd8 Feb 17 07:09:54 host-030 kernel: Call Trace: Feb 17 07:09:54 host-030 kernel: [<ffffffff814290a9>] ? md_wakeup_thread+0x39/0x80 Feb 17 07:09:54 host-030 kernel: [<ffffffffa052267b>] wait_barrier+0xdb/0x260 [raid1] Feb 17 07:09:54 host-030 kernel: [<ffffffff810a6ac0>] ? autoremove_wake_function+0x0/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffffa052536f>] make_request+0x8f/0xe50 [raid1] Feb 17 07:09:54 host-030 kernel: [<ffffffff81130d55>] ? mempool_alloc_slab+0x15/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff81130ef3>] ? mempool_alloc+0x63/0x140 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d611b>] ? bio_alloc_bioset+0x5b/0xf0 Feb 17 07:09:54 host-030 kernel: [<ffffffffa053e018>] raid_map+0x18/0x20 [dm_raid] Feb 17 07:09:54 host-030 kernel: [<ffffffffa0002568>] __map_bio+0x48/0x140 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffffa0002cec>] __split_and_process_bio+0x4bc/0x5e0 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffff811d87ca>] ? dio_bio_end_io+0x5a/0x90 Feb 17 07:09:54 host-030 kernel: [<ffffffffa00043d7>] dm_make_request+0x117/0x180 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffff8127ddd0>] generic_make_request+0x240/0x5a0 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d946c>] ? do_direct_IO+0x57c/0xfa0 Feb 17 07:09:54 host-030 kernel: [<ffffffffa0001fe3>] ? dm_merge_bvec+0x63/0x100 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffffa05ba0db>] ? origin_merge+0x3b/0x50 [dm_snapshot] Feb 17 07:09:54 host-030 kernel: [<ffffffffa000205c>] ? dm_merge_bvec+0xdc/0x100 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffff8127e1a0>] submit_bio+0x70/0x120 Feb 17 07:09:54 host-030 kernel: [<ffffffff811dab0d>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff811db177>] __blockdev_direct_IO+0x77/0xe0 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d73f7>] blkdev_direct_IO+0x57/0x60 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff8113043b>] generic_file_aio_read+0x6bb/0x700 Feb 17 07:09:54 host-030 kernel: [<ffffffff81242261>] ? avc_has_perm+0x71/0x90 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d68e1>] blkdev_aio_read+0x51/0x80 Feb 17 07:09:54 host-030 kernel: [<ffffffff81199aaa>] do_sync_read+0xfa/0x140 Feb 17 07:09:54 host-030 kernel: [<ffffffff810a6ac0>] ? autoremove_wake_function+0x0/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d670c>] ? block_ioctl+0x3c/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffff811af582>] ? vfs_ioctl+0x22/0xa0 Feb 17 07:09:54 host-030 kernel: [<ffffffff81247d9b>] ? selinux_file_permission+0xfb/0x150 Feb 17 07:09:54 host-030 kernel: [<ffffffff8123aa26>] ? security_file_permission+0x16/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119a3a5>] vfs_read+0xb5/0x1a0 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119b156>] ? fget_light_pos+0x16/0x50 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119a6f1>] sys_read+0x51/0xb0 Feb 17 07:09:54 host-030 kernel: [<ffffffff810ee45e>] ? __audit_syscall_exit+0x25e/0x290 Feb 17 07:09:54 host-030 kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b Feb 17 07:09:54 host-030 kernel: INFO: task dmeventd:28347 blocked for more than 120 seconds. Feb 17 07:09:54 host-030 kernel: Not tainted 2.6.32-615.el6.x86_64 #1 Feb 17 07:09:54 host-030 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 17 07:09:54 host-030 kernel: dmeventd D 0000000000000000 0 28347 1 0x00000080 Feb 17 07:09:54 host-030 kernel: ffff88002aacf5c8 0000000000000086 0000000000000000 0000000300000001 Feb 17 07:09:54 host-030 kernel: 0000000000000000 ffff880014da3e50 0000000000000282 0000000000000003 Feb 17 07:09:54 host-030 kernel: 0000000000000001 0000000000000282 ffff880022ab9ad8 ffff88002aacffd8 Feb 17 07:09:54 host-030 kernel: Call Trace: Feb 17 07:09:54 host-030 kernel: [<ffffffff814290a9>] ? md_wakeup_thread+0x39/0x80 Feb 17 07:09:54 host-030 kernel: [<ffffffffa052267b>] wait_barrier+0xdb/0x260 [raid1] Feb 17 07:09:54 host-030 kernel: [<ffffffff810a6ac0>] ? autoremove_wake_function+0x0/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffffa052536f>] make_request+0x8f/0xe50 [raid1] Feb 17 07:09:54 host-030 kernel: [<ffffffff81130d55>] ? mempool_alloc_slab+0x15/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff81130ef3>] ? mempool_alloc+0x63/0x140 Feb 17 07:09:54 host-030 kernel: [<ffffffff81296f79>] ? cfq_set_request+0x329/0x560 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d611b>] ? bio_alloc_bioset+0x5b/0xf0 Feb 17 07:09:54 host-030 kernel: [<ffffffffa053e018>] raid_map+0x18/0x20 [dm_raid] Feb 17 07:09:54 host-030 kernel: [<ffffffffa0002568>] __map_bio+0x48/0x140 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffffa0002cec>] __split_and_process_bio+0x4bc/0x5e0 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffffa00043d7>] dm_make_request+0x117/0x180 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffff8127ddd0>] generic_make_request+0x240/0x5a0 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d946c>] ? do_direct_IO+0x57c/0xfa0 Feb 17 07:09:54 host-030 kernel: [<ffffffffa0001fe3>] ? dm_merge_bvec+0x63/0x100 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffffa05ba0db>] ? origin_merge+0x3b/0x50 [dm_snapshot] Feb 17 07:09:54 host-030 kernel: [<ffffffffa000205c>] ? dm_merge_bvec+0xdc/0x100 [dm_mod] Feb 17 07:09:54 host-030 kernel: [<ffffffff8127e1a0>] submit_bio+0x70/0x120 Feb 17 07:09:54 host-030 kernel: [<ffffffff811dab0d>] __blockdev_direct_IO_newtrunc+0xc7d/0x1270 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff811db177>] __blockdev_direct_IO+0x77/0xe0 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d73f7>] blkdev_direct_IO+0x57/0x60 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d6370>] ? blkdev_get_block+0x0/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff8113043b>] generic_file_aio_read+0x6bb/0x700 Feb 17 07:09:54 host-030 kernel: [<ffffffff81242261>] ? avc_has_perm+0x71/0x90 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d68e1>] blkdev_aio_read+0x51/0x80 Feb 17 07:09:54 host-030 kernel: [<ffffffff81199aaa>] do_sync_read+0xfa/0x140 Feb 17 07:09:54 host-030 kernel: [<ffffffff810a6ac0>] ? autoremove_wake_function+0x0/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffff811d670c>] ? block_ioctl+0x3c/0x40 Feb 17 07:09:54 host-030 kernel: [<ffffffff811af582>] ? vfs_ioctl+0x22/0xa0 Feb 17 07:09:54 host-030 kernel: [<ffffffff81247d9b>] ? selinux_file_permission+0xfb/0x150 Feb 17 07:09:54 host-030 kernel: [<ffffffff8123aa26>] ? security_file_permission+0x16/0x20 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119a3a5>] vfs_read+0xb5/0x1a0 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119b156>] ? fget_light_pos+0x16/0x50 Feb 17 07:09:54 host-030 kernel: [<ffffffff8119a6f1>] sys_read+0x51/0xb0 Feb 17 07:09:54 host-030 kernel: [<ffffffff810ee45e>] ? __audit_syscall_exit+0x25e/0x290 Feb 17 07:09:54 host-030 kernel: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b Version-Release number of selected component (if applicable): 2.6.32-615.el6.x86_64 lvm2-2.02.141-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 lvm2-libs-2.02.141-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 lvm2-cluster-2.02.141-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 udev-147-2.71.el6 BUILT: Wed Feb 10 07:07:17 CST 2016 device-mapper-1.02.115-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 device-mapper-libs-1.02.115-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 device-mapper-event-1.02.115-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 device-mapper-event-libs-1.02.115-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 device-mapper-persistent-data-0.6.2-0.1.rc1.el6 BUILT: Wed Feb 10 09:52:15 CST 2016 cmirror-2.02.141-2.el6 BUILT: Wed Feb 10 07:49:03 CST 2016 How reproducible: Only once so far in many iterations