Hide Forgot
Description of problem: Hit this issue while attempting to mkfs a HA LVM mirror. Test Output: Creating PVs out of /dev/etherd/e1.1p1 /dev/etherd/e1.1p2 /dev/etherd/e1.1p3 /dev/etherd/e1.1p4 /dev/etherd/e1.1p5 Creating VG out of /dev/etherd/e1.1p1 /dev/etherd/e1.1p2 /dev/etherd/e1.1p3 /dev/etherd/e1.1p4 /dev/etherd/e1.1p5 Creating LV(s) on VG HAYES (with hayes-01 tag for activation) WARNING: New mirror won't be synchronised. Don't read what you didn't write! WARNING: New mirror won't be synchronised. Don't read what you didn't write! Creating ext3 filesystems on LV(s) mke2fs 1.41.12 (17-May-2010) [root@hayes-01 ~]# lvs -a -o +devices [appears stuck] [root@hayes-02 ~]# lvs -a -o +devices LV VG Attr LSize Log Copy% Devices ha1 HAYES Mwi--- 25.00g ha1_mlog ha1_mimage_0(0),ha1_mimage_1(0) [ha1_mimage_0] HAYES Iwi--- 25.00g /dev/etherd/e1.1p1(0) [ha1_mimage_1] HAYES Iwi--- 25.00g /dev/etherd/e1.1p2(0) [ha1_mlog] HAYES lwi--- 4.00m /dev/etherd/e1.1p5(0) ha2 HAYES Mwi--- 25.00g ha2_mlog ha2_mimage_0(0),ha2_mimage_1(0) [ha2_mimage_0] HAYES Iwi--- 25.00g /dev/etherd/e1.1p1(6400) [ha2_mimage_1] HAYES Iwi--- 25.00g /dev/etherd/e1.1p2(6400) [ha2_mlog] HAYES lwi--- 4.00m /dev/etherd/e1.1p5(1) aoe: 0030488d63d2 e1.1 v4467 has 19046937725 sectors etherd/e1.1: p1 p2 p3 p4 p5 DLM (built Feb 22 2011 11:17:55) installed INFO: task kmirrord:2436 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kmirrord D 0000000000000003 0 2436 2 0x00000080 ffff880219f1fb40 0000000000000046 0000000000000000 ffff8801e3728e00 ffff880219f1fb60 ffffffffa0009413 0000000019f1fb70 00000000ffff3f2a ffff880219198638 ffff880219f1ffd8 000000000000f558 ffff880219198638 Call Trace: [<ffffffffa0009413>] ? dispatch_io+0x233/0x260 [dm_mod] [<ffffffff814d9663>] io_schedule+0x73/0xc0 [<ffffffffa00094d5>] sync_io+0x95/0x110 [dm_mod] [<ffffffffa0009777>] ? dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa0008fa0>] ? bvec_next_page+0x0/0x10 [dm_mod] [<ffffffffa0322209>] ? mirror_flush+0xd9/0x150 [dm_mirror] [<ffffffffa0009777>] dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa0009040>] ? vm_get_page+0x0/0x70 [dm_mod] [<ffffffffa0008fb0>] ? vm_next_page+0x0/0x30 [dm_mod] [<ffffffffa02b1781>] disk_flush+0x91/0x170 [dm_log] [<ffffffff8110f34e>] ? mempool_kfree+0xe/0x10 [<ffffffff8110f4b5>] ? mempool_free+0x95/0xa0 [<ffffffffa02dbe63>] dm_rh_update_states+0x2d3/0x2f0 [dm_region_hash] [<ffffffffa03228d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffffa032297d>] do_mirror+0xad/0x6e0 [dm_mirror] [<ffffffffa03228d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffff810883b0>] worker_thread+0x170/0x2a0 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81088240>] ? worker_thread+0x0/0x2a0 [<ffffffff8108d976>] kthread+0x96/0xa0 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8108d8e0>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 INFO: task kmirrord:2436 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kmirrord D 0000000000000003 0 2436 2 0x00000080 ffff880219f1fb40 0000000000000046 0000000000000000 ffff8801df137c80 ffff880219f1fb60 ffffffffa0009413 0000000019f1fb70 000000010004031b ffff880219198638 ffff880219f1ffd8 000000000000f558 ffff880219198638 Call Trace: [<ffffffffa0009413>] ? dispatch_io+0x233/0x260 [dm_mod] [<ffffffff814d9663>] io_schedule+0x73/0xc0 [<ffffffffa00094d5>] sync_io+0x95/0x110 [dm_mod] [<ffffffffa0009777>] ? dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa0008fa0>] ? bvec_next_page+0x0/0x10 [dm_mod] [<ffffffffa0322209>] ? mirror_flush+0xd9/0x150 [dm_mirror] [<ffffffffa0009777>] dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa0009040>] ? vm_get_page+0x0/0x70 [dm_mod] [<ffffffffa0008fb0>] ? vm_next_page+0x0/0x30 [dm_mod] [<ffffffffa02b1781>] disk_flush+0x91/0x170 [dm_log] [<ffffffff8110f34e>] ? mempool_kfree+0xe/0x10 [<ffffffff8110f4b5>] ? mempool_free+0x95/0xa0 [<ffffffffa02dbe63>] dm_rh_update_states+0x2d3/0x2f0 [dm_region_hash] [<ffffffffa03228d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffffa032297d>] do_mirror+0xad/0x6e0 [dm_mirror] [<ffffffffa03228d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffff810883b0>] worker_thread+0x170/0x2a0 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81088240>] ? worker_thread+0x0/0x2a0 [<ffffffff8108d976>] kthread+0x96/0xa0 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8108d8e0>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 INFO: task lvs:2559 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. lvs D 0000000000000002 0 2559 2510 0x00000080 ffff880117ba9b88 0000000000000086 0000000000000000 00000000000000d0 ffff880117ba9b38 000000000d2a122a ffff880117ba9b78 000000010004034b ffff880119285078 ffff880117ba9fd8 000000000000f558 ffff880119285078 Call Trace: [<ffffffff814d9663>] io_schedule+0x73/0xc0 [<ffffffff811aa49e>] __blockdev_direct_IO+0x70e/0xc40 [<ffffffff811a80e7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811a72d0>] ? blkdev_get_blocks+0x0/0xc0 [<ffffffff8110eb8b>] generic_file_aio_read+0x6bb/0x700 [<ffffffff8120ae21>] ? avc_has_perm+0x71/0x90 [<ffffffff8120494f>] ? security_inode_permission+0x1f/0x30 [<ffffffff81170a9a>] do_sync_read+0xfa/0x140 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff811a769c>] ? block_ioctl+0x3c/0x40 [<ffffffff81183412>] ? vfs_ioctl+0x22/0xa0 [<ffffffff8121037b>] ? selinux_file_permission+0xfb/0x150 [<ffffffff812037e6>] ? security_file_permission+0x16/0x20 [<ffffffff811714c5>] vfs_read+0xb5/0x1a0 [<ffffffff810d1652>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff81171601>] sys_read+0x51/0x90 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Version-Release number of selected component (if applicable): 2.6.32-118.el6.x86_64 lvm2-2.02.83-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 lvm2-libs-2.02.83-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 lvm2-cluster-2.02.83-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 udev-147-2.34.el6 BUILT: Wed Feb 16 08:09:21 CST 2011 device-mapper-1.02.62-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 device-mapper-libs-1.02.62-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 device-mapper-event-1.02.62-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 device-mapper-event-libs-1.02.62-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011 cmirror-2.02.83-2.el6 BUILT: Tue Feb 8 10:10:57 CST 2011
Could you please attach output of: dmsetup info. Looks like some node stayed suspended.
Just reproduced this again. Jul 26 15:15:47 hayes-01 qarshd[3295]: Running cmdline: mkfs.ext3 /dev/HAYES/ha1 Jul 26 15:19:32 hayes-01 kernel: INFO: task kmirrord:2431 blocked for more than 120 seconds. Jul 26 15:19:32 hayes-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 26 15:19:32 hayes-01 kernel: kmirrord D 0000000000000002 0 2431 2 0x00000080 Jul 26 15:19:32 hayes-01 kernel: ffff88011a6c3b40 0000000000000046 0000000000000000 ffff8800be8fd6c0 Jul 26 15:19:32 hayes-01 kernel: ffff88011a6c3b60 ffffffffa00094c3 000000001a6c3b70 000000010004dc5f Jul 26 15:19:32 hayes-01 kernel: ffff8801197a5078 ffff88011a6c3fd8 000000000000f598 ffff8801197a5078 Jul 26 15:19:32 hayes-01 kernel: Call Trace: Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa00094c3>] ? dispatch_io+0x233/0x260 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffff814dbad3>] io_schedule+0x73/0xc0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0009585>] sync_io+0x95/0x110 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0009827>] ? dm_io+0x1b7/0x1c0 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0009050>] ? bvec_next_page+0x0/0x10 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0324209>] ? mirror_flush+0xd9/0x150 [dm_mirror] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0009827>] dm_io+0x1b7/0x1c0 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa00090f0>] ? vm_get_page+0x0/0x70 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa0009060>] ? vm_next_page+0x0/0x30 [dm_mod] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa02b3781>] disk_flush+0x91/0x170 [dm_log] Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8110f95e>] ? mempool_kfree+0xe/0x10 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8110fac5>] ? mempool_free+0x95/0xa0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa02dde63>] dm_rh_update_states+0x2d3/0x2f0 [dm_region_hash] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa03248d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa032497d>] do_mirror+0xad/0x6e0 [dm_mirror] Jul 26 15:19:32 hayes-01 kernel: [<ffffffffa03248d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] Jul 26 15:19:32 hayes-01 kernel: [<ffffffff81088830>] worker_thread+0x170/0x2a0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8108ddf6>] kthread+0x96/0xa0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8100c1ca>] child_rip+0xa/0x20 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8108dd60>] ? kthread+0x0/0xa0 Jul 26 15:19:32 hayes-01 kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 Jul 26 15:21:32 hayes-01 kernel: INFO: task kmirrord:2431 blocked for more than 120 seconds. Looks like all the devices are "ACTIVE": [root@hayes-01 ~]# dmsetup info | grep State State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE State: ACTIVE
Bumping priority now that this is being seen more often. Creating LV(s) on VG HAYES (with hayes-01 tag for activation) WARNING: New mirror won't be synchronised. Don't read what you didn't write! WARNING: New mirror won't be synchronised. Don't read what you didn't write! WARNING: New mirror won't be synchronised. Don't read what you didn't write! WARNING: New mirror won't be synchronised. Don't read what you didn't write! Creating ext3 filesystems on LV(s) mke2fs 1.41.12 (17-May-2010) mke2fs 1.41.12 (17-May-2010) INFO: task kmirrord:2725 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kmirrord D 0000000000000002 0 2725 2 0x00000080 ffff88021af79b40 0000000000000046 0000000000000000 ffff8800d9d80800 ffff88021af79b60 ffffffffa00095a3 000000001af79b70 ffffffffa00091d0 ffff88021a8f3078 ffff88021af79fd8 000000000000f4e8 ffff88021a8f3078 Call Trace: [<ffffffffa00095a3>] ? dispatch_io+0x233/0x260 [dm_mod] [<ffffffffa00091d0>] ? vm_get_page+0x0/0x70 [dm_mod] [<ffffffff814e0943>] io_schedule+0x73/0xc0 [<ffffffffa0009665>] sync_io+0x95/0x110 [dm_mod] [<ffffffffa0009907>] ? dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa033b209>] ? mirror_flush+0xd9/0x150 [dm_mirror] [<ffffffffa0009907>] dm_io+0x1b7/0x1c0 [dm_mod] [<ffffffffa00091d0>] ? vm_get_page+0x0/0x70 [dm_mod] [<ffffffffa0009140>] ? vm_next_page+0x0/0x30 [dm_mod] [<ffffffffa02e9781>] disk_flush+0x91/0x170 [dm_log] [<ffffffff8111002e>] ? mempool_kfree+0xe/0x10 [<ffffffff81110195>] ? mempool_free+0x95/0xa0 [<ffffffffa0331e63>] dm_rh_update_states+0x2d3/0x2f0 [dm_region_hash] [<ffffffffa033b8d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffffa033b97d>] do_mirror+0xad/0x6e0 [dm_mirror] [<ffffffffa033b8d0>] ? do_mirror+0x0/0x6e0 [dm_mirror] [<ffffffff81088e50>] worker_thread+0x170/0x2a0 [<ffffffff8108e780>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81088ce0>] ? worker_thread+0x0/0x2a0 [<ffffffff8108e416>] kthread+0x96/0xa0 [<ffffffff8100c10a>] child_rip+0xa/0x20 [<ffffffff8108e380>] ? kthread+0x0/0xa0 [<ffffffff8100c100>] ? child_rip+0x0/0x20 2.6.32-174.el6.x86_64 lvm2-2.02.83-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 lvm2-libs-2.02.83-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 lvm2-cluster-2.02.83-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 udev-147-2.35.el6 BUILT: Wed Mar 30 07:32:05 CDT 2011 device-mapper-1.02.62-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 device-mapper-libs-1.02.62-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 device-mapper-event-1.02.62-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 device-mapper-event-libs-1.02.62-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 cmirror-2.02.83-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011
What is the underlying hardware?
Is this still a problem after 2.5 years (and if so can we get fresh diagnostics from a current release), or can we close it?
No reports from the field. No reproducers in > 2yrs. I am closing this bug. Please re-open if you can reproduce.