| Summary: | Service failed to appear after update to cluster.conf | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Madison Kelly <mkelly> |
| Component: | rgmanager | Assignee: | Ryan McCabe <rmccabe> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.4 | CC: | apl30329, cfeist, cluster-maint, fdinitto, mkelly, tlavigne |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-08-17 15:53:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Madison Kelly
2013-11-14 21:56:43 UTC
Debugging notes; * I restarted cman and rgmanager on an-c05n02, no change. * I logged into vm02-win2012 and powered it off. an-c05n01 did not detect it as failed or restart it. * Live-migration of the seen VM, vm01-win2008, from an-c05n01 to an-c05n02 worked. * Stopping rgmanager on an-c05n01, the effected node, never completed (left it for over 20 minutes). Tried to fence the node with 'fence_node an-c05n01.alteeve.ca' which worked, but took a long time to start. Once an-c05n01 rebooted, it rejoined the cluster and then 'clustat' reported the right list of services. As an aside; here are the logs from 'an-c05n01' starting with the first VM being added successfully, a successful VM migration and another migration which eventually failed and triggered a fence. ==== Nov 14 14:11:38 an-c05n01 ricci[13654]: Executing '/usr/bin/virsh nodeinfo' Nov 14 14:11:38 an-c05n01 ricci[13656]: Executing '/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/queue/351299187' Nov 14 14:11:38 an-c05n01 modcluster: Updating cluster.conf Nov 14 14:11:38 an-c05n01 corosync[568]: [QUORUM] Members[2]: 1 2 Nov 14 14:11:38 an-c05n01 rgmanager[824]: Reconfiguring Nov 14 14:11:40 an-c05n01 rgmanager[824]: Initializing vm:vm01-win2008 Nov 14 14:11:40 an-c05n01 rgmanager[824]: vm:vm01-win2008 was added to the config, but I am not initializing it. Nov 14 14:11:41 an-c05n01 rgmanager[824]: Starting stopped service vm:vm01-win2008 Nov 14 14:11:41 an-c05n01 rgmanager[824]: Service vm:vm01-win2008 started Nov 14 14:28:45 an-c05n01 rgmanager[824]: Migrating vm:vm01-win2008 to an-c05n02.alteeve.ca Nov 14 14:28:47 an-c05n01 rgmanager[27042]: [vm] Migrate vm01-win2008 to an-c05n02.alteeve.ca failed: Nov 14 14:28:47 an-c05n01 rgmanager[27064]: [vm] error: unable to connect to server at 'an-c05n02.alteeve.ca:49152': No route to host Nov 14 14:28:47 an-c05n01 rgmanager[824]: migrate on vm "vm01-win2008" returned 150 (unspecified) Nov 14 14:28:47 an-c05n01 rgmanager[824]: Migration of vm:vm01-win2008 to an-c05n02.alteeve.ca failed; return code 150 Nov 14 14:33:02 an-c05n01 rgmanager[824]: Migrating vm:vm01-win2008 to an-c05n02.alteeve.ca Nov 14 14:33:07 an-c05n01 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:33:07 an-c05n01 kernel: device vnet0 left promiscuous mode Nov 14 14:33:07 an-c05n01 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:33:08 an-c05n01 rgmanager[824]: Migration of vm:vm01-win2008 to an-c05n02.alteeve.ca completed Nov 14 14:33:09 an-c05n01 ntpd[2268]: Deleting interface #42 vnet0, fe80::fc54:ff:fe8e:6732#123, interface stats: received=0, sent=0, dropped=0, active_time=1829 secs Nov 14 14:33:18 an-c05n01 kernel: device vnet0 entered promiscuous mode Nov 14 14:33:18 an-c05n01 kernel: vbr2: port 2(vnet0) entering forwarding state Nov 14 14:33:21 an-c05n01 ntpd[2268]: Listening on interface #43 vnet0, fe80::fc54:ff:fe8e:6732#123 Enabled Nov 14 14:33:33 an-c05n01 kernel: vbr2: port 2(vnet0) entering forwarding state Nov 14 14:34:32 an-c05n01 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Nov 14 14:34:32 an-c05n01 kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max) Nov 14 14:45:10 an-c05n01 kernel: INFO: task glock_workqueue:20413 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: glock_workque D 0000000000000007 0 20413 2 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff8804e12d9c70 0000000000000046 0000000000000000 ffff8803333e5000 Nov 14 14:45:10 an-c05n01 kernel: ffff8805b68d7f00 0000000000002018 ffff8804e12d9c10 ffffffffa036362f Nov 14 14:45:10 an-c05n01 kernel: ffff880638a745f8 ffff8804e12d9fd8 000000000000fb88 ffff880638a745f8 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036362f>] ? gfs2_log_write_buf+0xaf/0xd0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150e953>] io_schedule+0x73/0xc0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0362dea>] gfs2_log_flush+0x45a/0x680 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035fdfd>] inode_go_sync+0x7d/0x160 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035e966>] do_xmote+0x156/0x280 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035eb81>] run_queue+0xf1/0x1d0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035f35a>] glock_work_func+0x7a/0x1d0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035f2e0>] ? glock_work_func+0x0/0x1d0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81090be0>] worker_thread+0x170/0x2a0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81090a70>] ? worker_thread+0x0/0x2a0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:45:10 an-c05n01 kernel: INFO: task gfs2_logd:2754 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: gfs2_logd D 0000000000000004 0 2754 2 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff8801e2b75cf0 0000000000000046 ffff880028296768 ffff88032d9b6078 Nov 14 14:45:10 an-c05n01 kernel: ffff8801e2b75c90 ffffffff810656d3 0000000000000000 ffff88032d9b6078 Nov 14 14:45:10 an-c05n01 kernel: ffff88032d9b65f8 ffff8801e2b75fd8 000000000000fb88 ffff88032d9b65f8 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810656d3>] ? dequeue_entity+0x113/0x2e0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81510725>] rwsem_down_failed_common+0x95/0x1d0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81510883>] rwsem_down_write_failed+0x23/0x30 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff812838c3>] call_rwsem_down_write_failed+0x13/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150fd82>] ? down_write+0x32/0x40 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa03629bf>] gfs2_log_flush+0x2f/0x680 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0361b2f>] ? gfs2_ail1_empty+0x14f/0x1b0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa03630e9>] gfs2_logd+0xd9/0x140 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0363010>] ? gfs2_logd+0x0/0x140 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:45:10 an-c05n01 kernel: INFO: task gfs2_quotad:2755 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: gfs2_quotad D 0000000000000007 0 2755 2 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff8802b55d7c20 0000000000000046 0000000100000007 ffff88034ac30b60 Nov 14 14:45:10 an-c05n01 kernel: 00000000ffffffff 0000000000000008 0000000000016700 0000000000016700 Nov 14 14:45:10 an-c05n01 kernel: ffff88016f8f3058 ffff8802b55d7fd8 000000000000fb88 ffff88016f8f3058 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0377f39>] gfs2_statfs_sync+0x59/0x1c0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150efba>] ? schedule_timeout+0x19a/0x2e0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0377f31>] ? gfs2_statfs_sync+0x51/0x1c0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036f947>] quotad_check_timeo+0x57/0xb0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036fbd4>] gfs2_quotad+0x234/0x2b0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036f9a0>] ? gfs2_quotad+0x0/0x2b0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:45:10 an-c05n01 kernel: INFO: task flush-253:1:4989 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: flush-253:1 D 0000000000000013 0 4989 2 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff8801ed0039d0 0000000000000046 0000000000000000 ffff8801ed003a08 Nov 14 14:45:10 an-c05n01 kernel: ffffffff81143c27 ffff8801ed003968 0000000000000282 ffff8801ed003978 Nov 14 14:45:10 an-c05n01 kernel: ffff880337647af8 ffff8801ed003fd8 000000000000fb88 ffff880337647af8 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81143c27>] ? handle_pte_fault+0x487/0xb50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8112a3e1>] ? get_page_from_freelist+0x3d1/0x830 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810575d5>] ? select_idle_sibling+0x95/0x150 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa037757e>] gfs2_write_inode+0x2ae/0x320 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa0377575>] ? gfs2_write_inode+0x2a5/0x320 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811acb6c>] writeback_single_inode+0x20c/0x290 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811ace4e>] writeback_sb_inodes+0xce/0x180 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811acfab>] writeback_inodes_wb+0xab/0x1b0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811ad34b>] wb_writeback+0x29b/0x3f0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81081be2>] ? del_timer_sync+0x22/0x30 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811ad645>] wb_do_writeback+0x1a5/0x240 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff811ad743>] bdi_writeback_task+0x63/0x1b0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8113cc50>] ? bdi_start_fn+0x0/0x100 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8113ccd6>] bdi_start_fn+0x86/0x100 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8113cc50>] ? bdi_start_fn+0x0/0x100 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:45:10 an-c05n01 kernel: INFO: task clusterfs.sh:4995 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: clusterfs.sh D 0000000000000000 0 4995 4993 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff88063959fc88 0000000000000082 0000000000000000 ffffffff81190538 Nov 14 14:45:10 an-c05n01 kernel: ffff88016fbae3c0 0000000600000000 0000000000000000 ffff880600000000 Nov 14 14:45:10 an-c05n01 kernel: ffff8806370fb058 ffff88063959ffd8 000000000000fb88 ffff8806370fb058 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81190538>] ? follow_managed+0x158/0x2e0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036e585>] gfs2_getattr+0xb5/0xf0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036e57d>] ? gfs2_getattr+0xad/0xf0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186d51>] vfs_getattr+0x51/0x80 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186de0>] vfs_fstatat+0x60/0x80 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186e6e>] vfs_lstat+0x1e/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186e94>] sys_newlstat+0x24/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:45:10 an-c05n01 kernel: INFO: task vm.sh:5097 blocked for more than 120 seconds. Nov 14 14:45:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:45:10 an-c05n01 kernel: vm.sh D 000000000000000d 0 5097 5081 0x00000080 Nov 14 14:45:10 an-c05n01 kernel: ffff8806356e5ac8 0000000000000082 0000000000000000 000200da00000000 Nov 14 14:45:10 an-c05n01 kernel: 0000000000000286 0000000000000030 0000000000000000 ffff880000022dc0 Nov 14 14:45:10 an-c05n01 kernel: ffff88063576f058 ffff8806356e5fd8 000000000000fb88 ffff88063576f058 Nov 14 14:45:10 an-c05n01 kernel: Call Trace: Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036ca3c>] gfs2_permission+0xec/0x100 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffffa036ca34>] ? gfs2_permission+0xe4/0x100 [gfs2] Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8119099d>] __link_path_walk+0xad/0x1030 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81143897>] ? handle_pte_fault+0xf7/0xb50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81191baa>] path_walk+0x6a/0xe0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81191d7b>] do_path_lookup+0x5b/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81192a07>] user_path_at+0x57/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81513bfe>] ? do_page_fault+0x3e/0xa0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186dbc>] vfs_fstatat+0x3c/0x80 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81085271>] ? do_sigaction+0x91/0x1d0 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186f2b>] vfs_stat+0x1b/0x20 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff81186f54>] sys_newstat+0x24/0x50 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:45:10 an-c05n01 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:47:10 an-c05n01 kernel: INFO: task glock_workqueue:20413 blocked for more than 120 seconds. Nov 14 14:47:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:47:10 an-c05n01 kernel: glock_workque D 0000000000000007 0 20413 2 0x00000080 Nov 14 14:47:10 an-c05n01 kernel: ffff8804e12d9c70 0000000000000046 0000000000000000 ffff8803333e5000 Nov 14 14:47:10 an-c05n01 kernel: ffff8805b68d7f00 0000000000002018 ffff8804e12d9c10 ffffffffa036362f Nov 14 14:47:10 an-c05n01 kernel: ffff880638a745f8 ffff8804e12d9fd8 000000000000fb88 ffff880638a745f8 Nov 14 14:47:10 an-c05n01 kernel: Call Trace: Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa036362f>] ? gfs2_log_write_buf+0xaf/0xd0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150e953>] io_schedule+0x73/0xc0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0362dea>] gfs2_log_flush+0x45a/0x680 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035fdfd>] inode_go_sync+0x7d/0x160 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035e966>] do_xmote+0x156/0x280 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035eb81>] run_queue+0xf1/0x1d0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035f35a>] glock_work_func+0x7a/0x1d0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035f2e0>] ? glock_work_func+0x0/0x1d0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81090be0>] worker_thread+0x170/0x2a0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81090a70>] ? worker_thread+0x0/0x2a0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:47:10 an-c05n01 kernel: INFO: task gfs2_logd:2754 blocked for more than 120 seconds. Nov 14 14:47:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:47:10 an-c05n01 kernel: gfs2_logd D 0000000000000004 0 2754 2 0x00000080 Nov 14 14:47:10 an-c05n01 kernel: ffff8801e2b75cf0 0000000000000046 ffff880028296768 ffff88032d9b6078 Nov 14 14:47:10 an-c05n01 kernel: ffff8801e2b75c90 ffffffff810656d3 0000000000000000 ffff88032d9b6078 Nov 14 14:47:10 an-c05n01 kernel: ffff88032d9b65f8 ffff8801e2b75fd8 000000000000fb88 ffff88032d9b65f8 Nov 14 14:47:10 an-c05n01 kernel: Call Trace: Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810656d3>] ? dequeue_entity+0x113/0x2e0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81510725>] rwsem_down_failed_common+0x95/0x1d0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81510883>] rwsem_down_write_failed+0x23/0x30 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff812838c3>] call_rwsem_down_write_failed+0x13/0x20 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150fd82>] ? down_write+0x32/0x40 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa03629bf>] gfs2_log_flush+0x2f/0x680 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0361b2f>] ? gfs2_ail1_empty+0x14f/0x1b0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa03630e9>] gfs2_logd+0xd9/0x140 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0363010>] ? gfs2_logd+0x0/0x140 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:47:10 an-c05n01 kernel: INFO: task gfs2_quotad:2755 blocked for more than 120 seconds. Nov 14 14:47:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:47:10 an-c05n01 kernel: gfs2_quotad D 0000000000000007 0 2755 2 0x00000080 Nov 14 14:47:10 an-c05n01 kernel: ffff8802b55d7c20 0000000000000046 0000000100000007 ffff88034ac30b60 Nov 14 14:47:10 an-c05n01 kernel: 00000000ffffffff 0000000000000008 0000000000016700 0000000000016700 Nov 14 14:47:10 an-c05n01 kernel: ffff88016f8f3058 ffff8802b55d7fd8 000000000000fb88 ffff88016f8f3058 Nov 14 14:47:10 an-c05n01 kernel: Call Trace: Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0377f39>] gfs2_statfs_sync+0x59/0x1c0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150efba>] ? schedule_timeout+0x19a/0x2e0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0377f31>] ? gfs2_statfs_sync+0x51/0x1c0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa036f947>] quotad_check_timeo+0x57/0xb0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa036fbd4>] gfs2_quotad+0x234/0x2b0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa036f9a0>] ? gfs2_quotad+0x0/0x2b0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:47:10 an-c05n01 kernel: INFO: task flush-253:1:4989 blocked for more than 120 seconds. Nov 14 14:47:10 an-c05n01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:47:10 an-c05n01 kernel: flush-253:1 D 0000000000000013 0 4989 2 0x00000080 Nov 14 14:47:10 an-c05n01 kernel: ffff8801ed0039d0 0000000000000046 0000000000000000 ffff8801ed003a08 Nov 14 14:47:10 an-c05n01 kernel: ffffffff81143c27 ffff8801ed003968 0000000000000282 ffff8801ed003978 Nov 14 14:47:10 an-c05n01 kernel: ffff880337647af8 ffff8801ed003fd8 000000000000fb88 ffff880337647af8 Nov 14 14:47:10 an-c05n01 kernel: Call Trace: Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81143c27>] ? handle_pte_fault+0x487/0xb50 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8112a3e1>] ? get_page_from_freelist+0x3d1/0x830 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035b8a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035db15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa035efb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810575d5>] ? select_idle_sibling+0x95/0x150 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa037757e>] gfs2_write_inode+0x2ae/0x320 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffffa0377575>] ? gfs2_write_inode+0x2a5/0x320 [gfs2] Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811acb6c>] writeback_single_inode+0x20c/0x290 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811ace4e>] writeback_sb_inodes+0xce/0x180 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811acfab>] writeback_inodes_wb+0xab/0x1b0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811ad34b>] wb_writeback+0x29b/0x3f0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8150e1c0>] ? thread_return+0x4e/0x76e Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81081be2>] ? del_timer_sync+0x22/0x30 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811ad645>] wb_do_writeback+0x1a5/0x240 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff811ad743>] bdi_writeback_task+0x63/0x1b0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8113cc50>] ? bdi_start_fn+0x0/0x100 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8113ccd6>] bdi_start_fn+0x86/0x100 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8113cc50>] ? bdi_start_fn+0x0/0x100 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:47:10 an-c05n01 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:50:17 an-c05n01 kernel: block drbd0: sock_recvmsg returned -113 Nov 14 14:50:17 an-c05n01 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 ) Nov 14 14:50:17 an-c05n01 kernel: block drbd0: short read expecting header on sock: r=-113 Nov 14 14:50:17 an-c05n01 kernel: block drbd0: asender terminated Nov 14 14:50:17 an-c05n01 kernel: block drbd0: Terminating drbd0_asender Nov 14 14:50:17 an-c05n01 kernel: block drbd0: Connection closed Nov 14 14:50:17 an-c05n01 kernel: block drbd0: conn( BrokenPipe -> Unconnected ) Nov 14 14:50:17 an-c05n01 kernel: block drbd0: receiver terminated Nov 14 14:50:17 an-c05n01 kernel: block drbd0: Restarting drbd0_receiver Nov 14 14:50:17 an-c05n01 kernel: block drbd0: receiver (re)started Nov 14 14:50:17 an-c05n01 kernel: block drbd0: conn( Unconnected -> WFConnection ) Nov 14 14:50:17 an-c05n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 Nov 14 14:50:17 an-c05n01 rhcs_fence: Attempting to fence peer using RHCS from DRBD... Nov 14 14:50:18 an-c05n01 kernel: block drbd0: Handshake successful: Agreed network protocol version 97 Nov 14 14:50:18 an-c05n01 kernel: block drbd0: conn( WFConnection -> WFReportParams ) Nov 14 14:50:18 an-c05n01 kernel: block drbd0: Starting asender thread (from drbd0_receiver [2269]) Nov 14 14:50:18 an-c05n01 kernel: block drbd0: data-integrity-alg: <not-used> Nov 14 14:50:18 an-c05n01 kernel: block drbd0: drbd_sync_handshake: Nov 14 14:50:18 an-c05n01 kernel: block drbd0: self D518242FE2FB5DC5:0000000000000000:B99100FEF8DE5D0D:B99000FEF8DE5D0D bits:0 flags:0 Nov 14 14:50:18 an-c05n01 kernel: block drbd0: peer D518242FE2FB5DC5:0000000000000000:B99100FEF8DE5D0D:B99000FEF8DE5D0D bits:0 flags:0 Nov 14 14:50:18 an-c05n01 kernel: block drbd0: uuid_compare()=0 by rule 40 Nov 14 14:50:18 an-c05n01 kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) Nov 14 14:50:18 an-c05n01 kernel: block drbd0: susp( 1 -> 0 ) Nov 14 14:50:19 an-c05n01 rgmanager[824]: Migrating vm:vm01-win2008 to an-c05n02.alteeve.ca Nov 14 14:50:19 an-c05n01 rgmanager[6296]: [vm] Migrate vm01-win2008 to an-c05n02.alteeve.ca failed: Nov 14 14:50:19 an-c05n01 rgmanager[6321]: [vm] error: End of file while reading data: : Input/output error Nov 14 14:50:19 an-c05n01 rgmanager[824]: migrate on vm "vm01-win2008" returned 150 (unspecified) Nov 14 14:50:19 an-c05n01 rgmanager[824]: Migration of vm:vm01-win2008 to an-c05n02.alteeve.ca failed; return code 150 Nov 14 14:50:27 an-c05n01 kernel: block drbd1: PingAck did not arrive in time. Nov 14 14:50:27 an-c05n01 kernel: block drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 ) Nov 14 14:50:27 an-c05n01 kernel: block drbd1: asender terminated Nov 14 14:50:27 an-c05n01 kernel: block drbd1: Terminating drbd1_asender Nov 14 14:50:27 an-c05n01 kernel: block drbd1: Connection closed Nov 14 14:50:27 an-c05n01 kernel: block drbd1: conn( NetworkFailure -> Unconnected ) Nov 14 14:50:27 an-c05n01 kernel: block drbd1: receiver terminated Nov 14 14:50:27 an-c05n01 kernel: block drbd1: Restarting drbd1_receiver Nov 14 14:50:27 an-c05n01 kernel: block drbd1: receiver (re)started Nov 14 14:50:27 an-c05n01 kernel: block drbd1: conn( Unconnected -> WFConnection ) Nov 14 14:50:27 an-c05n01 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1 Nov 14 14:50:27 an-c05n01 rhcs_fence: Attempting to fence peer using RHCS from DRBD... Nov 14 14:50:27 an-c05n01 corosync[568]: [TOTEM ] A processor failed, forming new configuration. Nov 14 14:50:29 an-c05n01 corosync[568]: [QUORUM] Members[1]: 1 Nov 14 14:50:29 an-c05n01 corosync[568]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 14 14:50:29 an-c05n01 corosync[568]: [CPG ] chosen downlist: sender r(0) ip(10.20.50.1) ; members(old:2 left:1) Nov 14 14:50:29 an-c05n01 corosync[568]: [MAIN ] Completed service synchronization, ready to provide service. Nov 14 14:50:29 an-c05n01 kernel: dlm: closing connection to node 2 Nov 14 14:50:29 an-c05n01 fenced[629]: fencing node an-c05n02.alteeve.ca Nov 14 14:50:29 an-c05n01 kernel: GFS2: fsid=an-cluster-05:shared.0: jid=1: Trying to acquire journal lock... Nov 14 14:50:34 an-c05n01 fenced[629]: fence an-c05n02.alteeve.ca success Nov 14 14:50:34 an-c05n01 fence_node[6003]: fence an-c05n02.alteeve.ca success Nov 14 14:50:34 an-c05n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 7 (0x700) Nov 14 14:50:34 an-c05n01 kernel: block drbd0: fence-peer helper returned 7 (peer was stonithed) Nov 14 14:50:34 an-c05n01 kernel: block drbd0: PingAck did not arrive in time. Nov 14 14:50:34 an-c05n01 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 ) Nov 14 14:50:34 an-c05n01 kernel: block drbd0: asender terminated Nov 14 14:50:34 an-c05n01 kernel: block drbd0: Terminating drbd0_asender Nov 14 14:50:34 an-c05n01 kernel: block drbd0: Connection closed Nov 14 14:50:34 an-c05n01 kernel: block drbd0: conn( NetworkFailure -> Unconnected ) Nov 14 14:50:34 an-c05n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 Nov 14 14:50:34 an-c05n01 kernel: block drbd0: receiver terminated Nov 14 14:50:34 an-c05n01 kernel: block drbd0: Restarting drbd0_receiver Nov 14 14:50:34 an-c05n01 kernel: block drbd0: receiver (re)started Nov 14 14:50:34 an-c05n01 kernel: block drbd0: conn( Unconnected -> WFConnection ) Nov 14 14:50:34 an-c05n01 rhcs_fence: Attempting to fence peer using RHCS from DRBD... Nov 14 14:50:35 an-c05n01 rgmanager[824]: Marking service:storage_n02 as stopped: Restricted domain unavailable Nov 14 14:50:35 an-c05n01 rgmanager[824]: Marking service:libvirtd_n02 as stopped: Restricted domain unavailable Nov 14 14:50:35 an-c05n01 fence_node[6534]: fence an-c05n02.alteeve.ca success Nov 14 14:50:35 an-c05n01 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1 exit code 7 (0x700) Nov 14 14:50:35 an-c05n01 kernel: block drbd1: fence-peer helper returned 7 (peer was stonithed) Nov 14 14:50:35 an-c05n01 kernel: block drbd1: pdsk( DUnknown -> Outdated ) Nov 14 14:50:35 an-c05n01 kernel: block drbd1: new current UUID 6E4A04368AE808A7:E58DE6AA3626CB4F:80662A6D66C86DCD:80652A6D66C86DCD Nov 14 14:50:35 an-c05n01 kernel: block drbd1: susp( 1 -> 0 ) Nov 14 14:50:50 an-c05n01 fence_node[6651]: fence an-c05n02.alteeve.ca success Nov 14 14:50:50 an-c05n01 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 7 (0x700) Nov 14 14:50:50 an-c05n01 kernel: block drbd0: fence-peer helper returned 7 (peer was stonithed) Nov 14 14:50:50 an-c05n01 kernel: block drbd0: pdsk( DUnknown -> Outdated ) Nov 14 14:50:50 an-c05n01 kernel: block drbd0: new current UUID 9446E186D2EA252F:D518242FE2FB5DC5:B99100FEF8DE5D0D:B99000FEF8DE5D0D Nov 14 14:50:50 an-c05n01 kernel: block drbd0: susp( 1 -> 0 ) Nov 14 14:50:50 an-c05n01 kernel: GFS2: fsid=an-cluster-05:shared.0: jid=1: Looking at journal... Nov 14 14:50:50 an-c05n01 kernel: GFS2: fsid=an-cluster-05:shared.0: jid=1: Done ==== Here are the logs from an-c05n02 during the same time period. ==== Nov 14 14:11:38 an-c05n02 rgmanager[31902]: Reconfiguring Nov 14 14:11:40 an-c05n02 rgmanager[31902]: Initializing vm:vm01-win2008 Nov 14 14:11:40 an-c05n02 rgmanager[31902]: vm:vm01-win2008 was added to the config, but I am not initializing it. Nov 14 14:28:46 an-c05n02 kernel: device vnet0 entered promiscuous mode Nov 14 14:28:46 an-c05n02 kernel: vbr2: port 2(vnet0) entering forwarding state Nov 14 14:28:46 an-c05n02 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:28:46 an-c05n02 kernel: device vnet0 left promiscuous mode Nov 14 14:28:46 an-c05n02 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:33:02 an-c05n02 kernel: device vnet0 entered promiscuous mode Nov 14 14:33:02 an-c05n02 kernel: vbr2: port 2(vnet0) entering forwarding state Nov 14 14:33:05 an-c05n02 ntpd[2227]: Listening on interface #21 vnet0, fe80::fc54:ff:fe8e:6732#123 Enabled Nov 14 14:33:17 an-c05n02 kernel: vbr2: port 2(vnet0) entering forwarding state Nov 14 14:33:18 an-c05n02 rgmanager[31902]: Migrating vm:vm01-win2008 to an-c05n01.alteeve.ca Nov 14 14:33:23 an-c05n02 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:33:23 an-c05n02 kernel: device vnet0 left promiscuous mode Nov 14 14:33:23 an-c05n02 kernel: vbr2: port 2(vnet0) entering disabled state Nov 14 14:33:24 an-c05n02 rgmanager[31902]: Migration of vm:vm01-win2008 to an-c05n01.alteeve.ca completed Nov 14 14:33:25 an-c05n02 ntpd[2227]: Deleting interface #21 vnet0, fe80::fc54:ff:fe8e:6732#123, interface stats: received=0, sent=0, dropped=0, active_time=20 secs Nov 14 14:34:33 an-c05n02 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team Nov 14 14:34:33 an-c05n02 kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max) Nov 14 14:44:05 an-c05n02 kernel: INFO: task clusterfs.sh:30260 blocked for more than 120 seconds. Nov 14 14:44:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:44:05 an-c05n02 kernel: clusterfs.sh D 0000000000000006 0 30260 30193 0x00000080 Nov 14 14:44:05 an-c05n02 kernel: ffff88032f275c88 0000000000000086 ffff88030000001c ffffffffa0443660 Nov 14 14:44:05 an-c05n02 kernel: ffff8805abd95ca8 ffffffffa0443610 ffff880300000003 ffff8805abd95d58 Nov 14 14:44:05 an-c05n02 kernel: ffff88032e802638 ffff88032f275fd8 000000000000fb88 ffff88032e802638 Nov 14 14:44:05 an-c05n02 kernel: Call Trace: Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa0436585>] gfs2_getattr+0xb5/0xf0 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffffa043657d>] ? gfs2_getattr+0xad/0xf0 [gfs2] Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff81186d51>] vfs_getattr+0x51/0x80 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff81186de0>] vfs_fstatat+0x60/0x80 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff81186e6e>] vfs_lstat+0x1e/0x20 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff81186e94>] sys_newlstat+0x24/0x50 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 Nov 14 14:44:05 an-c05n02 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:46:05 an-c05n02 kernel: INFO: task gfs2_quotad:1395 blocked for more than 120 seconds. Nov 14 14:46:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:46:05 an-c05n02 kernel: gfs2_quotad D 0000000000000013 0 1395 2 0x00000080 Nov 14 14:46:05 an-c05n02 kernel: ffff880636eb1c20 0000000000000046 0000000000000000 ffffffffa0443660 Nov 14 14:46:05 an-c05n02 kernel: ffff880636fd5ce8 ffffffffa0443610 ffff880600000005 ffff880636fd5d98 Nov 14 14:46:05 an-c05n02 kernel: ffff880639861058 ffff880636eb1fd8 000000000000fb88 ffff880639861058 Nov 14 14:46:05 an-c05n02 kernel: Call Trace: Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa043ff39>] gfs2_statfs_sync+0x59/0x1c0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8150efba>] ? schedule_timeout+0x19a/0x2e0 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa043ff31>] ? gfs2_statfs_sync+0x51/0x1c0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0437947>] quotad_check_timeo+0x57/0xb0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0437bd4>] gfs2_quotad+0x234/0x2b0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04379a0>] ? gfs2_quotad+0x0/0x2b0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:46:05 an-c05n02 kernel: INFO: task clusterfs.sh:30260 blocked for more than 120 seconds. Nov 14 14:46:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:46:05 an-c05n02 kernel: clusterfs.sh D 0000000000000006 0 30260 30193 0x00000080 Nov 14 14:46:05 an-c05n02 kernel: ffff88032f275c88 0000000000000086 ffff88030000001c ffffffffa0443660 Nov 14 14:46:05 an-c05n02 kernel: ffff8805abd95ca8 ffffffffa0443610 ffff880300000003 ffff8805abd95d58 Nov 14 14:46:05 an-c05n02 kernel: ffff88032e802638 ffff88032f275fd8 000000000000fb88 ffff88032e802638 Nov 14 14:46:05 an-c05n02 kernel: Call Trace: Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa0436585>] gfs2_getattr+0xb5/0xf0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffffa043657d>] ? gfs2_getattr+0xad/0xf0 [gfs2] Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81186d51>] vfs_getattr+0x51/0x80 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81186de0>] vfs_fstatat+0x60/0x80 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81186e6e>] vfs_lstat+0x1e/0x20 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff81186e94>] sys_newlstat+0x24/0x50 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 Nov 14 14:46:05 an-c05n02 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:48:05 an-c05n02 kernel: INFO: task gfs2_quotad:1395 blocked for more than 120 seconds. Nov 14 14:48:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:48:05 an-c05n02 kernel: gfs2_quotad D 0000000000000013 0 1395 2 0x00000080 Nov 14 14:48:05 an-c05n02 kernel: ffff880636eb1c20 0000000000000046 0000000000000000 ffffffffa0443660 Nov 14 14:48:05 an-c05n02 kernel: ffff880636fd5ce8 ffffffffa0443610 ffff880600000005 ffff880636fd5d98 Nov 14 14:48:05 an-c05n02 kernel: ffff880639861058 ffff880636eb1fd8 000000000000fb88 ffff880639861058 Nov 14 14:48:05 an-c05n02 kernel: Call Trace: Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa043ff39>] gfs2_statfs_sync+0x59/0x1c0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8150efba>] ? schedule_timeout+0x19a/0x2e0 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa043ff31>] ? gfs2_statfs_sync+0x51/0x1c0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0437947>] quotad_check_timeo+0x57/0xb0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0437bd4>] gfs2_quotad+0x234/0x2b0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04379a0>] ? gfs2_quotad+0x0/0x2b0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:48:05 an-c05n02 kernel: INFO: task clusterfs.sh:30260 blocked for more than 120 seconds. Nov 14 14:48:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:48:05 an-c05n02 kernel: clusterfs.sh D 0000000000000006 0 30260 30193 0x00000080 Nov 14 14:48:05 an-c05n02 kernel: ffff88032f275c88 0000000000000086 ffff88030000001c ffffffffa0443660 Nov 14 14:48:05 an-c05n02 kernel: ffff8805abd95ca8 ffffffffa0443610 ffff880300000003 ffff8805abd95d58 Nov 14 14:48:05 an-c05n02 kernel: ffff88032e802638 ffff88032f275fd8 000000000000fb88 ffff88032e802638 Nov 14 14:48:05 an-c05n02 kernel: Call Trace: Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa0436585>] gfs2_getattr+0xb5/0xf0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffffa043657d>] ? gfs2_getattr+0xad/0xf0 [gfs2] Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81186d51>] vfs_getattr+0x51/0x80 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81186de0>] vfs_fstatat+0x60/0x80 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81186e6e>] vfs_lstat+0x1e/0x20 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff81186e94>] sys_newlstat+0x24/0x50 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 Nov 14 14:48:05 an-c05n02 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:50:05 an-c05n02 kernel: INFO: task gfs2_quotad:1395 blocked for more than 120 seconds. Nov 14 14:50:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:50:05 an-c05n02 kernel: gfs2_quotad D 0000000000000013 0 1395 2 0x00000080 Nov 14 14:50:05 an-c05n02 kernel: ffff880636eb1c20 0000000000000046 0000000000000000 ffffffffa0443660 Nov 14 14:50:05 an-c05n02 kernel: ffff880636fd5ce8 ffffffffa0443610 ffff880600000005 ffff880636fd5d98 Nov 14 14:50:05 an-c05n02 kernel: ffff880639861058 ffff880636eb1fd8 000000000000fb88 ffff880639861058 Nov 14 14:50:05 an-c05n02 kernel: Call Trace: Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa043ff39>] gfs2_statfs_sync+0x59/0x1c0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8150efba>] ? schedule_timeout+0x19a/0x2e0 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa043ff31>] ? gfs2_statfs_sync+0x51/0x1c0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0437947>] quotad_check_timeo+0x57/0xb0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0437bd4>] gfs2_quotad+0x234/0x2b0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04379a0>] ? gfs2_quotad+0x0/0x2b0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81096a36>] kthread+0x96/0xa0 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Nov 14 14:50:05 an-c05n02 kernel: INFO: task clusterfs.sh:30260 blocked for more than 120 seconds. Nov 14 14:50:05 an-c05n02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:50:05 an-c05n02 kernel: clusterfs.sh D 0000000000000006 0 30260 30193 0x00000080 Nov 14 14:50:05 an-c05n02 kernel: ffff88032f275c88 0000000000000086 ffff88030000001c ffffffffa0443660 Nov 14 14:50:05 an-c05n02 kernel: ffff8805abd95ca8 ffffffffa0443610 ffff880300000003 ffff8805abd95d58 Nov 14 14:50:05 an-c05n02 kernel: ffff88032e802638 ffff88032f275fd8 000000000000fb88 ffff88032e802638 Nov 14 14:50:05 an-c05n02 kernel: Call Trace: Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0443660>] ? gdlm_ast+0x0/0x210 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0443610>] ? gdlm_bast+0x0/0x50 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238ae>] gfs2_glock_holder_wait+0xe/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8150f30f>] __wait_on_bit+0x5f/0x90 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa04238a0>] ? gfs2_glock_holder_wait+0x0/0x20 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8150f3b8>] out_of_line_wait_on_bit+0x78/0x90 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0425b15>] gfs2_glock_wait+0x45/0x90 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0426fb3>] gfs2_glock_nq+0x2d3/0x3e0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa0436585>] gfs2_getattr+0xb5/0xf0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffffa043657d>] ? gfs2_getattr+0xad/0xf0 [gfs2] Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81186d51>] vfs_getattr+0x51/0x80 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81186de0>] vfs_fstatat+0x60/0x80 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81186e6e>] vfs_lstat+0x1e/0x20 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff81186e94>] sys_newlstat+0x24/0x50 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff810dc937>] ? audit_syscall_entry+0x1d7/0x200 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 Nov 14 14:50:05 an-c05n02 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Nov 14 14:50:17 an-c05n02 kernel: block drbd0: meta connection shut down by peer. Nov 14 14:50:17 an-c05n02 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 ) Nov 14 14:50:17 an-c05n02 kernel: block drbd0: asender terminated Nov 14 14:50:17 an-c05n02 kernel: block drbd0: Terminating drbd0_asender Nov 14 14:50:17 an-c05n02 kernel: block drbd0: Connection closed Nov 14 14:50:17 an-c05n02 kernel: block drbd0: conn( NetworkFailure -> Unconnected ) Nov 14 14:50:17 an-c05n02 kernel: block drbd0: receiver terminated Nov 14 14:50:17 an-c05n02 kernel: block drbd0: Restarting drbd0_receiver Nov 14 14:50:17 an-c05n02 kernel: block drbd0: receiver (re)started Nov 14 14:50:17 an-c05n02 kernel: block drbd0: conn( Unconnected -> WFConnection ) Nov 14 14:50:17 an-c05n02 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 Nov 14 14:50:17 an-c05n02 corosync[31675]: cman killed by node 1 because we were killed by cman_tool or other application Nov 14 14:50:17 an-c05n02 rhcs_fence: Attempting to fence peer using RHCS from DRBD... Nov 14 14:50:17 an-c05n02 rhcs_fence: Unable to find local node name. Nov 14 14:50:17 an-c05n02 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 1 (0x100) Nov 14 14:50:17 an-c05n02 kernel: block drbd0: fence-peer helper broken, returned 1 Nov 14 14:50:17 an-c05n02 rgmanager[31902]: #67: Shutting down uncleanly Nov 14 14:50:17 an-c05n02 fenced[31732]: cluster is down, exiting Nov 14 14:50:17 an-c05n02 dlm_controld[31758]: cluster is down, exiting Nov 14 14:50:17 an-c05n02 gfs_controld[31807]: cluster is down, exiting Nov 14 14:50:17 an-c05n02 fenced[31732]: daemon cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 gfs_controld[31807]: daemon cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 dlm_controld[31758]: daemon cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 fenced[31732]: cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 dlm_controld[31758]: cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 gfs_controld[31807]: cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 dlm_controld[31758]: cpg_dispatch error 2 Nov 14 14:50:17 an-c05n02 dlm_controld[31758]: cpg_dispatch error 2 Nov 14 14:50:18 an-c05n02 rgmanager[31279]: [script] Executing /etc/init.d/clvmd stop Nov 14 14:50:18 an-c05n02 rgmanager[31299]: [script] Executing /etc/init.d/libvirtd stop Nov 14 14:50:18 an-c05n02 rgmanager[31327]: [vm] Could not determine Hypervisor Nov 14 14:50:18 an-c05n02 rgmanager[31902]: stop on vm "vm01-win2008" returned 2 (invalid argument(s)) Nov 14 14:50:18 an-c05n02 rgmanager[31346]: [script] Executing /etc/init.d/libvirtd stop Nov 14 14:50:18 an-c05n02 kernel: block drbd0: Handshake successful: Agreed network protocol version 97 Nov 14 14:50:18 an-c05n02 kernel: block drbd0: conn( WFConnection -> WFReportParams ) Nov 14 14:50:18 an-c05n02 kernel: block drbd0: Starting asender thread (from drbd0_receiver [863]) Nov 14 14:50:18 an-c05n02 kernel: block drbd0: data-integrity-alg: <not-used> Nov 14 14:50:18 an-c05n02 kernel: block drbd0: drbd_sync_handshake: Nov 14 14:50:18 an-c05n02 kernel: block drbd0: self D518242FE2FB5DC5:0000000000000000:B99100FEF8DE5D0D:B99000FEF8DE5D0D bits:0 flags:0 Nov 14 14:50:18 an-c05n02 kernel: block drbd0: peer D518242FE2FB5DC5:0000000000000000:B99100FEF8DE5D0D:B99000FEF8DE5D0D bits:0 flags:0 Nov 14 14:50:18 an-c05n02 kernel: block drbd0: uuid_compare()=0 by rule 40 Nov 14 14:50:18 an-c05n02 kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) Nov 14 14:50:18 an-c05n02 kernel: block drbd0: susp( 1 -> 0 ) Nov 14 14:50:19 an-c05n02 rgmanager[31631]: [clusterfs] unmounting /shared Nov 14 14:50:25 an-c05n02 kernel: dlm: closing connection to node 2 Nov 14 14:50:25 an-c05n02 kernel: dlm: closing connection to node 1 Nov 14 14:50:25 an-c05n02 kernel: dlm: shared: no userland control daemon, stopping lockspace Nov 14 14:50:25 an-c05n02 kernel: dlm: clvmd: no userland control daemon, stopping lockspace Nov 14 14:50:25 an-c05n02 kernel: dlm: rgmanager: no userland control daemon, stopping lockspace Write failed: Broken pipe ==== On a totally different topic; While submitting the above, I got: ==== Mid-air collision detected! Someone else has made changes to bug 1030669 at the same time you were trying to. The changes made were: No changes have been made to this bug yet. Added the comment(s): Your comment was: <comment #2> ==== I think I am on a role. This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. I do, it *just* happened a few days ago. (In reply to digimer from comment #3) > On a totally different topic; While submitting the above, I got: > > ==== > Mid-air collision detected! > > Someone else has made changes to bug 1030669 at the same time you were > trying to. The changes made were: > > No changes have been made to this bug yet. > > Added the comment(s): > > > > Your comment was: > <comment #2> > ==== > > I think I am on a role. Bugbot was adding flags to the bugzilla. It happens a lot if you add a comment shortly after submitting the bug. Nothing you need to panic about :) We haven't been able to reproduce this. Digimer can you still see this issue? It looks like it might be GFS related, but it's difficult for us to determine the cause without a good reproducer. Commenting to stop the "outstanding bugs" emails. |