Description of problem: ========================= I am seeing frequent kernel hangs both on fuse clients and gluster nodes. For example I have seen this on systemic setup, due to which two of my nodes in a 4node cluster are hung due to this. With most likeliness this is due to a lot of healing required and mtsh set to 4 However on one of my functional cluster node setup(6 node cluster) I created a 1x2 volume and with compound fops enabled and mtsh set to 4 this time i see that there are kernel hung messages on one of the clients. there were two clients which were doing linux untar into two different directories One brick was brought down and then up . looks like healing was happening even in this case Version-Release number of selected component (if applicable): 3.8.4-5 hung message [ 2881.176517] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
INFO: task tee:23167 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. tee D 0000000000000006 0 23167 1 0x00000084 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse] [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse] [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse] [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse] [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0 [<ffffffff8118ea21>] sys_write+0x51/0x90 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task tee:23167 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. tee D 0000000000000006 0 23167 1 0x00000084 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse] [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse] [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse] [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse] [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0 [<ffffffff8118ea21>] sys_write+0x51/0x90 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task mv:23181 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mv D 000000000000000e 0 23181 1 0x00000084 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse] [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse] [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse] [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0 [<ffffffff811a0727>] user_path_at+0x57/0xa0 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20 [<ffffffff81193d64>] sys_newstat+0x24/0x50 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task tee:23167 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. tee D 0000000000000006 0 23167 1 0x00000084 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse] [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse] [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse] [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse] [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0 [<ffffffff8118ea21>] sys_write+0x51/0x90 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task mv:23181 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mv D 000000000000000e 0 23181 1 0x00000084 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse] [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse] [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse] [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0 [<ffffffff811a0727>] user_path_at+0x57/0xa0 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20 [<ffffffff81193d64>] sys_newstat+0x24/0x50 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task tee:23167 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. tee D 0000000000000006 0 23167 1 0x00000084 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse] [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse] [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse] [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse] [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0 [<ffffffff8118ea21>] sys_write+0x51/0x90 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b INFO: task mv:23181 blocked for more than 120 seconds. Not tainted 2.6.32-504.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mv D 000000000000000e 0 23181 1 0x00000084 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8 Call Trace: [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse] [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse] [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse] [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse] [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse] [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0 [<ffffffff811a0727>] user_path_at+0x57/0xa0 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20 [<ffffffff81193d64>] sys_newstat+0x24/0x50 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b SELinux: initialized (dev fuse, type fuse), uses genfs_contexts SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
Kernel hangs were noticed while running my systemic testing on 3.2 regression cycle Following is the dmesg -T [Thu Dec 29 19:27:39 2016] type=1305 audit(1483019860.651:1363): audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1 [Thu Dec 29 19:27:39 2016] type=1305 audit(1483019860.651:1364): audit_pid=26454 old=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1 [Thu Dec 29 19:30:30 2016] fuse init (API version 7.22) [Sat Dec 31 23:40:18 2016] INFO: task xfsaild/dm-0:487 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] xfsaild/dm-0 D ffff880451cfb000 0 487 2 0x00000000 [Sat Dec 31 23:40:18 2016] ffff8804502dbd58 0000000000000046 ffff880451f00fb0 ffff8804502dbfd8 [Sat Dec 31 23:40:18 2016] ffff8804502dbfd8 ffff8804502dbfd8 ffff880451f00fb0 ffff8804513d5100 [Sat Dec 31 23:40:18 2016] 0000000000000000 ffff880451f00fb0 ffff8804510ee528 ffff880451cfb000 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168b579>] schedule+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffffa027a97d>] _xfs_log_force+0x1bd/0x2b0 [xfs] [Sat Dec 31 23:40:18 2016] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffffa027aa96>] xfs_log_force+0x26/0x80 [xfs] [Sat Dec 31 23:40:18 2016] [<ffffffffa0286360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [Sat Dec 31 23:40:18 2016] [<ffffffffa02864ba>] xfsaild+0x15a/0x660 [xfs] [Sat Dec 31 23:40:18 2016] [<ffffffffa0286360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [Sat Dec 31 23:40:18 2016] [<ffffffff810b052f>] kthread+0xcf/0xe0 [Sat Dec 31 23:40:18 2016] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [Sat Dec 31 23:40:18 2016] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [Sat Dec 31 23:40:18 2016] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:21336 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 21336 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff8803643f3c70 0000000000000086 ffff880451e55e20 ffff8803643f3fd8 [Sat Dec 31 23:40:18 2016] ffff8803643f3fd8 ffff8803643f3fd8 ffff880451e55e20 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880451e55e20 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:22591 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 22591 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff8804512efc70 0000000000000086 ffff880451e54e70 ffff8804512effd8 [Sat Dec 31 23:40:18 2016] ffff8804512effd8 ffff8804512effd8 ffff880451e54e70 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880451e54e70 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:22654 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 22654 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff880440abbc70 0000000000000086 ffff88032274bec0 ffff880440abbfd8 [Sat Dec 31 23:40:18 2016] ffff880440abbfd8 ffff880440abbfd8 ffff88032274bec0 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff88032274bec0 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:23515 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 23515 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff8802c97cfc70 0000000000000086 ffff88032538af10 ffff8802c97cffd8 [Sat Dec 31 23:40:18 2016] ffff8802c97cffd8 ffff8802c97cffd8 ffff88032538af10 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff88032538af10 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff81057bc3>] ? x2apic_send_IPI_mask+0x13/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:24449 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 24449 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff88029bebbc70 0000000000000086 ffff880326aa8000 ffff88029bebbfd8 [Sat Dec 31 23:40:18 2016] ffff88029bebbfd8 ffff88029bebbfd8 ffff880326aa8000 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880326aa8000 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1380 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 1380 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff88000adffc70 0000000000000086 ffff880451d14e70 ffff88000adfffd8 [Sat Dec 31 23:40:18 2016] ffff88000adfffd8 ffff88000adfffd8 ffff880451d14e70 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880451d14e70 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1381 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 1381 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff88004483bc70 0000000000000086 ffff8803de232f10 ffff88004483bfd8 [Sat Dec 31 23:40:18 2016] ffff88004483bfd8 ffff88004483bfd8 ffff8803de232f10 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff8803de232f10 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1382 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 1382 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff88001c543c70 0000000000000086 ffff880326ad0000 ffff88001c543fd8 [Sat Dec 31 23:40:18 2016] ffff88001c543fd8 ffff88001c543fd8 ffff880326ad0000 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880326ad0000 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1719 blocked for more than 120 seconds. [Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Sat Dec 31 23:40:18 2016] glusterfsd D ffff8804507fa028 0 1719 1 0x00000080 [Sat Dec 31 23:40:18 2016] ffff88000f23bc70 0000000000000086 ffff880036373ec0 ffff88000f23bfd8 [Sat Dec 31 23:40:18 2016] ffff88000f23bfd8 ffff88000f23bfd8 ffff880036373ec0 ffff8804507fa020 [Sat Dec 31 23:40:18 2016] ffff8804507fa024 ffff880036373ec0 00000000ffffffff ffff8804507fa028 [Sat Dec 31 23:40:18 2016] Call Trace: [Sat Dec 31 23:40:18 2016] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [Sat Dec 31 23:40:18 2016] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [Sat Dec 31 23:40:18 2016] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [Sat Dec 31 23:40:18 2016] [<ffffffff8120cb9f>] do_last+0x28f/0x12a0 [Sat Dec 31 23:40:18 2016] [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200 [Sat Dec 31 23:40:18 2016] [<ffffffff8120dc72>] path_openat+0xc2/0x490 [Sat Dec 31 23:40:18 2016] [<ffffffff810f4f30>] ? futex_wake+0x80/0x160 [Sat Dec 31 23:40:18 2016] [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0 [Sat Dec 31 23:40:18 2016] [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0 [Sat Dec 31 23:40:18 2016] [<ffffffff811fd40e>] SyS_open+0x1e/0x20 [Sat Dec 31 23:40:18 2016] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [Sun Jan 1 05:29:54 2017] Clock: inserting leap second 23:59:60 UTC execution status https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=632186609 Also view 1409472 - brick crashed on systemic setup
sosreports and cores are available at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1409472/
I have updated BZ#https://bugzilla.redhat.com/show_bug.cgi?id=1389516 - During random write testing on LVM thinp back volumes hangs / hung task timeouts are seen I have put a need info on Mike Snitzer, the bug assignee