Bug 1397907 - seeing frequent kernel hangs when doing operations both on fuse client and gluster nodes on replica volumes
Summary: seeing frequent kernel hangs when doing operations both on fuse client and gl...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
low
urgent
Target Milestone: ---
: ---
Assignee: Pranith Kumar K
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On: 1389516
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-23 14:25 UTC by Nag Pavan Chilakam
Modified: 2018-05-30 11:35 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-30 11:35:09 UTC
Embargoed:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2016-11-23 14:25:00 UTC
Description of problem:
=========================
I am seeing frequent kernel hangs both on fuse clients and gluster nodes.
For example I have seen this on systemic setup, due to which two of my nodes in a 4node cluster are hung due to this.
With most likeliness this is due to a lot of healing required and mtsh set to 4

However on one of my functional cluster node setup(6 node cluster)
I created a 1x2 volume and with compound fops enabled and mtsh set to 4
this time i see that there are kernel hung messages on one of the clients.
there were two clients which were doing linux untar into two different directories
One brick was brought down and then up . looks like healing was happening even in this case

Version-Release number of selected component (if applicable):
3.8.4-5



hung message

[ 2881.176517] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message

Comment 4 Nag Pavan Chilakam 2016-11-30 05:54:02 UTC
INFO: task tee:23167 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tee           D 0000000000000006     0 23167      1 0x00000084
 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246
 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001
 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse]
 [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse]
 [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse]
 [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse]
 [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0
 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20
 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0
 [<ffffffff8118ea21>] sys_write+0x51/0x90
 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task tee:23167 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tee           D 0000000000000006     0 23167      1 0x00000084
 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246
 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001
 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse]
 [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse]
 [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse]
 [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse]
 [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0
 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20
 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0
 [<ffffffff8118ea21>] sys_write+0x51/0x90
 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task mv:23181 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mv            D 000000000000000e     0 23181      1 0x00000084
 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246
 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001
 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse]
 [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse]
 [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse]
 [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000
 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00
 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50
 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80
 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0
 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0
 [<ffffffff811a0727>] user_path_at+0x57/0xa0
 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480
 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0
 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20
 [<ffffffff81193d64>] sys_newstat+0x24/0x50
 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task tee:23167 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tee           D 0000000000000006     0 23167      1 0x00000084
 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246
 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001
 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse]
 [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse]
 [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse]
 [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse]
 [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0
 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20
 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0
 [<ffffffff8118ea21>] sys_write+0x51/0x90
 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task mv:23181 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mv            D 000000000000000e     0 23181      1 0x00000084
 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246
 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001
 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse]
 [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse]
 [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse]
 [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000
 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00
 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50
 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80
 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0
 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0
 [<ffffffff811a0727>] user_path_at+0x57/0xa0
 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480
 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0
 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20
 [<ffffffff81193d64>] sys_newstat+0x24/0x50
 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task tee:23167 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
tee           D 0000000000000006     0 23167      1 0x00000084
 ffff8803c48cbb58 0000000000000086 0000000000000000 0000000000000246
 ffff8803c48cbae8 0000000000000283 000ba9645d36114f 0000000300000001
 ffff8803c48cbaf8 00000001c3a0121a ffff88047b939058 ffff8803c48cbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa038ed89>] ? fuse_wait_on_page_writeback+0x39/0x1b0 [fuse]
 [<ffffffff8112485e>] ? add_to_page_cache_lru+0x3e/0x50
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038fcf7>] fuse_send_write+0xe7/0x120 [fuse]
 [<ffffffffa03905a6>] fuse_perform_write+0x2e6/0x580 [fuse]
 [<ffffffffa0390b53>] fuse_file_aio_write+0x313/0x3c0 [fuse]
 [<ffffffff8118dd5a>] do_sync_write+0xfa/0x140
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff813382fb>] ? put_ldisc+0x5b/0xc0
 [<ffffffff8123a5eb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8122d446>] ? security_file_permission+0x16/0x20
 [<ffffffff8118e058>] vfs_write+0xb8/0x1a0
 [<ffffffff8118ea21>] sys_write+0x51/0x90
 [<ffffffff810e5a7e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
INFO: task mv:23181 blocked for more than 120 seconds.
      Not tainted 2.6.32-504.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mv            D 000000000000000e     0 23181      1 0x00000084
 ffff8801f69bba68 0000000000000086 0000000000000000 0000000000000246
 ffff8801f69bb9f8 0000000000000287 000ba969e7ad340e 0000000300000001
 ffff8801f69bba08 00000001c3a06de7 ffff8802792de5f8 ffff8801f69bbfd8
Call Trace:
 [<ffffffffa038933d>] __fuse_request_send+0xed/0x2b0 [fuse]
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0389512>] fuse_request_send+0x12/0x20 [fuse]
 [<ffffffffa038cccc>] fuse_do_getattr+0x10c/0x2c0 [fuse]
 [<ffffffffa038cefd>] fuse_update_attributes+0x7d/0x90 [fuse]
 [<ffffffffa038e038>] fuse_permission+0xb8/0x1f0 [fuse]
 [<ffffffff8119e1e3>] __link_path_walk+0xb3/0x1000
 [<ffffffff8114f0b7>] ? handle_pte_fault+0xf7/0xb00
 [<ffffffff8152ae5e>] ? mutex_lock+0x1e/0x50
 [<ffffffff81336e6a>] ? copy_termios+0x6a/0x80
 [<ffffffff8119f3ea>] path_walk+0x6a/0xe0
 [<ffffffff8119f5fb>] filename_lookup+0x6b/0xc0
 [<ffffffff811a0727>] user_path_at+0x57/0xa0
 [<ffffffff8104d18c>] ? __do_page_fault+0x1ec/0x480
 [<ffffffff81193bc0>] vfs_fstatat+0x50/0xa0
 [<ffffffff811a38f4>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff81193d3b>] vfs_stat+0x1b/0x20
 [<ffffffff81193d64>] sys_newstat+0x24/0x50
 [<ffffffff810e5c87>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
SELinux: initialized (dev fuse, type fuse), uses genfs_contexts

Comment 12 Nag Pavan Chilakam 2017-01-02 07:01:19 UTC
Kernel hangs were noticed while running my systemic testing on 3.2 regression cycle
Following is the dmesg -T
[Thu Dec 29 19:27:39 2016] type=1305 audit(1483019860.651:1363): audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[Thu Dec 29 19:27:39 2016] type=1305 audit(1483019860.651:1364): audit_pid=26454 old=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[Thu Dec 29 19:30:30 2016] fuse init (API version 7.22)
[Sat Dec 31 23:40:18 2016] INFO: task xfsaild/dm-0:487 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] xfsaild/dm-0    D ffff880451cfb000     0   487      2 0x00000000
[Sat Dec 31 23:40:18 2016]  ffff8804502dbd58 0000000000000046 ffff880451f00fb0 ffff8804502dbfd8
[Sat Dec 31 23:40:18 2016]  ffff8804502dbfd8 ffff8804502dbfd8 ffff880451f00fb0 ffff8804513d5100
[Sat Dec 31 23:40:18 2016]  0000000000000000 ffff880451f00fb0 ffff8804510ee528 ffff880451cfb000
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168b579>] schedule+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffffa027a97d>] _xfs_log_force+0x1bd/0x2b0 [xfs]
[Sat Dec 31 23:40:18 2016]  [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffffa027aa96>] xfs_log_force+0x26/0x80 [xfs]
[Sat Dec 31 23:40:18 2016]  [<ffffffffa0286360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[Sat Dec 31 23:40:18 2016]  [<ffffffffa02864ba>] xfsaild+0x15a/0x660 [xfs]
[Sat Dec 31 23:40:18 2016]  [<ffffffffa0286360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[Sat Dec 31 23:40:18 2016]  [<ffffffff810b052f>] kthread+0xcf/0xe0
[Sat Dec 31 23:40:18 2016]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[Sat Dec 31 23:40:18 2016]  [<ffffffff81696418>] ret_from_fork+0x58/0x90
[Sat Dec 31 23:40:18 2016]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:21336 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0 21336      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff8803643f3c70 0000000000000086 ffff880451e55e20 ffff8803643f3fd8
[Sat Dec 31 23:40:18 2016]  ffff8803643f3fd8 ffff8803643f3fd8 ffff880451e55e20 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880451e55e20 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:22591 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0 22591      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff8804512efc70 0000000000000086 ffff880451e54e70 ffff8804512effd8
[Sat Dec 31 23:40:18 2016]  ffff8804512effd8 ffff8804512effd8 ffff880451e54e70 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880451e54e70 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:22654 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0 22654      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff880440abbc70 0000000000000086 ffff88032274bec0 ffff880440abbfd8
[Sat Dec 31 23:40:18 2016]  ffff880440abbfd8 ffff880440abbfd8 ffff88032274bec0 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff88032274bec0 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:23515 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0 23515      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff8802c97cfc70 0000000000000086 ffff88032538af10 ffff8802c97cffd8
[Sat Dec 31 23:40:18 2016]  ffff8802c97cffd8 ffff8802c97cffd8 ffff88032538af10 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff88032538af10 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff81057bc3>] ? x2apic_send_IPI_mask+0x13/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:24449 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0 24449      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff88029bebbc70 0000000000000086 ffff880326aa8000 ffff88029bebbfd8
[Sat Dec 31 23:40:18 2016]  ffff88029bebbfd8 ffff88029bebbfd8 ffff880326aa8000 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880326aa8000 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1380 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0  1380      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff88000adffc70 0000000000000086 ffff880451d14e70 ffff88000adfffd8
[Sat Dec 31 23:40:18 2016]  ffff88000adfffd8 ffff88000adfffd8 ffff880451d14e70 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880451d14e70 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1381 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0  1381      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff88004483bc70 0000000000000086 ffff8803de232f10 ffff88004483bfd8
[Sat Dec 31 23:40:18 2016]  ffff88004483bfd8 ffff88004483bfd8 ffff8803de232f10 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff8803de232f10 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1382 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0  1382      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff88001c543c70 0000000000000086 ffff880326ad0000 ffff88001c543fd8
[Sat Dec 31 23:40:18 2016]  ffff88001c543fd8 ffff88001c543fd8 ffff880326ad0000 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880326ad0000 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sat Dec 31 23:40:18 2016] INFO: task glusterfsd:1719 blocked for more than 120 seconds.
[Sat Dec 31 23:40:18 2016] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sat Dec 31 23:40:18 2016] glusterfsd      D ffff8804507fa028     0  1719      1 0x00000080
[Sat Dec 31 23:40:18 2016]  ffff88000f23bc70 0000000000000086 ffff880036373ec0 ffff88000f23bfd8
[Sat Dec 31 23:40:18 2016]  ffff88000f23bfd8 ffff88000f23bfd8 ffff880036373ec0 ffff8804507fa020
[Sat Dec 31 23:40:18 2016]  ffff8804507fa024 ffff880036373ec0 00000000ffffffff ffff8804507fa028
[Sat Dec 31 23:40:18 2016] Call Trace:
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8168972f>] mutex_lock+0x1f/0x2f
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120cb9f>] do_last+0x28f/0x12a0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811de2f6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120dc72>] path_openat+0xc2/0x490
[Sat Dec 31 23:40:18 2016]  [<ffffffff810f4f30>] ? futex_wake+0x80/0x160
[Sat Dec 31 23:40:18 2016]  [<ffffffff8120fdeb>] do_filp_open+0x4b/0xb0
[Sat Dec 31 23:40:18 2016]  [<ffffffff8121ca67>] ? __alloc_fd+0xa7/0x130
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd2f3>] do_sys_open+0xf3/0x1f0
[Sat Dec 31 23:40:18 2016]  [<ffffffff811fd40e>] SyS_open+0x1e/0x20
[Sat Dec 31 23:40:18 2016]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
[Sun Jan  1 05:29:54 2017] Clock: inserting leap second 23:59:60 UTC


execution status
https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=632186609


Also view 1409472 - brick crashed on systemic setup

Comment 14 Nag Pavan Chilakam 2017-01-02 07:55:41 UTC
sosreports and cores are available at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1409472/

Comment 16 Nag Pavan Chilakam 2017-01-04 07:56:42 UTC
I have updated BZ#https://bugzilla.redhat.com/show_bug.cgi?id=1389516 - During random write testing on LVM thinp back volumes hangs / hung task timeouts are seen 
I have put a need info on Mike Snitzer, the bug assignee


Note You need to log in before you can comment on or make changes to this bug.