Bug 833612

Summary: Umount of fuse mount hangs [during gluster volume stop]
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Justin Bautista <jbautist>
Component: fuseAssignee: Csaba Henk <csaba>
Status: CLOSED ERRATA QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: medium    
Version: 2.0CC: amarts, gluster-bugs, rfortier, sbhaloth, shaines
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0qa5-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:36:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Justin Bautista 2012-06-19 21:30:38 UTC
Description of problem:
Gluster commands hang after gluster volume stop.  Please see dmesg output in Additional info.  This was the internal Samba mount attempted to stop by the CTDB teardown hook.
 
Actual results:
Gluster volume stop does not succeed.

Expected results:
Would expect the gluster volume to stop successfully.

Additional info:

INFO: task umount:8609 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000000     0  8609   8604 0x00000000
 ffff88003c023b98 0000000000000086 0000000000000000 ffff88003c7f80c0
 ffff88003c7f80f8 0000000000000000 ffff88003c023b58 ffffffff81060680
 ffff88003c7f8678 ffff88003c023fd8 000000000000f4e8 ffff88003c7f8678
Call Trace:
 [<ffffffff81060680>] ? pick_next_task_fair+0xd0/0x130
 [<ffffffff814ed3b5>] ? thread_return+0x1b3/0x76e
 [<ffffffff814ee0c5>] schedule_timeout+0x215/0x2e0
 [<ffffffff8106266b>] ? enqueue_rt_entity+0x6b/0x80
 [<ffffffff814edd43>] wait_for_common+0x123/0x180
 [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20
 [<ffffffff814ede5d>] wait_for_completion+0x1d/0x20
 [<ffffffff81060c21>] synchronize_sched_expedited+0x1c1/0x280
 [<ffffffff810de0ae>] synchronize_rcu_expedited+0xe/0x10
 [<ffffffff8113451a>] bdi_remove_from_list+0x4a/0x60
 [<ffffffff811345e1>] bdi_unregister+0xb1/0x170
 [<ffffffff81134796>] bdi_destroy+0xf6/0x150
 [<ffffffffa031caa3>] fuse_conn_kill+0xc3/0xd0 [fuse]
 [<ffffffffa031cb1a>] fuse_put_super+0x6a/0x80 [fuse]
 [<ffffffff81178c8b>] generic_shutdown_super+0x5b/0xe0
 [<ffffffff81178d76>] kill_anon_super+0x16/0x60
 [<ffffffffa031bd92>] fuse_kill_sb_anon+0x52/0x60 [fuse]
 [<ffffffff81179d00>] deactivate_super+0x70/0x90
 [<ffffffff81195caf>] mntput_no_expire+0xbf/0x110
 [<ffffffff8119674b>] sys_umount+0x7b/0x3a0
 [<ffffffff81081841>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
INFO: task umount:8609 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000000     0  8609   8604 0x00000000
 ffff88003c023b98 0000000000000086 0000000000000000 ffff88003c7f80c0
 ffff88003c7f80f8 0000000000000000 ffff88003c023b58 ffffffff81060680
 ffff88003c7f8678 ffff88003c023fd8 000000000000f4e8 ffff88003c7f8678
Call Trace:
 [<ffffffff81060680>] ? pick_next_task_fair+0xd0/0x130
 [<ffffffff814ed3b5>] ? thread_return+0x1b3/0x76e
 [<ffffffff814ee0c5>] schedule_timeout+0x215/0x2e0
 [<ffffffff8106266b>] ? enqueue_rt_entity+0x6b/0x80
 [<ffffffff814edd43>] wait_for_common+0x123/0x180
 [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20
 [<ffffffff814ede5d>] wait_for_completion+0x1d/0x20
 [<ffffffff81060c21>] synchronize_sched_expedited+0x1c1/0x280
 [<ffffffff810de0ae>] synchronize_rcu_expedited+0xe/0x10
 [<ffffffff8113451a>] bdi_remove_from_list+0x4a/0x60
 [<ffffffff811345e1>] bdi_unregister+0xb1/0x170
 [<ffffffff81134796>] bdi_destroy+0xf6/0x150
 [<ffffffffa031caa3>] fuse_conn_kill+0xc3/0xd0 [fuse]
 [<ffffffffa031cb1a>] fuse_put_super+0x6a/0x80 [fuse]
 [<ffffffff81178c8b>] generic_shutdown_super+0x5b/0xe0
 [<ffffffff81178d76>] kill_anon_super+0x16/0x60
 [<ffffffffa031bd92>] fuse_kill_sb_anon+0x52/0x60 [fuse]
 [<ffffffff81179d00>] deactivate_super+0x70/0x90
 [<ffffffff81195caf>] mntput_no_expire+0xbf/0x110
 [<ffffffff8119674b>] sys_umount+0x7b/0x3a0
 [<ffffffff81081841>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
INFO: task umount:8609 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000000     0  8609   8604 0x00000000
 ffff88003c023b98 0000000000000086 0000000000000000 ffff88003c7f80c0
 ffff88003c7f80f8 0000000000000000 ffff88003c023b58 ffffffff81060680
 ffff88003c7f8678 ffff88003c023fd8 000000000000f4e8 ffff88003c7f8678
Call Trace:
 [<ffffffff81060680>] ? pick_next_task_fair+0xd0/0x130
 [<ffffffff814ed3b5>] ? thread_return+0x1b3/0x76e
 [<ffffffff814ee0c5>] schedule_timeout+0x215/0x2e0
 [<ffffffff8106266b>] ? enqueue_rt_entity+0x6b/0x80
 [<ffffffff814edd43>] wait_for_common+0x123/0x180
 [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20
 [<ffffffff814ede5d>] wait_for_completion+0x1d/0x20
 [<ffffffff81060c21>] synchronize_sched_expedited+0x1c1/0x280
 [<ffffffff810de0ae>] synchronize_rcu_expedited+0xe/0x10
 [<ffffffff8113451a>] bdi_remove_from_list+0x4a/0x60
 [<ffffffff811345e1>] bdi_unregister+0xb1/0x170
 [<ffffffff81134796>] bdi_destroy+0xf6/0x150
 [<ffffffffa031caa3>] fuse_conn_kill+0xc3/0xd0 [fuse]
 [<ffffffffa031cb1a>] fuse_put_super+0x6a/0x80 [fuse]
 [<ffffffff81178c8b>] generic_shutdown_super+0x5b/0xe0
 [<ffffffff81178d76>] kill_anon_super+0x16/0x60
 [<ffffffffa031bd92>] fuse_kill_sb_anon+0x52/0x60 [fuse]
 [<ffffffff81179d00>] deactivate_super+0x70/0x90
 [<ffffffff81195caf>] mntput_no_expire+0xbf/0x110
 [<ffffffff8119674b>] sys_umount+0x7b/0x3a0
 [<ffffffff81081841>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
INFO: task umount:8609 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 0000000000000000     0  8609   8604 0x00000000
 ffff88003c023b98 0000000000000086 0000000000000000 ffff88003c7f80c0
 ffff88003c7f80f8 0000000000000000 ffff88003c023b58 ffffffff81060680
 ffff88003c7f8678 ffff88003c023fd8 000000000000f4e8 ffff88003c7f8678
Call Trace:
 [<ffffffff81060680>] ? pick_next_task_fair+0xd0/0x130
 [<ffffffff814ed3b5>] ? thread_return+0x1b3/0x76e
 [<ffffffff814ee0c5>] schedule_timeout+0x215/0x2e0
 [<ffffffff8106266b>] ? enqueue_rt_entity+0x6b/0x80
 [<ffffffff814edd43>] wait_for_common+0x123/0x180
 [<ffffffff8105ea30>] ? default_wake_function+0x0/0x20
 [<ffffffff814ede5d>] wait_for_completion+0x1d/0x20
 [<ffffffff81060c21>] synchronize_sched_expedited+0x1c1/0x280
 [<ffffffff810de0ae>] synchronize_rcu_expedited+0xe/0x10
 [<ffffffff8113451a>] bdi_remove_from_list+0x4a/0x60
 [<ffffffff811345e1>] bdi_unregister+0xb1/0x170
 [<ffffffff81134796>] bdi_destroy+0xf6/0x150
 [<ffffffffa031caa3>] fuse_conn_kill+0xc3/0xd0 [fuse]
 [<ffffffffa031cb1a>] fuse_put_super+0x6a/0x80 [fuse]
 [<ffffffff81178c8b>] generic_shutdown_super+0x5b/0xe0
 [<ffffffff81178d76>] kill_anon_super+0x16/0x60
 [<ffffffffa031bd92>] fuse_kill_sb_anon+0x52/0x60 [fuse]
 [<ffffffff81179d00>] deactivate_super+0x70/0x90
 [<ffffffff81195caf>] mntput_no_expire+0xbf/0x110
 [<ffffffff8119674b>] sys_umount+0x7b/0x3a0
 [<ffffffff81081841>] ? sigprocmask+0x71/0x110
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

Comment 2 Vidya Sakar 2012-07-20 06:57:37 UTC
Csaba, can you take a look at this?

Comment 3 Amar Tumballi 2012-08-23 06:44:55 UTC
This bug is not seen in current master branch (which will get branched as RHS 2.1.0 soon). To consider it for fixing, want to make sure this bug still exists in RHS servers. If not reproduced, would like to close this.

Comment 4 Amar Tumballi 2013-01-11 07:42:39 UTC
For QE: there was no particular fix which went into fix the issue, could be by sideeffect of other patches. Please see if initial steps reported in Description work fine now.

Comment 5 surabhi 2013-07-02 07:31:36 UTC
BUG verified on version:
glusterfs 3.4.0.12rhs.beta1 built on Jun 28 2013 06:41:37

Comment 7 Scott Haines 2013-09-23 22:36:14 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html