Bug 764210 (GLUSTER-2478)

Summary: Crash using fuse mount
Product: [Community] GlusterFS Reporter: zaterio <daniel.ortiz>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.2CC: amarts, gluster-bugs, jdarcy, nsathyan, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
Logs none

Description zaterio 2011-03-01 07:00:37 EST
Created attachment 444
Comment 1 zaterio 2011-03-01 09:59:22 EST
Currently I have in production a glusterfs Cluster, with two Distributed Replicated Volumes in two nodes, connected through independent 1Gbps Ethernet Link. one volume is for document root of nginx (mega volume) and another for php5 sessions (sessions volume).


node1:~# gluster -V
glusterfs 3.1.2 built on Jan 16 2011 18:14:56
Repository revision: v3.1.1-64-gf2a067c
Copyright (c) 2006-2010 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU Affero General Public License.

uname -a 
2.6.32-5-amd64 #1 SMP Wed Jan 12 03:40:32 UTC 2011 x86_64 GNU/Linux
(Debian Squeeze)

After two week of normal operation, the sessions volume come down, i need to restart node2 and node1. apparently sessions volume presents problems, not the mega volume.  momentarily I had to use a not distributed directory on each nodr for sessions, which so far has corrected the problem. I have attached the logs for both servers and with volume conf.

Only In node2, i have this trace:

[ 6720.716830] INFO: task df:10319 blocked for more than 120 seconds.
[ 6720.716876] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6720.716922] df            D ffff880001895780     0 10319      1 0x00000004
[ 6720.716928]  ffff88007e421c40 0000000000000082 ffff880038a26ce8 000000010017ad6b
[ 6720.716934]  0000000000000092 ffff880037bacc30 000000000000f9e0 ffff880038995fd8
[ 6720.716939]  0000000000015780 0000000000015780 ffff880038a269f0 ffff880038a26ce8
[ 6720.716944] Call Trace:
[ 6720.716955]  [<ffffffff8103f9cb>] ? __wake_up+0x30/0x44
[ 6720.716969]  [<ffffffffa030bace>] ? fuse_request_send+0x1a2/0x255 [fuse]
[ 6720.716974]  [<ffffffff81064d2a>] ? autoremove_wake_function+0x0/0x2e
[ 6720.716982]  [<ffffffffa03126b5>] ? fuse_statfs+0xbf/0x133 [fuse]
[ 6720.716987]  [<ffffffff810edf26>] ? vfs_statfs+0x5b/0x76
[ 6720.716991]  [<ffffffff810ee134>] ? sys_statfs+0x3e/0x92
[ 6720.716997]  [<ffffffff8101195b>] ? device_not_available+0x1b/0x20
[ 6720.717001]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[ 6720.717005] INFO: task df:10341 blocked for more than 120 seconds.
[ 6720.717034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6720.717079] df            D 0000000000000000     0 10341      1 0x00000004
[ 6720.717084]  ffff88007fb5b880 0000000000000086 0000000000000000 000000010017b962
[ 6720.717089]  0000000000000092 ffff880037bacc30 000000000000f9e0 ffff880038a05fd8
[ 6720.717094]  0000000000015780 0000000000015780 ffff880038a454c0 ffff880038a457b8
[ 6720.717098] Call Trace:
[ 6720.717105]  [<ffffffffa030bace>] ? fuse_request_send+0x1a2/0x255 [fuse]
[ 6720.717109]  [<ffffffff81064d2a>] ? autoremove_wake_function+0x0/0x2e
[ 6720.717116]  [<ffffffffa03126b5>] ? fuse_statfs+0xbf/0x133 [fuse]
[ 6720.717120]  [<ffffffff810edf26>] ? vfs_statfs+0x5b/0x76
[ 6720.717124]  [<ffffffff810ee134>] ? sys_statfs+0x3e/0x92
[ 6720.717128]  [<ffffffff8101195b>] ? device_not_available+0x1b/0x20
[ 6720.717132]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[ 6720.717136] INFO: task df:10350 blocked for more than 120 seconds.

Regards,

Zaterio
Comment 2 Amar Tumballi 2011-07-15 02:01:59 EDT
Hi Zaterio,

Thanks for the bug report, and sorry for the delay in responding to you. Mean time, we have seen similar issues in GlusterFS due to hang caused by some of the operations. Is upgrading to 3.2.2 version a feasibility for you? Please upgrade and this issue should go away.

Regards,
Amar
Comment 3 Amar Tumballi 2011-08-29 00:03:31 EDT
After having 3.2.2 version, we are not seeing any hangs. Please re-open if you see the issues again.