Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 764210 (GLUSTER-2478)

Summary:

Crash using fuse mount

Product:

[Community] GlusterFS

Reporter:

zaterio <daniel.ortiz>

Component:

glusterd

Assignee:

krishnan parthasarathi <kparthas>

Status:

CLOSED WORKSFORME

QA Contact:

Severity:

medium

Docs Contact:

Priority:

medium

Version:

3.1.2

CC:

amarts, gluster-bugs, jdarcy, nsathyan, vijay

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Logs	none

Description zaterio 2011-03-01 12:00:37 UTC

Created attachment 444

Comment 1 zaterio 2011-03-01 14:59:22 UTC

Currently I have in production a glusterfs Cluster, with two Distributed Replicated Volumes in two nodes, connected through independent 1Gbps Ethernet Link. one volume is for document root of nginx (mega volume) and another for php5 sessions (sessions volume).


node1:~# gluster -V
glusterfs 3.1.2 built on Jan 16 2011 18:14:56
Repository revision: v3.1.1-64-gf2a067c
Copyright (c) 2006-2010 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU Affero General Public License.

uname -a 
2.6.32-5-amd64 #1 SMP Wed Jan 12 03:40:32 UTC 2011 x86_64 GNU/Linux
(Debian Squeeze)

After two week of normal operation, the sessions volume come down, i need to restart node2 and node1. apparently sessions volume presents problems, not the mega volume.  momentarily I had to use a not distributed directory on each nodr for sessions, which so far has corrected the problem. I have attached the logs for both servers and with volume conf.

Only In node2, i have this trace:

[ 6720.716830] INFO: task df:10319 blocked for more than 120 seconds.
[ 6720.716876] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6720.716922] df            D ffff880001895780     0 10319      1 0x00000004
[ 6720.716928]  ffff88007e421c40 0000000000000082 ffff880038a26ce8 000000010017ad6b
[ 6720.716934]  0000000000000092 ffff880037bacc30 000000000000f9e0 ffff880038995fd8
[ 6720.716939]  0000000000015780 0000000000015780 ffff880038a269f0 ffff880038a26ce8
[ 6720.716944] Call Trace:
[ 6720.716955]  [<ffffffff8103f9cb>] ? __wake_up+0x30/0x44
[ 6720.716969]  [<ffffffffa030bace>] ? fuse_request_send+0x1a2/0x255 [fuse]
[ 6720.716974]  [<ffffffff81064d2a>] ? autoremove_wake_function+0x0/0x2e
[ 6720.716982]  [<ffffffffa03126b5>] ? fuse_statfs+0xbf/0x133 [fuse]
[ 6720.716987]  [<ffffffff810edf26>] ? vfs_statfs+0x5b/0x76
[ 6720.716991]  [<ffffffff810ee134>] ? sys_statfs+0x3e/0x92
[ 6720.716997]  [<ffffffff8101195b>] ? device_not_available+0x1b/0x20
[ 6720.717001]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[ 6720.717005] INFO: task df:10341 blocked for more than 120 seconds.
[ 6720.717034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6720.717079] df            D 0000000000000000     0 10341      1 0x00000004
[ 6720.717084]  ffff88007fb5b880 0000000000000086 0000000000000000 000000010017b962
[ 6720.717089]  0000000000000092 ffff880037bacc30 000000000000f9e0 ffff880038a05fd8
[ 6720.717094]  0000000000015780 0000000000015780 ffff880038a454c0 ffff880038a457b8
[ 6720.717098] Call Trace:
[ 6720.717105]  [<ffffffffa030bace>] ? fuse_request_send+0x1a2/0x255 [fuse]
[ 6720.717109]  [<ffffffff81064d2a>] ? autoremove_wake_function+0x0/0x2e
[ 6720.717116]  [<ffffffffa03126b5>] ? fuse_statfs+0xbf/0x133 [fuse]
[ 6720.717120]  [<ffffffff810edf26>] ? vfs_statfs+0x5b/0x76
[ 6720.717124]  [<ffffffff810ee134>] ? sys_statfs+0x3e/0x92
[ 6720.717128]  [<ffffffff8101195b>] ? device_not_available+0x1b/0x20
[ 6720.717132]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[ 6720.717136] INFO: task df:10350 blocked for more than 120 seconds.

Regards,

Zaterio

Comment 2 Amar Tumballi 2011-07-15 06:01:59 UTC

Hi Zaterio,

Thanks for the bug report, and sorry for the delay in responding to you. Mean time, we have seen similar issues in GlusterFS due to hang caused by some of the operations. Is upgrading to 3.2.2 version a feasibility for you? Please upgrade and this issue should go away.

Regards,
Amar

Comment 3 Amar Tumballi 2011-08-29 04:03:31 UTC

After having 3.2.2 version, we are not seeing any hangs. Please re-open if you see the issues again.