Bug 1001585 - glusterd loses connection with the bricks
Summary: glusterd loses connection with the bricks
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-27 10:47 UTC by Joe Julian
Modified: 2015-10-07 12:21 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-10-07 12:21:30 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
gluster volume status (14.99 KB, text/plain)
2013-08-27 10:47 UTC, Joe Julian
no flags Details
glusterd dumpfile (21.75 KB, text/plain)
2013-08-27 10:49 UTC, Joe Julian
no flags Details
md5 hash calculations for all bricks (36.01 KB, text/plain)
2013-08-28 20:17 UTC, Joe Julian
no flags Details

Description Joe Julian 2013-08-27 10:47:07 UTC
Created attachment 790887 [details]
gluster volume status

Description of problem:
gluster volume status returns N/A for Ports despite the bricks being online and functional. The glusterd log file also fills up with "[2013-08-27 10:39:47.875339] E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)" every 3 seconds.

An strace determined that glusterd is attempting to connect to 
/var/run/4efb008e4e433ff7735a5a76111461d1.socket

Although that file exists, lsof shows no open connections. Perhaps related, that socket file also existed in /tmp.

restarting glusterd has no effect.

Version-Release number of selected component (if applicable):
3.4.0

Comment 1 Joe Julian 2013-08-27 10:49:45 UTC
Created attachment 790889 [details]
glusterd dumpfile

Comment 2 Joe Julian 2013-08-28 20:17:18 UTC
Created attachment 791528 [details]
md5 hash calculations for all bricks

These are the md5 hashes for all my bricks. Notice that 4efb008e4e433ff7735a5a76111461d1 doesn't exist.

Comment 3 Joe Julian 2013-08-28 22:22:44 UTC
Aha, found the connection attempt failure. Looks like it's because I don't have any of my volumes available via nfs:

[2013-08-28 22:07:03.333679] D [run.c:190:runner_log] 0-: Starting the nfs/glustershd services: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/4efb008e4e433ff7735a5a76111461d1.socket

4efb008e4e433ff7735a5a76111461d1 is the md5sum for /var/lib/glusterd/nfs/run/run-ceed91d5-e8d1-434d-9d47-63e914c93424 where that's the UUID for this server in glusterd.info.

Comment 4 Niels de Vos 2015-05-17 22:00:46 UTC
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 5 Kaleb KEITHLEY 2015-10-07 12:21:30 UTC
GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.


Note You need to log in before you can comment on or make changes to this bug.