Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 795131

Summary:	glusterd is killed when volume statedump operation is performed
Product:	[Community] GlusterFS	Reporter:	Shwetha Panduranga <shwetha.h.panduranga>
Component:	glusterd	Assignee:	Kaushal <kaushal>
Status:	CLOSED WORKSFORME	QA Contact:
Severity:	medium	Docs Contact:
Priority:	high
Version:	mainline	CC:	gluster-bugs, kaushal
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-03-28 07:13:31 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Shwetha Panduranga 2012-02-19 15:06:52 UTC

Description of problem:
Glusterd process is killed when volume statedump is performed. Happens when the statedump file size is big. (ex:- 123K)

Version-Release number of selected component (if applicable):
mainline

How reproducible:
occasionally 

Steps to Reproduce:
1.gluster volume statedump <volumename> all 

Actual results:
Dumps the state information of the volume into a files and kills glusterd process

Expected results:
Should dump the state information of the volume into a file. Shouldn't kill glusterd

Additional info:
[root@APP-SERVER1 ~]# ps -ef | grep gluster
root      1131 32390  0 23:02 pts/0    00:00:00 grep gluster
root     25773     1  0 Feb15 ?        00:12:28 /usr/local/sbin/glusterfsd -s localhost --volfile-id datastore.192.168.2.35.export1 -p /etc/glusterd/vols/datastore/run/192.168.2.35-export1.pid -S /tmp/2333e1dc05141f93d1455de5da4d7188.socket --brick-name /export1 -l /usr/local/var/log/glusterfs/bricks/export1.log --brick-port 24009 --xlator-option datastore-server.listen-port=24009
root     29803     1  0 Feb16 ?        00:00:01 glusterd
root     29823     1  0 Feb16 ?        00:00:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log
root     29830     1  0 Feb16 ?        00:00:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/glusterd/glustershd/run/glustershd.pid -l /usr/local/var/log/glusterfs/glustershd.log -S /tmp/ed4a200f26afd8f124ea87fd774eee7a.socket


[root@APP-SERVER1 ~]# gluster volume statedump datastore all
Connection failed. Please check if gluster daemon is operational.


[root@APP-SERVER1 ~]# ps -ef | grep gluster
root      1140 32390  0 23:02 pts/0    00:00:00 grep gluster
root     25773     1  0 Feb15 ?        00:12:30 /usr/local/sbin/glusterfsd -s localhost --volfile-id datastore.192.168.2.35.export1 -p /etc/glusterd/vols/datastore/run/192.168.2.35-export1.pid -S /tmp/2333e1dc05141f93d1455de5da4d7188.socket --brick-name /export1 -l /usr/local/var/log/glusterfs/bricks/export1.log --brick-port 24009 --xlator-option datastore-server.listen-port=24009
root     29823     1  0 Feb16 ?        00:00:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log
root     29830     1  0 Feb16 ?        00:00:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/glusterd/glustershd/run/glustershd.pid -l /usr/local/var/log/glusterfs/glustershd.log -S /tmp/ed4a200f26afd8f124ea87fd774eee7a.socket

GLusterd Logs:-
---------------
glusterd logs:-
[2012-02-17 23:02:23.296831] I [glusterd-volume-ops.c:539:glusterd_handle_cli_statedump_volume] 0-glusterd: Recieved statedump request for volume datastore with options all
[2012-02-17 23:02:23.296918] I [glusterd-utils.c:262:glusterd_lock] 0-glusterd: Cluster lock held by bd6f3667-7c8f-4b06-90d7-eedf12139065
[2012-02-17 23:02:23.296956] I [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Acquired local lock

Brick Log:-
---------
[2012-02-17 23:06:21.120155] D [socket.c:1796:socket_event_handler] 0-transport: disconnecting now
[2012-02-17 23:06:24.125217] D [common-utils.c:161:gf_resolve_ip6] 0-resolver: returning ip-::1 (port-24007) for hostname: localhost and port: 24007
[2012-02-17 23:06:24.129801] D [common-utils.c:181:gf_resolve_ip6] 0-resolver: next DNS query will return: ip-127.0.0.1 port-24007
[2012-02-17 23:06:24.129981] D [socket.c:289:__socket_disconnect] 0-glusterfs: shutdown() returned -1. Transport endpoint is not connected
[2012-02-17 23:06:24.130080] D [socket.c:193:__socket_rwv] 0-glusterfs: EOF from peer 127.0.0.1:24007
[2012-02-17 23:06:24.130132] D [socket.c:1510:__socket_proto_state_machine] 0-glusterfs: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:24007)

Comment 1 Kaushal 2012-02-20 05:38:07 UTC

Can you get the full glusterd DEBUG log when this happens again? The log given is not very informative.

Comment 2 Kaushal 2012-03-01 07:17:12 UTC

If "volume statedump" causes glusterd to crash for anyone else please reply with the logs of glusterd.

Comment 3 Kaushal 2012-03-28 06:45:30 UTC

One last call for information on this.

Comment 4 Shwetha Panduranga 2012-03-28 06:53:57 UTC

Unable to recreate this issue.

Comment 5 Kaushal 2012-03-28 07:13:31 UTC

Closing as the issue cannot be reproduced.