Hide Forgot
Description of problem: Glusterd process is killed when volume statedump is performed. Happens when the statedump file size is big. (ex:- 123K) Version-Release number of selected component (if applicable): mainline How reproducible: occasionally Steps to Reproduce: 1.gluster volume statedump <volumename> all Actual results: Dumps the state information of the volume into a files and kills glusterd process Expected results: Should dump the state information of the volume into a file. Shouldn't kill glusterd Additional info: [root@APP-SERVER1 ~]# ps -ef | grep gluster root 1131 32390 0 23:02 pts/0 00:00:00 grep gluster root 25773 1 0 Feb15 ? 00:12:28 /usr/local/sbin/glusterfsd -s localhost --volfile-id datastore.192.168.2.35.export1 -p /etc/glusterd/vols/datastore/run/192.168.2.35-export1.pid -S /tmp/2333e1dc05141f93d1455de5da4d7188.socket --brick-name /export1 -l /usr/local/var/log/glusterfs/bricks/export1.log --brick-port 24009 --xlator-option datastore-server.listen-port=24009 root 29803 1 0 Feb16 ? 00:00:01 glusterd root 29823 1 0 Feb16 ? 00:00:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log root 29830 1 0 Feb16 ? 00:00:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/glusterd/glustershd/run/glustershd.pid -l /usr/local/var/log/glusterfs/glustershd.log -S /tmp/ed4a200f26afd8f124ea87fd774eee7a.socket [root@APP-SERVER1 ~]# gluster volume statedump datastore all Connection failed. Please check if gluster daemon is operational. [root@APP-SERVER1 ~]# ps -ef | grep gluster root 1140 32390 0 23:02 pts/0 00:00:00 grep gluster root 25773 1 0 Feb15 ? 00:12:30 /usr/local/sbin/glusterfsd -s localhost --volfile-id datastore.192.168.2.35.export1 -p /etc/glusterd/vols/datastore/run/192.168.2.35-export1.pid -S /tmp/2333e1dc05141f93d1455de5da4d7188.socket --brick-name /export1 -l /usr/local/var/log/glusterfs/bricks/export1.log --brick-port 24009 --xlator-option datastore-server.listen-port=24009 root 29823 1 0 Feb16 ? 00:00:00 /usr/local/sbin/glusterfs -f /etc/glusterd/nfs/nfs-server.vol -p /etc/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log root 29830 1 0 Feb16 ? 00:00:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/glusterd/glustershd/run/glustershd.pid -l /usr/local/var/log/glusterfs/glustershd.log -S /tmp/ed4a200f26afd8f124ea87fd774eee7a.socket GLusterd Logs:- --------------- glusterd logs:- [2012-02-17 23:02:23.296831] I [glusterd-volume-ops.c:539:glusterd_handle_cli_statedump_volume] 0-glusterd: Recieved statedump request for volume datastore with options all [2012-02-17 23:02:23.296918] I [glusterd-utils.c:262:glusterd_lock] 0-glusterd: Cluster lock held by bd6f3667-7c8f-4b06-90d7-eedf12139065 [2012-02-17 23:02:23.296956] I [glusterd-handler.c:448:glusterd_op_txn_begin] 0-management: Acquired local lock Brick Log:- --------- [2012-02-17 23:06:21.120155] D [socket.c:1796:socket_event_handler] 0-transport: disconnecting now [2012-02-17 23:06:24.125217] D [common-utils.c:161:gf_resolve_ip6] 0-resolver: returning ip-::1 (port-24007) for hostname: localhost and port: 24007 [2012-02-17 23:06:24.129801] D [common-utils.c:181:gf_resolve_ip6] 0-resolver: next DNS query will return: ip-127.0.0.1 port-24007 [2012-02-17 23:06:24.129981] D [socket.c:289:__socket_disconnect] 0-glusterfs: shutdown() returned -1. Transport endpoint is not connected [2012-02-17 23:06:24.130080] D [socket.c:193:__socket_rwv] 0-glusterfs: EOF from peer 127.0.0.1:24007 [2012-02-17 23:06:24.130132] D [socket.c:1510:__socket_proto_state_machine] 0-glusterfs: reading from socket failed. Error (Transport endpoint is not connected), peer (127.0.0.1:24007)
Can you get the full glusterd DEBUG log when this happens again? The log given is not very informative.
If "volume statedump" causes glusterd to crash for anyone else please reply with the logs of glusterd.
One last call for information on this.
Unable to recreate this issue.
Closing as the issue cannot be reproduced.