Bug 765434 (GLUSTER-3702)

Summary: cannot start glusterd
Product: [Community] GlusterFS Reporter: shylesh <shylesh>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: pre-releaseCC: gluster-bugs, nsathyan, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 13:40:07 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description shylesh 2011-10-05 07:41:15 EDT
Not able to start glusterd

It was working fine upto some time
==============================================

[2011-10-05 16:57:48.360213] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:49.361899] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:50.368096] W [socket.c:1510:__socket_proto_state_machine] 0-management: reading from socket failed. Error (Transport endpoint is not connected), peer (192.168.2.44:24007)
[2011-10-05 16:57:51.365305] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:52.366050] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:54.397413] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:55.399162] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:57.402527] E [socket.c:2110:socket_connect] 0-management: connection attempt failed (No such file or directory)
[2011-10-05 16:57:58.120876] I [glusterfsd.c:1569:main] 0-/opt/glusterfs/3.3.0qa14/sbin/glusterd: Started Running /opt/glusterfs/3.3.0qa14/sbin/glusterd version 3.3.0qa14
[2011-10-05 16:57:58.221287] I [glusterd.c:805:init] 0-management: Using /etc/glusterd as working directory
[2011-10-05 16:57:58.277366] E [rpc-transport.c:261:rpc_transport_load] 0-rpc-transport: /opt/glusterfs/3.3.0qa14/lib64/glusterfs/3.3.0qa14/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2011-10-05 16:57:58.277399] E [rpc-transport.c:265:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2011-10-05 16:57:58.277418] W [rpcsvc.c:1320:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2011-10-05 16:57:58.289334] I [glusterd.c:92:glusterd_uuid_init] 0-glusterd: retrieved UUID: 18e4836d-a628-45f3-a040-44595e3b6d12
[2011-10-05 16:58:00.165106] E [glusterd-store.c:1205:glusterd_store_handle_retrieve] 0-glusterd: Unable to retrieve store handle for /etc/glusterd/vols/vol-replicate/info, error: No such file or directory
[2011-10-05 16:58:00.165176] E [glusterd-store.c:1975:glusterd_store_retrieve_volumes] 0-: Unable to restore volume: vol-replicate
[2011-10-05 16:58:00.165208] E [xlator.c:393:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2011-10-05 16:58:00.165228] E [graph.c:321:glusterfs_graph_init] 0-management: initializing translator failed
[2011-10-05 16:58:00.165242] E [graph.c:517:glusterfs_graph_activate] 0-graph: init failed
[2011-10-05 16:58:00.177626] W [glusterfsd.c:774:cleanup_and_exit] (-->/opt/glusterfs/3.3.0qa14/sbin/glusterd(main+0x47b) [0x405efb] (-->/opt/glusterfs/3.3.0qa14/sbin/glusterd(glusterfs_volumes_init+0x18b) [0x404c5b] (-->/opt/glusterfs/3.3.0qa14/sbin/glusterd(glusterfs_process_volfp+0x17a) [0x404aba]))) 0-: received signum (0), shutting down
[2011-10-05 17:00:45.807326] I [glusterfsd.c:1569:main] 0-glusterd: Started Running glusterd version 3.3.0qa14
[2011-10-05 17:00:45.818875] I [glusterd.c:805:init] 0-management: Using /etc/glusterd as working directory
[2011-10-05 17:00:45.820957] E [rpc-transport.c:261:rpc_transport_load] 0-rpc-transport: /opt/glusterfs/3.3.0qa14/lib64/glusterfs/3.3.0qa14/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2011-10-05 17:00:45.820987] E [rpc-transport.c:265:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2011-10-05 17:00:45.821008] W [rpcsvc.c:1320:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2011-10-05 17:00:45.821128] I [glusterd.c:92:glusterd_uuid_init] 0-glusterd: retrieved UUID: 18e4836d-a628-45f3-a040-44595e3b6d12
[2011-10-05 17:00:47.78545] E [glusterd-store.c:1205:glusterd_store_handle_retrieve] 0-glusterd: Unable to retrieve store handle for /etc/glusterd/vols/vol-replicate/info, error: No such file or directory
[2011-10-05 17:00:47.78605] E [glusterd-store.c:1975:glusterd_store_retrieve_volumes] 0-: Unable to restore volume: vol-replicate
[2011-10-05 17:00:47.78636] E [xlator.c:393:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2011-10-05 17:00:47.78660] E [graph.c:321:glusterfs_graph_init] 0-management: initializing translator failed
[2011-10-05 17:00:47.78675] E [graph.c:517:glusterfs_graph_activate] 0-graph: init failed
[2011-10-05 17:00:47.78911] W [glusterfsd.c:774:cleanup_and_exit] (-->glusterd(main+0x47b) [0x405efb] (-->glusterd(glusterfs_volumes_init+0x18b) [0x404c5b] (-->glusterd(glusterfs_process_volfp+0x17a) [0x404aba]))) 0-: received signum (0), shutting down
================================================================
Comment 1 krishnan parthasarathi 2011-10-06 06:36:22 EDT
Shylesh, 
The logs suggest that /etc/glusterd/vols/vol-replicate/info. Was the files under /etc/glusterd removed 'manually' instead of a delete volume operation? Can you provide more information on what were operations before the glusterd went into the state mentioned?
Comment 2 Vijay Bellur 2011-10-06 07:38:16 EDT
(In reply to comment #1)
I examined the setup. The disk space on root partition was full and we ended up saving partial state. We need to make 'store' to glusterd store atomic. We should either write the full state or not be able to write any.
Comment 3 shylesh 2011-10-06 22:38:12 EDT
Thanks vijay as you said it was space issue, i freed up some space and glusterd started.
Comment 4 shylesh 2011-10-06 22:57:44 EDT
*** Bug 3693 has been marked as a duplicate of this bug. ***
Comment 5 krishnan parthasarathi 2011-10-13 04:48:44 EDT
*** Bug 3696 has been marked as a duplicate of this bug. ***
Comment 6 Vijay Bellur 2012-07-19 12:03:56 EDT
CHANGE: http://review.gluster.com/654 (glusterd: atomic store update.) merged in master by Anand Avati (avati@redhat.com)
Comment 7 Vijay Bellur 2012-07-29 14:24:34 EDT
CHANGE: http://review.gluster.com/3726 (glusterd: Ensured 'store' data reaches disk.) merged in master by Anand Avati (avati@redhat.com)