I got a problem, steps as follows: 1)gluster peer probe node-2, gluster peer probe node-3; 2)gluster vol create test node-{1,2,3}:/disk1 force; 3)gluster vol start test; 4)mount.glusterfs 127.0.0.1:/test /test/; then I try to /etc/init.d/glusterd restart, but glusterd restart failed. this is log: [2016-03-22 18:35:56.301154] I [MSGID: 106513] [glusterd-store.c:2058:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 40000 [2016-03-22 18:35:57.084676] I [MSGID: 106544] [glusterd.c:155:glusterd_uuid_init] 0-management: retrieved UUID: 136c8841-c5fe-41de-a5ad-c8939016ea2d [2016-03-22 18:35:57.092168] E [MSGID: 106201] [glusterd-store.c:3071:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: test [2016-03-22 18:35:57.092459] E [MSGID: 101019] [xlator.c:430:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2016-03-22 18:35:57.092499] E [MSGID: 101066] [graph.c:324:glusterfs_graph_init] 0-management: initializing translator failed [2016-03-22 18:35:57.092523] E [MSGID: 101176] [graph.c:670:glusterfs_graph_activate] 0-graph: init failed [2016-03-22 18:35:57.093132] W [glusterfsd.c:1251:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xee) [0x40a470] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x19d) [0x40a37d] -->/usr/sbin/glusterd(cleanup_and_exit+0x88) [0x408137] ) 0-: received signum (0), shutting down
Could you run /etc/init.d/glusterd -LDEBUG to get the debug logs such that we can figure out why did that fail? Also can you just paste the following output from all the three nodes: ls -lrt /var/lib/glusterd/vols/test
(In reply to Atin Mukherjee from comment #1) > Could you run /etc/init.d/glusterd -LDEBUG to get the debug logs such that > we can figure out why did that fail? > > > Also can you just paste the following output from all the three nodes: > > ls -lrt /var/lib/glusterd/vols/test root@node-1:~# ls -lrt /var/lib/glusterd/vols/test total 52 -rw------- 1 root root 3010 Mar 22 18:06 test.node-1.disk1.vol -rw------- 1 root root 2050 Mar 22 18:06 trusted-test.tcp-fuse.vol -rw------- 1 root root 3010 Mar 22 18:06 test.node-3.disk1.vol -rw------- 1 root root 3010 Mar 22 18:06 test.node-2.disk1.vol -rw------- 1 root root 1708 Mar 22 18:06 test.tcp-fuse.vol -rw------- 1 root root 1280 Mar 22 18:06 test-rebalance.vol -rw------- 1 root root 13 Mar 22 18:06 snapd.info -rw------- 1 root root 546 Mar 22 18:06 info drwxr-xr-x 2 root root 4096 Mar 22 18:06 bricks -rw------- 1 root root 93 Mar 22 18:06 node_state.info -rw------- 1 root root 15 Mar 22 18:54 cksum -rw------- 1 root root 0 Mar 22 18:54 quota.conf -rw------- 1 root root 18 Mar 22 18:54 quota.cksum drwxr-xr-x 2 root root 4096 Mar 22 19:32 run
(In reply to jiademing.dd from comment #2) > (In reply to Atin Mukherjee from comment #1) > > Could you run /etc/init.d/glusterd -LDEBUG to get the debug logs such that > > we can figure out why did that fail? > > > > > > Also can you just paste the following output from all the three nodes: > > > > ls -lrt /var/lib/glusterd/vols/test > > root@node-1:~# ls -lrt /var/lib/glusterd/vols/test > total 52 > -rw------- 1 root root 3010 Mar 22 18:06 test.node-1.disk1.vol > -rw------- 1 root root 2050 Mar 22 18:06 trusted-test.tcp-fuse.vol > -rw------- 1 root root 3010 Mar 22 18:06 test.node-3.disk1.vol > -rw------- 1 root root 3010 Mar 22 18:06 test.node-2.disk1.vol > -rw------- 1 root root 1708 Mar 22 18:06 test.tcp-fuse.vol > -rw------- 1 root root 1280 Mar 22 18:06 test-rebalance.vol > -rw------- 1 root root 13 Mar 22 18:06 snapd.info > -rw------- 1 root root 546 Mar 22 18:06 info > drwxr-xr-x 2 root root 4096 Mar 22 18:06 bricks > -rw------- 1 root root 93 Mar 22 18:06 node_state.info > -rw------- 1 root root 15 Mar 22 18:54 cksum > -rw------- 1 root root 0 Mar 22 18:54 quota.conf > -rw------- 1 root root 18 Mar 22 18:54 quota.cksum > drwxr-xr-x 2 root root 4096 Mar 22 19:32 run root@node-2:~# ls -lrt /var/lib/glusterd/vols/test total 48 -rw------- 1 root root 3010 Mar 22 18:08 test.node-1.disk1.vol -rw------- 1 root root 3010 Mar 22 18:08 test.node-3.disk1.vol -rw------- 1 root root 3010 Mar 22 18:08 test.node-2.disk1.vol -rw------- 1 root root 2050 Mar 22 18:08 trusted-test.tcp-fuse.vol -rw------- 1 root root 1708 Mar 22 18:08 test.tcp-fuse.vol -rw------- 1 root root 1280 Mar 22 18:08 test-rebalance.vol drwxr-xr-x 2 root root 4096 Mar 22 18:08 run -rw------- 1 root root 13 Mar 22 18:08 snapd.info -rw------- 1 root root 546 Mar 22 18:08 info drwxr-xr-x 2 root root 4096 Mar 22 18:08 bricks -rw------- 1 root root 93 Mar 22 18:08 node_state.info -rw------- 1 root root 15 Mar 22 18:08 cksum root@node-2:~# root@node-3:~# root@node-3:~# ls -lrt /var/lib/glusterd/vols/test total 48 -rw------- 1 root root 3010 Mar 22 18:06 test.node-3.disk1.vol -rw------- 1 root root 3010 Mar 22 18:06 test.node-2.disk1.vol -rw------- 1 root root 3010 Mar 22 18:06 test.node-1.disk1.vol -rw------- 1 root root 2050 Mar 22 18:06 trusted-test.tcp-fuse.vol -rw------- 1 root root 1708 Mar 22 18:06 test.tcp-fuse.vol -rw------- 1 root root 1280 Mar 22 18:06 test-rebalance.vol drwxr-xr-x 2 root root 4096 Mar 22 18:06 run -rw------- 1 root root 13 Mar 22 18:07 snapd.info -rw------- 1 root root 546 Mar 22 18:07 info drwxr-xr-x 2 root root 4096 Mar 22 18:07 bricks -rw------- 1 root root 93 Mar 22 18:07 node_state.info -rw------- 1 root root 15 Mar 22 18:07 cksum root@node-3:~#
Ok, how about the debug logs?
(In reply to Atin Mukherjee from comment #4) > Ok, how about the debug logs? [2016-03-23 09:41:13.108244] I [MSGID: 100030] [glusterfsd.c:2329:main] 0-glusterd: Started running glusterd version 3.8dev (args: glusterd --log-level=DEBUG) [2016-03-23 09:41:13.109426] D [logging.c:1878:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5 [2016-03-23 09:41:13.110786] D [MSGID: 0] [glusterfsd.c:633:get_volfp] 0-glusterfsd: loading volume file /etc/glusterfs/glusterd.vol [2016-03-23 09:41:13.115008] I [MSGID: 106478] [glusterd.c:1333:init] 0-management: Maximum allowed open file descriptors set to 65536 [2016-03-23 09:41:13.115200] I [MSGID: 106479] [glusterd.c:1382:init] 0-management: Using /var/lib/glusterd as working directory [2016-03-23 09:41:13.115476] D [MSGID: 0] [glusterd.c:406:glusterd_rpcsvc_options_build] 0-glusterd: listen-backlog value: 128 [2016-03-23 09:41:13.115783] D [rpcsvc.c:2320:rpcsvc_init] 0-rpc-service: RPC service inited. [2016-03-23 09:41:13.115945] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2016-03-23 09:41:13.116069] D [rpc-transport.c:283:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.8dev/rpc-transport/socket.so [2016-03-23 09:41:13.121414] D [socket.c:3842:socket_init] 0-socket.management: Configued transport.tcp-user-timeout=0 [2016-03-23 09:41:13.121529] D [socket.c:3925:socket_init] 0-socket.management: SSL support on the I/O path is NOT enabled [2016-03-23 09:41:13.121624] D [socket.c:3928:socket_init] 0-socket.management: SSL support for glusterd is NOT enabled [2016-03-23 09:41:13.121709] D [socket.c:3945:socket_init] 0-socket.management: using system polling thread [2016-03-23 09:41:13.121809] D [name.c:565:server_fill_address_family] 0-socket.management: option address-family not specified, defaulting to inet [2016-03-23 09:41:13.122061] D [rpc-transport.c:283:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.8dev/rpc-transport/rdma.so [2016-03-23 09:41:13.122179] E [rpc-transport.c:287:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.8dev/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2016-03-23 09:41:13.122266] W [rpc-transport.c:291:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2016-03-23 09:41:13.122351] W [rpcsvc.c:1593:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2016-03-23 09:41:13.122433] E [MSGID: 106243] [glusterd.c:1606:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2016-03-23 09:41:13.122534] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GlusterD svc peer, Num: 1238437, Ver: 2, Port: 0 [2016-03-23 09:41:13.122619] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GlusterD svc cli read-only, Num: 1238463, Ver: 2, Port: 0 [2016-03-23 09:41:13.122704] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GlusterD svc mgmt, Num: 1238433, Ver: 2, Port: 0 [2016-03-23 09:41:13.122789] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GlusterD svc mgmt v3, Num: 1238433, Ver: 3, Port: 0 [2016-03-23 09:41:13.122873] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: Gluster Portmap, Num: 34123456, Ver: 1, Port: 0 [2016-03-23 09:41:13.122956] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: Gluster Handshake, Num: 14398633, Ver: 2, Port: 0 [2016-03-23 09:41:13.123039] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: Gluster MGMT Handshake, Num: 1239873, Ver: 1, Port: 0 [2016-03-23 09:41:13.123155] D [rpcsvc.c:2320:rpcsvc_init] 0-rpc-service: RPC service inited. [2016-03-23 09:41:13.123240] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2016-03-23 09:41:13.123338] D [rpc-transport.c:283:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.8dev/rpc-transport/socket.so [2016-03-23 09:41:13.123530] D [socket.c:3791:socket_init] 0-socket.management: disabling nodelay [2016-03-23 09:41:13.123617] D [socket.c:3842:socket_init] 0-socket.management: Configued transport.tcp-user-timeout=0 [2016-03-23 09:41:13.123703] D [socket.c:3925:socket_init] 0-socket.management: SSL support on the I/O path is NOT enabled [2016-03-23 09:41:13.123786] D [socket.c:3928:socket_init] 0-socket.management: SSL support for glusterd is NOT enabled [2016-03-23 09:41:13.123901] D [socket.c:3945:socket_init] 0-socket.management: using system polling thread [2016-03-23 09:41:13.124061] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: GlusterD svc cli, Num: 1238463, Ver: 2, Port: 0 [2016-03-23 09:41:13.124152] D [rpcsvc.c:1870:rpcsvc_program_register] 0-rpc-service: New program registered: Gluster Handshake (CLI Getspec), Num: 14398633, Ver: 2, Port: 0 [2016-03-23 09:41:13.124307] D [MSGID: 0] [glusterd-utils.c:6211:glusterd_sm_tr_log_init] 0-glusterd: returning 0 [2016-03-23 09:41:13.124406] D [MSGID: 0] [glusterd.c:1674:init] 0-management: cannot get run-with-valgrind value [2016-03-23 09:41:13.223125] D [MSGID: 0] [glusterd.c:465:glusterd_check_gsync_present] 0-glusterd: Returning 0 [2016-03-23 09:41:13.223437] D [MSGID: 0] [glusterd.c:601:glusterd_crt_georep_folders] 0-glusterd: Returning 0 [2016-03-23 09:41:15.818931] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-: Returning 0 [2016-03-23 09:41:15.819077] D [MSGID: 0] [store.c:438:gf_store_handle_retrieve] 0-: Returning 0 [2016-03-23 09:41:15.819226] D [MSGID: 0] [store.c:306:gf_store_retrieve_value] 0-: key operating-version found [2016-03-23 09:41:15.819312] I [MSGID: 106513] [glusterd-store.c:2075:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 40000 [2016-03-23 09:41:15.819534] D [MSGID: 0] [glusterd-utils.c:583:glusterd_volinfo_new] 0-management: Returning 0 [2016-03-23 09:41:15.826305] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-: Returning 0 [2016-03-23 09:41:15.826645] D [MSGID: 0] [store.c:438:gf_store_handle_retrieve] 0-: Returning 0 [2016-03-23 09:41:15.826970] D [MSGID: 0] [store.c:500:gf_store_iter_new] 0-: Returning with 0 [2016-03-23 09:41:15.827314] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.827598] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = type value = 0 [2016-03-23 09:41:15.827838] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.828093] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = count value = 3 [2016-03-23 09:41:15.828315] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.828511] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = status value = 1 [2016-03-23 09:41:15.828726] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.828945] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = sub_count value = 0 [2016-03-23 09:41:15.829219] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.829413] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = stripe_count value = 1 [2016-03-23 09:41:15.829630] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.829824] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = replica_count value = 1 [2016-03-23 09:41:15.830052] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.830260] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = disperse_count value = 0 [2016-03-23 09:41:15.830456] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.830639] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = redundancy_count value = 0 [2016-03-23 09:41:15.830852] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.831036] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = version value = 2 [2016-03-23 09:41:15.831243] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.831420] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = transport-type value = 0 [2016-03-23 09:41:15.831616] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.831804] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = volume-id value = 55c09bc1-9b4f-449d-b6cc-e74e8b76fdf2 [2016-03-23 09:41:15.832051] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.832242] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = username value = 5261bdfd-eede-44ad-8703-8c9364000995 [2016-03-23 09:41:15.832479] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.832664] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = password value = e1a69bbe-6c05-43d2-aa17-6f68e0b28442 [2016-03-23 09:41:15.832871] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.833485] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = op-version value = 30704 [2016-03-23 09:41:15.833699] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.833912] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = client-op-version value = 3 [2016-03-23 09:41:15.834140] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.834348] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = quota-version value = 0 [2016-03-23 09:41:15.834562] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.834757] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = parent_volname value = N/A [2016-03-23 09:41:15.834972] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.835176] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = restored_from_snap value = 00000000-0000-0000-0000-000000000000 [2016-03-23 09:41:15.835405] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.835597] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = snap-max-hard-limit value = 256 [2016-03-23 09:41:15.835824] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.836023] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = transport.address-family value = inet [2016-03-23 09:41:15.836275] D [MSGID: 0] [glusterd-store.c:2801:glusterd_store_update_volinfo] 0-management: Parsed as Volume-set:key=transport.address-family,value:inet [2016-03-23 09:41:15.836502] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.836695] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = performance.readdir-ahead value = on [2016-03-23 09:41:15.836907] D [MSGID: 0] [glusterd-store.c:2801:glusterd_store_update_volinfo] 0-management: Parsed as Volume-set:key=performance.readdir-ahead,value:on [2016-03-23 09:41:15.837209] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.837418] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = brick-0 value = node-1:-disk1 [2016-03-23 09:41:15.837636] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.837829] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = brick-1 value = node-2:-disk1 [2016-03-23 09:41:15.838060] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.838304] D [MSGID: 0] [glusterd-store.c:2616:glusterd_store_update_volinfo] 0-management: key = brick-2 value = node-3:-disk1 [2016-03-23 09:41:15.838563] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with -1 [2016-03-23 09:41:15.838820] D [MSGID: 0] [store.c:500:gf_store_iter_new] 0-: Returning with 0 [2016-03-23 09:41:15.839066] D [MSGID: 0] [glusterd-utils.c:996:glusterd_brickinfo_new] 0-management: Returning 0 [2016-03-23 09:41:15.839312] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:15.845549] D [logging.c:2040:_gf_msg_internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0" repeated 21 times between [2016-03-23 09:41:15.839312] and [2016-03-23 09:41:15.839790] [2016-03-23 09:41:15.845538] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-: Returning 0 [2016-03-23 09:41:15.846188] D [MSGID: 0] [store.c:438:gf_store_handle_retrieve] 0-: Returning 0 [2016-03-23 09:41:15.846414] D [MSGID: 0] [store.c:500:gf_store_iter_new] 0-: Returning with 0 [2016-03-23 09:41:15.846670] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:16.644630] D [logging.c:2040:_gf_msg_internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0" repeated 7 times between [2016-03-23 09:41:15.846670] and [2016-03-23 09:41:16.644609] [2016-03-23 09:41:16.644622] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with -1 [2016-03-23 09:41:16.645870] D [MSGID: 0] [glusterd-peer-utils.c:133:glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: node-1 [2016-03-23 09:41:16.646123] D [MSGID: 0] [common-utils.c:3275:gf_is_local_addr] 0-management: 10.10.21.50 [2016-03-23 09:41:16.646413] D [MSGID: 0] [common-utils.c:3160:gf_interface_search] 0-management: 10.10.21.50 is local address at interface eth0 [2016-03-23 09:41:16.646609] D [MSGID: 0] [store.c:306:gf_store_retrieve_value] 0-: key UUID found [2016-03-23 09:41:16.646789] D [MSGID: 0] [glusterd-store.c:2150:glusterd_retrieve_uuid] 0-management: Returning 0 [2016-03-23 09:41:16.646942] I [MSGID: 106544] [glusterd.c:155:glusterd_uuid_init] 0-management: retrieved UUID: 136c8841-c5fe-41de-a5ad-c8939016ea2d [2016-03-23 09:41:16.647079] D [MSGID: 0] [glusterd-peer-utils.c:168:glusterd_hostname_to_uuid] 0-management: returning 0 [2016-03-23 09:41:16.647212] D [MSGID: 0] [glusterd-utils.c:1037:glusterd_resolve_brick] 0-management: Returning 0 [2016-03-23 09:41:16.647380] D [MSGID: 0] [glusterd-utils.c:996:glusterd_brickinfo_new] 0-management: Returning 0 [2016-03-23 09:41:16.647525] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:16.654775] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-: Returning 0 [2016-03-23 09:41:16.654928] D [MSGID: 0] [store.c:438:gf_store_handle_retrieve] 0-: Returning 0 [2016-03-23 09:41:16.655079] D [MSGID: 0] [store.c:500:gf_store_iter_new] 0-: Returning with 0 [2016-03-23 09:41:16.655245] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0 [2016-03-23 09:41:16.655461] D [logging.c:2040:_gf_msg_internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with 0" repeated 7 times between [2016-03-23 09:41:16.655245] and [2016-03-23 09:41:16.655446] [2016-03-23 09:41:16.655456] D [MSGID: 0] [store.c:613:gf_store_iter_get_next] 0-: Returning with -1 [2016-03-23 09:41:16.655977] D [MSGID: 0] [glusterd-peer-utils.c:133:glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: node-2 [2016-03-23 09:41:16.656183] D [MSGID: 0] [common-utils.c:3275:gf_is_local_addr] 0-management: 10.10.21.52 [2016-03-23 09:41:16.656717] D [logging.c:2040:_gf_msg_internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [common-utils.c:3275:gf_is_local_addr] 0-management: 10.10.21.52 " repeated 2 times between [2016-03-23 09:41:16.656183] and [2016-03-23 09:41:16.656612] [2016-03-23 09:41:16.656710] D [MSGID: 0] [common-utils.c:3291:gf_is_local_addr] 0-management: node-2 is not local [2016-03-23 09:41:16.657170] D [MSGID: 0] [glusterd-peer-utils.c:168:glusterd_hostname_to_uuid] 0-management: returning -1 [2016-03-23 09:41:16.657295] D [MSGID: 0] [glusterd-utils.c:1037:glusterd_resolve_brick] 0-management: Returning -1 [2016-03-23 09:41:16.657422] D [MSGID: 0] [glusterd-store.c:2446:glusterd_store_retrieve_bricks] 0-management: Returning with -1 [2016-03-23 09:41:16.657564] D [MSGID: 0] [glusterd-utils.c:913:glusterd_volume_brickinfos_delete] 0-management: Returning 0 [2016-03-23 09:41:16.657700] D [MSGID: 0] [store.c:459:gf_store_handle_destroy] 0-: Returning 0 [2016-03-23 09:41:16.657854] D [MSGID: 0] [glusterd-utils.c:971:glusterd_volinfo_delete] 0-management: Returning 0 [2016-03-23 09:41:16.657982] E [MSGID: 106201] [glusterd-store.c:3122:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: test [2016-03-23 09:41:16.658123] D [MSGID: 0] [glusterd-store.c:3147:glusterd_store_retrieve_volumes] 0-management: Returning with -1 [2016-03-23 09:41:16.658247] D [MSGID: 0] [glusterd-store.c:4417:glusterd_restore] 0-management: Returning -1 [2016-03-23 09:41:16.658599] E [MSGID: 101019] [xlator.c:430:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2016-03-23 09:41:16.658779] E [MSGID: 101066] [graph.c:324:glusterfs_graph_init] 0-management: initializing translator failed [2016-03-23 09:41:16.658969] E [MSGID: 101176] [graph.c:670:glusterfs_graph_activate] 0-graph: init failed [2016-03-23 09:41:16.659481] D [logging.c:1852:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 5 extra log messages [2016-03-23 09:41:16.659680] D [logging.c:1855:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 5 extra log messages [2016-03-23 09:41:16.660306] W [glusterfsd.c:1251:cleanup_and_exit] (-->glusterd(glusterfs_volumes_init+0xee) [0x40a470] -->glusterd(glusterfs_process_volfp+0x19d) [0x40a37d] -->glusterd(cleanup_and_exit+0x88) [0x408137] ) 0-: received signum (0), shutting down [2016-03-23 09:41:16.660584] D [glusterfsd-mgmt.c:2368:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given
(In reply to Atin Mukherjee from comment #4) > Ok, how about the debug logs? I tried to analysis the code, in my opinion, the function glusterd_store_retrieve_bricks() needs to get remote hostname's peerinfo, but at this time has not call glusterd_store_retrieve_peers(), so glusterd restart failed. I adjust the call sequence about glusterd_store_retrieve_volumes() and glusterd_store_retrieve_peers() in glusterd_restore(),after this modification, glusterd restart OK. Why retrieve_volumes before retrieve_peers? Whether there are other reasons?
It seems to me that you are referring at master branch code. Are you sure your latest head is up to date. We actually fixed this issue with a patch [1], can you quickly check whether your git tree has the commit 9227386 which is "glusterd: Avoid ret value of glusterd_resolve_brick in retreive brick path" ? If not please git pull, recompile and then try to restart. [1] http://review.gluster.org/13588 ~Atin
(In reply to Atin Mukherjee from comment #8) > It seems to me that you are referring at master branch code. Are you sure > your latest head is up to date. We actually fixed this issue with a patch > [1], can you quickly check whether your git tree has the commit 9227386 > which is "glusterd: Avoid ret value of glusterd_resolve_brick in retreive > brick path" ? If not please git pull, recompile and then try to restart. > > [1] http://review.gluster.org/13588 > > ~Atin Yes,after I got this patch, glusterd restart OK. But it seems that (void)glusterd_resolve_brick (brickinfo) nothing to do. Why not call glusterd_store_retrieve_peers() before glusterd_store_retrieve_bricks()?
Well, I am pretty sure that there has been some history behind not restoring peers before volumes. I need to recollect that information. I am closing this bug now.