Bug 1568516

Summary: glusterfsd wont start at boot
Product: [Community] GlusterFS Reporter: Gabriel Sousa <ferreiragabrielsousa>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.13CC: amukherj, bugs, ferreiragabrielsousa
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-20 18:24:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gabriel Sousa 2018-04-17 16:38:16 UTC
Description of problem:

glusterfsd wont start at boot

Version-Release number of selected component (if applicable):

glusterfs 3.13.2

How reproducible:

just reboot the server and glusterfsd wont start

Additional info:

[2018-04-17 16:20:56.514372] I [MSGID: 100030] [glusterfsd.c:2556:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.13.2 (args: /usr/sbin/glusterfsd -s 10.7.101.46 --volfile-id gv001.10.7.101.46.data-vol001-gv001 -p /var/run/gluster/vols/gv001/10.7.101.46-data-vol001-gv001.pid -S /var/run/gluster/91fc1a4502f94e6b87ad4bd343ba2c12.socket --brick-name /data/vol001/gv001 -l /var/log/glusterfs/bricks/data-vol001-gv001.log --xlator-option *-posix.glusterd-uuid=fa244f30-53e8-4e28-b2f0-5d87a5406f07 --process-name brick --brick-port 49152 --xlator-option gv001-server.listen-port=49152)
[2018-04-17 16:20:56.519515] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2018-04-17 16:20:56.522188] W [socket.c:3216:socket_connect] 0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-04-17 16:20:56.522234] E [socket.c:3314:socket_connect] 0-glusterfs: connection attempt on  failed, (Network is unreachable)
[2018-04-17 16:20:56.522753] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-04-17 16:20:56.522789] I [glusterfsd-mgmt.c:2209:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.7.101.46
[2018-04-17 16:20:56.522798] I [glusterfsd-mgmt.c:2230:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2018-04-17 16:20:56.522954] W [glusterfsd.c:1393:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f1888ff502b] -->/usr/sbin/glusterfsd(+0x115bd) [0x556d223e95bd] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x556d223e249b] ) 0-: received signum (1), shutting down
[2018-04-17 16:20:56.525252] W [socket.c:3216:socket_connect] 0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-04-17 16:20:56.525305] E [socket.c:3314:socket_connect] 0-glusterfs: connection attempt on  failed, (Network is unreachable)
[2018-04-17 16:20:56.525318] I [socket.c:3629:socket_submit_request] 0-glusterfs: not connected (priv->connected = 0)
[2018-04-17 16:20:56.525324] W [rpc-clnt.c:1679:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)

Comment 1 Atin Mukherjee 2018-04-18 12:03:17 UTC
was brick multiplexing enabled on this setup? 

output of "gluster volume info" along with glusterd log would help us to determine the cause.

Comment 2 Gabriel Sousa 2018-04-19 10:24:48 UTC
i dont know what is brick multiplexing, so i will say that i didn't enable
========================================================================
Volume Name: gv001
Type: Replicate
Volume ID: 5e8b4bc9-e696-4089-8f09-6ddef54c0546
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.7.101.46:/data/vol001/gv001
Brick2: 10.7.101.23:/data/vol001/gv001
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.cache-invalidation: on
=======================================================================
[root@STG006 glusterfs]$ cat glusterd.log
[2018-04-19 10:22:12.591109] I [MSGID: 100030] [glusterfsd.c:2556:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.13.2 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2018-04-19 10:22:12.621018] I [MSGID: 106478] [glusterd.c:1423:init] 0-management: Maximum allowed open file descriptors set to 65536
[2018-04-19 10:22:12.621059] I [MSGID: 106479] [glusterd.c:1481:init] 0-management: Using /var/lib/glusterd as working directory
[2018-04-19 10:22:12.621068] I [MSGID: 106479] [glusterd.c:1486:init] 0-management: Using /var/run/gluster as pid file working directory
[2018-04-19 10:22:12.630138] E [rpc-transport.c:283:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.13.2/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2018-04-19 10:22:12.630153] W [rpc-transport.c:287:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2018-04-19 10:22:12.630160] W [rpcsvc.c:1770:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2018-04-19 10:22:12.630168] E [MSGID: 106243] [glusterd.c:1769:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2018-04-19 10:22:12.639078] I [MSGID: 106228] [glusterd.c:499:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [No such file or directory]
[2018-04-19 10:22:12.641327] I [MSGID: 106513] [glusterd-store.c:2241:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 31202
[2018-04-19 10:22:12.740756] I [MSGID: 106498] [glusterd-handler.c:3603:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2018-04-19 10:22:12.740836] W [MSGID: 106062] [glusterd-handler.c:3400:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
[2018-04-19 10:22:12.740859] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2018-04-19 10:22:12.740929] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-management: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2018-04-19 10:22:12.744988] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-04-19 10:22:12.745165] I [MSGID: 106544] [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: fa244f30-53e8-4e28-b2f0-5d87a5406f07
[2018-04-19 10:22:12.745875] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2018-04-19 10:22:12.745981] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2018-04-19 10:22:12.746004] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service is stopped
[2018-04-19 10:22:12.746018] I [MSGID: 106600] [glusterd-nfs-svc.c:82:glusterd_nfssvc_manager] 0-management: nfs/server.so xlator is not installed
[2018-04-19 10:22:12.746049] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2018-04-19 10:22:12.746653] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
[2018-04-19 10:22:12.746671] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd service is stopped
[2018-04-19 10:22:12.746692] I [MSGID: 106567] [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting glustershd service
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.listen-backlog 10
  8:     option event-threads 1
  9:     option ping-timeout 0
 10:     option transport.socket.read-fail-log off
 11:     option transport.socket.keepalive-interval 2
 12:     option transport.socket.keepalive-time 10
 13:     option transport-type rdma
 14:     option working-directory /var/lib/glusterd
 15: end-volume
 16:
+------------------------------------------------------------------------------+
[2018-04-19 10:22:12.747622] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-04-19 10:22:12.768869] W [socket.c:3216:socket_connect] 0-glustershd: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:13.769261] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2018-04-19 10:22:13.769438] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2018-04-19 10:22:13.769462] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is stopped
[2018-04-19 10:22:13.769493] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2018-04-19 10:22:13.769601] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2018-04-19 10:22:13.769616] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is stopped
[2018-04-19 10:22:13.769642] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2018-04-19 10:22:13.769744] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2018-04-19 10:22:13.769775] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub service is stopped
[2018-04-19 10:22:13.769811] I [glusterd-utils.c:5962:glusterd_brick_start] 0-management: starting a fresh brick process for brick /data/vol001/gv001
[2018-04-19 10:22:13.774393] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2018-04-19 10:22:13.774472] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:13.811746] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2018-04-19 10:22:13.811930] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600
[2018-04-19 10:22:13.812993] I [socket.c:2482:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2018-04-19 10:22:17.595121] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:20.595503] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:23.595922] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:25.376016] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: b162011e-b7cb-45d8-9032-157154d82d2c, host: stg007.siege.red, port: 0
[2018-04-19 10:22:25.381771] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: b162011e-b7cb-45d8-9032-157154d82d2c
[2018-04-19 10:22:25.381800] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2018-04-19 10:22:25.384987] I [MSGID: 106493] [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: b162011e-b7cb-45d8-9032-157154d82d2c
[2018-04-19 10:22:25.385211] I [MSGID: 106163] [glusterd-handshake.c:1361:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31202
[2018-04-19 10:22:25.390651] I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: b162011e-b7cb-45d8-9032-157154d82d2c
[2018-04-19 10:22:25.394073] I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to stg007.siege.red (0), ret: 0, op_ret: 0
[2018-04-19 10:22:25.404188] I [MSGID: 106492] [glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: b162011e-b7cb-45d8-9032-157154d82d2c
[2018-04-19 10:22:25.404246] I [MSGID: 106502] [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2018-04-19 10:22:25.404350] I [MSGID: 106493] [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: b162011e-b7cb-45d8-9032-157154d82d2c
[2018-04-19 10:22:26.596475] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:29.611909] W [MSGID: 101095] [xlator.c:162:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.13.2/xlator/nfs/server.so: cannot open shared object file: No such file or directory
[2018-04-19 10:22:29.875020] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:32.875424] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:35.875878] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:38.876343] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:41.876829] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:44.877264] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
[2018-04-19 10:22:47.877692] W [socket.c:3216:socket_connect] 0-management: Error disabling sockopt IPV6_V6ONLY: "Operation not supported"
=======================================================================

[root@STG006 glusterfs]$ cat bricks/data-vol001-gv001.log
[2018-04-19 10:22:13.773628] I [MSGID: 100030] [glusterfsd.c:2556:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.13.2 (args: /usr/sbin/glusterfsd -s 10.7.101.46 --volfile-id gv001.10.7.101.46.data-vol001-gv001 -p /var/run/gluster/vols/gv001/10.7.101.46-data-vol001-gv001.pid -S /var/run/gluster/91fc1a4502f94e6b87ad4bd343ba2c12.socket --brick-name /data/vol001/gv001 -l /var/log/glusterfs/bricks/data-vol001-gv001.log --xlator-option *-posix.glusterd-uuid=fa244f30-53e8-4e28-b2f0-5d87a5406f07 --process-name brick --brick-port 49152 --xlator-option gv001-server.listen-port=49152)
[2018-04-19 10:22:13.778359] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2018-04-19 10:22:13.782196] W [socket.c:3216:socket_connect] 0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-04-19 10:22:13.782243] E [socket.c:3314:socket_connect] 0-glusterfs: connection attempt on  failed, (Network is unreachable)
[2018-04-19 10:22:13.782335] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-04-19 10:22:13.782367] I [glusterfsd-mgmt.c:2209:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.7.101.46
[2018-04-19 10:22:13.782376] I [glusterfsd-mgmt.c:2230:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2018-04-19 10:22:13.782523] W [glusterfsd.c:1393:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f12deef202b] -->/usr/sbin/glusterfsd(+0x115bd) [0x561da6b755bd] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x561da6b6e49b] ) 0-: received signum (1), shutting down
[2018-04-19 10:22:13.785323] W [socket.c:3216:socket_connect] 0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-04-19 10:22:13.785366] E [socket.c:3314:socket_connect] 0-glusterfs: connection attempt on  failed, (Network is unreachable)
[2018-04-19 10:22:13.785379] I [socket.c:3629:socket_submit_request] 0-glusterfs: not connected (priv->connected = 0)
[2018-04-19 10:22:13.785385] W [rpc-clnt.c:1679:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)

========================================================================
MY WORKARROUND,

[root@STG006 glusterfs]$ crontab -l
@reboot (sleep 10; /usr/bin/systemctl restart glusterd)

Comment 3 Shyamsundar 2018-06-20 18:24:55 UTC
This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately.