Bug 1506487

Summary:

glusterfs / glusterfsd processes may not start properly upon reboot/restart

Product:

[Community] GlusterFS

Reporter:

Sylvain <srn>

Component:

glusterd

Assignee:

bugs <bugs>

Status:

CLOSED NOTABUG

QA Contact:

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

3.12

CC:

abhishku, bugs, jack.wong, mchangir, sankarshan, sarora, srn

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-10-26 15:54:18 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Log of gluster daemon when glusterfs/glusterfsd failed	none
Failed brick with TRACE level set	none
SUCCESS brick with TRACE level set	none
glusterd.log with TRACE when glusterfsd die	none
glusterd.log on TRACE (for real) when 1 glusterfsd + glusterfs die	none

Description Sylvain 2017-10-26 08:11:25 UTC

Description of problem:

I have a fresh cluster of 30 nodes (Gluster 3.12.1 / CentOS 7.1 / VM), each having two bricks (/mnt/dd1/brick & /mnt/dd2/brick), with a distributed replicated volume (I reproduced the bug with replica 2 and 3).

When a node reboot or when its gluster processes are restarted, some of them (glusterfs/glusterfsd) MAY die seconds after launch.


How reproducible:

Happens around 3 out of 4 times, in my setup.


Steps to Reproduce:

1. gluster volume create mysto replica 3 ... (see details for full command)
2. gluster volume start mysto
3. either reboot or restart process, on one of the node
3.1 reboot
3.2 /usr/share/glusterfs/scripts/stop-all-gluster-processes.sh; systemctl restart glusterd


Actual results:

glusterd, glusterfs and two glusterfsd start, as expected, and then glusterfs/glusterfsd processes stop randomly (either one of them, or two of them or all of them) less than a minute later.

In the log, it seems that dying processes have always trouble communicating with the main gluster daemon, which lead me to think that the problem is on glusterd side, serving some requests but not some others :

W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
or
E [socket.c:2369:socket_connect_finish] 0-glusterfs: connection to 10.45.4.2:24007 failed (Connection timed out); disconnecting socket

10.45.4.2 being the IP of the current node.
See details for full log.


Expected results:

I expect the node to properly start all its processes, and then to be ready to serve clients.


Additional info:

* glusterd daemon always survives;
* If I then launch, from any node, "gluster volume start mysto force", missing processes are properly launched and survive;
* I originaly had the problem doing brick healing tests, but I just reproduced the problem without writing anything on a newly created volume (it has not even been mounted once);
* I tried to increase server.event-threads to 5 and even 50 (just to be sure); didn't change anything;


Details:

[root@rk1glust1 ~]# rpm -qa | grep gluster
glusterfs-client-xlators-3.12.1-2.el7.x86_64
glusterfs-api-3.12.1-2.el7.x86_64
glusterfs-libs-3.12.1-2.el7.x86_64
glusterfs-cli-3.12.1-2.el7.x86_64
glusterfs-3.12.1-2.el7.x86_64
glusterfs-fuse-3.12.1-2.el7.x86_64
glusterfs-coreutils-0.2.0-1.el7.x86_64
glusterfs-server-3.12.1-2.el7.x86_64

[root@rk1glust1 ~]# gluster volume create mysto replica 3 10.45.1.1:/mnt/dd1/brick 10.45.1.2:/mnt/dd1/brick 10.45.1.3:/mnt/dd1/brick 10.45.2.1:/mnt/dd1/brick 10.45.2.2:/mnt/dd1/brick 10.45.2.3:/mnt/dd1/brick 10.45.4.1:/mnt/dd1/brick 10.45.4.2:/mnt/dd1/brick 10.45.4.3:/mnt/dd1/brick 10.45.6.1:/mnt/dd1/brick 10.45.6.2:/mnt/dd1/brick 10.45.6.3:/mnt/dd1/brick 10.45.8.1:/mnt/dd1/brick 10.45.8.2:/mnt/dd1/brick 10.45.8.3:/mnt/dd1/brick 10.45.9.1:/mnt/dd1/brick 10.45.9.2:/mnt/dd1/brick 10.45.9.3:/mnt/dd1/brick 10.45.10.1:/mnt/dd1/brick 10.45.10.2:/mnt/dd1/brick 10.45.10.3:/mnt/dd1/brick 10.45.12.1:/mnt/dd1/brick 10.45.12.2:/mnt/dd1/brick 10.45.12.3:/mnt/dd1/brick 10.45.13.1:/mnt/dd1/brick 10.45.13.2:/mnt/dd1/brick 10.45.13.3:/mnt/dd1/brick 10.45.14.1:/mnt/dd1/brick 10.45.14.2:/mnt/dd1/brick 10.45.14.3:/mnt/dd1/brick 10.45.1.1:/mnt/dd2/brick 10.45.1.2:/mnt/dd2/brick 10.45.1.3:/mnt/dd2/brick 10.45.2.1:/mnt/dd2/brick 10.45.2.2:/mnt/dd2/brick 10.45.2.3:/mnt/dd2/brick 10.45.4.1:/mnt/dd2/brick 10.45.4.2:/mnt/dd2/brick 10.45.4.3:/mnt/dd2/brick 10.45.6.1:/mnt/dd2/brick 10.45.6.2:/mnt/dd2/brick 10.45.6.3:/mnt/dd2/brick 10.45.8.1:/mnt/dd2/brick 10.45.8.2:/mnt/dd2/brick 10.45.8.3:/mnt/dd2/brick 10.45.9.1:/mnt/dd2/brick 10.45.9.2:/mnt/dd2/brick 10.45.9.3:/mnt/dd2/brick 10.45.10.1:/mnt/dd2/brick 10.45.10.2:/mnt/dd2/brick 10.45.10.3:/mnt/dd2/brick 10.45.12.1:/mnt/dd2/brick 10.45.12.2:/mnt/dd2/brick 10.45.12.3:/mnt/dd2/brick 10.45.13.1:/mnt/dd2/brick 10.45.13.2:/mnt/dd2/brick 10.45.13.3:/mnt/dd2/brick 10.45.14.1:/mnt/dd2/brick 10.45.14.2:/mnt/dd2/brick 10.45.14.3:/mnt/dd2/brick
[root@rk1glust1 ~]# gluster volume start mysto



[root@rk4glust2 ~]# reboot
...

Example of ps; here only one glusterfsd (the one dealing with /mnt/dd1/brick) died :

[root@rk4glust2 ~]# ps auxf  | grep [g]luster
root       932  1.1  0.1 667772 14560 ?        Ssl  07:41   0:12 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root       944  0.2  0.5 1136964 41380 ?       Ssl  07:41   0:02 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/eefb7938899437a1c25e42edaef663ce.socket --xlator-option *replicate*.node-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367
root       961  0.0  0.2 1032788 16956 ?       Ssl  07:41   0:00 /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id mysto.10.45.4.2.mnt-dd2-brick -p /var/run/gluster/vols/mysto/10.45.4.2-mnt-dd2-brick.pid -S /var/run/gluster/c5c5852563f2288eef42f52e6161b23b.socket --brick-name /mnt/dd2/brick -l /var/log/glusterfs/bricks/mnt-dd2-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49153 --xlator-option mysto-server.listen-port=49153


Example of logs when a process fails :

[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd1-brick.log 
[2017-10-25 12:16:09.947723] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.1 (args: /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id mysto.10.45.4.2.mnt-dd1-brick -p /var/run/gluster/vols/mysto/10.45.4.2-mnt-dd1-brick.pid -S /var/run/gluster/53ec230bdae6d472b0240b5b54a49ed7.socket --brick-name /mnt/dd1/brick -l /var/log/glusterfs/bricks/mnt-dd1-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49152 --xlator-option mysto-server.listen-port=49152)
[2017-10-25 12:16:09.953665] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-10-25 12:16:09.957247] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-25 12:16:47.294272] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
[2017-10-25 12:16:47.294441] I [glusterfsd-mgmt.c:2208:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.45.4.2
[2017-10-25 12:16:47.294454] I [glusterfsd-mgmt.c:2229:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-10-25 12:16:47.294745] W [glusterfsd.c:1347:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f0d7722700b] -->/usr/sbin/glusterfsd(+0x1126d) [0x7f0d7796726d] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x7f0d7796018b] ) 0-: received signum (1), shutting down
[2017-10-25 12:16:47.294831] I [socket.c:3598:socket_submit_request] 0-glusterfs: not connected (priv->connected = -1)
[2017-10-25 12:16:47.294842] W [rpc-clnt.c:1679:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x3 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)


[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd1-brick.log 
[2017-10-26 07:41:16.030488] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.1 (args: /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id mysto.10.45.4.2.mnt-dd1-brick -p /var/run/gluster/vols/mysto/10.45.4.2-mnt-dd1-brick.pid -S /var/run/gluster/53ec230bdae6d472b0240b5b54a49ed7.socket --brick-name /mnt/dd1/brick -l /var/log/glusterfs/bricks/mnt-dd1-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49152 --xlator-option mysto-server.listen-port=49152)
[2017-10-26 07:41:16.038526] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-10-26 07:41:16.043011] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-26 07:43:23.262749] E [socket.c:2369:socket_connect_finish] 0-glusterfs: connection to 10.45.4.2:24007 failed (Connection timed out); disconnecting socket
[2017-10-26 07:43:23.262951] I [glusterfsd-mgmt.c:2208:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.45.4.2
[2017-10-26 07:43:23.262995] I [glusterfsd-mgmt.c:2229:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-10-26 07:43:23.263599] W [glusterfsd.c:1347:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7fd1d735a00b] -->/usr/sbin/glusterfsd(+0x1126d) [0x7fd1d7a9a26d] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x7fd1d7a9318b] ) 0-: received signum (1), shutting down
[2017-10-26 07:43:23.270630] W [rpc-clnt.c:1679:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x2 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)
[2017-10-26 07:43:23.271359] I [timer.c:198:gf_timer_registry_init] (-->/lib64/libglusterfs.so.0(+0x39132) [0x7fd1d759f132] -->/lib64/libgfrpc.so.0(rpc_clnt_reconnect+0xee) [0x7fd1d735954e] -->/lib64/libglusterfs.so.0(gf_timer_call_after+0x213) [0x7fd1d759f4f3] ) 0-timer: ctx cleanup started
[2017-10-26 07:43:23.271498] E [timer.c:44:gf_timer_call_after] (-->/lib64/libglusterfs.so.0(+0x39132) [0x7fd1d759f132] -->/lib64/libgfrpc.so.0(rpc_clnt_reconnect+0xee) [0x7fd1d735954e] -->/lib64/libglusterfs.so.0(gf_timer_call_after+0x259) [0x7fd1d759f539] ) 0-timer: !reg
[2017-10-26 07:43:23.271520] E [rpc-clnt.c:417:rpc_clnt_reconnect] 0-glusterfs: Error adding to timer event queue


[root@rk4glust2 ~]# cat /var/log/glusterfs/glustershd.log 
[2017-10-25 12:11:30.950132] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.1 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/eefb7938899437a1c25e42edaef663ce.socket --xlator-option *replicate*.node-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367)
[2017-10-25 12:11:30.960965] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-10-25 12:11:30.966860] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-25 12:12:06.268304] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Connection reset by peer)
[2017-10-25 12:12:06.268472] I [glusterfsd-mgmt.c:2208:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost
[2017-10-25 12:12:06.268501] I [glusterfsd-mgmt.c:2229:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-10-25 12:12:06.268999] W [glusterfsd.c:1347:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f213832300b] -->/usr/sbin/glusterfs(+0x1126d) [0x7f2138a6326d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f2138a5c18b] ) 0-: received signum (1), shutting down

Comment 1 Sylvain 2017-10-26 08:27:02 UTC

Forgot the main thing; glusterd.log; see attachments.

During this run, none of the glusterfs/glusterfsd processes survived.


[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd1-brick.log 
...
[2017-10-26 08:22:10.915434] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
...

[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd2-brick.log 
...
[2017-10-26 08:21:55.676857] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
...

[root@rk4glust2 ~]# cat /var/log/glusterfs/glustershd.log 
...
[2017-10-26 08:22:00.348887] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Connection reset by peer)
...

Comment 2 Sylvain 2017-10-26 08:27:58 UTC

Created attachment 1343597 [details]
Log of gluster daemon when glusterfs/glusterfsd failed

Comment 3 Milind Changire 2017-10-26 08:50:48 UTC

Sylvain,
1. Are you using Gluster for Virtual Machine image storage ?
2. Are Gluster nodes running on bare metal ?
3. Are you using arbiter bricks ?

If possible, could you share TRACE level logs of one of the brick that dies after a reboot ?

To set log-level to TRACE:
# gluster volume set <vol> diagnostics.brick-log-level TRACE

To reset log-level to default (INFO), repeat the above command with INFO in place of TRACE.

Comment 4 Sylvain 2017-10-26 09:22:20 UTC

(In reply to Milind Changire from comment #3)
> 1. Are you using Gluster for Virtual Machine image storage ?
No, i'm using the "regular" RPM; 

> 2. Are Gluster nodes running on bare metal ?
For this particular test cluster, no; i'm using 30 VMs distributed on 10 physical hosts (3 VMs per host); config of each VMs : 4 CPUs & 8 GB ram.

> 3. Are you using arbiter bricks ?
No; I though it was linked when I was on a replica 2 volume, but now that I am on replica 3 (where arbiter shouldn't be needed I think) I have the same problem.


> If possible, could you share TRACE level logs of one of the brick that dies
> after a reboot ?

Here you go.
Bad news if that it didn't change anything for the failing glusterfsd process; I think the problem happens too early.

You can find attached:
- mnt-dd1-brick_NOK.log (the one which failed);
- mnt-dd2-brick_OK.log (the one which succeeded);

Comment 5 Sylvain 2017-10-26 09:23:53 UTC

Created attachment 1343613 [details]
Failed brick with TRACE level set

Comment 6 Sylvain 2017-10-26 09:24:34 UTC

Created attachment 1343614 [details]
SUCCESS brick with TRACE level set

Comment 7 Milind Changire 2017-10-26 09:38:03 UTC

(In reply to Sylvain from comment #4)
> (In reply to Milind Changire from comment #3)
> > 1. Are you using Gluster for Virtual Machine image storage ?
> No, i'm using the "regular" RPM; 

You probably misunderstood my question.
I want to know if the files that represent the virutal machine backing store are inside a Gluster volume.

eg. assuming you are using qemu and if you have a VM1.qcow2 file representing the first virtual machine, then does VM1.qcow2 reside inside a Gluster volume ?

Comment 8 Sylvain 2017-10-26 09:48:10 UTC

Ho, sorry;
Then, no, my VMs and all their related files / disks are hosted on bare metal.

The only gluster volume involved here is the one I'm testing ON TOP of the VMs, and for now this one doesn't contain anything.

Comment 9 Milind Changire 2017-10-26 11:02:17 UTC

Could you please attempt to collect the Gluster Daemon (glusterd) startup logs in TRACE level after a reboot.

You will have add "option log-level TRACE" to /etc/glusterfs/glusterd.vol and then reboot the Gluster node.

Comment 10 Sylvain 2017-10-26 11:18:43 UTC

Done; see glusterd_TRACE.log

This time the 2 glusterfsd processes died but glusterfs survived.

[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd1-brick.log 
[2017-10-26 11:12:57.064505] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.1 (args: /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id brosto.10.45.4.2.mnt-dd1-brick -p /var/run/gluster/vols/brosto/10.45.4.2-mnt-dd1-brick.pid -S /var/run/gluster/53ec230bdae6d472b0240b5b54a49ed7.socket --brick-name /mnt/dd1/brick -l /var/log/glusterfs/bricks/mnt-dd1-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49152 --xlator-option brosto-server.listen-port=49152)
...
[2017-10-26 11:13:13.244834] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
...


[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd2-brick.log 
[2017-10-26 11:12:57.069304] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.1 (args: /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id brosto.10.45.4.2.mnt-dd2-brick -p /var/run/gluster/vols/brosto/10.45.4.2-mnt-dd2-brick.pid -S /var/run/gluster/c5c5852563f2288eef42f52e6161b23b.socket --brick-name /mnt/dd2/brick -l /var/log/glusterfs/bricks/mnt-dd2-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49153 --xlator-option brosto-server.listen-port=49153)
...
[2017-10-26 11:13:30.428871] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
...

Comment 11 Sylvain 2017-10-26 11:19:44 UTC

Created attachment 1343686 [details]
glusterd.log with TRACE when glusterfsd die

Comment 12 Sylvain 2017-10-26 11:25:29 UTC

When brick processes fail, there is a message like 

[2017-10-26 11:13:30.623896] W [socket.c:593:__socket_rwv] 0-management: readv on /var/run/gluster/c5c5852563f2288eef42f52e6161b23b.socket failed (No data available)

in glusterd.log.

This socket is the same as used as command line parameter to the corresponding glusterfsd program (... -S /var/run/gluster/c5c5852563f2288eef42f52e6161b23b.socket ...) and it does exist on the system.

Comment 13 Sylvain 2017-10-26 11:38:55 UTC

Seems that it's still on INFO level, though, despite the option in .vol file.

I'll set it to TRACE in the service file and upload again.

Comment 14 Sylvain 2017-10-26 11:43:58 UTC

See glusterd_TRACE_2.log.

This time, 1 glusterfsd process + glusterfs process died;


[root@rk4glust2 ~]# cat /var/log/glusterfs/bricks/mnt-dd1-brick.log 
[2017-10-26 11:39:05.711158] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.1 (args: /usr/sbin/glusterfsd -s 10.45.4.2 --volfile-id brosto.10.45.4.2.mnt-dd1-brick -p /var/run/gluster/vols/brosto/10.45.4.2-mnt-dd1-brick.pid -S /var/run/gluster/53ec230bdae6d472b0240b5b54a49ed7.socket --brick-name /mnt/dd1/brick -l /var/log/glusterfs/bricks/mnt-dd1-brick.log --xlator-option *-posix.glusterd-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367 --brick-port 49152 --xlator-option brosto-server.listen-port=49152)
[2017-10-26 11:39:05.719876] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-10-26 11:39:05.724531] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-26 11:39:34.909346] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 10.45.4.2:24007 failed (Connection reset by peer)
[2017-10-26 11:39:34.909623] I [glusterfsd-mgmt.c:2208:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: 10.45.4.2
[2017-10-26 11:39:34.909649] I [glusterfsd-mgmt.c:2229:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-10-26 11:39:34.910146] W [glusterfsd.c:1347:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f9c0f5bb00b] -->/usr/sbin/glusterfsd(+0x1126d) [0x7f9c0fcfb26d] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x7f9c0fcf418b] ) 0-: received signum (1), shutting down
[2017-10-26 11:39:34.910219] I [socket.c:3598:socket_submit_request] 0-glusterfs: not connected (priv->connected = -1)
[2017-10-26 11:39:34.910238] W [rpc-clnt.c:1679:rpc_clnt_submit] 0-glusterfs: failed to submit rpc-request (XID: 0x3 Program: Gluster Portmap, ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)



[root@rk4glust2 ~]# cat /var/log/glusterfs/glustershd.log 
[2017-10-26 11:39:04.694881] I [MSGID: 100030] [glusterfsd.c:2496:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.1 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/eefb7938899437a1c25e42edaef663ce.socket --xlator-option *replicate*.node-uuid=d1dfb9d4-6c05-45be-b0eb-13298c0cf367)
[2017-10-26 11:39:04.701327] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction
[2017-10-26 11:39:04.705235] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-26 11:41:11.932949] E [socket.c:2369:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection timed out); disconnecting socket
[2017-10-26 11:41:11.933104] I [glusterfsd-mgmt.c:2208:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: localhost
[2017-10-26 11:41:11.933148] I [glusterfsd-mgmt.c:2229:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2017-10-26 11:41:11.933747] W [glusterfsd.c:1347:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7f3c598eb00b] -->/usr/sbin/glusterfs(+0x1126d) [0x7f3c5a02b26d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f3c5a02418b] ) 0-: received signum (1), shutting down

Comment 15 Sylvain 2017-10-26 11:45:05 UTC

Created attachment 1343702 [details]
glusterd.log  on TRACE (for real) when 1 glusterfsd + glusterfs die

Comment 16 Milind Changire 2017-10-26 13:45:28 UTC

Sylvain,
Could you check if adding/changing "option transport.listen-backlog 100" in /etc/glusterfs/glusterd.vol helps any bit ?

Comment 17 Sylvain 2017-10-26 15:05:45 UTC

I restarted services two times, then rebooted one time. All my processes stayed alive during these 3 tests; so .... I guess you fixed my issue :)
Thank you very much for the time you spent for me.

Just a couple of thing I'm wondering:

- From what I understand, my problem was due to the size of my cluster; is there somewhere a document dedicated to large cluster deployment, with guidelines, parameters to tune (just like this one), etc ?

- Is it equivalent to run "gluster volume set mysto transport.listen-backlog 100" ?

Comment 18 Milind Changire 2017-10-26 15:42:44 UTC

(In reply to Sylvain from comment #17)
> I restarted services two times, then rebooted one time. All my processes
> stayed alive during these 3 tests; so .... I guess you fixed my issue :)
> Thank you very much for the time you spent for me.

I'm glad the effort paid off for you.

> 
> Just a couple of thing I'm wondering:
> 
> - From what I understand, my problem was due to the size of my cluster; is
> there somewhere a document dedicated to large cluster deployment, with
> guidelines, parameters to tune (just like this one), etc ?

Upstream Gluster documentation is available at:
http://docs.gluster.org/en/latest/

All possible cluster-wide and volume-wide options can be listed using:
# gluster volume get all all

You can then set them using the command format:
# gluster volume set <vol> <option> <value

and retrieve a specific value using:
# gluster volume get <vol> <option>

or
# gluster volume get <vol> all

Help for these options can be listed by:
# gluster volume help

> 
> - Is it equivalent to run "gluster volume set mysto transport.listen-backlog
> 100" ?

Yes, that's the way you'd set the option for all the bricks.
But for glusterd, you'll have to edit the /etc/glusterfs/glusterd.vol file manually. There's no command-line for setting options for glusterd ... other than the actual command-line in the service file.

Comment 19 Sylvain 2017-10-26 15:54:18 UTC

Ok, got it;
I guess I can close the ticket now.

Thanks again.

Comment 26 jack.wong 2019-03-22 02:39:17 UTC

(In reply to Milind Changire from comment #16)
> Sylvain,
> Could you check if adding/changing "option transport.listen-backlog 100" in
> /etc/glusterfs/glusterd.vol helps any bit ?

We have run into this issue too. Thank you for your suggestion. It was quite helpful.

Tweaking "transport.listen-backlog" fixed the problem for us. One thing I want to note is that 100 may be too low. We are running about 40 Gluster volumes on a single server. We had to set "transport.listen-backlog" higher than 1024 before all our glusterfsd processes consistently started up. Because 1024 is higher than the default net.core.somaxconn kernel configuration value of 128, we also had to increase net.core.somaxconn to make that take effect. Otherwise, the kernel would silently truncate the listen backlog to SOMAXCONN (https://manpages.debian.org/stretch/manpages-dev/listen.2.en.html#NOTES).

We only had to edit /etc/glusterfs/glusterd.vol. We did not have to set the transport.listen-backlog on any of our Gluster volumes.