Created attachment 604042 [details] etc-glusterfs-glusterd.vol.log After disabling NFS, the message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds in glusterd log How reproducible: Happens every time NFS is disabled (on a Replicated volume, at least) Steps to Reproduce: 1. Disable NFS by running 'gluster volume set VOLUMENAME nfs.disable on' 2. Watch the log by running 'tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log' Actual results: The message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds, even though the NFS-related lines have disappeared from 'gluster volume status' output and the NFS service is indeed disabled. Additional info: Attached goes the 'etc-glusterfs-glusterd.vol.log' file right after the command 'gluster volume set VOLUMENAME nfs.disable on' is executed
glusterd generates this log as it tries to reconnect to non-existent unix domain socket (after doing nfs.disable=on): (gdb) p this.peerinfo $6 = {sockaddr = {ss_family = 1, __ss_align = 3905014932482385250, __ss_padding = "3b1f8e050d13478d02f8ecd.socket\000\000\220\375\315\n \177\000\000\215\b[\"\377\177\000\000p\t\000\374\037\177\000\000p\t\000\374\037\177\000\000\260\375\315\n \177\000\000\231\277\322\016 \177\000\000`\t\000\374\037\177\000\000 \t\000\374\037\177", '\000' <repeats 17 times>}, sockaddr_len = 110, identifier = "/tmp/1be55df163b1f8e050d13478d02f8ecd.socket", '\000' <repeats 63 times>} (gdb)
This is still present in 3.3.1.
This is still present in 3.4, but now outputs "E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)" instead.
This bug still present in 3.4.1 with a lot of messages in etc-glusterfs-glusterd.vol.log E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed.
still valid for glusterfs-3.5.3-1.el6.x86_64
running with debug loglevel, getting the following in etc-glusterfs-glusterd.vol.log : [2014-12-12 15:24:41.894149] D [socket.c:2903:socket_connect] 0-management: connection attempt on failed, (Connection refused) [2014-12-12 15:24:41.894187] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT [2014-12-12 15:24:44.894791] D [socket.c:2903:socket_connect] 0-management: connection attempt on failed, (Connection refused) [2014-12-12 15:24:44.894833] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT I can confirm that it is NFS disable related. running [root@UAK1-NAS-SRV2 glusterfs]# ps ax | grep "glusterd " 5658 ? Ssl 0:01 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid --log-level DEBUG [root@UAK1-NAS-SRV2 glusterfs]# strace -f -p 5658 Process 5658 attached with 6 threads - interrupt to quit [pid 5938] futex(0x162fbe4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> [pid 5661] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 5937] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 5660] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 5659] rt_sigtimedwait([HUP INT USR1 USR2 TERM], NULL, NULL, 8 <unfinished ...> [pid 5658] epoll_wait(3, <unfinished ...> [pid 5937] <... restart_syscall resumed> ) = 0 [pid 5937] nanosleep({1, 0}, NULL) = 0 [pid 5937] socket(PF_FILE, SOCK_STREAM, 0) = 36 [pid 5937] fcntl(36, F_GETFL) = 0x2 (flags O_RDWR) [pid 5937] fcntl(36, F_SETFL, O_RDWR|O_NONBLOCK) = 0 [pid 5937] connect(36, {sa_family=AF_FILE, path="/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"}, 110) = -1 ECONNREFUSED (Connection refused) [pid 5937] write(4, "[2014-12-12 15:26:12.796533] D ["..., 128) = 128 [pid 5937] close(36) = 0 [pid 5937] write(4, "[2014-12-12 15:26:12.796914] D ["..., 125) = 125 [pid 5937] nanosleep({1, 0}, NULL) = 0 Process 5658 detached Process 5659 detached Process 5660 detached Process 5661 detached Process 5937 detached Process 5938 detached [root@UAK1-NAS-SRV2 glusterfs]# ls -l "/var/run/20b48f9afa4a303f48fdec85f12b322f.socket" srwxr-xr-x 1 root root 0 Dec 12 19:02 /var/run/20b48f9afa4a303f48fdec85f12b322f.socket # netstat -an| grep 20b48f9afa4a303f48fdec85f12b322f # # less nfs.log [2014-12-12 15:02:51.991649] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.3 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/20b48f9afa4a303f48fdec85f12b322f.socket) [2014-12-12 15:02:51.995114] I [socket.c:3645:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled [2014-12-12 15:02:51.995158] I [socket.c:3660:socket_init] 0-socket.glusterfsd: using system polling thread [2014-12-12 15:02:51.995296] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled [2014-12-12 15:02:51.995314] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread [2014-12-12 15:02:52.044904] I [nfs.c:725:nfs_init_state] 0-nfs: NFS is manually disabled: Exiting [2014-12-12 15:02:52.045348] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x351b6e767d] (-->/lib64/libpthread.so.0() [0x351be07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x4053e5]))) 0-: received signum (15), shutting down
still valid for glusterfs-3.6.2-1.fc21.x86_64 [root@node01 glusterfs]# tail etc-glusterfs-glusterd.vol.log [2015-02-27 09:42:59.174338] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-02-27 09:43:02.175082] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-02-27 09:43:05.175699] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [root@node01 glusterfs]# tail nfs.log| grep 5cf388ba8b9610c52604425a16955178 [2015-02-27 09:41:25.369990] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5cf388ba8b9610c52604425a16955178.socket)
[root@node01 glusterfs]# rpm -q glusterfs glusterfs-3.7.1-1.fc22.x86_64 [root@node01 glusterfs]# gluster vol info vol2 Volume Name: vol2 Type: Distribute Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node01:/export/sdb1/brick2 Options Reconfigured: nfs.disable: on [root@node01 glusterfs]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log [2015-06-03 09:04:26.545507] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) [2015-06-03 09:04:26.545684] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-03 09:04:29.546227] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) [2015-06-03 09:04:29.546370] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-03 09:04:32.546903] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) [2015-06-03 09:04:32.547056] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-03 09:04:35.547560] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) [2015-06-03 09:04:35.547708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-03 09:04:38.548274] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) [2015-06-03 09:04:38.548414] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) https://bugzilla.redhat.com/show_bug.cgi?id=1199944 https://bugzilla.redhat.com/show_bug.cgi?id=1199936
https://bugzilla.redhat.com/show_bug.cgi?id=1222065
[root@node01 glusterfs]# rpm -q glusterfs glusterfs-3.7.2-3.fc22.x86_64 [root@node01 mator]# gluster vol set vol2 nfs.disable off volume set: success [root@node01 glusterfs]# gluster vol info vol2 Volume Name: vol2 Type: Distribute Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: node01:/export/sdb1/brick2 Options Reconfigured: nfs.disable: off [root@node01 mator]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log [2015-06-25 08:41:03.247283] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-25 08:41:06.248123] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-25 08:41:09.248778] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-25 08:41:12.249404] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument) [2015-06-25 08:41:15.249963] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
as a bonus, additional 3 second interval messages, on disconnecting one of the nodes: node01# systemctl stop glusterd node02# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log ... [2015-06-25 08:51:27.558925] W [socket.c:642:__socket_rwv] 0-management: readv on 192.168.136.104:24007 failed (No data available) [2015-06-25 08:51:27.559132] I [MSGID: 106004] [glusterd-handler.c:4814:__glusterd_peer_rpc_notify] 0-management: Peer <node01> (<6fbbd20d-cac0-4944-ba73-4cd4690870db>), in state <Peer in Cluster>, has disconnected from glusterd. [2015-06-25 08:51:38.080144] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument [2015-06-25 08:51:38.080183] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-06-25 08:51:39.081989] E [socket.c:2332:socket_connect_finish] 0-management: connection to 192.168.136.104:24007 failed (Connection refused) [2015-06-25 08:51:41.084448] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument [2015-06-25 08:51:41.084514] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-06-25 08:51:44.088688] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument [2015-06-25 08:51:44.088744] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-06-25 08:51:47.093036] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument [2015-06-25 08:51:47.093101] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-06-25 08:51:50.097422] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument [2015-06-25 08:51:50.097493] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument [2015-06-25 08:51:53.101715] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.