Bug 847821

Summary: After disabling NFS the message "0-transport: disconnecting now" keeps appearing in the logs
Product: [Community] GlusterFS Reporter: Joao F <jfraga7>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: mainlineCC: btlchan, bugs, gluster-bugs, mailbox, matorola, m.karandashev, pierre.francois, rwheeler, tm, vagarwal, vpvainio
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-22 11:46:38 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
etc-glusterfs-glusterd.vol.log none

Description Joao F 2012-08-13 12:30:22 EDT
Created attachment 604042 [details]
etc-glusterfs-glusterd.vol.log

After disabling NFS, the message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds in glusterd log

How reproducible:

Happens every time NFS is disabled (on a Replicated volume, at least)

Steps to Reproduce:
1. Disable NFS by running 'gluster volume set VOLUMENAME nfs.disable on'
2. Watch the log by running 'tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log'
  
Actual results:

The message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds, even though the NFS-related lines have disappeared from 'gluster volume status' output and the NFS service is indeed disabled.

Additional info:

Attached goes the 'etc-glusterfs-glusterd.vol.log' file right after the command 'gluster volume set VOLUMENAME nfs.disable on' is executed
Comment 1 Krishna Srinivas 2012-10-22 05:51:52 EDT
glusterd generates this log as it tries to reconnect to non-existent unix domain socket (after doing nfs.disable=on):

(gdb) p this.peerinfo
$6 = {sockaddr = {ss_family = 1, __ss_align = 3905014932482385250, 
    __ss_padding = "3b1f8e050d13478d02f8ecd.socket\000\000\220\375\315\n \177\000\000\215\b[\"\377\177\000\000p\t\000\374\037\177\000\000p\t\000\374\037\177\000\000\260\375\315\n \177\000\000\231\277\322\016 \177\000\000`\t\000\374\037\177\000\000 \t\000\374\037\177", '\000' <repeats 17 times>}, 
  sockaddr_len = 110, identifier = "/tmp/1be55df163b1f8e050d13478d02f8ecd.socket", '\000' <repeats 63 times>}
(gdb)
Comment 2 Pierre-Francois Laquerre 2012-11-29 10:04:30 EST
This is still present in 3.3.1.
Comment 3 Pierre-Francois Laquerre 2013-07-17 10:34:30 EDT
This is still present in 3.4, but now outputs "E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)" instead.
Comment 4 madox33 2013-11-15 01:41:02 EST
This bug still present in 3.4.1 with a lot of messages in etc-glusterfs-glusterd.vol.log
E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)
Comment 5 Niels de Vos 2014-11-27 09:53:47 EST
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.
Comment 6 Anatoly Pugachev 2014-12-12 07:33:18 EST
still valid for glusterfs-3.5.3-1.el6.x86_64
Comment 7 Anatoly Pugachev 2014-12-12 10:34:05 EST
running with debug loglevel, getting the following in etc-glusterfs-glusterd.vol.log :

[2014-12-12 15:24:41.894149] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:41.894187] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT
[2014-12-12 15:24:44.894791] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:44.894833] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT

I can confirm that it is NFS disable related.

running 

[root@UAK1-NAS-SRV2 glusterfs]# ps ax | grep "glusterd "
 5658 ?        Ssl    0:01 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid --log-level DEBUG

[root@UAK1-NAS-SRV2 glusterfs]# strace -f -p 5658
Process 5658 attached with 6 threads - interrupt to quit
[pid  5938] futex(0x162fbe4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  5661] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5937] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5660] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5659] rt_sigtimedwait([HUP INT USR1 USR2 TERM], NULL, NULL, 8 <unfinished ...>
[pid  5658] epoll_wait(3,  <unfinished ...>
[pid  5937] <... restart_syscall resumed> ) = 0
[pid  5937] nanosleep({1, 0}, NULL)     = 0
[pid  5937] socket(PF_FILE, SOCK_STREAM, 0) = 36
[pid  5937] fcntl(36, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  5937] fcntl(36, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  5937] connect(36, {sa_family=AF_FILE, path="/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"}, 110) = -1 ECONNREFUSED (Connection refused)
[pid  5937] write(4, "[2014-12-12 15:26:12.796533] D ["..., 128) = 128
[pid  5937] close(36)                   = 0
[pid  5937] write(4, "[2014-12-12 15:26:12.796914] D ["..., 125) = 125
[pid  5937] nanosleep({1, 0}, NULL)     = 0
Process 5658 detached
Process 5659 detached
Process 5660 detached
Process 5661 detached
Process 5937 detached
Process 5938 detached

[root@UAK1-NAS-SRV2 glusterfs]# ls -l "/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"
srwxr-xr-x 1 root root 0 Dec 12 19:02 /var/run/20b48f9afa4a303f48fdec85f12b322f.socket

# netstat -an| grep 20b48f9afa4a303f48fdec85f12b322f
#

# less nfs.log
[2014-12-12 15:02:51.991649] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.3 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/20b48f9afa4a303f48fdec85f12b322f.socket)
[2014-12-12 15:02:51.995114] I [socket.c:3645:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled
[2014-12-12 15:02:51.995158] I [socket.c:3660:socket_init] 0-socket.glusterfsd: using system polling thread
[2014-12-12 15:02:51.995296] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-12-12 15:02:51.995314] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
[2014-12-12 15:02:52.044904] I [nfs.c:725:nfs_init_state] 0-nfs: NFS is manually disabled: Exiting
[2014-12-12 15:02:52.045348] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x351b6e767d] (-->/lib64/libpthread.so.0() [0x351be07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x4053e5]))) 0-: received signum (15), shutting down
Comment 8 Anatoly Pugachev 2015-02-27 04:43:54 EST
still valid for glusterfs-3.6.2-1.fc21.x86_64

[root@node01 glusterfs]# tail  etc-glusterfs-glusterd.vol.log  
[2015-02-27 09:42:59.174338] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:02.175082] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:05.175699] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)

[root@node01 glusterfs]# tail nfs.log| grep 5cf388ba8b9610c52604425a16955178
[2015-02-27 09:41:25.369990] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5cf388ba8b9610c52604425a16955178.socket)
Comment 9 Anatoly Pugachev 2015-06-03 05:05:29 EDT
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.1-1.fc22.x86_64

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: on

[root@node01 glusterfs]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-03 09:04:26.545507] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:26.545684] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:29.546227] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:29.546370] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:32.546903] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:32.547056] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:35.547560] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:35.547708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:38.548274] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:38.548414] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)


https://bugzilla.redhat.com/show_bug.cgi?id=1199944
https://bugzilla.redhat.com/show_bug.cgi?id=1199936
Comment 10 Anatoly Pugachev 2015-06-04 10:48:05 EDT
https://bugzilla.redhat.com/show_bug.cgi?id=1222065
Comment 11 Anatoly Pugachev 2015-06-25 04:41:48 EDT
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.2-3.fc22.x86_64

[root@node01 mator]# gluster vol set vol2 nfs.disable off
volume set: success

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: off

[root@node01 mator]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-25 08:41:03.247283] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:06.248123] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:09.248778] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:12.249404] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:15.249963] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
Comment 12 Anatoly Pugachev 2015-06-25 04:56:16 EDT
as a bonus, additional 3 second interval messages, on disconnecting one of the nodes:

node01# systemctl stop glusterd

node02# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
...
[2015-06-25 08:51:27.558925] W [socket.c:642:__socket_rwv] 0-management: readv on 192.168.136.104:24007 failed (No data available)
[2015-06-25 08:51:27.559132] I [MSGID: 106004] [glusterd-handler.c:4814:__glusterd_peer_rpc_notify] 0-management: Peer <node01> (<6fbbd20d-cac0-4944-ba73-4cd4690870db>), in state <Peer in Cluster>, has disconnected from glusterd.
[2015-06-25 08:51:38.080144] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:38.080183] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:39.081989] E [socket.c:2332:socket_connect_finish] 0-management: connection to 192.168.136.104:24007 failed (Connection refused)
[2015-06-25 08:51:41.084448] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:41.084514] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:44.088688] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:44.088744] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:47.093036] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:47.093101] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:50.097422] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:50.097493] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:53.101715] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
Comment 13 Kaleb KEITHLEY 2015-10-22 11:46:38 EDT
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.