This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 847821 - After disabling NFS the message "0-transport: disconnecting now" keeps appearing in the logs
After disabling NFS the message "0-transport: disconnecting now" keeps appear...
Status: CLOSED EOL
Product: GlusterFS
Classification: Community
Component: glusterd (Show other bugs)
mainline
x86_64 Linux
medium Severity low
: ---
: ---
Assigned To: bugs@gluster.org
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-13 12:30 EDT by Joao F
Modified: 2015-10-22 11:46 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-22 11:46:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
etc-glusterfs-glusterd.vol.log (2.49 KB, text/x-log)
2012-08-13 12:30 EDT, Joao F
no flags Details

  None (edit)
Description Joao F 2012-08-13 12:30:22 EDT
Created attachment 604042 [details]
etc-glusterfs-glusterd.vol.log

After disabling NFS, the message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds in glusterd log

How reproducible:

Happens every time NFS is disabled (on a Replicated volume, at least)

Steps to Reproduce:
1. Disable NFS by running 'gluster volume set VOLUMENAME nfs.disable on'
2. Watch the log by running 'tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log'
  
Actual results:

The message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds, even though the NFS-related lines have disappeared from 'gluster volume status' output and the NFS service is indeed disabled.

Additional info:

Attached goes the 'etc-glusterfs-glusterd.vol.log' file right after the command 'gluster volume set VOLUMENAME nfs.disable on' is executed
Comment 1 Krishna Srinivas 2012-10-22 05:51:52 EDT
glusterd generates this log as it tries to reconnect to non-existent unix domain socket (after doing nfs.disable=on):

(gdb) p this.peerinfo
$6 = {sockaddr = {ss_family = 1, __ss_align = 3905014932482385250, 
    __ss_padding = "3b1f8e050d13478d02f8ecd.socket\000\000\220\375\315\n \177\000\000\215\b[\"\377\177\000\000p\t\000\374\037\177\000\000p\t\000\374\037\177\000\000\260\375\315\n \177\000\000\231\277\322\016 \177\000\000`\t\000\374\037\177\000\000 \t\000\374\037\177", '\000' <repeats 17 times>}, 
  sockaddr_len = 110, identifier = "/tmp/1be55df163b1f8e050d13478d02f8ecd.socket", '\000' <repeats 63 times>}
(gdb)
Comment 2 Pierre-Francois Laquerre 2012-11-29 10:04:30 EST
This is still present in 3.3.1.
Comment 3 Pierre-Francois Laquerre 2013-07-17 10:34:30 EDT
This is still present in 3.4, but now outputs "E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)" instead.
Comment 4 madox33 2013-11-15 01:41:02 EST
This bug still present in 3.4.1 with a lot of messages in etc-glusterfs-glusterd.vol.log
E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)
Comment 5 Niels de Vos 2014-11-27 09:53:47 EST
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.
Comment 6 Anatoly Pugachev 2014-12-12 07:33:18 EST
still valid for glusterfs-3.5.3-1.el6.x86_64
Comment 7 Anatoly Pugachev 2014-12-12 10:34:05 EST
running with debug loglevel, getting the following in etc-glusterfs-glusterd.vol.log :

[2014-12-12 15:24:41.894149] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:41.894187] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT
[2014-12-12 15:24:44.894791] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:44.894833] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT

I can confirm that it is NFS disable related.

running 

[root@UAK1-NAS-SRV2 glusterfs]# ps ax | grep "glusterd "
 5658 ?        Ssl    0:01 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid --log-level DEBUG

[root@UAK1-NAS-SRV2 glusterfs]# strace -f -p 5658
Process 5658 attached with 6 threads - interrupt to quit
[pid  5938] futex(0x162fbe4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  5661] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5937] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5660] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5659] rt_sigtimedwait([HUP INT USR1 USR2 TERM], NULL, NULL, 8 <unfinished ...>
[pid  5658] epoll_wait(3,  <unfinished ...>
[pid  5937] <... restart_syscall resumed> ) = 0
[pid  5937] nanosleep({1, 0}, NULL)     = 0
[pid  5937] socket(PF_FILE, SOCK_STREAM, 0) = 36
[pid  5937] fcntl(36, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  5937] fcntl(36, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  5937] connect(36, {sa_family=AF_FILE, path="/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"}, 110) = -1 ECONNREFUSED (Connection refused)
[pid  5937] write(4, "[2014-12-12 15:26:12.796533] D ["..., 128) = 128
[pid  5937] close(36)                   = 0
[pid  5937] write(4, "[2014-12-12 15:26:12.796914] D ["..., 125) = 125
[pid  5937] nanosleep({1, 0}, NULL)     = 0
Process 5658 detached
Process 5659 detached
Process 5660 detached
Process 5661 detached
Process 5937 detached
Process 5938 detached

[root@UAK1-NAS-SRV2 glusterfs]# ls -l "/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"
srwxr-xr-x 1 root root 0 Dec 12 19:02 /var/run/20b48f9afa4a303f48fdec85f12b322f.socket

# netstat -an| grep 20b48f9afa4a303f48fdec85f12b322f
#

# less nfs.log
[2014-12-12 15:02:51.991649] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.3 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/20b48f9afa4a303f48fdec85f12b322f.socket)
[2014-12-12 15:02:51.995114] I [socket.c:3645:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled
[2014-12-12 15:02:51.995158] I [socket.c:3660:socket_init] 0-socket.glusterfsd: using system polling thread
[2014-12-12 15:02:51.995296] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-12-12 15:02:51.995314] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
[2014-12-12 15:02:52.044904] I [nfs.c:725:nfs_init_state] 0-nfs: NFS is manually disabled: Exiting
[2014-12-12 15:02:52.045348] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x351b6e767d] (-->/lib64/libpthread.so.0() [0x351be07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x4053e5]))) 0-: received signum (15), shutting down
Comment 8 Anatoly Pugachev 2015-02-27 04:43:54 EST
still valid for glusterfs-3.6.2-1.fc21.x86_64

[root@node01 glusterfs]# tail  etc-glusterfs-glusterd.vol.log  
[2015-02-27 09:42:59.174338] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:02.175082] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:05.175699] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)

[root@node01 glusterfs]# tail nfs.log| grep 5cf388ba8b9610c52604425a16955178
[2015-02-27 09:41:25.369990] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5cf388ba8b9610c52604425a16955178.socket)
Comment 9 Anatoly Pugachev 2015-06-03 05:05:29 EDT
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.1-1.fc22.x86_64

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: on

[root@node01 glusterfs]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-03 09:04:26.545507] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:26.545684] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:29.546227] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:29.546370] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:32.546903] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:32.547056] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:35.547560] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:35.547708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:38.548274] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:38.548414] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)


https://bugzilla.redhat.com/show_bug.cgi?id=1199944
https://bugzilla.redhat.com/show_bug.cgi?id=1199936
Comment 10 Anatoly Pugachev 2015-06-04 10:48:05 EDT
https://bugzilla.redhat.com/show_bug.cgi?id=1222065
Comment 11 Anatoly Pugachev 2015-06-25 04:41:48 EDT
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.2-3.fc22.x86_64

[root@node01 mator]# gluster vol set vol2 nfs.disable off
volume set: success

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: off

[root@node01 mator]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-25 08:41:03.247283] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:06.248123] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:09.248778] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:12.249404] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:15.249963] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
Comment 12 Anatoly Pugachev 2015-06-25 04:56:16 EDT
as a bonus, additional 3 second interval messages, on disconnecting one of the nodes:

node01# systemctl stop glusterd

node02# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
...
[2015-06-25 08:51:27.558925] W [socket.c:642:__socket_rwv] 0-management: readv on 192.168.136.104:24007 failed (No data available)
[2015-06-25 08:51:27.559132] I [MSGID: 106004] [glusterd-handler.c:4814:__glusterd_peer_rpc_notify] 0-management: Peer <node01> (<6fbbd20d-cac0-4944-ba73-4cd4690870db>), in state <Peer in Cluster>, has disconnected from glusterd.
[2015-06-25 08:51:38.080144] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:38.080183] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:39.081989] E [socket.c:2332:socket_connect_finish] 0-management: connection to 192.168.136.104:24007 failed (Connection refused)
[2015-06-25 08:51:41.084448] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:41.084514] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:44.088688] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:44.088744] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:47.093036] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:47.093101] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:50.097422] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:50.097493] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:53.101715] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
Comment 13 Kaleb KEITHLEY 2015-10-22 11:46:38 EDT
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Note You need to log in before you can comment on or make changes to this bug.