Bug 847821 - After disabling NFS the message "0-transport: disconnecting now" keeps appearing in the logs
Summary: After disabling NFS the message "0-transport: disconnecting now" keeps appear...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
medium
low
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-13 16:30 UTC by Joao F
Modified: 2015-10-22 15:46 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-10-22 15:46:38 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
etc-glusterfs-glusterd.vol.log (2.49 KB, text/x-log)
2012-08-13 16:30 UTC, Joao F
no flags Details

Description Joao F 2012-08-13 16:30:22 UTC
Created attachment 604042 [details]
etc-glusterfs-glusterd.vol.log

After disabling NFS, the message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds in glusterd log

How reproducible:

Happens every time NFS is disabled (on a Replicated volume, at least)

Steps to Reproduce:
1. Disable NFS by running 'gluster volume set VOLUMENAME nfs.disable on'
2. Watch the log by running 'tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log'
  
Actual results:

The message "[socket.c:1798:socket_event_handler] 0-transport: disconnecting now" appears every 3 seconds, even though the NFS-related lines have disappeared from 'gluster volume status' output and the NFS service is indeed disabled.

Additional info:

Attached goes the 'etc-glusterfs-glusterd.vol.log' file right after the command 'gluster volume set VOLUMENAME nfs.disable on' is executed

Comment 1 Krishna Srinivas 2012-10-22 09:51:52 UTC
glusterd generates this log as it tries to reconnect to non-existent unix domain socket (after doing nfs.disable=on):

(gdb) p this.peerinfo
$6 = {sockaddr = {ss_family = 1, __ss_align = 3905014932482385250, 
    __ss_padding = "3b1f8e050d13478d02f8ecd.socket\000\000\220\375\315\n \177\000\000\215\b[\"\377\177\000\000p\t\000\374\037\177\000\000p\t\000\374\037\177\000\000\260\375\315\n \177\000\000\231\277\322\016 \177\000\000`\t\000\374\037\177\000\000 \t\000\374\037\177", '\000' <repeats 17 times>}, 
  sockaddr_len = 110, identifier = "/tmp/1be55df163b1f8e050d13478d02f8ecd.socket", '\000' <repeats 63 times>}
(gdb)

Comment 2 Pierre-Francois Laquerre 2012-11-29 15:04:30 UTC
This is still present in 3.3.1.

Comment 3 Pierre-Francois Laquerre 2013-07-17 14:34:30 UTC
This is still present in 3.4, but now outputs "E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)" instead.

Comment 4 madox33 2013-11-15 06:41:02 UTC
This bug still present in 3.4.1 with a lot of messages in etc-glusterfs-glusterd.vol.log
E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused)

Comment 5 Niels de Vos 2014-11-27 14:53:47 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.

Comment 6 Anatoly Pugachev 2014-12-12 12:33:18 UTC
still valid for glusterfs-3.5.3-1.el6.x86_64

Comment 7 Anatoly Pugachev 2014-12-12 15:34:05 UTC
running with debug loglevel, getting the following in etc-glusterfs-glusterd.vol.log :

[2014-12-12 15:24:41.894149] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:41.894187] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT
[2014-12-12 15:24:44.894791] D [socket.c:2903:socket_connect] 0-management: connection attempt on  failed, (Connection refused)
[2014-12-12 15:24:44.894833] D [glusterd-handler.c:3766:__glusterd_nodesvc_rpc_notify] 0-management: got RPC_CLNT_DISCONNECT

I can confirm that it is NFS disable related.

running 

[root@UAK1-NAS-SRV2 glusterfs]# ps ax | grep "glusterd "
 5658 ?        Ssl    0:01 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid --log-level DEBUG

[root@UAK1-NAS-SRV2 glusterfs]# strace -f -p 5658
Process 5658 attached with 6 threads - interrupt to quit
[pid  5938] futex(0x162fbe4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  5661] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5937] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5660] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  5659] rt_sigtimedwait([HUP INT USR1 USR2 TERM], NULL, NULL, 8 <unfinished ...>
[pid  5658] epoll_wait(3,  <unfinished ...>
[pid  5937] <... restart_syscall resumed> ) = 0
[pid  5937] nanosleep({1, 0}, NULL)     = 0
[pid  5937] socket(PF_FILE, SOCK_STREAM, 0) = 36
[pid  5937] fcntl(36, F_GETFL)          = 0x2 (flags O_RDWR)
[pid  5937] fcntl(36, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  5937] connect(36, {sa_family=AF_FILE, path="/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"}, 110) = -1 ECONNREFUSED (Connection refused)
[pid  5937] write(4, "[2014-12-12 15:26:12.796533] D ["..., 128) = 128
[pid  5937] close(36)                   = 0
[pid  5937] write(4, "[2014-12-12 15:26:12.796914] D ["..., 125) = 125
[pid  5937] nanosleep({1, 0}, NULL)     = 0
Process 5658 detached
Process 5659 detached
Process 5660 detached
Process 5661 detached
Process 5937 detached
Process 5938 detached

[root@UAK1-NAS-SRV2 glusterfs]# ls -l "/var/run/20b48f9afa4a303f48fdec85f12b322f.socket"
srwxr-xr-x 1 root root 0 Dec 12 19:02 /var/run/20b48f9afa4a303f48fdec85f12b322f.socket

# netstat -an| grep 20b48f9afa4a303f48fdec85f12b322f
#

# less nfs.log
[2014-12-12 15:02:51.991649] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.3 (/usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/20b48f9afa4a303f48fdec85f12b322f.socket)
[2014-12-12 15:02:51.995114] I [socket.c:3645:socket_init] 0-socket.glusterfsd: SSL support is NOT enabled
[2014-12-12 15:02:51.995158] I [socket.c:3660:socket_init] 0-socket.glusterfsd: using system polling thread
[2014-12-12 15:02:51.995296] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-12-12 15:02:51.995314] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
[2014-12-12 15:02:52.044904] I [nfs.c:725:nfs_init_state] 0-nfs: NFS is manually disabled: Exiting
[2014-12-12 15:02:52.045348] W [glusterfsd.c:1095:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x351b6e767d] (-->/lib64/libpthread.so.0() [0x351be07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x4053e5]))) 0-: received signum (15), shutting down

Comment 8 Anatoly Pugachev 2015-02-27 09:43:54 UTC
still valid for glusterfs-3.6.2-1.fc21.x86_64

[root@node01 glusterfs]# tail  etc-glusterfs-glusterd.vol.log  
[2015-02-27 09:42:59.174338] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:02.175082] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-02-27 09:43:05.175699] W [socket.c:611:__socket_rwv] 0-management: readv on /var/run/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)

[root@node01 glusterfs]# tail nfs.log| grep 5cf388ba8b9610c52604425a16955178
[2015-02-27 09:41:25.369990] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5cf388ba8b9610c52604425a16955178.socket)

Comment 9 Anatoly Pugachev 2015-06-03 09:05:29 UTC
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.1-1.fc22.x86_64

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: on

[root@node01 glusterfs]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-03 09:04:26.545507] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:26.545684] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:29.546227] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:29.546370] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:32.546903] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:32.547056] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:35.547560] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:35.547708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-03 09:04:38.548274] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket, (No such file or directory) 
[2015-06-03 09:04:38.548414] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)


https://bugzilla.redhat.com/show_bug.cgi?id=1199944
https://bugzilla.redhat.com/show_bug.cgi?id=1199936

Comment 10 Anatoly Pugachev 2015-06-04 14:48:05 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1222065

Comment 11 Anatoly Pugachev 2015-06-25 08:41:48 UTC
[root@node01 glusterfs]# rpm -q glusterfs
glusterfs-3.7.2-3.fc22.x86_64

[root@node01 mator]# gluster vol set vol2 nfs.disable off
volume set: success

[root@node01 glusterfs]# gluster vol info vol2
 
Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: off

[root@node01 mator]# tail -10 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-25 08:41:03.247283] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:06.248123] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:09.248778] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:12.249404] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
[2015-06-25 08:41:15.249963] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)

Comment 12 Anatoly Pugachev 2015-06-25 08:56:16 UTC
as a bonus, additional 3 second interval messages, on disconnecting one of the nodes:

node01# systemctl stop glusterd

node02# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
...
[2015-06-25 08:51:27.558925] W [socket.c:642:__socket_rwv] 0-management: readv on 192.168.136.104:24007 failed (No data available)
[2015-06-25 08:51:27.559132] I [MSGID: 106004] [glusterd-handler.c:4814:__glusterd_peer_rpc_notify] 0-management: Peer <node01> (<6fbbd20d-cac0-4944-ba73-4cd4690870db>), in state <Peer in Cluster>, has disconnected from glusterd.
[2015-06-25 08:51:38.080144] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:38.080183] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:39.081989] E [socket.c:2332:socket_connect_finish] 0-management: connection to 192.168.136.104:24007 failed (Connection refused)
[2015-06-25 08:51:41.084448] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:41.084514] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:44.088688] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:44.088744] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:47.093036] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:47.093101] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:50.097422] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument
[2015-06-25 08:51:50.097493] E [socket.c:3015:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-06-25 08:51:53.101715] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 7, Invalid argument

Comment 13 Kaleb KEITHLEY 2015-10-22 15:46:38 UTC
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.


Note You need to log in before you can comment on or make changes to this bug.