Bug 1222065

Summary: GlusterD fills the logs when the NFS-server is disabled
Product: [Community] GlusterFS Reporter: Niels de Vos <ndevos>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.7.0CC: alex, amukherj, bugs, enrico.tagliavini, gluster-bugs, kaushal, kparthas, manu, matorola, nsathyan, plarsen
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10956
Whiteboard:
Fixed In Version: glusterfs-3.7.2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1222378 1227179 (view as bug list) Environment:
Last Closed: 2015-07-06 13:53:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1222378    
Bug Blocks: 1227179, 1233025    
Attachments:
Description Flags
glusterfs log node01 none

Description Niels de Vos 2015-05-15 16:37:09 UTC
Description of problem:
When Gluster/NFS is disabled with the volume option nfs.disable=true, GlusterD fills its own logs with the following messages:

[2015-05-15 16:25:34.826068] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/2feeb86ac1b9bb2039781dcb18de449b.socket failed (Invalid argument)
[2015-05-15 16:25:37.826562] W [socket.c:3059:socket_connect] 0-nfs: Ignore failed connection attempt on /var/run/gluster/2feeb86ac1b9bb2039781dcb18de449b.socket, (No such file or directory) 

These lines are repeated every 3 seconds.

Version-Release number of selected component (if applicable):
3.7.0

How reproducible:
100%

Steps to Reproduce:
1. create and start a volume
2. gluster volume set $VOLUME nfs.disable true
3. tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log

Actual results:
Logs fill rapidly.

Expected results:
No log messages, this is intentional.

Additional info:
Reported by Manu: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10956

Comment 1 Anand Avati 2015-06-02 04:24:04 UTC
COMMIT: http://review.gluster.org/10963 committed in release-3.7 by Kaushal M (kaushal) 
------
commit efdb183c62be68f1a371237d1caac4347288f538
Author: Krishnan Parthasarathi <kparthas>
Date:   Tue May 19 14:48:01 2015 +0530

    glusterd: fix repeated connection to nfssvc failed msgs
    
    ... and disable reconnect timer on rpc_clnt_disconnect.
    
    Root Cause
    ----------
    
    gluster-NFS service wouldn't be started if there are no
    started volumes that have nfs service enabled for them.
    Before this fix we would initiate a connect even when
    the gluster-NFS service wasn't (re)started. Compounding
    that glusterd_conn_disconnect doesn't disable reconnect
    timer. So, it is possible that the reconnect timer was
    in execution when  the timer event was attempted to be
    removed.
    
    Change-Id: Iadcb5cff9eafefa95eaf3a1a9413eeb682d3aaac
    BUG: 1222065
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/10830
    Reviewed-by: Atin Mukherjee <amukherj>
    Reviewed-by: Gaurav Kumar Garg <ggarg>
    Reviewed-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/10963
    Tested-by: Gluster Build System <jenkins.com>
    Tested-by: NetBSD Build System <jenkins.org>
    Reviewed-by: Emmanuel Dreyfus <manu>

Comment 2 Niels de Vos 2015-06-02 08:20:19 UTC
The required changes to fix this bug have not made it into glusterfs-3.7.1. This bug is now getting tracked for glusterfs-3.7.2.

Comment 3 Anatoly Pugachev 2015-06-04 14:48:54 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=847821

Comment 4 Niels de Vos 2015-06-20 09:48:24 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.2, please reopen this bug report.

glusterfs-3.7.2 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/packaging/2015-June/000006.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 5 Anatoly Pugachev 2015-06-25 08:38:11 UTC
[root@node01 mator]# rpm -qa glusterfs*
glusterfs-server-3.7.2-3.fc22.x86_64
glusterfs-fuse-3.7.2-3.fc22.x86_64
glusterfs-libs-3.7.2-3.fc22.x86_64
glusterfs-rdma-3.7.2-3.fc22.x86_64
glusterfs-cli-3.7.2-3.fc22.x86_64
glusterfs-client-xlators-3.7.2-3.fc22.x86_64
glusterfs-geo-replication-3.7.2-3.fc22.x86_64
glusterfs-api-3.7.2-3.fc22.x86_64
glusterfs-3.7.2-3.fc22.x86_64

[root@node01 mator]# gluster vol info vol2

Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Stopped
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: on


no repeated nfs socket related messages in logs so far... lets try to play a bit:

[root@node01 mator]# gluster vol set vol2 nfs.disable off
volume set: success
[root@node01 mator]# gluster vol info vol2

Volume Name: vol2
Type: Distribute
Volume ID: 070d64bd-a120-43fb-b524-fb85403343ec
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick2
Options Reconfigured:
nfs.disable: off

# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-06-25 08:31:02.835985] I [glusterd-utils.c:4096:glusterd_nfs_pmap_deregister] 0-: De-registered ACL v3 successfully
[2015-06-25 08:31:02.870606] I [run.c:190:runner_log] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x147)[0x7f08802c9b87] (--> /lib64/li
bglusterfs.so.0(runner_log+0xf6)[0x7f08803146e6] (--> /usr/lib64/glusterfs/3.7.2/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x51d)[0x
7f0874e3403d] (--> /usr/lib64/glusterfs/3.7.2/xlator/mgmt/glusterd.so(+0xd6512)[0x7f0874e34512] (--> /lib64/libpthread.so.0(+0x7555)[0x7f08
7f12d555] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=vol2 -o nfs.disable=off --gd-workdir
=/var/lib/glusterd
[2015-06-25 08:31:02.885468] I [run.c:190:runner_log] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x147)[0x7f08802c9b87] (--> /lib64/li
bglusterfs.so.0(runner_log+0xf6)[0x7f08803146e6] (--> /usr/lib64/glusterfs/3.7.2/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x51d)[0x
7f0874e3403d] (--> /usr/lib64/glusterfs/3.7.2/xlator/mgmt/glusterd.so(+0xd6512)[0x7f0874e34512] (--> /lib64/libpthread.so.0(+0x7555)[0x7f08
7f12d555] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh --volname=vol2 -o nfs.disa
ble=off --gd-workdir=/var/lib/glusterd
[2015-06-25 08:31:06.109914] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed 
(Invalid argument)
[2015-06-25 08:31:09.110526] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed 
(Invalid argument)
[2015-06-25 08:31:09.942705] I [glusterd-handler.c:1404:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-06-25 08:31:09.943677] I [glusterd-handler.c:1404:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-06-25 08:31:09.944671] I [glusterd-handler.c:1404:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-06-25 08:31:12.111402] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed 
(Invalid argument)
[2015-06-25 08:31:15.112063] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed 
(Invalid argument)
...
The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:330:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 13 times between [2015-06-25 08:29:46.906695] and [2015-06-25 08:31:42.117820]
...
[2015-06-25 08:33:42.145842] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/5cf388ba8b9610c52604425a16955178.socket failed (Invalid argument)
The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:330:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2015-06-25 08:31:45.118640] and [2015-06-25 08:33:42.145947]

Comment 6 Anatoly Pugachev 2015-06-25 08:39:06 UTC
missed "gluster vol start vol2" in previous message.

Comment 7 Anatoly Pugachev 2015-06-25 08:39:34 UTC
can someone reopen this bug please? thanks.

Comment 8 Niels de Vos 2015-06-30 10:44:39 UTC
11:45 < mator> ndevos, how do i reopen bug in bugzilla.redhat.com ? 
               https://bugzilla.redhat.com/show_bug.cgi?id=1222065
11:45 <@glusterbot> Bug 1222065: high, high, ---, kparthas, CLOSED 
                    CURRENTRELEASE, GlusterD fills the logs when the NFS-server 
                    is disabled
11:47 < ndevos> mator: are you still hitting that bug with glusterfs-3.7.2? the 
                exact same problem?
11:48 < mator> ndevos, yeah


Please attach current logs and the exact version of the packages that you used.

Thanks!

Comment 9 Anatoly Pugachev 2015-06-30 11:19:54 UTC
version info per my previous message comment, logs in attachment. Thanks.

Comment 10 Anatoly Pugachev 2015-06-30 11:22:36 UTC
Created attachment 1044665 [details]
glusterfs log node01

file logs attachment, it has upgrade to 3.7.2 and NFS related logs, from node01

Comment 11 krishnan parthasarathi 2015-07-01 05:08:25 UTC
(In reply to Anatoly Pugachev from comment #10)
> Created attachment 1044665 [details]
> glusterfs log node01
> 
> file logs attachment, it has upgrade to 3.7.2 and NFS related logs, from
> node01

IIUC, you are seeing repeated log messages when nfs.disable is to "off" on vol2. As a side effect of setting nfs.disable to off, we have at least one volume (i.e, vol2) which is started and has nfs service enabled. This tells glusterd that it has to start the NFS service if it's not running. I wonder why glusterd is failing to 'reach' the gluster NFS process via the unix domain socket file mentioned in the log messages. Could you check if the gluster NFS process is running on the node where you see these log messages in glusterd's log file? It would help if you could attach the gluster nfs and glusterd log files.

Comment 12 Anatoly Pugachev 2015-07-06 13:07:17 UTC
Well, I was unable to reproduce, like in previous attempt, with glusterfs-3.7.2. We could close this bugreport as solved now. Thanks.

Comment 13 Niels de Vos 2015-07-06 13:53:17 UTC
Thanks for the confirmation!

Comment 14 Enrico Tagliavini 2015-07-14 11:50:10 UTC
With 3.6.4 this is still happening

Comment 15 Alex Atkinson 2015-11-05 04:08:32 UTC
Still happening. glusterfs 3.7.5 built on Oct  7 2015 16:27:17

[2015-11-05 04:06:29.300039] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/0bba59acfb6bd5aaa82b3079be6b8686.socket failed (Invalid argument)

Please fix. Thanks!

Comment 16 Atin Mukherjee 2015-11-05 04:32:21 UTC
(In reply to Alex Atkinson from comment #15)
> Still happening. glusterfs 3.7.5 built on Oct  7 2015 16:27:17
> 
> [2015-11-05 04:06:29.300039] W [socket.c:588:__socket_rwv] 0-nfs: readv on
> /var/run/gluster/0bba59acfb6bd5aaa82b3079be6b8686.socket failed (Invalid
> argument)
> 
> Please fix. Thanks!

Could you please provide the reproducer steps here. Is it exactly the same what the BZ had begun with?

Comment 17 Alex Atkinson 2015-11-05 04:43:26 UTC
Simple setup. Defaults. Two hosts, each with an ext4 mount as the brick. Replication 2. No stripes. Mounted on client with mount.glusterfs.

Comment 18 Peter Larsen 2016-05-31 01:56:16 UTC
rpm -q glusterfs-server
glusterfs-server-3.7.5-19.el7rhgs.x86_64

Same issue. Single node/brick setup - nfs-server is already running on server outside of gluster, but other than that - simple ext4 volume being used by RHEV.

Comment 19 Peter Larsen 2020-09-14 15:49:27 UTC
Clearing NEEDINFO