Bug 1380655 - Continuous errors getting in the mount log when the volume mount server glusterd is down.
Summary: Continuous errors getting in the mount log when the volume mount server glust...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rpc
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.2.0
Assignee: Mohit Agrawal
QA Contact: Byreddy
URL:
Whiteboard:
Depends On: 1388877
Blocks: 1351528 1394108 1394109
TreeView+ depends on / blocked
 
Reported: 2016-09-30 09:06 UTC by Byreddy
Modified: 2017-03-23 06:06 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1388877 1394108 1394109 (view as bug list)
Environment:
Last Closed: 2017-03-23 06:06:54 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Byreddy 2016-09-30 09:06:45 UTC
Description of problem:
=======================
when volume mount servers glusterd is down, getting the below continuous errors messages in the volume mount log for every 3 seconds.

<START>
[2016-09-30 08:45:54.917489] E [glusterfsd-mgmt.c:1922:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.43.190 (Transport endpoint is not connected)
[2016-09-30 08:45:54.917542] I [glusterfsd-mgmt.c:1939:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2016-09-30 08:45:57.924521] E [glusterfsd-mgmt.c:1922:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.43.190 (Transport endpoint is not connected)
[2016-09-30 08:45:57.924585] I [glusterfsd-mgmt.c:1939:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2016-09-30 08:46:00.931708] E [glusterfsd-mgmt.c:1922:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.43.190 (Transport endpoint is not connected)
[2016-09-30 08:46:00.931781] I [glusterfsd-mgmt.c:1939:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2016-09-30 08:46:03.938789] E [glusterfsd-mgmt.c:1922:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.43.190 (Transport endpoint is not connected)
[2016-09-30 08:46:03.938857] I [glusterfsd-mgmt.c:1939:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
<END>


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.8.4-2


How reproducible:
=================
Always


Steps to Reproduce:
===================
1. Have one or two nodes cluster
2. Create replica volume ( i used 7 x 2 = 14 )
3. Fuse mount the volume 
4. Stop glusterd in the node from where volume is mounted.
5. Check the volume mount log.



Actual results:
===============
getting continuous error messages for every 3 seconds.

Expected results:
=================
There should be some control on error throwing or some other solution.
3 seconds frequency will consume lot of log storage if volume mount servers is down for any known reasons.


Additional info:

Comment 2 Ravishankar N 2016-09-30 09:17:51 UTC
Changing component to core since this is not relevant to FUSE per se and the behaviour can be observed on gNFS mounts too.

Comment 3 Byreddy 2016-10-18 06:42:04 UTC
This issue is not there in the last GA build.

Comment 5 Atin Mukherjee 2016-10-26 04:36:14 UTC
Apologies Byreddy, I complete missed out comment 3, will be moving it back to 3.2.0 for further analysis and thanks for catching it!

Comment 6 Atin Mukherjee 2016-10-26 11:33:21 UTC
upstream mainline patch http://review.gluster.org/15732 posted for review.

Comment 7 Mohit Agrawal 2016-10-27 05:20:09 UTC
Hi,
 
 Messages are coming (mgmt_rpc_notify) continuously in this build because one check was removed before execute the code block in case of RPC_CLNT_DISCONNECT from this patch (http://review.gluster.org/#/c/13002/).
 
 To reduce the frequency of messages change gf_log to GF_LOG_OCCASIONALLY.

Regards
Mohit Agrawal

Comment 12 Byreddy 2016-12-08 06:07:07 UTC
Verified this BZ using the build glusterfs-3.8.4-7.

Fix is working good, Now  populating the num of error messages are less compared to earlier when vol file server is down.

[2016-12-08 05:54:55.846722] W [socket.c:590:__socket_rwv] 0-glusterfs: readv on 10.70.41.198:24007 failed (No data available)
[2016-12-08 05:54:55.846894] E [glusterfsd-mgmt.c:1924:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.41.198 (No data available)
[2016-12-08 05:54:55.846919] I [glusterfsd-mgmt.c:1942:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2016-12-08 05:55:07.740290] E [socket.c:2309:socket_connect_finish] 0-glusterfs: connection to 10.70.41.198:24007 failed (Connection refused)



[2016-12-08 05:57:10.035103] E [glusterfsd-mgmt.c:1924:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 10.70.41.198 (Transport endpoint is not connected)
[2016-12-08 05:57:10.035203] I [glusterfsd-mgmt.c:1942:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers


Moving to verified state.

Comment 14 errata-xmlrpc 2017-03-23 06:06:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.