Bug 1383979 - Getting error messages in glusterd.log when peer detach is done
Summary: Getting error messages in glusterd.log when peer detach is done
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.3.0
Assignee: Gaurav Yadav
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On: 1421607
Blocks: 1417147
TreeView+ depends on / blocked
 
Reported: 2016-10-12 09:49 UTC by Byreddy
Modified: 2017-09-21 04:54 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.8.4-19
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1421607 (view as bug list)
Environment:
Last Closed: 2017-09-21 04:28:23 UTC
Embargoed:


Attachments (Terms of Use)
peer detach related test case (38.74 KB, image/png)
2016-10-13 04:33 UTC, Byreddy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Byreddy 2016-10-12 09:49:04 UTC
Description of problem:
=======================
Getting the below error messages in the glusterd log when deprobe the cluster node is done.

[2016-10-12 07:31:07.381464] E [MSGID: 106029] [glusterd-utils.c:7767:glusterd_check_files_identical] 0-management: stat on file: /var/lib/glusterd//-server.vol failed (No such file or directory) [No such file or directory]
[2016-10-12 07:31:07.381736] E [MSGID: 106570] [glusterd-utils.c:7196:glusterd_friend_remove_cleanup_vols] 0-management: Failed to reconfigure all daemon services.



Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.8.4-2


How reproducible:
=================
Always


Steps to Reproduce:
===================
1. Create a two nodes (n1 and n2) cluster using 3.8.4-2 build
2. detach the n2 node from n1 //  n1#]gluster peer detach n2
3. Check for error messages in n1 glusterd.log

Actual results:
===============
Getting error messages in glusterd.log when peer detach is done


Expected results:
=================
Error messages should not come.


Additional info:

Comment 3 Niels de Vos 2016-10-12 10:27:06 UTC
I assume there is a Glusto or other regression test case for this. Please point to its location or attach it to this BZ. Thanks!

Comment 4 Niels de Vos 2016-10-12 10:31:16 UTC
Also, could you mention how urgent this is? Severity is set to "high", but priority is set to "undefined". If this problem causes the log to grow rapidly, we should fix it soon, otherwise we'll move it out to a later update.

Comment 5 Byreddy 2016-10-12 10:49:38 UTC
(In reply to Niels de Vos from comment #3)
> I assume there is a Glusto or other regression test case for this. Please
> point to its location or attach it to this BZ. Thanks!





(In reply to Niels de Vos from comment #3)
> I assume there is a Glusto or other regression test case for this. Please
> point to its location or attach it to this BZ. Thanks!

test case bug ID which caught this issue is https://bugzilla.redhat.com/show_bug.cgi?id=1246946

Comment 7 Byreddy 2016-10-12 10:53:04 UTC
sorry i cleared the need info of others ..i will set it back

Comment 8 Byreddy 2016-10-12 11:05:30 UTC
(In reply to Niels de Vos from comment #4)
> Also, could you mention how urgent this is? Severity is set to "high", but
> priority is set to "undefined". If this problem causes the log to grow
> rapidly, we should fix it soon, otherwise we'll move it out to a later
> update.

This is not urgent and it's not blocker and no functionality loss But we have regression test case ( mentioned above) which will be marked as failed during regression cycle and Regression keyword will be added to this bug.

and these error messages are not continuous, for every peer detach operation, it will throw those two error messages.

Comment 9 Kaleb KEITHLEY 2016-10-12 11:33:23 UTC
I'm not really sure how changing the default for starting gNFS or not would have anything to do with peer probing or related log messages.

Can you elaborate?

Comment 10 Atin Mukherjee 2016-10-12 11:37:08 UTC
(In reply to Byreddy from comment #8)
> (In reply to Niels de Vos from comment #4)
> > Also, could you mention how urgent this is? Severity is set to "high", but
> > priority is set to "undefined". If this problem causes the log to grow
> > rapidly, we should fix it soon, otherwise we'll move it out to a later
> > update.
> 
> This is not urgent and it's not blocker and no functionality loss But we
> have regression test case ( mentioned above) which will be marked as failed
> during regression cycle and Regression keyword will be added to this bug.

I disagree! Why would you want to mark a test failed given the test has actually passed? On the basis of having couple of error entries in the log a test case can not be failed and regression keyword can not be used IMO.

Rahul - please chime in with your thoughts.
> 
> and these error messages are not continuous, for every peer detach
> operation, it will throw those two error messages.

Comment 11 Atin Mukherjee 2016-10-12 11:41:02 UTC
(In reply to Kaleb KEITHLEY from comment #9)
> I'm not really sure how changing the default for starting gNFS or not would
> have anything to do with peer probing or related log messages.
> 
> Can you elaborate?

glusterd_friend_remove () ==> glusterd_friend_remove_cleanup_vols () ==> glusterd_svcs_reconfigure () ==> glusterd_nfssvc_reconfigure () where this function is unconditionally called (should be called only if gNFS is active)

And this is for peer detach code path.

Comment 12 Byreddy 2016-10-12 11:50:10 UTC
(In reply to Atin Mukherjee from comment #10)
> (In reply to Byreddy from comment #8)
> > (In reply to Niels de Vos from comment #4)
> > > Also, could you mention how urgent this is? Severity is set to "high", but
> > > priority is set to "undefined". If this problem causes the log to grow
> > > rapidly, we should fix it soon, otherwise we'll move it out to a later
> > > update.
> > 
> > This is not urgent and it's not blocker and no functionality loss But we
> > have regression test case ( mentioned above) which will be marked as failed
> > during regression cycle and Regression keyword will be added to this bug.
> 
> I disagree! Why would you want to mark a test failed given the test has
> actually passed? On the basis of having couple of error entries in the log a
> test case can not be failed and regression keyword can not be used IMO.
> 

As per the test case, peer detach should not populate any error messages but currently it's throwing errors and  this issue was not there in last GA release so it's regression from my side.


> Rahul - please chime in with your thoughts.
> > 
> > and these error messages are not continuous, for every peer detach
> > operation, it will throw those two error messages.

Comment 15 Byreddy 2016-10-13 04:33:55 UTC
Created attachment 1209905 [details]
peer detach related test case

Comment 20 Atin Mukherjee 2017-02-09 05:22:10 UTC
Gaurav - Can you start looking into this issue? I'd like to get this fixed in next release.

Comment 21 Atin Mukherjee 2017-02-13 10:46:47 UTC
upstream patch : https://review.gluster.org/#/c/16607

Comment 23 Atin Mukherjee 2017-03-24 09:44:45 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/101314/

Comment 25 Bala Konda Reddy M 2017-05-10 10:33:00 UTC
Build: 3.8.4.24

Based on the patch tested below scenarios
1. Detach peer with no volume been created in the cluster.
2. Detach peer after deleting all the volumes which were created but never started.

Detaching peer is not producing any error messages in the glusterd log.

Hence marking the bz as verified.

Comment 27 errata-xmlrpc 2017-09-21 04:28:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 28 errata-xmlrpc 2017-09-21 04:54:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.