Bug 1576661

Summary: [Ganesha] "NLM :MAJ :GRANTED_MSG RPC call failed with return code 19. Removing the blocking lock" messages were observed while performing multi client locking test (v3)
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Manisha Saini <msaini>
Component: nfs-ganeshaAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED WONTFIX QA Contact: Manisha Saini <msaini>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: dang, ffilz, grajoria, jthottan, pasik, rhs-bugs, sankarshan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 10:55:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Manisha Saini 2018-05-10 05:48:32 UTC
Description of problem:

while performing multi client locking tests on v3 protocol,Observing "NLM :MAJ :GRANTED_MSG RPC call failed with return code 19. Removing the blocking lock" messages in ganesha.log

Test-
1.Take lock on file from client1 to write on the file
2.Try taking lock from 2nd client at the same time (Lock is not granted)
3.Release the lock from client1.(Now Lock is been granted to client2)
4.Perform the above steps 2-3 times

At step 3,when the lock is released from client 1,following messages were observed i ganesha.log.
I am not seeing any functionality impact in locking test due to these messages


ganesha.log

----------
10/05/2018 11:09:59 : epoch 1af30000 : moonshine.lab.eng.blr.redhat.com : ganesha.nfsd-12606[State_Async] nlm_send_async :NLM :EVENT :failed to resolve rhs-client8.lab.eng.blr.redhat.com to an address: Name or service not known
10/05/2018 11:09:59 : epoch 1af30000 : moonshine.lab.eng.blr.redhat.com : ganesha.nfsd-12606[State_Async] nlm_send_async :NLM :EVENT :failed to resolve rhs-client8.lab.eng.blr.redhat.com to an address: Name or service not known
10/05/2018 11:09:59 : epoch 1af30000 : moonshine.lab.eng.blr.redhat.com : ganesha.nfsd-12606[State_Async] nlm_send_async :NLM :MAJ :NLM async Client exceeded retry count 2
10/05/2018 11:09:59 : epoch 1af30000 : moonshine.lab.eng.blr.redhat.com : ganesha.nfsd-12606[State_Async] nlm4_send_grant_msg :NLM :MAJ :GRANTED_MSG RPC call failed with return code 19. Removing the blocking lock

-----------


Version-Release number of selected component (if applicable):

# rpm -qa | grep ganesha
nfs-ganesha-gluster-2.5.5-6.el7rhgs.x86_64
nfs-ganesha-2.5.5-6.el7rhgs.x86_64
glusterfs-ganesha-3.12.2-9.el7rhgs.x86_64



How reproducible:
2/2

Steps to Reproduce:
1.Create 6 node ganesha cluster
2.Create distributed-disperse volume.
3.Export the volume via ganesha
4.Mount the volume to 2 clients with v3 protocol
5.Perform file locking test from 2 clients

Actual results:

While performing locking test,"GRANTED_MSG RPC call" failure messages were observed--

"NLM :MAJ :GRANTED_MSG RPC call failed with return code 19. Removing the blocking lock"


Expected results:

No return failure messages should be observed in ganesha.log


Additional info:

Snippet of locking test--

---------------------
Client 1--

[root@rhs-client6 home]# ./a.out /mnt/ganesha/1G 
opening /mnt/ganesha/1G
opened; hit Enter to lock... 
locking
locked; hit Enter to write... 
Write succeeeded 
locked; hit Enter to unlock... 
unlocking


Client 2--

[root@rhs-client8 home]# ./a.out /mnt/ganesha/1G 
opening /mnt/ganesha/1G
opened; hit Enter to lock... 
locking
locked; hit Enter to write... 
Write succeeeded 
locked; hit Enter to unlock...

-----------------------

Comment 5 Daniel Gryniewicz 2018-05-10 12:43:53 UTC
This appears to be a DNS issue, not a Ganesha issue:

failed to resolve rhs-client8.lab.eng.blr.redhat.com to an address: Name or service not known

The call that returned this failure is just getaddrinfo(), so it's a system library call that's failing.

Comment 9 Frank Filz 2018-11-19 17:54:28 UTC
It's not entirely free of functionality issues, however, a patch has been submitted upstream to convert the message to a LogEvent (maybe the other LogMajor should also be reduced).

https://review.gerrithub.io/#/c/ffilz/nfs-ganesha/+/433842/