Bug 813787 - NFS: locking tests fails
Summary: NFS: locking tests fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: pre-release
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 817967
TreeView+ depends on / blocked
 
Reported: 2012-04-18 12:41 UTC by Sachidananda Urs
Modified: 2015-12-01 16:45 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:19:28 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Sachidananda Urs 2012-04-18 12:41:06 UTC
While running cthon lock tests, simple locking test fails.

Corresponding logs:

[2012-04-18 08:31:03.341302] W [xdr-rpcclnt.c:88:rpc_request_to_xdr] 0-rpc: failed to encode call msg
[2012-04-18 08:31:03.341351] E [rpc-clnt.c:1268:rpc_clnt_record_build_record] 0-nfs-test-big-client-1: Failed to build record header
[2012-04-18 08:31:03.341371] W [rpc-clnt.c:1328:rpc_clnt_record] 0-nfs-test-big-client-1: cannot build rpc-record
[2012-04-18 08:31:03.341386] W [rpc-clnt.c:1467:rpc_clnt_submit] 0-nfs-test-big-client-1: cannot build rpc-record
[2012-04-18 08:31:03.341404] W [client3_1-fops.c:2173:client3_1_lk_cbk] 0-nfs-test-big-client-1: remote operation failed: Transport endpoint is not connected

Comment 1 Amar Tumballi 2012-04-18 14:07:11 UTC
little more log snippet will help.

Comment 2 Saurabh 2012-04-18 15:49:52 UTC
for me all cthon lock tests are a pass on qa-35 setup for a distribute-replicate(2X2) volume, btw I am using cthon from the suggested cthon repo,
git://linux-nfs.org/~steved/cthon04.git

Comment 3 Sachidananda Urs 2012-04-18 16:01:50 UTC
(In reply to comment #1)
> little more log snippet will help.

No more details are logged, the above log repeats, no unique messages are seen than above.

Comment 4 Sachidananda Urs 2012-04-18 16:03:30 UTC
(In reply to comment #2)
> for me all cthon lock tests are a pass on qa-35 setup for a
> distribute-replicate(2X2) volume, btw I am using cthon from the suggested cthon
> repo,
> git://linux-nfs.org/~steved/cthon04.git

The one I tried is a 4 node distribute. However that should not matter. The locking fails for any other lock test than cthon, a small program to lock a region of file fails as well.

Comment 5 Sachidananda Urs 2012-04-18 16:08:59 UTC
The network is quite flakey, the link is going UP/DOWN quite often. I am not sure if this is contributing to the issue, the other FOPs work quite transparently.

Comment 6 Krishna Srinivas 2012-04-18 20:30:39 UTC
If I change the client hostname to 33 chars things work fine. If i change it to 34 chars things don't work fine. I think if the length of the lockowner is beyond some value (40 - if we include pid and hostname) we see this behaviour. The problem seems to be in encoding the lock arguments (lock owner in particular)

Comment 7 Saurabh 2012-04-19 05:58:14 UTC
I tried to change the hostname of the client and executed the cthon test, for me there is one failure  and this is only after changing the hostname with a longname,




[root@longname-nlmtest-verification-purpose cthon04]# hostname
longname-nlmtest-verification-purpose

[root@longname-nlmtest-verification-purpose cthon04]# hostname | wc -c
38

cthon fails in Test#10, 

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] FAILED!
	Child:  **** Expected success, returned EACCES...
	Child:  **** Probably implementation error.

**  CHILD pass 1 results: 48/48 pass, 0/0 warn, 1/1 fail (pass/total).
	Parent: Child died

** PARENT pass 1 results: 29/29 pass, 0/0 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt/nfs-test mounted

###############################################################
logs from /var/log/glusterfs/nfs.log,

[2012-04-19 05:53:47.187445] W [xdr-rpcclnt.c:88:rpc_request_to_xdr] 0-rpc: failed to encode call msg
[2012-04-19 05:53:47.187568] E [rpc-clnt.c:1268:rpc_clnt_record_build_record] 0-dist-rep-client-2: Failed to build record header
[2012-04-19 05:53:47.187614] W [rpc-clnt.c:1328:rpc_clnt_record] 0-dist-rep-client-2: cannot build rpc-record
[2012-04-19 05:53:47.187652] W [rpc-clnt.c:1467:rpc_clnt_submit] 0-dist-rep-client-2: cannot build rpc-record
[2012-04-19 05:53:47.187693] W [client3_1-fops.c:2173:client3_1_lk_cbk] 0-dist-rep-client-2: remote operation failed: Transport endpoint is not connected
[2012-04-19 05:53:47.187742] W [xdr-rpcclnt.c:88:rpc_request_to_xdr] 0-rpc: failed to encode call msg
[2012-04-19 05:53:47.187783] E [rpc-clnt.c:1268:rpc_clnt_record_build_record] 0-dist-rep-client-3: Failed to build record header
[2012-04-19 05:53:47.187820] W [rpc-clnt.c:1328:rpc_clnt_record] 0-dist-rep-client-3: cannot build rpc-record
[2012-04-19 05:53:47.187855] W [rpc-clnt.c:1467:rpc_clnt_submit] 0-dist-rep-client-3: cannot build rpc-record
[2012-04-19 05:53:47.187892] W [client3_1-fops.c:2173:client3_1_lk_cbk] 0-dist-rep-client-3: remote operation failed: Transport endpoint is not connected



##############################################################

[root@RHS-71 ~]# glusterfs -V
glusterfs 3.3.0qa35 built on Apr 17 2012 11:22:39


[root@RHS-71 ~]# gluster volume info
 
Volume Name: dist-rep
Type: Distributed-Replicate
Volume ID: 0a559a33-6cbe-4853-8f75-f7db6c880cc4
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 172.17.251.71:/export/dr
Brick2: 172.17.251.72:/export/drr
Brick3: 172.17.251.73:/export/ddr
Brick4: 172.17.251.74:/export/ddrr

Comment 8 Anand Avati 2012-04-19 07:28:46 UTC
CHANGE: http://review.gluster.com/3191 (rpc-clnt: use the correct xdr_size for getting the iobuf) merged in master by Anand Avati (avati)

Comment 9 Amar Tumballi 2012-04-19 07:38:23 UTC
Thanks to Krishna for pointing me to the right place (by comment #6)

Comment 10 Sachidananda Urs 2012-04-20 05:58:57 UTC
Works on 3.3.0qa36


Note You need to log in before you can comment on or make changes to this bug.