Bug 1224250 - nfs-ganesha: cthon does not finish when failover is triggered by killing nfs-ganesha process
Summary: nfs-ganesha: cthon does not finish when failover is triggered by killing nfs-...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nfs-ganesha
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: RHGS 3.1.3
Assignee: Soumya Koduri
QA Contact: Shashank Raj
URL:
Whiteboard:
Depends On: 1242358
Blocks: 1202842 1216951 1299184
TreeView+ depends on / blocked
 
Reported: 2015-05-22 11:30 UTC by Saurabh
Modified: 2016-11-08 03:52 UTC (History)
13 users (show)

Fixed In Version: nfs-ganesha-2.3.1-5
Doc Type: Bug Fix
Doc Text:
Previously, while configuring nfs-ganesha cluster, there were cases where in nfs-ganesha process on each node would come up at the same time resulting in most of them having same epoch value. As a consequence, same epoch values on all the NFS-Ganesha heads resulted in NFS server sending NFS4ERR_FHEXPIRED error instead of NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID after failover. This resulted in NFSv4 clients not able to recover locks after failover. With this fix, a new option "EPOCH_EXEC" is added to '/etc/sysconfig/ganesha' to take the path of the script (default: '/bin/true') which is used to generate epoch value. For Gluster, a new script '/usr/libexec/ganesha/generate_epoch.py' is added and will be used to generate epoch value. A new helper service 'nfs-ganesha-config' added to process the init options provided in '/etc/sysconfig/ganesha' and copy the results to '/run/sysconfig/ganesha' to be used by nfs-ganesha while starting. Now, NFS-Ganesha will have unique epoch value on each of the nodes of the cluster resulting in smooth failover and lock recovery.
Clone Of:
Environment:
Last Closed: 2016-06-23 05:31:54 UTC
Embargoed:


Attachments (Terms of Use)
sosreport of node4 (14.07 MB, application/x-xz)
2015-05-22 11:37 UTC, Saurabh
no flags Details
packet trace from client (706.27 KB, application/octet-stream)
2015-06-23 10:32 UTC, Saurabh
no flags Details
nfs13 messages (525.05 KB, text/plain)
2015-06-23 10:33 UTC, Saurabh
no flags Details
nfs13 ganesha.log (10.77 KB, text/plain)
2015-06-23 10:33 UTC, Saurabh
no flags Details
second packet trace from client (11.98 MB, application/octet-stream)
2015-06-23 12:17 UTC, Saurabh
no flags Details
nfs14 ganesha.log (18.21 KB, text/plain)
2015-06-23 12:19 UTC, Saurabh
no flags Details
nfs14 messages (521.54 KB, text/plain)
2015-06-23 12:19 UTC, Saurabh
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1242358 0 high CLOSED Different epoch values for each of NFS-Ganesha heads 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2016:1247 0 normal SHIPPED_LIVE nfs-ganesha update for Red Hat Gluster Storage 3.1 update 3 2016-06-23 09:12:43 UTC

Internal Links: 1242358

Description Saurabh 2015-05-22 11:30:55 UTC
Description of problem:
Executed the cthon lock test, if didn't succeed to finish once the failover was triggered by killing the nfs-ganesha process

Version-Release number of selected component (if applicable):
glusterfs-3.7.0-2.el6rhs.x86_64
nfs-ganesha-2.2.0-0.el6.x86_64

How reproducible:
happened in the first attempt itself

Steps to Reproduce:
1. create a volume of type 6x2, start it
2. setup nfs-ganesha and bring it up
3. mount the volume with vers=4, using the vitrual IP
4. execute the cthon lock test
5. bring nfs-ganesha down using the command "kill -s TERM pid"

Actual results:
result post step5,
Test #6 - Try to lock the MAXEOF byte.
        Parent: 6.0  - F_TLOCK [7fffffffffffffff,               1] PASSED.
        Child:  6.1  - F_TEST  [7ffffffffffffffe,               1] PASSED.
        Child:  6.2  - F_TEST  [7ffffffffffffffe,               2] PASSED.
        Child:  6.3  - F_TEST  [7ffffffffffffffe,          ENDING] PASSED.
        Child:  6.4  - F_TEST  [7fffffffffffffff,               1] PASSED.
        Child:  6.5  - F_TEST  [7fffffffffffffff,               2] PASSED.
        Child:  6.6  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
        Child:  6.7  - F_TEST  [8000000000000000,          ENDING] PASSED.
        Child:  6.8  - F_TEST  [8000000000000000,               1] PASSED.
        Child:  6.9  - F_TEST  [8000000000000000,7fffffffffffffff] PASSED.
        Child:  6.10 - F_TEST  [8000000000000000,8000000000000000] PASSED.
        Parent: 6.11 - F_ULOCK [7fffffffffffffff,               1] PASSED.

Test #7 - Test parent/child mutual exclusion.
        Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
        Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
        Parent: Now free child to run, should block on lock.
        Parent: Check data in file to insure child blocked.
tlock: testfile read: Input/output error
        Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
        Child:  Write child's version of the data and release lock.

** PARENT pass 1 results: 22/22 pass, 0/0 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 45/45 pass, 0/0 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt mounted


Expected results:

cthon test should post the grace period required for failover.

Additional info:
[root@nfs4 ~]# pcs status
Cluster name: ganesha-ha-360
Last updated: Fri May 22 16:44:10 2015
Last change: Fri May 22 16:37:16 2015
Stack: cman
Current DC: nfs1 - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
17 Resources configured


Online: [ nfs1 nfs2 nfs3 nfs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs1 nfs2 nfs3 nfs4 ]
 nfs1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs1 
 nfs1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs1 
 nfs2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs2 
 nfs2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs2 
 nfs3-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs3 
 nfs3-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs3 
 nfs4-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs3 
 nfs4-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs3 
 nfs4-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs4

Comment 2 Saurabh 2015-05-22 11:37:11 UTC
Created attachment 1028695 [details]
sosreport of node4

Comment 3 Saurabh 2015-06-17 10:07:19 UTC
This case still fails for me 

[root@nfs11 ~]# pcs status
Cluster name: reaper
Last updated: Wed Jun 17 20:45:28 2015
Last change: Wed Jun 17 20:45:10 2015
Stack: cman
Current DC: nfs11 - partition with quorum
Version: 1.1.11-97629de
8 Nodes configured
33 Resources configured


Online: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ]
 nfs11-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs18 
 nfs11-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs18 
 nfs12-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs12 
 nfs12-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs12 
 nfs13-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs13 
 nfs13-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs13 
 nfs14-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs14 
 nfs14-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs14 
 nfs15-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs15 
 nfs15-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs15 
 nfs16-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs16 
 nfs16-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs16 
 nfs17-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs17 
 nfs17-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs17 
 nfs18-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs18 
 nfs18-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs18 
 nfs11-dead_ip-1	(ocf::heartbeat:Dummy):	Started nfs11 




cthon test,

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
tlock: testfile read: Input/output error
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Child:  Write child's version of the data and release lock.

** PARENT pass 1 results: 22/22 pass, 0/0 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 45/45 pass, 0/0 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt mounted

real	0m26.258s
user	0m0.013s
sys	0m0.049s

Comment 4 Saurabh 2015-06-17 11:05:29 UTC
This is an important BZ as cthon lock send EIO during failover.

Comment 5 Soumya Koduri 2015-06-18 11:42:33 UTC
We are unable to reproduce the issue. Cthon tests passed even after failover - 

[root@dhcp42-82 cthon04]# ./server  -l -o vers=4 -p /vol1 -m /mnt -N 1 10.70.40.179
Start tests on path /mnt/dhcp42-82.test [y/n]? y

sh ./runtests  -l  /mnt/dhcp42-82.test

Starting LOCKING tests: test directory /mnt/dhcp42-82.test (arg: /mnt/dhcp42-82.test)

Testing native post-LFS locking

Creating parent/child synchronization pipes.

Test #1 - Test regions of an unlocked file.
	Parent: 1.1  - F_TEST  [               0,               1] PASSED.
	Parent: 1.2  - F_TEST  [               0,          ENDING] PASSED.
	Parent: 1.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Parent: 1.4  - F_TEST  [               1,               1] PASSED.
	Parent: 1.5  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 1.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Parent: 1.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Parent: 1.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Parent: 1.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.

Test #2 - Try to lock the whole file.
	Parent: 2.0  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  2.1  - F_TEST  [               0,               1] PASSED.
	Child:  2.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  2.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Child:  2.4  - F_TEST  [               1,               1] PASSED.
	Child:  2.5  - F_TEST  [               1,          ENDING] PASSED.
	Child:  2.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Child:  2.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  2.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  2.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.
	Parent: 2.10 - F_ULOCK [               0,          ENDING] PASSED.

Test #3 - Try to lock just the 1st byte.
	Parent: 3.0  - F_TLOCK [               0,               1] PASSED.
	Child:  3.1  - F_TEST  [               0,               1] PASSED.
	Child:  3.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  3.3  - F_TEST  [               1,               1] PASSED.
	Child:  3.4  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 3.5  - F_ULOCK [               0,               1] PASSED.

Test #4 - Try to lock the 2nd byte, test around it.
	Parent: 4.0  - F_TLOCK [               1,               1] PASSED.
	Child:  4.1  - F_TEST  [               0,               1] PASSED.
	Child:  4.2  - F_TEST  [               0,               2] PASSED.
	Child:  4.3  - F_TEST  [               0,          ENDING] PASSED.
	Child:  4.4  - F_TEST  [               1,               1] PASSED.
	Child:  4.5  - F_TEST  [               1,               2] PASSED.
	Child:  4.6  - F_TEST  [               1,          ENDING] PASSED.
	Child:  4.7  - F_TEST  [               2,               1] PASSED.
	Child:  4.8  - F_TEST  [               2,               2] PASSED.
	Child:  4.9  - F_TEST  [               2,          ENDING] PASSED.
	Parent: 4.10 - F_ULOCK [               1,               1] PASSED.

Test #5 - Try to lock 1st and 2nd bytes, test around them.
	Parent: 5.0  - F_TLOCK [               0,               1] PASSED.
	Parent: 5.1  - F_TLOCK [               2,               1] PASSED.
	Child:  5.2  - F_TEST  [               0,               1] PASSED.
	Child:  5.3  - F_TEST  [               0,               2] PASSED.
	Child:  5.4  - F_TEST  [               0,          ENDING] PASSED.
	Child:  5.5  - F_TEST  [               1,               1] PASSED.
	Child:  5.6  - F_TEST  [               1,               2] PASSED.
	Child:  5.7  - F_TEST  [               1,          ENDING] PASSED.
	Child:  5.8  - F_TEST  [               2,               1] PASSED.
	Child:  5.9  - F_TEST  [               2,               2] PASSED.
	Child:  5.10 - F_TEST  [               2,          ENDING] PASSED.
	Child:  5.11 - F_TEST  [               3,               1] PASSED.
	Child:  5.12 - F_TEST  [               3,               2] PASSED.
	Child:  5.13 - F_TEST  [               3,          ENDING] PASSED.
	Parent: 5.14 - F_ULOCK [               0,               1] PASSED.
	Parent: 5.15 - F_ULOCK [               2,               1] PASSED.

Test #6 - Try to lock the MAXEOF byte.
	Parent: 6.0  - F_TLOCK [7fffffffffffffff,               1] PASSED.
	Child:  6.1  - F_TEST  [7ffffffffffffffe,               1] PASSED.
	Child:  6.2  - F_TEST  [7ffffffffffffffe,               2] PASSED.
	Child:  6.3  - F_TEST  [7ffffffffffffffe,          ENDING] PASSED.
	Child:  6.4  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  6.5  - F_TEST  [7fffffffffffffff,               2] PASSED.
	Child:  6.6  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  6.7  - F_TEST  [8000000000000000,          ENDING] PASSED.
	Child:  6.8  - F_TEST  [8000000000000000,               1] PASSED.
	Child:  6.9  - F_TEST  [8000000000000000,7fffffffffffffff] PASSED.
	Child:  6.10 - F_TEST  [8000000000000000,8000000000000000] PASSED.
	Parent: 6.11 - F_ULOCK [7fffffffffffffff,               1] PASSED.

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
	Parent: Read 'aaaa eh' from testfile [ 4092, 7 ].
	Parent: 7.1  - COMPARE [             ffc,               7] PASSED.

[root@clus2 ganesha]# service nfs-ganesha stop
Stopping ganesha.nfsd:                                     [  OK  ]
[root@clus2 ganesha]# 
[root@clus2 ganesha]# pcs status
Cluster name: ganesha-soumya
Last updated: Thu Jun 18 17:08:20 2015
Last change: Thu Jun 18 16:48:07 2015
Stack: cman
Current DC: clus1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
9 Resources configured


Online: [ clus1 clus2 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ clus1 clus2 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ clus1 clus2 ]
 clus1-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started clus1 
 clus1-trigger_ip-1	(ocf::heartbeat:Dummy):	Started clus1 
 clus2-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started clus1 
 clus2-trigger_ip-1	(ocf::heartbeat:Dummy):	Started clus1 
 clus2-dead_ip-1	(ocf::heartbeat:Dummy):	Started clus2 

[root@clus2 ganesha]# 


	Parent: Now unlock region so child will unblock.
	Parent: 7.2  - F_ULOCK [             ffc,               9] PASSED.
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Child:  Write child's version of the data and release lock.
	Parent: Now try to regain lock, parent should block.
	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
	Child:  7.4  - F_ULOCK [             ffc,               9] PASSED.
	Parent: 7.5  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Check data in file to insure child unblocked.
	Parent: Read 'bebebebeb' from testfile [ 4092, 9 ].
	Parent: 7.6  - COMPARE [             ffc,               9] PASSED.
	Parent: 7.7  - F_ULOCK [             ffc,               9] PASSED.

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] PASSED.
	Child:  10.5  - F_TEST  [               1,               1] PASSED.
	Parent: 10.6  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.7  - F_ULOCK [               2,               1] PASSED.
	Child:  10.8  - F_TEST  [               0,               3] PASSED.
	Parent: 10.9  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.10 - F_TLOCK [               1,               3] PASSED.
	Parent: 10.11 - F_ULOCK [               2,               1] PASSED.
	Child:  10.12 - F_TEST  [               1,               1] PASSED.
	Child:  10.13 - F_TEST  [               3,               1] PASSED.
	Child:  10.14 - F_TEST  [               4,          ENDING] PASSED.
	Child:  10.15 - F_TEST  [               2,               1] PASSED.
	Child:  10.16 - F_TEST  [               0,               1] PASSED.

Test #11 - Make sure close() releases the process's locks.
	Parent: 11.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Child:  11.1  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: 11.3  - F_TLOCK [              1d,             5b7] PASSED.
	Parent: 11.4  - F_TLOCK [            2000,              57] PASSED.
	Parent: Closed testfile.
	Child:  11.5  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.6  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.7  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ].
	Parent: Closed testfile.
	Child:  11.8  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.9  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.10 - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Truncated testfile.
	Parent: Closed testfile.
	Child:  11.11 - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.12 - F_ULOCK [               0,          ENDING] PASSED.

Test #12 - Signalled process should release locks.
	Child:  12.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [               0,          ENDING] PASSED.

Test #13 - Check locking and mmap semantics.
	Parent: 13.0  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: 13.1  - mmap [               0,            1000] WARNING!
	Parent: **** Expected EAGAIN, returned success...
	Parent: 13.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: unmap testfile.
	Parent: 13.3  - mmap [               0,            1000] PASSED.
	Parent: 13.4  - F_TLOCK [             ffe,          ENDING] PASSED.

Test #14 - Rate test performing I/O on unlocked and locked file.
	Parent: File Unlocked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [7876.92 +/- 26.15 KB/s].
	Parent: 14.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: File Locked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [7641.79 +/- 16.75 KB/s].
	Parent: 14.1  - F_ULOCK [               0,          ENDING] PASSED.

Test #15 - Test 2nd open and I/O after lock and close.
	Parent: Second open succeeded.
	Parent: 15.0  - F_LOCK  [               0,          ENDING] PASSED.
	Parent: 15.1  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ].
	Parent: Read 'abcdefghij' from testfile [ 0, 11 ].
	Parent: 15.2  - COMPARE [               0,               b] PASSED.

** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total).

Testing non-native 64 bit LFS locking

Creating parent/child synchronization pipes.

Test #1 - Test regions of an unlocked file.
	Parent: 1.1  - F_TEST  [               0,               1] PASSED.
	Parent: 1.2  - F_TEST  [               0,          ENDING] PASSED.
	Parent: 1.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Parent: 1.4  - F_TEST  [               1,               1] PASSED.
	Parent: 1.5  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 1.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Parent: 1.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Parent: 1.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Parent: 1.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.

Test #2 - Try to lock the whole file.
	Parent: 2.0  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  2.1  - F_TEST  [               0,               1] PASSED.
	Child:  2.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  2.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Child:  2.4  - F_TEST  [               1,               1] PASSED.
	Child:  2.5  - F_TEST  [               1,          ENDING] PASSED.
	Child:  2.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Child:  2.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  2.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  2.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.
	Parent: 2.10 - F_ULOCK [               0,          ENDING] PASSED.

Test #3 - Try to lock just the 1st byte.
	Parent: 3.0  - F_TLOCK [               0,               1] PASSED.
	Child:  3.1  - F_TEST  [               0,               1] PASSED.
	Child:  3.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  3.3  - F_TEST  [               1,               1] PASSED.
	Child:  3.4  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 3.5  - F_ULOCK [               0,               1] PASSED.

Test #4 - Try to lock the 2nd byte, test around it.
	Parent: 4.0  - F_TLOCK [               1,               1] PASSED.
	Child:  4.1  - F_TEST  [               0,               1] PASSED.
	Child:  4.2  - F_TEST  [               0,               2] PASSED.
	Child:  4.3  - F_TEST  [               0,          ENDING] PASSED.
	Child:  4.4  - F_TEST  [               1,               1] PASSED.
	Child:  4.5  - F_TEST  [               1,               2] PASSED.
	Child:  4.6  - F_TEST  [               1,          ENDING] PASSED.
	Child:  4.7  - F_TEST  [               2,               1] PASSED.
	Child:  4.8  - F_TEST  [               2,               2] PASSED.
	Child:  4.9  - F_TEST  [               2,          ENDING] PASSED.
	Parent: 4.10 - F_ULOCK [               1,               1] PASSED.

Test #5 - Try to lock 1st and 2nd bytes, test around them.
	Parent: 5.0  - F_TLOCK [               0,               1] PASSED.
	Parent: 5.1  - F_TLOCK [               2,               1] PASSED.
	Child:  5.2  - F_TEST  [               0,               1] PASSED.
	Child:  5.3  - F_TEST  [               0,               2] PASSED.
	Child:  5.4  - F_TEST  [               0,          ENDING] PASSED.
	Child:  5.5  - F_TEST  [               1,               1] PASSED.
	Child:  5.6  - F_TEST  [               1,               2] PASSED.
	Child:  5.7  - F_TEST  [               1,          ENDING] PASSED.
	Child:  5.8  - F_TEST  [               2,               1] PASSED.
	Child:  5.9  - F_TEST  [               2,               2] PASSED.
	Child:  5.10 - F_TEST  [               2,          ENDING] PASSED.
	Child:  5.11 - F_TEST  [               3,               1] PASSED.
	Child:  5.12 - F_TEST  [               3,               2] PASSED.
	Child:  5.13 - F_TEST  [               3,          ENDING] PASSED.
	Parent: 5.14 - F_ULOCK [               0,               1] PASSED.
	Parent: 5.15 - F_ULOCK [               2,               1] PASSED.

Test #6 - Try to lock the MAXEOF byte.
	Parent: 6.0  - F_TLOCK [7fffffffffffffff,               1] PASSED.
	Child:  6.1  - F_TEST  [7ffffffffffffffe,               1] PASSED.
	Child:  6.2  - F_TEST  [7ffffffffffffffe,               2] PASSED.
	Child:  6.3  - F_TEST  [7ffffffffffffffe,          ENDING] PASSED.
	Child:  6.4  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  6.5  - F_TEST  [7fffffffffffffff,               2] PASSED.
	Child:  6.6  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  6.7  - F_TEST  [8000000000000000,          ENDING] PASSED.
	Child:  6.8  - F_TEST  [8000000000000000,               1] PASSED.
	Child:  6.9  - F_TEST  [8000000000000000,7fffffffffffffff] PASSED.
	Child:  6.10 - F_TEST  [8000000000000000,8000000000000000] PASSED.
	Parent: 6.11 - F_ULOCK [7fffffffffffffff,               1] PASSED.

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
	Parent: Read 'aaaa eh' from testfile [ 4092, 7 ].
	Parent: 7.1  - COMPARE [             ffc,               7] PASSED.
	Parent: Now unlock region so child will unblock.
	Parent: 7.2  - F_ULOCK [             ffc,               9] PASSED.
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Now try to regain lock, parent should block.
	Child:  Write child's version of the data and release lock.
	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
	Child:  7.4  - F_ULOCK [             ffc,               9] PASSED.
	Parent: 7.5  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Check data in file to insure child unblocked.
	Parent: Read 'bebebebeb' from testfile [ 4092, 9 ].
	Parent: 7.6  - COMPARE [             ffc,               9] PASSED.
	Parent: 7.7  - F_ULOCK [             ffc,               9] PASSED.

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] PASSED.
	Child:  10.5  - F_TEST  [               1,               1] PASSED.
	Parent: 10.6  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.7  - F_ULOCK [               2,               1] PASSED.
	Child:  10.8  - F_TEST  [               0,               3] PASSED.
	Parent: 10.9  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.10 - F_TLOCK [               1,               3] PASSED.
	Parent: 10.11 - F_ULOCK [               2,               1] PASSED.
	Child:  10.12 - F_TEST  [               1,               1] PASSED.
	Child:  10.13 - F_TEST  [               3,               1] PASSED.
	Child:  10.14 - F_TEST  [               4,          ENDING] PASSED.
	Child:  10.15 - F_TEST  [               2,               1] PASSED.
	Child:  10.16 - F_TEST  [               0,               1] PASSED.

Test #11 - Make sure close() releases the process's locks.
	Parent: 11.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Child:  11.1  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: 11.3  - F_TLOCK [              1d,             5b7] PASSED.
	Parent: 11.4  - F_TLOCK [            2000,              57] PASSED.
	Parent: Closed testfile.
	Child:  11.5  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.6  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.7  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ].
	Parent: Closed testfile.
	Child:  11.8  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.9  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.10 - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Truncated testfile.
	Parent: Closed testfile.
	Child:  11.11 - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.12 - F_ULOCK [               0,          ENDING] PASSED.

Test #12 - Signalled process should release locks.
	Child:  12.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [               0,          ENDING] PASSED.

Test #13 - Check locking and mmap semantics.
	Parent: 13.0  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: 13.1  - mmap [               0,            1000] WARNING!
	Parent: **** Expected EAGAIN, returned success...
	Parent: 13.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: unmap testfile.
	Parent: 13.3  - mmap [               0,            1000] PASSED.
	Parent: 13.4  - F_TLOCK [             ffe,          ENDING] PASSED.

Test #14 - Rate test performing I/O on unlocked and locked file.
	Parent: File Unlocked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [17655.17 +/- 90.20 KB/s].
	Parent: 14.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: File Locked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [16516.13 +/- 69.39 KB/s].
	Parent: 14.1  - F_ULOCK [               0,          ENDING] PASSED.

Test #15 - Test 2nd open and I/O after lock and close.
	Parent: Second open succeeded.
	Parent: 15.0  - F_LOCK  [               0,          ENDING] PASSED.
	Parent: 15.1  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ].
	Parent: Read 'abcdefghij' from testfile [ 0, 11 ].
	Parent: 15.2  - COMPARE [               0,               b] PASSED.

** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total).
Congratulations, you passed the locking tests!

All tests completed
[root@dhcp42-82 cthon04]# 



Please reproduce the issue and collect below logs while doing failover - 
1) On all the nodes "ls -ld /var/lib/nfs"
2) on the node where NFS-Ganesha is killed "ls /var/lib/nfs/ganesha/v4recov"
2) tail -f /var/log/ganesha.log" of the node where the ganesha service would have failed to.

Comment 6 Saurabh 2015-06-23 10:31:05 UTC
Hi, 

I tried this again and I was able to see the error,

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
tlock: testfile read: Input/output error
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Child:  Write child's version of the data and release lock.

** PARENT pass 1 results: 	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
22/22 pass, 0/0 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 45/45 pass, 0/0 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt mounted

real	1m24.708s
user	0m0.018s
sys	0m0.082s


I gave two attempts to reproduce the issue, 
first attempt the cthon lock failed with EIO, but no NFS grace related lock was mentioned in ganesha.log file.

Second attempt the cthon lock failed with EIO, and NFS grace related lock was mentioned in ganesha.log file.

Comment 7 Saurabh 2015-06-23 10:32:53 UTC
Created attachment 1042215 [details]
packet trace from client

Comment 8 Saurabh 2015-06-23 10:33:26 UTC
Created attachment 1042216 [details]
nfs13 messages

Comment 9 Saurabh 2015-06-23 10:33:47 UTC
Created attachment 1042217 [details]
nfs13 ganesha.log

Comment 10 Soumya Koduri 2015-06-23 11:01:22 UTC
I do not see error cases in the pkt_trace. Nor any issue reported in the logs.
However from the pkt_trace, it seems strange that the client after reboot has n't requested for OPEN with reclaim set.
I suspect if it has to do any thing with cthon tool/nfs-client. What is the client version? As discussed please re-run the test on a clean mount. Along with the above logs and pkt_trace, please capture strace output too. Thanks!

Comment 11 Saurabh 2015-06-23 11:17:04 UTC
strace output for cthon lock test or some other process, can you be specific?

Comment 12 Saurabh 2015-06-23 12:15:23 UTC
This issue seems to be giving hiccups to me, as in just recent runs or attempts to reproduce the issue I didn't see the failure.
Sharing the latest packet trace as this is success case(cthon completed without fail).

Comment 13 Saurabh 2015-06-23 12:17:57 UTC
Created attachment 1042286 [details]
second packet trace from client

Comment 14 Saurabh 2015-06-23 12:19:20 UTC
Created attachment 1042289 [details]
nfs14 ganesha.log

Comment 15 Saurabh 2015-06-23 12:19:49 UTC
Created attachment 1042291 [details]
nfs14 messages

Comment 16 Soumya Koduri 2015-06-24 06:37:56 UTC
From the pkt trace which reported failure ->
after failover, server has sent NFS4ERR_EXPIRED to the client instead of 
NFS4ERR_STALE_CLIENTID/STATEID. This is mostly due to the fact that both the 
nfs-ganesha servers (the one which has failed and the other one which took over the VIP) may have had same epoch value. 

From the discussions with nfs-ganesha community, its been confirmed that for the nfs-ganesha servers to be able to run in a cluster, epoch values on each of the  systems should be different. 

To confirm that, request Saurabh to set different epoch values on each of the
system before bringing up the nfs-ganesha cluster and re-check if we hit this issue again. Thanks!

Comment 21 Meghana 2015-07-13 08:50:24 UTC
This works with different epoch values set on on NFS-Ganesha servers manually.
To make the change the automatic, I have opened a  new bug,

https://bugzilla.redhat.com/show_bug.cgi?id=1242358

Comment 22 monti lawrence 2015-07-22 20:20:15 UTC
Doc text is edited. Please sign off to be included in Known Issues.

Comment 23 Soumya Koduri 2015-07-27 08:58:49 UTC
By mistake, this bug may have got marked as Bug_Fix instead of Known_issue. 
Corrected the same. Please verify the doc text.

Comment 24 Soumya Koduri 2015-07-27 09:03:51 UTC
Please update the doc text. Thanks!

Comment 25 Anjana Suparna Sriram 2015-07-27 17:57:36 UTC
Please review the edited doc text and sign off to include it into the known Issues chapter.

Comment 26 Soumya Koduri 2015-07-28 07:33:50 UTC
Doc text looks good to me.

Comment 29 Soumya Koduri 2016-04-05 09:20:33 UTC
BZ#1242358 fixes this issue as well. More details on the changes required are at -
https://bugzilla.redhat.com/show_bug.cgi?id=1242358#c5

Comment 32 Shashank Raj 2016-05-31 06:50:01 UTC
Verified this bug with latest glusterfs-3.7.9-6 and nfs-ganesha-2.3,1-7 build and below are the observations:

1. Create a 6x2 dist-rep volume and export it via ganesha.
2. Mount the volume with vers4 on the client.
3. Start the cthon lock test from the client:

[root@dhcp43-8 ~]# cd cthon04/
[root@dhcp43-8 cthon04]# ./server -l -o vers=4 -p /testvolume -m /mnt/nfs1 -N 1 10.70.44.92

4. Bring nfs-ganesha down on the mounted node using below command 

[root@dhcp46-247 ~]# ps aux|grep ganesha
root      1371  0.0  0.0 107888   632 pts/0    S+   11:49   0:00 tailf /var/log/ganesha.log
root      1949  0.0  0.0 112644   956 pts/1    S+   11:51   0:00 grep --color=auto ganesha
root     30914  0.1  1.4 2467856 118880 ?      Ssl  11:46   0:00 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -E 6290732818990563328

[root@dhcp46-247 ~]# kill -s TERM 30914

5. Observe that when the failover happens, cthon lock test stops during the grace period and completes without any issues after the grace period

>>>>> pcs status

[root@dhcp46-247 ~]# pcs status
Cluster name: G1464676674.54
Last updated: Tue May 31 11:53:47 2016          Last change: Tue May 31 11:53:40 2016 by hacluster via crmd on dhcp47-139.lab.eng.blr.redhat.com
Stack: corosync
Current DC: dhcp47-139.lab.eng.blr.redhat.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
4 nodes and 16 resources configured

Online: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ]
     Stopped: [ dhcp46-247.lab.eng.blr.redhat.com ]
 dhcp46-247.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):        Started dhcp46-202.lab.eng.blr.redhat.com
 dhcp46-26.lab.eng.blr.redhat.com-cluster_ip-1  (ocf::heartbeat:IPaddr):        Started dhcp46-26.lab.eng.blr.redhat.com
 dhcp47-139.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):        Started dhcp47-139.lab.eng.blr.redhat.com
 dhcp46-202.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):        Started dhcp46-202.lab.eng.blr.redhat.com

PCSD Status:
  dhcp46-247.lab.eng.blr.redhat.com: Online
  dhcp46-26.lab.eng.blr.redhat.com: Online
  dhcp47-139.lab.eng.blr.redhat.com: Online
  dhcp46-202.lab.eng.blr.redhat.com: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

>>>>> cthon lock test completes without any issues after failover

[root@dhcp43-8 cthon04]# ./server -l -o vers=4 -p /testvolume -m /mnt/nfs1 -N 1 10.70.44.92
sh ./runtests  -l -t /mnt/nfs1/dhcp43-8.test

Starting LOCKING tests: test directory /mnt/nfs1/dhcp43-8.test (arg: -t)

Testing native post-LFS locking

Creating parent/child synchronization pipes.

Test #1 - Test regions of an unlocked file.
	Parent: 1.1  - F_TEST  [               0,               1] PASSED.
	Parent: 1.2  - F_TEST  [               0,          ENDING] PASSED.
	Parent: 1.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Parent: 1.4  - F_TEST  [               1,               1] PASSED.
	Parent: 1.5  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 1.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Parent: 1.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Parent: 1.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Parent: 1.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.

Test #2 - Try to lock the whole file.
	Parent: 2.0  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  2.1  - F_TEST  [               0,               1] PASSED.
	Child:  2.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  2.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Child:  2.4  - F_TEST  [               1,               1] PASSED.
	Child:  2.5  - F_TEST  [               1,          ENDING] PASSED.
	Child:  2.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Child:  2.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  2.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  2.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.
	Parent: 2.10 - F_ULOCK [               0,          ENDING] PASSED.

Test #3 - Try to lock just the 1st byte.
	Parent: 3.0  - F_TLOCK [               0,               1] PASSED.
	Child:  3.1  - F_TEST  [               0,               1] PASSED.
	Child:  3.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  3.3  - F_TEST  [               1,               1] PASSED.
	Child:  3.4  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 3.5  - F_ULOCK [               0,               1] PASSED.

Test #4 - Try to lock the 2nd byte, test around it.
	Parent: 4.0  - F_TLOCK [               1,               1] PASSED.
	Child:  4.1  - F_TEST  [               0,               1] PASSED.
	Child:  4.2  - F_TEST  [               0,               2] PASSED.
	Child:  4.3  - F_TEST  [               0,          ENDING] PASSED.
	Child:  4.4  - F_TEST  [               1,               1] PASSED.
	Child:  4.5  - F_TEST  [               1,               2] PASSED.
	Child:  4.6  - F_TEST  [               1,          ENDING] PASSED.
	Child:  4.7  - F_TEST  [               2,               1] PASSED.
	Child:  4.8  - F_TEST  [               2,               2] PASSED.
	Child:  4.9  - F_TEST  [               2,          ENDING] PASSED.
	Parent: 4.10 - F_ULOCK [               1,               1] PASSED.

Test #5 - Try to lock 1st and 2nd bytes, test around them.
	Parent: 5.0  - F_TLOCK [               0,               1] PASSED.
	Parent: 5.1  - F_TLOCK [               2,               1] PASSED.
	Child:  5.2  - F_TEST  [               0,               1] PASSED.
	Child:  5.3  - F_TEST  [               0,               2] PASSED.
	Child:  5.4  - F_TEST  [               0,          ENDING] PASSED.
	Child:  5.5  - F_TEST  [               1,               1] PASSED.
	Child:  5.6  - F_TEST  [               1,               2] PASSED.
	Child:  5.7  - F_TEST  [               1,          ENDING] PASSED.
	Child:  5.8  - F_TEST  [               2,               1] PASSED.
	Child:  5.9  - F_TEST  [               2,               2] PASSED.
	Child:  5.10 - F_TEST  [               2,          ENDING] PASSED.
	Child:  5.11 - F_TEST  [               3,               1] PASSED.
	Child:  5.12 - F_TEST  [               3,               2] PASSED.
	Child:  5.13 - F_TEST  [               3,          ENDING] PASSED.
	Parent: 5.14 - F_ULOCK [               0,               1] PASSED.
	Parent: 5.15 - F_ULOCK [               2,               1] PASSED.

Test #6 - Try to lock the MAXEOF byte.
	Parent: 6.0  - F_TLOCK [7fffffffffffffff,               1] PASSED.
	Child:  6.1  - F_TEST  [7ffffffffffffffe,               1] PASSED.
	Child:  6.2  - F_TEST  [7ffffffffffffffe,               2] PASSED.
	Child:  6.3  - F_TEST  [7ffffffffffffffe,          ENDING] PASSED.
	Child:  6.4  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  6.5  - F_TEST  [7fffffffffffffff,               2] PASSED.
	Child:  6.6  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  6.7  - F_TEST  [8000000000000000,          ENDING] PASSED.
	Child:  6.8  - F_TEST  [8000000000000000,               1] PASSED.
	Child:  6.9  - F_TEST  [8000000000000000,7fffffffffffffff] PASSED.
	Child:  6.10 - F_TEST  [8000000000000000,8000000000000000] PASSED.
	Parent: 6.11 - F_ULOCK [7fffffffffffffff,               1] PASSED.

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
	Parent: Read 'aaaa eh' from testfile [ 4092, 7 ].
	Parent: 7.1  - COMPARE [             ffc,               7] PASSED.
	Parent: Now unlock region so child will unblock.
	Parent: 7.2  - F_ULOCK [             ffc,               9] PASSED.
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Child:  Write child's version of the data and release lock.
	Parent: Now try to regain lock, parent should block.
	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
	Child:  7.4  - F_ULOCK [             ffc,               9] PASSED.
	Parent: 7.5  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Check data in file to insure child unblocked.
	Parent: Read 'bebebebeb' from testfile [ 4092, 9 ].
	Parent: 7.6  - COMPARE [             ffc,               9] PASSED.
	Parent: 7.7  - F_ULOCK [             ffc,               9] PASSED.

Test #8 - Rate test performing lock/unlock cycles.
	Parent: Performed 1000 lock/unlock cycles in 1620 msecs. [74074 lpm].

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] PASSED.
	Child:  10.5  - F_TEST  [               1,               1] PASSED.
	Parent: 10.6  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.7  - F_ULOCK [               2,               1] PASSED.
	Child:  10.8  - F_TEST  [               0,               3] PASSED.
	Parent: 10.9  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.10 - F_TLOCK [               1,               3] PASSED.
	Parent: 10.11 - F_ULOCK [               2,               1] PASSED.
	Child:  10.12 - F_TEST  [               1,               1] PASSED.
	Child:  10.13 - F_TEST  [               3,               1] PASSED.
	Child:  10.14 - F_TEST  [               4,          ENDING] PASSED.
	Child:  10.15 - F_TEST  [               2,               1] PASSED.
	Child:  10.16 - F_TEST  [               0,               1] PASSED.

Test #11 - Make sure close() releases the process's locks.
	Parent: 11.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Child:  11.1  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: 11.3  - F_TLOCK [              1d,             5b7] PASSED.
	Parent: 11.4  - F_TLOCK [            2000,              57] PASSED.
	Parent: Closed testfile.
	Child:  11.5  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.6  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.7  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ].
	Parent: Closed testfile.
	Child:  11.8  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.9  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.10 - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Truncated testfile.
	Parent: Closed testfile.
	Child:  11.11 - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.12 - F_ULOCK [               0,          ENDING] PASSED.

Test #12 - Signalled process should release locks.
	Child:  12.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [               0,          ENDING] PASSED.

Test #13 - Check locking and mmap semantics.
	Parent: 13.0  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: 13.1  - mmap [               0,            1000] WARNING!
	Parent: **** Expected EAGAIN, returned success...
	Parent: 13.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: unmap testfile.
	Parent: 13.3  - mmap [               0,            1000] PASSED.
	Parent: 13.4  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: unmap testfile.

Test #14 - Rate test performing I/O on unlocked and locked file.
	Parent: File Unlocked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [12800.00 +/- 76.80 KB/s].
	Parent: 14.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: File Locked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [11906.98 +/- 105.99 KB/s].
	Parent: 14.1  - F_ULOCK [               0,          ENDING] PASSED.

Test #15 - Test 2nd open and I/O after lock and close.
	Parent: Second open succeeded.
	Parent: 15.0  - F_LOCK  [               0,          ENDING] PASSED.
	Parent: 15.1  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ].
	Parent: Read 'abcdefghij' from testfile [ 0, 11 ].
	Parent: 15.2  - COMPARE [               0,               b] PASSED.

** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total).

Testing non-native 64 bit LFS locking

Creating parent/child synchronization pipes.

Test #1 - Test regions of an unlocked file.
	Parent: 1.1  - F_TEST  [               0,               1] PASSED.
	Parent: 1.2  - F_TEST  [               0,          ENDING] PASSED.
	Parent: 1.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Parent: 1.4  - F_TEST  [               1,               1] PASSED.
	Parent: 1.5  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 1.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Parent: 1.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Parent: 1.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Parent: 1.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.

Test #2 - Try to lock the whole file.
	Parent: 2.0  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  2.1  - F_TEST  [               0,               1] PASSED.
	Child:  2.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  2.3  - F_TEST  [               0,7fffffffffffffff] PASSED.
	Child:  2.4  - F_TEST  [               1,               1] PASSED.
	Child:  2.5  - F_TEST  [               1,          ENDING] PASSED.
	Child:  2.6  - F_TEST  [               1,7fffffffffffffff] PASSED.
	Child:  2.7  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  2.8  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  2.9  - F_TEST  [7fffffffffffffff,7fffffffffffffff] PASSED.
	Parent: 2.10 - F_ULOCK [               0,          ENDING] PASSED.

Test #3 - Try to lock just the 1st byte.
	Parent: 3.0  - F_TLOCK [               0,               1] PASSED.
	Child:  3.1  - F_TEST  [               0,               1] PASSED.
	Child:  3.2  - F_TEST  [               0,          ENDING] PASSED.
	Child:  3.3  - F_TEST  [               1,               1] PASSED.
	Child:  3.4  - F_TEST  [               1,          ENDING] PASSED.
	Parent: 3.5  - F_ULOCK [               0,               1] PASSED.

Test #4 - Try to lock the 2nd byte, test around it.
	Parent: 4.0  - F_TLOCK [               1,               1] PASSED.
	Child:  4.1  - F_TEST  [               0,               1] PASSED.
	Child:  4.2  - F_TEST  [               0,               2] PASSED.
	Child:  4.3  - F_TEST  [               0,          ENDING] PASSED.
	Child:  4.4  - F_TEST  [               1,               1] PASSED.
	Child:  4.5  - F_TEST  [               1,               2] PASSED.
	Child:  4.6  - F_TEST  [               1,          ENDING] PASSED.
	Child:  4.7  - F_TEST  [               2,               1] PASSED.
	Child:  4.8  - F_TEST  [               2,               2] PASSED.
	Child:  4.9  - F_TEST  [               2,          ENDING] PASSED.
	Parent: 4.10 - F_ULOCK [               1,               1] PASSED.

Test #5 - Try to lock 1st and 2nd bytes, test around them.
	Parent: 5.0  - F_TLOCK [               0,               1] PASSED.
	Parent: 5.1  - F_TLOCK [               2,               1] PASSED.
	Child:  5.2  - F_TEST  [               0,               1] PASSED.
	Child:  5.3  - F_TEST  [               0,               2] PASSED.
	Child:  5.4  - F_TEST  [               0,          ENDING] PASSED.
	Child:  5.5  - F_TEST  [               1,               1] PASSED.
	Child:  5.6  - F_TEST  [               1,               2] PASSED.
	Child:  5.7  - F_TEST  [               1,          ENDING] PASSED.
	Child:  5.8  - F_TEST  [               2,               1] PASSED.
	Child:  5.9  - F_TEST  [               2,               2] PASSED.
	Child:  5.10 - F_TEST  [               2,          ENDING] PASSED.
	Child:  5.11 - F_TEST  [               3,               1] PASSED.
	Child:  5.12 - F_TEST  [               3,               2] PASSED.
	Child:  5.13 - F_TEST  [               3,          ENDING] PASSED.
	Parent: 5.14 - F_ULOCK [               0,               1] PASSED.
	Parent: 5.15 - F_ULOCK [               2,               1] PASSED.

Test #6 - Try to lock the MAXEOF byte.
	Parent: 6.0  - F_TLOCK [7fffffffffffffff,               1] PASSED.
	Child:  6.1  - F_TEST  [7ffffffffffffffe,               1] PASSED.
	Child:  6.2  - F_TEST  [7ffffffffffffffe,               2] PASSED.
	Child:  6.3  - F_TEST  [7ffffffffffffffe,          ENDING] PASSED.
	Child:  6.4  - F_TEST  [7fffffffffffffff,               1] PASSED.
	Child:  6.5  - F_TEST  [7fffffffffffffff,               2] PASSED.
	Child:  6.6  - F_TEST  [7fffffffffffffff,          ENDING] PASSED.
	Child:  6.7  - F_TEST  [8000000000000000,          ENDING] PASSED.
	Child:  6.8  - F_TEST  [8000000000000000,               1] PASSED.
	Child:  6.9  - F_TEST  [8000000000000000,7fffffffffffffff] PASSED.
	Child:  6.10 - F_TEST  [8000000000000000,8000000000000000] PASSED.
	Parent: 6.11 - F_ULOCK [7fffffffffffffff,               1] PASSED.

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [             ffc,               9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
	Parent: Read 'aaaa eh' from testfile [ 4092, 7 ].
	Parent: 7.1  - COMPARE [             ffc,               7] PASSED.
	Parent: Now unlock region so child will unblock.
	Parent: 7.2  - F_ULOCK [             ffc,               9] PASSED.
	Child:  7.3  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Now try to regain lock, parent should block.
	Child:  Write child's version of the data and release lock.
	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
	Child:  7.4  - F_ULOCK [             ffc,               9] PASSED.
	Parent: 7.5  - F_LOCK  [             ffc,               9] PASSED.
	Parent: Check data in file to insure child unblocked.
	Parent: Read 'bebebebeb' from testfile [ 4092, 9 ].
	Parent: 7.6  - COMPARE [             ffc,               9] PASSED.
	Parent: 7.7  - F_ULOCK [             ffc,               9] PASSED.

Test #8 - Rate test performing lock/unlock cycles.
	Parent: Performed 1000 lock/unlock cycles in 1810 msecs. [66298 lpm].

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [               0,               3] PASSED.
	Parent: 10.1  - F_ULOCK [               1,               1] PASSED.
	Child:  10.2  - F_TEST  [               0,               1] PASSED.
	Child:  10.3  - F_TEST  [               2,               1] PASSED.
	Child:  10.4  - F_TEST  [               3,          ENDING] PASSED.
	Child:  10.5  - F_TEST  [               1,               1] PASSED.
	Parent: 10.6  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.7  - F_ULOCK [               2,               1] PASSED.
	Child:  10.8  - F_TEST  [               0,               3] PASSED.
	Parent: 10.9  - F_ULOCK [               0,               1] PASSED.
	Parent: 10.10 - F_TLOCK [               1,               3] PASSED.
	Parent: 10.11 - F_ULOCK [               2,               1] PASSED.
	Child:  10.12 - F_TEST  [               1,               1] PASSED.
	Child:  10.13 - F_TEST  [               3,               1] PASSED.
	Child:  10.14 - F_TEST  [               4,          ENDING] PASSED.
	Child:  10.15 - F_TEST  [               2,               1] PASSED.
	Child:  10.16 - F_TEST  [               0,               1] PASSED.

Test #11 - Make sure close() releases the process's locks.
	Parent: 11.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Child:  11.1  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: 11.3  - F_TLOCK [              1d,             5b7] PASSED.
	Parent: 11.4  - F_TLOCK [            2000,              57] PASSED.
	Parent: Closed testfile.
	Child:  11.5  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.6  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.7  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ].
	Parent: Closed testfile.
	Child:  11.8  - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.9  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.10 - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Truncated testfile.
	Parent: Closed testfile.
	Child:  11.11 - F_TLOCK [               0,          ENDING] PASSED.
	Child:  11.12 - F_ULOCK [               0,          ENDING] PASSED.

Test #12 - Signalled process should release locks.
	Child:  12.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [               0,          ENDING] PASSED.

Test #13 - Check locking and mmap semantics.
	Parent: 13.0  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: 13.1  - mmap [               0,            1000] WARNING!
	Parent: **** Expected EAGAIN, returned success...
	Parent: 13.2  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: unmap testfile.
	Parent: 13.3  - mmap [               0,            1000] PASSED.
	Parent: 13.4  - F_TLOCK [             ffe,          ENDING] PASSED.
	Parent: unmap testfile.

Test #14 - Rate test performing I/O on unlocked and locked file.
	Parent: File Unlocked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [13837.84 +/- 105.99 KB/s].
	Parent: 14.0  - F_TLOCK [               0,          ENDING] PASSED.
	Parent: File Locked
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Truncated testfile.
	Parent: Wrote and read 256 KB file 10 times; [11636.36 +/- 37.93 KB/s].
	Parent: 14.1  - F_ULOCK [               0,          ENDING] PASSED.

Test #15 - Test 2nd open and I/O after lock and close.
	Parent: Second open succeeded.
	Parent: 15.0  - F_LOCK  [               0,          ENDING] PASSED.
	Parent: 15.1  - F_ULOCK [               0,          ENDING] PASSED.
	Parent: Closed testfile.
	Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ].
	Parent: Read 'abcdefghij' from testfile [ 0, 11 ].
	Parent: 15.2  - COMPARE [               0,               b] PASSED.

** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total).

**  CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total).
Congratulations, you passed the locking tests!

All tests completed
[root@dhcp43-8 cthon04]# 

************************************************************************

Based on the above observation, marking this bug as Verified.

Comment 33 Divya 2016-06-13 10:49:27 UTC
Soumya,

Could you review and sign-off the edited doc text.

Comment 34 Soumya Koduri 2016-06-13 12:14:11 UTC
Doc text looks good to me.

Comment 36 errata-xmlrpc 2016-06-23 05:31:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1247


Note You need to log in before you can comment on or make changes to this bug.