Bug 1224250
Summary: | nfs-ganesha: cthon does not finish when failover is triggered by killing nfs-ganesha process | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Saurabh <saujain> | ||||||||||||||||
Component: | nfs-ganesha | Assignee: | Soumya Koduri <skoduri> | ||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Shashank Raj <sraj> | ||||||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||||||
Priority: | high | ||||||||||||||||||
Version: | rhgs-3.1 | CC: | asriram, asrivast, divya, kkeithle, mlawrenc, mzywusko, nlevinki, rcyriac, rhinduja, sankarshan, sashinde, skoduri, smohan | ||||||||||||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||||||||||||
Target Release: | RHGS 3.1.3 | ||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | nfs-ganesha-2.3.1-5 | Doc Type: | Bug Fix | ||||||||||||||||
Doc Text: |
Previously, while configuring nfs-ganesha cluster, there were cases where in nfs-ganesha process on each node would come up at the same time resulting in most of them having same epoch value. As a consequence, same epoch values on all the NFS-Ganesha heads resulted in NFS server sending NFS4ERR_FHEXPIRED error instead of NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID after failover. This resulted in NFSv4 clients not able to recover locks after failover. With this fix, a new option "EPOCH_EXEC" is added to '/etc/sysconfig/ganesha' to take the path of the script (default: '/bin/true') which is used to generate epoch value. For Gluster, a new script '/usr/libexec/ganesha/generate_epoch.py' is added and will be used to generate epoch value. A new helper service 'nfs-ganesha-config' added to process the init options provided in '/etc/sysconfig/ganesha' and copy the results to '/run/sysconfig/ganesha' to be used by nfs-ganesha while starting. Now, NFS-Ganesha will have unique epoch value on each of the nodes of the cluster resulting in smooth failover and lock recovery.
|
Story Points: | --- | ||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2016-06-23 05:31:54 UTC | Type: | Bug | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Bug Depends On: | 1242358 | ||||||||||||||||||
Bug Blocks: | 1202842, 1216951, 1299184 | ||||||||||||||||||
Attachments: |
|
Description
Saurabh
2015-05-22 11:30:55 UTC
Created attachment 1028695 [details]
sosreport of node4
This case still fails for me [root@nfs11 ~]# pcs status Cluster name: reaper Last updated: Wed Jun 17 20:45:28 2015 Last change: Wed Jun 17 20:45:10 2015 Stack: cman Current DC: nfs11 - partition with quorum Version: 1.1.11-97629de 8 Nodes configured 33 Resources configured Online: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ] Full list of resources: Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs11 nfs12 nfs13 nfs14 nfs15 nfs16 nfs17 nfs18 ] nfs11-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs18 nfs11-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs18 nfs12-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs12 nfs12-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs12 nfs13-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs13 nfs13-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs13 nfs14-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs14 nfs14-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs14 nfs15-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs15 nfs15-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs15 nfs16-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs16 nfs16-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs16 nfs17-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs17 nfs17-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs17 nfs18-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs18 nfs18-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs18 nfs11-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs11 cthon test, Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. tlock: testfile read: Input/output error Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Child: Write child's version of the data and release lock. ** PARENT pass 1 results: 22/22 pass, 0/0 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 45/45 pass, 0/0 warn, 0/0 fail (pass/total). lock tests failed Tests failed, leaving /mnt mounted real 0m26.258s user 0m0.013s sys 0m0.049s This is an important BZ as cthon lock send EIO during failover. We are unable to reproduce the issue. Cthon tests passed even after failover - [root@dhcp42-82 cthon04]# ./server -l -o vers=4 -p /vol1 -m /mnt -N 1 10.70.40.179 Start tests on path /mnt/dhcp42-82.test [y/n]? y sh ./runtests -l /mnt/dhcp42-82.test Starting LOCKING tests: test directory /mnt/dhcp42-82.test (arg: /mnt/dhcp42-82.test) Testing native post-LFS locking Creating parent/child synchronization pipes. Test #1 - Test regions of an unlocked file. Parent: 1.1 - F_TEST [ 0, 1] PASSED. Parent: 1.2 - F_TEST [ 0, ENDING] PASSED. Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Parent: 1.4 - F_TEST [ 1, 1] PASSED. Parent: 1.5 - F_TEST [ 1, ENDING] PASSED. Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED. Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] PASSED. Child: 2.2 - F_TEST [ 0, ENDING] PASSED. Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Child: 2.4 - F_TEST [ 1, 1] PASSED. Child: 2.5 - F_TEST [ 1, ENDING] PASSED. Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED. Test #3 - Try to lock just the 1st byte. Parent: 3.0 - F_TLOCK [ 0, 1] PASSED. Child: 3.1 - F_TEST [ 0, 1] PASSED. Child: 3.2 - F_TEST [ 0, ENDING] PASSED. Child: 3.3 - F_TEST [ 1, 1] PASSED. Child: 3.4 - F_TEST [ 1, ENDING] PASSED. Parent: 3.5 - F_ULOCK [ 0, 1] PASSED. Test #4 - Try to lock the 2nd byte, test around it. Parent: 4.0 - F_TLOCK [ 1, 1] PASSED. Child: 4.1 - F_TEST [ 0, 1] PASSED. Child: 4.2 - F_TEST [ 0, 2] PASSED. Child: 4.3 - F_TEST [ 0, ENDING] PASSED. Child: 4.4 - F_TEST [ 1, 1] PASSED. Child: 4.5 - F_TEST [ 1, 2] PASSED. Child: 4.6 - F_TEST [ 1, ENDING] PASSED. Child: 4.7 - F_TEST [ 2, 1] PASSED. Child: 4.8 - F_TEST [ 2, 2] PASSED. Child: 4.9 - F_TEST [ 2, ENDING] PASSED. Parent: 4.10 - F_ULOCK [ 1, 1] PASSED. Test #5 - Try to lock 1st and 2nd bytes, test around them. Parent: 5.0 - F_TLOCK [ 0, 1] PASSED. Parent: 5.1 - F_TLOCK [ 2, 1] PASSED. Child: 5.2 - F_TEST [ 0, 1] PASSED. Child: 5.3 - F_TEST [ 0, 2] PASSED. Child: 5.4 - F_TEST [ 0, ENDING] PASSED. Child: 5.5 - F_TEST [ 1, 1] PASSED. Child: 5.6 - F_TEST [ 1, 2] PASSED. Child: 5.7 - F_TEST [ 1, ENDING] PASSED. Child: 5.8 - F_TEST [ 2, 1] PASSED. Child: 5.9 - F_TEST [ 2, 2] PASSED. Child: 5.10 - F_TEST [ 2, ENDING] PASSED. Child: 5.11 - F_TEST [ 3, 1] PASSED. Child: 5.12 - F_TEST [ 3, 2] PASSED. Child: 5.13 - F_TEST [ 3, ENDING] PASSED. Parent: 5.14 - F_ULOCK [ 0, 1] PASSED. Parent: 5.15 - F_ULOCK [ 2, 1] PASSED. Test #6 - Try to lock the MAXEOF byte. Parent: 6.0 - F_TLOCK [7fffffffffffffff, 1] PASSED. Child: 6.1 - F_TEST [7ffffffffffffffe, 1] PASSED. Child: 6.2 - F_TEST [7ffffffffffffffe, 2] PASSED. Child: 6.3 - F_TEST [7ffffffffffffffe, ENDING] PASSED. Child: 6.4 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 6.5 - F_TEST [7fffffffffffffff, 2] PASSED. Child: 6.6 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 6.7 - F_TEST [8000000000000000, ENDING] PASSED. Child: 6.8 - F_TEST [8000000000000000, 1] PASSED. Child: 6.9 - F_TEST [8000000000000000,7fffffffffffffff] PASSED. Child: 6.10 - F_TEST [8000000000000000,8000000000000000] PASSED. Parent: 6.11 - F_ULOCK [7fffffffffffffff, 1] PASSED. Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. Parent: Read 'aaaa eh' from testfile [ 4092, 7 ]. Parent: 7.1 - COMPARE [ ffc, 7] PASSED. [root@clus2 ganesha]# service nfs-ganesha stop Stopping ganesha.nfsd: [ OK ] [root@clus2 ganesha]# [root@clus2 ganesha]# pcs status Cluster name: ganesha-soumya Last updated: Thu Jun 18 17:08:20 2015 Last change: Thu Jun 18 16:48:07 2015 Stack: cman Current DC: clus1 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 9 Resources configured Online: [ clus1 clus2 ] Full list of resources: Clone Set: nfs-mon-clone [nfs-mon] Started: [ clus1 clus2 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ clus1 clus2 ] clus1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started clus1 clus1-trigger_ip-1 (ocf::heartbeat:Dummy): Started clus1 clus2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started clus1 clus2-trigger_ip-1 (ocf::heartbeat:Dummy): Started clus1 clus2-dead_ip-1 (ocf::heartbeat:Dummy): Started clus2 [root@clus2 ganesha]# Parent: Now unlock region so child will unblock. Parent: 7.2 - F_ULOCK [ ffc, 9] PASSED. Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Child: Write child's version of the data and release lock. Parent: Now try to regain lock, parent should block. Child: Wrote 'bebebebeb' to testfile [ 4092, 9 ]. Child: 7.4 - F_ULOCK [ ffc, 9] PASSED. Parent: 7.5 - F_LOCK [ ffc, 9] PASSED. Parent: Check data in file to insure child unblocked. Parent: Read 'bebebebeb' from testfile [ 4092, 9 ]. Parent: 7.6 - COMPARE [ ffc, 9] PASSED. Parent: 7.7 - F_ULOCK [ ffc, 9] PASSED. Test #10 - Make sure a locked region is split properly. Parent: 10.0 - F_TLOCK [ 0, 3] PASSED. Parent: 10.1 - F_ULOCK [ 1, 1] PASSED. Child: 10.2 - F_TEST [ 0, 1] PASSED. Child: 10.3 - F_TEST [ 2, 1] PASSED. Child: 10.4 - F_TEST [ 3, ENDING] PASSED. Child: 10.5 - F_TEST [ 1, 1] PASSED. Parent: 10.6 - F_ULOCK [ 0, 1] PASSED. Parent: 10.7 - F_ULOCK [ 2, 1] PASSED. Child: 10.8 - F_TEST [ 0, 3] PASSED. Parent: 10.9 - F_ULOCK [ 0, 1] PASSED. Parent: 10.10 - F_TLOCK [ 1, 3] PASSED. Parent: 10.11 - F_ULOCK [ 2, 1] PASSED. Child: 10.12 - F_TEST [ 1, 1] PASSED. Child: 10.13 - F_TEST [ 3, 1] PASSED. Child: 10.14 - F_TEST [ 4, ENDING] PASSED. Child: 10.15 - F_TEST [ 2, 1] PASSED. Child: 10.16 - F_TEST [ 0, 1] PASSED. Test #11 - Make sure close() releases the process's locks. Parent: 11.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Child: 11.1 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: 11.3 - F_TLOCK [ 1d, 5b7] PASSED. Parent: 11.4 - F_TLOCK [ 2000, 57] PASSED. Parent: Closed testfile. Child: 11.5 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.6 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.7 - F_TLOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ]. Parent: Closed testfile. Child: 11.8 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.9 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.10 - F_TLOCK [ 0, ENDING] PASSED. Parent: Truncated testfile. Parent: Closed testfile. Child: 11.11 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.12 - F_ULOCK [ 0, ENDING] PASSED. Test #12 - Signalled process should release locks. Child: 12.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Killed child process. Parent: 12.1 - F_TLOCK [ 0, ENDING] PASSED. Test #13 - Check locking and mmap semantics. Parent: 13.0 - F_TLOCK [ ffe, ENDING] PASSED. Parent: 13.1 - mmap [ 0, 1000] WARNING! Parent: **** Expected EAGAIN, returned success... Parent: 13.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: unmap testfile. Parent: 13.3 - mmap [ 0, 1000] PASSED. Parent: 13.4 - F_TLOCK [ ffe, ENDING] PASSED. Test #14 - Rate test performing I/O on unlocked and locked file. Parent: File Unlocked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [7876.92 +/- 26.15 KB/s]. Parent: 14.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: File Locked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [7641.79 +/- 16.75 KB/s]. Parent: 14.1 - F_ULOCK [ 0, ENDING] PASSED. Test #15 - Test 2nd open and I/O after lock and close. Parent: Second open succeeded. Parent: 15.0 - F_LOCK [ 0, ENDING] PASSED. Parent: 15.1 - F_ULOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ]. Parent: Read 'abcdefghij' from testfile [ 0, 11 ]. Parent: 15.2 - COMPARE [ 0, b] PASSED. ** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total). Testing non-native 64 bit LFS locking Creating parent/child synchronization pipes. Test #1 - Test regions of an unlocked file. Parent: 1.1 - F_TEST [ 0, 1] PASSED. Parent: 1.2 - F_TEST [ 0, ENDING] PASSED. Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Parent: 1.4 - F_TEST [ 1, 1] PASSED. Parent: 1.5 - F_TEST [ 1, ENDING] PASSED. Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED. Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] PASSED. Child: 2.2 - F_TEST [ 0, ENDING] PASSED. Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Child: 2.4 - F_TEST [ 1, 1] PASSED. Child: 2.5 - F_TEST [ 1, ENDING] PASSED. Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED. Test #3 - Try to lock just the 1st byte. Parent: 3.0 - F_TLOCK [ 0, 1] PASSED. Child: 3.1 - F_TEST [ 0, 1] PASSED. Child: 3.2 - F_TEST [ 0, ENDING] PASSED. Child: 3.3 - F_TEST [ 1, 1] PASSED. Child: 3.4 - F_TEST [ 1, ENDING] PASSED. Parent: 3.5 - F_ULOCK [ 0, 1] PASSED. Test #4 - Try to lock the 2nd byte, test around it. Parent: 4.0 - F_TLOCK [ 1, 1] PASSED. Child: 4.1 - F_TEST [ 0, 1] PASSED. Child: 4.2 - F_TEST [ 0, 2] PASSED. Child: 4.3 - F_TEST [ 0, ENDING] PASSED. Child: 4.4 - F_TEST [ 1, 1] PASSED. Child: 4.5 - F_TEST [ 1, 2] PASSED. Child: 4.6 - F_TEST [ 1, ENDING] PASSED. Child: 4.7 - F_TEST [ 2, 1] PASSED. Child: 4.8 - F_TEST [ 2, 2] PASSED. Child: 4.9 - F_TEST [ 2, ENDING] PASSED. Parent: 4.10 - F_ULOCK [ 1, 1] PASSED. Test #5 - Try to lock 1st and 2nd bytes, test around them. Parent: 5.0 - F_TLOCK [ 0, 1] PASSED. Parent: 5.1 - F_TLOCK [ 2, 1] PASSED. Child: 5.2 - F_TEST [ 0, 1] PASSED. Child: 5.3 - F_TEST [ 0, 2] PASSED. Child: 5.4 - F_TEST [ 0, ENDING] PASSED. Child: 5.5 - F_TEST [ 1, 1] PASSED. Child: 5.6 - F_TEST [ 1, 2] PASSED. Child: 5.7 - F_TEST [ 1, ENDING] PASSED. Child: 5.8 - F_TEST [ 2, 1] PASSED. Child: 5.9 - F_TEST [ 2, 2] PASSED. Child: 5.10 - F_TEST [ 2, ENDING] PASSED. Child: 5.11 - F_TEST [ 3, 1] PASSED. Child: 5.12 - F_TEST [ 3, 2] PASSED. Child: 5.13 - F_TEST [ 3, ENDING] PASSED. Parent: 5.14 - F_ULOCK [ 0, 1] PASSED. Parent: 5.15 - F_ULOCK [ 2, 1] PASSED. Test #6 - Try to lock the MAXEOF byte. Parent: 6.0 - F_TLOCK [7fffffffffffffff, 1] PASSED. Child: 6.1 - F_TEST [7ffffffffffffffe, 1] PASSED. Child: 6.2 - F_TEST [7ffffffffffffffe, 2] PASSED. Child: 6.3 - F_TEST [7ffffffffffffffe, ENDING] PASSED. Child: 6.4 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 6.5 - F_TEST [7fffffffffffffff, 2] PASSED. Child: 6.6 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 6.7 - F_TEST [8000000000000000, ENDING] PASSED. Child: 6.8 - F_TEST [8000000000000000, 1] PASSED. Child: 6.9 - F_TEST [8000000000000000,7fffffffffffffff] PASSED. Child: 6.10 - F_TEST [8000000000000000,8000000000000000] PASSED. Parent: 6.11 - F_ULOCK [7fffffffffffffff, 1] PASSED. Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. Parent: Read 'aaaa eh' from testfile [ 4092, 7 ]. Parent: 7.1 - COMPARE [ ffc, 7] PASSED. Parent: Now unlock region so child will unblock. Parent: 7.2 - F_ULOCK [ ffc, 9] PASSED. Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Parent: Now try to regain lock, parent should block. Child: Write child's version of the data and release lock. Child: Wrote 'bebebebeb' to testfile [ 4092, 9 ]. Child: 7.4 - F_ULOCK [ ffc, 9] PASSED. Parent: 7.5 - F_LOCK [ ffc, 9] PASSED. Parent: Check data in file to insure child unblocked. Parent: Read 'bebebebeb' from testfile [ 4092, 9 ]. Parent: 7.6 - COMPARE [ ffc, 9] PASSED. Parent: 7.7 - F_ULOCK [ ffc, 9] PASSED. Test #10 - Make sure a locked region is split properly. Parent: 10.0 - F_TLOCK [ 0, 3] PASSED. Parent: 10.1 - F_ULOCK [ 1, 1] PASSED. Child: 10.2 - F_TEST [ 0, 1] PASSED. Child: 10.3 - F_TEST [ 2, 1] PASSED. Child: 10.4 - F_TEST [ 3, ENDING] PASSED. Child: 10.5 - F_TEST [ 1, 1] PASSED. Parent: 10.6 - F_ULOCK [ 0, 1] PASSED. Parent: 10.7 - F_ULOCK [ 2, 1] PASSED. Child: 10.8 - F_TEST [ 0, 3] PASSED. Parent: 10.9 - F_ULOCK [ 0, 1] PASSED. Parent: 10.10 - F_TLOCK [ 1, 3] PASSED. Parent: 10.11 - F_ULOCK [ 2, 1] PASSED. Child: 10.12 - F_TEST [ 1, 1] PASSED. Child: 10.13 - F_TEST [ 3, 1] PASSED. Child: 10.14 - F_TEST [ 4, ENDING] PASSED. Child: 10.15 - F_TEST [ 2, 1] PASSED. Child: 10.16 - F_TEST [ 0, 1] PASSED. Test #11 - Make sure close() releases the process's locks. Parent: 11.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Child: 11.1 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: 11.3 - F_TLOCK [ 1d, 5b7] PASSED. Parent: 11.4 - F_TLOCK [ 2000, 57] PASSED. Parent: Closed testfile. Child: 11.5 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.6 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.7 - F_TLOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ]. Parent: Closed testfile. Child: 11.8 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.9 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.10 - F_TLOCK [ 0, ENDING] PASSED. Parent: Truncated testfile. Parent: Closed testfile. Child: 11.11 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.12 - F_ULOCK [ 0, ENDING] PASSED. Test #12 - Signalled process should release locks. Child: 12.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Killed child process. Parent: 12.1 - F_TLOCK [ 0, ENDING] PASSED. Test #13 - Check locking and mmap semantics. Parent: 13.0 - F_TLOCK [ ffe, ENDING] PASSED. Parent: 13.1 - mmap [ 0, 1000] WARNING! Parent: **** Expected EAGAIN, returned success... Parent: 13.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: unmap testfile. Parent: 13.3 - mmap [ 0, 1000] PASSED. Parent: 13.4 - F_TLOCK [ ffe, ENDING] PASSED. Test #14 - Rate test performing I/O on unlocked and locked file. Parent: File Unlocked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [17655.17 +/- 90.20 KB/s]. Parent: 14.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: File Locked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [16516.13 +/- 69.39 KB/s]. Parent: 14.1 - F_ULOCK [ 0, ENDING] PASSED. Test #15 - Test 2nd open and I/O after lock and close. Parent: Second open succeeded. Parent: 15.0 - F_LOCK [ 0, ENDING] PASSED. Parent: 15.1 - F_ULOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ]. Parent: Read 'abcdefghij' from testfile [ 0, 11 ]. Parent: 15.2 - COMPARE [ 0, b] PASSED. ** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total). Congratulations, you passed the locking tests! All tests completed [root@dhcp42-82 cthon04]# Please reproduce the issue and collect below logs while doing failover - 1) On all the nodes "ls -ld /var/lib/nfs" 2) on the node where NFS-Ganesha is killed "ls /var/lib/nfs/ganesha/v4recov" 2) tail -f /var/log/ganesha.log" of the node where the ganesha service would have failed to. Hi, I tried this again and I was able to see the error, Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. tlock: testfile read: Input/output error Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Child: Write child's version of the data and release lock. ** PARENT pass 1 results: Child: Wrote 'bebebebeb' to testfile [ 4092, 9 ]. 22/22 pass, 0/0 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 45/45 pass, 0/0 warn, 0/0 fail (pass/total). lock tests failed Tests failed, leaving /mnt mounted real 1m24.708s user 0m0.018s sys 0m0.082s I gave two attempts to reproduce the issue, first attempt the cthon lock failed with EIO, but no NFS grace related lock was mentioned in ganesha.log file. Second attempt the cthon lock failed with EIO, and NFS grace related lock was mentioned in ganesha.log file. Created attachment 1042215 [details]
packet trace from client
Created attachment 1042216 [details]
nfs13 messages
Created attachment 1042217 [details]
nfs13 ganesha.log
I do not see error cases in the pkt_trace. Nor any issue reported in the logs. However from the pkt_trace, it seems strange that the client after reboot has n't requested for OPEN with reclaim set. I suspect if it has to do any thing with cthon tool/nfs-client. What is the client version? As discussed please re-run the test on a clean mount. Along with the above logs and pkt_trace, please capture strace output too. Thanks! strace output for cthon lock test or some other process, can you be specific? This issue seems to be giving hiccups to me, as in just recent runs or attempts to reproduce the issue I didn't see the failure. Sharing the latest packet trace as this is success case(cthon completed without fail). Created attachment 1042286 [details]
second packet trace from client
Created attachment 1042289 [details]
nfs14 ganesha.log
Created attachment 1042291 [details]
nfs14 messages
From the pkt trace which reported failure -> after failover, server has sent NFS4ERR_EXPIRED to the client instead of NFS4ERR_STALE_CLIENTID/STATEID. This is mostly due to the fact that both the nfs-ganesha servers (the one which has failed and the other one which took over the VIP) may have had same epoch value. From the discussions with nfs-ganesha community, its been confirmed that for the nfs-ganesha servers to be able to run in a cluster, epoch values on each of the systems should be different. To confirm that, request Saurabh to set different epoch values on each of the system before bringing up the nfs-ganesha cluster and re-check if we hit this issue again. Thanks! This works with different epoch values set on on NFS-Ganesha servers manually. To make the change the automatic, I have opened a new bug, https://bugzilla.redhat.com/show_bug.cgi?id=1242358 Doc text is edited. Please sign off to be included in Known Issues. By mistake, this bug may have got marked as Bug_Fix instead of Known_issue. Corrected the same. Please verify the doc text. Please update the doc text. Thanks! Please review the edited doc text and sign off to include it into the known Issues chapter. Doc text looks good to me. BZ#1242358 fixes this issue as well. More details on the changes required are at - https://bugzilla.redhat.com/show_bug.cgi?id=1242358#c5 Verified this bug with latest glusterfs-3.7.9-6 and nfs-ganesha-2.3,1-7 build and below are the observations: 1. Create a 6x2 dist-rep volume and export it via ganesha. 2. Mount the volume with vers4 on the client. 3. Start the cthon lock test from the client: [root@dhcp43-8 ~]# cd cthon04/ [root@dhcp43-8 cthon04]# ./server -l -o vers=4 -p /testvolume -m /mnt/nfs1 -N 1 10.70.44.92 4. Bring nfs-ganesha down on the mounted node using below command [root@dhcp46-247 ~]# ps aux|grep ganesha root 1371 0.0 0.0 107888 632 pts/0 S+ 11:49 0:00 tailf /var/log/ganesha.log root 1949 0.0 0.0 112644 956 pts/1 S+ 11:51 0:00 grep --color=auto ganesha root 30914 0.1 1.4 2467856 118880 ? Ssl 11:46 0:00 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -E 6290732818990563328 [root@dhcp46-247 ~]# kill -s TERM 30914 5. Observe that when the failover happens, cthon lock test stops during the grace period and completes without any issues after the grace period >>>>> pcs status [root@dhcp46-247 ~]# pcs status Cluster name: G1464676674.54 Last updated: Tue May 31 11:53:47 2016 Last change: Tue May 31 11:53:40 2016 by hacluster via crmd on dhcp47-139.lab.eng.blr.redhat.com Stack: corosync Current DC: dhcp47-139.lab.eng.blr.redhat.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 4 nodes and 16 resources configured Online: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-247.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ dhcp46-202.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com ] Stopped: [ dhcp46-247.lab.eng.blr.redhat.com ] dhcp46-247.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-202.lab.eng.blr.redhat.com dhcp46-26.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-26.lab.eng.blr.redhat.com dhcp47-139.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp47-139.lab.eng.blr.redhat.com dhcp46-202.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-202.lab.eng.blr.redhat.com PCSD Status: dhcp46-247.lab.eng.blr.redhat.com: Online dhcp46-26.lab.eng.blr.redhat.com: Online dhcp47-139.lab.eng.blr.redhat.com: Online dhcp46-202.lab.eng.blr.redhat.com: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled >>>>> cthon lock test completes without any issues after failover [root@dhcp43-8 cthon04]# ./server -l -o vers=4 -p /testvolume -m /mnt/nfs1 -N 1 10.70.44.92 sh ./runtests -l -t /mnt/nfs1/dhcp43-8.test Starting LOCKING tests: test directory /mnt/nfs1/dhcp43-8.test (arg: -t) Testing native post-LFS locking Creating parent/child synchronization pipes. Test #1 - Test regions of an unlocked file. Parent: 1.1 - F_TEST [ 0, 1] PASSED. Parent: 1.2 - F_TEST [ 0, ENDING] PASSED. Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Parent: 1.4 - F_TEST [ 1, 1] PASSED. Parent: 1.5 - F_TEST [ 1, ENDING] PASSED. Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED. Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] PASSED. Child: 2.2 - F_TEST [ 0, ENDING] PASSED. Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Child: 2.4 - F_TEST [ 1, 1] PASSED. Child: 2.5 - F_TEST [ 1, ENDING] PASSED. Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED. Test #3 - Try to lock just the 1st byte. Parent: 3.0 - F_TLOCK [ 0, 1] PASSED. Child: 3.1 - F_TEST [ 0, 1] PASSED. Child: 3.2 - F_TEST [ 0, ENDING] PASSED. Child: 3.3 - F_TEST [ 1, 1] PASSED. Child: 3.4 - F_TEST [ 1, ENDING] PASSED. Parent: 3.5 - F_ULOCK [ 0, 1] PASSED. Test #4 - Try to lock the 2nd byte, test around it. Parent: 4.0 - F_TLOCK [ 1, 1] PASSED. Child: 4.1 - F_TEST [ 0, 1] PASSED. Child: 4.2 - F_TEST [ 0, 2] PASSED. Child: 4.3 - F_TEST [ 0, ENDING] PASSED. Child: 4.4 - F_TEST [ 1, 1] PASSED. Child: 4.5 - F_TEST [ 1, 2] PASSED. Child: 4.6 - F_TEST [ 1, ENDING] PASSED. Child: 4.7 - F_TEST [ 2, 1] PASSED. Child: 4.8 - F_TEST [ 2, 2] PASSED. Child: 4.9 - F_TEST [ 2, ENDING] PASSED. Parent: 4.10 - F_ULOCK [ 1, 1] PASSED. Test #5 - Try to lock 1st and 2nd bytes, test around them. Parent: 5.0 - F_TLOCK [ 0, 1] PASSED. Parent: 5.1 - F_TLOCK [ 2, 1] PASSED. Child: 5.2 - F_TEST [ 0, 1] PASSED. Child: 5.3 - F_TEST [ 0, 2] PASSED. Child: 5.4 - F_TEST [ 0, ENDING] PASSED. Child: 5.5 - F_TEST [ 1, 1] PASSED. Child: 5.6 - F_TEST [ 1, 2] PASSED. Child: 5.7 - F_TEST [ 1, ENDING] PASSED. Child: 5.8 - F_TEST [ 2, 1] PASSED. Child: 5.9 - F_TEST [ 2, 2] PASSED. Child: 5.10 - F_TEST [ 2, ENDING] PASSED. Child: 5.11 - F_TEST [ 3, 1] PASSED. Child: 5.12 - F_TEST [ 3, 2] PASSED. Child: 5.13 - F_TEST [ 3, ENDING] PASSED. Parent: 5.14 - F_ULOCK [ 0, 1] PASSED. Parent: 5.15 - F_ULOCK [ 2, 1] PASSED. Test #6 - Try to lock the MAXEOF byte. Parent: 6.0 - F_TLOCK [7fffffffffffffff, 1] PASSED. Child: 6.1 - F_TEST [7ffffffffffffffe, 1] PASSED. Child: 6.2 - F_TEST [7ffffffffffffffe, 2] PASSED. Child: 6.3 - F_TEST [7ffffffffffffffe, ENDING] PASSED. Child: 6.4 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 6.5 - F_TEST [7fffffffffffffff, 2] PASSED. Child: 6.6 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 6.7 - F_TEST [8000000000000000, ENDING] PASSED. Child: 6.8 - F_TEST [8000000000000000, 1] PASSED. Child: 6.9 - F_TEST [8000000000000000,7fffffffffffffff] PASSED. Child: 6.10 - F_TEST [8000000000000000,8000000000000000] PASSED. Parent: 6.11 - F_ULOCK [7fffffffffffffff, 1] PASSED. Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. Parent: Read 'aaaa eh' from testfile [ 4092, 7 ]. Parent: 7.1 - COMPARE [ ffc, 7] PASSED. Parent: Now unlock region so child will unblock. Parent: 7.2 - F_ULOCK [ ffc, 9] PASSED. Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Child: Write child's version of the data and release lock. Parent: Now try to regain lock, parent should block. Child: Wrote 'bebebebeb' to testfile [ 4092, 9 ]. Child: 7.4 - F_ULOCK [ ffc, 9] PASSED. Parent: 7.5 - F_LOCK [ ffc, 9] PASSED. Parent: Check data in file to insure child unblocked. Parent: Read 'bebebebeb' from testfile [ 4092, 9 ]. Parent: 7.6 - COMPARE [ ffc, 9] PASSED. Parent: 7.7 - F_ULOCK [ ffc, 9] PASSED. Test #8 - Rate test performing lock/unlock cycles. Parent: Performed 1000 lock/unlock cycles in 1620 msecs. [74074 lpm]. Test #10 - Make sure a locked region is split properly. Parent: 10.0 - F_TLOCK [ 0, 3] PASSED. Parent: 10.1 - F_ULOCK [ 1, 1] PASSED. Child: 10.2 - F_TEST [ 0, 1] PASSED. Child: 10.3 - F_TEST [ 2, 1] PASSED. Child: 10.4 - F_TEST [ 3, ENDING] PASSED. Child: 10.5 - F_TEST [ 1, 1] PASSED. Parent: 10.6 - F_ULOCK [ 0, 1] PASSED. Parent: 10.7 - F_ULOCK [ 2, 1] PASSED. Child: 10.8 - F_TEST [ 0, 3] PASSED. Parent: 10.9 - F_ULOCK [ 0, 1] PASSED. Parent: 10.10 - F_TLOCK [ 1, 3] PASSED. Parent: 10.11 - F_ULOCK [ 2, 1] PASSED. Child: 10.12 - F_TEST [ 1, 1] PASSED. Child: 10.13 - F_TEST [ 3, 1] PASSED. Child: 10.14 - F_TEST [ 4, ENDING] PASSED. Child: 10.15 - F_TEST [ 2, 1] PASSED. Child: 10.16 - F_TEST [ 0, 1] PASSED. Test #11 - Make sure close() releases the process's locks. Parent: 11.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Child: 11.1 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: 11.3 - F_TLOCK [ 1d, 5b7] PASSED. Parent: 11.4 - F_TLOCK [ 2000, 57] PASSED. Parent: Closed testfile. Child: 11.5 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.6 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.7 - F_TLOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ]. Parent: Closed testfile. Child: 11.8 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.9 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.10 - F_TLOCK [ 0, ENDING] PASSED. Parent: Truncated testfile. Parent: Closed testfile. Child: 11.11 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.12 - F_ULOCK [ 0, ENDING] PASSED. Test #12 - Signalled process should release locks. Child: 12.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Killed child process. Parent: 12.1 - F_TLOCK [ 0, ENDING] PASSED. Test #13 - Check locking and mmap semantics. Parent: 13.0 - F_TLOCK [ ffe, ENDING] PASSED. Parent: 13.1 - mmap [ 0, 1000] WARNING! Parent: **** Expected EAGAIN, returned success... Parent: 13.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: unmap testfile. Parent: 13.3 - mmap [ 0, 1000] PASSED. Parent: 13.4 - F_TLOCK [ ffe, ENDING] PASSED. Parent: unmap testfile. Test #14 - Rate test performing I/O on unlocked and locked file. Parent: File Unlocked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [12800.00 +/- 76.80 KB/s]. Parent: 14.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: File Locked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [11906.98 +/- 105.99 KB/s]. Parent: 14.1 - F_ULOCK [ 0, ENDING] PASSED. Test #15 - Test 2nd open and I/O after lock and close. Parent: Second open succeeded. Parent: 15.0 - F_LOCK [ 0, ENDING] PASSED. Parent: 15.1 - F_ULOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ]. Parent: Read 'abcdefghij' from testfile [ 0, 11 ]. Parent: 15.2 - COMPARE [ 0, b] PASSED. ** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total). Testing non-native 64 bit LFS locking Creating parent/child synchronization pipes. Test #1 - Test regions of an unlocked file. Parent: 1.1 - F_TEST [ 0, 1] PASSED. Parent: 1.2 - F_TEST [ 0, ENDING] PASSED. Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Parent: 1.4 - F_TEST [ 1, 1] PASSED. Parent: 1.5 - F_TEST [ 1, ENDING] PASSED. Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED. Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] PASSED. Child: 2.2 - F_TEST [ 0, ENDING] PASSED. Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Child: 2.4 - F_TEST [ 1, 1] PASSED. Child: 2.5 - F_TEST [ 1, ENDING] PASSED. Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED. Test #3 - Try to lock just the 1st byte. Parent: 3.0 - F_TLOCK [ 0, 1] PASSED. Child: 3.1 - F_TEST [ 0, 1] PASSED. Child: 3.2 - F_TEST [ 0, ENDING] PASSED. Child: 3.3 - F_TEST [ 1, 1] PASSED. Child: 3.4 - F_TEST [ 1, ENDING] PASSED. Parent: 3.5 - F_ULOCK [ 0, 1] PASSED. Test #4 - Try to lock the 2nd byte, test around it. Parent: 4.0 - F_TLOCK [ 1, 1] PASSED. Child: 4.1 - F_TEST [ 0, 1] PASSED. Child: 4.2 - F_TEST [ 0, 2] PASSED. Child: 4.3 - F_TEST [ 0, ENDING] PASSED. Child: 4.4 - F_TEST [ 1, 1] PASSED. Child: 4.5 - F_TEST [ 1, 2] PASSED. Child: 4.6 - F_TEST [ 1, ENDING] PASSED. Child: 4.7 - F_TEST [ 2, 1] PASSED. Child: 4.8 - F_TEST [ 2, 2] PASSED. Child: 4.9 - F_TEST [ 2, ENDING] PASSED. Parent: 4.10 - F_ULOCK [ 1, 1] PASSED. Test #5 - Try to lock 1st and 2nd bytes, test around them. Parent: 5.0 - F_TLOCK [ 0, 1] PASSED. Parent: 5.1 - F_TLOCK [ 2, 1] PASSED. Child: 5.2 - F_TEST [ 0, 1] PASSED. Child: 5.3 - F_TEST [ 0, 2] PASSED. Child: 5.4 - F_TEST [ 0, ENDING] PASSED. Child: 5.5 - F_TEST [ 1, 1] PASSED. Child: 5.6 - F_TEST [ 1, 2] PASSED. Child: 5.7 - F_TEST [ 1, ENDING] PASSED. Child: 5.8 - F_TEST [ 2, 1] PASSED. Child: 5.9 - F_TEST [ 2, 2] PASSED. Child: 5.10 - F_TEST [ 2, ENDING] PASSED. Child: 5.11 - F_TEST [ 3, 1] PASSED. Child: 5.12 - F_TEST [ 3, 2] PASSED. Child: 5.13 - F_TEST [ 3, ENDING] PASSED. Parent: 5.14 - F_ULOCK [ 0, 1] PASSED. Parent: 5.15 - F_ULOCK [ 2, 1] PASSED. Test #6 - Try to lock the MAXEOF byte. Parent: 6.0 - F_TLOCK [7fffffffffffffff, 1] PASSED. Child: 6.1 - F_TEST [7ffffffffffffffe, 1] PASSED. Child: 6.2 - F_TEST [7ffffffffffffffe, 2] PASSED. Child: 6.3 - F_TEST [7ffffffffffffffe, ENDING] PASSED. Child: 6.4 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 6.5 - F_TEST [7fffffffffffffff, 2] PASSED. Child: 6.6 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 6.7 - F_TEST [8000000000000000, ENDING] PASSED. Child: 6.8 - F_TEST [8000000000000000, 1] PASSED. Child: 6.9 - F_TEST [8000000000000000,7fffffffffffffff] PASSED. Child: 6.10 - F_TEST [8000000000000000,8000000000000000] PASSED. Parent: 6.11 - F_ULOCK [7fffffffffffffff, 1] PASSED. Test #7 - Test parent/child mutual exclusion. Parent: 7.0 - F_TLOCK [ ffc, 9] PASSED. Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ]. Parent: Now free child to run, should block on lock. Parent: Check data in file to insure child blocked. Parent: Read 'aaaa eh' from testfile [ 4092, 7 ]. Parent: 7.1 - COMPARE [ ffc, 7] PASSED. Parent: Now unlock region so child will unblock. Parent: 7.2 - F_ULOCK [ ffc, 9] PASSED. Child: 7.3 - F_LOCK [ ffc, 9] PASSED. Parent: Now try to regain lock, parent should block. Child: Write child's version of the data and release lock. Child: Wrote 'bebebebeb' to testfile [ 4092, 9 ]. Child: 7.4 - F_ULOCK [ ffc, 9] PASSED. Parent: 7.5 - F_LOCK [ ffc, 9] PASSED. Parent: Check data in file to insure child unblocked. Parent: Read 'bebebebeb' from testfile [ 4092, 9 ]. Parent: 7.6 - COMPARE [ ffc, 9] PASSED. Parent: 7.7 - F_ULOCK [ ffc, 9] PASSED. Test #8 - Rate test performing lock/unlock cycles. Parent: Performed 1000 lock/unlock cycles in 1810 msecs. [66298 lpm]. Test #10 - Make sure a locked region is split properly. Parent: 10.0 - F_TLOCK [ 0, 3] PASSED. Parent: 10.1 - F_ULOCK [ 1, 1] PASSED. Child: 10.2 - F_TEST [ 0, 1] PASSED. Child: 10.3 - F_TEST [ 2, 1] PASSED. Child: 10.4 - F_TEST [ 3, ENDING] PASSED. Child: 10.5 - F_TEST [ 1, 1] PASSED. Parent: 10.6 - F_ULOCK [ 0, 1] PASSED. Parent: 10.7 - F_ULOCK [ 2, 1] PASSED. Child: 10.8 - F_TEST [ 0, 3] PASSED. Parent: 10.9 - F_ULOCK [ 0, 1] PASSED. Parent: 10.10 - F_TLOCK [ 1, 3] PASSED. Parent: 10.11 - F_ULOCK [ 2, 1] PASSED. Child: 10.12 - F_TEST [ 1, 1] PASSED. Child: 10.13 - F_TEST [ 3, 1] PASSED. Child: 10.14 - F_TEST [ 4, ENDING] PASSED. Child: 10.15 - F_TEST [ 2, 1] PASSED. Child: 10.16 - F_TEST [ 0, 1] PASSED. Test #11 - Make sure close() releases the process's locks. Parent: 11.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Child: 11.1 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: 11.3 - F_TLOCK [ 1d, 5b7] PASSED. Parent: 11.4 - F_TLOCK [ 2000, 57] PASSED. Parent: Closed testfile. Child: 11.5 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.6 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.7 - F_TLOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ]. Parent: Closed testfile. Child: 11.8 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.9 - F_ULOCK [ 0, ENDING] PASSED. Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ]. Parent: 11.10 - F_TLOCK [ 0, ENDING] PASSED. Parent: Truncated testfile. Parent: Closed testfile. Child: 11.11 - F_TLOCK [ 0, ENDING] PASSED. Child: 11.12 - F_ULOCK [ 0, ENDING] PASSED. Test #12 - Signalled process should release locks. Child: 12.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: Killed child process. Parent: 12.1 - F_TLOCK [ 0, ENDING] PASSED. Test #13 - Check locking and mmap semantics. Parent: 13.0 - F_TLOCK [ ffe, ENDING] PASSED. Parent: 13.1 - mmap [ 0, 1000] WARNING! Parent: **** Expected EAGAIN, returned success... Parent: 13.2 - F_ULOCK [ 0, ENDING] PASSED. Parent: unmap testfile. Parent: 13.3 - mmap [ 0, 1000] PASSED. Parent: 13.4 - F_TLOCK [ ffe, ENDING] PASSED. Parent: unmap testfile. Test #14 - Rate test performing I/O on unlocked and locked file. Parent: File Unlocked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [13837.84 +/- 105.99 KB/s]. Parent: 14.0 - F_TLOCK [ 0, ENDING] PASSED. Parent: File Locked Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Truncated testfile. Parent: Wrote and read 256 KB file 10 times; [11636.36 +/- 37.93 KB/s]. Parent: 14.1 - F_ULOCK [ 0, ENDING] PASSED. Test #15 - Test 2nd open and I/O after lock and close. Parent: Second open succeeded. Parent: 15.0 - F_LOCK [ 0, ENDING] PASSED. Parent: 15.1 - F_ULOCK [ 0, ENDING] PASSED. Parent: Closed testfile. Parent: Wrote 'abcdefghij' to testfile [ 0, 11 ]. Parent: Read 'abcdefghij' from testfile [ 0, 11 ]. Parent: 15.2 - COMPARE [ 0, b] PASSED. ** PARENT pass 1 results: 49/49 pass, 1/1 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total). Congratulations, you passed the locking tests! All tests completed [root@dhcp43-8 cthon04]# ************************************************************************ Based on the above observation, marking this bug as Verified. Soumya, Could you review and sign-off the edited doc text. Doc text looks good to me. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1247 |