Bug 434966 - [RHEL4 U7] Kernel NFS Connectathon Test#12, 12.1 Failing
[RHEL4 U7] Kernel NFS Connectathon Test#12, 12.1 Failing
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
All Linux
low Severity medium
: rc
: ---
Assigned To: Peter Staubach
Martin Jenner
http://rhts.redhat.com/cgi-bin/rhts/t...
:
Depends On:
Blocks: 448929 461304
  Show dependency treegraph
 
Reported: 2008-02-26 10:41 EST by Jeff Burke
Modified: 2009-01-12 10:38 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-12 10:38:51 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeff Burke 2008-02-26 10:41:53 EST
Description of problem:
Test #12 - Signalled process should release locks.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [       0,  ENDING] FAILED!
	Parent: **** Expected success, returned EAGAIN...
	Parent: **** Probably implementation error.

Version-Release number of selected component (if applicable):
2.6.9-68.13.EL

How reproducible:
 Intermittent 

Steps to Reproduce:
1. Install RHEL4-U6 Baseline. 
2. Install 2.6.9-68.13.EL kernel
3. Run the RTHS Connectathon test
  
Actual results:
Capturing on eth0
===== Starting 'nfsvers=2_udp' test 'lock' =====
----- start: Mon Feb 25 19:55:20 EST 2008 -----
./server -l -F nfs -onfsvers=2,udp -p /export/home rhel5-nfs
Start tests on path /mnt/rhel5-nfs/hp-xw4200-01.test [y/n]? 
sh ./runtests  -l -t /mnt/rhel5-nfs/hp-xw4200-01.test

Starting LOCKING tests: test directory /mnt/rhel5-nfs/hp-xw4200-01.test (arg: -t)

Testing native post-LFS locking

Creating parent/child synchronization pipes.

Test #1 - Test regions of an unlocked file.
	Parent: 1.1  - F_TEST  [       0,       1] PASSED.
	Parent: 1.2  - F_TEST  [       0,  ENDING] PASSED.
	Parent: 1.3  - F_TEST  [       0,7fffffff] PASSED.
	Parent: 1.4  - F_TEST  [       1,       1] PASSED.
	Parent: 1.5  - F_TEST  [       1,  ENDING] PASSED.
	Parent: 1.6  - F_TEST  [       1,7fffffff] PASSED.
	Parent: 1.7  - F_TEST  [7fffffff,       1] PASSED.
	Parent: 1.8  - F_TEST  [7fffffff,  ENDING] PASSED.
	Parent: 1.9  - F_TEST  [7fffffff,7fffffff] PASSED.

Test #2 - Try to lock the whole file.
	Parent: 2.0  - F_TLOCK [       0,  ENDING] PASSED.
	Child:  2.1  - F_TEST  [       0,       1] PASSED.
	Child:  2.2  - F_TEST  [       0,  ENDING] PASSED.
	Child:  2.3  - F_TEST  [       0,7fffffff] PASSED.
	Child:  2.4  - F_TEST  [       1,       1] PASSED.
	Child:  2.5  - F_TEST  [       1,  ENDING] PASSED.
	Child:  2.6  - F_TEST  [       1,7fffffff] PASSED.
	Child:  2.7  - F_TEST  [7fffffff,       1] PASSED.
	Child:  2.8  - F_TEST  [7fffffff,  ENDING] PASSED.
	Child:  2.9  - F_TEST  [7fffffff,7fffffff] PASSED.
	Parent: 2.10 - F_ULOCK [       0,  ENDING] PASSED.

Test #3 - Try to lock just the 1st byte.
	Parent: 3.0  - F_TLOCK [       0,       1] PASSED.
	Child:  3.1  - F_TEST  [       0,       1] PASSED.
	Child:  3.2  - F_TEST  [       0,  ENDING] PASSED.
	Child:  3.3  - F_TEST  [       1,       1] PASSED.
	Child:  3.4  - F_TEST  [       1,  ENDING] PASSED.
	Parent: 3.5  - F_ULOCK [       0,       1] PASSED.

Test #4 - Try to lock the 2nd byte, test around it.
	Parent: 4.0  - F_TLOCK [       1,       1] PASSED.
	Child:  4.1  - F_TEST  [       0,       1] PASSED.
	Child:  4.2  - F_TEST  [       0,       2] PASSED.
	Child:  4.3  - F_TEST  [       0,  ENDING] PASSED.
	Child:  4.4  - F_TEST  [       1,       1] PASSED.
	Child:  4.5  - F_TEST  [       1,       2] PASSED.
	Child:  4.6  - F_TEST  [       1,  ENDING] PASSED.
	Child:  4.7  - F_TEST  [       2,       1] PASSED.
	Child:  4.8  - F_TEST  [       2,       2] PASSED.
	Child:  4.9  - F_TEST  [       2,  ENDING] PASSED.
	Parent: 4.10 - F_ULOCK [       1,       1] PASSED.

Test #5 - Try to lock 1st and 2nd bytes, test around them.
	Parent: 5.0  - F_TLOCK [       0,       1] PASSED.
	Parent: 5.1  - F_TLOCK [       2,       1] PASSED.
	Child:  5.2  - F_TEST  [       0,       1] PASSED.
	Child:  5.3  - F_TEST  [       0,       2] PASSED.
	Child:  5.4  - F_TEST  [       0,  ENDING] PASSED.
	Child:  5.5  - F_TEST  [       1,       1] PASSED.
	Child:  5.6  - F_TEST  [       1,       2] PASSED.
	Child:  5.7  - F_TEST  [       1,  ENDING] PASSED.
	Child:  5.8  - F_TEST  [       2,       1] PASSED.
	Child:  5.9  - F_TEST  [       2,       2] PASSED.
	Child:  5.10 - F_TEST  [       2,  ENDING] PASSED.
	Child:  5.11 - F_TEST  [       3,       1] PASSED.
	Child:  5.12 - F_TEST  [       3,       2] PASSED.
	Child:  5.13 - F_TEST  [       3,  ENDING] PASSED.
	Parent: 5.14 - F_ULOCK [       0,       1] PASSED.
	Parent: 5.15 - F_ULOCK [       2,       1] PASSED.

Test #6 - Try to lock the MAXEOF byte.
	Parent: 6.0  - F_TLOCK [7fffffff,       1] PASSED.
	Child:  6.1  - F_TEST  [7ffffffe,       1] PASSED.
	Child:  6.2  - F_TEST  [7ffffffe,       2] PASSED.
	Child:  6.3  - F_TEST  [7ffffffe,  ENDING] PASSED.
	Child:  6.4  - F_TEST  [7fffffff,       1] PASSED.
	Child:  6.5  - F_TEST  [7fffffff,       2] PASSED.
	Child:  6.6  - F_TEST  [7fffffff,  ENDING] PASSED.
	Child:  6.7  - F_TEST  [80000000,  ENDING] PASSED.
	Child:  6.8  - F_TEST  [80000000,       1] PASSED.
	Child:  6.9  - F_TEST  [80000000,7fffffff] PASSED.
	Child:  6.10 - F_TEST  [80000000,80000000] WARNING!
	Child:  **** Expected EINVAL, returned EACCES...
	Parent: 6.11 - F_ULOCK [7fffffff,       1] PASSED.

Test #7 - Test parent/child mutual exclusion.
	Parent: 7.0  - F_TLOCK [     ffc,       9] PASSED.
	Parent: Wrote 'aaaa eh' to testfile [ 4092, 7 ].
	Parent: Now free child to run, should block on lock.
	Parent: Check data in file to insure child blocked.
	Parent: Read 'aaaa eh' from testfile [ 4092, 7 ].
	Parent: 7.1  - COMPARE [     ffc,       7] PASSED.
	Parent: Now unlock region so child will unblock.
	Parent: 7.2  - F_ULOCK [     ffc,       9] PASSED.
	Child:  7.3  - F_LOCK  [     ffc,       9] PASSED.
	Child:  Write child's version of the data and release lock.
	Parent: Now try to regain lock, parent should block.
	Child:  Wrote 'bebebebeb' to testfile [ 4092, 9 ].
	Child:  7.4  - F_ULOCK [     ffc,       9] PASSED.
	Parent: 7.5  - F_LOCK  [     ffc,       9] PASSED.
	Parent: Check data in file to insure child unblocked.
	Parent: Read 'bebebebeb' from testfile [ 4092, 9 ].
	Parent: 7.6  - COMPARE [     ffc,       9] PASSED.
	Parent: 7.7  - F_ULOCK [     ffc,       9] PASSED.

Test #8 - Rate test performing lock/unlock cycles.
	Parent: Performed 1000 lock/unlock cycles in 1230 msecs. [97560 lpm].

Test #10 - Make sure a locked region is split properly.
	Parent: 10.0  - F_TLOCK [       0,       3] PASSED.
	Parent: 10.1  - F_ULOCK [       1,       1] PASSED.
	Child:  10.2  - F_TEST  [       0,       1] PASSED.
	Child:  10.3  - F_TEST  [       2,       1] PASSED.
	Child:  10.4  - F_TEST  [       3,  ENDING] PASSED.
	Child:  10.5  - F_TEST  [       1,       1] PASSED.
	Parent: 10.6  - F_ULOCK [       0,       1] PASSED.
	Parent: 10.7  - F_ULOCK [       2,       1] PASSED.
	Child:  10.8  - F_TEST  [       0,       3] PASSED.
	Parent: 10.9  - F_ULOCK [       0,       1] PASSED.
	Parent: 10.10 - F_TLOCK [       1,       3] PASSED.
	Parent: 10.11 - F_ULOCK [       2,       1] PASSED.
	Child:  10.12 - F_TEST  [       1,       1] PASSED.
	Child:  10.13 - F_TEST  [       3,       1] PASSED.
	Child:  10.14 - F_TEST  [       4,  ENDING] PASSED.
	Child:  10.15 - F_TEST  [       2,       1] PASSED.
	Child:  10.16 - F_TEST  [       0,       1] PASSED.

Test #11 - Make sure close() releases the process's locks.
	Parent: 11.0  - F_TLOCK [       0,  ENDING] PASSED.
	Parent: Closed testfile.
	Child:  11.1  - F_TLOCK [       0,  ENDING] PASSED.
	Child:  11.2  - F_ULOCK [       0,  ENDING] PASSED.
	Parent: 11.3  - F_TLOCK [      1d,     5b7] PASSED.
	Parent: 11.4  - F_TLOCK [    2000,      57] PASSED.
	Parent: Closed testfile.
	Child:  11.5  - F_TLOCK [       0,  ENDING] PASSED.
	Child:  11.6  - F_ULOCK [       0,  ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.7  - F_TLOCK [       0,  ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 13, 16 ].
	Parent: Closed testfile.
	Child:  11.8  - F_TLOCK [       0,  ENDING] PASSED.
	Child:  11.9  - F_ULOCK [       0,  ENDING] PASSED.
	Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
	Parent: 11.10 - F_TLOCK [       0,  ENDING] PASSED.
	Parent: Truncated testfile.
	Parent: Closed testfile.
	Child:  11.11 - F_TLOCK [       0,  ENDING] PASSED.
	Child:  11.12 - F_ULOCK [       0,  ENDING] PASSED.

Test #12 - Signalled process should release locks.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [       0,  ENDING] FAILED!
	Parent: **** Expected success, returned EAGAIN...
	Parent: **** Probably implementation error.

** PARENT pass 1 results: 39/39 pass, 0/0 warn, 1/1 fail (pass/total).

**  CHILD pass 1 results: 63/63 pass, 1/1 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt/rhel5-nfs mounted
----- end: Mon Feb 25 19:55:37 EST 2008 -----
----- return code: 1 -----
4451 packets captured

Expected results:
 This test should pass all the time

Additional info:
 We have seen this against several static server RHEL5, Sol9
Comment 1 Jeff Burke 2008-03-01 14:44:30 EST
Not sure if this requires a separate bug at this point in time. We are seeing
this issue with the latest RHEL5.2 kernel 2.6.9.18-84.el5

Test #12 - Signalled process should release locks.
	Child:  12.0  - F_TLOCK [       0,  ENDING] PASSED.
	Parent: Killed child process.
	Parent: 12.1  - F_TLOCK [       0,  ENDING] FAILED!
	Parent: **** Expected success, returned EAGAIN...
	Parent: **** Probably implementation error.

** PARENT pass 1 results: 39/39 pass, 0/0 warn, 1/1 fail (pass/total).

**  CHILD pass 1 results: 64/64 pass, 0/0 warn, 0/0 fail (pass/total).
lock tests failed
Tests failed, leaving /mnt/sol9-nfs mounted
----- end: Sat Mar  1 01:52:37 EST 2008 -----
----- return code: 1 -----
4407 packets captured

http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=2091722
Comment 2 Peter Staubach 2008-03-03 09:34:42 EST
I looked at the packet capture for this last failure.  It appears to me
that sol9 failed to respond for long enough that test12, which is very
timing dependent, failed.  I believe that the backoff delays caused the
child of the child process to delay dropping its lock long enough for
the parent process to fail to acquire the lock when it tried.

Perhaps running the lock tests with a longer wait time will help.
Otherwise, we could/should look at rewriting this test to eliminate
the timing dependencies.
Comment 6 RHEL Product and Program Management 2008-05-29 11:36:53 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 7 RHEL Product and Program Management 2008-09-03 09:16:05 EDT
Updating PM score.
Comment 9 RHEL Product and Program Management 2009-01-12 10:38:51 EST
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.