Red Hat Bugzilla – Bug 239691
connectathon lock/test11 fails on CIFS w/o unix extensions
Last modified: 2009-02-17 13:07:07 EST
Lock test10 in the connectathon suite fails if unix extensions are disabled.
I've not completely investigated the cause as of yet, but here are the results:
# ./tlocklfs -t 11
Creating parent/child synchronization pipes.
Test #11 - Make sure close() releases the process's locks.
Parent: 11.0 - F_TLOCK [ 0, ENDING] PASSED.
Parent: Closed testfile.
Child: 11.1 - F_TLOCK [ 0, ENDING] PASSED.
Child: 11.2 - F_ULOCK [ 0, ENDING] PASSED.
Parent: 11.3 - F_TLOCK [ 1d, 5b7] PASSED.
Parent: 11.4 - F_TLOCK [ 2000, 57] PASSED.
Parent: Closed testfile.
Child: 11.5 - F_TLOCK [ 0, ENDING] PASSED.
Child: 11.6 - F_ULOCK [ 0, ENDING] PASSED.
Parent: Wrote '123456789abcdef' to testfile [ 0, 16 ].
Parent: 11.7 - F_TLOCK [ 0, ENDING] PASSED.
tlock: testfile write: Permission denied
** PARENT pass 1 results: 4/4 pass, 0/0 warn, 0/0 fail (pass/total).
** CHILD pass 1 results: 4/4 pass, 0/0 warn, 0/0 fail (pass/total).
This actually works against samba w/o unix extensions but fails against win2k3. The problem seems to occur during when the parent process does this:
write_testfile(data, (off_t)0, datalen, COMMENT);
test(11, 7, F_TLOCK, (off_t)0, (off_t)0, PASS, FATAL);
write_testfile(data, (off_t)(datalen - 3), datalen, COMMENT);
...so we write some data to the file, lock it, and then attempt to write again. On the wire, we see this:
30 0.012869 10.11.231.229 -> 10.11.231.44 SMB Write AndX Request, FID: 0x4005, 16 bytes at offset 0
31 0.015167 10.11.231.44 -> 10.11.231.229 SMB Write AndX Response, FID: 0x4005, 16 bytes
32 0.015540 10.11.231.229 -> 10.11.231.44 SMB Locking AndX Request, FID: 0x4004
33 0.016017 10.11.231.44 -> 10.11.231.229 SMB Locking AndX Response, FID: 0x4004
34 0.016347 10.11.231.229 -> 10.11.231.44 SMB Write AndX Request, FID: 0x4005, 29 bytes at offset 0
35 0.017007 10.11.231.44 -> 10.11.231.229 SMB Write AndX Response, FID: 0x4005, 0 bytes, Error: STATUS_FILE_LOCK_CONFLICT
...note that the FID's are different even though they should be the same (since the operations are from the same process). So either the write or the lock is picking up the wrong FID. We don't notice this with unix servers since they don't enforce mandatory locking
My guess is that either the locking or write code is using find_writable_file() and it's not picking up the filehandle we want.
While Linux uses non-mandatory locking, Samba do normally enforce locking for its clients. Did you reproduce this against recent samba versions using "strict locking = yes" ?
Even with strict locking, I don't seem to be able to reproduce this against samba for some reason. When I test against samba, it looks like it ends up using the same FID for all three operations. I'm not yet clear on why this is, but it seems to be pretty consistent and I think it must be something in the client...
Did a quick and dirty patch to make find_writable_file() skip any files that that don't have a pid that matches current->tgid. This fixes the test, so it convinces me that I at least understand the basic problem. As to why we don't see this against samba, I'm still not clear. Still, it makes some sense to make find_writeable_file prefer to return certain filehandles. I'm thinking:
1) if the FH's pid matches current->tgid, then that filehandle should be preferred over any others.
2) if no filehandle has a pid that matches current->tgid, then we'd want to prefer filehandles that have write locks that overlap the range of data being written.
#1 should be relatively easy to implement. #2 seems harder. I may try to just implement #1 first and leave a comment about #2 for future reference.
Patchset went upstream. Closing as duplicate of cifs update BZ.
*** This bug has been marked as a duplicate of bug 465143 ***