Bug 83545 - Locking sometimes fails from Tru64 5.1B
Summary: Locking sometimes fails from Tru64 5.1B
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: knfsd
Version: 7.3
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-02-05 15:31 UTC by Jeremy Sanders
Modified: 2007-04-18 16:50 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-09-01 18:15:30 UTC
Embargoed:


Attachments (Terms of Use)

Description Jeremy Sanders 2003-02-05 15:31:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021212

Description of problem:
We're locking mail spools over NFS from a Tru64 5.1B client machine to Linux
2.4.18-17.7.x-i686 P4 (RedHat) server using ext3. Using a simple C program which
opens the file on the Tru64 machine, does lockf (fd, F_LOCK, 1)) and then closes
it after 10 seconds, some users are able to lock their mail spool (one seen,
actually) and many users are unable!

The mail spool files:

[root@xserv1 mail]# ls -l rmj jss
-rw-------    1 jss      users      111406 Feb  4 14:55 jss
-rw-------    1 rmj      users        3196 Feb  4 14:48 rmj

[root@xserv1 mail]# ls -ln rmj jss
-rw-------    1 914      15         111406 Feb  4 14:55 jss
-rw-------    1 90       15           3196 Feb  4 14:48 rmj

The tcpdump (here tru64 is the client, linux is the linux server machine, and
server is an IP alias address for linux) traces for locking the mail spools for
each user are:

For rmj where locking works:

15:04:37.123839 tru64.1420724309 > server.nfs: 108 getattr fh Unknown/1
15:04:37.123931 linux.nfs > tru64.1420724309: reply ok 112 (DF)
15:04:37.124128 tru64.1437501525 > server.nfs: 116 lookup fh Unknown/1 "rmj"
15:04:37.124181 linux.nfs > tru64.1437501525: reply ok 232 (DF)
15:04:37.124408 tru64.1454278741 > server.nfs: 112 getattr fh Unknown/1
15:04:37.124467 linux.nfs > tru64.1454278741: reply ok 112 (DF)
15:04:37.124649 tru64.1471055957 > server.nfs: 112 getattr fh Unknown/1
15:04:37.124704 linux.nfs > tru64.1471055957: reply ok 112 (DF)
15:04:37.124887 tru64.1487833173 > server.nfs: 112 getattr fh Unknown/1
15:04:37.124929 linux.nfs > tru64.1487833173: reply ok 112 (DF)
15:04:37.125516 tru64.693 > server.32775:  udp 168 (DF)
15:04:37.125599 linux.32775 > tru64.693:  udp 24 (DF)
15:04:37.125630 linux.794 > tru64.sunrpc:  udp 56 (DF)
15:04:37.125849 tru64.sunrpc > linux.794:  udp 28
15:04:37.125897 linux.799 > tru64.1035:  udp 136 (DF)
15:04:47.126257 tru64.693 > server.32775:  udp 152 (DF)
15:04:47.126361 linux.799 > tru64.1035:  udp 136 (DF)
15:04:47.126371 linux.32775 > tru64.693:  udp 24 (DF)

For jss where locking hangs:


15:12:53.746512 tru64.1840220245 > server.nfs: 108 getattr fh Unknown/1
15:12:53.746607 linux.nfs > tru64.1840220245: reply ok 112 (DF)
15:12:53.746804 tru64.1856997461 > server.nfs: 108 getattr fh Unknown/1
15:12:53.746849 linux.nfs > tru64.1856997461: reply ok 112 (DF)
15:12:53.747041 tru64.1873774677 > server.nfs: 108 getattr fh Unknown/1
15:12:53.747084 linux.nfs > tru64.1873774677: reply ok 112 (DF)
15:12:53.747270 tru64.1890551893 > server.nfs: 116 lookup fh Unknown/1 "jss"
15:12:53.747330 linux.nfs > tru64.1890551893: reply ok 232 (DF)
15:12:53.747561 tru64.1907329109 > server.nfs: 112 getattr fh Unknown/1
15:12:53.747604 linux.nfs > tru64.1907329109: reply ok 112 (DF)
15:12:53.747783 tru64.1924106325 > server.nfs: 112 getattr fh Unknown/1
15:12:53.747837 linux.nfs > tru64.1924106325: reply ok 112 (DF)
15:12:53.748014 tru64.1940883541 > server.nfs: 112 getattr fh Unknown/1
15:12:53.748068 linux.nfs > tru64.1940883541: reply ok 112 (DF)
15:12:53.748687 tru64.693 > server.32775:  udp 168 (DF)
15:12:53.749174 linux.32775 > tru64.693:  udp 24 (DF)
15:12:53.749186 linux.796 > tru64.sunrpc:  udp 56 (DF)
15:12:53.749414 tru64.sunrpc > linux.796:  udp 28
15:12:53.749443 linux.797 > tru64.1035:  udp 92 (DF)
15:12:54.758566 tru64.693 > server.32775:  udp 168 (DF)
15:12:54.758649 linux.797 > tru64.1035:  udp 92 (DF)
15:12:54.758658 linux.32775 > tru64.693:  udp 24 (DF)
15:12:55.764532 tru64.693 > server.32775:  udp 168 (DF)
15:12:55.764613 linux.797 > tru64.1035:  udp 92 (DF)
15:12:55.764623 linux.32775 > tru64.693:  udp 24 (DF)
15:12:56.770309 tru64.693 > server.32775:  udp 168 (DF)
15:12:56.770371 linux.797 > tru64.1035:  udp 92 (DF)
15:12:56.770378 linux.32775 > tru64.693:  udp 24 (DF)
15:12:57.774218 tru64.693 > server.32775:  udp 168 (DF)
15:12:57.774290 linux.797 > tru64.1035:  udp 92 (DF)
15:12:57.774299 linux.32775 > tru64.693:  udp 24 (DF)
15:12:58.782025 tru64.693 > server.32775:  udp 168 (DF)
15:12:58.782102 linux.797 > tru64.1035:  udp 92 (DF)
15:12:58.782111 linux.32775 > tru64.693:  udp 24 (DF)
15:12:59.787892 tru64.693 > server.32775:  udp 168 (DF)
15:12:59.787967 linux.797 > tru64.1035:  udp 92 (DF)
15:12:59.787974 linux.32775 > tru64.693:  udp 24 (DF)
15:13:00.786273 tru64.693 > server.32775:  udp 160 (DF)
15:13:00.786336 linux.797 > tru64.1035:  udp 92 (DF)
15:13:00.786345 linux.32775 > tru64.693:  udp 24 (DF)
[ carries on like this ]

Interesting if the user who can lock their spool copies it to another file with
the same permissions and ownership, they cannot lock this file (even if it is in
the same directory).

Also locking only fails if group and others don't have r/w access to the file.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. Attempt to lock file with -rw------- permissions on linux server with Tru64
client using lockf call over NFS (UDP, V3).
2.
3.
    

Actual Results:  Locking fails.

Expected Results:  Locking should work.

Additional info:

Comment 1 Steve Dickson 2004-09-01 18:15:30 UTC
We seem to do better with lock in later kernel, like the
Fedora Core 1 and 2. Please try one of those kernel and 
reopen if necessary


Note You need to log in before you can comment on or make changes to this bug.