From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021212 Description of problem: We're locking mail spools over NFS from a Tru64 5.1B client machine to Linux 2.4.18-17.7.x-i686 P4 (RedHat) server using ext3. Using a simple C program which opens the file on the Tru64 machine, does lockf (fd, F_LOCK, 1)) and then closes it after 10 seconds, some users are able to lock their mail spool (one seen, actually) and many users are unable! The mail spool files: [root@xserv1 mail]# ls -l rmj jss -rw------- 1 jss users 111406 Feb 4 14:55 jss -rw------- 1 rmj users 3196 Feb 4 14:48 rmj [root@xserv1 mail]# ls -ln rmj jss -rw------- 1 914 15 111406 Feb 4 14:55 jss -rw------- 1 90 15 3196 Feb 4 14:48 rmj The tcpdump (here tru64 is the client, linux is the linux server machine, and server is an IP alias address for linux) traces for locking the mail spools for each user are: For rmj where locking works: 15:04:37.123839 tru64.1420724309 > server.nfs: 108 getattr fh Unknown/1 15:04:37.123931 linux.nfs > tru64.1420724309: reply ok 112 (DF) 15:04:37.124128 tru64.1437501525 > server.nfs: 116 lookup fh Unknown/1 "rmj" 15:04:37.124181 linux.nfs > tru64.1437501525: reply ok 232 (DF) 15:04:37.124408 tru64.1454278741 > server.nfs: 112 getattr fh Unknown/1 15:04:37.124467 linux.nfs > tru64.1454278741: reply ok 112 (DF) 15:04:37.124649 tru64.1471055957 > server.nfs: 112 getattr fh Unknown/1 15:04:37.124704 linux.nfs > tru64.1471055957: reply ok 112 (DF) 15:04:37.124887 tru64.1487833173 > server.nfs: 112 getattr fh Unknown/1 15:04:37.124929 linux.nfs > tru64.1487833173: reply ok 112 (DF) 15:04:37.125516 tru64.693 > server.32775: udp 168 (DF) 15:04:37.125599 linux.32775 > tru64.693: udp 24 (DF) 15:04:37.125630 linux.794 > tru64.sunrpc: udp 56 (DF) 15:04:37.125849 tru64.sunrpc > linux.794: udp 28 15:04:37.125897 linux.799 > tru64.1035: udp 136 (DF) 15:04:47.126257 tru64.693 > server.32775: udp 152 (DF) 15:04:47.126361 linux.799 > tru64.1035: udp 136 (DF) 15:04:47.126371 linux.32775 > tru64.693: udp 24 (DF) For jss where locking hangs: 15:12:53.746512 tru64.1840220245 > server.nfs: 108 getattr fh Unknown/1 15:12:53.746607 linux.nfs > tru64.1840220245: reply ok 112 (DF) 15:12:53.746804 tru64.1856997461 > server.nfs: 108 getattr fh Unknown/1 15:12:53.746849 linux.nfs > tru64.1856997461: reply ok 112 (DF) 15:12:53.747041 tru64.1873774677 > server.nfs: 108 getattr fh Unknown/1 15:12:53.747084 linux.nfs > tru64.1873774677: reply ok 112 (DF) 15:12:53.747270 tru64.1890551893 > server.nfs: 116 lookup fh Unknown/1 "jss" 15:12:53.747330 linux.nfs > tru64.1890551893: reply ok 232 (DF) 15:12:53.747561 tru64.1907329109 > server.nfs: 112 getattr fh Unknown/1 15:12:53.747604 linux.nfs > tru64.1907329109: reply ok 112 (DF) 15:12:53.747783 tru64.1924106325 > server.nfs: 112 getattr fh Unknown/1 15:12:53.747837 linux.nfs > tru64.1924106325: reply ok 112 (DF) 15:12:53.748014 tru64.1940883541 > server.nfs: 112 getattr fh Unknown/1 15:12:53.748068 linux.nfs > tru64.1940883541: reply ok 112 (DF) 15:12:53.748687 tru64.693 > server.32775: udp 168 (DF) 15:12:53.749174 linux.32775 > tru64.693: udp 24 (DF) 15:12:53.749186 linux.796 > tru64.sunrpc: udp 56 (DF) 15:12:53.749414 tru64.sunrpc > linux.796: udp 28 15:12:53.749443 linux.797 > tru64.1035: udp 92 (DF) 15:12:54.758566 tru64.693 > server.32775: udp 168 (DF) 15:12:54.758649 linux.797 > tru64.1035: udp 92 (DF) 15:12:54.758658 linux.32775 > tru64.693: udp 24 (DF) 15:12:55.764532 tru64.693 > server.32775: udp 168 (DF) 15:12:55.764613 linux.797 > tru64.1035: udp 92 (DF) 15:12:55.764623 linux.32775 > tru64.693: udp 24 (DF) 15:12:56.770309 tru64.693 > server.32775: udp 168 (DF) 15:12:56.770371 linux.797 > tru64.1035: udp 92 (DF) 15:12:56.770378 linux.32775 > tru64.693: udp 24 (DF) 15:12:57.774218 tru64.693 > server.32775: udp 168 (DF) 15:12:57.774290 linux.797 > tru64.1035: udp 92 (DF) 15:12:57.774299 linux.32775 > tru64.693: udp 24 (DF) 15:12:58.782025 tru64.693 > server.32775: udp 168 (DF) 15:12:58.782102 linux.797 > tru64.1035: udp 92 (DF) 15:12:58.782111 linux.32775 > tru64.693: udp 24 (DF) 15:12:59.787892 tru64.693 > server.32775: udp 168 (DF) 15:12:59.787967 linux.797 > tru64.1035: udp 92 (DF) 15:12:59.787974 linux.32775 > tru64.693: udp 24 (DF) 15:13:00.786273 tru64.693 > server.32775: udp 160 (DF) 15:13:00.786336 linux.797 > tru64.1035: udp 92 (DF) 15:13:00.786345 linux.32775 > tru64.693: udp 24 (DF) [ carries on like this ] Interesting if the user who can lock their spool copies it to another file with the same permissions and ownership, they cannot lock this file (even if it is in the same directory). Also locking only fails if group and others don't have r/w access to the file. Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1. Attempt to lock file with -rw------- permissions on linux server with Tru64 client using lockf call over NFS (UDP, V3). 2. 3. Actual Results: Locking fails. Expected Results: Locking should work. Additional info:
We seem to do better with lock in later kernel, like the Fedora Core 1 and 2. Please try one of those kernel and reopen if necessary