From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010808 Description of problem: At the university, our e-mail server was recently moved from a Solaris box to a Red Hat Linux 7.1 box. Ever since, some users have experienced mail delivery delays. We use sendmail+procmail for e-mail delivery on the Red Hat Linux box. /var/spool/mail is exported to a number of NFS clients, so that users can keep on using mailtool on Solaris, and other MUAs on Solaris or Red Hat Linux that access mailboxes in /var/spool/mail. The problem is that, sometimes, after some client or gateway machine crashes, mailboxes remain locked, and procmail is no longer able to deliver e-mail to these mailboxes, blocking delivery of the message not only to that particular user, but also to many other recipients of the same message. lslk shows mailboxes are locked by remote processes, but the clients that used to hold the lock no longer believe to do so: oftentimes, the client itself was rebooted or went down due to power failure; other times, it may have been a gateway between the server and the client that remained down for an extended period of time. Anyway, the point is that there are only two ways to get e-mail delivery to proceed: (i) reboot the server, so that the lock state is reset after clients are contacted (ii) copy the locked mailbox file to a new file and (atomically?) rename the new file to the old one then kill sendmail and procmail waiting for the lock. The lock remains, but in a removed file, so it no longer gets in the way. I suppose there's something wrong with lock recovery on remounting, or on lock time-outs, or I'm missing something about NFS locking. Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1.Set up mail delivery at a Red Hat Linux 7.1 box with sendmail and procmail 2.Export /var/spool/mail and let users use tools that acquire locks on mailbox files 3.Let clients crash, freeze or lose contact to the server for extended periods of time 4.Expect complains from users that they're no longer getting e-mail. 5.Verify with strace that procmail is waiting for a lock, and check with lslk that the lock is held by a remote process, that is no longer there. Actual Results: The lock is never released, requiring manual intervention. Expected Results: When the client re-connects, the server should attempt to release the lock. Additional info:
Some patches when into later kernel that should take care of this issue.