Bug 52278

Summary: locks remain forever after client crash
Product: [Retired] Red Hat Linux Reporter: Alexandre Oliva <aoliva>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: zaitcev
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-08-11 10:46:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexandre Oliva 2001-08-22 11:28:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010808

Description of problem:
At the university, our e-mail server was recently moved from a Solaris box
to a Red Hat Linux 7.1 box.  Ever since, some users have experienced mail
delivery delays.  We use sendmail+procmail for e-mail delivery on the Red
Hat Linux box.  /var/spool/mail is exported to a number of NFS clients, so
that users can keep on using mailtool on Solaris, and other MUAs on Solaris
or Red Hat Linux that access mailboxes in /var/spool/mail.

The problem is that, sometimes, after some client or gateway machine
crashes, mailboxes remain locked, and procmail is no longer able to deliver
e-mail to these mailboxes, blocking delivery of the message not only to
that particular user, but also to many other recipients of the same message.

lslk shows mailboxes are locked by remote processes, but the clients that
used to hold the lock no longer believe to do so: oftentimes, the client
itself was rebooted or went down due to power failure; other times, it may
have been a gateway between the server and the client that remained down
for an extended period of time.

Anyway, the point is that there are only two ways to get e-mail delivery to
proceed:

(i) reboot the server, so that the lock state is reset after clients are
contacted

(ii) copy the locked mailbox file to a new file and (atomically?) rename
the new file to the old one then kill sendmail and procmail waiting for the
lock.  The lock remains, but in a removed file, so it no longer gets in the
way.

I suppose there's something wrong with lock recovery on remounting, or on
lock time-outs, or I'm missing something about NFS locking.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.Set up mail delivery at a Red Hat Linux 7.1 box with sendmail and procmail
2.Export /var/spool/mail and let users use tools that acquire locks on
mailbox files
3.Let clients crash, freeze or lose contact to the server for extended 
periods of time
4.Expect complains from users that they're no longer getting e-mail.
5.Verify with strace that procmail is waiting for a lock, and check with
lslk that the lock is held by a remote process, that is no longer there.

	

Actual Results:  The lock is never released, requiring manual intervention.

Expected Results:  When the client re-connects, the server should attempt
to release the lock.

Additional info:

Comment 1 Steve Dickson 2004-08-11 10:46:52 UTC
Some patches when into later kernel that should take care of 
this issue.