Bug 52278 - locks remain forever after client crash
Summary: locks remain forever after client crash
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-08-22 11:28 UTC by Alexandre Oliva
Modified: 2007-04-18 16:36 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2004-08-11 10:46:52 UTC
Embargoed:


Attachments (Terms of Use)

Description Alexandre Oliva 2001-08-22 11:28:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.3) Gecko/20010808

Description of problem:
At the university, our e-mail server was recently moved from a Solaris box
to a Red Hat Linux 7.1 box.  Ever since, some users have experienced mail
delivery delays.  We use sendmail+procmail for e-mail delivery on the Red
Hat Linux box.  /var/spool/mail is exported to a number of NFS clients, so
that users can keep on using mailtool on Solaris, and other MUAs on Solaris
or Red Hat Linux that access mailboxes in /var/spool/mail.

The problem is that, sometimes, after some client or gateway machine
crashes, mailboxes remain locked, and procmail is no longer able to deliver
e-mail to these mailboxes, blocking delivery of the message not only to
that particular user, but also to many other recipients of the same message.

lslk shows mailboxes are locked by remote processes, but the clients that
used to hold the lock no longer believe to do so: oftentimes, the client
itself was rebooted or went down due to power failure; other times, it may
have been a gateway between the server and the client that remained down
for an extended period of time.

Anyway, the point is that there are only two ways to get e-mail delivery to
proceed:

(i) reboot the server, so that the lock state is reset after clients are
contacted

(ii) copy the locked mailbox file to a new file and (atomically?) rename
the new file to the old one then kill sendmail and procmail waiting for the
lock.  The lock remains, but in a removed file, so it no longer gets in the
way.

I suppose there's something wrong with lock recovery on remounting, or on
lock time-outs, or I'm missing something about NFS locking.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.Set up mail delivery at a Red Hat Linux 7.1 box with sendmail and procmail
2.Export /var/spool/mail and let users use tools that acquire locks on
mailbox files
3.Let clients crash, freeze or lose contact to the server for extended 
periods of time
4.Expect complains from users that they're no longer getting e-mail.
5.Verify with strace that procmail is waiting for a lock, and check with
lslk that the lock is held by a remote process, that is no longer there.

	

Actual Results:  The lock is never released, requiring manual intervention.

Expected Results:  When the client re-connects, the server should attempt
to release the lock.

Additional info:

Comment 1 Steve Dickson 2004-08-11 10:46:52 UTC
Some patches when into later kernel that should take care of 
this issue.


Note You need to log in before you can comment on or make changes to this bug.