Bug 123415

Summary:

API Breakage: NFS "No locks available" with kernel 2.4.21-15.ELsmp

Product:

Red Hat Enterprise Linux 3

Reporter:

Ole Holm Nielsen <ole.h.nielsen>

Component:

kernel

Assignee:

Ken Preslan <kpreslan>

Status:

CLOSED ERRATA

QA Contact:

Severity:

medium

Docs Contact:

Priority:

high

Version:

3.0

CC:

alan, brilong, david.grierson, herrold, howen, joshua, kanderso, k.georgiou, nhorman, nhruby, pamadio, peterm, petrides, riel, sattia, sct, steved, tao, tburke, t.h.amundsen

Target Milestone:

---

Keywords:

Regression

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-05-18 13:27:35 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

132991

Attachments:

Description	Flags
patch to make NFS return LOCK_USE_CLNT for flock calls	none
Proposed Patch	none
Updated Patch per Alan's comments	none
A patch that introduces the FS_BROKEN_FLOCK flag	none
NFS flock() avoidance patch	none

Description Ole Holm Nielsen 2004-05-18 10:27:07 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124

Description of problem:
When we upgraded our mailserver to RHEL 3.0 Update 2, the Sendmail
/usr/bin/vacation stopped working.  Whenever an autoreply is 
invoked by vacation, the maillog shows as in this sample:
  May 18 11:27:30 servfys vacation[24291]: vacation: .vacation.db: No
locks available
  May 18 11:27:30 servfys sendmail[24265]: i4I9RHX6024262:
to=|"/usr/bin/vacation helle", ctladdr=<helle.dk> (344/250),
delay=00:00:13, xdelay=00:00
:00, mailer=prog, pri=1859352, dsn=5.3.0, stat=unknown mailer error 1

and the vacation code exits without making an autoreply.
When copying the user's homedir to a local disk, the vacation
works as it did before (with kernel kernel-smp-2.4.21-9.0.3.EL).
Therefore the problem must be specific to NFS.
The vacation code tries to create a lock on the user's file
$HOME/.vacation.db and this apparently fails on NFS mounted
filesystems with the latest kernel.
One might suspect that there is a more general problem with
NFS locks, but I don't know how to test this.

Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-15.EL

How reproducible:
Always

Steps to Reproduce:
Set up Sendmail's vacation tool in the user's .forward file:
   helle, |"/usr/bin/vacation helle"
and initialize the .vacation.db by "vacation -i".

    

Actual Results:  Whenever a mail is sent to the user, the maillog shows a
   "No locks available" error.
The sender receives an error mail message:
   ----- The following addresses had permanent fatal errors -----
|"/usr/bin/vacation helle"
    (reason: Data format error)
    (expanded from: <helle.dk>)

Expected Results:  The vacation program should return an auto-reply to
the sender.


Additional info:

The /usr/bin/vacation is part of Sendmail, but is not installed
by default.  I built Sendmail myself and copied vacation to
/usr/bin/.

Comment 1 Trond H. Amundsen 2004-05-18 13:43:39 UTC

We have also noticed this bug. It seems to be spesific to locking over
NFS using flock(), while fcntl() works fine. Example using exim_lock
from Exim 4.30:

  $ touch foo
  $ exim_lock -v -fcntl foo 
  exim_lock: fcntl() lock successfully applied
  exim_lock: locking foo succeeded: running /local/gnu/bin/bash ...
  $ exit
  exim_lock: foo closed
  $ exim_lock -v -flock foo 
  exim_lock: flock() failed: No locks available
  exim_lock: file closed
   ... waiting

With kernel family 2.4.21-9.0.3.EL both flock() and fcntl() work as
expected.

Comment 2 Rik van Riel 2004-05-18 13:53:40 UTC

Reassigning NFS locking problem to our NFS maintainer.

Comment 3 Sameh Attia 2004-05-18 14:56:57 UTC

I just upgraded too to the same version of RHELAS 3.0  
  
I have the same problem with NFS but with ezmlm and qmail.  
  
I tested the setup on parties running RHELAS 3.0 Update 2 and all  
gave the same results no locks available with ezmlm.  
  
I managed to workaround it by specifying nolock in the mount option  
but Im not sure is this safe or not.

Comment 4 Steve Dickson 2004-06-14 18:02:27 UTC

BSD flock locks are not supported.

Comment 5 Keith 2004-07-01 16:29:58 UTC

Why is this CLOSED NOTABUG? It's incredibly Bad And Rude(tm) to break
support for a widely-used system call in a kernel patch, and then
claim it's "not our problem". This is simply not good enough!

Comment 6 Steve Dickson 2004-07-02 11:18:15 UTC

It is my understanding that flocks have never worked in correctly
in linux kernels. So the clami that we, Redhat, have added some
patch that breaks or removed support is simply wrong. flocks 
are not supported in upstreams kernels so they are not supported
in Redhat Kernels.

Comment 7 Keith 2004-07-05 13:17:45 UTC

Point taken, and I shouldn't have used the word "support", 
but "functionality".

Nevertheless, from my (exceptionally naive, presumably) point of 
view, if the functionality of a particular system call or library 
function changes, it is either intentional (and therefore ought to be 
documented) or it is unintentional (and is therefore a bug).

What appears to have happened is that the behaviour of an flock() 
call on a file on an NFS-mounted filesystem has changed from "maybe 
working, maybe not, but flock() reported success anyway" to "flock() 
reports failure". This is a pretty drastic change for a kernel patch, 
regardless of whether flock() is "officially" supported - there are 
so many third-party applications written for the BSD-based world - 
and therefore that are likely to use flock() instead of fcntl() - 
that such an arbitrary change cannot be made without some kind 
of "HEADS UP" warning.

That is, if it was intentional. If it wasn't intentional, then by 
anyone's book it's a bug. If this has crept in as a result of changes 
to the upstream kernel, I sympathize, but one of the things I'd 
expect of Red Hat is to provide some production continuity and 
quality that cushions their users from changes such as this, however 
well-intentioned. To simply dismiss this with a "flock() isn't 
supported, go away" belies the reality that there is an awful lot of 
third-party software out there that relies on not-so-well-documented 
features (of which there are more than a few), and drastically 
changing functionality without warning people beforehand is going to 
upset a lot of customers, which I'd kind of expect Red Hat to 
consider a Bad Thing(tm).

Of course, from my POV the right thing to do would be to make flock() 
work properly under NFS in Linux. But that'll never happen, will it?

Comment 8 Sameh Attia 2004-07-05 13:27:44 UTC

We are going to test a vanila kernel and check. If this works then I
would consider Red Hat really a bad thing. We were going to upgrade
about 70 servers to RHELAS.

We were advised to not go with Red Hat and go with Debian instead.
Seems that we should follow the advice.

Comment 10 Steve Dickson 2004-07-08 16:21:12 UTC

> Nevertheless, from my (exceptionally naive, presumably) point of 
> view, if the functionality of a particular system call or library 
> function changes, it is either intentional (and therefore ought to 
> be documented) or it is unintentional (and is therefore a bug).

I truly sympathize with the fact that apps ported from BSD
are breaking but I think its more of an porting issue than 
anything. The flock(2) man pages clearly states "flock(2) does 
not lock files over NFS.  Use fcntl(2) instead:"

> What appears to have happened is that the behaviour of an flock() 
> call on a file on an NFS-mounted filesystem has changed from "maybe 
> working, maybe not, but flock() reported success anyway" to "flock() 
> reports failure".

No. It when from lying about functionality it was not doing (child
process would not inherit the locks) to telling the application that
this type of locking style is not supported on this filesystem.

> If this has crept in as a result of changes to the upstream kernel,

It was... See 
http://sourceforge.net/mailarchive/forum.php?thread_id=4784837&forum_id=4930


> I sympathize, but one of the things I'd expect of Red Hat is to 
> provide some production continuity and quality that cushions 
> their users from changes such as this, however well-intentioned.
Well its not clear to me that continuing a functionality that is 
clearly broken is a good thing either... 

> drastically changing functionality without warning people 
> beforehand is going to upset a lot of customers, which I'd 
> kind of expect Red Hat to  consider a Bad Thing(tm).

Letting applications think that can pass locks to their child process,
seems to me, to be a fairly major issue.... And no we not like
and work very hard not to upset customers... but again I think this
is a porting issue since it is clearly documented that flocks do 
not work on NFS. 

> Of course,from my POV the right thing to do would be to makeflock() 
> work properly under NFS in Linux. But that'll never happen, will it?
Agreed, and there is talk of doing just that... see above mail thread... 

W.R.T "We were advised to not go with Red Hat and go with 
Debian instead"

This code is in the latest release of Debian, SuSE, and the 
upstream kernels.... So I would hope you would reconsider....

Comment 11 Konstantin Ryabitsev 2004-08-01 19:26:21 UTC

I've hacked vacation slightly to create the .db files in
/var/lib/vacation and not in the users' home directories. This solves
the problem of gdbm not being able to open the db due to flock() over NFS.

http://linux.duke.edu/~icon/RPMS/SRPMS/vacation-1.2.6.1-0.duke.4.el3.src.rpm

Of course, the *real* solution to the problem would be to stop using
this version of vacation. :)

Comment 12 Ole Holm Nielsen 2004-08-03 20:06:35 UTC

Solving the original problem of the vacation program not working, 
I decided to replace vacation by an alternative.  Noting that 
procmail is the default mailer i Redhat's /etc/mail/sendmail.mc, 
I searched the net for vacation-like setups for procmail.  I found 
a really excellent solution here: 
http://sial.org/howto/procmail/procmailrc-vacation
which I can highly recommend.

Comment 13 Brian Smith 2004-09-08 12:05:49 UTC

In vacation you can also add GDBM_NOLOCK to the vacation source at 
vacation.sf.net.  Granted this might be unsafe...but it is 
(hopefully) unlikely and non-serious that a 2nd access will take 
place on the db file in such a short time:

db = gdbm_open(VDB, 128, ((iflag || nflag) ? GDBM_NEWDB : GDBM_WRITER)

to

gdbm_open(VDB, 128, ((iflag || nflag) ? (GDBM_NEWDB | GDBM_NOLOCK) : 
(GDBM_WRITER | GDBM_NOLOCK)

Comment 14 Adam Spiers 2004-11-09 12:12:25 UTC

I notice this change also breaks the 'make test' phase of building
Perl (e.g. 5.8.5) when the source resides on NFS :-(

Comment 15 Joshua Jensen 2004-12-17 16:49:15 UTC

I've looked at the differences between the 2.4.21-15.EL kernel and the
one previous.... this flock-over-NFS behaviour change seems to be
caused by the linux-2.4.21-gfs-enablers.patch.  If you remove that
patch from the kernel, the old flock behavior comes back again.

Seems like a major change to the RHEL3 API to me.

Comment 16 Alan Cox 2004-12-27 14:21:06 UTC

Agreed, reopening as a major API breakage.

Comment 17 Steve Dickson 2005-01-03 13:50:55 UTC

My comments in comment #6 were were not completely accurate.
We (or I) did not add any patches to the NFS code that change the
type of support of flock()s. We never supported flocks and we continue
to not support flock()s over NFS. To be quite clear, we are using the
exact same nfs locking code (i.e. nfs_lock()) in every release RHEL3
(including the initial release). So Nothing has change w.r.t. the NFS
locking code...

But was has changed and broke KABI is the GFS enabler patch that
Joshua have pointed out. The breakage occurs because the flock()
system call (i.e. sys_flock()) was rewritten. This rewrite is 
causing the NFS failure.

I'm reassign to our GFS guy...

Comment 19 Ken Preslan 2005-01-03 23:04:18 UTC

Is there clean way of adding an "if NFS filesystem" switch into
sys_flock()?

The simplest way of fixing this may be to change the error code
returned by NFS.  In nfs_lock(), return LOCK_USE_CLNT instead of
returning -ENOLOCK if (fl->fl_flags & FL_POSIX).  That will cause the
VFS to process flock request and things go back to the way they were
before.

In the 2.6, kernel, there is a separate flock() file_operation.  In
RHEL3, the lock() operation is overloaded to do both flock and fcntl
locks.  We could (again) extend the file_operations structure to add a
field, if we want to put up with the added ugliness.

Comment 20 Ken Preslan 2005-01-03 23:06:04 UTC

Errr...  That should have been if !(fl->fl_flags & FL_POSIX).

Comment 21 Alan Cox 2005-01-03 23:11:12 UTC

For RHEL3 we'd break ABI if we added another file_op.  Other than that
observation I agree with your proposed change to the NFS return for
non POSIX locks.

Comment 22 Ken Preslan 2005-01-04 01:57:45 UTC

Created attachment 109305 [details]
patch to make NFS return LOCK_USE_CLNT for flock calls

As trivial as is, attached is a patch that fixes the problem by returning
LOCK_USE_CLNT as discussed above.  I'm not 100% sure whether or not the check
of fl_owner should be broken out so that it still returns -ENOLCK.  Opinions?

Comment 24 Steve Dickson 2005-01-05 14:04:57 UTC

Created attachment 109372 [details]
Proposed Patch 

This patch checks to make sure the FL_FLOCK bit is set
and will only return LOCK_USE_CLNT (causing a local
lock to be created) when the module parameter
nfs_local_flocks is set.

Comment 25 Steve Dickson 2005-01-05 14:16:42 UTC

Created attachment 109373 [details]
Updated Patch per Alan's comments

Comment 26 Alan Cox 2005-01-05 14:24:35 UTC

This second patch is still wrong. It doesn't default to
nfs_local_flocks as discussed.
Also the mode is wrong - a 444 is read-only meaning it can't be changed.

Please fix the patch and post to the list

Comment 27 Alan Cox 2005-01-05 16:01:55 UTC

Ok mode is settable at insmod viewable at runtime. Steve clarified
this aspect is correct and intentional

Comment 28 Steve Dickson 2005-01-05 18:47:03 UTC

Created attachment 109386 [details]
A patch that introduces the FS_BROKEN_FLOCK  flag

This patch stop the nfs code from being called (similar to
how its done in upstream) with the use of a new
FS_BROKEN_FLOCKS flag in file_system_type structure.

Comment 29 Ernie Petrides 2005-01-11 23:54:44 UTC

A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.7.EL).

Although there wasn't total agreement on the best way to resolve the
issue, I decided that the safest and least source-code-perturbing
approach was to make nfs_lock() return 0 in the FL_FLOCK case.

Comment 30 Brian Long 2005-01-13 16:12:37 UTC

Does this mean we will need to pass the nfs_local_flocks module option to the U5
kernel and the old flock over NFS behaviour (from 2.4.21-9.0.3 and earlier) will
return?  What exact entry in modules.conf is needed?

Can we get testing RPMs of 2.4.21-27.7.EL so we can verify internally this patch
works as expected?

Thanks!

Comment 31 Steve Dickson 2005-01-13 22:17:32 UTC

No... It was decided to use a patch very similar
to the one posted in comment #28. Ernie will
update this bz when the patch is committed

Comment 33 Brian Long 2005-01-14 15:34:50 UTC

Comment #29 says a patch was committed.  Is this patch going to re-enable flock
over NFS?  If so, could I get test kernels posted on someone's people page so we
can test?  Thanks.

Comment 34 Ernie Petrides 2005-01-15 00:42:08 UTC

The fix committed to 2.4.21-27.7.EL on 11-Jan-2005 (3 days ago) has been
slightly reworked such that the flock() avoidance test has been moved from
nfs_lock() up to the f/s-independent layer.  Note that this new fix has the
exact same functional effect as the prior fix (specifically, that flock()
syscalls on NFS files will no longer return ENOLCK errors despite the fact
that locking is enforced only among processes on the local client).

The new fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.8.EL).  Ken, please
make a test kernel available to interested parties (after the official
internal build completes later tonight).  Thanks in advance.  -ernie

Comment 35 Ken Preslan 2005-01-18 22:02:49 UTC

Test kernels:

http://people.redhat.com/kpreslan/2.4.21-27.8.EL/

Comment 36 Brian Long 2005-01-19 15:42:32 UTC

Thank you.  Can you upload kernel-source as well?  We have third party
tools (IBM Rational ClearCase) that must link against the kernel for
our full testing.  We've built kernel-source from your SRPM, but we'd
prefer to use one provided by you directly.

Comment 37 Ken Preslan 2005-01-19 20:58:18 UTC

kernel-source uploaded:

http://people.redhat.com/kpreslan/2.4.21-27.8.EL/

Comment 38 Brian Long 2005-01-27 03:14:02 UTC

Ken, so far we have shown flock() over NFS is working again, but when
we try flock() while running IBM Rational ClearCase (the use case
which was affected in the first place), it still fails.  We are asking
IBM in parallel how the MVFS kernel module behaves and if your patch
to get flock() over NFS working at the fs-independent layer is still
not going to work because MVFS is it's own filesystem that also relies
on NFS.

Comment 39 Michael Martinez 2005-01-28 19:58:30 UTC

So this patch doesn't actually make flock() work over NFS, it just
makes it exit 0? 

Michael Martinez

Comment 40 Ken Preslan 2005-01-28 20:05:19 UTC

It makes flock() over NFS work in the same way it does in the mainline 2.4 and 2.6 trees.   
Different flock holders can conflict with other holders on the same machine, but not with 
holders on other machines.

Comment 41 Brian Long 2005-02-01 14:32:21 UTC

I would like to request you engage IBM Rational Level 3 support
regarding Clearcase 6.0 and their MVFS filesystem.  We absolutely
require ClearCase MVFS (on top of NFS) + flock() to work.  Currently,
flock() over raw NFS works with the 27.8 kernel, but when we use
ClearCase and NFS, it still fails.  Reference IBM PMR 67696 and 53463.
 Thank you!

Comment 42 Howard Owen 2005-02-11 22:46:37 UTC

We seem to have worked out the MVFS issues with a Clearcase patch that
leverages the FS_BROKEN_FLOCK mechanism to acheive the same result as
for NFS code. The combination of their patch and yours works to meet
Cisco's need. Thanks!

Comment 43 Ernie Petrides 2005-02-22 09:24:36 UTC

Please note that the flock()-over-NFS avoidance fix committed to interim
U5 kernel version 2.4.21-27.8.EL has been reworked yet again by changing
the FS_BROKEN_FLOCK f/s flag to FS_WANT_FLOCK with an inverted sense.  We
have done this to allow the original version of MVFS (of ClearCase) to work
with NFS in the RHEL3 U5 kernel without changes.  If in the future MVFS is
implemented over GFS as well, then MVFS's flock() handler will need the new
conditionalized logic using fs_want_flock() that is now in fs/locks.c.

These changes have just been committed to the RHEL3 U5 patch pool this
evening (in kernel version 2.4.21-27.17.EL).

Brian Long and/or Howard Owen, please undo the ClearCase patch that was
previously needed to leverage the now-defunct FS_BROKEN_FLOCK mechanism.

Comment 45 Brian Long 2005-02-22 16:04:07 UTC

I need to test the 2.4.21-27.17.EL kernel in order to know if IBM Rational can
cancel their current patch which checks for FS_BROKEN_FLOCK.  Is U5 beta going
to be released this week?  If not, I need the 2.4.21-27.17.EL kernel, kernel-smp
and kernel-source built for testing.  Thanks.

Comment 46 Ernie Petrides 2005-02-23 23:04:29 UTC

Brian, one more (hopefully final) update: after checking in the -27.17.EL
fix, it was brought to our attention that this would cause a *different*
incompatibility with 3rd party file systems.  So after 3 tries to resolve
this issue, it has been decided to go back to the *original* fix that was
committed to the -27.7.EL kernel (as explained in comment #29).

So, the final fix for this problem has just been committed to the RHEL3 U5
patch pool this afternoon (in kernel version 2.4.21-28.EL, which is the U5
beta candidate kernel).

I will attach the patch (with respect to U4).

Comment 47 Ernie Petrides 2005-02-23 23:10:21 UTC

Created attachment 111360 [details]
NFS flock() avoidance patch

This is the U5 change (with respect to U4) to resolve this issue.

Comment 48 Tim Powers 2005-05-18 13:27:35 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html