Bug 123415
Summary: | API Breakage: NFS "No locks available" with kernel 2.4.21-15.ELsmp | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Ole Holm Nielsen <ole.h.nielsen> | ||||||||||||
Component: | kernel | Assignee: | Ken Preslan <kpreslan> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 3.0 | CC: | alan, brilong, david.grierson, herrold, howen, joshua, kanderso, k.georgiou, nhorman, nhruby, pamadio, peterm, petrides, riel, sattia, sct, steved, tao, tburke, t.h.amundsen | ||||||||||||
Target Milestone: | --- | Keywords: | Regression | ||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | i686 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2005-05-18 13:27:35 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 132991 | ||||||||||||||
Attachments: |
|
Description
Ole Holm Nielsen
2004-05-18 10:27:07 UTC
We have also noticed this bug. It seems to be spesific to locking over NFS using flock(), while fcntl() works fine. Example using exim_lock from Exim 4.30: $ touch foo $ exim_lock -v -fcntl foo exim_lock: fcntl() lock successfully applied exim_lock: locking foo succeeded: running /local/gnu/bin/bash ... $ exit exim_lock: foo closed $ exim_lock -v -flock foo exim_lock: flock() failed: No locks available exim_lock: file closed ... waiting With kernel family 2.4.21-9.0.3.EL both flock() and fcntl() work as expected. Reassigning NFS locking problem to our NFS maintainer. I just upgraded too to the same version of RHELAS 3.0 I have the same problem with NFS but with ezmlm and qmail. I tested the setup on parties running RHELAS 3.0 Update 2 and all gave the same results no locks available with ezmlm. I managed to workaround it by specifying nolock in the mount option but Im not sure is this safe or not. BSD flock locks are not supported. Why is this CLOSED NOTABUG? It's incredibly Bad And Rude(tm) to break support for a widely-used system call in a kernel patch, and then claim it's "not our problem". This is simply not good enough! It is my understanding that flocks have never worked in correctly in linux kernels. So the clami that we, Redhat, have added some patch that breaks or removed support is simply wrong. flocks are not supported in upstreams kernels so they are not supported in Redhat Kernels. Point taken, and I shouldn't have used the word "support", but "functionality". Nevertheless, from my (exceptionally naive, presumably) point of view, if the functionality of a particular system call or library function changes, it is either intentional (and therefore ought to be documented) or it is unintentional (and is therefore a bug). What appears to have happened is that the behaviour of an flock() call on a file on an NFS-mounted filesystem has changed from "maybe working, maybe not, but flock() reported success anyway" to "flock() reports failure". This is a pretty drastic change for a kernel patch, regardless of whether flock() is "officially" supported - there are so many third-party applications written for the BSD-based world - and therefore that are likely to use flock() instead of fcntl() - that such an arbitrary change cannot be made without some kind of "HEADS UP" warning. That is, if it was intentional. If it wasn't intentional, then by anyone's book it's a bug. If this has crept in as a result of changes to the upstream kernel, I sympathize, but one of the things I'd expect of Red Hat is to provide some production continuity and quality that cushions their users from changes such as this, however well-intentioned. To simply dismiss this with a "flock() isn't supported, go away" belies the reality that there is an awful lot of third-party software out there that relies on not-so-well-documented features (of which there are more than a few), and drastically changing functionality without warning people beforehand is going to upset a lot of customers, which I'd kind of expect Red Hat to consider a Bad Thing(tm). Of course, from my POV the right thing to do would be to make flock() work properly under NFS in Linux. But that'll never happen, will it? We are going to test a vanila kernel and check. If this works then I would consider Red Hat really a bad thing. We were going to upgrade about 70 servers to RHELAS. We were advised to not go with Red Hat and go with Debian instead. Seems that we should follow the advice. > Nevertheless, from my (exceptionally naive, presumably) point of > view, if the functionality of a particular system call or library > function changes, it is either intentional (and therefore ought to > be documented) or it is unintentional (and is therefore a bug). I truly sympathize with the fact that apps ported from BSD are breaking but I think its more of an porting issue than anything. The flock(2) man pages clearly states "flock(2) does not lock files over NFS. Use fcntl(2) instead:" > What appears to have happened is that the behaviour of an flock() > call on a file on an NFS-mounted filesystem has changed from "maybe > working, maybe not, but flock() reported success anyway" to "flock() > reports failure". No. It when from lying about functionality it was not doing (child process would not inherit the locks) to telling the application that this type of locking style is not supported on this filesystem. > If this has crept in as a result of changes to the upstream kernel, It was... See http://sourceforge.net/mailarchive/forum.php?thread_id=4784837&forum_id=4930 > I sympathize, but one of the things I'd expect of Red Hat is to > provide some production continuity and quality that cushions > their users from changes such as this, however well-intentioned. Well its not clear to me that continuing a functionality that is clearly broken is a good thing either... > drastically changing functionality without warning people > beforehand is going to upset a lot of customers, which I'd > kind of expect Red Hat to consider a Bad Thing(tm). Letting applications think that can pass locks to their child process, seems to me, to be a fairly major issue.... And no we not like and work very hard not to upset customers... but again I think this is a porting issue since it is clearly documented that flocks do not work on NFS. > Of course,from my POV the right thing to do would be to makeflock() > work properly under NFS in Linux. But that'll never happen, will it? Agreed, and there is talk of doing just that... see above mail thread... W.R.T "We were advised to not go with Red Hat and go with Debian instead" This code is in the latest release of Debian, SuSE, and the upstream kernels.... So I would hope you would reconsider.... I've hacked vacation slightly to create the .db files in /var/lib/vacation and not in the users' home directories. This solves the problem of gdbm not being able to open the db due to flock() over NFS. http://linux.duke.edu/~icon/RPMS/SRPMS/vacation-1.2.6.1-0.duke.4.el3.src.rpm Of course, the *real* solution to the problem would be to stop using this version of vacation. :) Solving the original problem of the vacation program not working, I decided to replace vacation by an alternative. Noting that procmail is the default mailer i Redhat's /etc/mail/sendmail.mc, I searched the net for vacation-like setups for procmail. I found a really excellent solution here: http://sial.org/howto/procmail/procmailrc-vacation which I can highly recommend. In vacation you can also add GDBM_NOLOCK to the vacation source at vacation.sf.net. Granted this might be unsafe...but it is (hopefully) unlikely and non-serious that a 2nd access will take place on the db file in such a short time: db = gdbm_open(VDB, 128, ((iflag || nflag) ? GDBM_NEWDB : GDBM_WRITER) to gdbm_open(VDB, 128, ((iflag || nflag) ? (GDBM_NEWDB | GDBM_NOLOCK) : (GDBM_WRITER | GDBM_NOLOCK) I notice this change also breaks the 'make test' phase of building Perl (e.g. 5.8.5) when the source resides on NFS :-( I've looked at the differences between the 2.4.21-15.EL kernel and the one previous.... this flock-over-NFS behaviour change seems to be caused by the linux-2.4.21-gfs-enablers.patch. If you remove that patch from the kernel, the old flock behavior comes back again. Seems like a major change to the RHEL3 API to me. Agreed, reopening as a major API breakage. My comments in comment #6 were were not completely accurate. We (or I) did not add any patches to the NFS code that change the type of support of flock()s. We never supported flocks and we continue to not support flock()s over NFS. To be quite clear, we are using the exact same nfs locking code (i.e. nfs_lock()) in every release RHEL3 (including the initial release). So Nothing has change w.r.t. the NFS locking code... But was has changed and broke KABI is the GFS enabler patch that Joshua have pointed out. The breakage occurs because the flock() system call (i.e. sys_flock()) was rewritten. This rewrite is causing the NFS failure. I'm reassign to our GFS guy... Is there clean way of adding an "if NFS filesystem" switch into sys_flock()? The simplest way of fixing this may be to change the error code returned by NFS. In nfs_lock(), return LOCK_USE_CLNT instead of returning -ENOLOCK if (fl->fl_flags & FL_POSIX). That will cause the VFS to process flock request and things go back to the way they were before. In the 2.6, kernel, there is a separate flock() file_operation. In RHEL3, the lock() operation is overloaded to do both flock and fcntl locks. We could (again) extend the file_operations structure to add a field, if we want to put up with the added ugliness. Errr... That should have been if !(fl->fl_flags & FL_POSIX). For RHEL3 we'd break ABI if we added another file_op. Other than that observation I agree with your proposed change to the NFS return for non POSIX locks. Created attachment 109305 [details]
patch to make NFS return LOCK_USE_CLNT for flock calls
As trivial as is, attached is a patch that fixes the problem by returning
LOCK_USE_CLNT as discussed above. I'm not 100% sure whether or not the check
of fl_owner should be broken out so that it still returns -ENOLCK. Opinions?
Created attachment 109372 [details]
Proposed Patch
This patch checks to make sure the FL_FLOCK bit is set
and will only return LOCK_USE_CLNT (causing a local
lock to be created) when the module parameter
nfs_local_flocks is set.
Created attachment 109373 [details]
Updated Patch per Alan's comments
This second patch is still wrong. It doesn't default to nfs_local_flocks as discussed. Also the mode is wrong - a 444 is read-only meaning it can't be changed. Please fix the patch and post to the list Ok mode is settable at insmod viewable at runtime. Steve clarified this aspect is correct and intentional Created attachment 109386 [details]
A patch that introduces the FS_BROKEN_FLOCK flag
This patch stop the nfs code from being called (similar to
how its done in upstream) with the use of a new
FS_BROKEN_FLOCKS flag in file_system_type structure.
A fix for this problem has just been committed to the RHEL3 U5 patch pool this evening (in kernel version 2.4.21-27.7.EL). Although there wasn't total agreement on the best way to resolve the issue, I decided that the safest and least source-code-perturbing approach was to make nfs_lock() return 0 in the FL_FLOCK case. Does this mean we will need to pass the nfs_local_flocks module option to the U5 kernel and the old flock over NFS behaviour (from 2.4.21-9.0.3 and earlier) will return? What exact entry in modules.conf is needed? Can we get testing RPMs of 2.4.21-27.7.EL so we can verify internally this patch works as expected? Thanks! No... It was decided to use a patch very similar to the one posted in comment #28. Ernie will update this bz when the patch is committed Comment #29 says a patch was committed. Is this patch going to re-enable flock over NFS? If so, could I get test kernels posted on someone's people page so we can test? Thanks. The fix committed to 2.4.21-27.7.EL on 11-Jan-2005 (3 days ago) has been slightly reworked such that the flock() avoidance test has been moved from nfs_lock() up to the f/s-independent layer. Note that this new fix has the exact same functional effect as the prior fix (specifically, that flock() syscalls on NFS files will no longer return ENOLCK errors despite the fact that locking is enforced only among processes on the local client). The new fix for this problem has just been committed to the RHEL3 U5 patch pool this evening (in kernel version 2.4.21-27.8.EL). Ken, please make a test kernel available to interested parties (after the official internal build completes later tonight). Thanks in advance. -ernie Test kernels: http://people.redhat.com/kpreslan/2.4.21-27.8.EL/ Thank you. Can you upload kernel-source as well? We have third party tools (IBM Rational ClearCase) that must link against the kernel for our full testing. We've built kernel-source from your SRPM, but we'd prefer to use one provided by you directly. kernel-source uploaded: http://people.redhat.com/kpreslan/2.4.21-27.8.EL/ Ken, so far we have shown flock() over NFS is working again, but when we try flock() while running IBM Rational ClearCase (the use case which was affected in the first place), it still fails. We are asking IBM in parallel how the MVFS kernel module behaves and if your patch to get flock() over NFS working at the fs-independent layer is still not going to work because MVFS is it's own filesystem that also relies on NFS. So this patch doesn't actually make flock() work over NFS, it just makes it exit 0? Michael Martinez It makes flock() over NFS work in the same way it does in the mainline 2.4 and 2.6 trees. Different flock holders can conflict with other holders on the same machine, but not with holders on other machines. I would like to request you engage IBM Rational Level 3 support regarding Clearcase 6.0 and their MVFS filesystem. We absolutely require ClearCase MVFS (on top of NFS) + flock() to work. Currently, flock() over raw NFS works with the 27.8 kernel, but when we use ClearCase and NFS, it still fails. Reference IBM PMR 67696 and 53463. Thank you! We seem to have worked out the MVFS issues with a Clearcase patch that leverages the FS_BROKEN_FLOCK mechanism to acheive the same result as for NFS code. The combination of their patch and yours works to meet Cisco's need. Thanks! Please note that the flock()-over-NFS avoidance fix committed to interim U5 kernel version 2.4.21-27.8.EL has been reworked yet again by changing the FS_BROKEN_FLOCK f/s flag to FS_WANT_FLOCK with an inverted sense. We have done this to allow the original version of MVFS (of ClearCase) to work with NFS in the RHEL3 U5 kernel without changes. If in the future MVFS is implemented over GFS as well, then MVFS's flock() handler will need the new conditionalized logic using fs_want_flock() that is now in fs/locks.c. These changes have just been committed to the RHEL3 U5 patch pool this evening (in kernel version 2.4.21-27.17.EL). Brian Long and/or Howard Owen, please undo the ClearCase patch that was previously needed to leverage the now-defunct FS_BROKEN_FLOCK mechanism. I need to test the 2.4.21-27.17.EL kernel in order to know if IBM Rational can cancel their current patch which checks for FS_BROKEN_FLOCK. Is U5 beta going to be released this week? If not, I need the 2.4.21-27.17.EL kernel, kernel-smp and kernel-source built for testing. Thanks. Brian, one more (hopefully final) update: after checking in the -27.17.EL fix, it was brought to our attention that this would cause a *different* incompatibility with 3rd party file systems. So after 3 tries to resolve this issue, it has been decided to go back to the *original* fix that was committed to the -27.7.EL kernel (as explained in comment #29). So, the final fix for this problem has just been committed to the RHEL3 U5 patch pool this afternoon (in kernel version 2.4.21-28.EL, which is the U5 beta candidate kernel). I will attach the patch (with respect to U4). Created attachment 111360 [details]
NFS flock() avoidance patch
This is the U5 change (with respect to U4) to resolve this issue.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-294.html |