Bug 580863 - NFS: Update lockd to fully support GFS2 (or other cluster filesystems)
Summary: NFS: Update lockd to fully support GFS2 (or other cluster filesystems)
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Steve Whitehouse
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-09 09:48 UTC by Steve Whitehouse
Modified: 2019-05-31 14:07 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Steve Whitehouse 2010-04-09 09:48:54 UTC
Using NFS lockd with GFS2 in clustered mode is problematic at the moment. There are probably a number of issues which need to be looked at, including the VFS, lockd and GFS2/dlm/dlm_controld.

This is a bug to track the work required to make this work as intended. There will probably be a number of sub tasks before the work is complete.

Comment 1 Steve Whitehouse 2010-04-09 09:56:11 UTC
Bruce's seven patches:
http://git.linux-nfs.org/?p=bfields/linux-topics.git;a=shortlog;h=refs/heads/fair-queueing

Comment 2 Steve Whitehouse 2010-04-09 09:59:19 UTC
Thread on linux-nfs:

http://thread.gmane.org/gmane.linux.nfs/31448

Comment 3 Steve Whitehouse 2010-04-13 11:33:44 UTC
Issues to address:

 - BKL removal in locks.c
 - Sharing of fcntl locks between NFS & local/remote processes
 - Recovery
 - Lock cancellation interface

Anything I missed?

Comment 4 Steve Whitehouse 2010-07-16 16:16:13 UTC
See also this patch:

http://lkml.org/lkml/2010/7/10/125

Comment 5 Alan Brown 2011-01-26 21:10:57 UTC
Where are you at on this Steve?

Comment 6 Steve Whitehouse 2011-01-27 10:27:17 UTC
Well some pieces of the work are done in upstream now. The BKL removal, for example. Also there have been some other updates relating to fcntl locking which have improved the base from which we have to work.

There is still the issue of how we deal with lock recovery though (for fcntl locks that is). We don't currently have a solution to that problem, and I suspect it will be a while yet before we do. Any suggestions are very welcome though.

Comment 7 Colin.Simpson 2012-04-20 20:33:50 UTC
Would this bug report if addressed allow a GFS2 filesystem to be used concurrently with NFS and local (Samba for example) access? Without causing potential filesystem corruption?

If so great, any update on this?

Comment 8 Steve Whitehouse 2012-04-21 11:54:18 UTC
The short answer (comment #7) is yes, however this is actually a very complicated question and this is not just going to be a case of "here is a patch which provides this feature" but something that we'll be working towards over a period of time.

Both NFS and Samba do locking in their own, mutually incompatible ways. So we'll be making steps towards this eventual goal over a period of time.

The first step is to allow clustered samba exports from multiple nodes - that is already working, and we support that in RHEL 6.2 and above.

There are other tasks which are going on at the same time, to fill in a few more pieces of the puzzle. It is likely to be some time before we finally get to the eventual goal you outlined in comment #7.

In the mean time, we would be very interested in collecting use case info from people, since that will help us prioritise certain aspects of the design. So if you have any potential applications for such a system that you could share with us, we would be very interested.

Comment 9 Alan Brown 2012-04-23 10:46:20 UTC
A lot of the problem with NFS stems from nfsd being in kernel space instead of userspace where it should be.

As one of the people responsible for that (back in ~1993-4) all I can do is apologise. Samba didn't exist then and lock races/compatibility weren't issues that were considered (we did it to speed up access for 8-bit PCnfs clients).

I _strongly_ suggest that an effort be made to move NFS back into userspace.

Comment 10 Colin.Simpson 2012-04-23 15:17:44 UTC
On comment #8, ack'd I know it's quite a long process to get all the pieces that will be required. 

Basically I want to have a clustered Intranet server providing HA for file and print services to both Windows and Linux clients via Samba and NFS respectively. 

Normally sharing a file system with both of these should work on other file systems but locking issues between them could cause the file level corruption. But on GFS2 this can result in File System corruption. My information came from a very helpful support reply I received last year (copied to the end of this post). The only way around this from RH was to re-export the NFS.

Interestingly a thread on a Samba mailing suggests locking should work between NFS and Samba and you shouldn't re-export NFS on Samba:

http://lists.samba.org/archive/samba-technical/2012-April/082954.html

so still a bit confused.

I can't imagine anyone nowadays who has a workstation environment with NFS exports wouldn't want to also re-export to CIFS now. 

So basically back to the main point GFS2 should be a filesystem with "no surprises" and allow me to re-export on CIFS and NFS without causing filesystem corruption.

On comment #9, I couldn't agree more, I have always wanted NFS back in userspace. It just locks up or stops local fs umounts way to often. It would nice to be able to free everything simply by killing a process (to be honest like Samba).


Original support info:
"Exporting NFS and Samba from the same directory tree is unsupported regardless of the filesystem in question, be it GFS, GFS2, EXT3, EXT4, or NFS. The reason that it is unsupported is because NFS and Samba use different locking systems. NFS uses BSD style locks (flock) and Samba uses POSIX locks (fcntl.) The flock and fcntl locking systems are not compatible. So if NFS takes out an exclusive-write flock on a file (because an NFS client is making a change to that file) Samba could come along read-from or write-to the file though NFS expects the file to be exclusively locked. The obverse could happen as well where Samba could ex-lock a file and NFS could come along and read, write, or even delete that file. The end result will be file corruption.

The situation is even more complex on GFS2 as GFS2 employs its own locking layer called "glocks." The conflict between flocks, plocks, and glocks can result not only in file corruption (as you'd see on other filesystems) but filesystem metadata corruption resulting in withdraws or panics and necessitating a filesystem check and repair. The situation is further complicated as the only correctly way to export NFS from GFS2 requires modifying the behavior of the GFS2 locking layer and thus making the possibility of filesystem corruption or lock contention even more pressing; this is why when exporting NFS from GFS2 one isn't even supposed to access the filesystem via any method other than NFS"

Comment 11 Alan Brown 2013-01-24 14:08:46 UTC
Has this progressed in any way since last year?

Comment 12 Steve Whitehouse 2013-01-24 14:36:28 UTC
Well at least the BKL removal has been done now in locks.c (and the rest of the kernel) so we are at least one step along the way.

Bruce may be able to give more information on the current plans for locks.c - he was working on a solution to unify the locking between NFS and Samba (and, I hope userspace too), but I'm not sure of the current status.

This is however still very much on our todo list. We have been working on some similar (i.e. important for NFS/Samba integration) issues recently. Since none of these items is standalone, and they all require cooperation between multiple subsystems, implementation is not simple, unfortunately.

Comment 13 Alan Brown 2013-10-28 13:37:10 UTC
Complete change of tactic:

Why not just ditch Kernel NFSD entirely and replace it with NFS-Ganesha? It's entirely in userspace and plays nicely with other disk access, including Samba.

It saves trying to bang square pegs into round holes.

Comment 14 J. Bruce Fields 2013-10-28 18:01:18 UTC
(In reply to Alan Brown from comment #13)
> Complete change of tactic:
> 
> Why not just ditch Kernel NFSD entirely and replace it with NFS-Ganesha?
> It's entirely in userspace and plays nicely with other disk access,
> including Samba.


We're following Ganesha with interest.  For now I suspect its consistency with other filesystem access is actually less tight than knfsd's, but the details here are fairly complicated.

There are also a number of things we can do (such as kernel support for rich ACLs and delegations, for example) that benefit both projects.


Note You need to log in before you can comment on or make changes to this bug.