Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1236713

Summary:	allow HA NFS migration without delaying unrelated exports
Product:	Red Hat Enterprise Linux 7	Reporter:	J. Bruce Fields <bfields>
Component:	kernel	Assignee:	J. Bruce Fields <bfields>
kernel sub component:	NFS	QA Contact:	Filesystem QE <fs-qe>
Status:	CLOSED WONTFIX	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	baumanmo, bcodding, ctatman, ravi-v.kini, ravi-v.kini, stanislav.moravec, steved, xzhou
Version:	7.1
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-12-15 07:34:59 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1124856, 1202403

Description J. Bruce Fields 2015-06-29 19:32:01 UTC

In order to migrate clients using a filesystem from server A to server B, we need to first unmount the filesystem from server A.  But that unmount can fail with -EBUSY, for reasons including:

        - An in-progress rpc is using the filesystem.
        - An rpc recently used the filesystem, and A's export caches still hold a reference.
        - A client holds a lock, or (for NFSv4 clients) an open or delegation, or (for NFSv4.1 and higher clients) a pNFS layout.

We could handle the first two issues by temporarily stopping rpc.mountd, flushing the export caches (with exportfs -f), unmounting, then restarting rpc.mountd.  But that's insufficient in the presence of NLM or NFSv4.

The way we currently deal with this is roughly:

        - configure clients accessing the filesystem to all mount it through a single floating IP address.
        - shut down nfsd on both A and B.
        - unmount the filesystem from A and mount it on B.
        - move the floating IP address.
        - restart nfsd on both A and B.
        - allow the clients to reclaim their state during B's grace period.

The shutdown of the server on filesystem A is what makes the unmount 
work reliably.

One problem with this procedure is that all the other clients (including
clients accessing different experts through different server IP
addresses) are forced to wait for the server restart.  We'd like to fix that.

Comment 2 J. Bruce Fields 2015-06-29 21:11:19 UTC

Alternatives include:

- extend nfsd's unlock_filesystem/unlock_ip interfaces. They provide a way to forcefully drop NLM locks, but don't handle v4 state. This may be the simplest fix.

- teach umount to forcibly remove nfs locks; I believe that's essentially what https://bugzilla.redhat.com/show_bug.cgi?id=749044#c10 is suggesting. This may violate user expectations in some cases and shouldn't be the default behavior. (Client applications will likely crash as opposed to hanging waiting for the filesystem to return as they would on a reboot.) However we could provide an allow_force_umount export option. Implementation is somewhat tricky. Kinglong Mee has tried to do someting similar just for reference held by the export caches, and this could build on that work (which build in turn on Al Viro's mount pin work.) This might also have other uses. (E.g. for people that want to decommission some filesystem badly enough that they don't mind crashing client applications.)

- finish nfsd containerization and run separate container for each floating IP: this allows independent shutdown and startup of nfs servers for each floating IP without affecting other floating IPs on the same machine. That also prevents unnecessary grace-period delays for other clients of server B (the target of the migration). It should also make client configuration more foolproof since only exports meant to be used over a given floating IP will be visible over that IP. I believe the last piece missing here is containerization of usermode helpers. Ian Kent has done some work on that, I'm not sure where it stands.

The first two solutions force us to shut down the floating IP before unexporting and unmounting (to prevent the client from seeing spurious ESTALE errors), and I believe both leave us vulnerable to the ACK storm problems described in https://bugzilla.redhat.com/show_bug.cgi?id=1161795.

Comment 3 Benjamin Coddington 2015-06-30 09:34:31 UTC

There's an existing solution similar to containerization -- use a virtual machine for each floating IP or export.  That gives you all the features of the containerized setup, it works today, and creates a server architecture that is pNFS-ready (for some layouts).  We rarely hear about this HA NFS architecture because the people that use it are not having any of these problems, so I think we tend to forget about it.

Comment 4 J. Bruce Fields 2015-09-10 21:03:43 UTC

(In reply to Benjamin Coddington from comment #3)
> There's an existing solution similar to containerization -- use a virtual
> machine for each floating IP or export.  That gives you all the features of
> the containerized setup, it works today, and creates a server architecture
> that is pNFS-ready (for some layouts).  We rarely hear about this HA NFS
> architecture because the people that use it are not having any of these
> problems, so I think we tend to forget about it.

I'm all for it.  Can we figure out if we have users for which that isn't an adequate replacement for the NFS HA stuff?  If so, can we fix any problems with the VM-based approach and deprecate the existing NFS HA agents?

Comment 8 J. Bruce Fields 2015-12-22 14:31:19 UTC

*** Bug 1145930 has been marked as a duplicate of this bug. ***

Comment 10 RHEL Program Management 2020-12-15 07:34:59 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.