Bug 1007607 - systemd fails to reboot/shutdown if the system has a stale NFS handle
Summary: systemd fails to reboot/shutdown if the system has a stale NFS handle
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 980088 1007745
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-12 23:09 UTC by John Schmitt
Modified: 2014-02-12 07:45 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-02-12 07:45:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1007745 0 unspecified CLOSED Cannot unmount "stale" NFS handles 2021-02-22 00:41:40 UTC

Internal Links: 1007745

Description John Schmitt 2013-09-12 23:09:08 UTC
Some network event resulted in NFS handles that were "stale".  No combination of killing processes, remounting, restarting services fixed this stale handle.

When all else failed, I initiated a reboot.  This also fails.  On the console, the systems in question keep displaying the following forever:

[...timestamp...] nfs: server my.ip.add.ress not responding, still trying
Unmounting /mnt/myMountPoint
Could not unmount /mnt/myMountPoint: Stale file handle

See also:

https://bugzilla.redhat.com/show_bug.cgi?id=851665
https://bugzilla.redhat.com/show_bug.cgi?id=750926

Comment 1 Lennart Poettering 2013-09-13 03:36:52 UTC
Well, what are we supposed to do with this? THis hangs in the kernel...

Comment 2 John Schmitt 2013-09-13 09:04:11 UTC
How about limiting the time shutdown should take?  Limiting the number of attempts to unmount?

Comment 3 Lennart Poettering 2013-09-13 20:24:29 UTC
(In reply to John Schmitt from comment #2)
> How about limiting the time shutdown should take? 

Well, we just invoke umount(), and the kernel is then blocking which is something we cannot cancel.

That said we actually turn on the hw watchdog when entering the shutdown phase (if you happen to have one, but almost all systems from the last few years do), so after a long timeout of 10min the machine should simply reset. (THis is configurable via ShutdownWatchdogSec= in system.conf. Note however that this is subject to hw limitation, and a lot of hw can't do such long watchdog timeouts...)

> Limiting the number of
> attempts to unmount?

We do that.

Comment 4 Orion Poplawski 2013-09-13 21:19:24 UTC
One problem wrt netfs - netfs would call umount with '-f -l'.  It does not appear that system uses MNT_FORCE|MNT_DETACH which is going to be necessary in the case of an unreachable nfs server.

Comment 5 John Schmitt 2013-09-16 22:50:02 UTC
I've been trying to take advantage of ShutdownWatchdogSec.  Sadly, my vmware VMs do not have a /dev/watchdog.  I have been able to use 

systemctl reboot --force

though.

Comment 6 John Schmitt 2014-02-12 07:45:53 UTC
I no longer see this with Fedora 20 with the 3.12 kernel.


Note You need to log in before you can comment on or make changes to this bug.