Bug 1334573
Summary: | During shutdown, systemd causes the machine to hang indefinitely if /usr/local is a symlink to an automounted location, requiring someone to physically hit the power button to power cycle | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ashima Rawat <arawat> |
Component: | systemd | Assignee: | Michal Sekletar <msekleta> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | qe-baseos-daemons |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.0 | CC: | awshaikh, ccheney, gabriel, helpdesk, kwalker, msekleta, msz, systemd-maint-list |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-17 19:27:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1298243, 1420851 |
Description
Ashima Rawat
2016-05-10 05:02:59 UTC
I would say, that this is unsupported scenario. Systemd requires to have full /usr prepared from the initrd. So definitely not an automount and if that directory should be mounted from nfs, then it should be done inside initrd. (In reply to Lukáš Nykrýn from comment #2) > I would say, that this is unsupported scenario. Systemd requires to have > full /usr prepared from the initrd. So definitely not an automount and if > that directory should be mounted from nfs, then it should be done inside > initrd. Hi there. I'm the one who originally reported this to RedHat, through which Ashima was able to reproduce the issue on RedHat's end, confirm the issue, and create this report for us. (Great job Ashima, your patience through all the back-and-forth is greatly appreciated). I can't find any documentation which states that /usr and it's subdirectories must be in a certain state for systemd to function properly. If such documentation exists, could you link me to it? I could have missed it. Because RHEL7 is capable of being rendered unusable for production environments with a few basic commands, if this can't be fixed, I think it needs to be documented somewhere that /usr and all of it's subdirectories must exist in some specific state for systemd to function properly. I don't want to get too much into non-technical, debatable discussion here, but the one thing I will say is that the use of /usr/local as an NFS mount is not too bizarre. Our organization has been doing it for about 20+ years across various non-systemd operating systems, including RHEL5 and RHEL6. A quick Google search for "/usr/local nfs" brings back an enormous amount of discussion on the use of /usr/local as an NFS mount. One of the leading sources of documentation on Linux filesystem hierarchy practices (in general, not specific to any distribution), states that /usr/local "might be just mounted read-only from somewhere else": http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/usr.html I realize that RedHat is not constricted to the practices set forth by third-parties, but I just want to show that investigation and correction of the root cause of this issue would not just be for us, it would be beneficial to the entire Linux community in keeping existing, pre-systemd practices compatible in the systemd world. We are not the only ones with this issue, and it will continue to slowly effect organizations who haven't yet, but eventually will, make the jump to systemd. Sorry, I have missed the part that this is shutdown related, I don't know why I though that you wrote that this is during boot. Do you use NetworkManager? If so, could you try adding After=dbus.service to its unit-file? (In reply to Lukáš Nykrýn from comment #4) > Do you use NetworkManager? If so, could you try adding After=dbus.service to > its unit-file? Thank you for your time and consideration, Lukáš. Unfortunately "After=dbus.service" in the NetworkManager.service unit-file didn't show any improvement. Just to eliminate NetworkManager from the equation, I disabled NetworkManager and the same problem occurs during shutdown. At Ashima's suggestion in my open support case with RedHat, adding "remote-fs.target" to the After= clause of autofs.service did improve the situation, in that the reboot failure rate has gone down from 100% to about 20%-50%. Unlike before where the failure was consistent, it now seems to be random luck as to whether or not it hangs. The first 5 reboots I did with that change were fine, and I was overjoyed thinking the problem had been solved. But then over the next 5 reboots, it failed twice. I went on to do about 20-30 additional reboots, and found the failure rate to be somewhere between 20%-50%. Hi Lukáš, Please let me know for any progress on this bugzilla raised. I suggested a workaround to the customer but supposedly it doesnt seem to work. Looking forward for your inputs on the same, Thanks Ashima I get similar problem on up-to-date CentOS 7 system: kernel 3.10.0-327.36.2.el7.x86_64, systemd-219-19.el7_2.13, autofs-5.0.7-54.el7, where /usr/local is automounted directly (by auto.direct) instead of a symlink. The shutdown process stops displaying: Stopping LSB: Bring up/down networking... It usually helps to manually stop the autofs (systemctl stop autofs) before shutdown but I also encountered (once or twice) this command to hang - I could still log to the machine remotely but it was impossible to reboot it unless hard reset. Adding "remote-fs.target" to the After= clause of autofs.service did not help at all. The shutdown process hanged immediately showing (1 of 2) A start job is running for Restore of /run/initramfs (14 s/no limit) regards, Michal There have been a number of alterations to avoid hangs during final stages of the shutdown process in recent systemd revisions. One of which was a backport for another bug report below. Bug 1519245 - hangs on reboot or shutdown when nfs file system mounted [rhel-7.4.z] Including the following patchset: * Thu Dec 07 2017 Lukas Nykryn <lnykryn> - 219-42.5 - unmount: Pass in mount options when remounting read-only (#1312002) - shutdown: don't remount,ro network filesystems. (#6588) (#1312002) - shutdown: fix incorrect fscanf() result check (#6806) (#1312002) This may avoid the complete hang condition on shutdown by carefully avoiding NFS filesystems during remount operations. Would it be possible to verify if an update to the systemd version in the following errata resolves the condition? https://access.redhat.com/errata/RHBA-2018:0155 - Kyle Walker I have asked the customer to upgrade systemd package and let us know the results. Based on my update in comment 24, and the lack of further reported instances, I am closing this bug as CURRENTRELEASE. Please open a further bug report and refer to this instance in the event that this particular issue is suspected to be related to the further occurrence. |