Bug 1369790

Summary: System fails to shutdown with /usr on iSCSI
Product: Red Hat Enterprise Linux 7 Reporter: Jan Stodola <jstodola>
Component: initscriptsAssignee: David Kaspar // Dee'Kej <deekej>
Status: CLOSED ERRATA QA Contact: Leos Pol <lpol>
Severity: high Docs Contact: Filip Hanzelka <fhanzelk>
Priority: high    
Version: 7.3CC: deekej, lnykryn, mbanas, msekleta
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: initscripts-9.49.38-1.el7 Doc Type: Release Note
Doc Text:
The system no longer fails to terminate with `/usr` on *iSCSI* or *NFS* In previous versions of Red Hat Enterprise Linux 7, the termination of the system sometimes failed and the system remained hung if the `/usr` folder was mounted over a network (for example, *NFS* or *iSCSI*). This issue has been resolved, and the system should now shut down normally.
Story Points: ---
Clone Of:
: 1446171 (view as bug list) Environment:
Last Closed: 2017-08-01 07:29:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1446171    
Bug Blocks: 1256306, 1380361, 1393867, 1400961    
Attachments:
Description Flags
console.log none

Description Jan Stodola 2016-08-24 12:11:55 UTC
Created attachment 1193615 [details]
console.log

Description of problem:
There are errors when shutting down a system with /usr on iSCSI (/boot, / and swap are on a local disk):

[root@localhost ~]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0    8G  0 disk 
└─sda1   8:1    0    8G  0 part /usr
vda    253:0    0    8G  0 disk 
├─vda1 253:1    0  500M  0 part /boot
├─vda2 253:2    0  4.9G  0 part /
└─vda3 253:3    0  2.6G  0 part [SWAP]
[root@localhost ~]# reboot
[  OK  ] Started Show Plymouth Reboot Screen.
[  OK  ] Stopped Permit User Sessions.
[  OK  ] Stopped Dynamic System Tuning Daemon.
[  OK  ] Stopped Postfix Mail Transport Agent.
[  OK  ] Stopped LSB: Starts the Spacewalk Daemon.
[  OK  ] Stopped target Network is Online.
[  OK  ] Stopped target Remote File Systems.
[  OK  ] Stopped target Remote File Systems (Pre).
[  OK  ] Stopped Login and scanning of iSCSI devices.
         Stopping Login and scanning of iSCSI devices...
         Stopping Logout off all iSCSI sessions on shutdown...
[  OK  ] Stopped Logout off all iSCSI sessions on shutdown.
         Stopping Open-iSCSI...
[  OK  ] Stopped Open-iSCSI.
[  OK  ] Stopped target Network.
         Stopping LSB: Bring up/down networking...
[  OK  ] Started Restore /run/initramfs.
[     *] A stop job is running for LSB: Bring up/down networking (10s / 5min)[  560.001501]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295218066, last ping 4295223072, now 4295228080
[  560.003572]  connection1:0: detected conn error (1022)
[   ***] A stop job is running for LSB: Bring up/down networking (51s / 5min)

The system will not shutdown, see attached console.log


Version-Release number of selected component (if applicable):
RHEL-7.2 GA
RHEL-7.3 Beta (anaconda-21.48.22.82-1.el7, systemd-219-26.el7, dracut-033-453.el7)

How reproducible:
always

Steps to Reproduce:
1. Install a system with /usr on iSCSI and /boot, / and swap stored on a local disk
2. try to reboot the installed system

Actual results:
error messages, connection timeout, system doesn't reboot

Expected results:
system reboots without errors, all file systems are unmounted cleanly

Comment 2 Michal Sekletar 2016-08-26 13:40:28 UTC
This is again caused by /etc/init.d/network bringing network interfaces down before we can cleanly unmount /usr in shutdown initrd.

We already have following code in network script that is supposed to solve the same issue for rootfs,

rootfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/" && $3 != "rootfs") { print $3; }}' /proc/mounts)
rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
	
if [[ "$rootfs" == nfs* || "$rootopts" =~ _r?netdev ]] ; then
	exit 1
fi

systemctl show --property=RequiredBy -- -.mount | grep -q 'remote-fs.target' && exit 1

When I added following to network script I was able to cleanly reboot the machine.

usrfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/usr") { print $3; }}' /proc/mounts)
usropts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/usr") { print $4; }}' /etc/fstab)
	
if [[ "$usrfs" == nfs* || "$usropts" =~ _r?netdev ]] ; then
	exit 1
fi

Note that I changed how filesystem options are gathered. I think that grepping /etc/mtab for _netdev is useless, that option is never there. It is only mentioned in /etc/fstab.

Reassigning to initscripts.

PS: just adding systemctl show --property=RequiredBy -- usr.mount | grep -q 'remote-fs.target' && exit 1 wasn't enough because when non-root filesystem is mounted already from initrd is actually RequiredBy=initrd-root-fs.target

Comment 3 Lukáš Nykrýn 2016-08-29 07:26:56 UTC
Ack for fixing this. By the way, workaround here is to disable the network service completely and use NM.

Comment 4 Lukáš Nykrýn 2016-10-25 14:36:30 UTC
diff --git a/rc.d/init.d/network b/rc.d/init.d/network
index 541a400..7d765fd 100755
--- a/rc.d/init.d/network
+++ b/rc.d/init.d/network
@@ -165,7 +165,7 @@ case "$1" in
         rootfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/" && $3 != "rootfs") { print $3; }}' /proc/mounts)
         rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
        
-       if [[ "$rootfs" == nfs* || "$rootopts" =~ _r?netdev ]] || systemctl show --property=RequiredBy -- -.mount | grep -q 'remote-fs.target' ; then
+       if [[ "$rootfs" == nfs* || "$rootopts" =~ _r?netdev ]] || systemctl show --property=RequiredBy -- -.mount | grep -q 'remote-fs.target' || systemctl show --property=RequiredBy -- usr.mount | grep -q 'remote-fs.target' ; then
                net_log $"rootfs is on network filesystem, leaving network up"
                exit 1
        fi

Comment 10 David Kaspar // Dee'Kej 2017-04-27 12:41:39 UTC
After the dicussion of systemd and dracut developers, it was decided to keep the functionality in initscripts, and the issue will be fixed in dracut:
https://github.com/fedora-sysv/initscripts/pull/94 (BZ #1446171 has been cloned)


On the initscripts side, the patch for this is already present as a result of 9.70-sync (BZ #1392766):
https://github.com/fedora-sysv/initscripts/commit/0649ea591e8d95415847da3ceeb4c2ba49d01248#diff-29b1f2fa45d29ec13422684108bd12e6R165

Comment 16 errata-xmlrpc 2017-08-01 07:29:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2286