Bug 488853

Summary: Insufficient iSCSI root detection in iscsid and network init scripts
Product: [Fedora] Fedora Reporter: Radek Hladik <rhladik>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: agrover, hdegoede, mchristi
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-12-01 09:36:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radek Hladik 2009-03-05 22:47:45 UTC
Description of problem:
We use the combination of a local disk and iSCSI connected disk in md RAID as root filesystem. Detection in initsrcipts does not evaluate this as network attached root. It means that the network interface is brought down on shutdown and on startup the iscsid is not started.

Version-Release number of selected component (if applicable):
iscsi-initiator-utils-6.2.0.870-1.0.fc10.x86_64
initscripts-8.86-1.x86_64


How reproducible:


Steps to Reproduce:
1. Configure system to use local and iSCSI disk in a mirror as root
i.e.:
md0 : active raid1 sda[0] sdc[1](W)
      31000000 blocks [2/2] [UU]
(sda is local SATA disk, sdc is remote iSCSI disk via eth0)
2. create a initrd to make sure that it contains iSCSI utils, configs, etc...
3. make sure the array is synced. 
4. reboot
  
Actual results:
The eth0 is brought down on network shutdown. The iSCSI (sdc) drive is removed from the array. After boot on the init ramdisk starts the RAID with only local drive (kicking the remote one as nonfresh). If the node is configured to start automatically ( node.startup = automatic in device file in /var/lib/iscsi/.../default) the iscsid daemon starts and iscsi init script tries to login the session failing with "already exists" error. 
If the setting is on manual (and this is the only session), both iscsi and iscsid scripts do nothing.
In every case the array is degraded and need to be reconstructed. In the case with manual setting the iscsi session is not managed by a daemon (it is not running at all).

Expected results:
The eth0 would not be downed on shutdown and the array will be stopped with both members in sync. On startup the array will be started with 2 drivers of 2 and iscsid will start. The array will be in sync after reboot.

Additional info:
I've tried two nasty hacks just to see if these are the only problems and I've managed to get it working as expected.
/etc/rc.d/init.d/iscsid:

root_is_iscsi() {
    rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
    [[ "$rootopts" =~ "_netdev" ]]
    #hack 
    /bin/true  
}

/etc/rc.d/init.d/network
  stop)
    # Don't shut the network down if root is on NFS or a network
    # block device.
        rootfs=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/" && $3 != "rootfs") { print $3; }}' /proc/mounts)
        rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)

   if [[ "$rootfs" =~ ^nfs ]] || [[ "$rootopts" =~ "_netdev|_rnetdev" ]] ; then
             exit 1
        fi
   #hack
   echo 'Not shutting network'
     exit 1

(I present these just to show the idea so they are not in a patcvh format)

Comment 1 Mike Christie 2009-05-15 20:44:49 UTC
Sorry for the late reply.

For the iscsid script modification we should have

root_is_iscsi() {
    rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $4; }}' /etc/mtab)
    [[ "$rootopts" =~ "_netdev" ]]
}

What is the difference between what you are doing? Is our code buggy and not returning true at the right time?


And then for the network script we have


        if [[ "$rootfs" =~ ^nfs ]] || [[ "$rootopts" =~ "_netdev|_rnetdev" ]] ; then
                exit 1
        fi


Is your root fs have the _netdev option set in the fstab?

Comment 2 Radek Hladik 2009-05-16 20:20:37 UTC
The difference is that I do not have root fs marked as _netdev in fstab.And I am unsure if I should. I know it would solve this issue, but I do not know whether it would not bring any other problems.  
The idea is that root fs is on md array, which is constructed from local disk (SATA) and remote disk (iSCSI). So in normal situations I would like to start the array using both disks and that requires network. But if there is any trouble then I would like to start it from any disk. In fact I would like to be able to start the array even from only the remote disk if needed.

Comment 3 Bug Zapper 2009-11-18 11:17:43 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Hans de Goede 2009-12-01 09:36:42 UTC
Sorry for the late response, this is not a bug in the iscsi initscripts, if your root partially depends on iscsi it should have the _netdev option in /etc/fstab.

And the iscsi init scripts certainly are not responsible for not stopping your network, this is controlled through the regular initscripts (or networkmanager), and the /etc/sysconfig/network-scripts/ifcfg-eth0 file.

Comment 5 Radek Hladik 2009-12-01 12:04:57 UTC
I understand that not bringing the network down is job of network scripts. I know that both initscripts would work with _netdev option. But I am wondering if adding it to fstab wont break anything else. Man page says, that 
 _netdev
The  filesystem resides on a device that requires network access (used to prevent the  system  from  attempting  to mount  these  filesystems  until  the  network has  been enabled on the system).

But my root does not require network access in all situations. And I surely would not want to prevent it from mounting if there is a problem enabling network.

The reason why I am so persistent with this issue is that (at least as I understand it) initrd iscsistart command initiates the TCP connection to iSCSI target, handles it to the kernel and finishes. And then when iscsid starts it "takes over" the error handling and other management for the connection. So if the iscsi-initiator initscript decides that iscsid is not needed then the connection would remain in some sort of temporaral state and could fail.

Comment 6 Hans de Goede 2009-12-01 12:26:13 UTC
(In reply to comment #5)
> But my root does not require network access in all situations. And I surely
> would not want to prevent it from mounting if there is a problem enabling
> network.
> 

Your root gets mounted by the initrd, not by the regular initscripts, and the initrd does no care about the _netdev flag.

> The reason why I am so persistent with this issue is that (at least as I
> understand it) initrd iscsistart command initiates the TCP connection to iSCSI
> target, handles it to the kernel and finishes. And then when iscsid starts it
> "takes over" the error handling and other management for the connection. So if
> the iscsi-initiator initscript decides that iscsid is not needed then the
> connection would remain in some sort of temporaral state and could fail. 

That is correct, so add that _netdev flag, as your root is (partly) dependend upon the network, so that is the correct thing to do.