Bug 436341 - iscsid init script blows away the network initscript symlinks
Summary: iscsid init script blows away the network initscript symlinks
Keywords:
Status: CLOSED DUPLICATE of bug 437522
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Mike Christie
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F9Blocker 246960
TreeView+ depends on / blocked
 
Reported: 2008-03-06 17:03 UTC by Daniel Berrangé
Modified: 2018-04-11 09:22 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-04-09 04:19:06 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
check for iscsi sessions when stopping network (831 bytes, patch)
2008-04-06 07:34 UTC, Mike Christie
no flags Details | Diff

Description Daniel Berrangé 2008-03-06 17:03:35 UTC
Description of problem:
In the start() function of the /etc/init.d/iscsid initscript, there is code
which disables the 'network' service in runlevels 06, and also blows away the
symlinks.

The iscsid initscript has no business doing this unconditionally. This disables
networking shutdown even if no iscsi targets are active. The network initscript
already has logic to determine if network filesystems are active & avoids
shutting down. It should can look for iSCSI block devices.

Version-Release number of selected component (if applicable):
iscsi-initiator-utils-6.2.0.868-0.3.fc9.x86_64

Also impacts F8

iscsi-initiator-utils-6.2.0.865-0.2.fc8

How reproducible:
Always

Steps to Reproduce:
1. 'service iscsid start'
2. Look for rc0.d and rc6.d  symlinks for 'network' service
3.
  
Actual results:
'network' initscript is missing shutdown symlinks

Expected results:
iscsid doesn't touch network initscript settings at all

Additional info:
This is the bad code from iscsid

        echo -n $"Turning off network shutdown. "
        # we do not want iscsi or network to run during system shutdown
        # incase there are RAID or multipath devices using
        # iscsi disks
        chkconfig --level 06 network off
        rm /etc/rc0.d/*network
        rm /etc/rc6.d/*network

Notice how it will blow away *any* initscript whose name matches
'<something>network'

Comment 1 Mike Christie 2008-03-07 15:19:21 UTC
(In reply to comment #0)
> Description of problem:
> In the start() function of the /etc/init.d/iscsid initscript, there is code
> which disables the 'network' service in runlevels 06, and also blows away the
> symlinks.
> 
> The iscsid initscript has no business doing this unconditionally. This disables
> networking shutdown even if no iscsi targets are active. The network initscript


I agree this is a nasty hack.


> already has logic to determine if network filesystems are active & avoids
> shutting down. It should can look for iSCSI block devices.


I am not sure what you want the netfs script to do? Do you want it to shutdown
iscsi disks and users of iscsi disks like MD and DM (and third party users like
EMC and Veritas) when it is run? The iscsi disk and MD/DM part is easier (but
still a pain :)). The 3rd party part is harder, but I agree it should be done
and Fedora is a good place to get vendors and us starting to do this.

What about root though? For that we need the network up for the entire shutdown
process, unless we find a way to load everything in memory. For iscsi root, do
you think anaconda (the program that sets up iscsi root) should config the
network scripts to not shutdown (I think it used to not do that, but I will check)?

Comment 2 Daniel Berrangé 2008-03-08 14:48:26 UTC
I wasn't referring to the 'netfs'  script - I'm talking about the 'network'
script. The 'network' script's 'stop()'  function looks to see if any network
filesystems are still active, and if so, doesn't stop networking. This has the
same effect as you seem to be trying to achieve by removing the shutdown/reboot
symlinks.  The 'network' script's  stop() function could also look to see if any
iSCSI targets are currently active on the host.



Comment 3 Mike Christie 2008-03-10 17:31:50 UTC
(In reply to comment #2)
> I wasn't referring to the 'netfs'  script - I'm talking about the 'network'
> script. The 'network' script's 'stop()'  function looks to see if any network
> filesystems are still active, and if so, doesn't stop networking. This has the
> same effect as you seem to be trying to achieve by removing the shutdown/reboot
> symlinks.  The 'network' script's  stop() function could also look to see if any
> iSCSI targets are currently active on the host.
> 

Ah thanks. I did not see that before in the net script.


Comment 4 Jesse Keating 2008-04-03 20:03:30 UTC
Mike, can you fix this please?  Final freeze is just a few days away...

Comment 5 Mike Christie 2008-04-04 19:02:31 UTC
I am using 437522 for the iscsi fixup.

I will use this BZ to send the initscript guys a patch to the network script.
Switching component to initscripts.

Comment 6 Mike Christie 2008-04-06 07:34:18 UTC
Created attachment 301415 [details]
check for iscsi sessions when stopping network

Hey initscript guys,

The attached patch checks if a session is running before stopping the network
service. If it is running then, we exit out. This patch will allow us to get
rid of the really crazy hacks in the iscsi script.

For FC10, instead of just failing the /etc/init.d/network stop" call, we can
modify the netfs script to shut down the user of the iscsi (in FC10 we will
probably have to deal with FCOE too) disk.

Comment 7 Mike Christie 2008-04-07 22:10:38 UTC
Adding Bill Notting since this is initscripts stuff and because it is related to
437522 which he made for the iscsi side.

Bill, for 437522 I removed all the chkconfig and network symlink hackery from
the iscsi init script.

For this bz, I added a patch in comment #6 to the network script which will
check for running sessions. If a iscsi session is then we do not shutdown the
network. The iscsi device shutdown and network device shutdown will then work
like how it does for iscsi root boot where the kernel calls the driver model
shutdown callouts.

Optimally, we probably want to check if a iscsi session is running, and if it is
not being used for root then we want to call into the netfs script (or something
like it that handles the block level users) which would then stop the users of
the iscsi disk. In the new script or modfied netfs one, we would call the lvm,
dm, md and 3rd party multipath scripts to shutdown the block layers if they are
using iscsi.

It seemed to late to do all this, so I am just keeping the old behavior where
the network is left on and the iscsi and block layers are left running like how
we handle iscsi root. I will make bugzillas for FC10 for each subsystem that
could use a block net device to add initscript stop calls that shutdown the device.

Comment 8 Bill Nottingham 2008-04-07 22:57:22 UTC
(In reply to comment #7)
> I will make bugzillas for FC10 for each subsystem that
> could use a block net device to add initscript stop calls that shutdown the
device.

That seems very very wrong, especially once you start stacking things.

Under what cases is this patch actually needed? I'm not seeing what it will fix
- at the time network is shut down, the only filesystems that should be accessed
is the root filesystem, which is already handled.


Comment 9 Mike Christie 2008-04-08 22:16:56 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > I will make bugzillas for FC10 for each subsystem that
> > could use a block net device to add initscript stop calls that shutdown the
> device.
> 
> That seems very very wrong, especially once you start stacking things.
> 
> Under what cases is this patch actually needed? I'm not seeing what it will fix
> - at the time network is shut down, the only filesystems that should be accessed
> is the root filesystem, which is already handled.
> 

It was originally needed because in RHEL4/FC5 some programs that are called late
in the shutdown sequence would try to access iscsi disks or disks that used
iscsi disks like with software raid or multipath. For example when
/etc/init.d/halt did the remount read-only, mount would probe disks for mount
labels. And when it got to a dm-multipath or a raid one that was using iscsi we
would hang or go into recovery (kick out raid disks and cause problems like
unnessary resyncs on the next reboot), because the iscsi disk that it was built
on was not there or the iscsi disk could not access the network.

So for RHEL4 we did a script which would tear down multipath devices using iscsi
devices. Then we had to handle md and dm raid, then we had to handle 3rd party
volume management and multipath devices, so we eventually did the hack we have now.

In FC6 or maybe FC7 we fixed things like the mount label search issue, so I
think as far as the scripts and programs we control we might be ok.

I have just been worried about if a program was accessing a iscsi disk or a disk
using a iscsi disk, and during shutdown the program was shutting down when
/etc/init.d/halt to sent the term signal to everything. In that case, without
the hacks, the iscsi disk would be shutdown and the network would be off so data
would not get written properly. Is that too paranoid? I do not have bug report
for something like this, so do you think it would be ok to just shutdown iscsi
and the network and remove the hacks and see if we do hit something like this?
If we do then handle it then.

Also do you think it is possible that a app could be using a disk (no fs), there
could be data in cache that needs to be written, and the app gets shutdown ok,
but the data is not written to disk when the "/etc/ini.d/iscsi or network stop"
call is made. For example if the app did not do a sync to make sure data is on
disk before shutting down, then the init.d scripts ran and shutdown the network
and iscsi then could the not get written to disk. Maybe this is too paranoid
too? What do you think?

Comment 10 Bill Nottingham 2008-04-09 02:54:33 UTC
(In reply to comment #9)
> It was originally needed because in RHEL4/FC5 some programs that are called late
> in the shutdown sequence would try to access iscsi disks or disks that used
> iscsi disks like with software raid or multipath. For example when
> /etc/init.d/halt did the remount read-only, mount would probe disks for mount
> labels. And when it got to a dm-multipath or a raid one that was using iscsi we
> would hang or go into recovery (kick out raid disks and cause problems like
> unnessary resyncs on the next reboot), because the iscsi disk that it was built
> on was not there or the iscsi disk could not access the network.

As long as the filesystems are properly marked with _netdev, the right thing
should happen.

> I have just been worried about if a program was accessing a iscsi disk or a disk
> using a iscsi disk, and during shutdown the program was shutting down when
> /etc/init.d/halt to sent the term signal to everything. In that case, without
> the hacks, the iscsi disk would be shutdown and the network would be off so data
> would not get written properly. Is that too paranoid? I do not have bug report
> for something like this, so do you think it would be ok to just shutdown iscsi
> and the network and remove the hacks and see if we do hit something like this?

I think this should be OK, yes.

> Also do you think it is possible that a app could be using a disk (no fs), there
> could be data in cache that needs to be written, and the app gets shutdown ok,
> but the data is not written to disk when the "/etc/ini.d/iscsi or network stop"
> call is made. For example if the app did not do a sync to make sure data is on
> disk before shutting down, then the init.d scripts ran and shutdown the network
> and iscsi then could the not get written to disk. Maybe this is too paranoid
> too? What do you think?

Using... raw devices? Just opening /dev/sda?  I'm not sure this is a case to
optimize for.

Comment 11 Mike Christie 2008-04-09 04:18:04 UTC
(In reply to comment #10)
> As long as the filesystems are properly marked with _netdev, the right thing
> should happen.

It didn't. Previously mount would scan every block device to find labels, so
that is how we would run into problems.

In FC9 it looks like we do not scan anymore so we are ok.

> 
> Using... raw devices? Just opening /dev/sda?  I'm not sure this is a case to
> optimize for.

Yeah, just opening /dev/sda.

I am not trying to optimize. I am just trying to not get bug reports :)

I will remove the net hacks and see how it goes.

Thanks. I am going to close this bug then. I have 437522 for the iscsi init
script changes.

Comment 12 Mike Christie 2008-04-09 04:19:06 UTC

*** This bug has been marked as a duplicate of 437522 ***


Note You need to log in before you can comment on or make changes to this bug.