Bug 456228

Summary: [NetApp 5.3 bug] RHEL 5.2 iSCSI SANBoot intermittent freeze during startup
Product: Red Hat Enterprise Linux 5 Reporter: Ritesh Raj Sarraf <rsarraf>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: agk, andriusb, bmarzins, bmr, christophe.varoqui, coughlan, cward, dwysocha, edamato, egoggin, heinzm, junichi.nomura, kueda, lmb, mbroz, mchristi, mgahagan, nandkumar.mane, prockai, tranlan, xdl-redhat-bugzilla
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 22:08:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 373081    
Attachments:
Description Flags
sanboot freeze sysrq
none
log of multipathd and iscsid with full verbosity
none
SANBoot NON-Freeze Log
none
SAN Boot Freeze Log
none
multipath-stop.patch none

Description Ritesh Raj Sarraf 2008-07-22 12:19:46 UTC
Description of problem:

On a freshly installed RHEL 5.2 SANBoot host, we are seeing intermittent hang
issues during boot-up.
Here's the flow:
* iSCSI, Network and Multipath gets enabled from initrd.
* LUNs are fetched.
* Root LUN is mounted and boot proceeds in initrd.
* Boot exits from initrd and proceeds with real init.
* Runlevel 5 is entered and a bunch of scripts get executed
* iscsid service gets started.


Sometime during this event, intermittently, the host hangs.

The first suspicion was firewall. But the issue is seen with when firewall is
disabled too.
Seconds suspicion was bad hardware. But the issue is seen on a VM also.

From the SysRq data that was gathered, we noticed that SCSI Error Handler was
the state at which the OS was hung.
We're in a hung state, because we're using multipath with "queue_if_no_path". So
the hang is not the real problem IMO.

SCSI Error Handler mesages do confirm that it could be a problem with either
iscsid or the network. From what I've investigated, it doesn't look to be a
network problem.

I also tried starting up iscsid daemon from initrd but that too didn't come up
to help.

Any thoughts?



Version-Release number of selected component (if applicable):
* RHEL 5.2
* SANBooted Host with NetApp Storage Controller

How reproducible:
Very Reproducible.

Steps to Reproduce:
1. Install RHEL 5.2 with SANBoot iSCSI (with Multipath)
2. Reboot the Host
3. Not consistent but on a couple of reboots, you should hit the freeze.
  
Actual results:
RHEL 5.2 iSCSI SANBoot Host freezes during start-up

Expected results:
RHEL 5.2 iSCSI SANBoot Host should boot without problems.

Comment 1 Ritesh Raj Sarraf 2008-07-22 12:21:45 UTC
Created attachment 312338 [details]
sanboot freeze sysrq

Comment 2 Mike Christie 2008-07-22 18:47:08 UTC
What are the scsi error handler messages are you seeing? I did not see them in
the log you attached.

Are you using dm-multipath with one session/path or are there multiple ones?

I saw this in the log:
ping timeout of 5 secs expired, last rx 4295152736, last ping 4295152736, now
4295208168

If you were using dm-multipath and only had one session at the time, then the
iscsi layer would fail commands with DID_BUS_BUSY. Since dm-mpath is being used
the command would get fast failed. If dm-mpath only had one path and if
multipathd was not yet running, then with queue_if_no_path dm-mpath would queue
IO and multpiathd would never detect if the single path came back.


Even if we did not hit a iscsi ping timeout like above, we would run into
trouble if we were using dm-multipath with multiple paths if multipathd is not
running when iscsid is started. When we get here

Turning off network shutdown. Starting iSCSI daemon: [  OK  ]

the initiator would fail any running IO on all paths with DID_BUS_BUSY while it
syncs it self up with userspace. So if dm-mpath is used the command would get
fast failed, and all paths would be failed and dm would sit there queueing if
queue_if_no_path was set.

Comment 3 Ritesh Raj Sarraf 2008-07-22 20:44:37 UTC
(In reply to comment #2)
> Are you using dm-multipath with one session/path or are there multiple ones?
>
Yes. For 5.2, in initrd, we are only left with a single session when loggin 
in. We have a patch scheduled for 5.3 to allow multiple session login when in 
initrd.

> I saw this in the log:
> ping timeout of 5 secs expired, last rx 4295152736, last ping 4295152736, 
now
> 4295208168
>
> If you were using dm-multipath and only had one session at the time, then 
the
> iscsi layer would fail commands with DID_BUS_BUSY. Since dm-mpath is being 
used
> the command would get fast failed. If dm-mpath only had one path and if
> multipathd was not yet running, then with queue_if_no_path dm-mpath would 
queue
> IO and multpiathd would never detect if the single path came back.
>

Okay! I'll retry with multipathd. multipathd in 5.2 was changed to store all 
prio_callouts into RAM, so I think we won't have dependency on the root lun.

Comment 4 Ritesh Raj Sarraf 2008-07-24 16:22:18 UTC
The following is what I tried.

* Enabled multipathd to start before iscsid.

With this, I believe, even when iscsid starts and we lose connectivity,
multipathd with the queue_if_no_path feature will take care of keeping the OS
alive. And since multipathd has the ramfs fix, which makes all prio apps and
scsi_id available through RAM, multipathd will be able to track the status of
the root LUN and its path even when the iscsid connection gets
terminated/re-established.

This looked good. We didn't see hang/freeze like the earlier case.

The new problem looks different though. The OS still gets hung. I'd enabled
iscsid and multipathd with full verbosity and was able to capture the logs.
At the time of the hang, as per multipathd logs, multipathd reports that all the
paths are up and available. There are also no iscsi errors. But still the OS was
hung.

With no pointers left to think about, I triggered a saK from SysRq. saK killed
the multipathd process and then the OS boot proceeded ahead to again freeze.

From the SysRq document about saK:

sa'K' (Secure Access Key) is useful when you want to be sure there is no
123	trojan program running at console which could grab your password
124	when you would try to login. It will kill all programs on given console,
125	thus letting you make sure that the login prompt you see is actually
126	the one from init, not some trojan program.
127	IMPORTANT: In its true form it is not a true SAK like the one in a :IMPORTANT
128	IMPORTANT: c2 compliant system, and it should not be mistaken as   :IMPORTANT
129	IMPORTANT: such.                                                   :IMPORTANT
130	       It seems others find it useful as (System Attention Key) which is
131	useful when you want to exit a program that will not let you switch consoles.
132	(For example, X or a svgalib program.)


My interpretation of the text is ("It will kill all programs on given console,")
that any process that is not exited (daemonized), stands as a good candidate to
be saKed.

But that again doesn't turn correct. Because multipathd reported multiple paths
to the LUN. And that'd only be possible when multipathd daemon properly
daemonizes itself and init proceeds to the next script i.e. iscsid.

So multipathd definitely isn't holding the boot process.

Any ideas what might be going on there now?
Any ideas why the kernel chose to saK the multipathd process?

Comment 5 Ritesh Raj Sarraf 2008-07-24 16:30:18 UTC
Created attachment 312578 [details]
log of multipathd and iscsid with full verbosity

Comment 6 Mike Christie 2008-07-25 02:38:58 UTC
Does the log runs start out with one patch, then when iscsid is started it adds
paths? There are multiple runs right? It looks like some things are missing.

Do you know if other services are started after iscsid? It seemed to look like
it in the logs, but I could not tell for sure.


Comment 7 Ritesh Raj Sarraf 2008-07-25 09:12:58 UTC
Let me try again.

I've enabled multipathd to start before iscsid. So multipathd is now
S06multipathd in rc5.d, which is just before S07iscsid.

I'm also attaching two separate trimmed logs which should give you a better idea
of what the problem is now.

The OS boots with 1 single path when started. The 1 path login takes place in
the initrd. Then real init is executed. Then S06multipathd gets executed. Then
S07iscsid gets executed. Then later iscsi service gets executed which creates
the remaining number of sessions.

To test, the OS is running in a reboot loop.

The logs where we see the freeze, we noticed that there are lesser iscsid
related messages as compared to the non-freeze logs. But interestingly,
multipathd, on the frozen host had been reporting all 4 paths to be up.

Comment 8 Ritesh Raj Sarraf 2008-07-25 09:14:20 UTC
Created attachment 312635 [details]
SANBoot NON-Freeze Log

Comment 9 Ritesh Raj Sarraf 2008-07-25 09:15:08 UTC
Created attachment 312636 [details]
SAN Boot Freeze Log

Comment 11 Ritesh Raj Sarraf 2008-08-11 17:35:51 UTC
Mike,

We need to have multipathd service start before iscsid.

With that in place, if the user is SAN Booting with "queue_if_no_path", when iscsid starts, the OS is able to survive.
In our labs this has been under test for more than 10 days without trouble.


======
Following are the changes that need to be done to ensure proper configuration of SAN Booted
iSCSI.

* Do the normal iSCSI SAN Boot installation as recommended by HU
* On boot, switch to Single User Mode.
* Create the /etc/multipath.conf file as recommended by HU.
* Make sure multipathd service is started before the iscsid service. This'd require
modifying the multipathd script and changing the start priority from SXX to S06.
This assumes that the default iscsi installation sets iscsid start priority at S07.
If not, you need to make sure that multipathd's start priority is less than iscsid's
start priority.
The line should look something like this:
# chkconfig: - 06 87
* Now run chkconfig multipathd off followed by chkconfig multipathd on
* You'd also need to ensure that during shutdown, multiapthd doesn't get killed.
You can either disable multipathd in rc6 and rc0 or make some modifications to the multipathd
init script.
The stop section should look like this:
stop() {
        rootopts=$(awk '{ if ($1 !~ /^[ \t]*#/ && $2 == "/") { print $1; }}' /etc/mtab)
        if [[ "$rootopts" =~ "mapper" ]] ; then
                echo $"Can not shutdown multipathd. Root is on a multipathed disk."
                exit 1
        fi
        echo -n $"Stopping $prog daemon: "
        killproc $DAEMON
        RETVAL=$?
        [ $RETVAL -eq 0 ] && rm -f $lockdir/$prog
        echo
}


=========


Mike, can we have this too? During shutdown, we shouldn't stop multipathd if root dev is on a multipathed device.

Comment 13 Ritesh Raj Sarraf 2008-08-25 15:03:51 UTC
Mike,

Here's a patch for your ease.

It takes care of the start priority and the stop check.

When stopping, it check to see if the root dev is a multipathed device (both LVM on top of multipath and only multipath), and if yes, it doesn't stop the daemon.
Something like mount. If dev is busy, mount errors with "dev busy" message.

I don't see this patch making any drastic changes in the behavior, and so recommend it for RHEL5 U3.

Comment 14 Ritesh Raj Sarraf 2008-08-25 15:06:21 UTC
Created attachment 314924 [details]
multipath-stop.patch

apply with `patch -p0`

Comment 15 Mike Christie 2008-08-25 16:07:42 UTC
Sorry about that guys. I thought I posted on this.

It seems fine to me, but the multipath init script is Ben's code, so I am transferring the bugzilla to device-mapper-multipath.

Comment 16 Ben Marzinski 2008-08-28 19:23:33 UTC
Patch applied. Thanks.

Comment 17 Mike Christie 2008-09-12 18:13:53 UTC
(In reply to comment #11)
> With that in place, if the user is SAN Booting with "queue_if_no_path", when
> iscsid starts, the OS is able to survive.
> In our labs this has been under test for more than 10 days without trouble.
> 

Hey Netapp guys,

How are you setting queue_if_no_path or no_path_retry for this type of setup? Is it the default for your hardware or did you set it manually? If the second one how do you set it manually for boot? Did you guys have another bugzilla so that the initramfs/installer tools detect if multipath is being used for root and if so will not setup multipath to fail right away if no paths are available? I thought you did but I cannot find it and I think some users are hitting it.

Comment 18 Ritesh Raj Sarraf 2008-09-12 20:48:30 UTC
(In reply to comment #17)
> 
> Hey Netapp guys,
> 
> How are you setting queue_if_no_path or no_path_retry for this type of setup?
> Is it the default for your hardware or did you set it manually? If the second
> one how do you set it manually for boot? Did you guys have another bugzilla so
> that the initramfs/installer tools detect if multipath is being used for root
> and if so will not setup multipath to fail right away if no paths are
> available? I thought you did but I cannot find it and I think some users are
> hitting it.

We rely on queue_if_no_path.
We set it manually right after the installation and just before First Boot by first booting into Single User Mode. The default installation doesn't provide any conf file, so we boot into Single User Mode and create one. And then with multipathd starting before iscsid, we're able to survive path failures.

For the initramfs/installer, there's no auto-detection done. In the current state, iscsi starts with a single session from initramfs and reaches the real init. And then then when iscsi/iscsid service is started, it logs in into all the nodes. We have a patch pending for 5.3 which should enable login to all the discovered nodes from initrd itself.

Comment 20 nandkumar mane 2008-11-06 06:30:15 UTC
Freeze issue is fixed in RHEL5.3 Beta. 

Thank you for including patch(Comment #14).

Comment 22 errata-xmlrpc 2009-01-20 22:08:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0232.html