Bug 435165

Summary: iscsi install fails to boot post-install
Product: Red Hat Enterprise Linux 5 Reporter: James Laska <jlaska>
Component: mkinitrdAssignee: Peter Jones <pjones>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 5.2CC: atodorov, jturner, mzazrivec, pjones
Target Milestone: betaKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0437 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 15:26:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
boot.log
none
/init (ia64)
none
/init (i386)
none
boot.log (i386)
none
r5u1-iscsi-files.tgz
none
r5u2-iscsi-files.tgz none

Description James Laska 2008-02-27 18:30:30 UTC
# TREE RHEL5.2-Server-20080225.2/ia64

During an ia64 iscsi installation I see the following stderr messages near the
end of the install.  This installation was done over a serial console (as are
most of ia64 installs).

I suspect we are not capturing iscsiadm output when attempting to process and
iBFT firmware reads.

                                                                                
                 +-------------+ Post Install +--------------+                  
                 |                                           |                  
                 | Performing post install configuration...iscsiadm: Could not
read fw values.
                 |                                           |                 
              iscsiadm: Could not read fw
values.+-------------------------------------------+                  
                                                                                

The post-install system then panics during bootup.  Still testing, but I suspect
something was not configured properly with regards to iscsi.                   
                                                            

Kickstart used -
http://hank.test.redhat.com/autotest/testcases/rel-eng_RHEL5.2-Server-20080225.2_5-ia64/iscsi-distill-yes-english-nfs-TUI-auto-ibm-ds300.test.redhat.com/ks.cfg

anaconda.log -
http://hank.test.redhat.com/autotest/testcases/rel-eng_RHEL5.2-Server-20080225.2_5-ia64/iscsi-distill-yes-english-nfs-TUI-auto-ibm-ds300.test.redhat.com/anamon/anaconda.log

Additional information:

Not sure if this is related, but on an i386 similar installation, I'm seeing a
nash SIGSEGV during post-install boot up.  That addresses given in the backtrace
are:

0x804fa2e - /usr/src/debug////////mkinitrd-5.1.19.6/nash/nash.c:2630
0x903420 - ??:0
0x80516a5 - /usr/src/debug////////mkinitrd-5.1.19.6/nash/network.c:468
0x805180a - /usr/src/debug////////mkinitrd-5.1.19.6/nash/network.c:501
0x804f189 - /usr/src/debug////////mkinitrd-5.1.19.6/nash/nash.c:2348
0x804f8fc - /usr/src/debug////////mkinitrd-5.1.19.6/nash/nash.c:2594
0x804fde1 - /usr/src/debug////////mkinitrd-5.1.19.6/nash/nash.c:2708
0x8180808 - ??:0
0x8048131 - ??:0

What can I check prior to rebooting after the install to determine if something
is broken?  Thoughts/suggestions?

Comment 1 James Laska 2008-02-27 18:30:30 UTC
Created attachment 296102 [details]
boot.log

Comment 2 James Laska 2008-02-27 18:50:36 UTC
Created attachment 296105 [details]
/init (ia64)

Still working to gather '/init' from the ramdisk on the i386 failure, but here
is the ia64 /init.

The failed system appears to only have 1 NIC (eth0)
# ip link list | grep -A1 eth
3: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:03:47:fd:ba:c0 brd ff:ff:ff:ff:ff:ff

Comment 3 James Laska 2008-02-27 20:07:36 UTC
Created attachment 296112 [details]
/init (i386)

Attaching the i386 ramdisk '/init' script

<6>pci0000:00: eth0: (PCI Express:2.5GB/s:Width x1) 00:13:20:f3:e9:9f
<6>pci0000:00: eth0: Intel(R) PRO/1000 Network Connection
<6>pci0000:00: eth0: MAC: 4, PHY: 6, PBA No: ffffff-0ff
<6>ACPI: PCI Interrupt 0000:05:00.0[A] -> GSI 19 (level, low) -> IRQ 193
<7>PCI: Setting latency timer of device 0000:05:00.0 to 64
<6>0000:00:1c.3: eth1: (PCI Express:2.5GB/s:Width x1) 00:15:17:3a:0a:c2
<6>0000:00:1c.3: eth1: Intel(R) PRO/1000 Network Connection
<6>0000:00:1c.3: eth1: MAC: 1, PHY: 4, PBA No: d50861-003

Comment 4 James Laska 2008-02-27 20:48:50 UTC
Created attachment 296117 [details]
boot.log (i386)

Comment 5 James Laska 2008-02-27 21:41:04 UTC
Created attachment 296127 [details]
r5u1-iscsi-files.tgz

Attaching anaconda files from a working i386 RHEL5-U1 install (includes /init)

> tar -ztvf Desktop/r5u1-iscsi-files.tgz 
-rw-r--r-- root/0	261615 2008-02-27 16:36 tmp/anaconda.log
-rw-r--r-- root/0	 23372 2008-02-27 16:36 tmp/syslog
-rw-r--r-- root/0	    32 2008-02-27 16:34 tmp/lvmout
-rwx------ root/0	  3162 2008-02-27 16:37 tmp/init.sh
-rw------- root/0	 11241 2008-02-27 16:31 tmp/ks.cfg

Comment 6 James Laska 2008-02-27 21:42:53 UTC
Created attachment 296128 [details]
r5u2-iscsi-files.tgz

Attaching anaconda files from a FAILED i386 RHEL5-U2 install (includes /init)

> tar -ztvf Desktop/r5u2-iscsi-files.tgz 
-rwx------ root/0	  3185 2008-02-27 16:15 tmp/init.sh
-rw-r--r-- root/0	259849 2008-02-27 16:13 tmp/anaconda.log
-rw-r--r-- root/0	 22727 2008-02-27 16:13 tmp/syslog
-rw------- root/0	 11244 2008-02-27 16:08 tmp/ks.cfg


There is a small diff between the two /init scripts.  The iscsi part of the
diff is as follows:

@@ -69,0 +70,5 @@
+echo Bringing up eth0
+netname 00:17:08:2A:72:3D eth0
+network --device eth0 --bootproto dhcp
+echo Attaching to iSCSI storage
+/bin/iscsistart -t
iqn.1986-03.com.ibm.25166155.20070201123353.qe-rtt-i386-disk1 -i iqn.rhel5.i386
	    -g 1 -a 192.168.33.219		  
@@ -86,4 +90,0 @@
-echo Bringing up eth0
-network --device eth0 --bootproto dhcp
-echo Attaching to iSCSI storage
-/bin/iscsistart -t
iqn.1986-03.com.ibm.25166155.20070201123353.qe-rtt-i386-disk1 -i iqn.rhel5.i386
		-g 1 -a 192.168.33.219

Comment 8 James Laska 2008-02-28 13:07:30 UTC
The presence of "netname 00:17:08:2A:72:3D eth0" in the '/init' seems to be
causing nash to SIGSEGV on i386 and ia64 (but not ppc or x86_64).

When I remove the netname line, update the initrd.img, and reboot ... the
network is started and iscsistart works without error.

The culprit seems to be netname causing nash to die.



Comment 13 errata-xmlrpc 2008-05-21 15:26:35 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0437.html