Bug 850560 - Unable to install RHSS 2.0 on IBM x3550 M3
Summary: Unable to install RHSS 2.0 on IBM x3550 M3
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: storage-server-tools
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
: ---
Assignee: Anthony Towns
QA Contact: Ben Turner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-21 20:22 UTC by Steve Reichard
Modified: 2014-07-11 06:39 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-29 13:57:17 UTC
Embargoed:


Attachments (Terms of Use)
anaconda abort info (1.79 KB, text/x-log)
2012-08-21 20:22 UTC, Steve Reichard
no flags Details
movie of install (8.18 MB, video/ogg)
2012-08-21 20:33 UTC, Steve Reichard
no flags Details
anaconda traceback (1.11 MB, application/octet-stream)
2012-10-05 17:21 UTC, Matthew Davis
no flags Details
Screen shot during EFI boot. (73.03 KB, image/png)
2013-04-17 18:19 UTC, Ben Turner
no flags Details

Description Steve Reichard 2012-08-21 20:22:14 UTC
Created attachment 606038 [details]
anaconda abort info

Description of problem:



In the final stages of the anaconda install of RHS 2.0 on my IBM x3550 M3 (f/w revently updated) I get an exception.

This same server has sucessfully installed RHEL 6.2GA & 6.3.  


Here is some config info

[root@rhs-spr2 ~]# lspci
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22)
00:02.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 2 (rev 22)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22)
00:10.0 PIC: Intel Corporation 5520/5500/X58 Physical and Link Layer Registers Port 0 (rev 22)
00:10.1 PIC: Intel Corporation 5520/5500/X58 Routing and Protocol Layer Registers Port 0 (rev 22)
00:11.0 PIC: Intel Corporation 5520/5500 Physical and Link Layer Registers Port 1 (rev 22)
00:11.1 PIC: Intel Corporation 5520/5500 Routing & Protocol Layer Register Port 1 (rev 22)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 22)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22)
00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 22)
00:15.0 PIC: Intel Corporation 5520/5500/X58 Trusted Execution Technology Registers (rev 22)
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIB (ICH10) LPC Interface Controller
00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
06:00.0 PCI bridge: Vitesse Semiconductor VSC452 [SuperBMC] (rev 01)
07:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EV
0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
0b:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
[root@rhs-spr2 ~]# lspci -s 01:00.0 -vv
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
	Subsystem: IBM Device 03c7
	Physical Slot: 5
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: I/O ports at 1000 [size=256]
	Region 1: Memory at 97940000 (64-bit, non-prefetchable) [size=16K]
	Region 3: Memory at 97900000 (64-bit, non-prefetchable) [size=256K]
	Expansion ROM at 90000000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+
		DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB
	Capabilities: [d0] Vital Product Data
		Unknown small resource type 00, will not decode more.
	Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
		Vector table: BAR=1 offset=00002000
		PBA: BAR=1 offset=00003800
	Capabilities: [100] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
		UESvrt:	DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [138] Power Budgeting <?>
	Kernel driver in use: megaraid_sas
	Kernel modules: megaraid_sas

[root@rhs-spr2 ~]# 

I will attach the log of the abort and a movie of the anaconda interaction.




Version-Release number of selected component (if applicable):




Didn't install o not easy to tell , however the ISO was downloaded from access and is named - RHS-2.0-20120621.2-RHS-x86_64-DVD1.iso

I will attach the log of the abort and a movie of the anaconda interaction.




How reproducible:

Every attempt on multiple similar servers.
Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Steve Reichard 2012-08-21 20:33:04 UTC
Created attachment 606039 [details]
movie of install

Comment 3 Matthew Davis 2012-10-05 17:20:23 UTC
I've hit the same bug on the x3650 M4. I'm attaching my anaconda traceback. Based on the traceback and logs, it seems to me RHS doesn't know how to handle EFI bootloaders.

Comment 4 Matthew Davis 2012-10-05 17:21:04 UTC
Created attachment 622302 [details]
anaconda traceback

Comment 5 Matthew Davis 2012-10-05 21:20:30 UTC
Fwiw, I loaded the ISO up via PXE and the install completed successfully.

Comment 6 Vijay Bellur 2012-10-22 07:10:04 UTC
AJ, Is there anything worth actioning here?

Comment 11 Ben Turner 2013-04-17 18:17:44 UTC
Ok I was able to repro the error "RuntimeError: Error running efibootmgr: No such file or directory" on one of my x3250m4 systems when mounting the install ISO on the IMM.  I used RHS-2.0-20121110.0-RHS-x86_64-DVD1.iso.  I can confirm that this issue is seen only on EFI boots and booting from legacy will work around it.

Next I tried with the u4 gold ISO, RHS-2.0-20130410.0-RHS-x86_64-DVD1.iso.  I was able to successfully install but first boot hung just spamming messages like:

kobject_add_internal failed for HDDP-fab7e9e1-39dd-4f2b-8480-e20e906cb6dw with -EEXIST don't try to register things with the same name in the same directory.

I couldn't think of a better way to capture these messages so I took a screenshot(see attached).

So now the installer will go the whole way through but there appears to be with the config that is causing problems.  

I started troubleshooting the -EEXIST problem by rebooting the system into single user mode.  I was able to get into single user mode without problems.  Just to note efibootmgr is installed:

# rpm -q efibootmgr
efibootmgr-0.5.4-9.el6.x86_64

After checking the efibootmgr version I ran init 3 and was able to get to multi-user with networking.  Everything seemed normal in runlevel 3 so I configured networking and rebooted.  The -EEXIST problem happened again.

The last thing I tried was updating the kernel in case this was a kernel bug.  I updated from:

kernel-2.6.32-220.32.1.el6.x86_64(version from the ISO)

To:

kernel-2.6.32-220.34.1.el6.x86_64(latest 6.2 EUS)

Same thing.

At this point I don't have much input on RCA other than it looks like some sort of race condition with kobject_add_internal on uEFI boots, and going into single user mode works around this.

Comment 12 Ben Turner 2013-04-17 18:19:49 UTC
Created attachment 736991 [details]
Screen shot during EFI boot.

Comment 13 Ben Turner 2013-04-17 19:37:43 UTC
A link to the kernel source where we are seeing the error:

http://src3.org/RHEL6-2.6.32+220.el6/lib/kobject.c#L158

 187        error = create_dir(kobj);
 188        if (error) {
 189                kobj_kset_leave(kobj);
 190                kobject_put(parent);
 191                kobj->parent = NULL;
 192
 193                /* be noisy on error issues */
 194                if (error == -EEXIST)
 195                        printk(KERN_ERR "%s failed for %s with "
 196                               "-EEXIST, don't try to register things with "
 197                               "the same name in the same directory.\n",
 198                               __func__, kobject_name(kobj));
 199                else
 200                        printk(KERN_ERR "%s failed for %s (%d)\n",
 201                               __func__, kobject_name(kobj), error);
 202                dump_stack();
 203        } else
 204                kobj->state_in_sysfs = 1;
 205
 206        return error;

Comment 14 Ben Turner 2013-04-23 00:56:19 UTC
I ran several installs of RHS-2.1-20130415.n.2-RHS-x86_64-DVD1.iso on my x3250s and everything worked properly.  Adding the efibootutils package to the ISO fixed the issue and I am not having any problems booting uEFI.

Comment 15 Ben Turner 2013-04-26 20:01:35 UTC
I tested this again today and didn't have any problems.  I am not sure what happened on the couple boots where I had the EEXISTS error but I am confident that the efibootmgr package resolved the issue this BZ was opened for.  If I run into any other instances of the EEXISTS error I will open a new BZ but from my perspective this BZ is resolved and can be closed.

Comment 16 Ben Turner 2013-04-29 13:57:17 UTC
Closing current release.  The efiboot manager package was added to RHN 2.0 u4 ISO and the anaconda crash is not reproducible in either the 2.0 u4 or the 2.1 alpha ISO.  

Also note that after a reset of the IMM I did not see the "kobject_add_internal failed" error.


Note You need to log in before you can comment on or make changes to this bug.