Created attachment 606038 [details] anaconda abort info Description of problem: In the final stages of the anaconda install of RHS 2.0 on my IBM x3550 M3 (f/w revently updated) I get an exception. This same server has sucessfully installed RHEL 6.2GA & 6.3. Here is some config info [root@rhs-spr2 ~]# lspci 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) 00:02.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 2 (rev 22) 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22) 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22) 00:10.0 PIC: Intel Corporation 5520/5500/X58 Physical and Link Layer Registers Port 0 (rev 22) 00:10.1 PIC: Intel Corporation 5520/5500/X58 Routing and Protocol Layer Registers Port 0 (rev 22) 00:11.0 PIC: Intel Corporation 5520/5500 Physical and Link Layer Registers Port 1 (rev 22) 00:11.1 PIC: Intel Corporation 5520/5500 Routing & Protocol Layer Register Port 1 (rev 22) 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 22) 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22) 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22) 00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 22) 00:15.0 PIC: Intel Corporation 5520/5500/X58 Trusted Execution Technology Registers (rev 22) 00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22) 00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1 00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5 00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIB (ICH10) LPC Interface Controller 00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) 06:00.0 PCI bridge: Vitesse Semiconductor VSC452 [SuperBMC] (rev 01) 07:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EV 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 0b:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) [root@rhs-spr2 ~]# lspci -s 01:00.0 -vv 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Subsystem: IBM Device 03c7 Physical Slot: 5 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: I/O ports at 1000 [size=256] Region 1: Memory at 97940000 (64-bit, non-prefetchable) [size=16K] Region 3: Memory at 97900000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at 90000000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range BC, TimeoutDis+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB Capabilities: [d0] Vital Product Data Unknown small resource type 00, will not decode more. Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003800 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [138] Power Budgeting <?> Kernel driver in use: megaraid_sas Kernel modules: megaraid_sas [root@rhs-spr2 ~]# I will attach the log of the abort and a movie of the anaconda interaction. Version-Release number of selected component (if applicable): Didn't install o not easy to tell , however the ISO was downloaded from access and is named - RHS-2.0-20120621.2-RHS-x86_64-DVD1.iso I will attach the log of the abort and a movie of the anaconda interaction. How reproducible: Every attempt on multiple similar servers. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 606039 [details] movie of install
I've hit the same bug on the x3650 M4. I'm attaching my anaconda traceback. Based on the traceback and logs, it seems to me RHS doesn't know how to handle EFI bootloaders.
Created attachment 622302 [details] anaconda traceback
Fwiw, I loaded the ISO up via PXE and the install completed successfully.
AJ, Is there anything worth actioning here?
Ok I was able to repro the error "RuntimeError: Error running efibootmgr: No such file or directory" on one of my x3250m4 systems when mounting the install ISO on the IMM. I used RHS-2.0-20121110.0-RHS-x86_64-DVD1.iso. I can confirm that this issue is seen only on EFI boots and booting from legacy will work around it. Next I tried with the u4 gold ISO, RHS-2.0-20130410.0-RHS-x86_64-DVD1.iso. I was able to successfully install but first boot hung just spamming messages like: kobject_add_internal failed for HDDP-fab7e9e1-39dd-4f2b-8480-e20e906cb6dw with -EEXIST don't try to register things with the same name in the same directory. I couldn't think of a better way to capture these messages so I took a screenshot(see attached). So now the installer will go the whole way through but there appears to be with the config that is causing problems. I started troubleshooting the -EEXIST problem by rebooting the system into single user mode. I was able to get into single user mode without problems. Just to note efibootmgr is installed: # rpm -q efibootmgr efibootmgr-0.5.4-9.el6.x86_64 After checking the efibootmgr version I ran init 3 and was able to get to multi-user with networking. Everything seemed normal in runlevel 3 so I configured networking and rebooted. The -EEXIST problem happened again. The last thing I tried was updating the kernel in case this was a kernel bug. I updated from: kernel-2.6.32-220.32.1.el6.x86_64(version from the ISO) To: kernel-2.6.32-220.34.1.el6.x86_64(latest 6.2 EUS) Same thing. At this point I don't have much input on RCA other than it looks like some sort of race condition with kobject_add_internal on uEFI boots, and going into single user mode works around this.
Created attachment 736991 [details] Screen shot during EFI boot.
A link to the kernel source where we are seeing the error: http://src3.org/RHEL6-2.6.32+220.el6/lib/kobject.c#L158 187 error = create_dir(kobj); 188 if (error) { 189 kobj_kset_leave(kobj); 190 kobject_put(parent); 191 kobj->parent = NULL; 192 193 /* be noisy on error issues */ 194 if (error == -EEXIST) 195 printk(KERN_ERR "%s failed for %s with " 196 "-EEXIST, don't try to register things with " 197 "the same name in the same directory.\n", 198 __func__, kobject_name(kobj)); 199 else 200 printk(KERN_ERR "%s failed for %s (%d)\n", 201 __func__, kobject_name(kobj), error); 202 dump_stack(); 203 } else 204 kobj->state_in_sysfs = 1; 205 206 return error;
I ran several installs of RHS-2.1-20130415.n.2-RHS-x86_64-DVD1.iso on my x3250s and everything worked properly. Adding the efibootutils package to the ISO fixed the issue and I am not having any problems booting uEFI.
I tested this again today and didn't have any problems. I am not sure what happened on the couple boots where I had the EEXISTS error but I am confident that the efibootmgr package resolved the issue this BZ was opened for. If I run into any other instances of the EEXISTS error I will open a new BZ but from my perspective this BZ is resolved and can be closed.
Closing current release. The efiboot manager package was added to RHN 2.0 u4 ISO and the anaconda crash is not reproducible in either the 2.0 u4 or the 2.1 alpha ISO. Also note that after a reset of the IMM I did not see the "kobject_add_internal failed" error.