Hide Forgot
Description of problem: when boot up the 6.2 hvm guest with three ioemu NICs and the xvdx block-device, the guest failed to boot up and call trace. Version-Release number of selected component (if applicable): Guest: 6.2-20111117.0 Host: 2.6.18-298.el5xen xen: xen-3.0.3-135.el5 How reproducible: 100%. Steps to Reproduce: 1. Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.2-64-20111117.0-hvm.raw,xvda,w"] vif = [ "type=ioemu,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] Actual results: The guest failed to boot up and call trace. Expected results: The guest should boot up successfully without call trace. Additional info: 1. Test it on 6.2-20111117.0. Both "file:" and "tap:aio:" will lead to the problem. It happens to three nics, (three default, one rtl8139 and one e1000 and one pcnet, three rtl8139, two rtl8139 and one e1000), two nics won't call trace. However, it will be ok for three netfront vifs. And no problems in using hdx block-device. 2. Test it on 6.1 release. It happens too. 3. Test it on 5.7 release, the guest boot up successfully.
Created attachment 535369 [details] boot log
Created attachment 535370 [details] xm dmesg log
Can you provide access to the system where it happens?
LVM setup on RHEL-Server-6.2-64-20111117.0-hvm.raw image is broken, probably you need to reinstall 6.2 guest. Had to use RHEL-Server-6.1-64-hvm.raw image for test, now I can reproduce it on my ws with following config and xen_emul_unplug=never must be on kernel command line: name = "rhel61x64FVfile" uuid = "40e9b5a0-546a-dcf9-b949-c590f31b927a" maxmem = 2048 memory = 2048 vcpus = 4 builder = "hvm" kernel = "/usr/lib/xen/boot/hvmloader" boot = "c" pae = 1 acpi = 1 apic = 1 localtime = 0 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" device_model = "/usr/lib64/xen/bin/qemu-dm" sdl = 0 vnc = 1 vncunused = 1 keymap = "en-us" disk = [ "file:/var/lib/xen/images/rhel61x64FVfile,xvda,w" ] vif = [ "mac=00:16:3e:5d:02:d6,bridge=xenbr0,script=vif-bridge", "mac=00:16:3e:23:5d:a4,bridge=xenbr0,script=vif-bridge", "mac=00:16:3e:10:03:59,bridge=xenbr0,script=vif-bridge" ] parallel = "none" serial = "pty" guest stuck after: dracut: Switching root SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts udev: starting version 147 piix4_smbus 0000:00:01.2: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004) alloc irq_desc for 32 on node -1 alloc kstat_irqs on node -1 8139cp 0000:00:04.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32 eth0: RTL-8139C+ at 0xffffc90001140000, 00:16:3e:5d:02:d6, IRQ 32 8139cp 0000:00:04.0: setting latency timer to 64 alloc irq_desc for 36 on node -1 alloc kstat_irqs on node -1 8139cp 0000:00:05.0: PCI INT A -> GSI 36 (level, low) -> IRQ 36 eth1: RTL-8139C+ at 0xffffc90001144100, 00:16:3e:23:5d:a4, IRQ 36 8139cp 0000:00:05.0: setting latency timer to 64 8139cp 0000:00:08.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 eth2: RTL-8139C+ at 0xffffc90001148200, 00:16:3e:10:03:59, IRQ 17 8139cp 0000:00:08.0: setting latency timer to 64 INFO: task ip:623 blocked for more than 120 seconds ...
It seams that fc16 doesn't have this issue. With xen_emul_unplug=never fc16 doesn't tries to use pv on hvm at all, but rhel6 uses pv block device instead of emulated one.
Created attachment 537515 [details] [RHEL6.3 PATCH 1/4] Do not init xen platform pci if xen_emul_unplug=never
Created attachment 537516 [details] [RHEL6.3 PATCH 2/4] x86/io_apic: add get_nr_irqs_gsi()
Created attachment 537517 [details] [RHEL6.3 PATCH 3/4] xen: add get_nr_hw_irqs req for finding an unbound irq number in reverse order
Created attachment 537518 [details] [RHEL6.3 PATCH 4/4] xen: Find an unbound irq number in reverse order (high to low).
Created attachment 537519 [details] [RHEL6.3 COVER LETTER 0/4] xen: Fix hung with 3 emulated nics when xen_emul_unplug=never
to QE: It should be tested with: xen_emul_unplug=never - disables all pv drivers, so it just hides problem because blkfront isn't loaded and doesn't takes irq. xen_emul_unplug=ide-disks - on affected kernel blkfront will get irq of 3rd nic. on fixed kernel it will get irq from upper bound (~25x).
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Patch(es) available on kernel-2.6.32-229.el6
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: In previous RHEL6 releases kernel option xen_emul_unplug=never didn't disable xen platform pci device and that lead to using para-virtual devices instead of emulated ones. This fix, in addition to fixing irq allocation issue for emulated network devices, allows to disable para-virtual drivers using xen_emul_unplug=never kernel option as described in "Virtualization Guide: Edition 5.8" chapter "12.3.5. Xen Para-virtualized Drivers on Red Hat Enterprise Linux 6".
Verified on RHEL6.3-20120416(kernel 2.6.32-262),and reproduced on RHEL6.2 release(kernel 2.6.32-220) Reproduce Steps: Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.2-64-hvm.raw,xvda,w"] vif = ["type=ioemu,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] Result: Fix(with xen_emul_unplug=never in kernel line): the guest failed to boot up and call trace. Verify Steps: 1. Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.3-64-hvm.raw,xvda,w"] vif = ["type=ioemu,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] 2. Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.3-64-hvm.raw,xvda,w"] vif = ["type=netfront,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=netfront,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=netfront,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] Result: 1. With "xen_emul_unplug=never" in guest kernel line, all ioemu nics will work fine, and all netfront nics won't work for all pv drivers are disabled: # lsmod | grep xen nothing show up. 2. with "xen_emul_unplug=ide-disks" in guest kernel line, both ioemu and netfront nic will work fine. # lsmod | grep xen xen_netfront 18905 0 xen_blkfront 15687 3 And check that: fixed kernel it will get irq from upper bound after the fix: [guest]# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 156 0 0 0 IO-APIC-edge timer 1: 410 20 25 17 IO-APIC-edge i8042 4: 338 51 68 10 IO-APIC-edge serial 8: 1 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 1066 54 119 30 IO-APIC-edge i8042 14: 0 0 0 0 IO-APIC-edge ata_piix 15: 0 0 0 0 IO-APIC-edge ata_piix 17: 29030 68 55 51 IO-APIC-fasteoi eth2 28: 6684 6516 6576 6663 IO-APIC-fasteoi xen-platform-pci 32: 27938 314 52 64 IO-APIC-fasteoi eth1 36: 28340 50 309 44 IO-APIC-fasteoi eth3 846: 6656 0 0 0 xen-dyn-event blkif 847: 102 0 0 0 xen-dyn-event xenbus NMI: 0 0 0 0 Non-maskable interrupts LOC: 9092081 9147301 7385229 7909594 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 IRQ work interrupts RES: 11425 12438 11469 12208 Rescheduling interrupts CAL: 39 187 196 190 Function call interrupts TLB: 1106 934 954 1077 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 37 37 37 37 Machine check polls ERR: 0 MIS: 0 So change this bug into VERIFIED.
*** Bug 794579 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0862.html