Bug 756307
Summary: | Failed to boot RHEL6.2 hvm guest with three NICs when using xvdx disk | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Shengnan Wang <shwang> |
Component: | kernel | Assignee: | Igor Mammedov <imammedo> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 6.2 | CC: | drjones, imammedo, leiwang, mrezanin, qguan, qwan, xen-maint, yuzhou |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | xen | ||
Fixed In Version: | kernel-2.6.32-229.el6 | Doc Type: | Bug Fix |
Doc Text: |
In previous RHEL6 releases kernel option xen_emul_unplug=never didn't disable xen platform pci device and that lead to using para-virtual devices instead of emulated ones. This fix, in addition to fixing irq allocation issue for emulated network devices, allows to disable para-virtual drivers using xen_emul_unplug=never kernel option as described in "Virtualization Guide: Edition 5.8" chapter "12.3.5. Xen Para-virtualized Drivers on Red Hat Enterprise Linux 6".
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-06-20 08:07:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 523117 | ||
Attachments: |
Description
Shengnan Wang
2011-11-23 08:29:15 UTC
Created attachment 535369 [details]
boot log
Created attachment 535370 [details]
xm dmesg log
Can you provide access to the system where it happens? LVM setup on RHEL-Server-6.2-64-20111117.0-hvm.raw image is broken, probably you need to reinstall 6.2 guest. Had to use RHEL-Server-6.1-64-hvm.raw image for test, now I can reproduce it on my ws with following config and xen_emul_unplug=never must be on kernel command line: name = "rhel61x64FVfile" uuid = "40e9b5a0-546a-dcf9-b949-c590f31b927a" maxmem = 2048 memory = 2048 vcpus = 4 builder = "hvm" kernel = "/usr/lib/xen/boot/hvmloader" boot = "c" pae = 1 acpi = 1 apic = 1 localtime = 0 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" device_model = "/usr/lib64/xen/bin/qemu-dm" sdl = 0 vnc = 1 vncunused = 1 keymap = "en-us" disk = [ "file:/var/lib/xen/images/rhel61x64FVfile,xvda,w" ] vif = [ "mac=00:16:3e:5d:02:d6,bridge=xenbr0,script=vif-bridge", "mac=00:16:3e:23:5d:a4,bridge=xenbr0,script=vif-bridge", "mac=00:16:3e:10:03:59,bridge=xenbr0,script=vif-bridge" ] parallel = "none" serial = "pty" guest stuck after: dracut: Switching root SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts udev: starting version 147 piix4_smbus 0000:00:01.2: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004) alloc irq_desc for 32 on node -1 alloc kstat_irqs on node -1 8139cp 0000:00:04.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32 eth0: RTL-8139C+ at 0xffffc90001140000, 00:16:3e:5d:02:d6, IRQ 32 8139cp 0000:00:04.0: setting latency timer to 64 alloc irq_desc for 36 on node -1 alloc kstat_irqs on node -1 8139cp 0000:00:05.0: PCI INT A -> GSI 36 (level, low) -> IRQ 36 eth1: RTL-8139C+ at 0xffffc90001144100, 00:16:3e:23:5d:a4, IRQ 36 8139cp 0000:00:05.0: setting latency timer to 64 8139cp 0000:00:08.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 eth2: RTL-8139C+ at 0xffffc90001148200, 00:16:3e:10:03:59, IRQ 17 8139cp 0000:00:08.0: setting latency timer to 64 INFO: task ip:623 blocked for more than 120 seconds ... It seams that fc16 doesn't have this issue. With xen_emul_unplug=never fc16 doesn't tries to use pv on hvm at all, but rhel6 uses pv block device instead of emulated one. Created attachment 537515 [details]
[RHEL6.3 PATCH 1/4] Do not init xen platform pci if xen_emul_unplug=never
Created attachment 537516 [details]
[RHEL6.3 PATCH 2/4] x86/io_apic: add get_nr_irqs_gsi()
Created attachment 537517 [details]
[RHEL6.3 PATCH 3/4] xen: add get_nr_hw_irqs req for finding an unbound irq number in reverse order
Created attachment 537518 [details]
[RHEL6.3 PATCH 4/4] xen: Find an unbound irq number in reverse order (high to low).
Created attachment 537519 [details]
[RHEL6.3 COVER LETTER 0/4] xen: Fix hung with 3 emulated nics when xen_emul_unplug=never
to QE: It should be tested with: xen_emul_unplug=never - disables all pv drivers, so it just hides problem because blkfront isn't loaded and doesn't takes irq. xen_emul_unplug=ide-disks - on affected kernel blkfront will get irq of 3rd nic. on fixed kernel it will get irq from upper bound (~25x). This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Patch(es) available on kernel-2.6.32-229.el6 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: In previous RHEL6 releases kernel option xen_emul_unplug=never didn't disable xen platform pci device and that lead to using para-virtual devices instead of emulated ones. This fix, in addition to fixing irq allocation issue for emulated network devices, allows to disable para-virtual drivers using xen_emul_unplug=never kernel option as described in "Virtualization Guide: Edition 5.8" chapter "12.3.5. Xen Para-virtualized Drivers on Red Hat Enterprise Linux 6". Verified on RHEL6.3-20120416(kernel 2.6.32-262),and reproduced on RHEL6.2 release(kernel 2.6.32-220) Reproduce Steps: Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.2-64-hvm.raw,xvda,w"] vif = ["type=ioemu,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] Result: Fix(with xen_emul_unplug=never in kernel line): the guest failed to boot up and call trace. Verify Steps: 1. Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.3-64-hvm.raw,xvda,w"] vif = ["type=ioemu,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=ioemu,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] 2. Boot up the guest with the parameters as following: disk = [ "file:/root/img/RHEL-Server-6.3-64-hvm.raw,xvda,w"] vif = ["type=netfront,mac=00:16:36:43:70:e3,bridge=xenbr0,script=vif-bridge","type=netfront,mac=00:16:36:43:70:e6,bridge=xenbr0,script=vif-bridge","type=netfront,mac=00:16:36:43:70:e2,bridge=xenbr0,script=vif-bridge"] Result: 1. With "xen_emul_unplug=never" in guest kernel line, all ioemu nics will work fine, and all netfront nics won't work for all pv drivers are disabled: # lsmod | grep xen nothing show up. 2. with "xen_emul_unplug=ide-disks" in guest kernel line, both ioemu and netfront nic will work fine. # lsmod | grep xen xen_netfront 18905 0 xen_blkfront 15687 3 And check that: fixed kernel it will get irq from upper bound after the fix: [guest]# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 156 0 0 0 IO-APIC-edge timer 1: 410 20 25 17 IO-APIC-edge i8042 4: 338 51 68 10 IO-APIC-edge serial 8: 1 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 1066 54 119 30 IO-APIC-edge i8042 14: 0 0 0 0 IO-APIC-edge ata_piix 15: 0 0 0 0 IO-APIC-edge ata_piix 17: 29030 68 55 51 IO-APIC-fasteoi eth2 28: 6684 6516 6576 6663 IO-APIC-fasteoi xen-platform-pci 32: 27938 314 52 64 IO-APIC-fasteoi eth1 36: 28340 50 309 44 IO-APIC-fasteoi eth3 846: 6656 0 0 0 xen-dyn-event blkif 847: 102 0 0 0 xen-dyn-event xenbus NMI: 0 0 0 0 Non-maskable interrupts LOC: 9092081 9147301 7385229 7909594 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 IRQ work interrupts RES: 11425 12438 11469 12208 Rescheduling interrupts CAL: 39 187 196 190 Function call interrupts TLB: 1106 934 954 1077 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 37 37 37 37 Machine check polls ERR: 0 MIS: 0 So change this bug into VERIFIED. *** Bug 794579 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0862.html |