RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 681133 - RHEL 5.6 32bit SMP guest hang at boot up
Summary: RHEL 5.6 32bit SMP guest hang at boot up
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Zachary Amsden
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
: 688951 (view as bug list)
Depends On:
Blocks: Rhel6KvmTier1 684385
TreeView+ depends on / blocked
 
Reported: 2011-03-01 08:21 UTC by Joy Pu
Modified: 2018-11-14 13:49 UTC (History)
10 users (show)

Fixed In Version: kernel-2.6.32-130.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-23 20:40:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Whole log for kvm trace (12.17 MB, application/octet-stream)
2011-03-01 08:23 UTC, Joy Pu
no flags Details
Created an attachment for dmesg, /var/log/messages and sosreport of the host, also xml file of the guest (1.14 MB, application/x-gzip)
2011-03-27 03:39 UTC, IBM Bug Proxy
no flags Details
Patch fixing the problem (1.24 KB, application/octet-stream)
2011-04-06 16:59 UTC, Zachary Amsden
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0542 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update 2011-05-19 11:58:07 UTC

Description Joy Pu 2011-03-01 08:21:33 UTC
Description:
In RHEL 6.1 system, RHEL 5.6 32bit guest hang at boot up while install. The install process is not start and the guest will hang. 64 bit guest works well. This happend both in INTEL and AMD host. It happend in url, nfs and cd install.

Version-Release number of selected component (if applicable):
Host:2.6.32-118.el6.x86_64
qemu-kvm:
rpm -qa |grep qemu
gpxe-roms-qemu-0.9.7-6.4.el6.noarch
qemu-kvm-debuginfo-0.12.1.2-2.148.el6.x86_64
qemu-img-0.12.1.2-2.148.el6.x86_64
qemu-kvm-0.12.1.2-2.148.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.148.el6.x86_64

How reproducible:
mostly

Steps to Reproduce:
1.Set up the env
#/usr/sbin/brctl addbr vbr0; \
#echo 1 > /proc/sys/net/ipv6/conf/vbr0/disable_ipv6; \
#echo 1 > /proc/sys/net/ipv4/ip_forward; \
#/usr/sbin/brctl stp vbr0 on; \
#/usr/sbin/brctl setfd vbr0 0; \
#ifconfig vbr0 192.168.58.1; \
#ifconfig vbr0 up; \
#iptables -t nat -A POSTROUTING \
-s 192.168.58.254/24 ! -d 192.168.58.254/24 -j MASQUERADE;
#dnsmasq --strict-order --bind-interfaces \
--listen-address 192.168.58.1 \
--dhcp-range 192.168.58.2,192.168.58.254 \
--enable-tftp --tftp-root /home/tests/kvm/images/tftpboot \
--dhcp-boot pxelinux.0 --dhcp-no-override


2.install guest with command line and guest will hang while boot up. 
#qemu-kvm -name 'vm1' -chardev socket,id=human_monitor_Qefe,path=/tmp/monitor-humanmonitor1-20110301-110905-SWMy,server,nowait -mon chardev=human_monitor_Qefe,mode=readline -chardev socket,id=serial_RLn3,path=/tmp/serial-20110301-110905-SWMy,server,nowait -device isa-serial,chardev=serial_RLn3 -drive file='/usr/auto/test/autotest-devel/client/tests/kvm/images/RHEL-Server-5.6-32.qcow2',index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,format=qcow2,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device rtl8139,netdev=idaAkxCQ,mac=9a:2e:3f:52:e6:dd,id=ndev00idaAkxCQ,bus=pci.0,addr=0x3 -netdev tap,id=idaAkxCQ,ifname='t0-110905-SWMy',script='/usr/auto/test/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 2046 -smp 2,cores=1,threads=1,sockets=2 -drive file='/usr/auto/test/autotest-devel/client/tests/kvm/isos/linux/RHEL-Server-5.6-i386-DVD.iso',index=1,if=none,id=drive-ide0-0-1,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file='/usr/auto/test/autotest-devel/client/tests/kvm/isos/../images/rhel56-32/ks.iso',index=2,if=none,id=drive-ide0-1-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -cpu cpu64-rhel6,+sse2,+x2apic -kernel '/usr/auto/test/autotest-devel/client/tests/kvm/images/rhel56-32/vmlinuz' -initrd '/usr/auto/test/autotest-devel/client/tests/kvm/images/rhel56-32/initrd.img' -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=none  -boot order=cdn,once=n,menu=off   -usbdevice tablet -no-kvm-pit-reinjection --append 'ks=cdrom nicdelay=60 console=ttyS0,115200 console=tty0' -enable-kvm


Expected results:
Guest install can finished

Additional info:
1.Host cpuinfo
processor	: 2
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 8750 Triple-Core Processor
stepping	: 3
cpu MHz		: 1200.000
cache size	: 512 KB
physical id	: 0
siblings	: 3
core id		: 2
cpu cores	: 3
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips	: 4809.90
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate


2. trace log in attechment
           qemu-8870  [002]  3261.979612: kvm_inj_virq: irq 239
            qemu-8870  [002]  3261.979613: kvm_entry: vcpu 1
            qemu-8870  [002]  3261.979616: kvm_exit: reason npf rip 0xc0418934
            qemu-8870  [002]  3261.979616: kvm_page_fault: address fee000b0 error_code 6
            qemu-8870  [002]  3261.979619: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0
            qemu-8870  [002]  3261.979620: kvm_apic: apic_write APIC_EOI = 0x0
            qemu-8870  [002]  3261.979621: kvm_entry: vcpu 1
            qemu-8870  [002]  3261.979624: kvm_exit: reason hlt rip 0xc0403c4a
            qemu-8869  [001]  3261.979755: kvm_exit: reason interrupt rip 0xc042d72b
            qemu-8869  [001]  3261.979755: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) (coalesced)
            qemu-8869  [001]  3261.979756: kvm_entry: vcpu 0
            qemu-8869  [001]  3261.980521: kvm_exit: reason interrupt rip 0xc042d94f
            qemu-8869  [001]  3261.980521: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) (coalesced)
            qemu-8869  [001]  3261.980521: kvm_entry: vcpu 0
            qemu-8869  [001]  3261.980601: kvm_exit: reason interrupt rip 0xc042d73d
            qemu-8869  [001]  3261.980602: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) (coalesced)
            qemu-8869  [001]  3261.980602: kvm_entry: vcpu 0
            qemu-8870  [000]  3261.980631: kvm_apic_accept_irq: apicid 1 vec 239 (Fixed|edge)
            qemu-8870  [000]  3261.980642: kvm_inj_virq: irq 239
            qemu-8870  [000]  3261.980645: kvm_entry: vcpu 1
            qemu-8870  [000]  3261.980653: kvm_exit: reason npf rip 0xc0418934
            qemu-8870  [000]  3261.980654: kvm_page_fault: address fee000b0 error_code 6
            qemu-8870  [000]  3261.980672: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0
            qemu-8870  [000]  3261.980673: kvm_apic: apic_write APIC_EOI = 0x0
            qemu-8870  [000]  3261.980676: kvm_entry: vcpu 1
            qemu-8870  [000]  3261.980687: kvm_exit: reason hlt rip 0xc0403c4a
            qemu-8869  [001]  3261.980753: kvm_exit: reason interrupt rip 0xc042d94c
            qemu-8869  [001]  3261.980754: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) (coalesced)
            qemu-8869  [001]  3261.980754: kvm_entry: vcpu 0
                                                                                                                                           159509,13     Bot

Comment 2 Joy Pu 2011-03-01 08:23:55 UTC
Created attachment 481551 [details]
Whole log for kvm trace

Comment 3 Joy Pu 2011-03-02 06:57:51 UTC
I also find this problem in some machine without the install process. The 32 bit RHEL5.6 guest will hang in RHEL6.1 host while booting and not in RHEL 5.6 host. So set the priority to high.

Comment 4 Dor Laor 2011-03-02 11:13:22 UTC
Please help rnd analyze this - it's too hard to drill down the issue with so many parameters. Please drop the -kernel -initrd, -qxl and other devices and test if it works. If it does, bisect the devices to know who is the faulty one.

Comment 5 Joy Pu 2011-03-02 13:08:08 UTC
(In reply to comment #4)
> Please help rnd analyze this - it's too hard to drill down the issue with so
> many parameters. Please drop the -kernel -initrd, -qxl and other devices and
> test if it works. If it does, bisect the devices to know who is the faulty one.

It seems the key parameter is -smp. I tried with this command line with -smp 2,cores=1,threads=1,sockets=2 and without -smp. It will only hang when using -smp 2,cores=1,threads=1,sockets=2.

command line:
/usr/auto/test/autotest-devel/client/tests/kvm/qemu -name 'vm1' -chardev socket,id=human_monitor_PulE,path=/tmp/monitor-humanmonitor1-20110301-110905-SWMy,server,nowait -mon chardev=human_monitor_PulE,mode=readline -chardev socket,id=serial_nYLc,path=/tmp/serial-20110301-110905-SWMy,server,nowait -device isa-serial,chardev=serial_nYLc -drive file='/usr/auto/test/autotest-devel/client/tests/kvm/images/RHEL-Server-5.6-32.qcow2',index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,format=qcow2,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device rtl8139,netdev=idkrV1T7,mac=9a:2e:3f:52:5b:9e,id=ndev00idkrV1T7,bus=pci.0,addr=0x3 -netdev tap,id=idkrV1T7,ifname='t0-110905-SWMy',script='/usr/auto/test/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 2046 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :0   -boot order=cdn,once=c,menu=off   -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm

And I use gdb to get the threads info for the hanging guest, I interrupt the process when I find the guest already hang. Here is the results:


(gdb) c
Continuing.
[Thread 0x7f07fedfe700 (LWP 17315) exited]
[New Thread 0x7f07fedfe700 (LWP 17321)]
[Thread 0x7f07fedfe700 (LWP 17321) exited]
^C
Program received signal SIGINT, Interrupt.
0x0000003ae10de923 in select () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003ae10de923 in select () from /lib64/libc.so.6
#1  0x000000000040b8d0 in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4421
#2  0x000000000042b35a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2164
#3  0x000000000040eeb5 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4638
#4  main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6852
(gdb) info thread
  4 Thread 0x7f0805b54700 (LWP 17302)  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
  3 Thread 0x7f0805150700 (LWP 17303)  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
* 1 Thread 0x7f0805d7a940 (LWP 17284)  0x0000003ae10de923 in select () from /lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f0805150700 (LWP 17303))]#0  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
#1  0x000000000042cf4f in kvm_run (env=0x1824010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:927
#2  0x000000000042d3d9 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1663
#3  0x000000000042e11f in kvm_main_loop_cpu (_env=0x1824010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1931
#4  ap_main_loop (_env=0x1824010) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1981
#5  0x0000003ae18077e1 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003ae10e5dcd in clone () from /lib64/libc.so.6
(gdb) thread 4
[Switching to thread 4 (Thread 0x7f0805b54700 (LWP 17302))]#0  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003ae10dde87 in ioctl () from /lib64/libc.so.6
#1  0x000000000042cf4f in kvm_run (env=0x180ae70) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:927
#2  0x000000000042d3d9 in kvm_cpu_exec (env=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1663
#3  0x000000000042e11f in kvm_main_loop_cpu (_env=0x180ae70) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1931
#4  ap_main_loop (_env=0x180ae70) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1981
#5  0x0000003ae18077e1 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003ae10e5dcd in clone () from /lib64/libc.so.6
(gdb)

Comment 6 Zachary Amsden 2011-03-03 01:59:07 UTC
We have an upstream report of a similar problem - 32 bit SMP guests hang, which bisected to a patch of mine.  It's very likely related.  I looked at the RHEL6 logic and it looks like my patches went in fine.

To rule this out or target it as suspect, I will prepare a brew build without my patches.

Comment 7 Joy Pu 2011-03-03 02:14:14 UTC
According comment 6 modify the bug name.

Comment 10 Dor Laor 2011-03-22 13:06:05 UTC
*** Bug 688951 has been marked as a duplicate of this bug. ***

Comment 11 Joseph Kachuck 2011-03-23 13:40:13 UTC
=== In Red Hat Customer Portal Case 00434734 ===
--- Comment by IBM bug, proxy on 3/23/2011 2:31 AM ---

------- Comment From santwana.samantray.com 2011-03-23 02:28 EDT-------
Hi All,

I verified this issue on RHEL6.1 Beta(k.v-2.6.32-122.el6), and the issue is still reproducible.
While installing RHEL5.6-32bit guest using qemu-kvm the installation is getting halt.
During installation of RHEL5.6-64 bit guest, this issue isn't noticed.

Thanks,
Santwana

Comment 12 Joseph Kachuck 2011-03-24 15:51:12 UTC
=== In Red Hat Customer Portal Case 00434734 ===
--- Comment by IBM bug, proxy on 3/24/2011 8:21 AM ---

------- Comment From santwana.samantray.com 2011-03-24 08:17 EDT-------
Hello,

After lot of trials using qemu-kvm as well as virt-manager, I could conclude that the issue is related to vcpu's.
When assigning just a single cpu to the guest, installation happens fine using qemu-kvm as well as virt-manager.
On allocating vcpu > 1, the installation is getting struck.
selinux is in a Permissive mode.

Installation happens fine with a single cpu using an NFS mounted path for the ISO as well.
The host is 8 processor  Intel(R) Xeon(R) system.

This issue isn't noticed in a 64-bit guest, i.e installation happens fine even though vcpus > 1.

Thanks,
Santwana

Comment 13 Zachary Amsden 2011-03-24 23:03:57 UTC
I've posted a patch which should fix the regression at least .. although I'm not completely satisfied that even that solves the entire problem.  We may have a more subtle bug lying underneath this that just got exposed.

Comment 15 Laine Stump 2011-03-27 03:23:34 UTC
*** Bug 688951 has been marked as a duplicate of this bug. ***

Comment 16 IBM Bug Proxy 2011-03-27 03:39:14 UTC
Created attachment 487983 [details]
Created an attachment for dmesg, /var/log/messages and sosreport of the host, also xml file of the guest

Comment 17 IBM Bug Proxy 2011-04-05 12:57:12 UTC
------- Comment From santwana.samantray.com 2011-04-05 08:47 EDT-------
Hello Redhat,

This issue is still reproducible with the kernel version : 2.6.32-125.el6.x86_64.

Comment 18 Zachary Amsden 2011-04-05 21:34:48 UTC
Yes, bug is still there, patch has not been applied.

Comment 19 IBM Bug Proxy 2011-04-06 16:53:01 UTC
------- Comment From markwiz.com 2011-04-06 12:48 EDT-------
This but blocks around 9% of our tests. If it is not included in Snap2, there is no way we can complete our testing before the End of Partner testing date.

Is there a place we can get the kernel with the patch applied so we can test it earlier?

Comment 20 Zachary Amsden 2011-04-06 16:59:01 UTC
Created attachment 490339 [details]
Patch fixing the problem

Comment 21 Zachary Amsden 2011-04-06 17:00:30 UTC
I've attached a patch which fixes the problems.  The patch is fairly trivial, but apparently waiting on some ack or flags to be set before being included in release?

Comment 22 Aristeu Rozanski 2011-04-07 13:51:31 UTC
Patch(es) available on kernel-2.6.32-130.el6

Comment 24 John Jarvis 2011-04-07 14:13:12 UTC
This fix is approved and planned for inclusion in snapshot 3.

Comment 25 Joy Pu 2011-04-08 05:42:01 UTC
Can reproduce in kernel-2.6.32-128.el6, and the kernel-2.6.32-130.el6 works well.

Comment 26 IBM Bug Proxy 2011-04-08 06:12:57 UTC
------- Comment From santwana.samantray.com 2011-04-08 02:08 EDT-------
Hello Redhat,

I verified this issue on Snap2 kernel(2.6.32-128.el6.x86_64), after applying the patch, and the RHEL5.6 32-bit guest installation is working fine with vcpus > 1.

Comment 28 IBM Bug Proxy 2011-04-08 17:13:37 UTC
------- Comment From pradeepkumars.com 2011-04-08 13:02 EDT-------
*** Bug 71382 has been marked as a duplicate of this bug. ***

Comment 29 IBM Bug Proxy 2011-04-10 03:33:04 UTC
------- Comment From pradeepkumars.com 2011-04-09 23:24 EDT-------
*** Bug 71434 has been marked as a duplicate of this bug. ***

Comment 30 Miya Chen 2011-04-11 03:30:22 UTC
move it to verified based on comment#25 and #26

Comment 31 IBM Bug Proxy 2011-04-20 07:04:08 UTC
------- Comment From santwana.samantray.com 2011-04-20 02:59 EDT-------
Hello Redhat,

I verified this issue with RHEL6.1-Snap3 (k.v-2.6.32-130.el6), and the issue seems to be fixed. We can install the RHEL5.6 32-bit guest with smp > 1 .

Thanks for your support,
Santwana

Comment 32 errata-xmlrpc 2011-05-23 20:40:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html


Note You need to log in before you can comment on or make changes to this bug.