Bug 599034 - RHEL5.5 i386 guest. installation stop at "running /sbin/loader when using 8 vcpu
Summary: RHEL5.5 i386 guest. installation stop at "running /sbin/loader when using 8 vcpu
Keywords:
Status: CLOSED DUPLICATE of bug 570824
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.6
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Marcelo Tosatti
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: Rhel5KvmTier1
TreeView+ depends on / blocked
 
Reported: 2010-06-02 14:52 UTC by lihuang
Modified: 2013-01-09 22:39 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-11 02:32:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kvmstrace.tar.gz (376.79 KB, application/x-gzip)
2010-06-02 15:14 UTC, lihuang
no flags Details

Description lihuang 2010-06-02 14:52:49 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 lihuang 2010-06-02 15:04:02 UTC
Description of problem:
test window 2k8 R2. RHEL5.5 x86_64 ,RHEL5.5 i386 
only RHEL5.5 i386 installation stop at "running /sbin/loader" if "-smp 8" is set



Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 5.5-2.2 (3)
kvm-83-164.el5_5.10
kernel-2.6.18-194.3.1.el5

How reproducible:
nearly 100%
2~3 times. it stop at the "probe video card " stage. 


Steps to Reproduce:
1. install rhel5.5 i386 guest with -smp 8
/usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -drive file=rhel5u5.raw,media=disk,if=ide,cache=none -smp 8 -m 8G -cpu qemu64,+sse2 -vnc :11 -monitor stdio -net nic,vlan=0,macaddr=20:20:28:99:01:19,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup -uuid c0a7deee-62bb-4771-86a6-95b728415aca -boot dn -no-kvm-pit-reinjection

2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 lihuang 2010-06-02 15:14:25 UTC
Created attachment 419067 [details]
kvmstrace.tar.gz

attached is kvmstrace for 2 sec

Comment 3 lihuang 2010-06-02 15:20:37 UTC
top - 15:07:48 up  9:39,  4 users,  load average: 1.25, 2.09, 1.87
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.5%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  1054284312k total,  7793200k used, 1046491112k free,   153904k buffers
Swap:  5484536k total,        0k used,  5484536k free,  6331696k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                         
12622 root      15   0 8493m 531m 4024 S  0.0  0.1  41:01.28 qemu-kvm                        



kvm statistics

 efer_reload                  8       0
 exits                139046534   20182
 fpu_reload             1029659       0
 halt_exits            43385891    7997
 halt_wakeup             182210      57
 host_state_reload     45862806    8165
 hypercalls                   0       0
 insn_emulation        69693572    9125
 insn_emulation_fail          0       0
 invlpg                 1308131       0
 io_exits              11365440    2024
 irq_exits              2520626      11
 irq_injections        49305075    9022
 irq_window             4713540    1022
 kvm_request_irq              0       0
 largepages                   0       0
 mmio_exits              971746      75
 mmu_cache_miss           10378       0
 mmu_flooded               6541       0
 mmu_pde_zapped            9021       0
 mmu_pte_updated          59308       0
 mmu_pte_write           190838       0
 mmu_recycled                 0       0
 mmu_shadow_zapped        14510       0
 mmu_unsync                  18       0
 mmu_unsync_global            0       0
 nmi_injections               0       0
 nmi_window                   0       0
 pf_fixed                967459       0
 pf_guest                 45571       0
 remote_tlb_flush        254110       0
 request_nmi                  0       0
 signal_exits                17       0
 tlb_flush             11527535       0


CPU info :

processor       : 95
vendor_id       : GenuineIntel
cpu family      : 6
model           : 29
model name      : Intel(R) Xeon(R) CPU           E7450  @ 2.40GHz
stepping        : 1
cpu MHz         : 2398.852
cache size      : 12288 KB
physical id     : 15
siblings        : 6
core id         : 5
cpu cores       : 6
apicid          : 125
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
bogomips        : 4798.22
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

[root@dhcp-66-83-236 ~]# cat /proc/meminfo 
MemTotal:     1054284312 kB
MemFree:      1044514964 kB
Buffers:        136348 kB
Cached:        8273128 kB
SwapCached:          0 kB
Active:        3681176 kB
Inactive:      5375216 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     1054284312 kB
LowFree:      1044514964 kB
SwapTotal:     5484536 kB
SwapFree:      5484536 kB
Dirty:              56 kB
Writeback:           0 kB
AnonPages:      647148 kB
Mapped:          15536 kB
Slab:           377860 kB
PageTables:       6740 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  532626692 kB
Committed_AS: 10259836 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    495208 kB
VmallocChunk: 34359243059 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Comment 4 lihuang 2010-06-02 15:52:02 UTC
More Test result

install RHEL5.5 i686 guest. ( cdrom,pxe boot)
-smp 1/2/4 PASS
-smp 8 
    virtio blk/ide          FAIL
    virtio net/e1000        FAIL
    with/without -no-apci   FAIL
 
boot a pre-installed RHEL5.5 i686 guest.
-smp 8
    with/without -no-apci   FAIL (stop at "starting udev",process take 0%cpu)
    with kernel line   clocksource=acpi_pm   PASS.

Comment 6 lihuang 2010-06-02 16:20:32 UTC
(In reply to comment #4)
> More Test result
> 
> install RHEL5.5 i686 guest. ( cdrom,pxe boot)
> -smp 1/2/4 PASS
> -smp 8 
>     virtio blk/ide          FAIL
>     virtio net/e1000        FAIL
>     with/without -no-apci   FAIL
> 
using -kernel .. -initrd .. -append  clocksource=apci_pm  also pass the test.


> boot a pre-installed RHEL5.5 i686 guest.
> -smp 8
>     with/without -no-apci   FAIL (stop at "starting udev",process take 0%cpu)
>     with kernel line   clocksource=acpi_pm   PASS.

Comment 8 lihuang 2010-06-04 04:23:11 UTC
FYI. 
so far just reproduce the bug on E7450, 96 cpu host.

test on another intel host, E5520 16 cpu. PASS on -smp 8/16.


processor       : 15
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2261.061
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 4522.04
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: [8]

Comment 9 XinSun 2010-06-04 04:30:23 UTC
I also can reproduce this problem when using rhevm(sm71) to install a vm(8 (8 sockets,1 core per socket) or 8 (2 sockets,4 cores per socket) on 96 cpu host (rhev-hypervisor-5.5-2.2.4).

Comment 10 Lawrence Lim 2010-06-04 04:51:42 UTC
Is it a regression? If so, since which version?

Comment 11 Marcelo Tosatti 2010-06-04 23:15:00 UTC
lihuang, 

Can you please arrange access to this machine (or similar Intel Xeon with >= 8 CPUs) and RHEV 5.5-2.2 (3). 

There is not enough information to clarify the issue.

Comment 12 Dor Laor 2010-06-06 11:48:32 UTC
Could it be a pvlcock issue, since clock=acpi_pm does work

Comment 13 lihuang 2010-06-06 16:53:24 UTC
(In reply to comment #11)
> lihuang, 
> 
> Can you please arrange access to this machine (or similar Intel Xeon with >= 8
> CPUs) and RHEV 5.5-2.2 (3). 
> 
> There is not enough information to clarify the issue.    

I guess you need a host which could reproduce the bug, but now we only reproduce it on the E7450 (96 cpu host), and the machine is unavailable until this Thu...

will update BZ when it is ready.

Comment 14 lihuang 2010-06-06 16:55:31 UTC
(In reply to comment #10)
> Is it a regression? If so, since which version?    

will test previous version when the machine is available.

Comment 15 Lawrence Lim 2010-06-07 02:39:09 UTC
What about 16 CPU host?? After Thu, its ald RC.

Comment 16 lihuang 2010-06-07 08:31:56 UTC
already test on the 16 cpu host. result is PASS. 599034#c8

Comment 17 Glauber Costa 2010-06-07 14:14:14 UTC
If this is in fact something that is proven to happen only with kvmclock, would be good to test with the new set of patches, that just got commited to RHEL5, and make sure they work.

I just checked, and kernel-2.6.18-202.el5 has all the patches you need.

Comment 18 lihuang 2010-06-08 08:33:04 UTC

(In reply to comment #4)
> More Test result
> 
> install RHEL5.5 i686 guest. ( cdrom,pxe boot)
> -smp 1/2/4 PASS
> -smp 8 
>     virtio blk/ide          FAIL
>     virtio net/e1000        FAIL
>     with/without -no-apci   FAIL
> 
> boot a pre-installed RHEL5.5 i686 guest.
> -smp 8
>     with/without -no-apci   FAIL (stop at "starting udev",process take 0%cpu)
>     with kernel line   clocksource=acpi_pm   PASS.    

update kernel to -202.el5  boot PASS ( using kvm-clock ).

Comment 19 Dor Laor 2010-06-10 15:07:20 UTC
Can you close it?

Comment 20 Glauber Costa 2010-06-11 02:32:56 UTC

*** This bug has been marked as a duplicate of bug 570824 ***


Note You need to log in before you can comment on or make changes to this bug.