Bug 504264 - vm use 100% cpu after migration on 5.4 host ( Intel i7 cpu)
Summary: vm use 100% cpu after migration on 5.4 host ( Intel i7 cpu)
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Glauber Costa
QA Contact: Lawrence Lim
URL:
Whiteboard:
Depends On: 501693
Blocks: LiveMigration
TreeView+ depends on / blocked
 
Reported: 2009-06-05 08:38 UTC by Suqin Huang
Modified: 2014-03-26 00:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-07-06 08:49:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
strace result (12.89 MB, text/plain)
2009-06-05 08:45 UTC, Suqin Huang
no flags Details
strace new (1.81 MB, text/plain)
2009-06-09 11:38 UTC, Suqin Huang
no flags Details
kvmtrace, can not read derictory, need to analyze first. I failed to analyze it on my machine (10.46 MB, application/octet-stream)
2009-06-09 11:41 UTC, Suqin Huang
no flags Details

Description Suqin Huang 2009-06-05 08:38:18 UTC
Description of problem:
vm is hang after migration on 5.4 host

Version-Release number of selected component (if applicable):
kvm-83-60.el5

How reproducible:
always

Steps to Reproduce:
1. command on source host:
# qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide -boot c -vnc :10
2. command on destination host:
# qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide -boot c -incoming tco:0:4567 -vnc :10
3.
  
Actual results:
vm is hang after miagration:

Expected results:


Additional info:

1. top

9835 root      15   0 1222m 1.0g 2492 S 105.5  8.8   0:33.29 qemu-kvm          
 9650 root      15   0 61572 6468 2548 S  2.0  0.1   0:39.49 ssh                
 5398 root      15   0 59712  15m 5500 S  0.3  0.1   0:14.45 Xvnc               
 9649 root      18   0 53844 1904 1456 S  0.3  0.0   0:02.56 scp                
 9854 root      15   0 12872 1216  816 R  0.3  0.0   0:00.01 top                
    1 root      15   0 10348  692  584 S  0.0  0.0   0:00.98 init               
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0        
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0        
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0         
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/1        
    6 root      34  19     0    0    0 S  0.0  0.0   0:13.39 ksoftirqd/1        
    7 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/1         
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/2        
    9 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/2        
   10 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/2         
   11 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/3        
   12 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/3

2. kvm_stat:
kvm statistics

 efer_reload                  0       0
 exits                  9859261  154336
 fpu_reload                  14       0
 halt_exits                1238       0
 halt_wakeup                 27       0
 host_state_reload      8952430  140499
 hypercalls                   0       0
 insn_emulation         2733548   42792
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits               8950775  140501
 irq_exits                 3799      15
 irq_injections          424300    6532
 irq_window              408355    6293
 kvm_request_irq              0       0
 largepages                   0       0
 mmio_exits                  33       0
 mmu_cache_miss             194       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0
 mmu_pte_write                0       0

3. can ping from host to guest before migration.
4. can not ping from host to guest after migration.
5. guest info:
    5.3server-64, 5.4server-64
6. host info:
Red Hat Enterprise Linux Server release 5.4 Beta (Tikanga) (RHEL5.4-Server-20090526.nightly/)

model name      : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz

7. strace result:
attached

Comment 1 Suqin Huang 2009-06-05 08:45:54 UTC
Created attachment 346617 [details]
strace result

Comment 2 Lawrence Lim 2009-06-05 13:01:31 UTC
Other OS OK?

Comment 3 Suqin Huang 2009-06-05 13:17:24 UTC
can migrate on 5.3 host

Comment 6 Yaniv Kaul 2009-06-05 16:42:42 UTC
Please pin the guest to a specific CPU (using taskset) and run kvmtrace for 1 second.

Comment 9 Dor Laor 2009-06-08 19:26:10 UTC
Please re-test using kvm-83-72

Comment 10 Suqin Huang 2009-06-09 11:37:34 UTC
the same result:
1. command used:
qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net nic,model=e1000,macaddr=00:1a:4a:16:66:86,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -drive file=5.4s-64.qcow2,if=ide -boot c -vnc :10
2. top
5547 root      15   0 1223m 1.0g 2488 S 100.9  8.8   0:35.12 qemu-kvm
    1 root      15   0 10348  692  584 S  0.0  0.0   0:01.12 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/1
    6 root      34  19     0    0    0 S  0.0  0.0   0:01.30 ksoftirqd/1
    7 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/1
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/2
    9 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/2
   10 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/2
   11 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/3
   12 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/3
   13 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/3
   14 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/4
   15 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/4

3. kvm_stat:
 efer_reload                  0       0
 exits                 11948201  158472
 fpu_reload                  46       0
 halt_exits                6092       0
 halt_wakeup                 34       0
 host_state_reload     10836795  144278
 hypercalls                   0       0
 insn_emulation         3312346   43889
 insn_emulation_fail          0       0
 invlpg                       0       0
 io_exits              10829757  144276
 irq_exits                 2432      31
 irq_injections          519128    6655
 irq_window              493493    6510
 kvm_request_irq              0       0
 largepages                   0       0
 mmio_exits                 450       0
 mmu_cache_miss             484       0
 mmu_flooded                  0       0
 mmu_pde_zapped               0       0
 mmu_pte_updated              0       0

4. strace: attached
5. kvmtrace: attached
6. kvm version:
kvm-83-73.el5

Comment 11 Suqin Huang 2009-06-09 11:38:35 UTC
Created attachment 347008 [details]
strace new

Comment 12 Suqin Huang 2009-06-09 11:41:13 UTC
Created attachment 347009 [details]
kvmtrace, can not read derictory, need to analyze first. I failed to analyze it on my machine

Comment 13 Dor Laor 2009-06-09 12:00:15 UTC
(In reply to comment #0)
> Description of problem:
> vm is hang after migration on 5.4 host
> 
> Version-Release number of selected component (if applicable):
> kvm-83-60.el5
> 
> How reproducible:
> always
> 
> Steps to Reproduce:
> 1. command on source host:
> # qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net
> nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
> tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide
> -boot c -vnc :10
          
               ^^^^

> 2. command on destination host:
> # qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net
> nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
> tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide
> -boot c -incoming tco:0:4567 -vnc :10

                       ^^^          ^^^^^
> 3.

I can't see how this works since according to this cmd line you use the same host for the destination and both src/dst use -vnc :10.

Comment 14 Suqin Huang 2009-06-09 12:09:38 UTC
(In reply to comment #13)
> (In reply to comment #0)
> > Description of problem:
> > vm is hang after migration on 5.4 host
> > 
> > Version-Release number of selected component (if applicable):
> > kvm-83-60.el5
> > 
> > How reproducible:
> > always
> > 
> > Steps to Reproduce:
> > 1. command on source host:
> > # qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net
> > nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
> > tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide
> > -boot c -vnc :10
> 
>                ^^^^
> 
> > 2. command on destination host:
> > # qemu-kvm -no-hpet -usbdevice tablet  -rtc-td-hack -m 1G -uuid `uuidgen` -net
> > nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
> > tap,vlan=0,script=/etc/qemu-ifup  -drive file=rhel-5.3-server-64.qcow2,if=ide
> > -boot c -incoming tcp:0:4567 -vnc :10
> 
>                        ^^^          ^^^^^
> > 3.
> 
> I can't see how this works since according to this cmd line you use the same
> host for the destination and both src/dst use -vnc :10.  

sure at the difference machines. 
destination host just listen for source host. so use tcp:0:4567, and at source host (qemu) migration -d tcp:destinacion_ip:4567

there are at different machine so can use -vnc :10 at the same time
there

Comment 15 Suqin Huang 2009-06-15 08:21:24 UTC
host: AMD 
=>pass

host: intel (not i7)
=>pass

host: intel (i7):
ept enabled -> ept disabled => failed
ept disabled -> ept disabled => failed
ept enabled -> ept enabled => failed
guest include rhel5.3, rhel5.4,rhel4.8, window2008 => all failed

Comment 16 Dor Laor 2009-06-15 12:35:37 UTC
(In reply to comment #15)
> host: AMD 
> =>pass
> 
> host: intel (not i7)
> =>pass
> 
> host: intel (i7):
> ept enabled -> ept disabled => failed
> ept disabled -> ept disabled => failed
> ept enabled -> ept enabled => failed
> guest include rhel5.3, rhel5.4,rhel4.8, window2008 => all failed  

Did you use -cpu qemu64,sse2 ? I don't see it above.
Please retest with that.

Comment 17 Suqin Huang 2009-06-16 06:31:31 UTC
1. command used:
source host:
qemu-kvm -no-hpet -cpu qemu64,+sse2 -rtc-td-hack -m 1G -uuid `uuidgen` -net nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -drive file=RHEL-5.4-32.qcow2,if=ide,boot=on -boot c -vnc :6

destination host:
qemu-kvm -no-hpet -cpu qemu64,+sse2 -rtc-td-hack -m 1G -uuid `uuidgen` -net nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -drive file=RHEL-5.4-32.qcow2,if=ide,boot=on -boot c -vnc :8 -incoming tcp:0:4586

2. Test on Beta5:
(qemu)info migration
Migration status: failed

3. Test on 5.4 nightly(0602)
after migration, destination guest is black, source guest hang, monitor can not be operated on source guest


you can try on my machine
10.66.70.6 & 10.66.70.31 
passwd: redhat

Comment 18 Suqin Huang 2009-06-16 07:07:47 UTC
command used:
source host:
#qemu-kvm -no-hpet -cpu qemu64,+sse2 -rtc-td-hack -m 1G -uuid 2406a5a1-d905-4f42-bf11-b1f6069fa8d3 -net
nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
tap,vlan=0,script=/etc/qemu-ifup -drive file=RHEL-5.4-32.qcow2,if=ide,boot=on
-boot c -vnc :6

destination host:
#qemu-kvm -no-hpet -cpu qemu64,+sse2 -rtc-td-hack -m 1G -uuid 2406a5a1-d905-4f42-bf11-b1f6069fa8d3 -net
nic,model=e1000,macaddr=00:1a:4a:16:97:86,vlan=0 -net
tap,vlan=0,script=/etc/qemu-ifup -drive file=RHEL-5.4-32.qcow2,if=ide,boot=on
-boot c -vnc :8 -incoming tcp:0:4568

Comment 19 Dor Laor 2009-06-16 08:11:19 UTC
I just run it now, hope we didn't kill the image since the guest cannot even boot. The kernel crashed.
Can you re-start with a fresh image?
Also, DO NOT use boot=on for ide.

Comment 20 lihuang 2009-06-23 10:43:52 UTC
Can not reproduce the issue on a i7 5U4 host. migaration is successfully done.
cpu usage change from 0.3%~0.9% to 0% on src host,and from 0% to 0.3% ~1% on dst host.

Tested ping-pong migration
1. enable_ept=1 <==> enable_ept=1
2. enable_ept=1 <==> enable_ept=0
3. enable_ept=0 <==> enable_ept=0

[root@intel-i7-12-2 ~]# dmesg | tail
switch: port 2(vm1) entering learning state
vm1: no IPv6 routers present
switch: topology change detected, propagating
switch: port 2(vm1) entering forwarding state
kvm: 22176: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 22176: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffd76970
kvm: 22176: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
kvm: 22176: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 22176: cpu1 unimplemented perfctr wrmsr: 0xc1 data 0xffd76970
kvm: 22176: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x530079


CLI 1 . a fresh install 5u3 guest using ide block and e1000 NIC
[root@intel-i7-12-2 images]# /usr/libexec/qemu-kvm -cpu qemu64,+sse2 -drive file=rhel5u3-server-x86_64.qcow2,cache=off,format=qcow2,index=0 -no-hpet -usbdevice tablet -rtc-td-hack -name vm1 -smp 2 -m 2048 -boot c -net nic,vlan=1,macaddr=00:1a:4a:fe:32:01,model=e1000 -net tap,vlan=1,ifname=vm1,script=/etc/qemu-ifup -vnc :12 -monitor stdio -incoming tcp:0:5800

CLI 2 . a fresh instll 5u4server guest using virtio block and virtio NIC
[root@intel-i7-12-2 images]# /usr/libexec/qemu-kvm -cpu qemu64,+sse2 -drive file=rhel5u3.virtio,if=virtio,boot=on,cache=off,format=qcow2,index=0 -no-hpet -usbdevice tablet -rtc-td-hack -name vm1 -smp 2 -m 2048 -boot c -net nic,vlan=1,macaddr=00:1a:4a:fe:32:02,model=virtio -net tap,vlan=1,ifname=vm1,script=/etc/qemu-ifup -vnc :12 -monitor stdio -incoming tcp:0:5800


qemu monitor cmd: migrate  -d tcp:$dst_ip:5800


[root@intel-i7-12-2 images]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.4 Beta (Tikanga)
[root@intel-i7-12-2 images]# rpm -q kernel kvm
kernel-2.6.18-152.el5
kvm-83-77.el5

the same testing is running on rhevh 5.4-2.0.99 (7.1). result coming soon.

Comment 21 lihuang 2009-06-23 12:11:20 UTC
> 
> CLI 1 . a fresh install 5u3 guest using ide block and e1000 NIC
> [root@intel-i7-12-2 images]# /usr/libexec/qemu-kvm -cpu qemu64,+sse2 -drive
> file=rhel5u3-server-x86_64.qcow2,cache=off,format=qcow2,index=0 -no-hpet
> -usbdevice tablet -rtc-td-hack -name vm1 -smp 2 -m 2048 -boot c -net
> nic,vlan=1,macaddr=00:1a:4a:fe:32:01,model=e1000 -net
> tap,vlan=1,ifname=vm1,script=/etc/qemu-ifup -vnc :12 -monitor stdio -incoming
> tcp:0:5800
> 
> CLI 2 . a fresh instll 5u4server guest using virtio block and virtio NIC
> [root@intel-i7-12-2 images]# /usr/libexec/qemu-kvm -cpu qemu64,+sse2 -drive
> file=rhel5u3.virtio,if=virtio,boot=on,cache=off,format=qcow2,index=0 -no-hpet
> -usbdevice tablet -rtc-td-hack -name vm1 -smp 2 -m 2048 -boot c -net
> nic,vlan=1,macaddr=00:1a:4a:fe:32:02,model=virtio -net
> tap,vlan=1,ifname=vm1,script=/etc/qemu-ifup -vnc :12 -monitor stdio -incoming
> tcp:0:5800
> 

also tested 4G/8G mem. => pass 
(host has 12G RAM)

Comment 22 lihuang 2009-06-23 12:41:57 UTC
Can not reproduce this issue on non Intel i7 host.
and According comment #15 , update the bug summary.


CUP info: 

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
stepping	: 4
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr popcnt lahf_lm
bogomips	: 5319.89
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management: [8]

Comment 23 lihuang 2009-06-23 16:46:36 UTC
when I running the testing on :

[root@dhcp-66-70-6 ~]# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99 (7.1)
[root@dhcp-66-70-6 ~]# rpm -q kvm
kvm-83-77.el5
[root@dhcp-66-70-6 ~]# rpm -q kernel
kernel-2.6.18-154.el5

I also can not reproduce the original bug ( cpu usage reach 100%) 

CLI:
root     30111 29041  0 13:35 pts/3    00:00:01 /usr/libexec/qemu-kvm -drive file=/mnt/rhel53-64-ser-virtio.qcow2,if=virtio,cache=off,index=0,boot=on -net nic,macaddr=20:20:20:00:18:59,model=virtio -net tap,script=/etc/qemu-ifup -rtc-td-hack -no-hpet -usbdevice tablet -cpu qemu64,+sse2 -smp 2 -m 4096 -vnc :12 -monitor stdio -boot c -incoming tcp:0:5800


but there is another non-100% reproducible issue. 
opened a new bug for that issue :

  Bug 507659 -  Migrate command not end and vm responseless on Nahalem host

Comment 24 Dor Laor 2009-07-06 07:20:27 UTC
So can you close this bug?

Comment 25 lihuang 2009-07-06 08:49:41 UTC
Please reopen if you can reproduce the original issue


Note You need to log in before you can comment on or make changes to this bug.