Bug 601045 - guest migration turns failed by the end (16G + stress load)
Summary: guest migration turns failed by the end (16G + stress load)
Keywords:
Status: CLOSED DUPLICATE of bug 513765
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.6
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Juan Quintela
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 599330 (view as bug list)
Depends On: 599330
Blocks: 565939 568128 Rhel5KvmTier1 643970 645188
TreeView+ depends on / blocked
 
Reported: 2010-06-07 05:22 UTC by lihuang
Modified: 2013-01-11 03:03 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 599330
: 643970 645188 (view as bug list)
Environment:
Last Closed: 2010-11-24 13:40:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kvmtrace in test 1 (5.70 MB, application/x-gzip)
2010-06-07 05:27 UTC, lihuang
no flags Details
kvmtrace in test 2 (204.19 KB, application/x-gzip)
2010-06-07 05:28 UTC, lihuang
no flags Details

Description lihuang 2010-06-07 05:22:09 UTC
clone to kvm. 

More Test:
1. 16g v-mem guest + stress load ( stress -c 3 --vm 12 --vm-bytest 1G )
=> FAILED
remaining ram stuck at 1842200 kbytes 

Migration status: active
transferred ram: 14956920 kbytes
remaining ram: 1842200 kbytes
total ram: 16797708 kbytes

2. 16g v-mem guest + stress load ( stress -c 3 --vm 12 --vm-bytest 256M )
=> FAILED
remaining ram stuck at 2987840 kbytes 
Migration status: active
transferred ram: 29963348 kbytes
remaining ram: 2987840 kbytes
total ram: 16797708 kbytes 

there is a similar bug about migration (without load). bug 513765

3. 8g  v-mem guest + stress load ( stress -c 3 --vm 12 --vm-bytest 1G )
==> PASS




+++ This bug was initially created as a clone of Bug #599330 +++

Description of problem:
while trying to migrate a 4vcpu/16g guest with some stress loaded on it, the migration started ok, but ended with failure.

Version-Release number of selected component (if applicable):
sm71 (rhevh-hypervisor-5.5-2.2.4, rhevm-2.2.0.46140)

How reproducible:
always

Steps to Reproduce:
1- create a guest with 4vcpu and 16g memory.
2- install any os on it. boot this guest up.
3- load 75% memory stress on the guest.
   e.g  #stress -m 48   (load 12g memory stress on rhel)
4- migrate this guest to another host
  
Actual results:
migration ended up with failure. guest turns running still on the source host.

Expected results:
migration succeeded. guest runs on the target host.
OR
give a warning without starting migration if the condition is not suitable for migration.

Additional info:
1- the source host and target host are on the same cluster, and both of the hosts owned 8cpu/32gb memory each.
2- whatever the guest os is, all failed.
3- there are 2 screenshots when migration start and end. hope they can help.
4- some info from vdc-log.txt below (for two failure migrations):
------------------------
02Jun 09:45:50 [3424] INFO  - Running command: MigrateVmToServerCommand
02Jun 09:45:50 [5768] INFO  - IncreasePendingVms::MigrateVmIncreasing vds intel-5310-32-2 pending vcpu count, now 4. Vm: IIS_win08r2_64
02Jun 09:55:58 [2756] ERROR - Rerun vm 66b6a11f-3063-4dc7-a825-f585c9037326. Called from vds intel-5310-32-1
-------------------
03Jun 03:26:33 [5520] INFO  - Running command: MigrateVmToServerCommand
03Jun 03:26:34 [4116] INFO  - IncreasePendingVms::MigrateVmIncreasing vds intel-5310-32-2 pending vcpu count, now 4. Vm: Mysql_rhel5u5_64
------------------------


(In reply to comment #7)
> I assume this bug is on the reporting issue to user, and you opened another bug
> to kvm on the fact the migration is failing?    

Lingqing Lu and I didn't file the other migration bug for kvm.

Comment 1 lihuang 2010-06-07 05:24:47 UTC
kvm status in test 1


kvm statistics

 efer_reload                 30       0
 exits                198251092  390809
 fpu_reload              633903     504
 halt_exits             3874161       0
 halt_wakeup             184263       0
 host_state_reload    6951987    1663
 hypercalls                   0       0
 insn_emulation         9582288    5507
 insn_emulation_fail          0       0
 invlpg                  117224       0
 io_exits               1222678     590
 irq_exits              2283870    2787
 irq_injections         7821956    5100
 irq_window             1908481    1521
 kvm_request_irq              0       0
 largepages                   0       0
 mmio_exits              372732       0
 mmu_cache_miss          268363     374
 mmu_flooded              28435       0
 mmu_pde_zapped          279824     373
 mmu_pte_updated            901       0
 mmu_pte_write           333181     373
 mmu_recycled                 0       0
 mmu_shadow_zapped     259495       0
 mmu_unsync                 118      -4
 mmu_unsync_global            0       0
 nmi_injections               0       0
 nmi_window                   0       0
 pf_fixed              93753433  191083
 pf_guest              84925057  189560
 remote_tlb_flush        770846     425
 request_nmi                  0       0
 signal_exits                 1       0
 tlb_flush              1531492     942 



qemu-kvm command line :
/usr/libexec/qemu-kvm -no-hpet -no-kvm-pit-reinjection -usbdevice tablet -rtc-td-hack -startdate 2010-06-04T17:02:11 -name Mysql_rhel5u5_64 -smp 4,cores=1 -k en-us -m 16384 -boot c -net nic,vlan=1,macaddr=00:1a:4a:42:46:00,model=virtio -net tap,vlan=1,ifname=virtio_10_1,script=no -drive file=/rhev/data-center/ea8dd427-53d4-441c-8bdf-8eb4c205ff15/6df2e9d8-1366-4f28-aac2-380a7954e738/images/09d33ef8-104d-438f-81f3-a7a398407e28/f81c19f0-c0af-494e-b221-bc1847256711,media=disk,if=virtio,cache=off,serial=8f-81f3-a7a398407e28,boot=on,format=raw,werror=stop -pidfile /var/vdsm/7d73dc91-4f55-46d7-82e2-5cae180487c4.pid -vnc 0:10,password -cpu qemu64,+sse2,+cx16,+ssse3 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-4,serial=44454C4C-5900-1051-8031-C3C04F4D3258_00:22:19:bb:4a:d3,uuid=7d73dc91-4f55-46d7-82e2-5cae180487c4 -vmchannel di:0200,unix:/var/vdsm/7d73dc91-4f55-46d7-82e2-5cae180487c4.guest.socket,server -monitor unix:/var/vdsm/7d73dc91-4f55-46d7-82e2-5cae180487c4.monitor.socket,server

Host cpuinfo: 
processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU           E5310  @ 1.60GHz
stepping        : 11
cpu MHz         : 1595.926
cache size      : 4096 KB
physical id     : 1
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx tm2 ssse3 cx16 xtpr lahf_lm
bogomips        : 3191.91
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

host meminfo :
MemTotal:     32809788 kB
MemFree:      13321836 kB
Buffers:         40188 kB
Cached:       19249952 kB
SwapCached:          0 kB
Active:         200504 kB
Inactive:     19146268 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     32809788 kB
LowFree:      13321836 kB
SwapTotal:     1023992 kB
SwapFree:      1023992 kB
Dirty:              56 kB
Writeback:           0 kB
AnonPages:       56844 kB
Mapped:          11008 kB
Slab:            93472 kB
PageTables:       4008 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  17428884 kB
Committed_AS:   485296 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    272716 kB
VmallocChunk: 34359464619 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Comment 2 lihuang 2010-06-07 05:27:33 UTC
Created attachment 421714 [details]
kvmtrace in test 1

kvmtrace in test 1

Comment 3 lihuang 2010-06-07 05:28:25 UTC
Created attachment 421716 [details]
kvmtrace in test 2

kvmtrace in test 2

Comment 4 Lawrence Lim 2010-06-07 08:32:16 UTC
Is this specific to OS? Or applicable to all OS?

Comment 5 XinSun 2010-06-07 09:25:04 UTC
*** Bug 599330 has been marked as a duplicate of this bug. ***

Comment 6 lihuang 2010-06-07 14:49:39 UTC
FYI.


same test run on another host :
1. RHEV Hypervisor 5.5-2.2 (0.10). RHEL5.4 i386 PAE , 16g v-mem, 75% load. with
npt
   --> PASS

2. RHEV Hypervisor 5.5-2.2 (4)   . RHEL5.5 x86_64   . 16g v-mem, 75% load.with
npt
   --> PASS

3.2. RHEV Hypervisor 5.5-2.2 (4)   . RHEL5.5 x86_64   . 16g v-mem, 75%
load.without npt
   --> PASS


processor       : 11
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 8
model name      : Six-Core AMD Opteron(tm) Processor 2427
stepping        : 0
cpu MHz         : 2200.026
cache size      : 512 KB
physical id     : 1
siblings        : 6
core id         : 5
cpu cores       : 6
apicid          : 13
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse 3dnowprefetch
osvw
bogomips        : 4399.42
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate [8]


[root@amd-2427-32-1 ~]# cat /proc/meminfo 
MemTotal:     32835876 kB
MemFree:      32166204 kB
Buffers:         53688 kB
Cached:         464344 kB
SwapCached:          0 kB
Active:         155556 kB
Inactive:       410880 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     32835876 kB
LowFree:      32166204 kB
SwapTotal:    24809464 kB
SwapFree:     24809464 kB
Dirty:               0 kB
Writeback:           0 kB
AnonPages:       48428 kB
Mapped:          13204 kB
Slab:            45488 kB
PageTables:       2840 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  41227400 kB
Committed_AS:   418440 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    546184 kB
VmallocChunk: 34359190131 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Comment 7 Juan Quintela 2010-10-26 17:24:34 UTC
reproduced and patches have been posted.

Comment 8 Juan Quintela 2010-11-24 13:40:58 UTC

*** This bug has been marked as a duplicate of bug 513765 ***


Note You need to log in before you can comment on or make changes to this bug.