Bug 1807351 - Migrate L1 guest with running L2 guest with postcopy=on L2 guest hang after migration
Summary: Migrate L1 guest with running L2 guest with postcopy=on L2 guest hang after m...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Amnon Ilan
QA Contact: Qinghua Cheng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-26 07:35 UTC by Qinghua Cheng
Modified: 2020-03-06 05:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-05 09:09:10 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Qinghua Cheng 2020-02-26 07:35:32 UTC
Description of problem:

Do migration test from 8.1 host to 8.2 host, migrate L1 guest with running L2 guest with postcopy=on L2 guest hang, qemu-kvm reports error message. 

Version-Release number of selected component (if applicable):

8.1 host: 
kernel: 4.18.0-147.6.1.el8_1.x86_64
qemu-kvm: qemu-kvm-2.12.0-88.module+el8.1.0+5708+85d8e057.3.x86_64

8.2 host:
kernel: 4.18.0-179.el8.x86_64
qemu-kvm: qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64

How reproducible:


Steps to Reproduce:
1. Start an l1 guest on 8.1 host with -cpu host
2. Start an l2 guest in l1 guest 
3. Start an l1 guest on 8.2 host wait for migration 
4. Set migrate_set_capability postcopy-ram on on both 8.1 host and 8.2 host
5. Start migration from 8.1 host to 8.2 host: migrate tcp:<8.2 host ip>:<port>

Actual results:
After migration qeum-kvm report error, and l2 guest doesn't work any more: 
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) KVM: entry failed, hardware error 0x7
RAX=ffffffff93092eb0 RBX=0000000000000005 RCX=7fffffd461f6c03f RDX=0000000000000001
RSI=0000000000000005 RDI=ffff93baf795d5c0 RBP=0000000000000005 RSP=ffffa8d200cdbea8
R8 =ffffa8d2019bbd30 R9 =0000000000000000 R10=000000000002f5ee R11=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=ffffffff9309325e RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff93baf7940000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 fffffe00000da000 0000206f 00008b00 DPL=0 TSS64-busy
GDT=     fffffe00000d8000 0000007f
IDT=     fffffe0000000000 00000fff
CR0=80050033 CR2=00007f8b8c016238 CR3=000000025a00a001 CR4=00760ee0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=08 75 c4 eb 80 90 e9 07 00 00 00 0f 00 2d d6 5e 57 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d c6 5e 57 00 f4 c3 90 90 0f 1f 44 00 00 41 55 41 54 55 53 e8


Expected results:
After migration both L1 and L2 guest works as before. 

Additional info:

Do postcopy migration on 8.2 <-> 8.2 build, migration is successful, both L1 and L2 guests work well. 

8.2 -> 8.1 host postcopy migration also works.

Host 8.1:

root@dell-per440-08 ~ # lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              20
On-line CPU(s) list: 0-19
Thread(s) per core:  2
Core(s) per socket:  10
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz
Stepping:            7
CPU MHz:             2496.206
CPU max MHz:         3200.0000
CPU min MHz:         1000.0000
BogoMIPS:            4400.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            14080K
NUMA node0 CPU(s):   0-19
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities

On 8.2 host:
root@dell-per440-09 ~ # lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              20
On-line CPU(s) list: 0-19
Thread(s) per core:  2
Core(s) per socket:  10
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz
Stepping:            7
CPU MHz:             2907.423
CPU max MHz:         3200.0000
CPU min MHz:         1000.0000
BogoMIPS:            4400.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            14080K
NUMA node0 CPU(s):   0-19
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities

Comment 5 Dr. David Alan Gilbert 2020-03-05 09:09:10 UTC
Closed as not-a-bug because:
   a) 8.1 is missing the code for migration of nesting
   b) Using -cpu host with mixed versions

Using a named cpu model, then this should work among future versions.


Note You need to log in before you can comment on or make changes to this bug.