Bug 1109056
| Summary: | Error while loading state for instance 0x0 of device 'ram' when do migration between 2 different intel hosts | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Shuang Yu <shuyu> |
| Component: | qemu-kvm | Assignee: | Eduardo Habkost <ehabkost> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.1 | CC: | amit.shah, bcao, dgilbert, ehabkost, hhuang, huding, juzhang, lijin, michen, qzhang, rbalakri, shuyu, virt-maint, xfu |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-02-05 16:26:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Shuang Yu
2014-06-13 07:33:08 UTC
Hi Shuyu, This is qemu-kvm-rhev only bz? Can you test qemu-kvm version as well? If not qemu-kvm-rhev only bz, please update the component as qemu-kvm? Best Regards, Junyi cache=none is missing on the disk. But this bug should be a different bug. Can you check and report: a- without balloon b- the only difference between this two cpu's is "aes", could you test if that feature is seen inside the guest and or filtered? c- if it still fails, cat /proc/meminfo of the guests launched on both hosts would be a starting point. Thanks, Juan. Reproduce this issue on 2 intel hosts with different cpu models ,hit the same issue when do migration between 2 different intel hosts Version-Release number of selected component (if applicable): 3.10.0-121.el7.x86_64 qemu-kvm-rhev-1.5.3-60.el7ev.x86_64 seabios-1.7.2.2-12.el7.x86_64 virtio-win-prewhql86 Guest win7-32 Steps: 1.Start VM without balloon driver in source host A: CLI: /usr/libexec/qemu-kvm -m 2G -cpu Nehalem -smp 2 -monitor stdio -vnc :10 -netdev tap,id=hostnet1,script=/etc/qemu-ifup -device e1000,netdev=hostnet1,id=net1,mac=00:22:52:00:04:44 -usb -device usb-tablet,id=tablet1 -drive file=win7-32.qcow2,format=qcow2,if=none,id=drive1 -device ide-drive,drive=drive1,id=disk1 -cdrom en_windows_7_ultimate_x86_dvd_x15-65921.iso -name win7-32 2.Start Listenning Port in Destination Host B (the image is shared via NFS): CLI:/usr/libexec/qemu-kvm -m 2G -cpu Nehalem -smp 2 -monitor stdio -vnc :10 -netdev tap,id=hostnet1,script=/etc/qemu-ifup -device e1000,netdev=hostnet1,id=net1,mac=00:22:52:00:04:44 -usb -device usb-tablet,id=tablet1 -drive file=win7-32.qcow2,format=qcow2,if=none,id=drive1 -device ide-drive,drive=drive1,id=disk1 -cdrom en_windows_7_ultimate_x86_dvd_x15-65921.iso -name win7-32 -incoming tcp:0:5888 3.Do live migration (qemu)migrate -d tcp:<ip of host B>:5888 4.Start VM without balloon driver in source host B,the CLI same as step 1.Start Listenning Port in Destination Host A,the CLI same as step 2 (the image is shared via NFS),do live migration. 5.Start VM with balloon driver in source host A: /usr/libexec/qemu-kvm -m 2G -cpu Nehalem -smp 2 -monitor stdio -vnc :10 -netdev tap,id=hostnet1,script=/etc/qemu-ifup -device e1000,netdev=hostnet1,id=net1,mac=00:22:52:00:04:44 -usb -device usb-tablet,id=tablet1 -drive file=win7-32.qcow2,format=qcow2,if=none,id=drive1 -device ide-drive,drive=drive1,id=disk1 -cdrom en_windows_7_ultimate_x86_dvd_x15-65921.iso -name win7-32 -device virtio-balloon-pci,id=balloon 6.Start Listenning Port in Destination Host B (the image is shared via NFS): /usr/libexec/qemu-kvm -m 2G -cpu Nehalem -smp 2 -monitor stdio -vnc :10 -netdev tap,id=hostnet1,script=/etc/qemu-ifup -device e1000,netdev=hostnet1,id=net1,mac=00:22:52:00:04:44 -usb -device usb-tablet,id=tablet1 -drive file=win7-32.qcow2,format=qcow2,if=none,id=drive1 -device ide-drive,drive=drive1,id=disk1 -cdrom en_windows_7_ultimate_x86_dvd_x15-65921.iso -name win7-32 -device virtio-balloon-pci,id=balloon -incoming tcp:0:5888 7.Do live migration (qemu)migrate -d tcp:<ip of host B>:5888 8.Start VM with balloon driver in source host B,the CLI same as step 5.Start Listenning Port in Destination Host A,the CLI same as step 6 (the image is shared via NFS),do live migration. Actual results: 4 times migration,qemu-kvm process all quit with "(qemu) qemu: warning: error while loading state for instance 0x0 of device 'ram'" Expected results: Migration successfully Additional info: 1.Inside the guest,use "x86info.exe -a" get the cpu infomation: C:\Program Files\GnuWin32\bin>x86info.exe -a x86info v1.21. Dave Jones 2001-2007 Feedback to <davej>. Found 2 CPUs issmp(): /dev/mem: No such file or directory -------------------------------------------------------------------------- CPU #1 /dev/cpu/0/cpuid: No such file or directory eax in: 0x00000000, eax = 00000004 ebx = 756e6547 ecx = 6c65746e edx = 49656e69 eax in: 0x00000001, eax = 000106a3 ebx = 00000800 ecx = 80b82201 edx = 078bfbfd eax in: 0x00000002, eax = 00000001 ebx = 00000000 ecx = 00000000 edx = 002c307d eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000004, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000000, eax = 8000000a ebx = 756e6547 ecx = 6c65746e edx = 49656e69 eax in: 0x80000001, eax = 000106a3 ebx = 00000000 ecx = 00000001 edx = 20100800 eax in: 0x80000002, eax = 65746e49 ebx = 6f43206c ecx = 69206572 edx = 78392037 eax in: 0x80000003, eax = 4e282078 ebx = 6c616865 ecx = 43206d65 edx = 7373616c eax in: 0x80000004, eax = 726f4320 ebx = 37692065 ecx = 00000029 edx = 00000000 eax in: 0x80000005, eax = 01ff01ff ebx = 01ff01ff ecx = 40020140 edx = 40020140 eax in: 0x80000006, eax = 00000000 ebx = 42004200 ecx = 02008140 edx = 00000000 eax in: 0x80000007, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000008, eax = 00003024 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 Family: 6 Model: 10 Stepping: 3 Type: 0 Brand: 0 CPU Model: Pentium II (Deschutes) Original OEM Feature flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflsh mmx fxsr sse sse2 Extended feature flags: sse3 ssse3 cx16 [19] [20] [21] [23] [31] SYSCALL xd em64t lahf_lm Cache info L1 Instruction cache: 32KB, 8-way associative. 64 byte line size. L1 Data cache: 32KB, 8-way associative. 64 byte line size. L2 unified cache: 2MB, sectored, 8-way associative. 64 byte line size. TLB info Processor serial: 0001-06A3-0000-0000-0000-0000 /dev/cpu/0/msr: No such file or directory Connector type: Slot 1 (242 Contact Cartridge) MTRR registers: MTRRcap (0xfe): MTRRphysBase0 (0x200): MTRRphysMask0 (0x201): MTRRphysBase1 (0x2 02): MTRRphysMask1 (0x203): MTRRphysBase2 (0x204): MTRRphysMask2 (0x205): MTRRph ysBase3 (0x206): MTRRphysMask3 (0x207): MTRRphysBase4 (0x208): MTRRphysMask4 (0x 209): MTRRphysBase5 (0x20a): MTRRphysMask5 (0x20b): MTRRphysBase6 (0x20c): MTRRp hysMask6 (0x20d): MTRRphysBase7 (0x20e): MTRRphysMask7 (0x20f): MTRRfix64K_00000 (0x250): MTRRfix16K_80000 (0x258): MTRRfix16K_A0000 (0x259): MTRRfix4K_C8000 (0 x269): MTRRfix4K_D0000 0x26a: MTRRfix4K_D8000 0x26b: MTRRfix4K_E0000 0x26c: MTRR fix4K_E8000 0x26d: MTRRfix4K_F0000 0x26e: MTRRfix4K_F8000 0x26f: MTRRdefType (0x 2ff): 3.40GHz processor (estimate). -------------------------------------------------------------------------- CPU #2 eax in: 0x00000000, eax = 00000004 ebx = 756e6547 ecx = 6c65746e edx = 49656e69 eax in: 0x00000001, eax = 000106a3 ebx = 00000800 ecx = 80b82201 edx = 078bfbfd eax in: 0x00000002, eax = 00000001 ebx = 00000000 ecx = 00000000 edx = 002c307d eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x00000004, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000000, eax = 8000000a ebx = 756e6547 ecx = 6c65746e edx = 49656e69 eax in: 0x80000001, eax = 000106a3 ebx = 00000000 ecx = 00000001 edx = 20100800 eax in: 0x80000002, eax = 65746e49 ebx = 6f43206c ecx = 69206572 edx = 78392037 eax in: 0x80000003, eax = 4e282078 ebx = 6c616865 ecx = 43206d65 edx = 7373616c eax in: 0x80000004, eax = 726f4320 ebx = 37692065 ecx = 00000029 edx = 00000000 eax in: 0x80000005, eax = 01ff01ff ebx = 01ff01ff ecx = 40020140 edx = 40020140 eax in: 0x80000006, eax = 00000000 ebx = 42004200 ecx = 02008140 edx = 00000000 eax in: 0x80000007, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000008, eax = 00003024 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x80000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 eax in: 0x8000000a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000 Family: 6 Model: 10 Stepping: 3 Type: 0 Brand: 0 CPU Model: Pentium II (Deschutes) Original OEM Feature flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflsh mmx fxsr sse sse2 Extended feature flags: sse3 ssse3 cx16 [19] [20] [21] [23] [31] SYSCALL xd em64t lahf_lm Cache info L1 Instruction cache: 32KB, 8-way associative. 64 byte line size. L1 Data cache: 32KB, 8-way associative. 64 byte line size. L2 unified cache: 2MB, sectored, 8-way associative. 64 byte line size. TLB info Processor serial: 0001-06A3-0000-0000-0000-0000 Connector type: Slot 1 (242 Contact Cartridge) MTRR registers: MTRRcap (0xfe): MTRRphysBase0 (0x200): MTRRphysMask0 (0x201): MTRRphysBase1 (0x2 02): MTRRphysMask1 (0x203): MTRRphysBase2 (0x204): MTRRphysMask2 (0x205): MTRRph ysBase3 (0x206): MTRRphysMask3 (0x207): MTRRphysBase4 (0x208): MTRRphysMask4 (0x 209): MTRRphysBase5 (0x20a): MTRRphysMask5 (0x20b): MTRRphysBase6 (0x20c): MTRRp hysMask6 (0x20d): MTRRphysBase7 (0x20e): MTRRphysMask7 (0x20f): MTRRfix64K_00000 (0x250): MTRRfix16K_80000 (0x258): MTRRfix16K_A0000 (0x259): MTRRfix4K_C8000 (0 x269): MTRRfix4K_D0000 0x26a: MTRRfix4K_D8000 0x26b: MTRRfix4K_E0000 0x26c: MTRR fix4K_E8000 0x26d: MTRRfix4K_F0000 0x26e: MTRRfix4K_F8000 0x26f: MTRRdefType (0x 2ff): 3.40GHz processor (estimate). -------------------------------------------------------------------------- WARNING: Detected SMP, but unable to access cpuid driver. Used Uniprocessor CPU routines. Results inaccurate. C:\Program Files\GnuWin32\bin> 2.The host infomation: Host A: #cat /proc/cpuinfo: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 microcode : 0x29 cpu MHz : 3276.882 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6784.71 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: #cat /proc/meminfo: MemTotal: 7736640 kB MemFree: 142560 kB MemAvailable: 5036368 kB Buffers: 728 kB Cached: 4929644 kB SwapCached: 1020 kB Active: 4134476 kB Inactive: 3011776 kB Active(anon): 1447124 kB Inactive(anon): 794016 kB Active(file): 2687352 kB Inactive(file): 2217760 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 8142844 kB SwapFree: 8138388 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 2214452 kB Mapped: 35724 kB Shmem: 25260 kB Slab: 298604 kB SReclaimable: 242112 kB SUnreclaim: 56492 kB KernelStack: 1960 kB PageTables: 10720 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 12011164 kB Committed_AS: 2651040 kB VmallocTotal: 34359738367 kB VmallocUsed: 563308 kB VmallocChunk: 34359081920 kB HardwareCorrupted: 0 kB AnonHugePages: 2129920 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 84768 kB DirectMap2M: 8124416 kB Host B: #cat /proc/cpuinfo: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz stepping : 7 microcode : 0x29 cpu MHz : 1861.695 cache size : 6144 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6185.51 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: #cat /proc/meminfo MemTotal: 16049188 kB MemFree: 3682812 kB MemAvailable: 13281916 kB Buffers: 12 kB Cached: 9639600 kB SwapCached: 12204 kB Active: 10401008 kB Inactive: 1429688 kB Active(anon): 2181808 kB Inactive(anon): 34368 kB Active(file): 8219200 kB Inactive(file): 1395320 kB Unevictable: 5852 kB Mlocked: 5852 kB SwapTotal: 8200188 kB SwapFree: 8074744 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 2188060 kB Mapped: 31712 kB Shmem: 22236 kB Slab: 310456 kB SReclaimable: 237988 kB SUnreclaim: 72468 kB KernelStack: 2160 kB PageTables: 13140 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 16224780 kB Committed_AS: 2609392 kB VmallocTotal: 34359738367 kB VmallocUsed: 380504 kB VmallocChunk: 34359316440 kB HardwareCorrupted: 0 kB AnonHugePages: 2152448 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 159184 kB DirectMap2M: 16496640 kB (In reply to shuyu from comment #5) > Reproduce this issue on 2 intel hosts with different cpu models ,hit the > same issue when do migration between 2 different intel hosts I have questions about the steps on comment #5. When exactly does migration fail? I see you do migration on steps 3, 4, 7, 8. Does it fail on all of them? If it fails on all cases, is there a specific reason comment #5 includes steps to reproduce with balloon driver, if migration already fails without it? > 1.Inside the guest,use "x86info.exe -a" get the cpu infomation: Was it running on which host? Can you report the guest-side x86info results on both hosts? (they should be exactly the same, but it is good to ensure that, anyway) All error return paths (except for version_id check) at ram_load() have error_report() calls. Do you have the full QEMU error output? Is there any additional error messages explaining why ram load failed? Other issues/questions: 1. Why are you testing the 7.0 qemu-kvm-rhev package (1.5.3) and not qemu-kvm-rhev-2.1.2? 2. You are not specifying the machine-type. You must always specify the machine-type when live-migrating. 3. Please confirm the qemu-kvm and *bios versions on both hosts, not only one host. All signs point to incompatible QEMU versions being used without explicit machine-type option, and no extra information was provided. Please reopen the bug if it is reproducible when specifying machine-type explicitly. clear neeinfo as the bug is closed. |