Bug 1003293
Summary: | qemu crash when boot from snapshot image file | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Xu Tian <xutian> | ||||||||||
Component: | qemu-kvm | Assignee: | Jeff Cody <jcody> | ||||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 7.0 | CC: | acathrow, hhuang, juzhang, virt-maint, xutian, xwei | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-11-05 19:28:06 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Not reproduce this issue with RHEL6.4 guest Check memory info with top on host found that when qemu crash, qemu used abort 51% memory (win7 x86_64 guest), but only used abort 10% memory when guest from sn1 image(rhel6.4 x86_64 guest); Thanks, Xu (In reply to xu from comment #2) > Not reproduce this issue with RHEL6.4 guest > > Check memory info with top on host found that when qemu crash, qemu used > abort 51% memory (win7 x86_64 guest), but only used abort 10% memory when > guest from sn1 image(rhel6.4 x86_64 guest); > > Thanks, > Xu Change guest memory (windows) to 2048M guest can boot from sn1 image and no crash happened; Feeling strange why qemu eat so much memory when boot from a snapshot image file; Thanks, Xu Tested on migration case: 2*4G vms on a 8G host, met same seg fault, but this works for me after glibc downgrade to glibc-common-2.17-4.el7.x86_64 glibc-devel-2.17-4.el7.x86_64 glibc-headers-2.17-4.el7.x86_64 glibc-2.17-4.el7.x86_64 glibc-debuginfo-common-2.17-4.el7.x86_64 Regards, Xiaoqing. (In reply to Xiaoqing Wei from comment #4) > Tested on migration case: 2*4G vms on a 8G host, met same seg fault, > > but this works for me after glibc downgrade to > > glibc-common-2.17-4.el7.x86_64 > glibc-devel-2.17-4.el7.x86_64 > glibc-headers-2.17-4.el7.x86_64 > glibc-2.17-4.el7.x86_64 > glibc-debuginfo-common-2.17-4.el7.x86_64 > > > Regards, > Xiaoqing. Oops, I have to take this comment back, as more rounds of testing still seg faults :( I've been unable to reproduce this, on qemu-kvm-1.5.3-2.el7. This is on a machine with a little less than 8GB of RAM, and 8GB of swap. A couple of questions: In comment #4, Xiaoqing referenced "seg fault". I assume that means the same abort that was hit in the backtrace and in the description (and not an actual SEGFAULT)? Also, in the description /proc/meminfo shows <8GB of RAM, and ~8GB of swap. What does the qemu memory usage look like while running the guest, with the base, sn1, and sn2? Is there anything else running on this host (any other qemu instances, etc..)? Thanks! (In reply to Jeff Cody from comment #6) > I've been unable to reproduce this, on qemu-kvm-1.5.3-2.el7. This is on a > machine with a little less than 8GB of RAM, and 8GB of swap. > > A couple of questions: > > In comment #4, Xiaoqing referenced "seg fault". I assume that means the > same abort that was hit in the backtrace and in the description (and not an > actual SEGFAULT)? > yes, the backtrack that xiaoqing mention in comment#4 same as I post in attachment; > Also, in the description /proc/meminfo shows <8GB of RAM, and ~8GB of swap. > What does the qemu memory usage look like while running the guest, with the > base, sn1, and sn2? no crash at sn2, crash at sn1, so post 'top' command output here: > > Is there anything else running on this host (any other qemu instances, > etc..)? only one qemu instance run on that host, you can check output of 'top' > > Thanks! Created attachment 794593 [details]
top output when boot with sn2(not crash)
Created attachment 794594 [details]
top output when boot with sn1(crashed)
Thanks for the attachments. Unfortunately, the memory usage does not seem to add up, and is not accounted for in the buffers / cache usage, either. I hate to ask, but could you run it again, this time starting the following command in a different terminal window prior to initiating the qemu process? Let it run throughout the process, and then terminate it (with ^C) after the process is complete: while [ 1 ]; do ps krsz -e -o pid,vsz,rsz,comm,args=; echo -e "\nfree:"; free; echo -e "\n\n"; sleep 1; done|tee ps_output.txt Afterwards, could you attach the ps_output.txt for the time it aborts, and when it does not abort (you may want to bzip2 ps_output.txt prior to attaching it)? Also note that command will display the commandline arguments for all processes, so if you have something sensitive on the commandline (e.g. passwords passed by argument, etc..) you may want to censor that info. Thanks again, Jeff Hi Xu, Can you have a look comment10 and give feedback? (In reply to Jeff Cody from comment #10) > Thanks for the attachments. Unfortunately, the memory usage does not seem > to add up, and is not accounted for in the buffers / cache usage, either. > > I hate to ask, but could you run it again, this time starting the following > command in a different terminal window prior to initiating the qemu process? > Let it run throughout the process, and then terminate it (with ^C) after the > process is complete: > > while [ 1 ]; do ps krsz -e -o pid,vsz,rsz,comm,args=; echo -e "\nfree:"; > free; echo -e "\n\n"; sleep 1; done|tee ps_output.txt > > Afterwards, could you attach the ps_output.txt for the time it aborts, and > when it does not abort (you may want to bzip2 ps_output.txt prior to > attaching it)? Also note that command will display the commandline arguments > for all processes, so if you have something sensitive on the commandline > (e.g. passwords passed by argument, etc..) you may want to censor that info. > > Thanks again, > Jeff produce steps: 1. boot base image, then make live snapshot chain base -> sn1 -> sn2 (not crash) 'ps' output file: ps_output_make_snapshot_chain.txt 2. shutdown guest ,then boot from sn2 (not crash) 'ps' output file: ps_output_boot_sn2.txt 3. shutdown guest, then boot from sn1 (not crash) 'ps' output file: ps_output_boot_sn1.txt 4. shutdown guest, then boot from base (crashed) 'ps' output file: ps_output_boot_base_crashed.txt Thanks, Xu Created attachment 795811 [details]
ps command output
decompress attachment file you will see, ps output files;
ps_output.tgz.xz && tar -xzvf ps_output.tgz
(In reply to xu from comment #13) > Created attachment 795811 [details] > ps command output > > decompress attachment file you will see, ps output files; > > ps_output.tgz.xz && tar -xzvf ps_output.tgz xz -d ps_output.tgz.xz && tar -xzvf ps_output.tgz I have been unable to reproduce this problem. I believe the issue may be with memory allocation by autotest, and residual qemu instances, rather than a bug with qemu itself. QEMU is aborting because it is not able to allocate the memory requested. If you are able to still reproduce the issue outside of autotest, please reopen or create a new bz. |
Created attachment 792646 [details] backtrack Description of problem: when boot win7 64 guest from a snapshot image, qemu crashed and report "Failed to allocate 4294967296 B: Cannot allocate memory" Version-Release number of selected component (if applicable): qemu-kvm-1.5.3-2.el7.x86_64 kernel-3.10.0-15.el7.x86_64 glibc-2.17-27.el7.x86_64 How reproducible: abort 80% Steps to Reproduce: 1. boot win7 guest: /root/test/autotest-devel/client/tests/virt/qemu/qemu \ -name 'virt-tests-vm1' \ -nodefaults \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20130902-003659-dpmQSqE4,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20130902-003659-dpmQSqE4,server,nowait \ -device isa-serial,chardev=serial_id_serial1 \ -chardev socket,id=seabioslog_id_20130902-003659-dpmQSqE4,path=/tmp/seabios-20130902-003659-dpmQSqE4,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20130902-003659-dpmQSqE4,iobase=0x402 \ -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x4 \ -drive file='/root/test/autotest-devel/client/tests/virt/shared/data/images/win7-64.qcow2',index=0,if=none,id=drive-ide0-0-0,media=disk,cache=writeback,snapshot=off,format=qcow2,aio=native \ -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0 \ -device rtl8139,netdev=idSjgfXF,mac='9a:30:31:32:33:34',bus=pci.0,addr=0x3,id='idQb0e95' \ -netdev tap,id=idSjgfXF,vhost=on,vhostfd=25,fd=24 \ -m 4096 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'SandyBridge',hv_relaxed \ -M pc \ -drive file='/root/test/autotest-devel/client/tests/virt/shared/data/isos/windows/winutils.iso',index=1,if=none,id=drive-ide0-0-1,media=cdrom,format=raw \ -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -vga std \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off \ -enable-kvm 2. create file in guest: D:\coreutils\DummyCMD.exe C:\test\image1 1048576 1 3. create live snapshot file sn1 {'execute': 'blockdev-snapshot-sync', 'arguments': {'device': u'drive-ide0-0-0', 'snapshot-file': '/root/test/autotest-devel/client/tests/virt/shared/data/images/sn1.qcow2', 'format': 'qcow2'}, 'id': 'xXsGotfc'} 4. create file in guest: D:\coreutils\DummyCMD.exe C:\test\sn1 1048576 1 5.create live snapshot file sn2 {'execute': 'blockdev-snapshot-sync', 'arguments': {'device': u'drive-ide0-0-0', 'snapshot-file': '/root/test/autotest-devel/client/tests/virt/shared/data/images/sn2.qcow2', 'format': 'qcow2'}, 'id': 'xXsGotfc'} 6. create file in guest: D:\coreutils\DummyCMD.exe C:\test\sn2 1048576 1 7. shutdown guest 8. boot guest with sn2, and check file in guest, then shutdown guest 9. boot guest with sn1, and check file in guest, then shutdown guest Actual results: qemu will crash at step8 or step9 Expected results: guest works fine Additional info: [root@localhost qemu]# cat /proc/meminfo MemTotal: 7791860 kB MemFree: 6955336 kB Buffers: 0 kB Cached: 646232 kB SwapCached: 3780 kB Active: 394932 kB Inactive: 258260 kB Active(anon): 2312 kB Inactive(anon): 5160 kB Active(file): 392620 kB Inactive(file): 253100 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 8142844 kB SwapFree: 8101444 kB Dirty: 20 kB Writeback: 0 kB AnonPages: 3952 kB Mapped: 4916 kB Shmem: 460 kB Slab: 71192 kB SReclaimable: 22596 kB SUnreclaim: 48596 kB KernelStack: 1464 kB PageTables: 4520 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 12038772 kB Committed_AS: 220052 kB VmallocTotal: 34359738367 kB VmallocUsed: 357436 kB VmallocChunk: 34359372444 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 101840 kB DirectMap2M: 8165376 kB cpu info: ... processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz stepping : 7 microcode : 0x25 cpu MHz : 1666.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6784.18 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: see full backtrack in attachment