Hide Forgot
Description of problem: win8-32 qcow2 image damaged when do system_reset during wakeup from s3 Version-Release number of selected component (if applicable): kernel-2.6.32-416.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.398.el6.x86_64 virtio-win-prewhql-0.1.68 spice-server-0.12.4-2.el6.x86_64 seabios-0.6.1.2-28.el6.x86_64 vgabios-0.6b-3.7.el6.noarch How reproducible: 1/2 Steps to Reproduce: 1.boot guest /w qxl "-spice port=5931,disable-ticketing -vga qxl -global qxl-vga.revision=3 \" CLI: /usr/libexec/qemu-kvm \ -M rhel6.5.0 \ -m 2G \ -smp 2,cores=2 \ -cpu 'SandyBridge' \ -usb \ -device usb-tablet \ -enable-kvm \ -drive file=win8-32.qcow2,format=qcow2,if=none,id=drive-blk,cache=writeback,rerror=stop,werror=stop,serial=disk0 \ -device ide-drive,drive=drive-blk,id=blk0-0-0-0,bootindex=1 \ -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no \ -device e1000,netdev=hostnet0,mac=fe:23:40:21:22:31,id=net0 \ -uuid 8ff96365-d565-4304-8534-f1282aa90267 \ -no-kvm-pit-reinjection \ -chardev socket,id=111a,path=/tmp/monitor-win8-32,server,nowait \ -mon chardev=111a,mode=readline \ -name win8-32 \ -spice port=5931,disable-ticketing \ -vga qxl \ -global qxl-vga.revision=3 \ -rtc base=localtime,clock=host,driftfix=slew \ -global PIIX4_PM.disable_s3=0 \ -global PIIX4_PM.disable_s4=0 \ -monitor stdio 2.Install qxl driver qxlwddm-0.2-1 3.do s3 4.do wakeup 5.during step 4 system_reset 6.repeat step3-step5 many times. Actual results: after step6, win8-32 qcow2 image damaged. qemu-img check info: $qemu-img check win8-32-old.qcow2 ERROR cluster 343662 refcount=1 reference=2 1 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. Image end offset: 23912579072 Expected results: No errors were found on the image after do system_reset during wakeup from s3. Additional info: win8-32 could not wakeup successfully from s3.
Created attachment 795820 [details] The screenshot of wakeup from s3, and in this status do system_reset guest.
system_reset in qemu monitor is like pressing the reset button on a physical computer or plugging off the power cord and plugging it back in. Since there's no guest cooperation, the guest could be in the middle of using the disk, have unwritten data in memory, and all that will be lost when system_reset is invoked. Not surprising this can cause image corruption.
(In reply to Amit Shah from comment #5) > system_reset in qemu monitor is like pressing the reset button on a physical > computer or plugging off the power cord and plugging it back in. Since > there's no guest cooperation, the guest could be in the middle of using the > disk, have unwritten data in memory, and all that will be lost when > system_reset is invoked. > > Not surprising this can cause image corruption. Amit, is that to say this is not a bug?
This looks very much like a bug, ERRORs reported by qemu-img check are almost always bugs. Can you try if this is reproducible on RHEL 7? Is it really necessary to install the QXL driver or does it happen without it as well? Do you still have the image around, so I could have a look at it?
(In reply to Kevin Wolf from comment #7) > This looks very much like a bug, ERRORs reported by qemu-img check are almost > always bugs. > > Can you try if this is reproducible on RHEL 7? > > Is it really necessary to install the QXL driver or does it happen without it > as well? > > Do you still have the image around, so I could have a look at it? Hi, Kevin I could not reproduce it with QXL driver installed or without it on rhel6 host, so I could not analysis the reason why image corruption. I have the image and will uploaded.
This is a case of two data clusters pointing to the same cluster in the image file, as shown by the following 'qemu-img map' output: Offset Length Mapped to File ... 0x546d70000 0x10000 0x53e6e0000 win8-32-old.qcow2 ... 0x7e13f0000 0x90000 0x53e670000 win8-32-old.qcow2 ... The corrupted cluster (343662 * 64k = 0x53e6e0000) is contained in the "mapped to" area of both allocations.
Created attachment 796723 [details] qemu-img map/check output (with debug messages added to check) Attaching some qemu-img outputs I gathered on the machine with the broken image. One is the output of 'qemu-img map' as of current upstream master, the other one the output of a 'qemu-img check' with added debug output for each reference that is found and accounted for.
Not sure how relevant it is, but from the qemu-img check output: UPDATE: cluster offset=0x473400000 -> refcount 1 UPDATE: cluster offset=0x473410000 -> refcount 1 UPDATE: cluster offset=0x53e6e0000 -> refcount 1 UPDATE: cluster offset=0x540c90000 -> refcount 1 UPDATE: cluster offset=0x540ca0000 -> refcount 1 ... UPDATE: cluster offset=0x53e6c0000 -> refcount 1 UPDATE: cluster offset=0x53e6d0000 -> refcount 1 UPDATE: cluster offset=0x53e6e0000 -> refcount 2 UPDATE: cluster offset=0x53e6f0000 -> refcount 1 UPDATE: cluster offset=0x53e740000 -> refcount 1 This shows that one of the allocations is a single cluster allocation, whereas the other one is part of a longer contiguous allocation.
S3/S4 support is tech-preview in RHEL6 and it'll be promoted to fully supported at some point, but only in RHEL7. Therefore we're closing all S3/S4 related bugs in RHEL6. New bugs will be considered only if they're regressions or break some important use-case or certification. RHEL7 is being more extensively tested and effort from QE is underway in certifying that this particular bug is not present there. Please reopen with a justification if you believe this bug should not be closed. We'll consider them on a case-by-case basis following a best effort approach. Thank you.