Bug 574495
Summary: | QEMU: VM crash with qcow2 over block device error (and virtio-blk driver) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Yaniv Kaul <ykaul> | ||||||
Component: | kvm | Assignee: | Kevin Wolf <kwolf> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 5.5 | CC: | akong, cpelland, jinzishuai, jkt, llim, tburke, virt-maint, ykaul | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-05-10 11:12:51 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 556823, 580948 | ||||||||
Attachments: |
|
Description
Yaniv Kaul
2010-03-17 16:34:41 UTC
Complete command line: /usr/libexec/qemu-kvm -no-hpet -usb -rtc-td-hack -startdate 2010-3-17T9:11:24 -name W2K8_R2PV_COW_V -smp 1,cores=1 -k en-us -m 512 -boot cdn -net nic,vlan=1,macaddr=00:1a:4a:16:97:1d,model=virtio -net tap,vlan=1,ifname=virtio_10_1,script=no -drive file=/rhev/data-center/6bc3de76-9469-4852-91d0-b2bf668f941c/beddbe7d-e512-4d0c-b11f-a2ae5ae98e9e/images/725c6944-2fb2-49f2-b84d-edf3b7787270/fb5a14c2-224a-40fc-af5c-364efbb80861,media=disk,if=virtio,cache=off,serial=f2-b84d-edf3b7787270,boot=on,format=qcow2,werror=stop -drive file=/rhev/data-center/6bc3de76-9469-4852-91d0-b2bf668f941c/dfe7d294-0cc3-484c-a2d4-eb6908c49960/images/11111111-1111-1111-1111-111111111111/en_windows_server_2008_r2_standard_enterprise_datacenter_and_web_x64_dvd_x15-59754.iso,media=cdrom,index=2,if=ide -fda /rhev/data-center/6bc3de76-9469-4852-91d0-b2bf668f941c/dfe7d294-0cc3-484c-a2d4-eb6908c49960/images/11111111-1111-1111-1111-111111111111/virtio-drivers-1.0.0-8.vfd -pidfile /var/vdsm/557eb7ff-7e25-4dc5-a9f4-c60cf36afadf.pid -vnc 0:10,password -cpu qemu64,+sse2,+cx16,+ssse3,+sse4.1,+sse4.2,+popcnt -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer="Red Hat",product="RHEL",version=5Server-5.5.0.2,serial="9978F05A-B189-11DE-9BD8-00215EC7F8AC_00:21:5e:c7:f8:ac",uuid="557eb7ff-7e25-4dc5-a9f4-c60cf36afadf" -vmchannel di:0200,unix:/var/vdsm/557eb7ff-7e25-4dc5-a9f4-c60cf36afadf.guest.socket,server -monitor unix:/var/vdsm/557eb7ff-7e25-4dc5-a9f4-c60cf36afadf.monitor.socket,server 1>/var/vdsm/557eb7ff-7e25-4dc5-a9f4-c60cf36afadf.stdio.dump 2>&1; /usr/bin/sudo /usr/bin/tunctl -d virtio_10_1 Created attachment 400826 [details]
qemu-img check results
Created attachment 400828 [details]
VDSM log
Meh, I seem to have missed that my comment wasn't added yesterday because Yaniv added something else and now it's gone. Let me see if I can remember what it said... One thing I noticed was that the failure happened at the very last clusters in the first refcount block, with the second refcount block not yet allocated. So the problem could either be that the refcount of these last clusters was already corrupted and would have become negative; or that the update crossed the boundary and tried to decrease the refcount there, which led to the allocation a new refcount block which might have gone wrong. That alloc_refcount_block can turn any write errors into EINVAL is another observation I made. Yaniv said that shortly before the crash a high watermark was reached and this might actually have been an ENOSPC. Kevin, does the new ref count fix might fix it too? Good question. Without that fix I/O errors during refcount block allocation could lead to almost any kind of corruption. It might have fixed the problem completely (if a previous refcount block allocation error has caused the situation), it might have fixed part of it (the abort() would still happen, but with no additional image corruption) or it might be completely unrelated - we can't know. Yaniv, something like this hasn't happened again since you reported the bug? (In reply to comment #6) > Good question. Without that fix I/O errors during refcount block allocation > could lead to almost any kind of corruption. It might have fixed the problem > completely (if a previous refcount block allocation error has caused the > situation), it might have fixed part of it (the abort() would still happen, but > with no additional image corruption) or it might be completely unrelated - we > can't know. > > Yaniv, something like this hasn't happened again since you reported the bug? Nope. As long as nobody can reproduce, we can't do anything about it anyway. I'm closing this as a duplicate of the bug Dor mentioned. If later it turns out that the patches don't fix this case, please reopen. *** This bug has been marked as a duplicate of bug 567940 *** I can reproduce this problem with a Windows 2008 server VM running on RHEL-5.5 with the virtio disk and network drivers. I happens when I try to attach a disk to the VM using "virsh attach-disk" command to attach a disk image file in qcow2 format. The log I got is also qcow2: free_clusters failed: Invalid argument |