Bug 816505
Summary: | [qemu-kvm]disk checking for consistency happens sometimes when rebooting guest after migrating. | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | dawu | ||||||||||||||||||
Component: | qemu-kvm | Assignee: | Ronen Hod <rhod> | ||||||||||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||||
Priority: | medium | ||||||||||||||||||||
Version: | 7.0 | CC: | acathrow, areis, armbru, bcao, bsarathy, hhuang, juzhang, knoel, mdeng, michen, mkenneth, mtosatti, pbonzini, qiguo, rhod, shuang, virt-maint, vrozenfe, yvugenfi | ||||||||||||||||||
Target Milestone: | rc | ||||||||||||||||||||
Target Release: | 7.0 | ||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||
Last Closed: | 2014-04-28 11:43:08 UTC | Type: | Bug | ||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
Embargoed: | |||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||
Bug Blocks: | 798682 | ||||||||||||||||||||
Attachments: |
|
Description
dawu
2012-04-26 09:52:04 UTC
Created attachment 580417 [details]
win2k8-32-diskchecking-1
Created attachment 580418 [details]
win2k8-32-diskchecking-2
Created attachment 580419 [details]
win2k8-32-diskchecking-3
Tested without balloon driver on win2k3-64 for three times, didn't hit this issue ,I'll try more times on win2k8-32 without balloon driver and update the results. Thanks! Best Regards, Dawn (In reply to comment #4) > Tested without balloon driver on win2k3-64 for three times, didn't hit this > issue > ,I'll try more times on win2k8-32 without balloon driver and update the > results. > > Thanks! > Best Regards, > Dawn 1.Tried more times ping-pong migration between 2 hosts on fresh images of win2k8-32 nd win2k3-64 without balloon driver, also hit this issue, CLI: /usr/libexec/qemu-kvm -m 2G -smp 2 -cpu cpu64-rhel6,+x2apic -usb -device usb-tablet -drive file=win2k3-64-fun-balloon.raw,format=raw,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup0 -device e1000,netdev=hostnet0,mac=00:10:16:23:78:01,bus=pci.0,addr=0x4 -uuid a65b5920-b410-4606-8c4a-eb2eacb58f96 -rtc base=localtime -no-kvm-pit-reinjection -monitor stdio -name rhel63 -spice disable-ticketing,port=5931 -vga qxl -bios /usr/share/seabios/bios-pm.bin Please refer to the attached for "2k8-32-NoBalloonDriver-1.png", "2k8-32-noBalloonDriver-2.png", "2k3-64-noBalloonDriver-1.png" and "2k3-64-noballoonDriver-2.png". Note: Md5sum for image is different before migration and after migration -> shutdown -> re-start, since there is file corrupted or missing. 2.For rhel6.3 guest, I tried 5 times, no any issue found till now. Best Regards, Dawn Created attachment 580685 [details]
2k8-32-NoBalloonDriver-1
Created attachment 580686 [details]
2k8-32-noBalloonDriver-2
Created attachment 580687 [details]
2k3-64-noBalloonDriver-1
Created attachment 580688 [details]
2k3-64-noballoonDriver-2
Please check guest event log for kernel crash entries. (In reply to comment #11) > Please check guest event log for kernel crash entries. Hi Yan, Please refer to the attachment of "ErrorLogFromKernel.txt" for details, I collected Kernel-General event logs from the event viewer, if it's not what you need, please let me know. Thanks! Best Regards, Dawn Created attachment 582019 [details]
ErrorLogFromKernel.txt for win2k8-32
This looks like one IDE bug. We haven't been able to reproduce it yet. Could you: - try to reproduce from libvirt (although all options look right) - take screenshots of the IDE controller properties in the migration destination after each migration, reboot, and only attach them when they get a disk check. Just to be sure if we find any pattern there. Thanks, Juan. (In reply to comment #14) > This looks like one IDE bug. We haven't been able to reproduce it yet. > Could you: > - try to reproduce from libvirt (although all options look right) Hi Juan, I have tried on win2k8-32 for 6 times, didn't hit this issue on libvirt. > - take screenshots of the IDE controller properties in the > migration destination after each migration, reboot, and only attach them > when they get a disk check. I'd like to confirm with you for the IDE controller properties for two points: 1. Is it refer to "intel(R) 82371SB PCI Bus Master IDE Controller" under the path Device Manager -> IDE ATA/ATAPI controllers -> intel(R) 82371SB PCI Bus Master IDE Controller 2. If it is, what do you focus on? info for all tabs ("General" / "Driver" / "Details" / "Resources")? If it is, I'll take screen for each tab one by one, and for content of tab "Details", there are many options, so if needed ,could you tell me which options you want to know so that I can take response info for you. 3. you said "reboot, and only attach them when they get a disk check." You mean to take screen when get a disk check just like screen of "win2k8-32-diskchecking-1",right? Please refer to the attachment "IDE_properties.JPG" for details. Thanks! Best Regards, Dawn > > Just to be sure if we find any pattern there. > > Thanks, Juan. Could you test using virtio block and networking and see if the problem goes away? Suspicion is that the problem is in ide, code, but that would help confirm it. (In reply to comment #16) > Could you test using virtio block and networking and see if the problem goes > away? Suspicion is that the problem is in ide, code, but that would help > confirm it. Hi Juan, This issue still happened when using virtio block and networking. This issue reproduce not easily, sometimes, the first run of migration can hit this issue, but sometimes, you'll hit this issue after many loops for ping-pong migration. Best Regards, Dawn From the event viewer file: "{Registry Hive Recovered} Registry hive (file): '\SystemRoot\System32\Config\SOFTWARE' was corrupted and it has been recovered. Some data might have been lost." So Windows performed chkdisk because it encounters file system corruption. Can you describe details of shared storage setup. (In reply to comment #18) > From the event viewer file: > > "{Registry Hive Recovered} Registry hive (file): > '\SystemRoot\System32\Config\SOFTWARE' was corrupted and it has been > recovered. Some data might have been lost." > > So Windows performed chkdisk because it encounters file system corruption. > > Can you describe details of shared storage setup. Hi Marcelo, shared storage setup steps: on shared host hostC: 1. vi /etc/exports /home *(rw,no_root_squash) 2. service nfs start on test hosts hostA and hostB 3. mount hostA:/home /mnt on hostA mount hostB:/home /mnt on hostB Best Regard, Dawn I guess you mean: mount hostc:/home /mnt on both hosts, right? (In reply to comment #20) > I guess you mean: > > mount hostc:/home /mnt > > on both hosts, right? Juan, Sorry for my typing mistakes , you are right. Best Regards, Dawn Vadim, Can you reproduce this and help figure out what's happening from within Windows? writethrough cache option is not valid for migration. Their example show cache=none in all the disks. My suspcion was in the balloon driver was alos doing something strange, but I don't know either :-( bug is as starnge as it can be. does it mean that the problem is not reproducible if balloon was deflated before migration? Hi Qian, Can you have a test and update the testing result? Best Regards, Junyi (In reply to juzhang from comment #34) > Hi Qian, > > Can you have a test and update the testing result? > > Best Regards, > Junyi Test this bug according to comment #6 in RHEL7 hosts and with windows 2008 32bit guest, test 10 times, can not be reprodcued. Components # rpm -q qemu-kvm qemu-kvm-1.5.3-60.el7.x86_64 # uname -r 3.10.0-121.el7.x86_64 For nfs server: # cat /etc/exports /home/ *(rw,no_root_squash) cli: # /usr/libexec/qemu-kvm -m 4G -smp 4 -cpu Penryn -usb -device usb-tablet -drive file=/mnt/win2008-32.qcow2,format=qcow2,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,id=hostnet0,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown -device e1000,netdev=hostnet0,mac=00:10:16:23:78:01,bus=pci.0,addr=0x4 -uuid a65b5920-b410-4606-8c4a-eb2eacb58f96 -rtc base=localtime -no-kvm-pit-reinjection -monitor stdio -name m2008 -vnc :10 -vga std -bios /usr/share/seabios/bios.bin -boot menu=on Steps: migration -> shutdown -> re-start Test for 10 times, can not be reproduced Thanks, I do not see us doing anything with this BZ. It does not reproduce well, more so in RHEL7. I will close it, and we can reopen once we have a reproducer. |