| Summary: | qcow2 image corrupted after ping-pong live migration while scp file from host to guest | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Mike Cao <bcao> | ||||||
| Component: | qemu-kvm | Assignee: | Juan Quintela <quintela> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 6.1 | CC: | bcao, chayang, chellwig, gcosta, Jes.Sorensen, khong, kwolf, michen, mkenneth, mshao, tburke, virt-maint | ||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2011-04-19 08:40:17 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Juan, do we close all images or at least bdrv_flush them before migrating? I think we only call qemu_aio_flush, which is not enough. Hit same issue with windows 2008r2 guest. We do a bdrv_flush(), so we should be good here. Can QE help analyze this case? We probably have a block IO issue while migrating. So please don't do networking, just block IO and live migration in a loop without guest reboot and report what happens. What Ethernet hardware is in that host? If it is rtl8169 based, could you please try and disable hw checksumming support? I have at least one system here, where hw csum on the rtl8169 is bad and corrupts NFS if I do not disable it. /sbin/ethtool -K eth0 rx off If it is not rtl8169, please ignore this comment. Jes Mike, Can you please confirm or not, if this problem was seen on systems with rtl8169 hardware, or if it has been seen on other systems as well? Thanks, Jes (In reply to comment #7) > Mike, > > Can you please confirm or not, if this problem was seen on systems with > rtl8169 hardware, or if it has been seen on other systems as well? > > Thanks, > Jes I find this bug on a host with rtl8169 hardware .Will try on other systems with e1000e. since one of the host with rtl8169 hareware is broken now ,I will do local live mgiration instead to try to reproduce it . Mike, Any chance you can get access to a set of machines with rtl8169, and try to reproduce it the 'old way'? Once that is done, try and disable the checksums as described above and see if the problem goes away? That would be the ideal test to determine if this is rtl8169 related. Thanks, Jes I have reproduced it once. Whole host networking died. Trying to reproduce it with normal console & serial console to see what is happening. Hit same issue with steps in comment #0 on AMD host. CLI: /usr/libexec/qemu-kvm -M rhel6.1.0 -enable-kvm -m 4096 -smp 4 -name rhel5.6-32 -uuid `uuidgen` -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -boot dc -drive file=/dev/chayang/rhel5.6-32,if=none,id=drive-virtio0-0-0,media=disk,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio0-0-0,id=virt0-0-0 -netdev tap,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:40:81:11:53 -usb -device usb-tablet,id=input1 -vnc :0 -monitor stdio -balloon none # qemu-img check /dev/chayang/rhel5.6-32 ERROR OFLAG_COPIED: offset=80000000c1400000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1410000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1420000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1430000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1440000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1450000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1460000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1470000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1480000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c1490000 refcount=0 ERROR OFLAG_COPIED: offset=80000000c14a0000 refcount=0 ... ERROR OFLAG_COPIED: offset=80000000d6f20000 refcount=0 ERROR OFLAG_COPIED: offset=80000000d6f30000 refcount=0 ERROR OFLAG_COPIED: offset=80000000d6f40000 refcount=0 ERROR OFLAG_COPIED: offset=80000000d6f50000 refcount=0 ERROR OFLAG_COPIED: offset=80000000d6ef0000 refcount=0 ERROR cluster 49472 refcount=0 reference=1 ERROR cluster 49473 refcount=0 reference=1 ERROR cluster 49474 refcount=0 reference=1 ERROR cluster 49475 refcount=0 reference=1 ERROR cluster 49476 refcount=0 reference=1 ERROR cluster 49477 refcount=0 reference=1 ERROR cluster 49478 refcount=0 reference=1 ERROR cluster 49479 refcount=0 reference=1 ERROR cluster 49480 refcount=0 reference=1 ... 11116 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. Additional info: # brctl show bridge name bridge id STP enabled interfaces switch 8000.0024217fb7f9 no eth0 tap0 # ethtool -i eth0 driver: tg3 version: 3.113 firmware-version: 5754-v3.26 bus-info: 0000:3f:00.0 # lspci -vvv -s 3f:00.0 3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express (rev 02) Subsystem: Hewlett-Packard Company Device 3029 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 25 Region 0: Memory at f0200000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at <ignored> [disabled] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data Product Name: Broadcom NetLink Gigabit Ethernet Controller Read-only fields: [PN] Part number: BCM95754 [EC] Engineering changes: 106679-15 [SN] Serial number: 0123456789 [MN] Manufacture ID: 31 34 65 34 [RV] Reserved: checksum good, 30 byte(s) reserved Read/write fields: [YA] Asset tag: XYZ01234567 [RW] Read-write area: 107 byte(s) free End Capabilities: [58] Vendor Specific Information <?> Capabilities: [e8] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0f00c Data: 4181 Capabilities: [d0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 <4us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [13c] Virtual Channel <?> Capabilities: [160] Device Serial Number 00-24-21-ff-fe-7f-b7-f9 Capabilities: [16c] Power Budgeting <?> Kernel driver in use: tg3 Kernel modules: tg3 Hi, I notice that so far we have only seen this bug when using iSCSI for storage. We need to try and narrow down further what causes this bug - ie. is it networking, is it iSCSI, or is it something else. 1) As discussed per irc earlier, could you try and reproduce the problem using NFS backed storage? 2) Could you also try to reproduce it using LVM storage and migration on just one host? 3) Could you try to reproduce it without guest networking, ie. using file copy inside the guest instead of scp, but on iSCSI. If if fails on iSCSI, try NFS as well, and last on LVM as in 2). Thanks, Jes (In reply to comment #5) > Can QE help analyze this case? > We probably have a block IO issue while migrating. So please don't do > networking, just block IO and live migration in a loop without guest reboot and > report what happens. Hi dor, Have tested iscsi as storage to do ping-pong migration for about 10 times, launched 3 iozone -a processes in guest instead of scp files from host to guest. After ping-pong migration, I checked block image, the following is what I got in src host and dst host. BTW, when boot again the block image, guest fails to launch X windows (please take a look at screenshot). Before migration: checked the block image both in src and dst host, no errors were found on the image. After 10 times migration: src host: # qemu-img check /dev/chayang/rhel5.6-21-bac ERROR OFLAG_COPIED: offset=800000020e520000 refcount=0 ERROR OFLAG_COPIED: offset=800000020e530000 refcount=0 ERROR OFLAG_COPIED: offset=800000020e570000 refcount=0 ERROR OFLAG_COPIED: offset=800000020e580000 refcount=0 ... ERROR OFLAG_COPIED: offset=8000000210910000 refcount=0 ERROR OFLAG_COPIED: offset=8000000210920000 refcount=0 ERROR OFLAG_COPIED: offset=8000000210930000 refcount=0 ERROR OFLAG_COPIED: offset=8000000210940000 refcount=0 ERROR cluster 118364 refcount=1 reference=2 ERROR cluster 118365 refcount=1 reference=2 ERROR cluster 118366 refcount=1 reference=2 ERROR cluster 118367 refcount=1 reference=2 ERROR cluster 118368 refcount=1 reference=2 ERROR cluster 118369 refcount=1 reference=2 ERROR cluster 118370 refcount=1 reference=2 ... ERROR cluster 135315 refcount=0 reference=1 ERROR cluster 135316 refcount=0 reference=1 1262 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. dst host: # qemu-img check /dev/chayang/rhel5.6-21-bac ERROR OFLAG_COPIED: offset=800000020db70000 refcount=0 ERROR OFLAG_COPIED: offset=800000020db80000 refcount=0 ERROR OFLAG_COPIED: offset=800000020db90000 refcount=0 ERROR OFLAG_COPIED: offset=800000020dba0000 refcount=0 ... ERROR OFLAG_COPIED: offset=8000000210940000 refcount=0 ERROR OFLAG_COPIED: offset=800000020e400000 refcount=0 ERROR cluster 114205 refcount=1 reference=2 ERROR cluster 114206 refcount=1 reference=2 ERROR cluster 114207 refcount=1 reference=2 ERROR cluster 114208 refcount=1 reference=2 ... ERROR cluster 135315 refcount=0 reference=1 ERROR cluster 135316 refcount=0 reference=1 2313 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. Created attachment 492084 [details]
see picture name
Created attachment 492100 [details]
pci info on the host
Thanks. The next stage is to see whether this happens w/o qcow2 by using a raw image instead. (In reply to comment #7) > Mike, > > Can you please confirm or not, if this problem was seen on systems with > rtl8169 hardware, or if it has been seen on other systems as well? > > Thanks, > Jes (In reply to comment #12) > 2) Could you also try to reproduce it using LVM storage and migration > on just one host? > Tried on this machine with following steps: 1.start kvm while image is lvm (qcow2 format) 2.scp file from host to guest in a loop 3.do ping-pong live migration 4.shutdown VM #qemu-img check Actual Results: no error find in #qemu-img check.Can not reproduce this issue. additional info: chayang tried on AMD host(host info referring comment #15) ,can not reproduced either. Hi, So we are still down to iSCSI and QCOW2 - what type of iSCSI server are you using? Thanks, Jes (In reply to comment #18) > Hi, > > So we are still down to iSCSI and QCOW2 - what type of iSCSI server > are you using? > > Thanks, > Jes We made localdisk partition as iscsi target ,then use scsi-target-utils configure it as setenforce 0 iptables -F tgtadm --lld iscsi --mode target --op new --tid 1 --target iqn.mike.com:s1 tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --backing-store /dev/sdb1 tgtadm --lld iscsi --mode target --op bind --tid 1 --initiator-address ALL Interesting! Just to clarify, so if you use iSCSI on the host, and run QCOW2, then do the ping pong on the same host, you can reproduce the problem? If you use LVM instead, you cannot reproduce it? Thanks, Jes (In reply to comment #12) > Hi, > > I notice that so far we have only seen this bug when using iSCSI > for storage. > > We need to try and narrow down further what causes this bug - ie. > is it networking, is it iSCSI, or is it something else. > > 1) As discussed per irc earlier, could you try and reproduce the problem > using NFS backed storage? Hi Jes, Tried the problem using NFS backed storage, scp files from host to guest, after ping-pong 10 times on two hosts, check the qcow2 image. # qemu-img check /mnt/RHEL-Server-5.6-32.qcow2 No errors were found on the image. > 2) Could you also try to reproduce it using LVM storage and migration > on just one host? Please refer to comment #17 > 3) Could you try to reproduce it without guest networking, ie. using > file copy inside the guest instead of scp, but on iSCSI. If if fails Please refer to comment #13 > on iSCSI, try NFS as well, and last on LVM as in 2). Running 3 iozone processes in guest, after ping-pong 10 times, check the image: # qemu-img check /mnt/RHEL-Server-5.6-32.qcow2 No errors were found on the image. > Thanks, > Jes CLI I am using for test. /usr/libexec/qemu-kvm -M rhel6.1.0 -enable-kvm -m 4096 -smp 4 -name rhel5.6-32 -uuid `uuidgen` -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -boot c -drive file=/mnt/RHEL-Server-5.6-32.qcow2,if=none,id=drive-virtio0-0-0,media=disk,format=qcow2,cache=none -device virtio-blk-pci,drive=drive-virtio0-0-0,id=virt0-0-0 -netdev tap,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:40:81:11:53 -usb -device usb-tablet,id=input1 -vnc :0 -monitor stdio -balloon none -incoming tcp:0:7000 (In reply to comment #20) > Interesting! > > Just to clarify, so if you use iSCSI on the host, and run QCOW2, then do the > ping pong on the same host, you can reproduce the problem? If you use LVM > instead, you cannot reproduce it? > > Thanks, > Jes Hi Jes, I tried iscsi+qcow2 in one host, scp files from host to guest, ping-pong 5 times, check the image: # qemu-img check /dev/chayang/test-ping-pong No errors were found on the image. Chayang, How many times did you have to ping pong before, before you saw the error? Was 5 ping pongs enough? Thanks, Jes Hi, Kevin and I were chatting about this on irc, and we are not quite sure. When you export the disk locally via iSCSI, how is it accessed. Ie. is your configuration: 1) /dev/sdX on host A is exported as an iSCSI target to host B, accessed directly on host A, but over iSCSI from host B. 2) /dev/sdX on host A is exported as an iSCSI target to both host A and host B, and imported as an iSCSI device on both host A and B? Thanks, Jes (In reply to comment #24) > Hi, > Kevin and I were chatting about this on irc, and we are not quite sure. > When you export the disk locally via iSCSI, how is it accessed. Ie. is your > configuration: > 1) /dev/sdX on host A is exported as an iSCSI target to host B, accessed > directly on host A, but over iSCSI from host B. > 2) /dev/sdX on host A is exported as an iSCSI target to both host A and host >B, > and imported as an iSCSI device on both host A and B? > Thanks, > Jes Hi,Jes, Seems the 1st one .because can not use iscsi-initiator to connect hostA(scsi-target-utils) itself.If we do so ,it will report dup vg error. Following is my steps. 1.configure iscsi target on hostA: #tgtadm --lld iscsi --mode target --op new --tid 1 --target iqn.mike.com:s1 #tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --backing-store /dev/sdb1 #tgtadm --lld iscsi --mode target --op bind --tid 1 --initiator-address ALL 2.configure iscsi initiator on hostB: #iscsiadm -m node -T iqn.mike.com:s1 -p [host A] -l 3.on hostA (or hostB) #pvcreate /dev/sdb1 #vgcreate vgtest /dev/sdb1 #lvcreate -L 20G -n RHEL5u6 vgtest #qemu-img create -f qcow2 /dev/vgtest/RHEL5u6 4.on hostB (if step3 use hostB ,this steps use hostA instead) #vgscan #lvscan #lvchange -ay /dev/vgtest/RHEL5u6 5.on host A: <commandLine> -drive file=/dev/vgtest/RHEL5u6 on host B: <commandLine> -drive file=/dev/vgtest/RHEL5u6 -incoming tcp:0:5888 Hi Mike, Thank you for the explanation. Given that you are using 1) we should focus on two things: See if we can reproduce it on raw as Dor requested in #16 and also see if we can reproduce it against an external iSCSI server. Thanks, Jes Note that using buffered I/O with shared storage will cause exactly this kind of corruption. From the list of tools above it looks like you use the userspace "tgt" iSCSI target. What does the configuration for the affected LUN look like? If it does not include the "direct" argument it will use buffered I/O and thus cause corruption like this. (In reply to comment #27) > Note that using buffered I/O with shared storage will cause exactly this kind > of corruption. From the list of tools above it looks like you use the > userspace "tgt" iSCSI target. What does the configuration for the affected LUN > look like? If it does not include the "direct" argument it will use buffered > I/O and thus cause corruption like this. The configuration referring to comment #25 Based on Christoph's findings, it looks the default is for tgtd to use buffered IO on the iSCSI target, which could explain this corruption. Please try running the tests specifying --bsoflags=direct for the exported devices and see if it makes a difference to the corruption. Thanks, Jes using following cli to exports a target, cannot reproduce Bug 681472 after ping-pong 10 times with qcow2 image. tgtadm --lld iscsi --mode target --op new --tid 1 --targetname iqn.chayang.com:test tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --backing-store /dev/sda5 --bsoflags=direct tgtadm --lld iscsi --op bind --mode target --tid 1 --initiator-address ALL Tried 2 more times with --bsoflags=direct (reffering to comment #30) steps: 1.in the guest #for((;;)) do iozone -a ;done 2.on the host scp file to guest in a loop 3.ping-pong live migration 10 times Actual Results: Tried 2 times ,could not reproduce. This is excellent news! It looks like this was an iSCSI configuration issue in the end, rather than a QEMU bug. Mike, please make sure to document this within QE so everybody knows to use --bsoflags=direct when using iSCSI locally like this. Closing Jes It is actually a good practice for the future for refrain from doing this and use a completely different server for the shared storage. Even if the above will be used, there are always other things that might get in our way. This bug took allot of resources from us all. Jes/Juan/Kevin/christoph and QE thanks for your efforts closing the bug |
Description of problem: Version-Release number of selected component (if applicable): # uname -r 2.6.32-118.el6.x86_64 # rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.148.el6.x86_64 How reproducible: only 1 time Steps to Reproduce: 1.start a VM(rhel6 guest) in src host: eg:/usr/libexec/qemu-kvm -enable-kvm -m 4G -smp 4 -name rhel6U1 -uuid adcbfb49-3411-1701-3c36-6bdbc00bedb9 -rtc base=utc,clock=host,driftfix=slew -boot c -drive file=/dev/s2/share,if=none,id=mike_d1,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=mike_d1,id=mike_d1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=a2:54:50:a4:c2:c1 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -device usb-tablet,id=input0 -vnc :2 -device virtio-balloon-pci,id=ballooning -monitor stdio 2.start a listenning port 3.scp file from host to guest #ssh-copy-id <guest ip> #for ((i=1;i<=10000;i++)); do scp -l 102400 /tt1 <guest-ip>:/; ssh <guest-ip> "rm -rf /tt1"; echo $i+"times completed"; done 4.do live migration between 2 hosts back & forth. Actual results: After ping-pong migration some times ,guest responds very slowly on the host ,the scp process stalled to transfer file to guest,while guest's nework is fine #tt1 60% 904MB 0.0KB/s - stalled then reboot the guest #(qemu)system_reset the image corrupted #qemu-img check /dev/s2/share ... ERROR cluster 134730 refcount=1 reference=2 ERROR cluster 134731 refcount=1 reference=2 260 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. Expected results: Additional info: 1.only occurs 1 time ,can not reproduce with other qcow2 images 2.The corrupted image was installed in this week ,only run some stress tools on it ever.