Bug 1185172
Summary: | The blockcopy command will hang there in the mirror period with the raw disk | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhenfeng wang <zhwang> | ||||||
Component: | qemu-kvm-rhev | Assignee: | Jeff Cody <jcody> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Qianqian Zhu <qizhu> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.1 | CC: | coli, dyuan, gsun, hhuang, huding, jdenemar, juzhang, knoel, lhuang, michen, mrezanin, mzhan, qizhu, rbalakri, virt-bugs, virt-maint, yanyang | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-rhev-2.9.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-08-01 23:27:12 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
zhenfeng wang
2015-01-23 06:39:40 UTC
The qemu colleague try it with the pure qemu command and they didn't hit the issue, the following steps were got from them and they try it with the qemu-kvm-rhev-2.1.2-20.el7.x86_64. Hope this info is helpfull for you /usr/libexec/qemu-kvm -enable-kvm -M pc-i440fx-rhel7.0.0 -smp 4 -m 4G -name rhel6.3-64 -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/home/RHEL-Server-7.1-64-virtio.raw,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=raw,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0,bootindex=1 -boot order=cd -monitor stdio -readconfig nfs/ich9-ehci-uhci.cfg -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -monitor tcp::1235,server,nowait -vga qxl -global qxl-vga.revision=3 -spice port=5930,disable-ticketing -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vnc :10 -qmp tcp:0:5555,server,nowait -sandbox on -cpu host -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=aa:54:00:11:22:33 { 'execute' : 'qmp_capabilities' } {"return": {}} {"timestamp": {"seconds": 1421989798, "microseconds": 674653}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "device-net0", "path": "/machine/peripheral/device-net0/virtio-backend"}} { "execute": "drive-mirror", "arguments": { "device": "drive-virtio-disk0", "target": "/home/sn1", "format": "qcow2", "mode": "absolute-paths", "sync": "true", "speed": 1000000000, "on-source-error": "stop", "on-target-error": "stop" } } {"error": {"class": "GenericError", "desc": "Invalid parameter 'true'"}} { "execute": "drive-mirror", "arguments": { "device": "drive-virtio-disk0", "target": "/home/sn1", "format": "qcow2", "mode": "absolute-paths", "sync": "full", "speed": 1000000000, "on-source-error": "stop", "on-target-error": "stop" } } {"return": {}} {"timestamp": {"seconds": 1421990163, "microseconds": 676036}, "event": "VNC_CONNECTED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5910", "host": "0.0.0.0"}, "client": {"family": "ipv4", "service": "39823", "host": "10.66.115.154"}}} {"timestamp": {"seconds": 1421990163, "microseconds": 748665}, "event": "VNC_INITIALIZED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5910", "host": "0.0.0.0"}, "client": {"family": "ipv4", "service": "39823", "host": "10.66.115.154"}}} {"timestamp": {"seconds": 1421990391, "microseconds": 198984}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 21474836480, "offset": 21474836480, "speed": 1000000000, "type": "mirror"}} {"execute": "block-job-complete", "arguments": { "device": "drive-virtio-disk0"} } {"return": {}} {"timestamp": {"seconds": 1421991719, "microseconds": 70199}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 21474836480, "offset": 21474836480, "speed": 1000000000, "type": "mirror"}} {"timestamp": {"seconds": 1421991754, "microseconds": 416189}, "event": "RESET"} {"timestamp": {"seconds": 1421991754, "microseconds": 455172}, "event": "RESET"} {"timestamp": {"seconds": 1421991776, "microseconds": 303532}, "event": "SHUTDOWN"} {"timestamp": {"seconds": 1421991776, "microseconds": 325294}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "ide1-cd0", "tray-open": true}} {"timestamp": {"seconds": 1421991776, "microseconds": 325447}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "floppy0", "tray-open": true}} Hi, I test qemu side, does have problem, has file a bug: Bug 1185178 - QMP drive_mirror command with "format":"raw" hang Could you provide the output of "virsh domcontrol $DOMAIN" while qemu hangs (i.e., when virsh blockjob times out)? (In reply to Shaolong Hu from comment #3) > Hi, > > I test qemu side, does have problem, has file a bug: > > Bug 1185178 - QMP drive_mirror command with "format":"raw" hang A negtive one, so far no evidence qemu's problem. Terminal1# virsh blockcopy rhel7f vda /var/lib/libvirt/images/f.img --wait --verbose ==> hang here Terminal2# virsh domcontrol rhel7f occupied (3.300s) Terminal2# virsh blockjob rhel7f vda error: Timed out during operation: cannot acquire state change lock Terminal2# virsh domcontrol rhel7f occupied (40.461s) Which means libvirt is stuck waiting for a response from QEMU. Can you attach libvirtd debug logs so that we can look at what commands we sent to QEMU and which one caused it to hang? Created attachment 983230 [details]
the log about libvirt
Created attachment 983231 [details]
the qemu log
So the last QMP command libvirt sent to QEMU is: 2015-01-23 09:13:34.199+0000: 14530: debug : qemuMonitorIOWrite:507 : QEMU_MONITOR_IO_WRITE: mon=0x7fba8c00d750 buf= { "execute":"drive-mirror", "arguments":{ "device":"drive-virtio-disk0", "target":"/var/lib/libvirt/images/f1.img", "speed":0, "sync":"full", "mode":"absolute-paths", "format":"raw" }, "id":"libvirt-96" } len=195 ret=195 errno=0 and we we got no answer from QEMU for 4 minutes. The only thing in qemu log is Formatting '/var/lib/libvirt/images/f1.img', fmt=raw size=2147483648 Btw, this bug can not be reproduced by qemu-kvm QMP command line directly so far, no matter target format is raw or qcow2. Hi Jeff, While try it with the latest qemu-kvm-rhev-2.3.0-24.el7.x86_64 , virsh blockcopy command won't hang there, if we only excute it alone. However, the blockcopy command will quit automatically , if we do the blockcopy in one terminal meanwhile do blockjob several times in another terminal. BTW, The blockcopy job will continue, even the blockcopy command quit , however, it would excute very slowly, I wait for more than 20 minutes, but it still didn't finish pkginfo libvirt-1.2.17-9.el7.x86_64 kernel-3.10.0-316.el7.x86_64 qemu-kvm-rhev-2.3.0-24.el7.x86_64 steps 1.Create a guest # virsh create rhel7raw.xml Domain rhel7raw created from rhel7raw.xml 2.Open the first terminal, Do blockcopy with the guest while the guest start completely in the first terminal # time virsh blockcopy rhel7raw vda /var/lib/libvirt/images/f.img --wait --verbose 3.open the second terminal, constantly check the blockjob info, it will report the following error sometimes # virsh blockjob rhel7raw vda error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainGetBlockJobInfo) # time virsh blockjob rhel7raw vda error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainGetBlockJobInfo) real 0m30.010s user 0m0.003s sys 0m0.008s # time virsh blockjob rhel7raw vda Block Copy: [ 4 %] real 0m9.508s user 0m0.007s sys 0m0.003s # time virsh blockjob rhel7raw vda ===>wait long time here 4.Re-check the first terminal, The virsh blockcopy command has quit unexpectly # time virsh blockcopy rhel7raw vda /var/lib/libvirt/images/f.img --wait --verbose Block Copy: [ 23 %]error: failed to query job for disk vda error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainGetBlockJobInfo) real 3m52.704s user 0m0.008s sys 0m0.012s 5.Go back the second terminal , find the blockjob command in step 3 has return the blockjob info, but it wait too long, go on check the blockcopy job, the job was continuing, but excuted very slowly # time virsh blockjob rhel7raw vda Block Copy: [ 26 %] real 7m34.682s user 0m0.015s sys 0m0.013s # time virsh blockjob rhel7raw vda Block Copy: [ 30 %] real 4m21.903s user 0m0.010s sys 0m0.014s -- -- KVM QE can not reproduce this issue on qemu-kvm-rhev-2.9.0-1.el7.x86_64. Steps: 1. Launch guest with raw disk: /usr/libexec/qemu-kvm -smp 1 -m 1024 -name linux -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/home/kvm_autotest_root/images/rhel74-64-virtio.raw,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=raw,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0,bootindex=1 -monitor stdio -qmp tcp::1234,server,nowait 2. Block mirror: { "execute": "drive-mirror", "arguments": { "device": "drive-virtio-disk0", "target": "/home/sn1", "format": "qcow2", "mode": "absolute-paths", "sync": "full"}} {"return": {}} {"execute":"query-block-jobs"} {"return": [{"io-status": "ok", "device": "drive-virtio-disk0", "busy": true, "len": 21475491840, "offset": 3991928832, "paused": false, "speed": 0, "ready": false, "type": "mirror"}]} {"timestamp": {"seconds": 1493014995, "microseconds": 370734}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 21486698496, "offset": 21486698496, "speed": 0, "type": "mirror"}} {"execute": "block-job-complete", "arguments": { "device": "drive-virtio-disk0"} } {"return": {}} Result: Block mirror goes well, no hang. Per Comment #20, moving to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |