Bug 1256044
Summary: | alias for virtio disk is not removed from internal structures right at the time the DEVICE_DELETED event is delivered | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Amit Aviram <aaviram> | ||||
Component: | qemu-kvm-rhev | Assignee: | Markus Armbruster <armbru> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | CongLi <coli> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.1 | CC: | aaviram, amureini, armbru, bmcclain, chayang, derez, dyuan, eblake, hhan, huding, juzhang, knoel, michen, pbonzini, pkrempa, pzhang, rbalakri, snagar, virt-maint, xfu, xuzhang, ylavi | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1262399 (view as bug list) | Environment: | |||||
Last Closed: | 2017-12-05 14:04:27 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1172230, 1288337, 1310118 | ||||||
Attachments: |
|
Description
Amit Aviram
2015-08-23 15:38:25 UTC
Created attachment 1066062 [details]
Scripts and logs
Can you add this to the 7.2\3.6 tracking? Your script calls dom.detachDevice() but does not ensure that the detach completes. Detaching a device requires guest cooperation, while the command dom.detachDevice() merely returns success if the detach was requested without waiting for it to be completed. You MUST poll the domain XML to see if the guest has acted on the request before attempting to re-attach another device of the same name. It might be nice if libvirt added a flag to dom.detachDevice() that waited for the guest to respond before returning control; but commands that require guest cooperation are generally risky as there is no telling how long they might take. Also, it is possible to register to receive an event when the detach actually happens, instead of having to poll the domain XML. At this point, I think it is a usage bug, not a libvirt problem. But the script does check if the device is still in the xml: """ while 'fc2a7a5e-6e9b-4765-82d3-8bf5f57a643b' in dom.XMLDesc(): print ('2nd device is stuck.') """ Note that this checks if the device's serial is still in the domain's XML. The script might not describe the error well.. at the point that this serial is actually in dom.XMLDesc(), it NEVER gets removed, and even when I kill the script I still see it at dom.XMLDesc().. It may be that this bug is a qemu bug. maybe there is some problem with the guest and libvirt doesn't get an answer regarding the detach result? I cannot reproduce it with libvirt-1.2.17-8.el7.x86_64, qemu-kvm-rhev-2.3.0-22.el7.x86_64. Steps: in terminal 1 for i in {1..1000}; do echo try $i times virsh attach-device virt-tests-vm1 /dev/stdin <<EOF <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/tmp/attach.img'/> <target dev='vdb' bus='virtio'/> </disk> EOF virsh detach-device virt-tests-vm1 /dev/stdin <<EOF <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/tmp/attach.img'/> <target dev='vdb' bus='virtio'/> </disk> EOF done in terminal 2 for i in {1..1000}; do echo try $i times virsh attach-device virt-tests-vm1 /dev/stdin <<EOF <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/tmp/attach2.img'/> <target dev='vdc' bus='virtio'/> </disk> EOF virsh detach-device virt-tests-vm1 /dev/stdin <<EOF <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/tmp/attach2.img'/> <target dev='vdc' bus='virtio'/> </disk> EOF done Attach&detach device work well. I also cannot reproduce it with the scripts reporter uploaded on libvirt-1.2.17-8.el7.x86_64, qemu-kvm-rhev-2.3.0-22.el7.x86_64. I've managed to reproduce the issue by running just one of the reproducer scripts so the issue is not bound to a race between two disks being attached/detached. The issue reproduces after a few attach/detach cycles of the script. The following communication happens between qemu and libvirtd (annotated ) ### one of the "good" attach/detach cycles: # add backend: 13:14:39.756+0000 L: buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial":"9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop","rerror":"stop","aio":"threads"},"id":"libvirt-844"} 13:14:39.764+0000 Q: buf={"return": {}, "id": "libvirt-844"} # add frontend: 13:14:39.764+0000 L: buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi":"off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio-disk0"},"id":"libvirt-845"} 13:14:39.774+0000 Q: buf={"return": {}, "id": "libvirt-845"} # query alias list: 13:14:39.775+0000 L: buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-846"} 13:14:39.778+0000 Q: buf={"return": [{"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-846"} # unplug frontend: 13:14:39.779+0000 L: buf={"execute":"device_del","arguments":{"id":"virtio-disk0"},"id":"libvirt-847"} 13:14:39.789+0000 Q: buf={"return": {}, "id": "libvirt-847"} # get DEVICE_DELETED event for the unplug: 13:14:39.815+0000 Q: buf={"timestamp": {"seconds": 1441890879, "microseconds": 815331}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-disk0/virtio-backend"}} 13:14:39.816+0000 Q: buf={"timestamp": {"seconds": 1441890879, "microseconds": 816058}, "event": "DEVICE_DELETED", "data": {"device": "virtio-disk0", "path": "/machine/peripheral/virtio-disk0"}} # try to unplug disk backend ... 13:14:39.816+0000 L: buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio-disk0"},"id":"libvirt-848"} # unplugging of backend fails, since frontend removes the backend too. Note that the downstream impl is broken since it tries to use the HMP implementation which does not properly report errors so that they are propagated via QMP. We ignore this error, and claim successful detach. 13:14:39.820+0000 Q: buf={"id": "libvirt-848", "error": {"class": "GenericError", "desc": "An undefined error has occurred"}} # refresh alias list 13:14:39.821+0000 L: buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-849"} 13:14:39.824+0000 Q: buf={"return": [{"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-849"} #done # NEXT CYCLE 13:14:39.831+0000 L: buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial":"9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop","rerror":"stop","aio":"threads"},"id":"libvirt-850"} 13:14:39.839+0000 Q: buf={"return": {}, "id": "libvirt-850"} 13:14:39.840+0000 L: buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi":"off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio-disk0"},"id":"libvirt-851"} 13:14:39.849+0000 Q: buf={"return": {}, "id": "libvirt-851"} 13:14:39.850+0000 L: buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-852"} 13:14:39.853+0000 Q: buf={"return": [{"name": "virtio-disk0", "type": "child<virtio-blk-pci>"}, {"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-852"} 13:14:39.854+0000 L: buf={"execute":"device_del","arguments":{"id":"virtio-disk0"},"id":"libvirt-853"} 13:14:39.863+0000 Q: buf={"return": {}, "id": "libvirt-853"} 13:14:39.882+0000 Q: buf={"timestamp": {"seconds": 1441890879, "microseconds": 882492}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-disk0/virtio-backend"}} 13:14:39.883+0000 Q: buf={"timestamp": {"seconds": 1441890879, "microseconds": 883033}, "event": "DEVICE_DELETED", "data": {"device": "virtio-disk0", "path": "/machine/peripheral/virtio-disk0"}} #Note that we've received the DEVICE_DELETED event ... 13:14:39.883+0000 L: buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio-disk0"},"id":"libvirt-854"} 13:14:39.887+0000 Q: buf={"id": "libvirt-854", "error": {"class": "GenericError", "desc": "An undefined error has occurred"}} 13:14:39.889+0000 L: buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-855"} 13:14:39.891+0000 Q: buf={"return": [{"name": "balloon0", "type": "child<virtio-balloon-pci>"}, {"name": "video0", "type": "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": "libvirt-855"} # and virtio-disk-0 is not among the devices, but .. # NEXT cycle (the one that fails) 13:14:39.899+0000 L: buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial":"9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop","rerror":"stop","aio":"threads"},"id":"libvirt-856"} 13:14:39.907+0000 Q: buf={"return": {}, "id": "libvirt-856"} 13:14:39.907+0000 L: buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi":"off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio-disk0"},"id":"libvirt-857"} # we try to add 'virtio-disk0' yet again but it fails! 13:14:39.912+0000 Q: buf={"id": "libvirt-857", "error": {"class": "GenericError", "desc": "Duplicate ID 'virtio-disk0' for device"}} # now we try to delete the backend, but due to a bug in libvirt we use invalid string, so the backend won't be removed. 13:14:39.912+0000 L: buf={"execute":"__com.redhat_drive_del","arguments":{"id":"file=/var/lib/libvirt/images/a,if=none,id=drive-virtio-disk0,format=raw,serial=9f09e0db-cc0b-4a93-826b-c4a3fda0b208,cache=none,werror=stop,rerror=stop,aio=threads"},"id":"libvirt-858"} 13:14:39.919+0000 Q: buf={"id": "libvirt-858", "error": {"class": "GenericError", "desc": "An undefined error has occurred"}} 13:20:51.685+0000 L: buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio-disk0"},"id":"libvirt-859"} 13:20:51.689+0000 Q: buf={"return": {}, "id": "libvirt-859"} # After a manual delete of the dangling backend, the script can be restarted and works for a few cycles again. The conclusion from the above log is that QEMU has a race window where it fires the DEVICE_DELETED event but it does not yet free the device alias from it's internal structures. In a tight loop as used by the reproducer script (causeError.py, in the archive attached) The libvirt bug described above is caused by bug in commit 8125113c. Tested with: libvirt-1.2.17-8.el7.x86_64 qemu-kvm-rhev-2.3.0-22.el7.x86_64 Finishing the sentence from comment 11: In a tight loop as used by the reproducer script (causeError.py, in the archive attached) libvirt sometimes hits the window and attach of a disk with same alias gets rejected. Adding a short sleep to the original reproducer allows the script to complete all 1000 plug/unplug cycles. Reassigning to qemu and lowering severity since a tight loop of attach/detaching disks is not really a viable use case. I'll clone this bug to track the bug in the cleanup path in libvirt that does not correctly remove the backend so that after the failure to attach we will be able to recover. Excellent analysis, Peter. Clue: device_add fails with "Duplicate ID 'virtio-disk0' for device". For this to happen, the QemuOpts with this ID must still exist. Precondition for that: DEVICE_DELETED gets sent before the QemuOpts gets deleted. Is this the case? Turns out it is. Instead of demonstrating it with virtio-blk-pci, let me demonstrate with scsi-hd disk, because that's *much* simpler to understand: (1) it's only one device (virtio-blk-pci is actually two below the hood, virtio-blk-pci and virtio-blk-device), and (2) it doesn't do the PCI hot unplug ACPI dance. Here's my reproducer: * Run QEMU under GDB. I use latest upstream here, shouldn't matter. * Set a breakpoint on qapi_event_send_device_deleted() and on drive_info_del(). * Start a guest. * Once the guest has booted, run in the monitor: (qemu) device_add virtio-scsi-pci,id=scsi0 (qemu) drive_add "" if=none,file=tmp.qcow2,id=dr0 OK (qemu) device_add scsi-hd,drive=dr0,id=sd0 (qemu) device_del sd0 GDB breaks at qapi_event_send_device_deleted(), backtrace appended. * Continue. GDB breaks at drive_info_del(), backtrace appended. Note this is still within the same object_unparent() call. Shows that we send the event *before* we delete the QemuOpts. If you manage to get a Now we have to figure out: * How can the next device_add get in between? I suspect it can only with a PCI unplug. * How can we fix this? - Can we prevent a device_add getting in between? - Can we delay the DEVICE_DELETED until the QemuOpts is gone? - Can we move deletion of QemuOpts back in time, before sending DEVICE_DELETED? First backtrace, cut off after qmp_device_del(): #0 0x0000555555a57ef1 in qapi_event_send_device_deleted (has_device=true, device=0x5555575ecb40 "sd0", path=0x555557269b50 "/machine/peripheral/sd0", errp=0x555556411ea8 <error_abort>) at qapi-event.c:831 #1 0x0000555555816474 in device_unparent (obj=0x555556c41840) at /work/armbru/qemu/hw/core/qdev.c:1242 #2 0x000055555596549c in object_finalize_child_property (obj=0x5555564b5d00, name=0x555557269ad0 "sd0", opaque=0x555556c41840) at /work/armbru/qemu/qom/object.c:1234 #3 0x0000555555964955 in object_property_del (obj=0x5555564b5d00, name=0x555557269ad0 "sd0", errp=0x0) at /work/armbru/qemu/qom/object.c:928 #4 0x000055555596367f in object_property_del_child (obj=0x5555564b5d00, child=0x555556c41840, errp=0x0) at /work/armbru/qemu/qom/object.c:384 #5 0x00005555559636cc in object_unparent (obj=0x555556c41840) at /work/armbru/qemu/qom/object.c:393 #6 0x0000555555813f05 in qdev_simple_device_unplug_cb (hotplug_dev=0x5555564b2360, dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/qdev.c:344 #7 0x00005555556b0699 in virtio_scsi_hotunplug (hotplug_dev=0x5555564b2360, dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/scsi/virtio-scsi.c:809 #8 0x0000555555819908 in hotplug_handler_unplug (plug_handler=0x5555564b2360, plugged_dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/hotplug.c:44 #9 0x0000555555813dc4 in qdev_unplug (dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/qdev.c:303 #10 0x000055555576be85 in qmp_device_del (id=0x555557962100 "sd0", errp=0x7fffffffc9a8) at /work/armbru/qemu/qdev-monitor.c:801 Second backtrace, cut off after qmp_device_del(): #0 0x0000555555a026e4 in drive_info_del (dinfo=0x555557814860) at /work/armbru/qemu/block/block-backend.c:159 #1 0x0000555555a026c9 in blk_delete (blk=0x555556b545a0) at /work/armbru/qemu/block/block-backend.c:153 #2 0x0000555555a0279d in blk_unref (blk=0x555556b545a0) at /work/armbru/qemu/block/block-backend.c:186 #3 0x0000555555762560 in blockdev_auto_del (blk=0x555556b545a0) at /work/armbru/qemu/blockdev.c:144 #4 0x0000555555820a8b in release_drive (obj=0x555556c41840, name=0x5555564b2600 "drive", opaque=0x555555f446c0 <scsi_hd_properties>) at /work/armbru/qemu/hw/core/qdev-properties-system.c:105 #5 0x00005555559635d6 in object_property_del_all (obj=0x555556c41840) at /work/armbru/qemu/qom/object.c:368 #6 0x000055555596375c in object_finalize (data=0x555556c41840) at /work/armbru/qemu/qom/object.c:413 #7 0x000055555596460f in object_unref (obj=0x555556c41840) at /work/armbru/qemu/qom/object.c:848 #8 0x00005555559654b4 in object_finalize_child_property (obj=0x5555564b5d00, name=0x555557269ad0 "sd0", opaque=0x555556c41840) at /work/armbru/qemu/qom/object.c:1237 #9 0x0000555555964955 in object_property_del (obj=0x5555564b5d00, name=0x555557269ad0 "sd0", errp=0x0) at /work/armbru/qemu/qom/object.c:928 x555556c41840, errp=0x0) at /work/armbru/qemu/qom/object.c:384 #11 0x00005555559636cc in object_unparent (obj=0x555556c41840) at /work/armbru/qemu/qom/object.c:393 #12 0x0000555555813f05 in qdev_simple_device_unplug_cb (hotplug_dev=0x5555564b2360, dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/qdev.c:344 #13 0x00005555556b0699 in virtio_scsi_hotunplug (hotplug_dev=0x5555564b2360, dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/scsi/virtio-scsi.c:809 #14 0x0000555555819908 in hotplug_handler_unplug (plug_handler=0x5555564b2360, plugged_dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/hotplug.c:44 #15 0x0000555555813dc4 in qdev_unplug (dev=0x555556c41840, errp=0x7fffffffc9a8) at /work/armbru/qemu/hw/core/qdev.c:303 #16 0x000055555576be85 in qmp_device_del (id=0x555557962100 "sd0", errp=0x7fffffffc9a8) at /work/armbru/qemu/qdev-monitor.c:801 (In reply to yangyang from comment #10) > I also cannot reproduce it with the scripts reporter uploaded on > libvirt-1.2.17-8.el7.x86_64, qemu-kvm-rhev-2.3.0-22.el7.x86_64. The python scripts in attachment seems reproduce the error more easily. I try python scripts, in average every 1000 attach-detach loops can get a 'Duplicate ID' error. But for shell scripts with the same device xml, I have never reproduce it with latest libvirt. Michael had reported this to me and we had started discussing it, so I can take it too. Latest upstream patches, waiting for review: http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg04978.html Patches have been reviewed, general approach looks good, but v2 needed. Peter Krempa thinks our fixing of libvirt bug 1262399 makes hitting this bug unlikely. (In reply to Markus Armbruster from comment #27) > Patches have been reviewed, general approach looks good, but v2 needed. > > Peter Krempa thinks our fixing of libvirt bug 1262399 makes hitting this bug > unlikely. I think that the bug itself is very unlikely to cause problems as it's very unlikely for users to repeatedly unplug and plug disks in a very tight loop. The libvirt fix allows the operation to be retried "later" after qemu rejected one of the hotplug attempts. Previously we would desync the state and the operation would fail forever after the first occurence. In conclusion, the bug is VERY unlikely to be hit in the first place, and the operation can be successfuly finished by retrying now that the libvirt bug mentioned above is fixed. (In reply to Peter Krempa from comment #28) > (In reply to Markus Armbruster from comment #27) > > Patches have been reviewed, general approach looks good, but v2 needed. > > > > Peter Krempa thinks our fixing of libvirt bug 1262399 makes hitting this bug > > unlikely. > > I think that the bug itself is very unlikely to cause problems as it's very > unlikely for users to repeatedly unplug and plug disks in a very tight loop. This came up from backup partner which does exactly this. > > The libvirt fix allows the operation to be retried "later" after qemu > rejected one of the hotplug attempts. Previously we would desync the state > and the operation would fail forever after the first occurence. > > In conclusion, the bug is VERY unlikely to be hit in the first place, and > the operation can be successfuly finished by retrying now that the libvirt > bug mentioned above is fixed. (In reply to Peter Krempa from comment #28) > (In reply to Markus Armbruster from comment #27) > > Patches have been reviewed, general approach looks good, but v2 needed. > > > > Peter Krempa thinks our fixing of libvirt bug 1262399 makes hitting this bug > > unlikely. > > I think that the bug itself is very unlikely to cause problems as it's very > unlikely for users to repeatedly unplug and plug disks in a very tight loop. > > The libvirt fix allows the operation to be retried "later" after qemu > rejected one of the hotplug attempts. Previously we would desync the state > and the operation would fail forever after the first occurence. As Allon said before, the loop that is being demonstrated here just helps us to have this bug reproduced easily- but the real issue is that once in a while our partners experiencing the race described above, when performing a bunch of attaching and detaching actions. In big setups of RHEV, which are typical in partner's environments- this scenario is not that unlikely, unfortunately.. > > In conclusion, the bug is VERY unlikely to be hit in the first place, and > the operation can be successfuly finished by retrying now that the libvirt > bug mentioned above is fixed. (In reply to Peter Krempa from comment #11) > I've managed to reproduce the issue by running just one of the reproducer > scripts so the issue is not bound to a race between two disks being > attached/detached. The issue reproduces after a few attach/detach cycles of > the script. > > The following communication happens between qemu and libvirtd (annotated ) > > ### one of the "good" attach/detach cycles: > # add backend: > 13:14:39.756+0000 L: > buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/ > libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial": > "9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop", > "rerror":"stop","aio":"threads"},"id":"libvirt-844"} > 13:14:39.764+0000 Q: buf={"return": {}, "id": "libvirt-844"} > # add frontend: > 13:14:39.764+0000 L: > buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi": > "off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio- > disk0"},"id":"libvirt-845"} > 13:14:39.774+0000 Q: buf={"return": {}, "id": "libvirt-845"} > # query alias list: > 13:14:39.775+0000 L: > buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id": > "libvirt-846"} > 13:14:39.778+0000 Q: buf={"return": [{"name": "virtio-disk0", "type": > "child<virtio-blk-pci>"}, {"name": "balloon0", "type": > "child<virtio-balloon-pci>"}, {"name": "video0", "type": > "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, > {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": > "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": > "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": > "libvirt-846"} > # unplug frontend: > 13:14:39.779+0000 L: > buf={"execute":"device_del","arguments":{"id":"virtio-disk0"},"id":"libvirt- > 847"} > 13:14:39.789+0000 Q: buf={"return": {}, "id": "libvirt-847"} > # get DEVICE_DELETED event for the unplug: > 13:14:39.815+0000 Q: buf={"timestamp": {"seconds": 1441890879, > "microseconds": 815331}, "event": "DEVICE_DELETED", "data": {"path": > "/machine/peripheral/virtio-disk0/virtio-backend"}} > 13:14:39.816+0000 Q: buf={"timestamp": {"seconds": 1441890879, > "microseconds": 816058}, "event": "DEVICE_DELETED", "data": {"device": > "virtio-disk0", "path": "/machine/peripheral/virtio-disk0"}} > # try to unplug disk backend ... > 13:14:39.816+0000 L: > buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio- > disk0"},"id":"libvirt-848"} > # unplugging of backend fails, since frontend removes the backend too. Note > that the downstream impl is broken since it tries to use the HMP > implementation which does not properly report errors so that they are > propagated via QMP. We ignore this error, and claim successful detach. > 13:14:39.820+0000 Q: buf={"id": "libvirt-848", "error": {"class": > "GenericError", "desc": "An undefined error has occurred"}} > # refresh alias list > 13:14:39.821+0000 L: > buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id": > "libvirt-849"} > 13:14:39.824+0000 Q: buf={"return": [{"name": "balloon0", "type": > "child<virtio-balloon-pci>"}, {"name": "video0", "type": > "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, > {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": > "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": > "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": > "libvirt-849"} > #done > > # NEXT CYCLE > 13:14:39.831+0000 L: > buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/ > libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial": > "9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop", > "rerror":"stop","aio":"threads"},"id":"libvirt-850"} > 13:14:39.839+0000 Q: buf={"return": {}, "id": "libvirt-850"} > 13:14:39.840+0000 L: > buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi": > "off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio- > disk0"},"id":"libvirt-851"} > 13:14:39.849+0000 Q: buf={"return": {}, "id": "libvirt-851"} > 13:14:39.850+0000 L: > buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id": > "libvirt-852"} > 13:14:39.853+0000 Q: buf={"return": [{"name": "virtio-disk0", "type": > "child<virtio-blk-pci>"}, {"name": "balloon0", "type": > "child<virtio-balloon-pci>"}, {"name": "video0", "type": > "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, > {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": > "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": > "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": > "libvirt-852"} > 13:14:39.854+0000 L: > buf={"execute":"device_del","arguments":{"id":"virtio-disk0"},"id":"libvirt- > 853"} > 13:14:39.863+0000 Q: buf={"return": {}, "id": "libvirt-853"} > 13:14:39.882+0000 Q: buf={"timestamp": {"seconds": 1441890879, > "microseconds": 882492}, "event": "DEVICE_DELETED", "data": {"path": > "/machine/peripheral/virtio-disk0/virtio-backend"}} > 13:14:39.883+0000 Q: buf={"timestamp": {"seconds": 1441890879, > "microseconds": 883033}, "event": "DEVICE_DELETED", "data": {"device": > "virtio-disk0", "path": "/machine/peripheral/virtio-disk0"}} > > #Note that we've received the DEVICE_DELETED event ... > > 13:14:39.883+0000 L: > buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio- > disk0"},"id":"libvirt-854"} > 13:14:39.887+0000 Q: buf={"id": "libvirt-854", "error": {"class": > "GenericError", "desc": "An undefined error has occurred"}} > 13:14:39.889+0000 L: > buf={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id": > "libvirt-855"} > 13:14:39.891+0000 Q: buf={"return": [{"name": "balloon0", "type": > "child<virtio-balloon-pci>"}, {"name": "video0", "type": > "child<cirrus-vga>"}, {"name": "ide0-1-0", "type": "child<ide-cd>"}, > {"name": "virtio-serial0", "type": "child<virtio-serial-pci>"}, {"name": > "scsi0", "type": "child<virtio-scsi-pci>"}, {"name": "usb", "type": > "child<piix3-usb-uhci>"}, {"name": "type", "type": "string"}], "id": > "libvirt-855"} > > # and virtio-disk-0 is not among the devices, but .. > > # NEXT cycle (the one that fails) > > 13:14:39.899+0000 L: > buf={"execute":"__com.redhat_drive_add","arguments":{"file":"/var/lib/ > libvirt/images/a","id":"drive-virtio-disk0","format":"raw","serial": > "9f09e0db-cc0b-4a93-826b-c4a3fda0b208","cache":"none","werror":"stop", > "rerror":"stop","aio":"threads"},"id":"libvirt-856"} > 13:14:39.907+0000 Q: buf={"return": {}, "id": "libvirt-856"} > 13:14:39.907+0000 L: > buf={"execute":"device_add","arguments":{"driver":"virtio-blk-pci","scsi": > "off","bus":"pci.0","addr":"0x7","drive":"drive-virtio-disk0","id":"virtio- > disk0"},"id":"libvirt-857"} > # we try to add 'virtio-disk0' yet again but it fails! > 13:14:39.912+0000 Q: buf={"id": "libvirt-857", "error": {"class": > "GenericError", "desc": "Duplicate ID 'virtio-disk0' for device"}} > # now we try to delete the backend, but due to a bug in libvirt we use > invalid string, so the backend won't be removed. > 13:14:39.912+0000 L: > buf={"execute":"__com.redhat_drive_del","arguments":{"id":"file=/var/lib/ > libvirt/images/a,if=none,id=drive-virtio-disk0,format=raw,serial=9f09e0db- > cc0b-4a93-826b-c4a3fda0b208,cache=none,werror=stop,rerror=stop,aio=threads"}, > "id":"libvirt-858"} > 13:14:39.919+0000 Q: buf={"id": "libvirt-858", "error": {"class": > "GenericError", "desc": "An undefined error has occurred"}} > 13:20:51.685+0000 L: > buf={"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio- > disk0"},"id":"libvirt-859"} > 13:20:51.689+0000 Q: buf={"return": {}, "id": "libvirt-859"} > # After a manual delete of the dangling backend, the script can be > restarted and works for a few cycles again. > > The conclusion from the above log is that QEMU has a race window where it > fires the DEVICE_DELETED event but it does not yet free the device alias > from it's internal structures. In a tight loop as used by the reproducer > script (causeError.py, in the archive attached) > > The libvirt bug described above is caused by bug in commit 8125113c. Peter, Thanks for the deep analysis. IIUC, after solving Bug 1262399, libvirt won't block a new try for plugging a disk after failed attempt, as before the fix the internal structures in libvirt regarding this disk was not cleaned as it should. This is a great fix- however, reading your analysis I'm understanding that detaching a disk will always work, even if QEMU will fail removing the disk- as any error is being ignored. If that is the case, that means inconsistency, where libvirt is not aware of a plugged disk in qemu. shouldn't a failed return value from detaching, prevent libvirt from deleting the disk in the first place? keeping it in its structures? (In reply to Amit Aviram from comment #31) [...] > IIUC, after solving Bug 1262399, libvirt won't block a new try for plugging > a disk after failed attempt, as before the fix the internal structures in > libvirt regarding this disk was not cleaned as it should. Yes a new attempt to plug the disk after qemu clears the old device from it's structures will now work after the race condition in qemu clears. The issue with libvirt was that it did not clean up after the failed hotplug attempt as disk hotplug is a two step operation and the failure happened in the second step. The first step that adds the backend for the disk was not cleaned up. > This is a great fix- however, reading your analysis I'm understanding that > detaching a disk will always work, even if QEMU will fail removing the disk- > as any error is being ignored. That is not true. Libvirt will remove the disk from the internal structures only after it receives the DEVICE_DELETED event from qemu. This approach was present even prior to this BZ. Until the event is received, libvirt still keeps the disk in the definition. The issue is just wrong ordering of the DEVICE_DELETED event from qemu and qemu actually clearing the device name thus after it was ejected from the guest. The issue happened as libvirt tried to reuse the device name used previously. > If that is the case, that means inconsistency, where libvirt is not aware of > a plugged disk in qemu. shouldn't a failed return value from detaching, > prevent libvirt from deleting the disk in the first place? keeping it in its > structures? If I understand this correctly then the previous paragraph should clear that. The only place where errors are ignored are in the cleanup path of the first step of the hotplug operation if the second step fails. This was the reason why this was not caught in the first place. Since a failure in a cleanup path after a failed operation can't make anything worse the semantics is unfortunately correct. Peter is correct: this bug is about slightly premature emission of DEVICE_DELETED, which opens a race window where an attempt to reuse the same ID fails. Workaround: try again, possibly after waiting a little. See comment#19. Some work around DEVICE_DELETED landed in 2.11 (commit f7b879e07), which we backported to fix bug 1445460. I suspect this bug shares the same root cause, and got fixed, too. Can you still reproduce it with qemu-kvm-rhev-2.10.0-8.el7 or later? If yes, I'm wrong. Amit is no longer with Red Har, unfortunately. Daniel - can you address Markus' needinfo please? (In reply to Markus Armbruster from comment #41) > Some work around DEVICE_DELETED landed in 2.11 (commit f7b879e07), which we > backported to fix bug 1445460. I suspect this bug shares the same root > cause, and got fixed, too. Can you still reproduce it with > qemu-kvm-rhev-2.10.0-8.el7 or later? If yes, I'm wrong. IIUC the scenario, seems to work with qemu-kvm-ev-2.9.0-16.el7_4.8.1.x86_64 Thanks, Daniel! Closing as duplicate. *** This bug has been marked as a duplicate of bug 1445460 *** |