Bug 758658
Summary: | device_add fails the second time it is run (after device_del) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Michal Privoznik <mprivozn> | ||||
Component: | qemu-kvm | Assignee: | Markus Armbruster <armbru> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 6.2 | CC: | abaron, acathrow, bsarathy, cpelland, dkenigsb, ilvovsky, juzhang, michen, minchan, mkenneth, rhod, shyu, tburke, virt-maint | ||||
Target Milestone: | rc | Keywords: | TestBlocker | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-02-05 12:37:02 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 707622, 748534, 773650, 773651, 773665, 773677, 773696 | ||||||
Attachments: |
|
Description
Michal Privoznik
2011-11-30 11:28:23 UTC
This bug is blocker for hotplug/hotunplug disk in vdsm During testing, Igor and I were unable to detach SCSI controller even if we firstly detached virtio disk. I attach libvirtd logs: 15:24:00.945: 25854: debug : qemuMonitorJSONCommandWithFd:228 : Send command '{"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio-disk6"},"id":"libvirt-22"}' for write with FD -1 15:24:00.946: 25847: debug : qemuMonitorJSONIOProcessLine:119 : Line [{"return": {}, "id": "libvirt-22"}] 15:24:00.947: 25854: debug : qemuMonitorJSONCommandWithFd:228 : Send command '{"execute":"device_del","arguments":{"id":"virtio-disk6"},"id":"libvirt-23"}' for write with FD -1 15:24:00.947: 25847: debug : qemuMonitorJSONIOProcessLine:119 : Line [{"return": {}, "id": "libvirt-23"}] Those devices were added via: 15:23:21.303: 25851: debug : qemuMonitorJSONCommandWithFd:228 : Send command '{"execute":"__com.redhat_drive_add","arguments":{"file":"/rhev/data-center/baf6e036-6ecb-499f-a545-605aca58c1ea/165634e4-c23c-42c3-9822-c885f60bc376/images/11e999ee-dd8a-462e-a2a9-50058232fabd/11f58385-b59c-4fce-8fe7-94b4b8a0d92f","id":"drive-virtio-disk6","format":"raw","serial":"2e-a2a9-50058232fabd","cache":"none","werror":"stop","rerror":"stop","aio":"threads"},"id":"libvirt-20"}' for write with FD -1 15:23:21.307: 25847: debug : qemuMonitorJSONIOProcessLine:119 : Line [{"return": {}, "id": "libvirt-20"}] 15:23:21.308: 25851: debug : qemuMonitorJSONCommandWithFd:228 : Send command '{"execute":"device_add","arguments":{"driver":"virtio-blk-pci","bus":"pci.0","addr":"0x5","drive":"drive-virtio-disk6","id":"virtio-disk6"},"id":"libvirt-21"}' for write with FD -1 15:23:21.309: 25847: debug : qemuMonitorJSONIOProcessLine:119 : Line [{"return": {}, "id": "libvirt-21"}] Created attachment 538575 [details]
libvirtd.log
Just a question,do we support or plan to support emulated SCSI disk in rhel6.3?thanks What's the guest OS? How many disks did the VM have? Was the disk mounted by the guest at the time of the hot unplug? In addition to Dor's questions (comment#8), I have another one: what incorrect behavior exactly are you reporting? I think I understand what commands you sent. I tried them locally, like this: {"QMP": {"version": {"qemu": {"micro": 1, "minor": 12, "major": 0}, "package": "(qemu-kvm-devel)"}, "capabilities": []}} { "execute": "qmp_capabilities" } {"return": {}} {"execute":"__com.redhat_drive_add","arguments":{"file":"tmp.qcow2","id":"drive-virtio-disk6","format":"raw","serial":"2e-a2a9-50058232fabd","cache":"none","werror":"stop","rerror":"stop","aio":"threads"},"id":"libvirt-20"} {"return": {}, "id": "libvirt-20"} {"execute":"device_add","arguments":{"driver":"virtio-blk-pci","bus":"pci.0","addr":"0x5","drive":"drive-virtio-disk6","id":"virtio-disk6"},"id":"libvirt-21"} {"return": {}, "id": "libvirt-21"} {"execute":"__com.redhat_drive_del","arguments":{"id":"drive-virtio-disk6"},"id":"libvirt-22"} {"return": {}, "id": "libvirt-22"} {"execute":"device_del","arguments":{"id":"virtio-disk6"},"id":"libvirt-23"} {"return": {}, "id": "libvirt-23"} The device is visible in monitor (info pci) and guest (/sys/bus/pci/) after the device_add. It's gone in both shortly after the device_del. If that's not what you see, please explain in more detail. (In reply to comment #8) > What's the guest OS? > How many disks did the VM have? > Was the disk mounted by the guest at the time of the hot unplug? Dor, Actually, I found it during my work on hotplug/hotunplug feature on vdsm side and asked help from libvirt guys. So, the flow is very simple: 1. Start VM with single ide disk (without OS at all). 2. Hotplug second virtio disk (empty of course) 3. Unplug this disk. I found the problem just because I tried to reattach unpluged disk again. It failed inside libvirt and Michal found that it is qemu problem. I just wanted to help to libvirt guys (probably I shouldn't) because I found it when I worked on my features in vdsm side. I thought that my recipe is very clear, but probably I am wrong. Anyway, if you need *reproducer* you should talk to libvirt fellows, because libvirt is the right place to do it Michal, can you please help ? Thanks, Igor. A high-level description such as yours is often enough to reproduce the bug with an acceptable amount of guesswork (naturally, I appreciate it when bug reporters go the extra mile and give me a detailed description right away). But this time, I tried, I failed, so I do need the detailed description to make progress. Michal, could you please go through comment#8 and comment#9 and answer the questions there? Markus did you try the full flow? 1. run vm 2. plug disk 3. unplug disk 4. plug disk again (this fails as 3 did not really do anything) I agree that Igor's initial description was a bit misleading as step 4 was a comment after the flow, but this should work (fail) for you. It reproduces consistently in our environment. I don't think your response is called for, I wonder if you even read Igor's full response (which includes the second device_add step) as it details the flow properly. Of course I tried these steps. Many times, in fact. And I just tried them again, same result: the second plug works just fine. Now, I don't doubt for a minute that your test case fails for you. But you still haven't showed that test case to me! All I got is high-level descriptions. I tried but failed at guessing the actual test case from them. Please write up something I can reproduce. I'm pretty sure it'll take you less time I already spent at guessing. FYI,kvm-qe tried two scenarios with qemu-kvm-0.12.1.2-2.213.el6.x86_64 1) Start "with os" and load "acpiphp module" in guest, device_add success the second time or more that two times it is run (after device_del) steps 1.hot add driver and device (qemu) __com.redhat_drive_add file=/home/qcow2/second-disk-virtio.qcow2,format=qcow2,id=second-disk (qemu) device_add driver=virtio-blk-pci,drive=second-disk,id=disk2 2.hot remove device via device_del (qemu) device_del disk2 3. hot add driver and device second times (qemu) __com.redhat_drive_add file=/home/qcow2/second-disk-virtio.qcow2,format=qcow2,id=second-disk (qemu) device_add driver=virtio-blk-pci,drive=second-disk,id=disk2 4. hot remove device again via "__com.redhat_drive_del" and device_del (qemu) __com.redhat_drive_del second-disk (qemu) device_del disk2 5. hot add device and driver third times (qemu) __com.redhat_drive_add file=/home/qcow2/second-disk-virtio.qcow2,format=qcow2,id=second-disk (qemu) device_add driver=virtio-blk-pci,drive=second-disk,id=disk2 1) Start "without os" can hit the same problem,however,I think this is normal since hot remove pci need guest support acpi,fix me if any mistake. steps (qemu) __com.redhat_drive_add file=/home/qcow2/second-disk-virtio.qcow2,format=qcow2,id=second-disk (qemu) device_add driver=virtio-blk-pci,drive=second-disk,id=disk2 (qemu) device_del disk2(in fact,the pci is not removed) (qemu) device_add driver=virtio-blk-pci,drive=second-disk,id=second-disk Duplicate ID 'second-disk' for device (qemu) __com.redhat_drive_add file=/home/qcow2/second-disk-virtio.qcow2,format=qcow2,id=second-disk Duplicate ID 'second-disk' for drive (qemu) ---snip comment from comment10--- 1. Start VM with single ide disk (without OS at all). 2. Hotplug second virtio disk (empty of course) 3. Unplug this disk. Hi,Markus According to my testing,without guest or "do not load acpiphp in guest" maybe is the root reason for finding this issue,any mistake,please fix me. Best Regards & Thanks, Junyi Steps to reproduce: 1) in a terminal run: virt-install --name pxe --ram 512 --nodisks --pxe We need a guest without support for PCI hotplug. A guest without an OS would be sufficient. from other terminal: 2) qemu-img create /tmp/test.img 10M 3) virsh attach-disk pxe /tmp/test.img vda 4) virsh detach-disk pxe vda virsh attach-disk pxe /tmp/test.img vda error: Failed to attach disk error: internal error unable to execute QEMU command 'device_add': Duplicate ID 'virtio-disk0' for device What is wrong here is: Either qemu should really detach disk in step 3, or report an error; But saying "yeah, detached" and not detaching is causing others split brain. Aha, we're talking about a guest that doesn't support PCI hotplug! That's *exactly* why Dor asked for the guest OS in comment#8. Unlike some other buses such as USB, PCI unplug requires guest cooperation. Here's how the ACPI unplug dance works when it works: a. Management app sends device_del command. b. QEMU dispatches the unplug request to the appropriate device code. c. QEMU virtual ACPI device asks BIOS politely to give up the PCI device, and returns successfully. d. QEMU sends "okay" reply to management app. e. Meanwhile, BIOS asks guest OS politely to give up the PCI device. f. Guest OS does what it needs to do to give up the device, then notifies the BIOS. g. BIOS pokes the (QEMU virtual) ACPI device to signal guest software has given up the device. h. QEMU destroys the virtual PCI device. Steps e-g take an unpredictable amount of time. In particular, if the guest OS doesn't support ACPI, or just doesn't feel like giving up the device, we get stuck in step f. Step d's "okay" reply does *not* mean the device is gone. It merely means an unplug attempt has been initiated. You say 'What is wrong here is: Either qemu should really detach disk in step 3, or report an error; But saying "yeah, detached" and not detaching is causing others split brain.' Unfortunately that's not how device_del works. We can't make device_del wait for the device going away, because that can take an unpredictable amount of time (including forever). You'd be rightly upset if we'd block the monitor that long. device_del works as designed. If there is a bug, it's further up the stack, possibly in libvirt. Re comment#28 Junyi, thanks for your test. I really appreciate the level of detail there. Your observations match mine, and what the real hardware does: PCI hot plug works, but it requires guest cooperation. It cannot work when the guest doesn't cooperate. PCI cold plug (device_del before guest starts) doesn't work, but only because it's not implemented in qemu-kvm-rhel6. (In reply to comment #31) > Re comment#28 > > Junyi, thanks for your test. I really appreciate the level of detail there. > Your observations match mine, and what the real hardware does: PCI hot plug > works, but it requires guest cooperation. It cannot work when the guest doesn't > cooperate. > > PCI cold plug (device_del before guest starts) doesn't work, but only because > it's not implemented in qemu-kvm-rhel6. If ovirt likes it we may consider this as RFE. Other than that, I tend to agree w/ Markus recommendations of closing it |