Hide Forgot
Description of problem: Hot add two vfs with same bootindex leading to qemu-kvm process quit Version-Release number of selected component (if applicable): #qemu-kvm version qemu-kvm-0.12.1.2-2.144.el6.x86_64 #uname -r 2.6.32-115.el6.x86_64 #rpm -qa | grep seabios seabios-0.6.1.2-3.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1.Generate VF #modprobe -r igb;modprobe igb max_vfs=7 2.Unbind two vfs from host lspci -n | grep 09:10.2 09:10.2 0200: 8086:10ca (rev 01) echo "8086 10ca" >/sys/bus/pci/drivers/pci-stub/new_id echo 0000:09:10.2 >/sys/bus/pci/devices/0000\:09\:10.2/driver/unbind echo 0000:09:10.2 >/sys/bus/pci/drivers/pci-stub/bind #lspci -n | grep 09:10.3 09:10.3 0200: 8086:10ca (rev 01) #echo "8086 10ca" >/sys/bus/pci/drivers/pci-stub/new_id #echo 0000:09:10.3 >/sys/bus/pci/devices/0000\:09\:10.3/driver/unbind #echo 0000:09:10.3 >/sys/bus/pci/drivers/pci-stub/bind 3.boot guest /usr/libexec/qemu-kvm -m 2G -smp 2 -drive file=/root/zhangjunyi/rhel6.1-ide.qcow2,if=none,id=test,boot=on,cache=none,format=qcow2,werror=stop,rerror=stop -device virtio-blk-pci,drive=test -cpu qemu64,+sse2,+x2apic -monitor stdio -boot c -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:11:22:45:66:95 -vnc :9 -qmp tcp:0:4446,server,nowait 4.hotadd two vfs with the same bootindex #(qemu) device_add pci-assign,host=09:10.2,id=pci1,bootindex=10 #(qemu) device_add pci-assign,host=09:10.3,id=pci3,bootindex=10 Actual results: qemu-kvm process was quit with message "Two devices with same boot index 10" Expected results: should prevent hot-add instead of quit process,image that,if some very important jobs running in guest currently.it's very aggressive let qemu-kvm quit directly. Additional info:
The above error has nothing to do with device assignment in particular, it will happen with any device that supports bootindex. The new bootindex option does not work well with hotplug. add_boot_device_path() should not call exit, it should return an error if there's a conflict. There also appears to be no remove function, so if a device is specified with a bootindex and later removed, what happens? Assigning back to Gleb.
Don't add two devices with same same bootindex. This does not suppose to work neither from command line nor from qemu monitor. Closing as NOTABUG. Alex is correct that we probably need to add del_boot_device_path(), but hot-unplug will not cause crash without it too, so this is not high priority.
(In reply to comment #4) > Don't add two devices with same same bootindex. This does not suppose to work > neither from command line nor from qemu monitor. Closing as NOTABUG. Alex is > correct that we probably need to add del_boot_device_path(), but hot-unplug > will not cause crash without it too, so this is not high priority. Gleb, should we at least ignore the bootindex for hotplugged device, maybe printing a warning? I'm not a fan of leaving holes that are so easy to trigger and result in qemu calling exit() while the guest is running.
(In reply to comment #5) > (In reply to comment #4) > > Don't add two devices with same same bootindex. This does not suppose to work > > neither from command line nor from qemu monitor. Closing as NOTABUG. Alex is > > correct that we probably need to add del_boot_device_path(), but hot-unplug > > will not cause crash without it too, so this is not high priority. > > Gleb, should we at least ignore the bootindex for hotplugged device, maybe > printing a warning? I'm not a fan of leaving holes that are so easy to trigger > and result in qemu calling exit() while the guest is running. Why should we ignore bootindex for hotplugged device? User may want to hotplug a device and boot from in on the next boot for instance. About exit() I am not sure what can be done here. This is not exit that can be triggered by a guest and qemu usually exits on wrong command line (user may not notice the warning if we will print one). Also printing a warning is not appropriate response on monitor command failure since the one who issues the command may not even have access to stderr. I can change the code to not treat two devices with the same bootindex as an error, but then user may unintentionally specify two devices with the same bootindex and be surprised by a result.
(In reply to comment #7) > Why should we ignore bootindex for hotplugged device? User may want to hotplug > a device and boot from in on the next boot for instance. About exit() I am not > sure what can be done here. This is not exit that can be triggered by a guest > and qemu usually exits on wrong command line (user may not notice the warning > if we will print one). Also printing a warning is not appropriate response on > monitor command failure since the one who issues the command may not even have > access to stderr. I can change the code to not treat two devices with the same > bootindex as an error, but then user may unintentionally specify two devices > with the same bootindex and be surprised by a result. Gleb, it sounds like you're validating why this should not be closed. If we want to allow users to hotplug devices and boot from it on the next boot, then we need to make it such that a trivial typo on the hotplug doesn't cause the VM to exit. It seems like a fairly simple matter to have add_boot_device_path() return an error and make the device initfn fail when this occurs. If an initfn fails before the VM is running, I think we automatically abort. I actually started fixing this, but decided to re-assign to you after I saw there was no del_boot_device_path().
(In reply to comment #9) > (In reply to comment #7) > > Why should we ignore bootindex for hotplugged device? User may want to hotplug > > a device and boot from in on the next boot for instance. About exit() I am not > > sure what can be done here. This is not exit that can be triggered by a guest > > and qemu usually exits on wrong command line (user may not notice the warning > > if we will print one). Also printing a warning is not appropriate response on > > monitor command failure since the one who issues the command may not even have > > access to stderr. I can change the code to not treat two devices with the same > > bootindex as an error, but then user may unintentionally specify two devices > > with the same bootindex and be surprised by a result. > > Gleb, it sounds like you're validating why this should not be closed. If we > want to allow users to hotplug devices and boot from it on the next boot, then > we need to make it such that a trivial typo on the hotplug doesn't cause the VM > to exit. It seems like a fairly simple matter to have add_boot_device_path() > return an error and make the device initfn fail when this occurs. If an initfn > fails before the VM is running, I think we automatically abort. I actually > started fixing this, but decided to re-assign to you after I saw there was no > del_boot_device_path(). Are you saying that if initfn fails during machine creation qemu aborts, but if it fails during hot plug it doesn't? If yes then we may change the behaviour for 6.2. We may make add_boot_device_path() to return token or an error. Token will be used for del_boot_device_path().
(In reply to comment #11) > Are you saying that if initfn fails during machine creation qemu aborts, but if > it fails during hot plug it doesn't? If yes then we may change the behaviour > for 6.2. We may make add_boot_device_path() to return token or an error. Token > will be used for del_boot_device_path(). Yes, exactly. See how device_init_func is called versus do_device_add.