Bug 963588

Summary: qemu will quit when hot plug a device with a bootindex number that was already used by other device
Product: Red Hat Enterprise Linux 6 Reporter: Sibiao Luo <sluo>
Component: qemu-kvmAssignee: Marcel Apfelbaum <marcel>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.5CC: acathrow, bsarathy, chayang, flang, juzhang, lnovich, marcel, michen, mkenneth, pbonzini, qzhang, rhod, sradvan, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 969968 (view as bug list) Environment:
Last Closed: 2014-07-13 14:25:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 771545, 1086603, 1314591, 1362084    
Bug Blocks: 969968    

Description Sibiao Luo 2013-05-16 08:16:58 UTC
Description of problem:
boot a guest specified with bootindex number, and then hot plug a device with a bootindex number that was already used by other device, the qemu quit occur.

Version-Release number of selected component (if applicable):
host info:
kernel-2.6.32-377.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.369.el6.x86_64
seabios-0.6.1.2-27.el6.x86_64
guest info:
kernel-2.6.32-377.el6.x86_64

How reproducible:
3/3

Steps to Reproduce:
1.boot up a guest specified with bootindex number.
e.g:# /usr/libexec/qemu-kvm -S -M rhel6.5.0 -cpu host -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection -name sluo-test -uuid a51eb497-bfd7-47c0-8b5b-0853716e3ce5 -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/RHEL-Server-6.4-64-virtio.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,serial=QEMU-DISK1 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-system-disk,id=system-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=08:2e:5f:0a:1d:b1,bus=pci.0,addr=0x5,bootindex=2,ioeventfd=off -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x6 -qmp tcp:0:4444,server,nowait -k en-us -boot menu=on -vnc :1 -spice disable-ticketing,port=5931 -vga qxl -monitor stdio
2.hot plug a device with a bootindex number that was already used by other device.
(qemu) __com.redhat_drive_add file=/dev/sdd,id=drive-scsi-disk,format=raw,media=disk,cache=none,aio=native
(qemu) device_add virtio-blk-pci,scsi=on,bus=pci.0,addr=0x7,drive=drive-scsi-disk,id=scsi-disk,bootindex=2
  
Actual results:
after step 2, qemu will quit with error prompt. check the host dmesg that have nothing related to it.
(qemu) __com.redhat_drive_add file=/dev/sdd,id=drive-scsi-disk,format=raw,media=disk,cache=none,aio=native
(qemu) device_add virtio-blk-pci,scsi=on,bus=pci.0,addr=0x7,drive=drive-scsi-disk,id=scsi-disk,bootindex=2
Two devices with same boot index 2
/etc/qemu-ifdown: could not launch network script

Expected results:
qemu should not quit,  just error prompt is enough.

Additional info:

Comment 2 Markus Armbruster 2013-06-07 14:57:20 UTC
Root cause: add_boot_device_path() calls exit() instead of returning failure.

To fix, we need to make it return an error code, and forward it up all call chains until we reach the spot where it should be handled.  I don't expect it to be hard, just t-e-d-i-o-u-s.

Upstream first.

Comment 7 Marcel Apfelbaum 2013-09-16 10:23:02 UTC
The boot order is passed in fw cfg and updated only once at
"machine done". There is no update of this list after this point.
Modifying the boot order from monitor does not work at all.
 
 So in order to solve this issue we can:
 1. Don't allow use of bootindex at hot-plug
 2. Change the architecture so boot order changing during hot-plug will be possible

Currently discussed upstream

Comment 12 Ronen Hod 2014-07-13 14:25:02 UTC
Will be handled in RHEL7 BZ#771545.
Closing, since not important enough for RHEL6