Bug 970159

Summary: qemu-kvm-rhevm [race]: vm pauses with 'block I/O error in device '': No medium found (123)' when hounplug a disk and cannot be resumed
Product: Red Hat Enterprise Linux 6 Reporter: Dafna Ron <dron>
Component: qemu-kvmAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4CC: acathrow, bsarathy, chayang, hateya, juzhang, lnovich, michen, minovotn, mkenneth, qzhang, stefanha, virt-maint, zhzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.387.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-21 06:58:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 896690    
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-06-03 14:57:32 UTC
Created attachment 756350 [details]
logs

Description of problem:

I hotunpluged a vm and vm paused and I could not resume it. 
qemu reports: 

block I/O error in device '': No medium found (123)

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-0.12.1.2-2.355.el6_4.3.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.355.el6_4.3.x86_64
qemu-img-rhev-0.12.1.2-2.355.el6_4.3.x86_64
vdsm-4.10.2-22.0.el6ev.x86_64
libvirt-0.10.2-18.el6_4.5.x86_64

How reproducible:

race - happened to me twice. 

Steps to Reproduce:

for rhevm its a race and I was unable to reproduce it but Paolo Bozini suggested the following: 

do a libvirt detach-disk while grub runs, wait at the GRUB menu, then choose an entry

and Stephan Hajnuczi tested and said reproduction worked. 

Actual results:

vm pauses and cannot be resumed. 

Expected results:

vm should not pause but if so we should at least be able to resume it. 

Additional info: logs


vm log from qemu: 

[root@cougar02 ~]# cat /var/log/libvirt/qemu/vm1.log 
2013-06-03 13:03:56.672+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name vm1 -S -M rhel6.4.0 -cpu Opteron_G3 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -uuid e95d8a97-e71c-4d31-90f7-a455f64ba7d2 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=6Server-6.4.0.4.el6,serial=eb3b1d1d-5762-4e22-97dd-cba15fb6eb22,uuid=e95d8a97-e71c-4d31-90f7-a455f64ba7d2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2013-06-03T13:03:56,driftfix=slew -no-shutdown -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/81ef11d0-4c0c-47b4-8953-d61a6af442d8/images/3d4db7ae-8d4a-4177-842e-13fc64b9d1fb/0e19002e-0e1e-4421-ac4c-471f050d8300,if=none,id=drive-virtio-disk0,format=raw,serial=3d4db7ae-8d4a-4177-842e-13fc64b9d1fb,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/7414f930-bbdb-4ec6-8132-4640cbb3c722/images/dfc826eb-6a02-4f9b-bbfd-ce2404087c9a/85ece7fe-ce9f-4bff-825d-ce5c79d4e27d,if=none,id=drive-virtio-disk1,format=qcow2,serial=dfc826eb-6a02-4f9b-bbfd-ce2404087c9a,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:a1:1e,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/vm1.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/vm1.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5900,tls-port=5901,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -k en-us -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -incoming tcp:[::]:49167 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8
block I/O error in device '': No medium found (123)
block I/O error in device '': No medium found (123)
block I/O error in device '': No medium found (123)

libvirt:

2013-06-03 13:06:25.930+0000: 10195: debug : qemuTeardownDiskPathDeny:112 : Process path /rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/7414f930-bbdb-4ec6-8132-4640cbb3c722/images/dfc826eb-6a02-4f9b-bbfd-ce2404087c9a/85ece7fe-ce9f-4bff-825d-ce5c79d4e27d for disk
2013-06-03 13:06:25.930+0000: 10195: debug : virCgroupSetValueStr:331 : Set value '/cgroup/devices/libvirt/qemu/vm1/devices.deny' to 'b 253:34 rwm'
2013-06-03 13:06:25.930+0000: 10195: debug : virFileClose:72 : Closed fd 25
2013-06-03 13:06:25.969+0000: 10195: debug : virFileClose:72 : Closed fd 25
2013-06-03 13:06:25.969+0000: 10195: debug : qemuDomainObjEndJob:937 : Stopping job: modify (async=none)
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7fb4dc0bb9a0
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7fb4dc0d2950
2013-06-03 13:06:25.969+0000: 10195: debug : virDomainFree:2281 : dom=0x7fb4e80c9600, (VM: name=vm1, uuid=e95d8a97-e71c-4d31-90f7-a455f64ba7d2)
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7fb4e80c9600
2013-06-03 13:06:25.969+0000: 10187: error : qemuProcessFindDomainDiskByAlias:406 : internal error no disk found with alias 
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:137 : OBJECT_DISPOSE: obj=0x7fb4e80c9600
2013-06-03 13:06:25.969+0000: 10187: debug : qemuProcessHandleIOError:933 : Transitioned guest vm1 to paused state due to IO error
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7fb4f41061e0
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x102c6b0
2013-06-03 13:06:25.969+0000: 10195: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x1033e00
2013-06-03 13:06:25.970+0000: 10187: debug : qemuProcessHandleIOError:943 : Preserving lock state '(null)'
2013-06-03 13:06:26.003+0000: 10187: debug : virFileClose:72 : Closed fd 25


this is the hotunplug from vdsm log: 

Thread-197316::DEBUG::2013-06-03 16:06:25,654::libvirtvm::1830::vm.Vm::(hotunplugDisk) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::Hotunplug disk xml: <disk device="disk" snapshot="no" type="block">
        <address bus="0x00" domain="0x0000" function="0x0" slot="0x07" type="pci"/>
        <source dev="/rhev/data-center/7fd33b43-a9f4-4eb7-a885-e9583a929ceb/7414f930-bbdb-4ec6-8132-4640cbb3c722/images/dfc826eb-6a02-4f9b-bbfd-ce2404087c9a/85ece7fe-ce9f-4bff-825d-ce5c79d4e27d"/>
        <target bus="virtio" dev="vdb"/>
        <serial>dfc826eb-6a02-4f9b-bbfd-ce2404087c9a</serial>
        <driver cache="none" error_policy="stop" io="native" name="qemu" type="qcow2"/>
</disk>

libvirtEventLoop::INFO::2013-06-03 16:06:26,004::libvirtvm::2519::vm.Vm::(_onAbnormalStop) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::abnormal vm stop device  error eother
Thread-197316::DEBUG::2013-06-03 16:06:26,006::task::579::TaskManager.Task::(_updateState) Task=`703cf1e7-f3fb-436c-8a21-7248c22e09db`::moving from state init -> state preparing


if I try to resume it now I get: 

Thread-198122::DEBUG::2013-06-03 16:40:19,663::BindingXMLRPC::913::vds::(wrapper) client [10.35.161.49]::call vmCont with ('e95d8a97-e71c-4d31-90f7-a455f64ba7d2',) {} flowID [43c4ff4f]
libvirtEventLoop::DEBUG::2013-06-03 16:40:19,769::libvirtvm::3068::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::event Resumed detail 0 opaque None
libvirtEventLoop::DEBUG::2013-06-03 16:40:19,772::libvirtvm::3068::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::event Resumed detail 0 opaque None
libvirtEventLoop::INFO::2013-06-03 16:40:19,773::libvirtvm::2519::vm.Vm::(_onAbnormalStop) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::abnormal vm stop device  error eother
libvirtEventLoop::DEBUG::2013-06-03 16:40:19,773::libvirtvm::3068::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::event Suspended detail 2 opaque None
libvirtEventLoop::INFO::2013-06-03 16:40:19,774::libvirtvm::2519::vm.Vm::(_onAbnormalStop) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::abnormal vm stop device  error eother
libvirtEventLoop::INFO::2013-06-03 16:40:19,774::libvirtvm::2519::vm.Vm::(_onAbnormalStop) vmId=`e95d8a97-e71c-4d31-90f7-a455f64ba7d2`::abnormal vm stop device  error eother

Comment 1 Stefan Hajnoczi 2013-06-03 15:01:40 UTC
Patch posted:
http://thread.gmane.org/gmane.comp.emulators.qemu/214536

Comment 2 Chao Yang 2013-06-04 05:32:52 UTC
Reproduced with qemu-kvm-0.12.1.2-2.375.el6.x86_64, 2.6.32-383.el6.x86_64. Guest paused due to "No medium found (123)"

Steps:
1. boot a guest with data disk attached
2. wait on guest grub phase
3. hot unplug data disk by:
(qemu) device_del virtio-disk1
4. continue to boot guest by selecting a boot entry
5. hot unplug disk drive:
(qemu) __com.redhat_drive_del drive-virtio-disk1

Actual Result:

(qemu) block I/O error in device '': No medium found (123)

(qemu) info status 
VM status: paused (io-error)

{"timestamp": {"seconds": 1370323629, "microseconds": 933064}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "__com.redhat_debug_info": {"message": "No medium found", "errno": 123}, "__com.redhat_reason": "eother", "operation": "read", "action": "stop"}}
{"timestamp": {"seconds": 1370323629, "microseconds": 934711}, "event": "STOP"}


Providing with qa_ack+ based on above.

Comment 3 Chao Yang 2013-06-04 05:35:20 UTC
CLI:
/usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Nehalem -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -name test -rtc base=utc -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0 -drive file=/home/images/rhel6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/home/images/test.raw,if=none,id=drive-virtio-disk1,format=raw,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:0b:38,bus=pci.0 -spice port=8000,disable-ticketing -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -monitor stdio -boot menu=on -qmp tcp:0:4455,server,nowait

Comment 10 Michal Novotny 2013-08-12 17:24:42 UTC
Fixed in version qemu-kvm-0.12.1.2-2.387.el6.

Michal

Comment 12 zhonglinzhang 2013-08-23 08:27:00 UTC
reproduce this issue with qemu-kvm-0.12.1.2-2.386.el6.x86_64 and 2.6.32-414.el6.x86_64 kernel

Steps:
1. boot a guest with data disk attached
the command line:
/usr/libexec/qemu-kvm -M pc -cpu SandyBridge  -enable-kvm -m 4G -smp 4,sockets=1,cores=2,threads=2 -name scalability-test -rtc base=localtime,clock=host,driftfix=slew  -k en-us  -boot menu=on -spice disable-ticketing,port=5930 -vga qxl -monitor stdio   -qmp tcp:0:7778,server,nowait -serial unix:/tmp/ttyS0,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0        -drive file=/home/RHEL-Server-6.5-64.qcow2,if=none,id=drive-system-disk,media=disk,format=qcow2,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-system-disk,id=system-disk,bootindex=1,addr=0x5      -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:22:15:27:54:8d,bus=pci.0,addr=0x9  -drive file=/home/adddisk.qcow2,if=none,id=drive-s1-disk,media=disk,format=qcow2,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-s1-disk,id=s1-disk,addr=0x7 

2. wait on guest boot guest by selecting a boot entry
(qemu) stop

3. hot unplug data disk by:
(qemu) device_del s1-disk

4. hot unplug disk drive:
(qemu) __com.redhat_drive_del drive-s1-disk

5. (qemu) cont

Actual Result:
(qemu) block I/O error in device '': No medium found (123)
(qemu) info status
VM status: paused (io-error)


verify with qemu-kvm-0.12.1.2-2.397.el6.x86_64 2.6.32-414.el6.x86_64 kernel
steps as above:
Actual Result:
(qemu) block I/O error in device '': No medium found (123)
(qemu) info status
VM status: running

After boot guest
(qemu) info block
drive-system-disk: removable=0 io-status=ok file=/home/RHEL-Server-6.5-64.qcow2 ro=0 drv=qcow2 encrypted=0
ide1-cd0: removable=1 locked=0 tray-open=0 io-status=ok [not inserted]
floppy0: removable=1 locked=0 tray-open=0 [not inserted]
sd0: removable=1 locked=0 tray-open=0 [not inserted]

hotplug one data disk
(qemu) __com.redhat_drive_add file=/home/adddisk.qcow2,id=drive-s2-disk
(qemu) device_add virtio-blk-pci,drive=drive-s2-disk,bus=pci.0,id=s2-disk

unhotplug data disk
(qemu) device_del s2-disk

success to hotplug and unhotplug,and guest work well.

Based above, this issue has been fixd.

Comment 14 errata-xmlrpc 2013-11-21 06:58:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1553.html