Bug 784324

Summary: [qemu-kvm] guest crash when connection to storage is blocked on host machine
Product: [Fedora] Fedora Reporter: Dafna Ron <dron>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 16CC: abaron, acathrow, amit.shah, bazulay, berrange, cfergeau, clalancette, crobinso, dougsland, dwmw2, ehabkost, extras-orphan, gcosta, hateya, iheim, itamar, jaswinder, jforbes, knoel, markmc, mgoldboi, notting, quintela, scottt.tw, virt-maint, ykaul
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-10 07:06:55 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
logs none

Description Dafna Ron 2012-01-24 10:21:08 EST
Created attachment 557253 [details]
logs

Description of problem:

blocking storage from host using iptables will cause the vm running on hsm host to shut down 

Version-Release number of selected component (if applicable):

vdsm-4.9.3.1-0.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with running vm's on 2 hosts, block storage from host using iptables 
2.
3.
  
Actual results:

vm's running on hsm host will shut down 

Expected results:

vms should change state to unknown

Additional info: full logs are attached

first error: 

Thread-71::ERROR::2012-01-24 16:59:30,768::utils::393::vm.Vm::(collect) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats function failed: <AdvancedStatsFunction _sampleDiskLat
ency at 0x1e28960>
Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 389, in collect
    statsFunction()
  File "/usr/share/vdsm/utils.py", line 266, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 155, in _sampleDiskLatency
    stats = _blockstatsParses(out)
  File "/usr/share/vdsm/libvirtvm.py", line 142, in _blockstatsParses
    'flush_op':devStats['flush_operations'],
KeyError: 'flush_operations'


second error: 

Thread-53::ERROR::2012-01-24 17:04:43,590::domainMonitor::120::Storage.DomainMonitor::(_monitorDomain) Error while collecting domain `038e5836-a83a-4aa2-9fd3-77b2ee910511` monitoring information
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 105, in _monitorDomain
    nextStatus.readDelay = domain.getReadDelay()
  File "/usr/share/vdsm/storage/blockSD.py", line 423, in getReadDelay
    f.read(4096)
  File "/usr/share/vdsm/storage/fileUtils.py", line 287, in read
    raise OSError(err, msg)
OSError: [Errno 5] Input/output error
Thread-53::DEBUG::2012-01-24 17:04:43,593::domainMonitor::130::Storage.DomainMonitor::(_monitorDomain) Domain `038e5836-a83a-4aa2-9fd3-77b2ee910511` changed its status to Invalid
libvirtEventLoop::INFO::2012-01-24 17:04:43,605::libvirtvm::1301::vm.Vm::(_onAbnormalStop) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::abnormal vm stop device ide0-0-0 error 
libvirtEventLoop::DEBUG::2012-01-24 17:04:43,606::libvirtvm::1475::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::event Suspended detail 2 opaque None


vm is killed:

Thread-71::ERROR::2012-01-24 17:04:59,585::utils::393::vm.Vm::(collect) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x1e27b40>
Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 389, in collect
    statsFunction()
  File "/usr/share/vdsm/utils.py", line 266, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 79, in _highWrite
    self._vm._dom.blockInfo(vmDrive.path, 0)
  File "/usr/share/vdsm/libvirtvm.py", line 483, in f
    ret = attr(*args, **kwargs)
  File "/usr/share/vdsm/libvirtconnection.py", line 79, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1383, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: Unable to read from monitor: Connection reset by peer
libvirtEventLoop::DEBUG::2012-01-24 17:04:59,587::libvirtvm::1475::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::event Stopped detail 5 opaque None
libvirtEventLoop::INFO::2012-01-24 17:04:59,589::vm::581::vm.Vm::(_onQemuDeath) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::underlying process disconnected
libvirtEventLoop::INFO::2012-01-24 17:04:59,590::libvirtvm::1360::vm.Vm::(releaseVm) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Release VM resources
libvirtEventLoop::DEBUG::2012-01-24 17:04:59,599::utils::330::vm.Vm::(stop) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stop statistics collection
Thread-71::DEBUG::2012-01-24 17:04:59,600::utils::361::vm.Vm::(run) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats thread finished
libvirtEventLoop::WARNING::2012-01-24 17:04:59,603::libvirtvm::1371::vm.Vm::(releaseVm) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 1368, in releaseVm
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 483, in f
    ret = attr(*args, **kwargs)
  File "/usr/share/vdsm/libvirtconnection.py", line 79, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 553, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '9e160556-2e1f-4624-8edc-a7fb06e2fe09'
Comment 1 Haim 2012-01-30 08:58:19 EST
guest crash on qemu log:

qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.15.1/hw/ide/pci.c:314: bmdma_cmd_writeb: Assertion `bm->bus->dma->aiocb == ((void *)0)' failed.
2012-01-30 13:25:57.922+0000: shutting down

qemu-kvm guest command line:

2012-01-30 13:09:50.908+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=spice /usr/bin/qemu-kvm -S -M pc-0.14 -cpu qemu64,+misalignsse,+sse4a,+abm,+lahf_lm,+rdtscp,+popcnt,+cx16,+monitor -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name X
P -uuid 732d1381-be78-47f8-a9ee-60d694174f7f -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=16-1,serial=44454C4C-3800-1046-8035-B4C04F34344A_00:22:19:14:7E:B8,uuid=732d1381-be78-47f8-a9ee-60d694174f7f -nodefconfig -
nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/XP.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2012-01-30T13:09:50,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive file=/rhev/data-center/8a006ced-ec17-403c-a4bc-4fe7ebca8097/4941f9d5-7e2b-4415-abf7-59b1e9ddc171/images/3fa8f4a7-9ddf-41de-b5f7-93ad8e7181c3/da23186c-d70c-4264-b24a-bc75296c3e38,if=none,id=drive-ide0-0-0,format=qcow2,serial=3fa8f4a7-9ddf-41de-b5f7-93ad8e7181c3,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=28,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:61:05,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/XP.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -usb -spice port=5900,tls-port=5901,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=inputs -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0
Comment 2 Yaniv Kaul 2012-01-30 09:05:05 EST
Can we catch the stack of the guest and provides a nice trace to kvm?
Comment 3 Haim 2012-01-30 09:35:59 EST
(In reply to comment #2)
> Can we catch the stack of the guest and provides a nice trace to kvm?

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/qemu-kvm -S -M pc-0.14 -cpu qemu64,+misalignsse,+sse4a,+abm,+lahf_lm,+'.
Program terminated with signal 6, Aborted.
#0  0x00007f68b62b5285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f68b62b5285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f68b62b6b9b in __GI_abort () at abort.c:91
#2  0x00007f68b799c2c3 in ?? ()
#3  0x00007f68bbfd6eb0 in ?? ()
#4  0x000000000000170d in ?? ()
#5  0x00007f68bbfd7060 in ?? ()
#6  0x00007f68bad9de1f in qemu_spice_init () at ui/spice-core.c:614
#7  0x00007f68bad286fb in module_call_init (type=<optimized out>) at module.c:78
#8  0x00007f68bacf7962 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /usr/src/debug/qemu-kvm-0.15.1/vl.c:3226
Comment 4 Fedora Admin XMLRPC Client 2012-03-15 13:54:39 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 5 Cole Robinson 2012-05-28 20:04:48 EDT
Build ID says this is F16, not F15
Comment 6 Cole Robinson 2012-07-10 07:06:55 EDT
Stack trace in Comment #2 isn't complete, probably need debuginfo-install qemu-ssytem-x86. If you can still reproduce, please attach a stack trace without any ?? lines and reopen.