Bug 784324 - [qemu-kvm] guest crash when connection to storage is blocked on host machine
Summary: [qemu-kvm] guest crash when connection to storage is blocked on host machine
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: 16
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-24 15:21 UTC by Dafna Ron
Modified: 2013-01-09 11:58 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-10 11:06:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
logs (582.81 KB, application/x-gzip)
2012-01-24 15:21 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2012-01-24 15:21:08 UTC
Created attachment 557253 [details]
logs

Description of problem:

blocking storage from host using iptables will cause the vm running on hsm host to shut down 

Version-Release number of selected component (if applicable):

vdsm-4.9.3.1-0.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with running vm's on 2 hosts, block storage from host using iptables 
2.
3.
  
Actual results:

vm's running on hsm host will shut down 

Expected results:

vms should change state to unknown

Additional info: full logs are attached

first error: 

Thread-71::ERROR::2012-01-24 16:59:30,768::utils::393::vm.Vm::(collect) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats function failed: <AdvancedStatsFunction _sampleDiskLat
ency at 0x1e28960>
Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 389, in collect
    statsFunction()
  File "/usr/share/vdsm/utils.py", line 266, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 155, in _sampleDiskLatency
    stats = _blockstatsParses(out)
  File "/usr/share/vdsm/libvirtvm.py", line 142, in _blockstatsParses
    'flush_op':devStats['flush_operations'],
KeyError: 'flush_operations'


second error: 

Thread-53::ERROR::2012-01-24 17:04:43,590::domainMonitor::120::Storage.DomainMonitor::(_monitorDomain) Error while collecting domain `038e5836-a83a-4aa2-9fd3-77b2ee910511` monitoring information
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 105, in _monitorDomain
    nextStatus.readDelay = domain.getReadDelay()
  File "/usr/share/vdsm/storage/blockSD.py", line 423, in getReadDelay
    f.read(4096)
  File "/usr/share/vdsm/storage/fileUtils.py", line 287, in read
    raise OSError(err, msg)
OSError: [Errno 5] Input/output error
Thread-53::DEBUG::2012-01-24 17:04:43,593::domainMonitor::130::Storage.DomainMonitor::(_monitorDomain) Domain `038e5836-a83a-4aa2-9fd3-77b2ee910511` changed its status to Invalid
libvirtEventLoop::INFO::2012-01-24 17:04:43,605::libvirtvm::1301::vm.Vm::(_onAbnormalStop) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::abnormal vm stop device ide0-0-0 error 
libvirtEventLoop::DEBUG::2012-01-24 17:04:43,606::libvirtvm::1475::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::event Suspended detail 2 opaque None


vm is killed:

Thread-71::ERROR::2012-01-24 17:04:59,585::utils::393::vm.Vm::(collect) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x1e27b40>
Traceback (most recent call last):
  File "/usr/share/vdsm/utils.py", line 389, in collect
    statsFunction()
  File "/usr/share/vdsm/utils.py", line 266, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 79, in _highWrite
    self._vm._dom.blockInfo(vmDrive.path, 0)
  File "/usr/share/vdsm/libvirtvm.py", line 483, in f
    ret = attr(*args, **kwargs)
  File "/usr/share/vdsm/libvirtconnection.py", line 79, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1383, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: Unable to read from monitor: Connection reset by peer
libvirtEventLoop::DEBUG::2012-01-24 17:04:59,587::libvirtvm::1475::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::event Stopped detail 5 opaque None
libvirtEventLoop::INFO::2012-01-24 17:04:59,589::vm::581::vm.Vm::(_onQemuDeath) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::underlying process disconnected
libvirtEventLoop::INFO::2012-01-24 17:04:59,590::libvirtvm::1360::vm.Vm::(releaseVm) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Release VM resources
libvirtEventLoop::DEBUG::2012-01-24 17:04:59,599::utils::330::vm.Vm::(stop) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stop statistics collection
Thread-71::DEBUG::2012-01-24 17:04:59,600::utils::361::vm.Vm::(run) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Stats thread finished
libvirtEventLoop::WARNING::2012-01-24 17:04:59,603::libvirtvm::1371::vm.Vm::(releaseVm) vmId=`9e160556-2e1f-4624-8edc-a7fb06e2fe09`::Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 1368, in releaseVm
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 483, in f
    ret = attr(*args, **kwargs)
  File "/usr/share/vdsm/libvirtconnection.py", line 79, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 553, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '9e160556-2e1f-4624-8edc-a7fb06e2fe09'

Comment 1 Haim 2012-01-30 13:58:19 UTC
guest crash on qemu log:

qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.15.1/hw/ide/pci.c:314: bmdma_cmd_writeb: Assertion `bm->bus->dma->aiocb == ((void *)0)' failed.
2012-01-30 13:25:57.922+0000: shutting down

qemu-kvm guest command line:

2012-01-30 13:09:50.908+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=spice /usr/bin/qemu-kvm -S -M pc-0.14 -cpu qemu64,+misalignsse,+sse4a,+abm,+lahf_lm,+rdtscp,+popcnt,+cx16,+monitor -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name X
P -uuid 732d1381-be78-47f8-a9ee-60d694174f7f -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=16-1,serial=44454C4C-3800-1046-8035-B4C04F34344A_00:22:19:14:7E:B8,uuid=732d1381-be78-47f8-a9ee-60d694174f7f -nodefconfig -
nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/XP.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2012-01-30T13:09:50,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive file=/rhev/data-center/8a006ced-ec17-403c-a4bc-4fe7ebca8097/4941f9d5-7e2b-4415-abf7-59b1e9ddc171/images/3fa8f4a7-9ddf-41de-b5f7-93ad8e7181c3/da23186c-d70c-4264-b24a-bc75296c3e38,if=none,id=drive-ide0-0-0,format=qcow2,serial=3fa8f4a7-9ddf-41de-b5f7-93ad8e7181c3,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=28,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:61:05,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/XP.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -usb -spice port=5900,tls-port=5901,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=inputs -k en-us -vga qxl -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0

Comment 2 Yaniv Kaul 2012-01-30 14:05:05 UTC
Can we catch the stack of the guest and provides a nice trace to kvm?

Comment 3 Haim 2012-01-30 14:35:59 UTC
(In reply to comment #2)
> Can we catch the stack of the guest and provides a nice trace to kvm?

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/qemu-kvm -S -M pc-0.14 -cpu qemu64,+misalignsse,+sse4a,+abm,+lahf_lm,+'.
Program terminated with signal 6, Aborted.
#0  0x00007f68b62b5285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f68b62b5285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f68b62b6b9b in __GI_abort () at abort.c:91
#2  0x00007f68b799c2c3 in ?? ()
#3  0x00007f68bbfd6eb0 in ?? ()
#4  0x000000000000170d in ?? ()
#5  0x00007f68bbfd7060 in ?? ()
#6  0x00007f68bad9de1f in qemu_spice_init () at ui/spice-core.c:614
#7  0x00007f68bad286fb in module_call_init (type=<optimized out>) at module.c:78
#8  0x00007f68bacf7962 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /usr/src/debug/qemu-kvm-0.15.1/vl.c:3226

Comment 4 Fedora Admin XMLRPC Client 2012-03-15 17:54:39 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 5 Cole Robinson 2012-05-29 00:04:48 UTC
Build ID says this is F16, not F15

Comment 6 Cole Robinson 2012-07-10 11:06:55 UTC
Stack trace in Comment #2 isn't complete, probably need debuginfo-install qemu-ssytem-x86. If you can still reproduce, please attach a stack trace without any ?? lines and reopen.


Note You need to log in before you can comment on or make changes to this bug.