Description of problem: VM Guests occasionally hard shutdown unexpectedly with error: "qemu-system-x86_64: block.c:2806: bdrv_error_action: Assertion `error >= 0' failed." In my environment it only appears to be my Windows 2008R2 guests that are affected, although I know of another environment with SLES guests that are affected. As per email here: http://lists.gnu.org/archive/html/qemu-discuss/2014-06/msg00094.html Version-Release number of selected component (if applicable): * qemu-system-x86-1.6.2-5.fc20.x86_64 How reproducible: It is difficult to reproduce, it occurs roughly once a week each Guest VM for me. Additional info: * I am running Openstack Icehouse on Fedora 20 (via packstack) * Kernels: kernel-3.14.4-200.fc20.x86_64 / kernel-3.14.8-200.fc20.x86_64 * libvirt-1.1.3.5-2.fc20.x86_64 * Guest OS: Windows 2008R2 * Guest VirtIO driver version 0.1-81 * Guest Storage is via NFS export from a Netapp FAS 6220 cluster. * These unexpected shutdowns do not occur for me at busy times for either the guests or the hosts.
*** Bug 1147398 has been marked as a duplicate of this bug. ***
bdrv_error_action is called from 3 places. What is going to help most of all here is a stack trace. Easiest thing is to enable core dumps and make sure the core dump is captured when qemu fails.
Thanks for the idea. Sounds better than mine to recompile qemu with debug messages. Can you give a hint how to achieve it in an OVirt/libvirt environment.
I met this problem with qemu-1.6.1 too, while my problem is found at debian7 guests.
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Since F20 is EOL soon, closing this. If anyone can still reproduce with F21+, please reopen and I'll take a look
We see this in upstream openstack CI testing, viewable here: http://logs.openstack.org/07/251407/2/check/gate-tempest-dsvm-full/144f7fc/logs/libvirt/libvirtd.txt.gz#_2015-11-30_18_20_18_168 2015-11-30 18:20:18.168+0000: 31539: error : qemuMonitorIO:656 : internal error: End of file from monitor 2015-11-30 18:20:18.168+0000: 31539: debug : qemuMonitorIO:710 : Error on monitor internal error: End of file from monitor 2015-11-30 18:20:18.168+0000: 31539: debug : qemuMonitorIO:731 : Triggering EOF callback 2015-11-30 18:20:18.168+0000: 31539: debug : qemuProcessHandleMonitorEOF:300 : Received EOF on 0x7fa310011240 'instance-00000066' 2015-11-30 18:20:18.168+0000: 31539: debug : qemuProcessHandleMonitorEOF:318 : Monitor connection to 'instance-00000066' closed without SHUTDOWN event; assuming the domain crashed 2015-11-30 18:20:18.168+0000: 31539: debug : virObjectEventNew:643 : obj=0x7fa340aab850 2015-11-30 18:20:18.168+0000: 31539: debug : qemuProcessStop:4235 : Shutting down vm=0x7fa310011240 name=instance-00000066 id=150 pid=17830 flags=0 This was the domain log: http://logs.openstack.org/07/251407/2/check/gate-tempest-dsvm-full/144f7fc/logs/libvirt/qemu/instance-00000066.txt.gz I noticed this: char device redirected to /dev/pts/1 (label charserial1) qemu-system-x86_64: /build/qemu-5LgLIn/qemu-2.0.0+dfsg/block.c:3491: bdrv_error_action: Assertion `error >= 0' failed. 2015-11-30 18:20:18.168+0000: shutting down This is a volume-backed VM. I think around the time that this fails, we should be trying to plug a virtual interface. Possibly also helpful: http://logs.openstack.org/07/251407/2/check/gate-tempest-dsvm-full/144f7fc/logs/screen-n-net.txt.gz#_2015-11-30_18_19_45_252 2015-11-30 18:19:45.251 DEBUG oslo_concurrency.processutils [req-8911e8c7-2466-408f-832e-af4b78e9adec tempest-TestVolumeBootPattern-2142876884 tempest-TestVolumeBootPattern-1970238908] CMD "sudo nova-rootwrap /etc/nova/rootwrap.conf ebtables --concurrent -t nat -D PREROUTING --logical-in br100 -p ipv4 --ip-src 10.1.0.3 ! --ip-dst 10.1.0.0/20 -j redirect --redirect-target ACCEPT" returned: 255 in 0.147s execute /usr/local/lib/python2.7/dist-packages/oslo_concurrency/processutils.py:297
For comment 7, this is mitaka openstack. libvirt version: 1.2.2 QEMU 2.0.0 Ubuntu 14.04 for the compute host.
If you are hitting this on ubuntu, you need to file an ubuntu bug.