Bug 714344

Summary: qemu dies after suspend & resume operations with an assert message.
Product: Red Hat Enterprise Linux 6 Reporter: tuhongj
Component: qemu-kvmAssignee: Juan Quintela <quintela>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.1CC: bugproxy, juzhang, kwolf, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-26 14:14:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[abrt] new crash was detected
none
sosreport none

Description tuhongj 2011-06-18 09:33:01 UTC
Created attachment 505376 [details]
[abrt] new crash was detected

Description of problem:

The virtual instance (qemu-kvm process) disappears after virsh suspend & resume operations. 

qemu
Version-Release number of selected component (if applicable):

qemu-kvm-0.12.1.2-2.160.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.160.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.160.el6.x86_64
libvirt-0.8.8-1.el6.x86_64
libvirt-client-0.8.8-1.el6.x86_64
libvirt-devel-0.8.8-1.el6.x86_64

How reproducible:

It is very easy to reproduce especially with a non-virtio base image. A non-virtio image somehow can not be reproduced every time.

Steps to Reproduce:
1. virsh suspend instance_id
2. wait for a little while
3. virsh resume instance_id
  
Actual results:

That instance (qemu-kvm process) is gone!!!

Expected results:

The instance should continue to work

Additional info:

https://bugzilla.redhat.com/show_bug.cgi?id=589681

This assertion in the qemu's instance log:
qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/ide/internal.h:517:
bmdma_active_if: Assertion `bmdma->unit != (uint8_t)-1' failed.
Also this in /var/log/messages:
Jun 17 14:34:17 r002idp018 abrt[20851]: saved core dump of pid 29187
(/usr/libexec/qemu-kvm) to /var/spool/abrt/ccpp-1308292435-29187.new/coredump
(2163900416 bytes)

The call stack of that call core file:
#0  0x000000376c232a45 in raise () from /lib64/libc.so.6
#1  0x000000376c234225 in abort () from /lib64/libc.so.6
#2  0x000000376c22b9d5 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000043e3da in bmdma_active_if (opaque=0x371ffe0) at
/usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/internal.h:517
#4  ide_dma_restart_bh (opaque=0x371ffe0) at
/usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:696
#5  0x00000000004115ed in qemu_bh_poll () at
/usr/src/debug/qemu-kvm-0.12.1.2/async.c:150
#6  0x000000000040bbd1 in main_loop_wait (timeout=1000) at
/usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4472
#7  0x000000000042b55a in kvm_main_loop () at
/usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2164
#8  0x000000000040ef55 in main_loop (argc=<value optimized out>, argv=<value
optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4640
#9  main (argc=<value optimized out>, argv=<value optimized out>, envp=<value
optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6845

We need to ship our product based on RHEL6.1, and this is a real severe problem
which will block our shipment, it will be nice to fix it as soon as possible,
otherwise we have to switch to other OS or xen.

Comment 2 juzhang 2011-06-18 10:36:38 UTC
this issue is probably duplicate bz703554
----snip of from bz703554---
coredump occures with qemu-kvm:
/builddir/build/BUILD/qemu-kvm-0.12.1.2/hw/ide/internal.h:516: bmdma_active_if:
Assertion `bmdma->unit != (uint8_t)-1' failed.

https://bugzilla.redhat.com/show_bug.cgi?id=703554#c9

Comment 3 tuhongj 2011-06-19 09:50:48 UTC
Is that a internal bug report? I'm not authorized to access https://bugzilla.redhat.com/show_bug.cgi?id=703554#c9
Since this defect is quite easy to reproduce? So may I know how soon I can expect to get the fix?

Comment 4 tuhongj 2011-06-20 12:55:11 UTC
As far as I tested more, it seems if I use a swap file attached outside, the suspend resume will cause the assert!

Comment 5 Suqin Huang 2011-06-22 23:07:15 UTC
*** Bug 715195 has been marked as a duplicate of this bug. ***

Comment 6 IBM Bug Proxy 2011-06-22 23:14:23 UTC
Created attachment 506091 [details]
sosreport

Comment 8 IBM Bug Proxy 2011-08-24 12:31:08 UTC
------- Comment From vahegde1.ibm.com 2011-08-24 08:23 EDT-------
Hi Red Hat ,

We tried with upstream qemu-kvm ( qemu-kvm-0.15.0) and not able to reproduce this issue.  However I am not able to figureout the patch which fixed this issue.

Any update on this issue ? Will this be  included in RHEL6.2 cycle ?

Thanks
Vasant

Comment 9 Kevin Wolf 2011-08-24 13:06:03 UTC
This is probably a duplicate of bug 698537 which is already fixed in current 6.2 packages. Can you give it a try?

Comment 10 IBM Bug Proxy 2011-08-24 13:41:35 UTC
------- Comment From vahegde1.ibm.com 2011-08-24 09:31 EDT-------
HI Kevin,
(In reply to comment #20)
> This is probably a duplicate of bug 698537 which is already fixed in current
> 6.2 packages. Can you give it a try?

I see that bz698537 is verified against qemu-kvm-0.12.1.2-2.172.el6. I have downloaded these package from RHN site. Let me give a try.

Will get back with result soon.

Thanks
Vasant

Comment 11 IBM Bug Proxy 2011-08-26 13:00:54 UTC
------- Comment From vahegde1.ibm.com 2011-08-26 08:59 EDT-------
Hi Kevin,

Tried with qemu-kvm-0.12.1.2-2.172.el6 and its working fine for me. Can we assume this version will be included in RHEL6.2 ?

Thanks
Vasant

Comment 12 Kevin Wolf 2011-08-26 14:14:07 UTC
Thanks for testing this. Yes, the fix will be included in RHEL 6.2.

*** This bug has been marked as a duplicate of bug 698537 ***