Bug 853369 - libvirt error: "could not destroy libvirt domain: Requested operation is not valid: domain is not running" is unclear, qemu is actually segfaulting
Summary: libvirt error: "could not destroy libvirt domain: Requested operation is not ...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-31 09:06 UTC by Richard W.M. Jones
Modified: 2013-10-16 15:41 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-10-16 15:41:52 UTC
Embargoed:


Attachments (Terms of Use)
libvirt.log (40.58 KB, text/plain)
2012-08-31 11:18 UTC, Richard W.M. Jones
no flags Details
libvirtd.log (log file from daemon) (1.49 MB, text/plain)
2012-08-31 11:28 UTC, Richard W.M. Jones
no flags Details

Description Richard W.M. Jones 2012-08-31 09:06:33 UTC
Description of problem:

When trying to shut down a transient domain which *is* running I get:

*stdin*:31: libguestfs: error: could not destroy libvirt domain: Requested operation is not valid: domain is not running [code=55 domain=10]

This used to work in libvirt-0.10.0-0rc0.2.fc18.x86_64 but
seems to have broken in Rawhide (libvirt-0.10.0-1.fc19.x86_64).

Version-Release number of selected component (if applicable):

libvirt-0.10.0-1.fc19.x86_64

How reproducible:

At least once.

Steps to Reproduce:
1. Build libguestfs in Rawhide.
 
Actual results:

http://kojipkgs.fedoraproject.org//work/tasks/8635/4438635/build.log

Comment 1 Richard W.M. Jones 2012-08-31 09:36:51 UTC
Second attempt at building also fails the same way:
http://kojipkgs.fedoraproject.org//work/tasks/931/4440931/build.log

As requested I'll try to get libvirt logs of this.

Comment 2 Osier Yang 2012-08-31 09:58:01 UTC
I couldn't reproduce the problem on top of libvirt git.

Comment 3 Osier Yang 2012-08-31 09:58:49 UTC
(In reply to comment #2)
> I couldn't reproduce the problem on top of libvirt git.

No libguestfs building surely, :-) Just trying to destroy a transient domain.

Comment 4 Richard W.M. Jones 2012-08-31 11:08:25 UTC
I get a similar but different error when running this
on my local machine:

libguestfs: recv_from_daemon: 40 bytes: 20 00 f5 f5 | 00 00 00 04 | 00 00 01 1a | 00 00 00 01 | 00 12 34 04 | ...
libguestfs: error: could not destroy libvirt domain: End of file while reading data: Input/output error [code=38 domain=7]
libguestfs-test-tool: shutdown failed
libguestfs: closing guestfs handle 0x665f80 (state 0)

Comment 5 Richard W.M. Jones 2012-08-31 11:18:27 UTC
Created attachment 608486 [details]
libvirt.log

Actually when running locally, I get both errors.

Attached is the libvirt log requested.

(In reply to comment #3)
> (In reply to comment #2)
> > I couldn't reproduce the problem on top of libvirt git.
> 
> No libguestfs building surely, :-) Just trying to destroy a transient domain.

What did you do to try to reproduce this?  libguestfs is a big
C program and it creates and destroys the transient guest
entirely through the API:
https://github.com/libguestfs/libguestfs/blob/87cb1549761c9441b0fa7ee9b6a85b8eeb164c5c/src/launch-libvirt.c
I'm pretty sure it's not libguestfs at fault here since
(a) it works fine with other libvirt and (b) its use of the
API is very simple.

Comment 6 Richard W.M. Jones 2012-08-31 11:28:49 UTC
Created attachment 608490 [details]
libvirtd.log (log file from daemon)

Comment 7 Richard W.M. Jones 2012-08-31 11:42:14 UTC
Looking at it closer, I think what's happening is that
qemu segfaults when libvirt sends it a signal to shut down.
(That's a bug in qemu obviously).  But then libvirt ought
to be able to distinguish this case -- we really care if
qemu segfaults, but it could indicate data integrity issues.

I will try and catch the qemu segfault if I can.

Comment 8 Richard W.M. Jones 2012-08-31 11:42:55 UTC
(In reply to comment #7)
> qemu segfaults, but it could indicate data integrity issues.

s/but/because/

Comment 9 Osier Yang 2012-08-31 12:13:55 UTC
(In reply to comment #5)
> Created attachment 608486 [details]
> libvirt.log
> 
> Actually when running locally, I get both errors.
> 
> Attached is the libvirt log requested.
> 
> (In reply to comment #3)
> > (In reply to comment #2)
> > > I couldn't reproduce the problem on top of libvirt git.
> > 
> > No libguestfs building surely, :-) Just trying to destroy a transient domain.
> 
> What did you do to try to reproduce this?  libguestfs is a big
> C program and it creates and destroys the transient guest
> entirely through the API:

I simply used virsh to destroy a transient domain.

> https://github.com/libguestfs/libguestfs/blob/
> 87cb1549761c9441b0fa7ee9b6a85b8eeb164c5c/src/launch-libvirt.c
> I'm pretty sure it's not libguestfs at fault here since
> (a) it works fine with other libvirt and (b) its use of the
> API is very simple.

Comment 10 Richard W.M. Jones 2012-08-31 12:42:53 UTC
So I've verified that what is happening is that
qemu is segfaulting when libvirtd sends it a
signal (new bug 853408).

But definitely libvirt could improve the error
message here.  It's a good thing that libvirt
indicates some sort of error, because we really
want to know when this fails, but it should say
something like 'qemu just segfaulted'.

Comment 11 Daniel Berrangé 2012-09-04 11:57:44 UTC
From the POV of the virDomainDestroy() command, whether QEMU segfaults or shuts down cleanly is academic, since this command makes no guarantees about how QEMU is stopped, and indeed will even send SIGKILL to QEMU which arguably has similar effect to SEGV. So having QEMU SEGV after sending it a SIGTERM should be considered 'Success' for this function. As such we should not be returning the "Operation is not valid' error code.

Comment 12 Richard W.M. Jones 2012-09-04 12:20:00 UTC
13:07 <@rwmjones> danpb: what should I be using if I care about whether qemu shuts down without segfaulting?
13:09 < danpb> oh, pass the GRACEFUL flag to virDomainDestroy
13:09 < danpb> that means we'll only ever ask qemu to do a clean shutdown, and never try to SIGKILL it
13:10 < danpb> if we pass that flag, then you are right that we should report the SEGV as an error condition for virDomainDestory

Comment 13 Richard W.M. Jones 2012-09-04 15:53:21 UTC
(In reply to comment #12)
> 13:07 <@rwmjones> danpb: what should I be using if I care about whether qemu
> shuts down without segfaulting?
> 13:09 < danpb> oh, pass the GRACEFUL flag to virDomainDestroy
> 13:09 < danpb> that means we'll only ever ask qemu to do a clean shutdown,
> and never try to SIGKILL it
> 13:10 < danpb> if we pass that flag, then you are right that we should
> report the SEGV as an error condition for virDomainDestory

I have fixed this in libguestfs 1.19.39.

Comment 14 Richard W.M. Jones 2013-10-16 15:41:52 UTC
Closing / upstream based on comment 13.


Note You need to log in before you can comment on or make changes to this bug.