Bug 853369 - libvirt error: "could not destroy libvirt domain: Requested operation is not valid: domain is not running" is unclear, qemu is actually segfaulting
libvirt error: "could not destroy libvirt domain: Requested operation is not ...
Status: CLOSED UPSTREAM
Product: Virtualization Tools
Classification: Community
Component: libvirt (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Libvirt Maintainers
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-31 05:06 EDT by Richard W.M. Jones
Modified: 2013-10-16 11:41 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-10-16 11:41:52 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
libvirt.log (40.58 KB, text/plain)
2012-08-31 07:18 EDT, Richard W.M. Jones
no flags Details
libvirtd.log (log file from daemon) (1.49 MB, text/plain)
2012-08-31 07:28 EDT, Richard W.M. Jones
no flags Details

  None (edit)
Description Richard W.M. Jones 2012-08-31 05:06:33 EDT
Description of problem:

When trying to shut down a transient domain which *is* running I get:

*stdin*:31: libguestfs: error: could not destroy libvirt domain: Requested operation is not valid: domain is not running [code=55 domain=10]

This used to work in libvirt-0.10.0-0rc0.2.fc18.x86_64 but
seems to have broken in Rawhide (libvirt-0.10.0-1.fc19.x86_64).

Version-Release number of selected component (if applicable):

libvirt-0.10.0-1.fc19.x86_64

How reproducible:

At least once.

Steps to Reproduce:
1. Build libguestfs in Rawhide.
 
Actual results:

http://kojipkgs.fedoraproject.org//work/tasks/8635/4438635/build.log
Comment 1 Richard W.M. Jones 2012-08-31 05:36:51 EDT
Second attempt at building also fails the same way:
http://kojipkgs.fedoraproject.org//work/tasks/931/4440931/build.log

As requested I'll try to get libvirt logs of this.
Comment 2 Osier Yang 2012-08-31 05:58:01 EDT
I couldn't reproduce the problem on top of libvirt git.
Comment 3 Osier Yang 2012-08-31 05:58:49 EDT
(In reply to comment #2)
> I couldn't reproduce the problem on top of libvirt git.

No libguestfs building surely, :-) Just trying to destroy a transient domain.
Comment 4 Richard W.M. Jones 2012-08-31 07:08:25 EDT
I get a similar but different error when running this
on my local machine:

libguestfs: recv_from_daemon: 40 bytes: 20 00 f5 f5 | 00 00 00 04 | 00 00 01 1a | 00 00 00 01 | 00 12 34 04 | ...
libguestfs: error: could not destroy libvirt domain: End of file while reading data: Input/output error [code=38 domain=7]
libguestfs-test-tool: shutdown failed
libguestfs: closing guestfs handle 0x665f80 (state 0)
Comment 5 Richard W.M. Jones 2012-08-31 07:18:27 EDT
Created attachment 608486 [details]
libvirt.log

Actually when running locally, I get both errors.

Attached is the libvirt log requested.

(In reply to comment #3)
> (In reply to comment #2)
> > I couldn't reproduce the problem on top of libvirt git.
> 
> No libguestfs building surely, :-) Just trying to destroy a transient domain.

What did you do to try to reproduce this?  libguestfs is a big
C program and it creates and destroys the transient guest
entirely through the API:
https://github.com/libguestfs/libguestfs/blob/87cb1549761c9441b0fa7ee9b6a85b8eeb164c5c/src/launch-libvirt.c
I'm pretty sure it's not libguestfs at fault here since
(a) it works fine with other libvirt and (b) its use of the
API is very simple.
Comment 6 Richard W.M. Jones 2012-08-31 07:28:49 EDT
Created attachment 608490 [details]
libvirtd.log (log file from daemon)
Comment 7 Richard W.M. Jones 2012-08-31 07:42:14 EDT
Looking at it closer, I think what's happening is that
qemu segfaults when libvirt sends it a signal to shut down.
(That's a bug in qemu obviously).  But then libvirt ought
to be able to distinguish this case -- we really care if
qemu segfaults, but it could indicate data integrity issues.

I will try and catch the qemu segfault if I can.
Comment 8 Richard W.M. Jones 2012-08-31 07:42:55 EDT
(In reply to comment #7)
> qemu segfaults, but it could indicate data integrity issues.

s/but/because/
Comment 9 Osier Yang 2012-08-31 08:13:55 EDT
(In reply to comment #5)
> Created attachment 608486 [details]
> libvirt.log
> 
> Actually when running locally, I get both errors.
> 
> Attached is the libvirt log requested.
> 
> (In reply to comment #3)
> > (In reply to comment #2)
> > > I couldn't reproduce the problem on top of libvirt git.
> > 
> > No libguestfs building surely, :-) Just trying to destroy a transient domain.
> 
> What did you do to try to reproduce this?  libguestfs is a big
> C program and it creates and destroys the transient guest
> entirely through the API:

I simply used virsh to destroy a transient domain.

> https://github.com/libguestfs/libguestfs/blob/
> 87cb1549761c9441b0fa7ee9b6a85b8eeb164c5c/src/launch-libvirt.c
> I'm pretty sure it's not libguestfs at fault here since
> (a) it works fine with other libvirt and (b) its use of the
> API is very simple.
Comment 10 Richard W.M. Jones 2012-08-31 08:42:53 EDT
So I've verified that what is happening is that
qemu is segfaulting when libvirtd sends it a
signal (new bug 853408).

But definitely libvirt could improve the error
message here.  It's a good thing that libvirt
indicates some sort of error, because we really
want to know when this fails, but it should say
something like 'qemu just segfaulted'.
Comment 11 Daniel Berrange 2012-09-04 07:57:44 EDT
From the POV of the virDomainDestroy() command, whether QEMU segfaults or shuts down cleanly is academic, since this command makes no guarantees about how QEMU is stopped, and indeed will even send SIGKILL to QEMU which arguably has similar effect to SEGV. So having QEMU SEGV after sending it a SIGTERM should be considered 'Success' for this function. As such we should not be returning the "Operation is not valid' error code.
Comment 12 Richard W.M. Jones 2012-09-04 08:20:00 EDT
13:07 <@rwmjones> danpb: what should I be using if I care about whether qemu shuts down without segfaulting?
13:09 < danpb> oh, pass the GRACEFUL flag to virDomainDestroy
13:09 < danpb> that means we'll only ever ask qemu to do a clean shutdown, and never try to SIGKILL it
13:10 < danpb> if we pass that flag, then you are right that we should report the SEGV as an error condition for virDomainDestory
Comment 13 Richard W.M. Jones 2012-09-04 11:53:21 EDT
(In reply to comment #12)
> 13:07 <@rwmjones> danpb: what should I be using if I care about whether qemu
> shuts down without segfaulting?
> 13:09 < danpb> oh, pass the GRACEFUL flag to virDomainDestroy
> 13:09 < danpb> that means we'll only ever ask qemu to do a clean shutdown,
> and never try to SIGKILL it
> 13:10 < danpb> if we pass that flag, then you are right that we should
> report the SEGV as an error condition for virDomainDestory

I have fixed this in libguestfs 1.19.39.
Comment 14 Richard W.M. Jones 2013-10-16 11:41:52 EDT
Closing / upstream based on comment 13.

Note You need to log in before you can comment on or make changes to this bug.