Bug 1445600

Summary: destroy "paused guest" will output error info,but actually destroy successfully
Product: Red Hat Enterprise Linux 7 Reporter: lijuan men <lmen>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Meina Li <meili>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.4CC: dyuan, fjin, jdenemar, lmen, rbalakri, xuzhang
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-3.8.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 10:43:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log info
none
log-libvirt-3.2.0-3.el7.x86_64 none

Description lijuan men 2017-04-26 05:36:14 UTC
Description of problem:
destroy "paused guest" will output error info,but actually destroy successfully

Version-Release number of selected component (if applicable):
libvirt-3.2.0-3.el7
qemu-kvm-rhev-2.9.0-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.start a guest with the xml(the ip in the xml can not be connected),and then input "ctrl + c" in the terminal :
[root@lmen1 ~]#virsh dumpxml aaa
...
  <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source protocol='rbd' name='lmen/lmen.img'>
        <host name='10.73.75.59' port='6789'/>     -->the ip can not be connected
      </source>
      <target dev='vdc' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1a' function='0x0'/>
    </disk>
...


[root@lmen1 ~]# virsh start aaa
^C

2.check the guest status

[root@lmen1 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 12    uefi                           running
 14    aaa                           *** paused  ***

3.destroy the guest

[root@lmen1 ~]# virsh destroy aaa
error: Failed to destroy domain aaa
error: Requested operation is not valid: domain is not running

[root@lmen1 ~]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 12    uefi                           running
 -     aaa                       ***  shut off  ***


Actual results:
when destroying the guest,libvirt outputs the error info,but actually destroy successfully,the guest will be in "shut off" state

Expected results:
no error info

Additional info:

Comment 2 Jiri Denemark 2017-04-26 06:06:26 UTC
Well, destroy complains there's nothing to destroy since the domain is not running anymore. And this is confirmed by listing domains afterwards. That said, this looks like the domain just died by itself. Would you mind sharing qemu logs and debug logs from libvirtd with us?

Comment 3 lijuan men 2017-05-02 06:01:05 UTC
Created attachment 1275563 [details]
log info

Comment 4 lijuan men 2017-05-02 06:04:41 UTC
upload the log as an attachment,thanks

Comment 5 Jiri Denemark 2017-05-02 08:11:42 UTC
The logs confirm my theory from comment 2:

qemuProcessReportLogError:1845 : internal error: process exited while connecting to monitor: profiling:/builddir/build/BUILD/libvirt-3.2.0/src/access/.libs/libvirt_driver_access_la-viraccessdriverpolkit.gcda:Cannot open

That is, the domain died before virsh destroy. The reason for the error shows that you have *virtcov* libvirt packages installed. Please, don't file bugs which cannot be reproduced with the normal builds.

Comment 6 lijuan men 2017-05-02 08:36:27 UTC
(In reply to Jiri Denemark from comment #5)
> The logs confirm my theory from comment 2:
> 
> qemuProcessReportLogError:1845 : internal error: process exited while
> connecting to monitor:
> profiling:/builddir/build/BUILD/libvirt-3.2.0/src/access/.libs/
> libvirt_driver_access_la-viraccessdriverpolkit.gcda:Cannot open
> 
> That is, the domain died before virsh destroy. The reason for the error
> shows that you have *virtcov* libvirt packages installed. Please, don't file
> bugs which cannot be reproduced with the normal builds.

sorry for uploading the improper log

but I can reproduce the issue using the normal build:
libvirt-3.2.0-3.el7.x86_64

I will upload the right log file again

Comment 7 lijuan men 2017-05-02 08:40:15 UTC
Created attachment 1275618 [details]
log-libvirt-3.2.0-3.el7.x86_64

Comment 8 Jiri Denemark 2017-05-02 10:48:15 UTC
Ah, so the situation is actually a bit different. The domain doesn't die by itself. This is a case of trying to destroy a domain while it is still being started. qemuDomainDestroyFlags first marks the domain as being destroyed, then kills the domain to make sure it can acquire a job, and waits for the job. While our monitor error handler correctly does not clean up such domains, the code which starts a domain ignores the flag and cleans the domain after seeing an unexpected EOF on the monitor. Thus when qemuDomainDestroyFlags actually gets the job, there's no running domain to destroy.

Comment 9 Jiri Denemark 2017-09-11 14:34:02 UTC
Fixed upstream by

commit c5d1dcbcd904c8a27b4addf7cf6debcbdd641d75
Refs: v3.7.0-42-gc5d1dcbcd9
Author:     Jiri Denemark <jdenemar>
AuthorDate: Fri Sep 8 20:44:34 2017 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Sep 11 16:32:15 2017 +0200

    qemu: Don't report failure to destroy a destroyed domain

    When destroying a domain libvirt marks it internally with a
    beingDestroyed flag to make sure the qemuDomainDestroyFlags API itself
    cleans up after the domain rather than letting an uninformed EOF handler
    do it. However, when the domain is being started at the moment libvirt
    was asked to destroy it, only the starting thread can properly clean up
    after the domain and thus it ignores the beingDestroyed flag. Once
    qemuDomainDestroyFlags finally gets a job, the domain may not be running
    anymore, which should not be reported as an error if the domain has been
    starting up.

    https://bugzilla.redhat.com/show_bug.cgi?id=1445600

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Martin Kletzander <mkletzan>

Comment 11 Meina Li 2017-10-18 09:20:19 UTC
Test env components:
kernel-3.10.0-734.el7.x86_64
libvirt-3.8.0-1.el7.x86_64
qemu-kvm-rhev-2.10.0-2.el7.x86_64

Steps to verify:
1.Start a guest with the xml(the ip in the xml can not be connected),and then input "ctrl + c" in the terminal :
#virsh dumpxml lmn
...
  <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source protocol='rbd' name='lmen/lmen.img'>
        <host name='10.73.75.59' port='6789'/>     -->the ip can not be connected
      </source>
      <target dev='vdc' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1a' function='0x0'/>
    </disk>
...


# virsh start lmn
^C

2.Check the guest status.

# virsh list --all
 Id    Name                           State
----------------------------------------------------                          
 15    lmn                           *** paused  ***

3.Destroy the guest.

# virsh destroy lmn
Domain lmn destroyed

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     lmn                       ***  shut off  ***

The results are as expected, move this bug to be verified.

Comment 15 errata-xmlrpc 2018-04-10 10:43:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704