Bug 2168769

Summary: Save guest failed sometimes after restarting the service(mainly rhel6 guest)
Product: Red Hat Enterprise Linux 9 Reporter: Meina Li <meili>
Component: libvirtAssignee: Peter Krempa <pkrempa>
libvirt sub component: Storage QA Contact: Meina Li <meili>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: high CC: jdenemar, lmen, pkrempa, virt-maint, yicui
Version: 9.2Keywords: Automation, Regression, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-9.0.0-5.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 07:27:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd.log
none
fill-libvirtd.log none

Description Meina Li 2023-02-10 03:41:26 UTC
Created attachment 1943206 [details]
libvirtd.log

Description of problem:
Save guest failed sometimes after restarting the service(mainly rhel6 guest)

Version-Release number of selected component (if applicable):
libvirt-9.0.0-3.el9.x86_64
qemu-kvm-7.2.0-7.el9.x86_64

How reproducible:
100% in auto test but may 50% in manual test

Steps to Reproduce:
1. Prepare a running rhel6 seabios guest with a virtio-transitional model disk.
# virsh dumpxml rhel| xmllint --xpath //disk -
<disk type='file' device='disk' model='virtio-transitional'>
     <driver name='qemu' type='qcow2'/>
     <source file='/var/lib/libvirt/images/rhel6.qcow2'/>
     <backingStore/>
     <target dev='vda' bus='virtio'/>
     <alias name='virtio-disk0'/>
     <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
   </disk>

2. Create snapshots for the rhel6 guest.
# for i in {1..3}; do virsh snapshot-create-as rhel s$i --disk-only; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created

3. Restart the service.
# systemctl restart virtqemud

4. Save the guest.
# virsh save rhel rhel.save
error: Failed to save domain 'rhel' to rhel.save
error: operation failed: job 'domain save' unexpectedly failed

Actual results:
As above step4.

Expected results:
Save success.

Additional info:
1) It must be reproduced in libvirt auto case: virtio_transitional_blk.rhel6_guest.default.with_virtio_blk.snapshot_test.virtio_transitional
2) In some machines it has a 100% reproduction rate. But in some others, it may not reproduce.
3) Sometimes it can also be reproduced in the latest RHEL9. 
4) libvirtd.log:
2023-02-09 06:06:59.633+0000: 1465479: info : qemuMonitorSend:861 : QEMU_MONITOR_SEND_MSG: mon=0x7fe9dc0e3460 msg={"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-storage"},"id":"libvirt-12"}
 fd=-1
2023-02-09 06:06:59.633+0000: 1465480: info : virObjectRef:400 : OBJECT_REF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.633+0000: 1465480: info : qemuMonitorIOWrite:366 : QEMU_MONITOR_IO_WRITE: mon=0x7fe9dc0e3460 buf={"execute":"blockdev-del","arguments":{"node-name":"libvirt-4-storage"},"id":"libvirt-12"}
 len=92 ret=92 errno=0
2023-02-09 06:06:59.633+0000: 1465480: info : virObjectRef:400 : OBJECT_REF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.633+0000: 1465480: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.633+0000: 1465480: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465480: info : virObjectRef:400 : OBJECT_REF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465480: info : qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY: mon=0x7fe9dc0e3460 reply={"id": "libvirt-12", "error": {"class": "GenericError", "desc": "Block device libvirt-4-storage is in use"}}
2023-02-09 06:06:59.635+0000: 1465480: info : virObjectRef:400 : OBJECT_REF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465480: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465480: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465479: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465479: info : virObjectRef:400 : OBJECT_REF: obj=0x7fe9dc0e3460
2023-02-09 06:06:59.635+0000: 1465479: error : qemuMonitorJSONCheckErrorFull:366 : internal error: unable to execute QEMU command 'blockdev-del': Block device libvirt-4-storage is in use

Comment 1 Peter Krempa 2023-02-10 08:23:11 UTC
Looks like we don't dismiss the 'blockdev-create' jobs properly after the snapshots. After restart libvirt tries to clean them up but they are already in use. I'm not sure what's up with the save job though.

Comment 2 Peter Krempa 2023-02-10 08:24:31 UTC
The log you've attached is not a debug log. Please include one which doesn't filter out the debug level messages. E.g use filter:

 3:remote 4:event 3:util.json 3:util.object 3:util.dbus 3:util.netlink 3:node_device 3:rpc 3:access 1:*

Comment 3 Meina Li 2023-02-10 09:15:53 UTC
Created attachment 1943269 [details]
fill-libvirtd.log

Comment 4 Peter Krempa 2023-02-10 17:05:49 UTC
While I was not able to reproduce the issue I'm fairly sure I know what's going on.

After a recent commit libvirt doesn't properly progress through the blockjob state transitions and the state for 'blockdev-create' job gets stuck at 'pending' in the internal libvirt data, but in qemu the job properly transitions to 'concluded'. Libvirt doesn't watch the job any more so it doesn't dismiss it.

Now after restart of the daemon libvirt tries to cleanup blockjobs that are leftover and as part of cleanup of the 'blockdev-create' job we remove security labels from the images, which are already in use.

QEMU then can't actually use the images and that causes migration failure.

Comment 6 Peter Krempa 2023-02-13 13:26:40 UTC
Fixed upstream:

commit c433c2434c0459df98ed3355ef615e341acd9009
Author: Peter Krempa <pkrempa>
Date:   Fri Feb 10 17:16:43 2023 +0100

    qemu: blockjob: Handle 'pending' blockjob state only when we need it
    
    The 'pending' state needs to be handled by the blockjob code only when
    the snapshot code requests a block-commit without auto-finalization.
    
    If we always handle it we fail to properly remove the blockjob data for
    the 'blockdev-create' job as that also transitions trhough 'pending' but
    we'd never update it once it reaches 'concluded' as the code already
    thinks that the job has finished and is no longer watching it.
    
    Introduce a 'processPending' property into block job data and set it
    only when we know that we need to process 'pending'.
    
    Fixes: 90d9bc9d74a5157167548b26c00b1a016655e295
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2168769
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Pavel Hrdina <phrdina>

Comment 8 Meina Li 2023-02-15 05:33:52 UTC
Pre-verified Version:
libvirt-9.0.0-5.el9.x86_64
qemu-kvm-7.2.0-8.el9.x86_64

Pre-verified Steps:
1. The related auto cases are passed.
virtio_transitional_blk.rhel6_guest.default.with_virtio_blk.snapshot_test.virtio_transitional
backingchain.lifecycle_test.check_mirror_with_restart_libvirtd.blockcopy

2. The related manual cases are also passed.
2.1 For rhel6 guest with virtio-transitional: verify it with the steps in Description part.
2.2 For backingchain case:
# virsh domstate rhel
running
# virsh blockcopy rhel vda /tmp/rhel.copy  --transient-job --shallow
Block Copy started
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/rhel.qcow2" index="1"/>
      <backingStore/>
      <mirror type="file" file="/tmp/rhel.copy" format="qcow2" job="copy">
        <format type="qcow2"/>
        <source file="/tmp/rhel.copy" index="2"/>
        <backingStore/>
      </mirror>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>
# systemctl restart virtqemud
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/rhel.qcow2" index="1"/>
      <backingStore/>
      <mirror type="file" file="/tmp/rhel.copy" format="qcow2" job="copy" ready="yes">
        <format type="qcow2"/>
        <source file="/tmp/rhel.copy" index="2"/>
        <backingStore/>
      </mirror>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>
# virsh blockjob rhel vda --pivot
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/tmp/rhel.copy" index="2"/>
      <backingStore/>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>

Comment 11 Meina Li 2023-02-20 08:39:58 UTC
Verified Version:
libvirt-8.0.0-5.module+el8.6.0+14480+c0a3aa0f.x86_64
qemu-kvm-6.2.0-11.module+el8.6.0+16538+01ea313d.6.x86_64

Verified Steps:
S1: Save rhel6 guest after restarting service
1. Prepare a running rhel6 seabios guest with a virtio-transitional model disk.
# virsh dumpxml rhel| xmllint --xpath //disk -
<disk type='file' device='disk' model='virtio-transitional'>
     <driver name='qemu' type='qcow2'/>
     <source file='/var/lib/libvirt/images/rhel6.qcow2'/>
     <backingStore/>
     <target dev='vda' bus='virtio'/>
     <alias name='virtio-disk0'/>
     <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
   </disk>
2. Create snapshots for the rhel6 guest.
# for i in {1..3}; do virsh snapshot-create-as rhel s$i --disk-only; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created
3. Restart the service.
# systemctl restart virtqemud
4. Save the guest and restore it.
# virsh save rhel rhel.save
Domain 'rhel' saved to test.save
# virsh restore test.save
Domain restored from test.save

S2: Check mirror status of guest after restaring service.
1. Prepare a running guest.
# virsh domstate rhel
running
2. Do blockcopy for it.
# virsh blockcopy rhel vda /tmp/rhel.copy  --transient-job --shallow
Block Copy started
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/rhel.qcow2" index="1"/>
      <backingStore/>
      <mirror type="file" file="/tmp/rhel.copy" format="qcow2" job="copy">
        <format type="qcow2"/>
        <source file="/tmp/rhel.copy" index="2"/>
        <backingStore/>
      </mirror>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>
3. Restart the virtqemud and wait a while to check the xml.
# systemctl restart virtqemud
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/rhel.qcow2" index="1"/>
      <backingStore/>
      <mirror type="file" file="/tmp/rhel.copy" format="qcow2" job="copy" ready="yes">
        <format type="qcow2"/>
        <source file="/tmp/rhel.copy" index="2"/>
        <backingStore/>
      </mirror>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>
4. Pivot the blockjob and check the status.
# virsh blockjob rhel vda --pivot
# virsh dumpxml rhel | xmllint --xpath //disk -
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/tmp/rhel.copy" index="2"/>
      <backingStore/>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </disk>

Comment 13 errata-xmlrpc 2023-05-09 07:27:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171