1689165 – Can not get the active block job when do mangedsave after restart libvirtd

Bug 1689165 - Can not get the active block job when do mangedsave after restart libvirtd

Summary: Can not get the active block job when do mangedsave after restart libvirtd

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	8.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	8.0
Assignee:	Peter Krempa
QA Contact:	yisun
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-15 10:34 UTC by yafu
Modified:	2020-11-06 03:37 UTC (History)
CC List:	7 users (show)
Fixed In Version:	libvirt-5.3.0-1.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-11-06 07:13:50 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:3723	0	None	None	None	2019-11-06 07:14:05 UTC

Description yafu 2019-03-15 10:34:38 UTC

Description of problem:
Can not get the active block job when do mangedsave after restart libvirtd


Version-Release number of selected component (if applicable):
libvirt-5.0.0-6.virtcov.el8.x86_64
qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64


How reproducible:
100%

Steps to Reproduce:
1.Prepare a guest with multiple external disk-only snapshot:
#virsh dumpxml rhel8
 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/rhel8.s4'/>
      <backingStore type='file' index='1'>
        <format type='qcow2'/>
        <source file='/var/lib/libvirt/images/rhel8.s3'/>
        <backingStore type='file' index='2'>
          <format type='qcow2'/>
          <source file='/var/lib/libvirt/images/rhel8.s2'/>
          <backingStore type='file' index='3'>
            <format type='qcow2'/>
            <source file='/var/lib/libvirt/images/rhel8.s1'/>
            <backingStore type='file' index='4'>
              <format type='qcow2'/>
              <source file='/var/lib/libvirt/images/rhel8.qcow2'/>
              <backingStore/>
            </backingStore>
          </backingStore>
        </backingStore>
      </backingStore>
      <target dev='sda' bus='scsi'/>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
   </disk>

2.Do blockcommit with '--active':
#virsh blockcommit rhel8 sda --base sda[2] --active --wait --verbose
Block commit: [100 %]
Now in synchronized phase

3.Check the blockjob status:
# virsh blockjob rhel8 sda --info
Active Block Commit: [100 %]

4.Do managedsave:
# virsh managedsave rhel8
error: Failed to save domain rhel8 state
error: Requested operation is not valid: domain has active block job

5.Restart libvirtd service:
# systemctl restart libvirtd

6.Do managedsave again:
#virsh managedsave rhel8
error: Failed to save domain rhel8 state
error: operation failed: domain save job: unexpectedly failed

Actual results:
Can not get the active block job when do mangedsave after restart libvirtd

Expected results:
Should report the same error after restart libvirtd.

Comment 1 Peter Krempa 2019-03-15 14:27:33 UTC

This might have been broken by one of my refactors.

Comment 2 Peter Krempa 2019-03-22 17:06:41 UTC

This was already fixed recently:

commit 9ed9124d0d72fbc1dbaa4859fcfdc998ce060488
Author:     Peter Krempa <pkrempa>
AuthorDate: Thu Oct 18 12:34:49 2018 +0200
Commit:     Peter Krempa <pkrempa>
CommitDate: Thu Jan 17 17:12:50 2019 +0100

    qemu: process: refresh block jobs on reconnect
    
    Block job state was widely untracked by libvirt across restarts which
    was allowed by a stateless block job finishing handler which discarded
    disk state and redetected it. This is undesirable since we'll need to
    track more information for individual blockjobs due to -blockdev
    integration requirements.
    
    In case of legacy blockjobs we can recover whether the job is present at
    reconnect time by querying qemu. Adding tracking whether a job is
    present will allow simplification of the non-shared-storage cancellation
    code.

git desc 9ed9124d0d7 --contains
v5.1.0-rc1~462

Comment 4 yisun 2019-07-05 08:18:32 UTC

verified with: libvirt-5.5.0-1.module+el8.1.0+3580+d7f6488d.x86_64

[root@dell-per730-67 ~]# virsh domblklist avocado-vt-vm1
 Target   Source
------------------------------------------------------------------------
 vda      /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2

[root@dell-per730-67 ~]# for i in {s1,s2,s3,s4}; do virsh snapshot-create-as avocado-vt-vm1 $i --disk-only; done
Domain snapshot s1 created
Domain snapshot s2 created
Domain snapshot s3 created
Domain snapshot s4 created


[root@dell-per730-67 ~]# virsh blockcommit avocado-vt-vm1 vda --base vda[2] --active --wait --verbose
Block commit: [100 %]
Now in synchronized phase

[root@dell-per730-67 ~]# virsh blockjob avocado-vt-vm1 vda --info
Active Block Commit: [100 %]


[root@dell-per730-67 ~]# virsh managedsave avocado-vt-vm1
error: Failed to save domain avocado-vt-vm1 state
error: Requested operation is not valid: domain has active block job

[root@dell-per730-67 ~]# systemctl restart libvirtd
[root@dell-per730-67 ~]# virsh managedsave avocado-vt-vm1
error: Failed to save domain avocado-vt-vm1 state
error: Requested operation is not valid: domain has active block job
<====== same expected error message here. issue fixed.


[root@dell-per730-67 ~]# virsh blockjob avocado-vt-vm1 vda --info
Active Block Commit: [100 %]

Comment 6 errata-xmlrpc 2019-11-06 07:13:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723

Note You need to log in before you can comment on or make changes to this bug.