Bug 1937598
| Summary: | [incremental_backup]libvirt should release lock held by remoteDispatchDomainBackupBegin after guest destroyed | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | yafu <yafu> |
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
| Status: | CLOSED ERRATA | QA Contact: | yisun |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 8.4 | CC: | jdenemar, jen, jsuchane, lmen, pkrempa, virt-maint, xuzhang, yisun |
| Target Milestone: | rc | Keywords: | Regression, Triaged |
| Target Release: | 8.4 | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-7.0.0-9.el8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-25 06:48:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
not reproduced on libvirt-7.0.0-6.module+el8.4.0+10144+c3d3c217.x86_64 Fixed upstream by:
commit 55d175c073b15c039337e46b81c3cef907e55e7b
Author: Peter Krempa <pkrempa>
Date: Thu Mar 11 16:18:50 2021 +0100
qemuBackupJobTerminate: Fix job termination for inactive VMs
Commit cb29e4e801d didn't take into account that the VM can be inactive
when it's destroyed. This means that the job would remain active also
when the VM became inactive.
To fix this properly:
1) Remove the bogus VM liveness check and early return
(reverts the aforementioned commit)
2) Conditionalize the stats assignment only when the stats object is
present
(properly fix the crash when VM dies when reconnecting)
3) end the asyncjob only when it was already set
(prevent corruption of priv->jobs_queued)
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1937598
Fixes: cb29e4e801d
Signed-off-by: Peter Krempa <pkrempa>
Reviewed-by: Ján Tomko <jtomko>
commit aa372e5a0115ef94d55193e4fd85f622213e225c
Author: Peter Krempa <pkrempa>
Date: Thu Mar 11 16:14:17 2021 +0100
backup: Store 'apiFlags' in private section of virDomainBackupDef
'qemuBackupJobTerminate' needs the API flags to see whether
VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL. Unfortunately when called via
qemuProcessReconnect()->qemuProcessStop() early (e.g. if the qemu
process died while we were reconnecting) the job is cleared temporarily
so that other APIs can be called. This would mean that we couldn't clean
up the files in some cases.
Save the 'apiFlags' inside the backup object and set it from the
'qemuDomainJobObj' 'apiFlags' member when reconnecting to a VM.
Signed-off-by: Peter Krempa <pkrempa>
Reviewed-by: Ján Tomko <jtomko>
Exception approved in review meeting on 12 Mar 2021. Verfied on: libvirt-7.0.0-9.module+el8.4.0+10326+5e50a3b6.x86_64
result: PASS
[root@dell-per740-01 ~]# cat backup.xml
<domainbackup mode='pull'>
<server name="localhost" port="10809"/>
<disks>
<disk name='vda' backup='yes' type='file'>
<scratch file='/tmp/sratch.vda'/>
</disk>
</disks>
</domainbackup>
[root@dell-per740-01 ~]# virsh backup-begin vm1 backup.xml
Backup started
[root@dell-per740-01 ~]# virsh destroy vm1
Domain 'vm1' destroyed
[root@dell-per740-01 ~]# virsh domjobinfo vm1 --completed
Job type: Cancelled
Operation: Backup
[root@dell-per740-01 ~]# virsh backup-begin vm1 backup.xml
Backup started
[root@dell-per740-01 ~]# virsh list
Id Name State
----------------------
3 vm1 running
[root@dell-per740-01 ~]# ps -ef | grep vm1
qemu 188112 1 61 05:23 ? 00:00:29 /usr/libexec/qemu-kvm -name guest=vm1...
[root@dell-per740-01 ~]# kill -9 188112
[root@dell-per740-01 ~]# virsh domjobinfo vm1 --completed
Job type: Cancelled
Operation: Backup
[root@dell-per740-01 ~]# virsh start vm1
Domain 'vm1' started
push mode:
[root@dell-per740-01 ~]# cat push.xml
<domainbackup>
<disks>
<disk name='vda' type='file'>
<target file='/tmp/vda.backup'/>
<driver type='qcow2'/>
</disk>
</disks>
</domainbackup>
[root@dell-per740-01 ~]# virsh backup-begin vm1 push.xml
Backup started
[root@dell-per740-01 ~]# virsh destroy vm1
Domain 'vm1' destroyed
[root@dell-per740-01 ~]# virsh domjobinfo vm1 --completed
Job type: Cancelled
Operation: Backup
[root@dell-per740-01 ~]# virsh start vm1
Domain 'vm1' started
[root@dell-per740-01 ~]# virsh backup-begin vm1 push.xml
Backup started
[root@dell-per740-01 ~]# ps -ef | grep vm1
qemu 196827 1 99 10:40 ? 00:00:22 /usr/libexec/qemu-kvm -name guest=vm1..
[root@dell-per740-01 ~]# kill -9 196827
[root@dell-per740-01 ~]# virsh domjobinfo vm1 --completed
Job type: Cancelled
Operation: Backup
[root@dell-per740-01 ~]# virsh start vm1
Domain 'vm1' started
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098 |
Description of problem: libvirt should release lock held by remoteDispatchDomainBackupBegin after guest destroyed Version-Release number of selected component (if applicable): libvirt-daemon-7.0.0-8.module+el8.4.0+10233+8b7fd9eb.x86_64 How reproducible: 100% Steps to Reproduce: 1.Prepare a 'pull' mode backup xml: #cat backup_pull_full.xml <domainbackup mode='pull'> <server name="localhost" port="10809"/> <disks> <disk name='vda' backup='yes' type='file'> <scratch file='/mnt/sratch.vda'/> </disk> </disks> </domainbackup> 2.Start backup with 'pull' mode: #virsh backup-begin vm1 backup_full_pull.xml Backup started 3.Check domain job info: # virsh domjobinfo vm1 Job type: Unbounded Operation: Backup Time elapsed: 16612 ms Temporary disk space use: 21.375 MiB Temporary disk space total: 10.000 GiB 4.Destroy guest: # virsh destroy vm1 Domain 'vm1' destroyed 5.Check domain job info: # virsh domjobinfo vm1 --completed Job type: None 6.Start guest again: # virsh start vm1 error: Failed to start domain 'vm1' error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainBackupBegin) Actual results: libvirt not release lock held by remoteDispatchDomainBackupBegin after guest destroyed Expected results: libvirt should release lock held by remoteDispatchDomainBackupBegin after guest destroyed Additional info: 1.The issue also exits when destroy guest by 'kill -9 `pidof qemu-kvm`' 2.The issue can not reproduce with libvirt-6.0.0-25.5.module+el8.2.1+8680+ea98947b.x86_64