Bug 1928838
Summary: | Don't use VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE for virDomainSnapshotCreateXML when filesystems are already frozen by virDomainFSFreeze | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Peter Krempa <pkrempa> |
Component: | openstack-nova | Assignee: | OSP DFG:Compute <osp-dfg-compute> |
Status: | CLOSED EOL | QA Contact: | OSP DFG:Compute <osp-dfg-compute> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 16.2 (Train) | CC: | alifshit, dasmith, eglynn, jhakimra, kchamart, sbauza, sgordon, vromanso |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2025-01-17 15:32:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Krempa
2021-02-15 15:59:37 UTC
Looking a the nova-compute's exception fragment[1], this is coming from the _volume_snapshot_create() method in Nova's libvirt driver, where the following seems to be the logic. Before taking a snapshot, the _volume_snapshot_create() method checks if we can quiesce the guest: - if the guest is capable of quiescing, then it tries guest.snapshot() with the "quiesce=True" [...] # if the user requests (by specifying as a parameter on the template image from which the guest is booting) to have quiesce be part of the snapshot, and if Nova can't honour that, then raise an error - but if the guest is _not_ capable of quiescing, then the guest.snapshot() call is re-tried with "quiesce=False" It was introduced in this Nova commit[3] to fix a bug where Nova was attempting to quiesce when doing a volume (i.e. a detachable block device) snapshot without checking if the guest is _capable_ of quiescing or not. [1] exception fragment from nova-compute.log --------------------------------------------------------- [...] 2021-02-10 06:00:55.038 7 ERROR nova.virt.libvirt.driver [instance: 72786c63-160a-44d0-941e-3ce056afebe2] File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 2897, in _volume_snapshot_create 2021-02-10 06:00:55.038 7 ERROR nova.virt.libvirt.driver [instance: 72786c63-160a-44d0-941e-3ce056afebe2] reuse_ext=True, quiesce=True) [...] 2021-02-10 05:59:29.293 7 ERROR nova.virt.libvirt.driver [instance: 72786c63-160a-44d0-941e-3ce056afebe2] File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2814, in snapshotCreate XML 2021-02-10 05:59:29.293 7 ERROR nova.virt.libvirt.driver [instance: 72786c63-160a-44d0-941e-3ce056afebe2] if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=sel f) 2021-02-10 05:59:29.293 7 ERROR nova.virt.libvirt.driver [instance: 72786c63-160a-44d0-941e-3ce056afebe2] libvirt.libvirtError: internal error: unable to execute QEMU agent command 'guest-fs freeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance [...] --------------------------------------------------------- [2] https://github.com/openstack/nova/blob/308c6007dcbced/nova/virt/libvirt/driver.py#L2791,#L2820 [3] https://opendev.org/openstack/nova/commit/e659a6e7cbb30 (libvirt: check if we can quiesce before volume-backed snapshot; 2016-09-30) The problem isn't that 'guest.snapshot(quiesce=True)' is followed by 'guest.snapshot(quiesce=False)' if the former fails. That is a reasonable algorithm when the quiescing is done as integral part of the libvirt snapshot API. The problem lies with an explicit quiesce done via 'virDomainFsFreeze' (https://github.com/openstack/nova/blob/308c6007dcbced8f4e97b1712ade66b27949b712/nova/virt/libvirt/guest.py#L546) followed by a snapshot with quiesce=True, in libvirt terms virDomainFsFreeze, virDomainSnapshotCreateXML(...,VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE). The qemu guest agent doesn't allow quiescing/freezing if the filesystems are already frozen, so the snapshot with the quiescing enabled will always fail if the filesystems are already quiesced. I've also updated the libvirt docs https://gitlab.com/libvirt/libvirt/-/commit/ec86b8fa29fa97b51382eb19ca2355c87dfcc38f to promote use of explicit quiescing. While the reported behavior isn't great, as far as I understand it the impact is minimal and not user-visible. Being realistic, we'll never get around to fixing this (and it's been 4 years since the bug report anyways). Closing. |