Bug 1460962
| Summary: | vm cannot be started if it has a corrupted managedsave file | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | yisun |
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
| Status: | CLOSED ERRATA | QA Contact: | Yanqiu Zhang <yanqzhan> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 7.4 | CC: | chhu, dyuan, fjin, jdenemar, rbalakri, xuzhang, yafu, yanqzhan, yisun, zpeng |
| Target Milestone: | rc | Keywords: | Regression, Upstream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-3.9.0-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-10 10:48:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
yisun
2017-06-13 09:15:03 UTC
Oops, caused by
commit ac793bd7195ab99445cf6c6d6053439c56cef922
Author: Jiri Denemark <jdenemar>
AuthorDate: Tue Jun 6 22:27:57 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Wed Jun 7 13:36:01 2017 +0200
qemu: Fix memory leaks in qemuDomainSaveImageOpen
Signed-off-by: Jiri Denemark <jdenemar>
Reviewed-by: Pavel Hrdina <phrdina>
which switched from directly returning with -3 to a goto, but failed to change the "return -1" statement at the end of the error path.
Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2017-June/msg00541.html Fixed upstream now by
commit 16e31fb38da3c2b9a35faff9ac626d947199cf13
Refs: v3.4.0-97-g16e31fb38
Author: Jiri Denemark <jdenemar>
AuthorDate: Tue Jun 13 13:25:07 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Tue Jun 13 13:46:40 2017 +0200
qemu: Fix starting a domain with corrupted managed save file
Commit v3.4.0-44-gac793bd71 fixed a memory leak, but failed to return
the special -3 value. Thus an attempt to start a domain with corrupted
managed save file would removed the corrupted file and report
"An error occurred, but the cause is unknown" instead of starting the
domain from scratch.
https://bugzilla.redhat.com/show_bug.cgi?id=1460962
Hit another issue, should be same root cause, doc it here, pls Jiri help to confirm. Summary: cannot undefine a VM when it used to have a corrupted manavedsave file which is already removed Steps: 1. make a managedsave root@localhost ~ ## virsh managedsave avocado-vt-vm1 Domain avocado-vt-vm1 state saved by libvirt 2. corrupt the managedsave file root@localhost ~ ## echo > /var/lib/libvirt/qemu/save/avocado-vt-vm1.save 3. try to start the vm root@localhost ~ ## virsh start avocado-vt-vm1 error: Failed to start domain avocado-vt-vm1 error: An error occurred, but the cause is unknown 4. now we can see the managedsave file removed root@localhost ~ ## ll /var/lib/libvirt/qemu/save/avocado-vt-vm1.save ls: cannot access /var/lib/libvirt/qemu/save/avocado-vt-vm1.save: No such file or directory 5. try to undefine the vm root@localhost ~ ## virsh undefine avocado-vt-vm1 error: Refusing to undefine while domain managed save image exists <=== now, we cannot undefine the Yeah, it's caused by the same bug. Libvirtd still thinks the domain has a saved state since it didn't notice it was removed because it was corrupted. Restarting libvirtd should let you undefined the domain. Reproduce this bug with libvirt-3.2.0-14.el7_4.2.x86_64 Steps to reproduce: 1.# virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # ls /var/lib/libvirt/qemu/save/ # echo > /var/lib/libvirt/qemu/save/V.save # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # virsh start V error: Failed to start domain V error: An error occurred, but the cause is unknown <== Reproduced # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off 2.# virsh start V Domain V started # virsh list --all --managed-save Id Name State ---------------------------------------------------- 206 V running # virsh managedsave V Domain V state saved by libvirt # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved # echo > /var/lib/libvirt/qemu/save/V.save # virsh start V error: Failed to start domain V error: An error occurred, but the cause is unknown <== Reproduced # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved # virsh start V Domain V started # virsh list --all --managed-save Id Name State ---------------------------------------------------- 207 V running # virsh destroy V Domain V destroyed # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved Verify this bug with libvirt-3.8.0-1.el7.x86_64. Steps to verify: 1.# virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # ls /var/lib/libvirt/qemu/save/ # echo > /var/lib/libvirt/qemu/save/V.save # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # virsh start V Domain V started <== Successfully started without error. 2.# virsh list --all --managed-save Id Name State ---------------------------------------------------- 1 V running # virsh managedsave V Domain V state saved by libvirt # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved # echo > /var/lib/libvirt/qemu/save/V.save # virsh start V Domain V started <== Successfully started without error. # virsh list --all --managed-save Id Name State ---------------------------------------------------- 2 V running # virsh destroy V Domain V destroyed # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved # virsh undefine V error: Refusing to undefine while domain managed save image exists # systemctl restart libvirtd # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # virsh undefine V Domain V has been undefined Above 'start' behavior get the expected result. But, Jiri, one more question: In last a few steps, after start guest with a corrupted image, the managed-saved status can only be cancelled by restart libvirtd, even though I start/destroy the guest for many times it cannot be cancelled. Do you think it's okay? Oops, looks like we don't reset the managed-saved status after deleting a corrupted save image. An additional trivial patch is needed... Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2017-October/msg01079.html Fixed upstream by
commit f26636887fee11b3ecaa5c0a0734687cded8ed28
Refs: v3.8.0-237-gf26636887
Author: Jiri Denemark <jdenemar>
AuthorDate: Tue Oct 24 10:32:03 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Tue Oct 24 11:07:10 2017 +0200
qemu: Reset hasManagedSave after removing a corrupted image
When starting a domain with managed save image, we try to restore it
first. If the image is corrupted, we silently unlink it and just
normally start the domain. At this point the domain has no managed save
image, yet we did not reset the hasManagedSave flag.
https://bugzilla.redhat.com/show_bug.cgi?id=1460962
Signed-off-by: Jiri Denemark <jdenemar>
Verify this bug with libvirt-3.9.0-2.el7.x86_64: 1.Newly create a corrupted saved image: # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # ls /var/lib/libvirt/qemu/save/ # echo > /var/lib/libvirt/qemu/save/V.save # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off # virsh start V Domain V started # ls /var/lib/libvirt/qemu/save/V.save # virsh destroy V Domain V destroyed # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off <== status is not "saved" 2.Corrupt an existing saved image: # virsh list --all --managed-save Id Name State ---------------------------------------------------- 6 V running # virsh managedsave V Domain V state saved by libvirt # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V saved # echo > /var/lib/libvirt/qemu/save/V.save # virsh start V Domain V started # virsh list --all --managed-save Id Name State ---------------------------------------------------- 7 V running # virsh destroy V Domain V destroyed # virsh list --all --managed-save Id Name State ---------------------------------------------------- - V shut off <== status is not "saved" And guest can be undefined. According to comment 9 and this comment. Mark this bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |