| Summary: | Libvirt should support destroying a migrated guest on the source host | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Fangge Jin <fjin> | ||||
| Component: | libvirt | Assignee: | Virtualization Maintenance <virt-maint> | ||||
| Status: | CLOSED DEFERRED | QA Contact: | Fangge Jin <fjin> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | --- | CC: | berrange, dyuan, jdenemar, mzhan, rbalakri, xuzhang, yafu, zpeng | ||||
| Target Milestone: | rc | Keywords: | FutureFeature, Triaged | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-02-11 13:23:04 UTC | Type: | Feature Request | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Destroying a guest without restoring label is absolutely *NOT* something we will do. Having such an option is a security risk, because when the guest is destroyed, any future guest may be given the same SELinux label, and would thus be potentially able to access the original guest's data. The thing is, we can get into a split brain situation, when a domain is already happily running on the destination host (Finish return success), while the source host doesn't know about it (Confirm call was never made). It's not a normal situation, it only happens after, e.g., a network failure, but it can happen. And with post-copy migration the window between Finish and Confirm is bigger which makes it easier to get into such situation. And we need a way to recover from such a split brain since destroying the domain on the source host will currently restore the labels and effectively kill the running domain the destination. That is, we definitely don't want to name it "destroy without restoring labels", but we need to provide a way to do exactly this (although in a very specific case). I still think it makes sense to implement something like
virDomainDestroyFlags(dom, VIR_DOMAIN_DESTROY_MIGRATED)
which would tell libvirt the domain was actually migrated to the destination
host. The API would of course succeed only when libvirt was migrating the
domain before and the domain got paused at the end of the migration, but
couldn't be killed or resumed because we did not get the result. If it was
supposed to be resumed, the higher layer may already fix that with
virDomainResume(). But we have no way to fix the situation if the domain was
supposed to be killed and the new flag for virDomainDestroyFlags() would solve
this.
This bug was closed deferred as a result of bug triage. Please reopen if you disagree and provide justification why this bug should get enough priority. Most important would be information about impact on customer or layered product. Please indicate requested target release. |
Created attachment 1202121 [details] libvirtd and qemu log on both hosts Description of problem: Kill virsh client when do postcopy migration, guest will be left in paused status on source host after migration finishes, while it's running on target host at the same time. Then destroy guest on source host by "virsh destroy $guest", the ownership of guest image file will be changed to root:root. So it's better for libvirt to support destroying a guest without restoring the disk image label, so the guest on target host can work well. Version-Release number of selected component (if applicable): libvirt-2.0.0-9.el7.x86_64 How reproducible: 100% Steps to Reproduce: 0.Prepare shared image file. 1.Start a guest on source host 2.In terminal one: Migrate the guest to target host with --postcopy option: # virsh migrate rhel7.3-0817 qemu+ssh://hp-dl385g7-06.lab.eng.pek2.redhat.com/system --live --verbose --postcopy 3.In terminal two: Before migration finishes, switch to postcopy and kill virsh client: # virsh migrate-postcopy rhel7.3-0817; sleep 1; killall virsh 4.Check terminal one, virsh command terminated: # virsh migrate rhel7.3-0817 qemu+ssh://hp-dl385g7-06.lab.eng.pek2.redhat.com/system --live --verbose --postcopy Migration: [ 48 %]Terminated 5.Check guest status on both hosts: 1) On source host: # virsh list Id Name State ---------------------------------------------------- 6 rhel7.3-0817 paused 2) On target host: # virsh list Id Name State ---------------------------------------------------- 1 rhel7.3-0817 running 6.Do some operation in guest,for example: $ date Sun Sep 18 15:41:05 CST 2016 7.Destroy guest on source host: # virsh destroy rhel7.3-0817 Domain rhel7.3-0817 destroyed 8.Check guest image file ownership, it's changed to root:root # ll /90121/fjin/rhel7.3-0817-1.qcow2 -Z -rw-------. root root system_u:object_r:nfs_t:s0 /90121/fjin/rhel7.3-0817-1.qcow2 9.Do operation in guest, it will report I/O error: $ date bash: /usr/bin/date: Input/output error Actual results: As steps Expected results: Libvirt adds a way to support destroying guest without restoring the label of guest image file.