Bug 822052
Summary: | migration will cause guest IO failure when DST sebool is virt_use_nfs=off | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhpeng | ||||
Component: | libvirt | Assignee: | Eric Blake <eblake> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.0 | CC: | acathrow, cwei, dallan, dyuan, eblake, mjenner, mzhan, weizhan, zpeng | ||||
Target Milestone: | rc | ||||||
Target Release: | 7.1 | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-1.1.1-3.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-06-13 13:26:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 547546, 1652078 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
zhpeng
2012-05-16 08:22:05 UTC
Created attachment 584892 [details]
guest io failure screenshot
I was able to reproduce the issue. It iss caused by dynamic ownership on destination host. Once qemu fails to start (because SELinux denies access to the disk image), libvirtd resets the image ownership to root:root and NFS denies access to it even though the file is already open. We have to finally fix dynamic ownership to restore the original owner instead of just resetting it to root:root. The root cause of this error is bug 547546. I'll keep this bug open (rather than closing it as dup) to track possible complications we need to handle during migration and also as a reminder that fixes for bug 547546 need to be verified during migration. I got the same error with migrate & domjobabort testing scenario. Details are as follows: Version libvirt-0.10.2-6.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.330.el6.x86_64 kernel-2.6.32-335.el6.x86_64 How reproducible: 100% Steps to Reproduce: 0. Prepare a migration environment 1. start domain "rhel_mig" on srouce host # virsh start rhel_mig 2. wait for the domain fully started. 3. migrate on source # virsh migrate --live rhel_mig qemu+ssh://${dest_host_ip}/system --verbose 4. before step 3 finished, open another terminal to cancel migrating job # virsh domjobabort rhel_mig Actual results: After step 4, migrating job is successfully canceled, but there are buffer I/O errors in guest like the screenshot in this bug's attachment. I got this error again, there are buffer I/O errors on the screen of guest, and commands in the guest doesn't work. Details are as follows: Version libvirt-0.9.10-21.el6_3.6.x86_64 qemu-kvm-0.12.1.2-2.295.el6_3.8.x86_64 kernel-2.6.32-279.14.1.el6.x86_64 How reproducible: 100% Steps to Reproduce: 0. Prepare a migration environment 1. Define and start network with the following testbr.xml; <network> <name>testbr</name> <uuid>8da85d86-fbd9-c2a1-013b-f121e7c42c8a</uuid> <forward mode='nat'/> <bridge name='testbr' stp='on' delay='0' /> <ip address='192.168.100.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.100.2' end='192.168.100.6' /> </dhcp> </ip> </network> # virsh net-define testbr.xml # virsh net-start testbr 2. Define and start a guest with image on /mnt dir and replace the interface segment with the following xml <interface type='network'> <mac address='52:54:00:c5:66:80'/> <source network='testbr'/> <target dev='vnet0'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> 3. Issue the following command to migrateguest # virsh start rhel_mig # ll -Z /mnt/rhel_mig.img -rw-------. qemu qemu system_u:object_r:nfs_t:s0 /mnt/test_xuwan.img # virsh migrate rhel_mig --live qemu+ssh://$target_ip/system error: Network not found: no network with matching name 'testbr' # ll -Z /mnt/rhel_mig.img -rw-------. root root system_u:object_r:nfs_t:s0 /mnt/test_xuwan.img Actual results: After step 3, there are buffer I/O errors in guest like the screenshot in this bug's attachment. Expected results: After step 3, migration will fail, but the guest should run well on the source. Xu, comments 7 and 9 are both hitting the same issue this bug is about. Basically, whenever you start a migration and the migration fails on destination, libvirtd on destination will reset file ownership to root:root in an attempt to cleanup after failed start of the domain and that will cause IO errors on files stored on NFS. Comments 7 and 9 only differ in the reason why migration failed on destination. In comment 7 it failed because it was you aborted it and in comment 9 it failed because the required network was not found on destination. bug 895826 is another reported instance where a failed migration invokes the relabeling cleanup, and proposes a patch that might solve the immediate symptoms (if migration fails, then don't attempt relabels on the destination, because the source is still using the file), without needing the more complex fix of proper ref-counting and restoring permissions to original settings in the first place. Upstream patch proposed: https://www.redhat.com/archives/libvir-list/2013-August/msg01005.html I can reproduce this with: libvirt-1.1.1-2.el7.x86_64 qemu-kvm-1.5.2-4.el7.x86_64 kernel-3.10.0-9.el7.x86_64 verify with build: libvirt-1.1.1-3.el7.x86_64 qemu-kvm-1.5.2-4.el7.x86_64 kernel-3.10.0-9.el7.x86_64 step: 1, prepare a nfs and mount it to host A and B as a shared nfs pool. 2, virt_use_nfs = off (B) virt_use_nfs = on (A) 3, define and start a guest on A 4, migrate guest from A to B # migrate --live aaa qemu+ssh://$hostB_ip/system root.106.30's password: error: internal error: process exited while connecting to monitor: char device redirected to /dev/pts/1 (label charserial0) qemu-kvm: -drive file=/var/lib/libvirt/migrate/kvm-rhel6.4-x86_64-qcow2.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /var/lib/libvirt/migrate/kvm-rhel6.4-x86_64-qcow2.img: Permission denied no I/O errors in guest. check ownership of images # ll -Z -rw-r--r--. qemu qemu unconfined_u:object_r:virt_image_t:s0 kvm-rhel6.4-x86_64-qcow2.img not changed to root:root, worked as expect, move to verified. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |