Hide Forgot
Created attachment 484449 [details] A set of scripts to exercise the file system Description of problem: Migrating a kvm guest on non-shared lvm-storage with --copy-storage-all results in a corrupted file system if that guest is under considerable I/O load. Version-Release number of selected component (if applicable): libvirt-0.8.5 linux-kernel-2.6.32 lvm2-2.02.54 qemu-kvm-0.12.3 How reproducible: The error can be reproduced consistently. Steps to Reproduce: 1. create a guest using lvm-based storage 2. create an LV on the destination node for the guest to be migrated to 3. place the attached scripts somewhere on the guest's system 4. run 'runlots' 5. migrate the guest using the following command: 'virsh migrate --live --persistent --undefinesource --copy-storage-all <guest-name> qemu+ssh://<target-node>/system' 6. attempt to shut down the guest, forcing it off if necessary 7. access the partitions of the LV on the node: 'partprobe /dev/mapper/<volume-name>' 8. run fsck: 'fsck -n -f /dev/mapper/<volume-name>p1' Actual results: You should see a big mess of errors, that go beyond what can be accounted for by an unclean shutdown. Expected results: Expected is a clean bill of health from fsck. Additional information: I suspect that there is some sort of race condition in the live synchronization algorithm behind --copy-storage-all. Workaround: The only safe way to migrate guests in this scenario is without the '--live' argument. That way they are first suspended, then everything is transferred, and finally resumed on the target node. When the I/O load is low, the migration works live, as well. However, this is too risky to use on production systems because there is no way to tell when the I/O load is too high for a successful live migration. Using this workaround is very dissatisfying because for a guest with a 100GB filesystem, the migration takes 45 minutes on our systems. Having migrated other guests with 0 downtime got us hooked. The attached scripts to simulate high I/O load are somewhat artificial in nature. However, the bug is motivated by a real-world scenario: We migrated a productive mail-server that subsequently became buggy, finally crashed and corrupted several of our customers' e-mails. Unfortunately a bug of this nature can't be tested on non-productive systems, because they don't reach the necessary load levels. The scripts reliably reproduce the failure experienced by our mail-server.
If the guest doesn't complete its migration it may be necessary to kill the test scripts from the console: killall python
There is nothing we can do about this from libvirt, unless there is a mistake in the way we issue the migration command to QEMU, which I don't know of. Any data corruption will be at the QEMU layer which is where the disk data copying / dirty state tracking takes place. So please report this problem to the QEMU bug tracker instead.
I suppose that just goes to show my lack of understanding in the interaction between libvirt and qemu-kvm. Thanks for the pointer. This bug is now submitted at the following URL: https://bugs.launchpad.net/qemu/+bug/735454