685144 – live kvm migration with non-shared storage corrupts file system

Bug 685144 - live kvm migration with non-shared storage corrupts file system

Summary: live kvm migration with non-shared storage corrupts file system

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Virtualization Tools
Classification:	Community
Component:	libvirt
Sub Component:
Version:	unspecified
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Daniel Veillard
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-03-15 12:22 UTC by Sebastian J. Bronner
Modified:	2011-03-15 12:58 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-03-15 12:27:34 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
A set of scripts to exercise the file system (968 bytes, application/x-bzip-compressed-tar) 2011-03-15 12:22 UTC, Sebastian J. Bronner	no flags	Details
View All

Description Sebastian J. Bronner 2011-03-15 12:22:54 UTC

Created attachment 484449 [details]
A set of scripts to exercise the file system

Description of problem:

Migrating a kvm guest on non-shared lvm-storage with --copy-storage-all results in a corrupted file system if that guest is under considerable I/O load.



Version-Release number of selected component (if applicable):

libvirt-0.8.5
linux-kernel-2.6.32
lvm2-2.02.54
qemu-kvm-0.12.3



How reproducible:

The error can be reproduced consistently.



Steps to Reproduce:

1. create a guest using lvm-based storage

2. create an LV on the destination node for the guest to be migrated to

3. place the attached scripts somewhere on the guest's system

4. run 'runlots'

5. migrate the guest using the following command: 'virsh migrate --live --persistent --undefinesource --copy-storage-all <guest-name> qemu+ssh://<target-node>/system'

6. attempt to shut down the guest, forcing it off if necessary

7. access the partitions of the LV on the node: 'partprobe /dev/mapper/<volume-name>'

8. run fsck: 'fsck -n -f /dev/mapper/<volume-name>p1'



Actual results:

You should see a big mess of errors, that go beyond what can be accounted for by an unclean shutdown.



Expected results:

Expected is a clean bill of health from fsck.



Additional information:

I suspect that there is some sort of race condition in the live synchronization algorithm behind --copy-storage-all.



Workaround:

The only safe way to migrate guests in this scenario is without the '--live' argument. That way they are first suspended, then everything is transferred, and finally resumed on the target node. When the I/O load is low, the migration works live, as well. However, this is too risky to use on production systems because there is no way to tell when the I/O load is too high for a successful live migration.

Using this workaround is very dissatisfying because for a guest with a 100GB filesystem, the migration takes 45 minutes on our systems. Having migrated other guests with 0 downtime got us hooked.

The attached scripts to simulate high I/O load are somewhat artificial in nature. However, the bug is motivated by a real-world scenario: We migrated a productive mail-server that subsequently became buggy, finally crashed and corrupted several of our customers' e-mails. Unfortunately a bug of this nature can't be tested on non-productive systems, because they don't reach the necessary load levels. The scripts reliably reproduce the failure experienced by our mail-server.

Comment 1 Sebastian J. Bronner 2011-03-15 12:24:08 UTC

If the guest doesn't complete its migration it may be necessary to kill the test scripts from the console:

killall python

Comment 2 Daniel Berrangé 2011-03-15 12:27:34 UTC

There is nothing we can do about this from libvirt, unless there is a mistake in the way we issue the migration command to QEMU, which I don't know of. Any data corruption will be at the QEMU layer which is where the disk data copying / dirty state tracking takes place. So please report this problem to the QEMU bug tracker instead.

Comment 3 Sebastian J. Bronner 2011-03-15 12:58:51 UTC

I suppose that just goes to show my lack of understanding in the interaction between libvirt and qemu-kvm. Thanks for the pointer.

This bug is now submitted at the following URL:

https://bugs.launchpad.net/qemu/+bug/735454

Note You need to log in before you can comment on or make changes to this bug.