Bug 1285921

Summary:	Can't migrate VMs with cache != none, but 'none' doesn't work with 4k native drives
Product:	Red Hat Enterprise Linux 6	Reporter:	Madison Kelly <mkelly>
Component:	resource-agents	Assignee:	Oyvind Albrigtsen <oalbrigt>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	6.7	CC:	agk, cluster-maint, djansa, fdinitto
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	resource-agents-3.9.5-30.el6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-05-10 19:15:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Madison Kelly 2015-11-27 00:46:48 UTC

Description of problem:

Systems with 4k native disks are not compatible with 'cache=none' (install fails, it can't see the disk). So cache must be set to 'writeback' or 'writethrough'. However, with this set, migration fails:

====
[root@node2 ~]# clusvcadm -M vm:server -m node2.ccrs.bcn
Trying to migrate vm:server to node2.ccrs.bcn...Failed; service running on original owner
====

Logs on host:

====
Nov 27 02:41:06 node1 rgmanager[32465]: Migrating vm:server to node2.ccrs.bcn
Nov 27 02:41:06 node1 rgmanager[684]: [vm] Migrate server to node2.ccrs.bcn failed:
Nov 27 02:41:06 node1 rgmanager[706]: [vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none
Nov 27 02:41:06 node1 rgmanager[32465]: migrate on vm "server" returned 150 (unspecified)
Nov 27 02:41:06 node1 rgmanager[32465]: Migration of vm:server to node2.ccrs.bcn failed; return code 150
====


Version-Release number of selected component (if applicable):

resource-agents-3.9.5-24.el6_7.1.x86_64



How reproducible:

100%


Steps to Reproduce:
1. Build a machine using only 4k native drives
2. Create a VM with a dedicated clustered LV as storage, verify that 'cache=none' is not supported.
3. Use 'cache=write{back,through}', confirm VM now installs.
4. Try to migrate the VM.


Actual results:

Refuses to migrate


Expected results:

Migrate, though possibly with warning or alert of some sort. Possibly call a flush after pausing the VM prior to kicking over to the peer?


Additional info:

Using a cman + rgmanager cluster.

Comment 2 Oyvind Albrigtsen 2015-11-30 16:27:14 UTC

Tested and created pull request for working patch: https://github.com/ClusterLabs/resource-agents/pull/707

Comment 3 Oyvind Albrigtsen 2015-11-30 16:31:35 UTC

Set migrate_options="--unsafe" to make it work where cache!=none.

Comment 4 Madison Kelly 2015-11-30 16:36:55 UTC

I will grab a copy of the patched RA and test in the next day or two and report back.

Thanks!

Comment 5 Oyvind Albrigtsen 2016-01-08 09:09:56 UTC

Before:
# rpm -q resource-agents
resource-agents-3.9.5-24.el6_7.1.x86_64

# clusvcadm -M vm:vm1 -m host2
Trying to migrate vm:vm1 to host2...Failed; service running on original owner
# tail -f /var/log/cluster/rgmanager.log
[vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none


After:
Add migrate_options="--unsafe" to cluster.conf and reload the configuration.

# rpm -q resource-agents
resource-agents-3.9.5-30.el6.x86_64
# clusvcadm -M vm:vm1 -m host2
Trying to migrate vm:vm1 to host2...Success
# tail -f /var/log/cluster/rgmanager.log
[vm] virsh migrate  --live --unsafe vm1 qemu+ssh://host2/system tcp:host2
Migration of vm:vm1 to host2 completed

Comment 7 Madison Kelly 2016-01-22 00:12:59 UTC

This appears to now work. Thank you!

Comment 10 errata-xmlrpc 2016-05-10 19:15:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0735.html