Bug 1285921 - Can't migrate VMs with cache != none, but 'none' doesn't work with 4k native drives
Can't migrate VMs with cache != none, but 'none' doesn't work with 4k native ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents (Show other bugs)
6.7
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Oyvind Albrigtsen
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-26 19:46 EST by digimer
Modified: 2016-05-10 15:15 EDT (History)
4 users (show)

See Also:
Fixed In Version: resource-agents-3.9.5-30.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-10 15:15:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description digimer 2015-11-26 19:46:48 EST
Description of problem:

Systems with 4k native disks are not compatible with 'cache=none' (install fails, it can't see the disk). So cache must be set to 'writeback' or 'writethrough'. However, with this set, migration fails:

====
[root@node2 ~]# clusvcadm -M vm:server -m node2.ccrs.bcn
Trying to migrate vm:server to node2.ccrs.bcn...Failed; service running on original owner
====

Logs on host:

====
Nov 27 02:41:06 node1 rgmanager[32465]: Migrating vm:server to node2.ccrs.bcn
Nov 27 02:41:06 node1 rgmanager[684]: [vm] Migrate server to node2.ccrs.bcn failed:
Nov 27 02:41:06 node1 rgmanager[706]: [vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none
Nov 27 02:41:06 node1 rgmanager[32465]: migrate on vm "server" returned 150 (unspecified)
Nov 27 02:41:06 node1 rgmanager[32465]: Migration of vm:server to node2.ccrs.bcn failed; return code 150
====


Version-Release number of selected component (if applicable):

resource-agents-3.9.5-24.el6_7.1.x86_64



How reproducible:

100%


Steps to Reproduce:
1. Build a machine using only 4k native drives
2. Create a VM with a dedicated clustered LV as storage, verify that 'cache=none' is not supported.
3. Use 'cache=write{back,through}', confirm VM now installs.
4. Try to migrate the VM.


Actual results:

Refuses to migrate


Expected results:

Migrate, though possibly with warning or alert of some sort. Possibly call a flush after pausing the VM prior to kicking over to the peer?


Additional info:

Using a cman + rgmanager cluster.
Comment 2 Oyvind Albrigtsen 2015-11-30 11:27:14 EST
Tested and created pull request for working patch: https://github.com/ClusterLabs/resource-agents/pull/707
Comment 3 Oyvind Albrigtsen 2015-11-30 11:31:35 EST
Set migrate_options="--unsafe" to make it work where cache!=none.
Comment 4 digimer 2015-11-30 11:36:55 EST
I will grab a copy of the patched RA and test in the next day or two and report back.

Thanks!
Comment 5 Oyvind Albrigtsen 2016-01-08 04:09:56 EST
Before:
# rpm -q resource-agents
resource-agents-3.9.5-24.el6_7.1.x86_64

# clusvcadm -M vm:vm1 -m host2
Trying to migrate vm:vm1 to host2...Failed; service running on original owner
# tail -f /var/log/cluster/rgmanager.log
[vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none


After:
Add migrate_options="--unsafe" to cluster.conf and reload the configuration.

# rpm -q resource-agents
resource-agents-3.9.5-30.el6.x86_64
# clusvcadm -M vm:vm1 -m host2
Trying to migrate vm:vm1 to host2...Success
# tail -f /var/log/cluster/rgmanager.log
[vm] virsh migrate  --live --unsafe vm1 qemu+ssh://host2/system tcp:host2
Migration of vm:vm1 to host2 completed
Comment 7 digimer 2016-01-21 19:12:59 EST
This appears to now work. Thank you!
Comment 10 errata-xmlrpc 2016-05-10 15:15:33 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0735.html

Note You need to log in before you can comment on or make changes to this bug.