Bug 1285921
| Summary: | Can't migrate VMs with cache != none, but 'none' doesn't work with 4k native drives | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Madison Kelly <mkelly> |
| Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.7 | CC: | agk, cluster-maint, djansa, fdinitto |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | resource-agents-3.9.5-30.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-05-10 19:15:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Tested and created pull request for working patch: https://github.com/ClusterLabs/resource-agents/pull/707 Set migrate_options="--unsafe" to make it work where cache!=none. I will grab a copy of the patched RA and test in the next day or two and report back. Thanks! Before: # rpm -q resource-agents resource-agents-3.9.5-24.el6_7.1.x86_64 # clusvcadm -M vm:vm1 -m host2 Trying to migrate vm:vm1 to host2...Failed; service running on original owner # tail -f /var/log/cluster/rgmanager.log [vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none After: Add migrate_options="--unsafe" to cluster.conf and reload the configuration. # rpm -q resource-agents resource-agents-3.9.5-30.el6.x86_64 # clusvcadm -M vm:vm1 -m host2 Trying to migrate vm:vm1 to host2...Success # tail -f /var/log/cluster/rgmanager.log [vm] virsh migrate --live --unsafe vm1 qemu+ssh://host2/system tcp:host2 Migration of vm:vm1 to host2 completed This appears to now work. Thank you! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0735.html |
Description of problem: Systems with 4k native disks are not compatible with 'cache=none' (install fails, it can't see the disk). So cache must be set to 'writeback' or 'writethrough'. However, with this set, migration fails: ==== [root@node2 ~]# clusvcadm -M vm:server -m node2.ccrs.bcn Trying to migrate vm:server to node2.ccrs.bcn...Failed; service running on original owner ==== Logs on host: ==== Nov 27 02:41:06 node1 rgmanager[32465]: Migrating vm:server to node2.ccrs.bcn Nov 27 02:41:06 node1 rgmanager[684]: [vm] Migrate server to node2.ccrs.bcn failed: Nov 27 02:41:06 node1 rgmanager[706]: [vm] error: Unsafe migration: Migration may lead to data corruption if disks use cache != none Nov 27 02:41:06 node1 rgmanager[32465]: migrate on vm "server" returned 150 (unspecified) Nov 27 02:41:06 node1 rgmanager[32465]: Migration of vm:server to node2.ccrs.bcn failed; return code 150 ==== Version-Release number of selected component (if applicable): resource-agents-3.9.5-24.el6_7.1.x86_64 How reproducible: 100% Steps to Reproduce: 1. Build a machine using only 4k native drives 2. Create a VM with a dedicated clustered LV as storage, verify that 'cache=none' is not supported. 3. Use 'cache=write{back,through}', confirm VM now installs. 4. Try to migrate the VM. Actual results: Refuses to migrate Expected results: Migrate, though possibly with warning or alert of some sort. Possibly call a flush after pausing the VM prior to kicking over to the peer? Additional info: Using a cman + rgmanager cluster.