Bug 1242181
Summary: | An option to do unsafe live migration of VirtualDomain | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Sergey Urushkin <urusha.v1.0> | ||||
Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.1 | CC: | agk, cluster-maint, mnovacek, urusha.v1.0 | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | resource-agents-3.9.5-61.el7 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1286650 (view as bug list) | Environment: | |||||
Last Closed: | 2016-11-03 23:57:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1286650 | ||||||
Attachments: |
|
Description
Sergey Urushkin
2015-07-11 22:22:00 UTC
Thanks. That patch works for me too and it's much more flexible. # pcs resource create VM VirtualDomain config=/etc/libvirt/qemu/vm.xml migration_transport=ssh meta allow-migrate=true Before: # rpm -q resource-agents resource-agents-3.9.5-54.el7_2.6.x86_64 # pcs resource move VM host2 # tail -f /var/log/pacemaker.log Live migration fails with: Unsafe migration: Migration may lead to data corruption if disks use cache != none After: # rpm -q resource-agents resource-agents-3.9.5-61.el7.x86_64 # pcs resource update VM migrate_options="--unsafe" # pcs resource move VM host2 # tail -f /var/log/pacemaker.log INFO: vmhost: Starting live migration to host2 (using virsh --connect=qemu:///system --quiet migrate --live --unsafe vmhost qemu+ssh://host2/system ). INFO: vmhost: live migration to host2 succeeded. I have verified that it is possible to set additional live migration parameters with resource-agents-3.9.5-81.el7.x86_64 ---- Common setup: Setup bridged virtuals that can be live migrated between cluster nodes and add them as cluster resources. (1) below. # pcs resource show R-pool-10-34-70-94 Resource: R-pool-10-34-70-94 (class=ocf provider=heartbeat type=VirtualDomain) Attributes: hypervisor=qemu:///system config=/etc/libvirt/qemu/pool-10-34-70-94.xml \ migration_transport=ssh migration_downtime=1 \ >> migrate_options=--unsafe Meta Attrs: allow-migrate=true priority=100 Utilization: cpu=2 hv_memory=1024 Operations: start interval=0s timeout=120s (R-pool-10-34-70-94-start-interval-0s) stop interval=0s timeout=120s (R-pool-10-34-70-94-stop-interval-0s) monitor interval=10 timeout=30 (R-pool-10-34-70-94-monitor-interval-10) migrate_from interval=0 timeout=120s (R-pool-10-34-70-94-migrate_from-interval-0) migrate_to interval=0 timeout=120 (R-pool-10-34-70-94-migrate_to-interval-0) # pcs resource move R-pool-10-34-70-94 # grep VirtualDomain /var/log/messages | tail ... ... INFO: pool-10-34-70-94: Starting live migration to kiff-03.cluster-qe.lab.eng.brq.redhat.com (using: virsh --connect=qemu:///system --quiet migrate --live --unsafe pool-10-34-70-94 qemu+ssh://kiff-03.cluster-qe.lab.eng.brq.redhat.com/system ). ... INFO: pool-10-34-70-94: Setting live migration downtime for pool-10-34-70-94 (using: virsh --connect=qemu:///system --quiet migrate-setmaxdowntime pool-10-34-70-94 1). ... INFO: pool-10-34-70-94: live migration to kiff-03.cluster-qe.lab.eng.brq.redhat.com succeeded. ... INFO: Domain pool-10-34-70-94 already stopped ---- >>> (1) [root@light-02 ~]# pcs config Cluster Name: STSRHTS19499 Corosync Nodes: light-02.cluster-qe.lab.eng.brq.redhat.com kiff-03.cluster-qe.lab.eng.brq.redhat.com Pacemaker Nodes: kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com Resources: Clone: dlm-clone Meta Attrs: interleave=true ordered=true clone-max=2 Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: start interval=0s timeout=90 (dlm-start-interval-0s) stop interval=0s timeout=100 (dlm-stop-interval-0s) monitor interval=30s on-fail=fence (dlm-monitor-interval-30s) Clone: clvmd-clone Meta Attrs: interleave=true ordered=true clone-max=2 Resource: clvmd (class=ocf provider=heartbeat type=clvm) Attributes: with_cmirrord=1 Operations: start interval=0s timeout=90 (clvmd-start-interval-0s) stop interval=0s timeout=90 (clvmd-stop-interval-0s) monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s) Clone: shared-group-clone Meta Attrs: clone-max=2 interleave=true Group: shared-group Resource: shared-vg (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=false partial_activation=false volgrpname=shared Operations: start interval=0s timeout=30 (shared-vg-start-interval-0s) stop interval=0s timeout=30 (shared-vg-stop-interval-0s) monitor interval=10 timeout=30 (shared-vg-monitor-interval-10) Resource: etc-libvirt (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/etc0 directory=/etc/libvirt/qemu fstype=gfs2 options= Operations: start interval=0s timeout=60 (etc-libvirt-start-interval-0s) stop interval=0s timeout=60 (etc-libvirt-stop-interval-0s) monitor interval=30s (etc-libvirt-monitor-interval-30s) Resource: images (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/images0 directory=/var/lib/libvirt/images fstype=gfs2 options= Operations: start interval=0s timeout=60 (images-start-interval-0s) stop interval=0s timeout=60 (images-stop-interval-0s) monitor interval=30s (images-monitor-interval-30s) Resource: R-pool-10-34-70-94 (class=ocf provider=heartbeat type=VirtualDomain) Attributes: hypervisor=qemu:///system config=/etc/libvirt/qemu/pool-10-34-70-94.xml migration_transport=ssh migration_downtime=1 migrate_options=--unsafe Meta Attrs: allow-migrate=true priority=100 Utilization: cpu=2 hv_memory=1024 Operations: start interval=0s timeout=120s (R-pool-10-34-70-94-start-interval-0s) stop interval=0s timeout=120s (R-pool-10-34-70-94-stop-interval-0s) monitor interval=10 timeout=30 (R-pool-10-34-70-94-monitor-interval-10) migrate_from interval=0 timeout=120s (R-pool-10-34-70-94-migrate_from-interval-0) migrate_to interval=0 timeout=120 (R-pool-10-34-70-94-migrate_to-interval-0) Stonith Devices: Resource: fence-light-02 (class=stonith type=fence_ipmilan) Attributes: ipaddr=light-02-ilo login=admin passwd=admin pcmk_host_check=static-list pcmk_host_list=light-02.cluster-qe.lab.eng.brq.redhat.com delay=5 Operations: monitor interval=60s (fence-light-02-monitor-interval-60s) Resource: fence-kiff-03 (class=stonith type=fence_ipmilan) Attributes: ipaddr=kiff-03-ilo login=admin passwd=admin pcmk_host_check=static-list pcmk_host_list=kiff-03.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-kiff-03-monitor-interval-60s) Fencing Levels: Location Constraints: Resource: R-pool-10-34-70-94 Disabled on: light-02.cluster-qe.lab.eng.brq.redhat.com (score:-INFINITY) (role: Started) (id:cli-ban-R-pool-10-34-70-94-on-light-02.cluster-qe.lab.eng.brq.redhat.com) Resource: clvmd-clone Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-clvmd-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-clvmd-clone-pool-10-34-70-95--INFINITY) Resource: dlm-clone Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-dlm-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-dlm-clone-pool-10-34-70-95--INFINITY) Resource: shared-group-clone Enabled on: light-02.cluster-qe.lab.eng.brq.redhat.com (score:INFINITY) (id:location-shared-group-clone-light-02.cluster-qe.lab.eng.brq.redhat.com-INFINITY) Enabled on: kiff-03.cluster-qe.lab.eng.brq.redhat.com (score:INFINITY) (id:location-shared-group-clone-kiff-03.cluster-qe.lab.eng.brq.redhat.com-INFINITY) Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-shared-group-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-shared-group-clone-pool-10-34-70-95--INFINITY) Ordering Constraints: start dlm-clone then start clvmd-clone (kind:Mandatory) (id:order-dlm-clone-clvmd-clone-mandatory) start clvmd-clone then start shared-group-clone (kind:Mandatory) (id:order-clvmd-clone-shared-group-clone-mandatory) start shared-group-clone then start R-pool-10-34-70-94 (kind:Mandatory) (id:order-shared-group-clone-R-pool-10-34-70-94-mandatory) Colocation Constraints: clvmd-clone with dlm-clone (score:INFINITY) (id:colocation-clvmd-clone-dlm-clone-INFINITY) shared-group-clone with clvmd-clone (score:INFINITY) (id:colocation-shared-group-clone-clvmd-clone-INFINITY) Ticket Constraints: Alerts: Alert: forwarder (path=/usr/tests/sts-rhel7.3/pacemaker/alerts/alert_forwarder.py) Recipients: Recipient: forwarder-recipient (value=http://virt-009.cluster-qe.lab.eng.brq.redhat.com:37676/) Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: STSRHTS19499 dc-version: 1.1.15-10.el7-e174ec8 have-watchdog: false last-lrm-refresh: 1473344258 no-quorum-policy: freeze stonith-enabled: true Quorum: Options: Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html |