Bug 1296406
Summary: | VirtualDomain: add migration_speed and migration_downtime options | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Oyvind Albrigtsen <oalbrigt> |
Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.3 | CC: | agk, cfeist, cluster-maint, fdinitto, mnovacek |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | resource-agents-3.9.5-69.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-04 00:00:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Oyvind Albrigtsen
2016-01-07 08:02:36 UTC
Tested patch and verified that it's working as expected. This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions I have verified that new live migration paramaters migration_speed and migration_downtime can be set for VirtualDomain in resource-agents-3.9.5-81.el7.x86_64 ---- See cluster setup at the bottom. Virtual machines live migration works correctly without these new params as well. migration_speed unset, takes about 10 seconds ... Sep 8 16:18:04 kiff-03 VirtualDomain(R-pool-10-34-70-94)[19714]: INFO: Issuing graceful shutdown request for domain pool-10-34-70-94. Sep 8 16:18:59 kiff-03 VirtualDomain(R-pool-10-34-70-94)[20945]: INFO: pool-10-34-70-94: Starting live migration to light-02.cluster-qe.lab.eng.brq.redhat.com (using: virsh --connect=qemu:///system --quiet migrate --live pool-10-34-70-94 qemu+ssh://light-02.cluster-qe.lab.eng.brq.redhat.com/system ). Sep 8 16:19:09 kiff-03 VirtualDomain(R-pool-10-34-70-94)[20945]: INFO: pool-10-34-70-94: live migration to light-02.cluster-qe.lab.eng.brq.redhat.com succeeded. Sep 8 16:19:09 kiff-03 VirtualDomain(R-pool-10-34-70-94)[21163]: INFO: Domain pool-10-34-70-94 already stopped. migration_speed=5 (5mbits/s), takes about 70 seconds ... Sep 8 16:13:37 kiff-03 VirtualDomain(R-pool-10-34-70-94)[15434]: INFO: pool-10-34-70-94: Setting live migration speed limit for pool-10-34-70-94 (using: virsh --connect=qemu:///system --quiet migrate-setspeed pool-10-34-70-94 5). Sep 8 16:13:37 kiff-03 VirtualDomain(R-pool-10-34-70-94)[15434]: INFO: pool-10-34-70-94: Starting live migration to light-02.cluster-qe.lab.eng.brq.redhat.com (using: virsh --connect=qemu:///system --quiet migrate --live pool-10-34-70-94 qemu+ssh://light-02.cluster-qe.lab.eng.brq.redhat.com/system ). Sep 8 16:14:47 kiff-03 VirtualDomain(R-pool-10-34-70-94)[15434]: INFO: pool-10-34-70-94: live migration to light-02.cluster-qe.lab.eng.brq.redhat.com succeeded. migration_downtime=1 (1 miliseconds) https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-KVM_live_migration-Live_KVM_migration_with_virsh.html Will set a maximum tolerable downtime for a domain which is being live-migrated to another host. The specified downtime is in milliseconds. The domain specified must be the same domain that is being migrated. Sep 8 16:38:15 kiff-03 VirtualDomain(R-pool-10-34-70-94)[6392]: INFO: Domain pool-10-34-70-94 already stopped. Sep 8 16:38:42 kiff-03 VirtualDomain(R-pool-10-34-70-94)[7044]: INFO: pool-10-34-70-94: Starting live migration to light-02.cluster-qe.lab.eng.brq.redhat.com (using: virsh --connect=qemu:///system --quiet migrate --live pool-10-34-70-94 qemu+ssh://light-02.cluster-qe.lab.eng.brq.redhat.com/system ). Sep 8 16:38:44 kiff-03 VirtualDomain(R-pool-10-34-70-94)[7044]: INFO: pool-10-34-70-94: Setting live migration downtime for pool-10-34-70-94 (using: virsh --connect=qemu:///system --quiet migrate-setmaxdowntime pool-10-34-70-94 1). Sep 8 16:38:52 kiff-03 VirtualDomain(R-pool-10-34-70-94)[7044]: INFO: pool-10-34-70-94: live migration to light-02.cluster-qe.lab.eng.brq.redhat.com succeeded. Sep 8 16:38:52 kiff-03 VirtualDomain(R-pool-10-34-70-94)[7253]: INFO: Domain pool-10-34-70-94 already stopped. ---- >>(1) [root@light-02 ~]# pcs config Cluster Name: STSRHTS19499 Corosync Nodes: light-02.cluster-qe.lab.eng.brq.redhat.com kiff-03.cluster-qe.lab.eng.brq.redhat.com Pacemaker Nodes: kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com Resources: Clone: dlm-clone Meta Attrs: interleave=true ordered=true clone-max=2 Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: start interval=0s timeout=90 (dlm-start-interval-0s) stop interval=0s timeout=100 (dlm-stop-interval-0s) monitor interval=30s on-fail=fence (dlm-monitor-interval-30s) Clone: clvmd-clone Meta Attrs: interleave=true ordered=true clone-max=2 Resource: clvmd (class=ocf provider=heartbeat type=clvm) Attributes: with_cmirrord=1 Operations: start interval=0s timeout=90 (clvmd-start-interval-0s) stop interval=0s timeout=90 (clvmd-stop-interval-0s) monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s) Clone: shared-group-clone Meta Attrs: clone-max=2 interleave=true Group: shared-group Resource: shared-vg (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=false partial_activation=false volgrpname=shared Operations: start interval=0s timeout=30 (shared-vg-start-interval-0s) stop interval=0s timeout=30 (shared-vg-stop-interval-0s) monitor interval=10 timeout=30 (shared-vg-monitor-interval-10) Resource: etc-libvirt (class=ocf provider=heartbeat type=Filesystem) Attributes: devkice=/dev/shared/etc0 directory=/etc/libvirt/qemu fstype=gfs2 options= Operations: start interval=0s timeout=60 (etc-libvirt-start-interval-0s) stop interval=0s timeout=60 (etc-libvirt-stop-interval-0s) monitor interval=30s (etc-libvirt-monitor-interval-30s) Resource: images (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/images0 directory=/var/lib/libvirt/images fstype=gfs2 options= Operations: start interval=0s timeout=60 (images-start-interval-0s) stop interval=0s timeout=60 (images-stop-interval-0s) monitor interval=30s (images-monitor-interval-30s) Resource: R-pool-10-34-70-94 (class=ocf provider=heartbeat type=VirtualDomain) Attributes: hypervisor=qemu:///system config=/etc/libvirt/qemu/pool-10-34-70-94.xml migration_transport=ssh migration_speed=5 Meta Attrs: allow-migrate=true priority=100 Utilization: cpu=2 hv_memory=1024 Operations: start interval=0s timeout=120s (R-pool-10-34-70-94-start-interval-0s) stop interval=0s timeout=120s (R-pool-10-34-70-94-stop-interval-0s) monitor interval=10 timeout=30 (R-pool-10-34-70-94-monitor-interval-10) migrate_from interval=0 timeout=120s (R-pool-10-34-70-94-migrate_from-interval-0) migrate_to interval=0 timeout=120 (R-pool-10-34-70-94-migrate_to-interval-0) Stonith Devices: Resource: fence-light-02 (class=stonith type=fence_ipmilan) Attributes: ipaddr=light-02-ilo login=admin passwd=admin pcmk_host_check=static-list pcmk_host_list=light-02.cluster-qe.lab.eng.brq.redhat.com delay=5 Operations: monitor interval=60s (fence-light-02-monitor-interval-60s) Resource: fence-kiff-03 (class=stonith type=fence_ipmilan) Attributes: ipaddr=kiff-03-ilo login=admin passwd=admin pcmk_host_check=static-list pcmk_host_list=kiff-03.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-kiff-03-monitor-interval-60s) Fencing Levels: Location Constraints: Resource: R-pool-10-34-70-94 Disabled on: kiff-03.cluster-qe.lab.eng.brq.redhat.com (score:-INFINITY) (role: Started) (id:cli-ban-R-pool-10-34-70-94-on-kiff-03.cluster-qe.lab.eng.brq.redhat.com) Resource: clvmd-clone Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-clvmd-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-clvmd-clone-pool-10-34-70-95--INFINITY) Resource: dlm-clone Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-dlm-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-dlm-clone-pool-10-34-70-95--INFINITY) Resource: shared-group-clone Enabled on: light-02.cluster-qe.lab.eng.brq.redhat.com (score:INFINITY) (id:location-shared-group-clone-light-02.cluster-qe.lab.eng.brq.redhat.com-INFINITY) Enabled on: kiff-03.cluster-qe.lab.eng.brq.redhat.com (score:INFINITY) (id:location-shared-group-clone-kiff-03.cluster-qe.lab.eng.brq.redhat.com-INFINITY) Disabled on: pool-10-34-70-94 (score:-INFINITY) (id:location-shared-group-clone-pool-10-34-70-94--INFINITY) Disabled on: pool-10-34-70-95 (score:-INFINITY) (id:location-shared-group-clone-pool-10-34-70-95--INFINITY) Ordering Constraints: start dlm-clone then start clvmd-clone (kind:Mandatory) (id:order-dlm-clone-clvmd-clone-mandatory) start clvmd-clone then start shared-group-clone (kind:Mandatory) (id:order-clvmd-clone-shared-group-clone-mandatory) start shared-group-clone then start R-pool-10-34-70-94 (kind:Mandatory) (id:order-shared-group-clone-R-pool-10-34-70-94-mandatory) Colocation Constraints: clvmd-clone with dlm-clone (score:INFINITY) (id:colocation-clvmd-clone-dlm-clone-INFINITY) shared-group-clone with clvmd-clone (score:INFINITY) (id:colocation-shared-group-clone-clvmd-clone-INFINITY) Ticket Constraints: Alerts: Alert: forwarder (path=/usr/tests/sts-rhel7.3/pacemaker/alerts/alert_forwarder.py) Recipients: Recipient: forwarder-recipient (value=http://virt-009.cluster-qe.lab.eng.brq.redhat.com:37676/) Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: STSRHTS19499 dc-version: 1.1.15-10.el7-e174ec8 have-watchdog: false last-lrm-refresh: 1473338656 no-quorum-policy: freeze stonith-enabled: true Quorum: Options: >>(2) [root@light-02 ~]# pcs status Cluster name: STSRHTS19499 Stack: corosync Current DC: light-02.cluster-qe.lab.eng.brq.redhat.com (version 1.1.15-10.el7-e174ec8) - partition with quorum Last updated: Thu Sep 8 16:14:18 2016 Last change: Thu Sep 8 16:13:37 2016 by root via crm_resource on kiff-03.cluster-qe.lab.eng.brq.redhat.com 2 nodes and 13 resources configured Online: [ kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com ] Full list of resources: R-pool-10-34-70-94 (ocf::heartbeat:VirtualDomain): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com fence-light-02 (stonith:fence_ipmilan): Started kiff-03.cluster-qe.lab.eng.brq.redhat.com fence-kiff-03 (stonith:fence_ipmilan): Started light-02.cluster-qe.lab.eng.brq.redhat.com Clone Set: dlm-clone [dlm] Started: [ kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com ] Clone Set: clvmd-clone [clvmd] Started: [ kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com ] Clone Set: shared-group-clone [shared-group] Started: [ kiff-03.cluster-qe.lab.eng.brq.redhat.com light-02.cluster-qe.lab.eng.brq.redhat.com ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html |