Bug 1706239 - [OSP15] cannot migrate instance using live migration
Summary: [OSP15] cannot migrate instance using live migration
Keywords:
Status: CLOSED DUPLICATE of bug 1722041
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 15.0 (Stein)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: beta
: ---
Assignee: Assaf Muller
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-03 21:09 UTC by Artem Hrechanychenko
Modified: 2020-12-21 19:21 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-24 14:35:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1716335 0 high CLOSED [OSP16] Live migration time out when using live_migration_wait_for_vif_plug=true with OVN along the patch from 1563110 2021-02-22 00:41:40 UTC

Description Artem Hrechanychenko 2019-05-03 21:09:22 UTC
Description of problem:
Deployed OSP15 with 3 controller 3 ceph + 1 compute
After scale-up to 2 computes



(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+----------------------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks                               |
+--------------------------------------+--------------+--------+------------+-------------+----------------------------------------+
| 0d5924ca-4f2f-413e-b0b3-1497632fac4e | after_deploy | ACTIVE | -          | Running     | tenantgeneve=192.168.32.27, 10.0.0.212 |
| 711ccf0c-43a3-42f1-aa00-983251bd2f47 | after_reboot | ACTIVE | -          | Running     | tenantgeneve=192.168.32.22, 10.0.0.187 |
+--------------------------------------+--------------+--------+------------+-------------+----------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova show after_deploy |grep hyp | awk '{ print $4 }'
compute-0.localdomain


nova live-migration after_deploy compute-1.localdomain

| 0d5924ca-4f2f-413e-b0b3-1497632fac4e | after_deploy | MIGRATING | migrating  | Running     | tenantgeneve=192.168.32.27, 10.0.0.212 |


from nova-scheduler logs
Attempting to claim resources 
in the placement API for instance 0d5924ca-4f2f-413e-b0b3-1497632fac4e claim_resources /usr/lib/python3.6/site-packages/nova/scheduler/utils.py:1011
2019-05-03 20:41:16.455 19 DEBUG oslo_service.periodic_task [req-38bfe2c0-97d7-4cb8-99e6-ad28c4016712 - - - - -] Running periodic task SchedulerManager._run_periodic_tasks run_periodic_tasks /usr/lib/python3.6/s
ite-packages/oslo_service/periodic_task.py:217
2019-05-03 20:41:17.321 22 DEBUG nova.scheduler.filter_scheduler [req-b305db6d-8b34-4e94-aa88-682be3d8b107 d33a8b2c51784509bef385fca72d45b2 180947ff74a14fb892934e1fdbe8f5fc - default default] [instance: 0d5924ca-4f2f-413e-b0b3-1497632fac4e] Selected host: (compute-1.localdomain, compute-1.localdomain) ram: 1188MB disk: 26624MB io_ops: 0 instances: 1 _consume_selected_host /usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354
2019-05-03 20:41:17.322 22 DEBUG oslo_concurrency.lockutils [req-b305db6d-8b34-4e94-aa88-682be3d8b107 d33a8b2c51784509bef385fca72d45b2 180947ff74a14fb892934e1fdbe8f5fc - default default] Lock "('compute-1.localdomain', 'compute-1.localdomain')" acquired by "nova.scheduler.host_manager.HostState.consume_from_request.<locals>._locked" :: waited 0.000s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:327
2019-05-03 20:41:17.326 22 DEBUG oslo_concurrency.lockutils [req-b305db6d-8b34-4e94-aa88-682be3d8b107 d33a8b2c51784509bef385fca72d45b2 180947ff74a14fb892934e1fdbe8f5fc - default default] Lock "('compute-1.localdomain', 'compute-1.localdomain')" released by "nova.scheduler.host_manager.HostState.consume_from_request.<locals>._locked" :: held 0.003s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:33


2019-05-03 20:58:02.548 7 DEBUG nova.virt.libvirt.driver [req-e6b5c349-c8f3-4054-9e53-3ce64e49e36b - - - - -] skipping disk for instance-00000009 as it does not have a path _get_instance_disk_info_from_config /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8499
2019-05-03 20:58:02.575 7 DEBUG nova.compute.resource_tracker [req-e6b5c349-c8f3-4054-9e53-3ce64e49e36b - - - - -] Hypervisor/Node resource view: name=compute-1.localdomain free_ram=5796MB free_disk=26.8125GB free_vcpus=3 pci_devices=[{"dev_id": "pci_0000_00_06_0", "address": "0000:00:06.0", "product_id": "2934", "vendor_id": "8086", "numa_node": null, "label": "label_8086_2934", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_02_0", "address": "0000:00:02.0", "product_id": "00b8", "vendor_id": "1013", "numa_node": null, "label": "label_1013_00b8", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_04_0", "address": "0000:00:04.0", "product_id": "1000", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1000", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_03_0", "address": "0000:00:03.0", "product_id": "1000", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1000", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_09_0", "address": "0000:00:09.0", "product_id": "1002", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1002", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_00_0", "address": "0000:00:00.0", "product_id": "1237", "vendor_id": "8086", "numa_node": null, "label": "label_8086_1237", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_06_7", "address": "0000:00:06.7", "product_id": "293a", "vendor_id": "8086", "numa_node": null, "label": "label_8086_293a", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_06_1", "address": "0000:00:06.1", "product_id": "2935", "vendor_id": "8086", "numa_node": null, "label": "label_8086_2935", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_05_0", "address": "0000:00:05.0", "product_id": "1000", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1000", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_07_0", "address": "0000:00:07.0", "product_id": "1003", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1003", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_01_3", "address": "0000:00:01.3", "product_id": "7113", "vendor_id": "8086", "numa_node": null, "label": "label_8086_7113", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_0a_0", "address": "0000:00:0a.0", "product_id": "1005", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1005", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_01_0", "address": "0000:00:01.0", "product_id": "7000", "vendor_id": "8086", "numa_node": null, "label": "label_8086_7000", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_01_1", "address": "0000:00:01.1", "product_id": "7010", "vendor_id": "8086", "numa_node": null, "label": "label_8086_7010", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_08_0", "address": "0000:00:08.0", "product_id": "1001", "vendor_id": "1af4", "numa_node": null, "label": "label_1af4_1001", "dev_type": "type-PCI"}, {"dev_id": "pci_0000_00_06_2", "address": "0000:00:06.2", "product_id": "2936", "vendor_id": "8086", "numa_node": null, "label": "label_8086_2936", "dev_type": "type-PCI"}] _report_hypervisor_resource_view /usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py:870
2019-05-03 20:58:02.576 7 DEBUG oslo_concurrency.lockutils [req-e6b5c349-c8f3-4054-9e53-3ce64e49e36b - - - - -] Lock "compute_resources" acquired by "nova.compute.resource_tracker.ResourceTracker._update_available_resource" :: waited 0.000s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:327
2019-05-03 20:58:02.635 7 DEBUG nova.compute.resource_tracker [req-e6b5c349-c8f3-4054-9e53-3ce64e49e36b - - - - -] Migration for instance 0d5924ca-4f2f-413e-b0b3-1497632fac4e refers to another host's instance! _pair_instances_to_migrations /usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py:759
2019-05-03 20:58:02.687 7 WARNING nova.compute.resource_tracker [req-e6b5c349-c8f3-4054-9e53-3ce64e49e36b - - - - -] [instance: 0d5924ca-4f2f-413e-b0b3-1497632fac4e] Instance not resizing, skipping migration.

Version-Release number of selected component (if applicable):
OSP15 release RHOS_TRUNK-15.0-RHEL-8-20190423.n.1

How reproducible:


Steps to Reproduce:
1.Deploy OSP15 with 3 controller + 3 ceph + 1 compute
2.Deploy instance
3.Scale Up
4.Try to migrate instance to newly provisioned compute 

Actual results:
live migration doesn't work

Expected results:
instance moved to another compute host

Additional info:

all logs located there- https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DF%20Current%20release/job/DFG-df-deployment-15-virthost-3cont_1comp_3ceph-yes_UC_SSL-no_OC_SSL-scalecompute-ceph-ipv4-geneve-RHELOSP-31842/9/artifact/

Comment 2 Martin Schuppert 2019-05-03 21:17:37 UTC
This is likely a duplicate of [1] where we have provided a 2nd patch to run the discovery on every run.

Installed: openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch
Fix is in: openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost

Can we verify with the above version?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1698630

Comment 3 Artem Hrechanychenko 2019-05-03 21:25:16 UTC
installed  openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch.rpm

Comment 4 Martin Schuppert 2019-05-03 21:39:39 UTC
(In reply to Artem Hrechanychenko from comment #3)
> installed 
> openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> noarch.rpm

we'd need openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost

Comment 5 Artem Hrechanychenko 2019-05-03 21:41:55 UTC
(In reply to Martin Schuppert from comment #4)
> (In reply to Artem Hrechanychenko from comment #3)
> > installed 
> > openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> > noarch.rpm
> 
> we'd need
> openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost

that what we have in passed_phase1 http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/

Comment 6 Martin Schuppert 2019-05-03 21:45:21 UTC
(In reply to Artem Hrechanychenko from comment #5)
> (In reply to Martin Schuppert from comment #4)
> > (In reply to Artem Hrechanychenko from comment #3)
> > > installed 
> > > openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> > > noarch.rpm
> > 
> > we'd need
> > openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost
> 
> that what we have in passed_phase1
> http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/

Then we'd need to wait for new puddle containing the package with the fix.

Comment 7 Martin Schuppert 2019-05-09 08:21:08 UTC
(In reply to Artem Hrechanychenko from comment #5)
> (In reply to Martin Schuppert from comment #4)
> > (In reply to Artem Hrechanychenko from comment #3)
> > > installed 
> > > openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> > > noarch.rpm
> > 
> > we'd need
> > openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost
> 
> that what we have in passed_phase1
> http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/

Can you verify with [1] which has openstack-tripleo-heat-templates-10.5.1-0.20190509000437.b674002.el8ost.noarch.rpm  ?

[1] http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/RHOS_TRUNK-15.0-RHEL-8-20190509.n.0/compose/OpenStack/x86_64/os/Packages/

Comment 8 Artem Hrechanychenko 2019-05-13 09:42:50 UTC
(In reply to Martin Schuppert from comment #7)
> (In reply to Artem Hrechanychenko from comment #5)
> > (In reply to Martin Schuppert from comment #4)
> > > (In reply to Artem Hrechanychenko from comment #3)
> > > > installed 
> > > > openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> > > > noarch.rpm
> > > 
> > > we'd need
> > > openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost
> > 
> > that what we have in passed_phase1
> > http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> > RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/
> 
> Can you verify with [1] which has
> openstack-tripleo-heat-templates-10.5.1-0.20190509000437.b674002.el8ost.
> noarch.rpm  ?
> 
> [1]
> http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> RHOS_TRUNK-15.0-RHEL-8-20190509.n.0/compose/OpenStack/x86_64/os/Packages/

http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch.rpm

Comment 9 Artem Hrechanychenko 2019-05-13 10:05:44 UTC
(In reply to Martin Schuppert from comment #7)
> (In reply to Artem Hrechanychenko from comment #5)
> > (In reply to Martin Schuppert from comment #4)
> > > (In reply to Artem Hrechanychenko from comment #3)
> > > > installed 
> > > > openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.
> > > > noarch.rpm
> > > 
> > > we'd need
> > > openstack-tripleo-heat-templates-10.5.1-0.20190429000408.3415df5.el8ost
> > 
> > that what we have in passed_phase1
> > http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> > RHOS_TRUNK-15.0-RHEL-8-20190423.n.1/compose/OpenStack/x86_64/os/Packages/
> 
> Can you verify with [1] which has
> openstack-tripleo-heat-templates-10.5.1-0.20190509000437.b674002.el8ost.
> noarch.rpm  ?
> 
> [1]
> http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/15.0-RHEL-8/
> RHOS_TRUNK-15.0-RHEL-8-20190509.n.0/compose/OpenStack/x86_64/os/Packages/

Ok, I'll check using that compose

Comment 11 Artom Lifshitz 2019-05-17 14:25:40 UTC
(In reply to Artem Hrechanychenko from comment #10)
> Checked using RHOS_TRUNK-15.0-RHEL-8-20190509.n.1
> failed
> https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/
> DF%20Current%20release/job/DFG-df-deployment-15-virthost-3cont_1comp_3ceph-
> yes_UC_SSL-no_OC_SSL-scalecompute-ceph-ipv4-geneve-RHELOSP-31842/13/artifact/

This is a CI run, which specific failed test should we be looking at?

Comment 12 Artem Hrechanychenko 2019-05-18 12:42:41 UTC
(In reply to Artom Lifshitz from comment #11)
> (In reply to Artem Hrechanychenko from comment #10)
> > Checked using RHOS_TRUNK-15.0-RHEL-8-20190509.n.1
> > failed
> > https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/
> > DF%20Current%20release/job/DFG-df-deployment-15-virthost-3cont_1comp_3ceph-
> > yes_UC_SSL-no_OC_SSL-scalecompute-ceph-ipv4-geneve-RHELOSP-31842/13/artifact/
> 
> This is a CI run, which specific failed test should we be looking at?

http://pastebin.test.redhat.com/764343

Comment 21 Maciej Józefczyk 2019-06-24 14:05:27 UTC
@Slawek, Yes it is the same.
Explanation why this happened is in: https://bugzilla.redhat.com/show_bug.cgi?id=1716335. It will not work in OSP13, OSP14 and OSP15.
Workaround is to set live_migration_wait_for_vif_plug=False like @Martin mentioned.


Note You need to log in before you can comment on or make changes to this bug.