Bug 1409356 - unable to unshelve the instance with sriov ports
Summary: unable to unshelve the instance with sriov ports
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 10.0 (Newton)
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: 10.0 (Newton)
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks: 1413010 1414965
TreeView+ depends on / blocked
 
Reported: 2017-01-01 09:22 UTC by Pratik Pravin Bandarkar
Modified: 2023-03-21 18:38 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-16 14:28:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2983771 0 None None None 2017-03-28 11:23:22 UTC

Description Pratik Pravin Bandarkar 2017-01-01 09:22:00 UTC
Description of problem:

If you spawn instance with sriov ports, unshelve operation will fail with below error:
<snip>
2017-01-01 08:04:27.491 981128 DEBUG oslo_messaging._drivers.amqpdriver [req-77629513-ef24-4ed8-8d4b-9b8ade6eb23d b8653655548a4f44a874d1e12682801e 334f810b327a4205b8742533cba5e1bd - - -] CAST unique_id: d1f61d7339f349278fb65667009292db NOTIFY exchange 'nova' topic 'versioned_notifications.error' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:432
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server [req-77629513-ef24-4ed8-8d4b-9b8ade6eb23d b8653655548a4f44a874d1e12682801e 334f810b327a4205b8742533cba5e1bd - - -] Exception during message handling
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     function_name, call_dict, binary)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return f(self, context, *args, **kw)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 188, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     LOG.warning(msg, e, instance=instance)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 157, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 613, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     kwargs['instance'], e, sys.exc_info())
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4308, in unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     do_unshelve_instance()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4307, in do_unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     filter_properties, node)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4354, in _unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.host)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2319, in setup_instance_network_on_host
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self._update_port_binding_for_instance(context, instance, host)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2407, in _update_port_binding_for_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     pci_slot)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server PortUpdateFailed: Port update failed for port cd05b3cd-e88f-43f1-a259-02d95d678dc0: Unable to correlate PCI slot 0000:08:12.7
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server 
</snip>


Version-Release number of selected component (if applicable):
RHOS10

How reproducible:
100%

Steps to Reproduce:
1. Spawn new instance with sriov port.
2. Shelve the instance.
3. Try to unshelve it. The operation will fail.

<snip>
[stack@ibm-x3630m4-5 ~]$ y=0; for i in $(neutron port-list |grep -i sr- |awk {'print $2'}); do ((y++)) ;nova boot --image RHEL7.2 --flavor medium --nic port-id=$i --availability-zone prod-az pbandark-$y ; done
[stack@ibm-x3630m4-5 ~]$ nova list |awk {'print $2'} |egrep -v '^$|ID' |xargs -i nova shelve {}
[stack@ibm-x3630m4-5 ~]$ nova list
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
| ID                                   | Name       | Status            | Task State | Power State | Networks                         |
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
| 8ff065ce-e49a-43db-a8f2-7bccb1b7f2d1 | pbandark-1 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.181 |
| 58388463-e10a-4ce0-bbb2-594b485db77d | pbandark-2 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.180 |
| e32dab54-b1ae-4a46-b393-f7db7fcc267e | pbandark-3 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.183 |
| c558a079-0451-4d95-b3c8-d470525e6274 | pbandark-4 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.182 |
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
[stack@ibm-x3630m4-5 ~]$ glance image-list
+--------------------------------------+--------------------+
| ID                                   | Name               |
+--------------------------------------+--------------------+
| c7840f05-8d0b-4e33-a1c6-338e8f21cb24 | pbandark-1-shelved |
| a5ce6082-d23a-4846-b0f5-ff03f2e2a4ac | pbandark-2-shelved |
| 06eacd81-774f-4b5b-8e89-7679259fc95c | pbandark-3-shelved |
| 0ce2b5ac-c657-4ac8-91aa-66e79ffb86fa | pbandark-4-shelved |
| 33f7f49d-8dfa-45dd-a82f-a1d7394a06f9 | RHEL7.2            |
+--------------------------------------+--------------------+

nova list |awk {'print $2'} |egrep -v '^$|ID' |xargs -i nova unshelve {}

2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4308, in unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     do_unshelve_instance()
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4307, in do_unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     filter_properties, node)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4354, in _unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     self.host)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2319, in setup_instance_network_on_host
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     self._update_port_binding_for_instance(context, instance, host)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2407, in _update_port_binding_for_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     pci_slot)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server PortUpdateFailed: Port update failed for port fd4cd2ca-2082-49ea-864a-04340a8662ea: Unable to correlate PCI slot 0000:08:10.7
</snip>


Actual results:
unable to unshelve the instance.

Expected results:
unshelve operation should be successful

Additional info:

Comment 2 Sahid Ferdjaoui 2017-01-06 09:50:45 UTC
The instances are removed from the hosts and resources freed during an 'offload'. Then when Nova unsheleve the instances in a offloaded state the VMs can be scheduled on an other hosts.

We do not create a migration context and claims for new PCI devices, so we can't update the ports binding.

That issue occurs also in master.

Comment 8 Stephen Finucane 2018-08-30 16:22:53 UTC
OSP 11 is EOL and I have been unable to reproduce this on OSP 13. Is this still an issue or can I mark this as closed?

Comment 9 Jaison Raju 2018-08-30 18:57:58 UTC
(In reply to Stephen Finucane from comment #8)
> OSP 11 is EOL and I have been unable to reproduce this on OSP 13. Is this
> still an issue or can I mark this as closed?

I am not sure if this is still reproducible in 13.
I can test & revert back in a weeks time.
Leaving the needinfo until i test & confirm this.

Comment 10 Stephen Finucane 2018-10-02 14:16:03 UTC
Any updates on this, Jaison?

Comment 11 Stephen Finucane 2018-10-16 14:28:11 UTC
I'm going to close this as this no longer appears to be an issue.


Note You need to log in before you can comment on or make changes to this bug.