Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1409356

Summary: unable to unshelve the instance with sriov ports
Product: Red Hat OpenStack Reporter: Pratik Pravin Bandarkar <pbandark>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED WORKSFORME QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: amuller, berrange, chrisw, dasmith, eglynn, jraju, kchamart, lyarwood, mbooth, nyechiel, sbauza, sferdjao, sgordon, srevivo, stephenfin, vromanso
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 14:28:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1413010, 1414965    

Description Pratik Pravin Bandarkar 2017-01-01 09:22:00 UTC
Description of problem:

If you spawn instance with sriov ports, unshelve operation will fail with below error:
<snip>
2017-01-01 08:04:27.491 981128 DEBUG oslo_messaging._drivers.amqpdriver [req-77629513-ef24-4ed8-8d4b-9b8ade6eb23d b8653655548a4f44a874d1e12682801e 334f810b327a4205b8742533cba5e1bd - - -] CAST unique_id: d1f61d7339f349278fb65667009292db NOTIFY exchange 'nova' topic 'versioned_notifications.error' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:432
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server [req-77629513-ef24-4ed8-8d4b-9b8ade6eb23d b8653655548a4f44a874d1e12682801e 334f810b327a4205b8742533cba5e1bd - - -] Exception during message handling
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     function_name, call_dict, binary)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return f(self, context, *args, **kw)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 188, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     LOG.warning(msg, e, instance=instance)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 157, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 613, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     kwargs['instance'], e, sys.exc_info())
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.force_reraise()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4308, in unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     do_unshelve_instance()
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4307, in do_unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     filter_properties, node)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4354, in _unshelve_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self.host)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2319, in setup_instance_network_on_host
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     self._update_port_binding_for_instance(context, instance, host)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2407, in _update_port_binding_for_instance
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server     pci_slot)
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server PortUpdateFailed: Port update failed for port cd05b3cd-e88f-43f1-a259-02d95d678dc0: Unable to correlate PCI slot 0000:08:12.7
2017-01-01 08:04:27.495 981128 ERROR oslo_messaging.rpc.server 
</snip>


Version-Release number of selected component (if applicable):
RHOS10

How reproducible:
100%

Steps to Reproduce:
1. Spawn new instance with sriov port.
2. Shelve the instance.
3. Try to unshelve it. The operation will fail.

<snip>
[stack@ibm-x3630m4-5 ~]$ y=0; for i in $(neutron port-list |grep -i sr- |awk {'print $2'}); do ((y++)) ;nova boot --image RHEL7.2 --flavor medium --nic port-id=$i --availability-zone prod-az pbandark-$y ; done
[stack@ibm-x3630m4-5 ~]$ nova list |awk {'print $2'} |egrep -v '^$|ID' |xargs -i nova shelve {}
[stack@ibm-x3630m4-5 ~]$ nova list
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
| ID                                   | Name       | Status            | Task State | Power State | Networks                         |
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
| 8ff065ce-e49a-43db-a8f2-7bccb1b7f2d1 | pbandark-1 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.181 |
| 58388463-e10a-4ce0-bbb2-594b485db77d | pbandark-2 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.180 |
| e32dab54-b1ae-4a46-b393-f7db7fcc267e | pbandark-3 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.183 |
| c558a079-0451-4d95-b3c8-d470525e6274 | pbandark-4 | SHELVED_OFFLOADED | -          | Shutdown    | sriov-provider-171=10.65.199.182 |
+--------------------------------------+------------+-------------------+------------+-------------+----------------------------------+
[stack@ibm-x3630m4-5 ~]$ glance image-list
+--------------------------------------+--------------------+
| ID                                   | Name               |
+--------------------------------------+--------------------+
| c7840f05-8d0b-4e33-a1c6-338e8f21cb24 | pbandark-1-shelved |
| a5ce6082-d23a-4846-b0f5-ff03f2e2a4ac | pbandark-2-shelved |
| 06eacd81-774f-4b5b-8e89-7679259fc95c | pbandark-3-shelved |
| 0ce2b5ac-c657-4ac8-91aa-66e79ffb86fa | pbandark-4-shelved |
| 33f7f49d-8dfa-45dd-a82f-a1d7394a06f9 | RHEL7.2            |
+--------------------------------------+--------------------+

nova list |awk {'print $2'} |egrep -v '^$|ID' |xargs -i nova unshelve {}

2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4308, in unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     do_unshelve_instance()
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4307, in do_unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     filter_properties, node)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4354, in _unshelve_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     self.host)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2319, in setup_instance_network_on_host
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     self._update_port_binding_for_instance(context, instance, host)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 2407, in _update_port_binding_for_instance
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server     pci_slot)
2016-12-29 11:37:19.528 10861 ERROR oslo_messaging.rpc.server PortUpdateFailed: Port update failed for port fd4cd2ca-2082-49ea-864a-04340a8662ea: Unable to correlate PCI slot 0000:08:10.7
</snip>


Actual results:
unable to unshelve the instance.

Expected results:
unshelve operation should be successful

Additional info:

Comment 2 Sahid Ferdjaoui 2017-01-06 09:50:45 UTC
The instances are removed from the hosts and resources freed during an 'offload'. Then when Nova unsheleve the instances in a offloaded state the VMs can be scheduled on an other hosts.

We do not create a migration context and claims for new PCI devices, so we can't update the ports binding.

That issue occurs also in master.

Comment 8 Stephen Finucane 2018-08-30 16:22:53 UTC
OSP 11 is EOL and I have been unable to reproduce this on OSP 13. Is this still an issue or can I mark this as closed?

Comment 9 Jaison Raju 2018-08-30 18:57:58 UTC
(In reply to Stephen Finucane from comment #8)
> OSP 11 is EOL and I have been unable to reproduce this on OSP 13. Is this
> still an issue or can I mark this as closed?

I am not sure if this is still reproducible in 13.
I can test & revert back in a weeks time.
Leaving the needinfo until i test & confirm this.

Comment 10 Stephen Finucane 2018-10-02 14:16:03 UTC
Any updates on this, Jaison?

Comment 11 Stephen Finucane 2018-10-16 14:28:11 UTC
I'm going to close this as this no longer appears to be an issue.