Bug 1655267 - [OSP10] Physical Function binding fails
Summary: [OSP10] Physical Function binding fails
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Rodolfo Alonso
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-02 06:22 UTC by Vadim Khitrin
Modified: 2018-12-17 13:54 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-17 13:35:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vadim Khitrin 2018-12-02 06:22:42 UTC
Description of problem:

PF binding on a SR-IOV enabled compute host fails.

Create network:
openstack network create --provider-network-type vlan --provider-physical-network sriov-1 sr-iov
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | UP                                   |
| availability_zone_hints   |                                      |
| availability_zones        |                                      |
| created_at                | 2018-12-02T05:38:23Z                 |
| description               |                                      |
| headers                   |                                      |
| id                        | 70dfe420-6d43-4d9e-9c1a-02a5d7a3f841 |
| ipv4_address_scope        | None                                 |
| ipv6_address_scope        | None                                 |
| mtu                       | 9000                                 |
| name                      | SR-IOV                               |
| port_security_enabled     | True                                 |
| project_id                | 37b24cddb4cf46ce890fb57bf1354426     |
| project_id                | 37b24cddb4cf46ce890fb57bf1354426     |
| provider:network_type     | vlan                                 |
| provider:physical_network | sriov-1                              |
| provider:segmentation_id  | 600                                  |
| qos_policy_id             | None                                 |
| revision_number           | 3                                    |
| router:external           | Internal                             |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tags                      | []                                   |
| updated_at                | 2018-12-02T05:38:23Z                 |
+---------------------------+--------------------------------------+

Create subnet:
openstack subnet create --network SR-IOV --subnet-range 40.0.0.0/24 --allocation-pool start=40.0.0.100,end=40.0.0.200 --dhcp SR-IOV_subnet
+-------------------+--------------------------------------+
| Field             | Value                                |
+-------------------+--------------------------------------+
| allocation_pools  | 40.0.0.100-40.0.0.200                |
| cidr              | 40.0.0.0/24                          |
| created_at        | 2018-12-02T05:41:03Z                 |
| description       |                                      |
| dns_nameservers   |                                      |
| enable_dhcp       | True                                 |
| gateway_ip        | 40.0.0.1                             |
| headers           |                                      |
| host_routes       |                                      |
| id                | 0efeaec1-33bb-4387-96fb-3c60f81bf84f |
| ip_version        | 4                                    |
| ipv6_address_mode | None                                 |
| ipv6_ra_mode      | None                                 |
| name              | SR-IOV_subnet                        |
| network_id        | 70dfe420-6d43-4d9e-9c1a-02a5d7a3f841 |
| project_id        | 37b24cddb4cf46ce890fb57bf1354426     |
| project_id        | 37b24cddb4cf46ce890fb57bf1354426     |
| revision_number   | 2                                    |
| service_types     | []                                   |
| subnetpool_id     | None                                 |
| updated_at        | 2018-12-02T05:41:03Z                 |
+-------------------+--------------------------------------+

Create port:
openstack port create --network SR-IOV --vnic-type direct-physical PF_port
+-----------------------+---------------------------------------------------------------------------+
| Field                 | Value                                                                     |
+-----------------------+---------------------------------------------------------------------------+
| admin_state_up        | UP                                                                        |
| allowed_address_pairs |                                                                           |
| binding_host_id       |                                                                           |
| binding_profile       |                                                                           |
| binding_vif_details   |                                                                           |
| binding_vif_type      | unbound                                                                   |
| binding_vnic_type     | direct-physical                                                           |
| created_at            | 2018-12-02T05:42:12Z                                                      |
| description           |                                                                           |
| device_id             |                                                                           |
| device_owner          |                                                                           |
| extra_dhcp_opts       |                                                                           |
| fixed_ips             | ip_address='40.0.0.108', subnet_id='0efeaec1-33bb-4387-96fb-3c60f81bf84f' |
| headers               |                                                                           |
| id                    | 5df91c9c-0462-4d95-ab9b-d6c17a2a9577                                      |
| mac_address           | fa:16:3e:4d:62:f3                                                         |
| name                  | PF_port                                                                   |
| network_id            | 70dfe420-6d43-4d9e-9c1a-02a5d7a3f841                                      |
| port_security_enabled | True                                                                      |
| project_id            | 37b24cddb4cf46ce890fb57bf1354426                                          |
| project_id            | 37b24cddb4cf46ce890fb57bf1354426                                          |
| qos_policy_id         | None                                                                      |
| revision_number       | 6                                                                         |
| security_groups       | 47fac4f7-a5cd-452d-9d17-f671a354e911                                      |
| status                | DOWN                                                                      |
| updated_at            | 2018-12-02T05:42:13Z                                                      |
+-----------------------+---------------------------------------------------------------------------+

Launch instance:
openstack server create --image rhel-guest-image-7.5-180.x86_64.qcow2 --flavor m1.medium.huge_pages_cpu_pinning_numa_node-0 --nic port-id=5df91c9c-0462-4d95-ab9b-d6c17a2a9577 PF_Instance
+--------------------------------------+-------------------------------------------------------------------------------------+
| Field                                | Value                                                                               |
+--------------------------------------+-------------------------------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                                              |
| OS-EXT-AZ:availability_zone          |                                                                                     |
| OS-EXT-SRV-ATTR:host                 | None                                                                                |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | None                                                                                |
| OS-EXT-SRV-ATTR:instance_name        |                                                                                     |
| OS-EXT-STS:power_state               | NOSTATE                                                                             |
| OS-EXT-STS:task_state                | scheduling                                                                          |
| OS-EXT-STS:vm_state                  | building                                                                            |
| OS-SRV-USG:launched_at               | None                                                                                |
| OS-SRV-USG:terminated_at             | None                                                                                |
| accessIPv4                           |                                                                                     |
| accessIPv6                           |                                                                                     |
| addresses                            |                                                                                     |
| adminPass                            | 9rp89F48CqZ2                                                                        |
| config_drive                         |                                                                                     |
| created                              | 2018-12-02T05:48:07Z                                                                |
| flavor                               | m1.medium.huge_pages_cpu_pinning_numa_node-0 (4e75710f-7324-4f1c-aa89-3bddb564a1d5) |
| hostId                               |                                                                                     |
| id                                   | 8bedb682-2576-4ab8-a89c-59394b313e38                                                |
| image                                | rhel-guest-image-7.5-180.x86_64.qcow2 (508d49e7-9d76-4d1d-b63c-035648e2418b)        |
| key_name                             | None                                                                                |
| name                                 | PF_Instance                                                                         |
| os-extended-volumes:volumes_attached | []                                                                                  |
| progress                             | 0                                                                                   |
| project_id                           | 37b24cddb4cf46ce890fb57bf1354426                                                    |
| properties                           |                                                                                     |
| security_groups                      | [{u'name': u'default'}]                                                             |
| status                               | BUILD                                                                               |
| updated                              | 2018-12-02T05:48:08Z                                                                |
| user_id                              | 545cc95fdca74ba28a7a44ba9e37b5fc                                                    |
+--------------------------------------+-------------------------------------------------------------------------------------+

View instance:
openstack server show PF_Instance
[stack@undercloud-0 ~]$ openstack server show 8bedb682-2576-4ab8-a89c-59394b313e38
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Field                                | Value                                                                                                                                |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                                                                                               |
| OS-EXT-AZ:availability_zone          |                                                                                                                                      |
| OS-EXT-SRV-ATTR:host                 | None                                                                                                                                 |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | None                                                                                                                                 |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000039                                                                                                                    |
| OS-EXT-STS:power_state               | NOSTATE                                                                                                                              |
| OS-EXT-STS:task_state                | None                                                                                                                                 |
| OS-EXT-STS:vm_state                  | error                                                                                                                                |
| OS-SRV-USG:launched_at               | None                                                                                                                                 |
| OS-SRV-USG:terminated_at             | None                                                                                                                                 |
| accessIPv4                           |                                                                                                                                      |
| accessIPv6                           |                                                                                                                                      |
| addresses                            |                                                                                                                                      |
| config_drive                         |                                                                                                                                      |
| created                              | 2018-12-02T05:48:07Z                                                                                                                 |
| fault                                | {u'message': u'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 8bedb682-2576-4ab8-a89c-          |
|                                      | 59394b313e38. Last exception: Binding failed for port 5df91c9c-0462-4d95-ab9b-d6c17a2a9577, please check neutron logs for more       |
|                                      | information.', u'code': 500, u'details': u'  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 493, in         |
|                                      | build_instances\n    filter_properties, instances[0].uuid)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line  |
|                                      | 184, in populate_retry\n    raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2018-12-02T05:48:29Z'}                  |
| flavor                               | m1.medium.huge_pages_cpu_pinning_numa_node-0 (4e75710f-7324-4f1c-aa89-3bddb564a1d5)                                                  |
| hostId                               |                                                                                                                                      |
| id                                   | 8bedb682-2576-4ab8-a89c-59394b313e38                                                                                                 |
| image                                | rhel-guest-image-7.5-180.x86_64.qcow2 (508d49e7-9d76-4d1d-b63c-035648e2418b)                                                         |
| key_name                             | None                                                                                                                                 |
| name                                 | PF_Instance                                                                                                                          |
| os-extended-volumes:volumes_attached | []                                                                                                                                   |
| project_id                           | 37b24cddb4cf46ce890fb57bf1354426                                                                                                     |
| properties                           |                                                                                                                                      |
| status                               | ERROR                                                                                                                                |
| updated                              | 2018-12-02T05:48:30Z                                                                                                                 |
| user_id                              | 545cc95fdca74ba28a7a44ba9e37b5fc                                                                                                     |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+

Snippet of error from nova-compute.log on hypervisor:
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [req-45e15a54-5a4e-4bc0-9e0e-6c33fa20d580 545cc95fdca74ba28a7a44ba9e37b5fc 37b24cddb4cf46ce890fb57bf1354426 - - -] [instance: 8bedb682-2576-4ab8-a89c-59394b313e38] Instance failed to spawn
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38] Traceback (most recent call last):
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2087, in _build_resources
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     yield resources
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1928, in _build_and_run_instance
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     block_device_info=block_device_info)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2749, in spawn
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     block_device_info=block_device_info)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4886, in _get_guest_xml
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     network_info_str = str(network_info)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/model.py", line 538, in __str__
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     return self._sync_wrapper(fn, *args, **kwargs)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/model.py", line 521, in _sync_wrapper
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     self.wait()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/model.py", line 553, in wait
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     self[:] = self._gt.wait()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     return self._exit_event.wait()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     return hubs.get_hub().switch()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     return self.greenlet.switch()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     result = function(*args, **kwargs)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/utils.py", line 1066, in context_wrapper
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     return func(*args, **kwargs)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1418, in _allocate_network_async
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     six.reraise(*exc_info)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1401, in _allocate_network_async
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     bind_host_id=bind_host_id)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 894, in allocate_for_instance
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     bind_host_id, dhcp_opts, available_macs)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 1013, in _update_ports_for_instance
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     vif.destroy()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     self.force_reraise()
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     six.reraise(self.type_, self.value, self.tb)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 983, in _update_ports_for_instance
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     port_client, instance, port_id, port_req_body)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 450, in _update_port
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     _ensure_no_port_binding_failure(port)
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 188, in _ensure_no_port_binding_failure
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38]     raise exception.PortBindingFailed(port_id=port['id'])
2018-12-02 05:48:27.998 129094 ERROR nova.compute.manager [instance: 8bedb682-2576-4ab8-a89c-59394b313e38] PortBindingFailed: Binding failed for port 5df91c9c-0462-4d95-ab9b-d6c17a2a9577, please check neutron logs for more information.

Neutron logs on controller contain only binding attempts.

Will attach SOS report in comments.

Version-Release number of selected component (if applicable):
In this bug report, the puddle is 2018-11-27.1 but noticed it in late August puddles as well.


How reproducible:
always


Steps to Reproduce:
1. Deploy SR-IOV capable deployment
2. Create network, subnet, PF port
3. Boot up instance with PF port

Actual results:
Binding fails and instance is not spawned

Expected results:
Instance spawns successfully 

Additional info:

Comment 6 Rodolfo Alonso 2018-12-11 10:41:55 UTC
Hello Vadim:

In http://rhos-release.virt.bos.redhat.com/log/bz1655267-1 logs, the verbosity is the same as in http://rhos-release.virt.bos.redhat.com/log/bz1655267. There are no DEBUG messages.

Regards.

Comment 14 Vadim Khitrin 2018-12-17 13:35:07 UTC
Hey,

First of all thanks for all of your help Rodolfo! 

As far as I know when we deploy OSP10 with SR-IOV, TripleO automatically populates 'supported_pci_vendor_devs' with the values '15b3:1004', '8086:10ca' (refer to /etc/puppet/modules/neutron/manifests/plugins/ml2.pp).

Regardless to that, marking it as CLOSED NOTABUG due to redeploying with new values in TripleO which included '8086:1572' (Ethernet controller [0200]: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ [8086:1572] (rev 01) and verifying PF binding.

Comment 15 Vadim Khitrin 2018-12-17 13:54:30 UTC
Refer to BZ#1448919 regarding supported_pci_vendor_devs puppet generation


Note You need to log in before you can comment on or make changes to this bug.