Bug 1344315

Summary: SRIOV PF/VF allocation fails with NUMA aware flavor Edit
Product: Red Hat OpenStack Reporter: Ricardo Noriega <rnoriega>
Component: openstack-novaAssignee: Vladik Romanovsky <vromanso>
Status: CLOSED ERRATA QA Contact: Prasanth Anbalagan <panbalag>
Severity: high Docs Contact:
Priority: high    
Version: 9.0 (Mitaka)CC: berrange, dasmith, eglynn, kchamart, sbauza, sferdjao, sgordon, srevivo, vromanso, yrachman
Target Milestone: asyncKeywords: ZStream
Target Release: 9.0 (Mitaka)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-nova-13.1.1-3.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-21 14:08:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ricardo Noriega 2016-06-09 12:03:40 UTC
Description
===========
It seems that the main failure happens due to the incorrect NUMA filtering in the pci allocation mechanism. The allocation is being done according to the instance NUMA topology, however, this is not always correct. Specifically in the case when a user selects hw:numa_nodes=1, which would mean that VM will take resources from just one numa node and not from a specific one.

Steps to reproduce
==================

Create nova flavor with NUMA awareness, CPU pinning, Huge pages, etc:

# nova flavor-create prefer_pin_1 auto 2048 20 1
# nova flavor-key prefer_pin_1 set hw:numa_nodes=1
# nova flavor-key prefer_pin_1 set hw:mem_page_size=1048576
# nova flavor-key prefer_pin_1 set hw:numa_mempolicy=strict
# nova flavor-key prefer_pin_1 set hw:cpu_policy=dedicated
# nova flavor-key prefer_pin_1 set hw:cpu_thread_policy=prefer

Then instantiate VMs with direct-physical neutron ports:

neutron port-create nfv_sriov --binding:vnic-type direct-physical --name pf1
nova boot pf1 --flavor prefer_pin_1 --image centos_udev --nic port-id=a0fe88f6-07cc-4c70-b702-1915e36ed728
neutron port-create nfv_sriov --binding:vnic-type direct-physical --name pf2
nova boot pf2 --flavor prefer_pin_1 --image centos_udev --nic port-id=b96de3ec-ef94-428b-96bc-dc46623a2427

Third VM instantiation failed. Our environment has got 4 NICs configured to be allocated. However, with a regular flavor (m1.normal), the instantiation works:

neutron port-create nfv_sriov --binding:vnic-type direct-physical --name pf3
nova boot pf3 --flavor 2 --image centos_udev --nic port-id=52caacfe-0324-42bd-84ad-9a54d80e8fbe
neutron port-create nfv_sriov --binding:vnic-type direct-physical --name pf4
nova boot pf4 --flavor 2 --image centos_udev --nic port-id=7335a9a6-82d0-4595-bb88-754678db56ef

Expected result
===============

PCI passthrough (PFs and VFs) should work in an environment with NUMATopologyFilter enable

Actual result
=============

Checking availability of NICs with NUMATopologyFilter is not working.

Environment
===========

1 controller + 1 compute.

OpenStack Mitaka

Logs & Configs
==============

See attachment

Comment 3 Prasanth Anbalagan 2016-09-14 19:56:41 UTC
Neutron port was created with the option "--binding:vnic-type direct".
Booting an instance with --nic causes NUMATopologyFILTER to fail. Without the option, booting an instance works. Please check logs below.

Few things to note - The error was observed with both flavors 200 and 300 below, *Instance boot was done after deleting all existing instances. * pci_devices table shows that resources are still available.

*********
VERSION
*********

[root@serverX ~(keystone_admin)]# yum list installed | grep openstack-nova
openstack-nova-api.noarch            1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-cert.noarch           1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-common.noarch         1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-compute.noarch        1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-conductor.noarch      1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-console.noarch        1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-novncproxy.noarch     1:13.1.1-4.el7ost       @rhelosp-9.0-puddle
openstack-nova-scheduler.noarch      1:13.1.1-4.el7ost       @rhelosp-9.0-puddle

**********
LOGS
**********

[root@rhos-compute-node-02 ~(keystone_admin)]# nova flavor-show 200
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                                                                                             |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                                                                             |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                                                 |
| disk                       | 5                                                                                                                                                                 |
| extra_specs                | {"hw:cpu_policy": "dedicated", "hw:cpu_thread_policy": "prefer", "pci_passthrough:alias": "pci_pass_test:1", "hw:numa_nodes": "1", "hw:numa_mempolicy": "strict"} |
| id                         | 200                                                                                                                                                               |
| name                       | pci-pass                                                                                                                                                          |
| os-flavor-access:is_public | True                                                                                                                                                              |
| ram                        | 512                                                                                                                                                               |
| rxtx_factor                | 1.0                                                                                                                                                               |
| swap                       |                                                                                                                                                                   |
| vcpus                      | 1                                                                                                                                                                 |
+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
[root@rhos-compute-node-02 ~(keystone_admin)]# 
[root@rhos-compute-node-02 ~(keystone_admin)]# 
[root@rhos-compute-node-02 ~(keystone_admin)]# nova flavor-show 300
+----------------------------+-----------------------------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                                                 |
+----------------------------+-----------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                                 |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                     |
| disk                       | 5                                                                                                                     |
| extra_specs                | {"hw:cpu_policy": "dedicated", "hw:cpu_thread_policy": "prefer", "hw:numa_nodes": "1", "hw:numa_mempolicy": "strict"} |
| id                         | 300                                                                                                                   |
| name                       | pci-pass1                                                                                                             |
| os-flavor-access:is_public | True                                                                                                                  |
| ram                        | 512                                                                                                                   |
| rxtx_factor                | 1.0                                                                                                                   |
| swap                       |                                                                                                                       |
| vcpus                      | 1                                                                                                                     |
+----------------------------+-----------------------------------------------------------------------------------------------------------------------+
[root@rhos-compute-node-02 ~(keystone_admin)]# 


********************************************
WITHOUT --nic option in boot
*********************************************

[root@serverX ~(keystone_admin)]# nova show vm1
+--------------------------------------+----------------------------------------------------------+
| Property                             | Value                                                    |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                   |
| OS-EXT-AZ:availability_zone          | nova                                                     |
| OS-EXT-SRV-ATTR:host                 | serverX.lab.eng.rdu2.redhat.com             |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | serverX.lab.eng.rdu2.redhat.com             |
| OS-EXT-SRV-ATTR:instance_name        | instance-0000000a                                        |
| OS-EXT-STS:power_state               | 1                                                        |
| OS-EXT-STS:task_state                | -                                                        |
| OS-EXT-STS:vm_state                  | active                                                   |
| OS-SRV-USG:launched_at               | 2016-09-14T18:42:27.000000                               |
| OS-SRV-USG:terminated_at             | -                                                        |
| accessIPv4                           |                                                          |
| accessIPv6                           |                                                          |
| config_drive                         |                                                          |
| created                              | 2016-09-14T18:42:19Z                                     |
| flavor                               | pci-pass (200)                                           |
| hostId                               | 715eec11b0869d3f063f023d3a53bcaf1357a62a4e596f9bcb986a08 |
| id                                   | 48d81369-b26b-46e8-94b8-ab35543e9506                     |
| image                                | cirros (e1819103-0254-4b2b-a323-38cd6143073d)            |
| key_name                             | -                                                        |
| metadata                             | {}                                                       |
| name                                 | vm1                                                      |
| os-extended-volumes:volumes_attached | []                                                       |
| progress                             | 0                                                        |
| public network                       | 172.24.4.231                                             |
| security_groups                      | default                                                  |
| status                               | ACTIVE                                                   |
| tenant_id                            | 0bd41cf0d4bd4eddacfcf5a51b2b13cf                         |
| updated                              | 2016-09-14T18:42:27Z                                     |
| user_id                              | 42fa6f918169480589dca471b5240457                         |
+--------------------------------------+----------------------------------------------------------+
[root@serverX ~(keystone_admin)]# 


*****************************************************
WITH --nic option in boot 
*****************************************************

NUMATopologyFilter returned 0 hosts
2016-09-14 22:13:40.707 15706 INFO nova.filters [req-52a796ad-aef4-4ab5-9d79-70af7bdae0e8 42fa6f918169480589dca471b5240457 0bd41cf0d4bd4eddacfcf5a51b2b13cf - - -] Filtering removed all hosts for the request with instance ID '32f0b53e-3390-4d50-b9b3-9ab21913377c'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'ImagePropertiesFilter: (start: 1, end: 1)', 'CoreFilter: (start: 1, end: 1)', 'AggregateInstanceExtraSpecsFilter: (start: 1, end: 1)', 'NUMATopologyFilter: (start: 1, end: 0)']

==> nova-conductor.log <==
2016-09-14 22:13:40.710 15770 WARNING nova.scheduler.utils [req-52a796ad-aef4-4ab5-9d79-70af7bdae0e8 42fa6f918169480589dca471b5240457 0bd41cf0d4bd4eddacfcf5a51b2b13cf - - -] Failed to compute_task_build_instances: No valid host was found. There are not enough hosts available.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 150, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 104, in select_destinations
    dests = self.driver.select_destinations(ctxt, spec_obj)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 74, in select_destinations
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.


******************
PCI_DEVICES TABLE
******************
MariaDB [nova]> select * from pci_devices;
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+-----------------------------------+---------------+------------+-----------+--------------+
| created_at          | updated_at          | deleted_at | deleted | id | compute_node_id | address      | product_id | vendor_id | dev_type | dev_id           | label           | status    | extra_info                        | instance_uuid | request_id | numa_node | parent_addr  |
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+-----------------------------------+---------------+------------+-----------+--------------+
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  1 |               1 | 0000:07:10.0 | 1520       | 8086      | type-VF  | pci_0000_07_10_0 | label_8086_1520 | available | {"phys_function": "0000:06:00.0"} | NULL          | NULL       |         0 | 0000:06:00.0 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  2 |               1 | 0000:07:10.1 | 1520       | 8086      | type-VF  | pci_0000_07_10_1 | label_8086_1520 | available | {"phys_function": "0000:06:00.1"} | NULL          | NULL       |         0 | 0000:06:00.1 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  3 |               1 | 0000:07:10.2 | 1520       | 8086      | type-VF  | pci_0000_07_10_2 | label_8086_1520 | available | {"phys_function": "0000:06:00.2"} | NULL          | NULL       |         0 | 0000:06:00.2 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  4 |               1 | 0000:07:10.3 | 1520       | 8086      | type-VF  | pci_0000_07_10_3 | label_8086_1520 | available | {"phys_function": "0000:06:00.3"} | NULL          | NULL       |         0 | 0000:06:00.3 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  5 |               1 | 0000:07:10.4 | 1520       | 8086      | type-VF  | pci_0000_07_10_4 | label_8086_1520 | available | {"phys_function": "0000:06:00.0"} | NULL          | NULL       |         0 | 0000:06:00.0 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  6 |               1 | 0000:07:10.5 | 1520       | 8086      | type-VF  | pci_0000_07_10_5 | label_8086_1520 | available | {"phys_function": "0000:06:00.1"} | NULL          | NULL       |         0 | 0000:06:00.1 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  7 |               1 | 0000:07:10.6 | 1520       | 8086      | type-VF  | pci_0000_07_10_6 | label_8086_1520 | available | {"phys_function": "0000:06:00.2"} | NULL          | NULL       |         0 | 0000:06:00.2 |
| 2016-09-14 17:59:14 | 2016-09-14 19:53:03 | NULL       |       0 |  8 |               1 | 0000:07:10.7 | 1520       | 8086      | type-VF  | pci_0000_07_10_7 | label_8086_1520 | available | {"phys_function": "0000:06:00.3"} | NULL          | NULL       |         0 | 0000:06:00.3 |
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+-----------------------------------+---------------+------------+-----------+--------------+

Comment 4 Vladik Romanovsky 2016-09-15 02:06:05 UTC
Hi,

Well, it's happening because you are trying to allocate VFs that has physical_network: None, but all of the devices you have whitelisted has the 
physical_network: physnet1

[{'count': 8, 'product_id': u'1520', u'dev_type': u'type-VF', 'numa_node': 0, 'vendor_id': u'8086', u'physical_network': u'physnet1'}]


Removing this tag makes the filters pass.
However, it comes back to the binding error which is coming from neutron.
It might be because you didn't configure the sriov agent (I'm not an expert here..)
I would suggest to follow [1] and [2] to set it up.

Using the pci aliases (pci passthrough without a neutron port) - it just works fine.

Thank you!
Vladik

[1]https://docs.google.com/document/d/1qQbJlLI1hSlE4uwKpmVd0BoGSDBd8Z0lTzx5itQ6WL0/edit#heading=h.aj2vev1y0yj6
and 
[2]http://docs.openstack.org/mitaka/networking-guide/config-sriov.html

Comment 5 Prasanth Anbalagan 2016-09-15 12:24:43 UTC
Vladik,

Thanks for looking in to it. It was a configuration issue (missing the SRIOV agent part). It works fine now. Marking it as VERIFIED.

Comment 7 errata-xmlrpc 2016-09-21 14:08:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1916.html