Bug 1482390 - SR-IOV VF count is not set immediately on deployment
Summary: SR-IOV VF count is not set immediately on deployment
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: z4
: 10.0 (Newton)
Assignee: Brent Eagles
QA Contact: Maxim Babushkin
URL:
Whiteboard:
Depends On:
Blocks: 1454624 1485452
TreeView+ depends on / blocked
 
Reported: 2017-08-17 07:11 UTC by Maxim Babushkin
Modified: 2017-09-06 17:13 UTC (History)
22 users (show)

Fixed In Version: puppet-tripleo-5.6.1-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1485452 (view as bug list)
Environment:
Last Closed: 2017-09-06 17:13:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Sosreport of the compute before the reboot (10.11 MB, application/x-xz)
2017-08-17 07:15 UTC, Maxim Babushkin
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1712903 0 None None None 2017-08-24 19:26:00 UTC
OpenStack gerrit 497664 0 None None None 2017-08-25 18:04:36 UTC
Red Hat Product Errata RHBA-2017:2654 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 director Bug Fix Advisory 2017-09-06 20:55:36 UTC

Description Maxim Babushkin 2017-08-17 07:11:42 UTC
Description of problem:
RHOS 10 sriov deployment on RHEL 7.4 deployed incorrectly.
On RHEL 7.3 the 

Version-Release number of selected component (if applicable):
RHOS 10
RHEL 7.4

How reproducible:
Perform RHOS 10 sriov deployment using rhel 7.4 os.

Actual results:
The VF is missing on the compute until additional manual restart of the compute node.

Expected results:
The VF should appear on the compute node.

Additional info:
Manual reboot of the compute fix the count and add the VF.
Compute:
--------
VF count on the compute before the reboot:
[root@compute-0 ~]# cat /sys/class/net/ens2f0/device/sriov_numvfs 
0
[root@compute-0 ~]# cat /sys/class/net/ens2f1/device/sriov_numvfs 
0

VF count on the compute after the reboot:
[root@compute-0 ~]# cat /sys/class/net/ens2f0/device/sriov_numvfs 
5
[root@compute-0 ~]# cat /sys/class/net/ens2f1/device/sriov_numvfs 
5


Controller:
-----------
Mysql nova database before the compute reboot:

MariaDB [nova]> select * from pci_devices;
+---------------------+------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+
| created_at          | updated_at | deleted_at | deleted | id | compute_node_id | address      | product_id | vendor_id | dev_type | dev_id           | label           | status    | extra_info | instance_uuid | request_id | numa_node | parent_addr |
+---------------------+------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+
| 2017-08-16 15:28:31 | NULL       | NULL       |       0 |  1 |               1 | 0000:0b:00.0 | 10fb       | 8086      | type-PF  | pci_0000_0b_00_0 | label_8086_10fb | available | {}         | NULL          | NULL       |         0 | NULL        |
| 2017-08-16 15:28:31 | NULL       | NULL       |       0 |  2 |               1 | 0000:0b:00.1 | 10fb       | 8086      | type-PF  | pci_0000_0b_00_1 | label_8086_10fb | available | {}         | NULL          | NULL       |         0 | NULL        |
+---------------------+------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+
2 rows in set (0.00 sec)

MariaDB [nova]> select hypervisor_hostname, pci_stats from compute_nodes;

| hypervisor_hostname   | pci_stats|

| compute-0.localdomain | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": [{"nova_object.version": "1.1", "nova_object.changes": ["count", "numa_node", "vendor_id", "product_id", "tags"], "nova_object.name": "PciDevicePool", "nova_object.data": {"count": 2, "numa_node": 0, "vendor_id": "8086", "product_id": "10fb", "tags": {"dev_type": "type-PF", "physical_network": "sriov"}}, "nova_object.namespace": "nova"}]}, "nova_object.namespace": "nova"} |

1 row in set (0.00 sec)


Mysql nova database after the compute reboot:

MariaDB [nova]> select * from pci_devices;
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+--------------+
| created_at          | updated_at          | deleted_at | deleted | id | compute_node_id | address      | product_id | vendor_id | dev_type | dev_id           | label           | status    | extra_info | instance_uuid | request_id | numa_node | parent_addr  |
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+--------------+
| 2017-08-16 15:28:31 | 2017-08-17 06:07:23 | NULL       |       0 |  1 |               1 | 0000:0b:00.0 | 10fb       | 8086      | type-PF  | pci_0000_0b_00_0 | label_8086_10fb | available | {}         | NULL          | NULL       |         0 | NULL         |
| 2017-08-16 15:28:31 | 2017-08-17 06:07:23 | NULL       |       0 |  2 |               1 | 0000:0b:00.1 | 10fb       | 8086      | type-PF  | pci_0000_0b_00_1 | label_8086_10fb | available | {}         | NULL          | NULL       |         0 | NULL         |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  3 |               1 | 0000:0b:10.0 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_0 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.0 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  4 |               1 | 0000:0b:10.1 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_1 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.1 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  5 |               1 | 0000:0b:10.2 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_2 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.0 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  6 |               1 | 0000:0b:10.3 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_3 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.1 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  7 |               1 | 0000:0b:10.4 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_4 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.0 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  8 |               1 | 0000:0b:10.5 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_5 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.1 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 |  9 |               1 | 0000:0b:10.6 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_6 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.0 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 | 10 |               1 | 0000:0b:10.7 | 10ed       | 8086      | type-VF  | pci_0000_0b_10_7 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.1 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 | 11 |               1 | 0000:0b:11.0 | 10ed       | 8086      | type-VF  | pci_0000_0b_11_0 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.0 |
| 2017-08-17 06:07:23 | NULL                | NULL       |       0 | 12 |               1 | 0000:0b:11.1 | 10ed       | 8086      | type-VF  | pci_0000_0b_11_1 | label_8086_10ed | available | {}         | NULL          | NULL       |         0 | 0000:0b:00.1 |
+---------------------+---------------------+------------+---------+----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+--------------+
12 rows in set (0.00 sec)

MariaDB [nova]> select hypervisor_hostname, pci_stats from compute_nodes;

| hypervisor_hostname   | pci_stats|

| compute-0.localdomain | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": [{"nova_object.version": "1.1", "nova_object.changes": ["count", "numa_node", "vendor_id", "product_id", "tags"], "nova_object.name": "PciDevicePool", "nova_object.data": {"count": 2, "numa_node": 0, "vendor_id": "8086", "product_id": "10fb", "tags": {"dev_type": "type-PF", "physical_network": "sriov"}}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.1", "nova_object.changes": ["count", "numa_node", "vendor_id", "product_id", "tags"], "nova_object.name": "PciDevicePool", "nova_object.data": {"count": 10, "numa_node": 0, "vendor_id": "8086", "product_id": "10ed", "tags": {"dev_type": "type-VF", "physical_network": "sriov"}}, "nova_object.namespace": "nova"}]}, "nova_object.namespace": "nova"} |

1 row in set (0.00 sec)

Comment 1 Maxim Babushkin 2017-08-17 07:15:05 UTC
Created attachment 1314563 [details]
Sosreport of the compute before the reboot

Comment 2 Maxim Babushkin 2017-08-17 16:50:41 UTC
On the rhel 7.3 the deployment passed cerrectly with the same templates.

Comment 10 Brent Eagles 2017-08-22 19:01:20 UTC
There appears to be a problem with the network templates being used. The network configuration files for the interfaces indicated in the report do not contained NM_CONTROLLED=yes as required on RHEL. For RHEL, the udev rules will not be fired when the PCI device is hotplugged "back" into the system like it does on CentOS. We rely on NetworkManager to recognize the network device and bring it "up", which will ultimately cause the allocate_vfs script to be called. 

It is curious that the same templates worked in RHEL 7.3 as it should have behaved the same way.

Comment 12 Yariv 2017-08-23 07:17:28 UTC
Hi Brent


Verified with Intel NICs also 

See compute.yaml

type: interface
              name: p1p1
              use_dhcp: false
              defroute: false
              nm_controlled: true
              hotplug: true

And with in compute ifcfg-p1p1 contain nm_controlled=yes

Additional Suggestions?

Comment 13 Brent Eagles 2017-08-23 12:46:49 UTC
Okay, this differs somewhat from the sosreport attached to the bug where there are no interfaces with NM_CONTROLLER=yes. Can you verify that you are getting the same behavior and if so, can you provide sosreport or similar for your test environment? We are mainly looking for the contents of the interface files and the message logs - particularly PCI plugging and NetworkManager.

Comment 14 Brent Eagles 2017-08-23 16:52:44 UTC
I did a simple test against a RHEL 7.4 install that included the types of scripts and interface file mods that tripleo would create. The NetworkManager, ifup-local* and allocate_vfs script mechanisms seem to do the job with re-initializing the VF count. I think it will expedite things if I can get access to a system that is exhibiting the problem behavior.

Comment 16 Brent Eagles 2017-08-24 18:00:27 UTC
I've located the problem and am testing potential fix.

The cause was a regression introduced by a recent fix I made to allow updates on compute nodes that had guest instances that had "consumed" a physical function and the PCI device was not available.

Comment 17 Brent Eagles 2017-08-24 18:45:46 UTC
Some clarifications with respect to this bug:

- this is a regression introduced by https://review.openstack.org/#/c/478503/

- the bug is that the puppet no longer writes the VF counts defined in the heat variables to the interface's sriov_numvfs file (e.g. /sys/class/net/ens2f0/device/sriov_numvfs)

- the regression was backported throughout current versions and consequently will also need to be fixed throughout

- there is a workaround that does not require compute node reboot: ifdown/ifup of the affected interfaces

- this is not specific to any particular version of RHEL.

Comment 18 Assaf Muller 2017-08-25 19:34:55 UTC
Setting blocker flag to '+' following discussion on rhos-pgm ML.

Comment 20 Yariv 2017-08-29 19:07:39 UTC
Verified
On NFV-CI for RHOS 10

Comment 22 errata-xmlrpc 2017-09-06 17:13:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2654


Note You need to log in before you can comment on or make changes to this bug.