Bug 1706862

Summary: [OSP15] nova_wait_for_compute_service.py Failing When Supplying Multiple Options In NovaPCIPassthrough
Product: Red Hat OpenStack Reporter: Vadim Khitrin <vkhitrin>
Component: openstack-tripleo-heat-templatesAssignee: Martin Schuppert <mschuppe>
Status: CLOSED ERRATA QA Contact: Joe H. Rahme <jhakimra>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: aschultz, fbaudin, lyarwood, mbooth, mburns, michele
Target Milestone: rcKeywords: Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-10.5.1-0.20190511010414.fbcd4d0.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:21:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vadim Khitrin 2019-05-06 12:06:01 UTC
Description of problem:

During deployment of Compute Nodes, when using 'NovaPCIPassthrough' with multiple options, for example:
  NovaPCIPassthrough:
    - devname: "eth10"
      trusted: "true"
      physical_network: "sriov-1"
    - devname: "eth11"
      trusted: "true"
      physical_network: "sriov-2"

The following entries will be generated in nova.conf:
passthrough_whitelist={"devname":"eth10","physical_network":"sriov-1","trusted":"true"}
passthrough_whitelist={"devname":"eth11","physical_network":"sriov-2","trusted":"true"}

Which will cause nova_wait_for_compute_service.py to fail with the following exception:
configparser.DuplicateOptionError: While reading from '/etc/nova/nova.conf' [line 8669]: option 'passthrough_whitelist' in section 'pci' already exists
Log of script with error: http://paste.openstack.org/show/750597/

When omitting one of the entries or supplying both entries in one single list the script will pass, example:
passthrough_whitelist=[{"devname":"eth10","physical_network":"sriov-1","trusted":"true"},
                       {"devname":"eth12","physical_network":"sriov-2","trusted":"true"}]

Log of script passing with the change above: http://paste.openstack.org/show/750596/

This doesn't seem like an issue in TripleO, the output of heat resource generated in TripleO, a list of keys containing required values:

u'nova::compute::pci::passthrough': u'[{"devname": "eth10", "physical_network": "sriov-1", "trusted": "true"}, {"devname": "eth11", "physical_network": "sriov-2", "trusted": "true"}]'

This is probably caused by puppet-nova.

Perhaps meanwhile, it is worth to provide some kind of a disclaimer warning users that they might encounter the same scenario when deploying.

Version-Release number of selected component (if applicable):
rpm -qa | grep tripleo
ansible-role-tripleo-modify-image-1.0.1-0.20190422122515.f1dfdc6.el8ost.noarch
openstack-tripleo-heat-templates-10.5.1-0.20190426040357.0b61dd0.el8ost.noarch
puppet-tripleo-10.4.2-0.20190426000346.bc825d0.el8ost.noarch
openstack-tripleo-common-containers-10.7.1-0.20190426083235.1988c18.el8ost.noarch
python3-tripleoclient-heat-installer-11.4.1-0.20190424070351.41fa2fc.el8ost.noarch
openstack-tripleo-validations-10.4.1-0.20190426070346.e774685.el8ost.noarch
python3-tripleo-common-10.7.1-0.20190426083235.1988c18.el8ost.noarch
python3-tripleoclient-11.4.1-0.20190424070351.41fa2fc.el8ost.noarch
openstack-tripleo-image-elements-10.4.1-0.20190426080346.7efbd4c.el8ost.noarch
openstack-tripleo-common-10.7.1-0.20190426083235.1988c18.el8ost.noarch
ansible-tripleo-ipsec-9.1.1-0.20190422122014.8c1fdab.el8ost.noarch
openstack-tripleo-puppet-elements-10.3.1-0.20190426070355.a359301.el8ost.noarch


How reproducible:
always

Steps to Reproduce:
1. Attempt to deploy Overcloud with multiple NovaPCIPassthrough

Actual results:
Deployment fails

Expected results:
Deployment passes

Additional info:

Comment 1 Michele Baldessari 2019-05-08 17:12:23 UTC
I think there are two slightly separate issues here:
A) Given the NovaPCIPassthrough set as in the BZ description, is the obtained output correct or does THT/puppet-nova need fixing?

passthrough_whitelist={"devname":"eth10","physical_network":"sriov-1","trusted":"true"}
passthrough_whitelist={"devname":"eth11","physical_network":"sriov-2","trusted":"true"}

B) nova_wait_for_compute_service.py should not barf with duplicate keys errors no matter what.
That is because in nova.conf there are indeed certain options which may appear more than once in the same INI section. For example [pci/alias] can be repeated according to the stock config file:
"""
# * Supports multiple aliases by repeating the option (not by specifying
#   a list value)::
#
#     alias = {
#       "name": "QuickAssist-1",
#       "product_id": "0443",
#       "vendor_id": "8086",
#       "device_type": "type-PCI",
#       "numa_policy": "required"
#     }
#     alias = {
#       "name": "QuickAssist-2",
#       "product_id": "0444",
#       "vendor_id": "8086",
#       "device_type": "type-PCI",
#       "numa_policy": "required"
#     }
#  (multi valued)
"""

So, while I make no claim around A) (it might or might not be correct, I'll let compute folks comment on that), I think B) needs fixing no matter what. I'll use this BZ to push a fix for at least B)

Comment 9 errata-xmlrpc 2019-09-21 11:21:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811