Bug 1471531 - [RFE] Add TripleO validation of VLANs using introspected LLDP data
Summary: [RFE] Add TripleO validation of VLANs using introspected LLDP data
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-validations
Version: 13.0 (Queens)
Hardware: All
OS: Linux
medium
high
Target Milestone: Upstream M1
: 13.0 (Queens)
Assignee: Bob Fournier
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On: 1554248
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-16 17:22 UTC by Bob Fournier
Modified: 2018-06-27 13:33 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-validations-8.1.1-0.20180119231917.2ff3c79.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:32:14 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 512375 None master: MERGED tripleo-validations: Add validation to check VLANs against switch info in Ironic intospection data (I5adeefea1534db0ede6... 2018-02-07 14:23:51 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:33:21 UTC

Internal Links: 1337769

Description Bob Fournier 2017-07-16 17:22:29 UTC
Description of problem:
TripleO Heat Templates can define VLANs per NIC for roles (controller, compute etc.) for isolated networks.  The problem is that the network switches that the NICs are attached to may not have been set up properly for these VLANs.  As of OSP-11, LLDP data for baremetal nodes is captured during the Ironic inspection process which may have VLAN info for the attached switch ports.  This VLAN info can be checked during a pre-deployment validation to ensure that the VLANs configured in THT nic config files are also configured on the switch.  

The NIC alias to real NIC name conversion must be done during this validation similar to what os-net-config does in order to map the NICs configured in nic config files to actual NIC names in the introspected data.

Since roles can use different VLANs, e.g. the controller may use additional networks than compute so would use additional VLANs, the challenge is to map the roles to Ironic nodes in the pre-deployment phase.  This mapping may not be available in this pre-deployment validation phase.  It may be necessary to only check that ALL configured VLANs per switch port are available on the switches. In other words, if a role in THT has eth0 with VLANs 10, 11, and 12, all Ironic nodes must have LLDP data indicating that the switch port attached to eth0 has VLANs 10, 11, 12.  If it is possible to map the roles to Ironic nodes in this phase then it will be possible to check, for  example, that all controller nodes have a switch port mapped to eth0 with VLANs 10, 11, and 12.

Version-Release number of selected component (if applicable):
OSP-11


How reproducible:
Always


Steps to Reproduce:
1.  Incorrectly configure network switch VLANs different than THT nic config files
2.  Run deployment
3.

Actual results:
Deployment may fail eventually depending on which VLANs are incorrect.


Expected results:
Pre-deployment validation will detect that switch is incorrectly configured and return error.


Additional info:

Comment 4 Bob Fournier 2018-04-13 16:42:31 UTC
Verification is pending fix for https://bugzilla.redhat.com/show_bug.cgi?id=1554248

Comment 5 Bob Fournier 2018-04-30 18:52:54 UTC
Using:
openstack-tripleo-validations-8.4.0-1.el7ost.noarch

Verified by including patch https://review.openstack.org/#/c/563969 which has merged to stable/queens.

$ openstack action execution run tripleo.plan.create_container '{"container":"my-templates"}'

$ swift upload my-templates /home/stack/templates/

$ openstack workflow execution create tripleo.plan_management.v1.create_deployment_plan '{"container":"my-templates"}'

$ export TRIPLEO_PLAN_NAME=my-templates

Using a network switch running lldp attached to 2 nodes with following vlans configured on switch:
$ openstack baremetal introspection interface list host2
+-----------+-------------------+------------------------------+-------------------+----------------+
| Interface | MAC Address       | Switch Port VLAN IDs         | Switch Chassis ID | Switch Port ID |
+-----------+-------------------+------------------------------+-------------------+----------------+
| em1       | b0:83:fe:c6:63:86 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-0/0/25      |
| em2       | b0:83:fe:c6:63:87 | [101, 104, 2001, 2002, 2003] | 64:64:9b:32:f3:00 | ge-1/0/25      |
| p2p2      | a0:36:9f:52:7f:b3 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-1/0/26      |
| p2p1      | a0:36:9f:52:7f:b2 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-0/0/26      |
+-----------+-------------------+------------------------------+-------------------+----------------+
$ openstack baremetal introspection interface list host3
+-----------+-------------------+------------------------------+-------------------+----------------+
| Interface | MAC Address       | Switch Port VLAN IDs         | Switch Chassis ID | Switch Port ID |
+-----------+-------------------+------------------------------+-------------------+----------------+
| em1       | b0:83:fe:c6:53:21 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-0/0/23      |
| em2       | b0:83:fe:c6:53:22 | [101, 104, 2001, 2002, 2003] | 64:64:9b:32:f3:00 | ge-1/0/23      |
| p2p2      | a0:36:9f:52:7e:d9 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-1/0/24      |
| p2p1      | a0:36:9f:52:7e:d8 | [101, 102, 104, 2001, 2002]  | 64:64:9b:32:f3:00 | ge-0/0/24      |
+-----------+-------------------+------------------------------+-------------------+----------------+

Verified passing case:

$ ansible-playbook -i /usr/bin/tripleo-ansible-inventory /usr/share/openstack-tripleo-validations/validations/switch-vlans.yaml

PLAY [undercloud] **********************************************************************************************

TASK [Gathering Facts] *****************************************************************************************
ok: [localhost]

TASK [Get Ironic Inspector swift auth_url] *********************************************************************
ok: [localhost]

TASK [Get Ironic Inspector swift password] *********************************************************************
ok: [localhost]

TASK [Check that switch vlans are present if used in nic-config files] *****************************************
ok: [localhost]

PLAY RECAP *****************************************************************************************************
localhost                  : ok=4    changed=0    unreachable=0    failed=0   


Verified failing case:
templates changed to use vlans which aren't being reported via lldp on switch. 

ansible-playbook -i /usr/bin/tripleo-ansible-inventory /usr/share/openstack-tripleo-validations/validations/switch-vlans.yaml

PLAY [undercloud] **********************************************************************************************

TASK [Gathering Facts] *****************************************************************************************
ok: [localhost]

TASK [Get Ironic Inspector swift auth_url] *********************************************************************
ok: [localhost]

TASK [Get Ironic Inspector swift password] *********************************************************************
ok: [localhost]

TASK [Check that switch vlans are present if used in nic-config files] *****************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "VLAN ID 777 not on attached switch\nVLAN ID 888 not on attached switch\nVLAN ID 1010 not on attached switch\nVLAN ID 2020 not on attached switch\nVLAN ID 777 not on attached switch\nVLAN ID 888 not on attached switch\nVLAN ID 999 not on attached switch\nVLAN ID 1010 not on attached switch\nVLAN ID 2030 not on attached switch"}
 [WARNING]: Could not create retry file '/usr/share/openstack-tripleo-validations/validations/switch-
vlans.retry'.         [Errno 13] Permission denied: u'/usr/share/openstack-tripleo-validations/validations
/switch-vlans.retry'


PLAY RECAP *****************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=1

Comment 7 errata-xmlrpc 2018-06-27 13:32:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.