Bug 1888712 - Worker nodes do not come up on a baremetal IPI deployment with control plane network configured on a vlan on top of bond interface due to Pending CSRs
Summary: Worker nodes do not come up on a baremetal IPI deployment with control plane ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.0
Assignee: Bob Fournier
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks: 1895954
TreeView+ depends on / blocked
 
Reported: 2020-10-15 14:35 UTC by Marius Cornea
Modified: 2024-03-25 16:44 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Ironic Python Agent now reports vlan interfaces in the list of interfaces during introspection including the IP address on the interfaces. Reason: To generate a CSR for an interface the IP address must be provided. The address for the vlan interfaces in the introspection report can be used. Result: A CSR can be obtained for all interfaces, including vlan interfaces.
Clone Of:
: 1895954 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:33:58 UTC
Target Upstream Version:
Embargoed:
augol: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift baremetal-operator pull 142 0 None open Merge upstream 2021-04-06 2021-04-08 11:17:10 UTC
Github openshift ironic-image pull 154 0 None open Bug 1888712: Support for including vlan interfaces in introspection r… 2021-03-09 20:29:08 UTC
OpenStack Storyboard 2008298 0 None None None 2020-11-04 02:48:25 UTC
OpenStack gerrit 760570 0 None MERGED Bring up VLAN interfaces and include in introspection report 2021-02-18 12:03:05 UTC
OpenStack gerrit 761177 0 None MERGED Add support for vlan interfaces in dhcp-all-interfaces.sh 2021-02-18 12:03:05 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:34:13 UTC

Description Marius Cornea 2020-10-15 14:35:55 UTC
Description of problem:

Worker nodes do not come up on a baremetal IPI deployment with control plane network configured on a vlan on top of bond interface due to Pending CSRs.

Nodes have 2 NICs grouped in a bond, native vlan is set for the provisioning network and the control plane is configured over a tagged vlan. 

Version-Release number of selected component (if applicable):
4.6.0-0.ci.test-2020-10-15-100828-ci-ln-x9bt472 (including fix for BZ#18846280)

How reproducible:
100%

Steps to Reproduce:

1. Customize Openstack image by following https://access.redhat.com/solutions/5460671

2. Customize master and worker ign files with the following custom ignition file:

{
  "ignition": {
    "version": "2.3.0"
  },
  "storage": {
    "files": [
      {
        "path": "/etc/sysconfig/network-scripts/ifcfg-enp4s0",
        "filesystem": "root",
        "mode": 436,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,REVWSUNFPWVucDRzMApCT09UUFJPVE89bm9uZQpPTkJPT1Q9eWVzCk1BU1RFUj1ib25kMApTTEFWRT15ZXMK"
        }
      },
      {
        "path": "/etc/sysconfig/network-scripts/ifcfg-enp5s0",
        "filesystem": "root",
        "mode": 436,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,REVWSUNFPWVucDVzMApCT09UUFJPVE89bm9uZQpPTkJPT1Q9eWVzCk1BU1RFUj1ib25kMApTTEFWRT15ZXMK"
        }
      },
      {
        "path": "/etc/sysconfig/network-scripts/ifcfg-bond0",
        "filesystem": "root",
        "mode": 436,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,Qk9ORElOR19PUFRTPSJkb3duZGVsYXk9MCBsYWNwX3JhdGU9ZmFzdCBtaWltb249MTAwIG1vZGU9ODAyLjNhZCB1cGRlbGF5PTAiClRZUEU9Qm9uZApCT05ESU5HX01BU1RFUj15ZXMKQk9PVFBST1RPPWRoY3AKTkFNRT1ib25kMApERVZJQ0U9Ym9uZDAKT05CT09UPXllcwo="
        }
      },
      {
        "path": "/etc/sysconfig/network-scripts/ifcfg-bond0.404",
        "filesystem": "root",
        "mode": 436,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,REVWSUNFPWJvbmQwLjQwNApCT09UUFJPVE89ZGhjcApPTkJPT1Q9eWVzClZMQU49eWVz"
        }
      }
    ]
  }
}


3. Deploy via baremetal IPI flow

Actual results:

Worker nodes get provisioned but they do not show up as nodes and deployment eventually times out.

Expected results:
Deployment succeeds.

Additional info:

Attaching must-gather.

Comment 2 Marius Cornea 2020-10-15 14:58:10 UTC
Capturing worker nodes ip addresses:

[core@worker-0-0 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:75:0d:99 brd ff:ff:ff:ff:ff:ff
3: enp5s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
    link/ether 52:54:00:75:0d:99 brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:75:0d:99 brd ff:ff:ff:ff:ff:ff
    inet 172.22.0.99/24 brd 172.22.0.255 scope global dynamic noprefixroute bond0
       valid_lft 3035sec preferred_lft 3035sec
    inet6 fe80::5054:ff:fe75:d99/64 scope link 
       valid_lft forever preferred_lft forever
6: bond0.404@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:75:0d:99 brd ff:ff:ff:ff:ff:ff
7: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9e:2f:b2:22:0d:81 brd ff:ff:ff:ff:ff:ff
8: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 52:54:00:75:0d:99 brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.107/24 brd 192.168.123.255 scope global dynamic noprefixroute br-ex
       valid_lft 3038sec preferred_lft 3038sec
    inet6 fe80::9d09:2d4e:7c2f:e4b/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[core@worker-0-0 ~]$

Comment 3 Marius Cornea 2020-10-15 15:05:44 UTC
worker-0-0 introspection data:

oc get bmh openshift-worker-0-0 -n openshift-machine-api -o json | jq -r .status.hardware.nics
[
  {
    "ip": "172.22.0.16",
    "mac": "52:54:00:d9:a0:fc",
    "model": "0x1af4 0x0001",
    "name": "enp5s0",
    "pxe": false,
    "speedGbps": 0,
    "vlanId": 0
  },
  {
    "ip": "172.22.0.99",
    "mac": "52:54:00:75:0d:99",
    "model": "0x1af4 0x0001",
    "name": "enp4s0",
    "pxe": true,
    "speedGbps": 0,
    "vlanId": 0
  }
]

Comment 4 Marius Cornea 2020-10-15 15:28:29 UTC
After manually approving the CSR comparing the machine with the node we can notice that the machine's IP addresses and hostname do not match the nodes.

[kni@provisionhost-0-0 ~]$ oc -n openshift-machine-api get machine/ocp-edge-cluster-0-57w7x-worker-0-wqfls -o json | jq .status
{
  "addresses": [
    {
      "address": "172.22.0.16",
      "type": "InternalIP"
    },
    {
      "address": "172.22.0.99",
      "type": "InternalIP"
    },
    {
      "address": "localhost.localdomain",
      "type": "Hostname"
    },
    {
      "address": "localhost.localdomain",
      "type": "InternalDNS"
    }
  ],
  "lastUpdated": "2020-10-15T14:43:10Z",
  "phase": "Provisioned"
}


[kni@provisionhost-0-0 ~]$ oc -n openshift-machine-api get node/worker-0-0 -o json | jq .status.addresses
[
  {
    "address": "192.168.123.107",
    "type": "InternalIP"
  },
  {
    "address": "worker-0-0",
    "type": "Hostname"
  }
]

Comment 6 Steven Hardy 2020-10-15 15:46:48 UTC
(In reply to Marius Cornea from comment #3)
> worker-0-0 introspection data:
> 
> oc get bmh openshift-worker-0-0 -n openshift-machine-api -o json | jq -r
> .status.hardware.nics
> [
>   {
>     "ip": "172.22.0.16",
>     "mac": "52:54:00:d9:a0:fc",
>     "model": "0x1af4 0x0001",
>     "name": "enp5s0",
>     "pxe": false,
>     "speedGbps": 0,
>     "vlanId": 0
>   },
>   {
>     "ip": "172.22.0.99",
>     "mac": "52:54:00:75:0d:99",
>     "model": "0x1af4 0x0001",
>     "name": "enp4s0",
>     "pxe": true,
>     "speedGbps": 0,
>     "vlanId": 0
>   }
> ]

So yeah, here we see both nics report an IP from the provisioning network (native vlan), but there isn't any IP from the tagged network used for the controlplane.

We need to figure out how to discover that data during inspection, but as a short-term workaround it may be possible to add the data to each BMH via a status annotation manually, then the auto-approval would work as expected?

That would require some manual intervention pre-deployment of the workers, but would mean that subsequent CSR approval works as expected.

Comment 7 Steven Hardy 2020-10-22 16:07:24 UTC
So we had some discussion around potential solutions which I'll attempt to summarize:

- The IPA ramdisk can collect LLDP data which would (depending on the switch configuration) potentially tell us the connected VLANs for each nic
- This is enabled via the "ipa-collect-lldp" flag, which is already enabled in the OCP/metal3 ironic deployment ref https://github.com/openshift/ironic-image/blob/master/inspector.ipxe#L8
- Currently the ipa-inspection-dhcp-all-interfaces=1 does not consider any vlan information collected via LLDP

To solve the reported problem, we need the IPA ramdisk to detect IPs on all connected vlans, so that they are appended to the BMH .status.hardware.nics described in previous comments.

A potential solution is for the dhcp-all-interfaces element (or something similar) to consume the LLDP data as part of the discovery process, so that we can detect DHCP IPs available via tagged networks and not only the native vlan for each physical nic.

One other possibility would be to attempt to DHCP over a range of all possible vlans, but it seems likely this would impose an unacceptable time delay?

A final possibility would be some user-interface that allows us to specify either a smaller range, or a specific vlan, but in the case of a specific vlan there is the potential problem that we can't currently control the IPA kernel args on a per-host basis, and it's possible different pools of nodes could be on different vlans.

Overall it seems like the most elegant solution is to consume the LLDP data, but further investigation is needed to understand the effort involved.

Also, if we go with the LLDP solution, we'll need to find a way to enable VM based testing, and there still may be a requirement for some manual workaround in the case where switch model/configuration prevent reliance on LLDP.

Comment 10 Steven Hardy 2020-10-23 09:45:51 UTC
>   "ignition": {
>    "version": "2.3.0"

Note that from 4.6 onwards this should be 3.1.0 and the schema for creating files is slightly different, in particular the "filesystem" option is no longer required/supported

Comment 12 Bob Fournier 2020-10-26 12:04:45 UTC
Investigating adding IPA collector to bring up vlans and run dhclient.  As a short-term hack/workaround it may be possible to manually approve the CSR according to Steve.

Comment 14 Tomas Sedovic 2020-11-23 11:23:09 UTC
Adding the UpcomingSprint keywords. The code is up for review but still being iterated on. We plan to implement it in the upcoming release.

Comment 15 Bob Fournier 2020-11-30 18:20:25 UTC
Hi Chris - you added Case 02787794 to this bug so this is being treated as a blocker, however I don't think this bug is related to that case.  That case is strictly for setting up a bond while this bug is for getting the VLAN IP to be used for the CSR and is dependent on other fixes, see https://issues.redhat.com/browse/KNIDEPLOY-1778.

Can we remove Case 02787794 from this bug?

Comment 16 Edu Alcaniz 2020-12-10 15:15:48 UTC
I de-attach the bugzilla to the case

Comment 18 Bob Fournier 2021-01-14 14:02:37 UTC
IPA images have been built and new images are available here https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1458388.

Moving to MODIFIED

Comment 20 Manuel Rodriguez 2021-01-20 14:29:45 UTC
Hi Red Hat team

I'm interested in this BZ, since we are having the same issues described in the current consulting engagement, I have a RedHat case opened to follow up. I just wanted to add a few notes from our tests:

- updating the entries (InternalIP, InternalDNS, adding hostname) in the machines or baremetalhost resources doesn't work, as soon as we edit the status section the resource is not updated, it doesn't complain either.
- the manual approval seems to work, but we noticed that the machines remain in "Provisioned" status and not Running. (oc get machines -n openshift-machine-api)
- the machines never get the "nodeRef" field, I think because the baremetalhost couldn't grab the IP and hostname, and hence certs are not auto-approved.
- we don't know the consequences of leaving the machines in "Provisioned" status and without the correct hostname, IntenalIP. All seems to work, but we wonder if in future Day-2 operations we'll encounter issues.

I pulled the IPA images you made reference, and inject them in my current Lab with OCP 4.6.12 deployment, I did it in a very ugly fashion but it was just to test. 
I confirmed that using these images and adding something like ipa-enable-vlan-interfaces=enp2s0.1000 in the pxe files, prepares the interface to use a vlan and bring it up to detect DHCP.

Then I'm able to see IP and InternalAPI in the baremetalhost and machine resources. In the example below I do not have a trunk on vlan 1000, I just wanted to confirm the information is reported back in the respective resources.

$ oc get bmh worker-0 -o yaml | grep 1000
    - ip: fe80::5054:ff:fe99:30%enp2s0.1000
      name: enp2s0.1000

$ oc get machines ocp4-f2k8b-worker-0-4hcg7 -o yaml | grep -A1 1000
  - address: fe80::5054:ff:fe99:30%enp2s0.1000
    type: InternalIP

Do you have any references of how this procedure will be enabled in the deployment? and when we'll be officially released? I see in the BZ info above target is 4.7, I just wanted to confirm.
We want to implement this, then any info you can share or help testing please contact me.

Thanks
Manuel

Comment 21 Manuel Rodriguez 2021-01-25 20:44:34 UTC
Hi, 

We found a work-around for this issue. Another challenge we are facing is that our switches do not support LACP fallback and we cannot perform dhclient requests over a physical NIC with a VLAN. We need to build a bond with LCAP in mode 802.3ad to be able to talk with the DHCP server.

In short the work-around consist in updating the missing status data in the baremetalhost resource via the API, this information is replicated to the machine resources and CSR gets auto-approved, all elements in the machine resource are present and installation completed without issues and without having to approve the CSR manually.

I prepared a solution article for this to get it documented, it's just a draft now, I'll include the link when approved.

Thanks.

Comment 22 Bob Fournier 2021-01-27 21:09:01 UTC
Thanks Manuel.  Yes, this fix to add the vlan interfaces to the introspection report will be in in 4.7.

Comment 23 Bob Fournier 2021-02-10 13:45:26 UTC
Hi Amit, this was actually delivered in 4.7, but the Verification is not done yet.  Should we leave the TargetRelease at 4.7?

Comment 24 Amit Ugol 2021-02-15 10:13:38 UTC
Already moved to 4.8, thanks for noticing.

Comment 33 Bob Fournier 2021-03-18 15:53:55 UTC
Looking at the must gather for from Comment 1, it appears that lldp is not enabled on the switches as we don't see any lldp info in the introspection report for enp5s0 and enp4s0.

2020-10-15T14:42:45.666756853Z 2020-10-15 14:42:45.665 1 DEBUG ironic_inspector.main [req-87ec5801-bebf-40c0-bd94-9b273d6c4e7f - - - - -] [node: MAC 52:54:00:33:14:20] Received data from the ramdisk: {'inventory': {'interfaces': [{'name': 'enp5s0', 'mac_address': '52:54:00:b8:d8:f9', 'ipv4_address': '172.22.0.237', 'ipv6_address': 'fe80::5206:484:12d1:f47d%enp5s0', 'has_carrier': True, 'lldp': [], 'vendor': '0x1af4', 'product': '0x0001', 'client_id': None, 'biosdevname': None}, {'name': 'enp4s0', 'mac_address': '52:54:00:33:14:20', 'ipv4_address': '172.22.0.216', 'ipv6_address': 'fe80::ea8e:823b:90b9:5d60%enp4s0', 'has_carrier': True, 'lldp': [],

So although https://github.com/openshift/ironic-image/pull/154 is necessary to set ipa-enable-vlan-interfaces in the kernel params it will use lldp by default when set to "all".  We'll need another patch to set the particular VLAN interface(s) to use.

Comment 34 Bob Fournier 2021-04-08 11:17:10 UTC
The fix has merged upstream https://github.com/metal3-io/baremetal-operator/pull/821 and this downstream backport will pick it up - https://github.com/openshift/baremetal-operator/pull/142.  When that merges will move this to ON_QA.

Comment 35 Bob Fournier 2021-04-09 11:51:02 UTC
Moving this back to ON_QA as https://github.com/openshift/baremetal-operator/pull/142 has merged. Note that this will now work automatically in setup in which the switch is using LLDP. 

When the switch does not use LLDP then the interface that the VLANs are on would need to be defined via the IRONIC_INSPECTOR_VLAN_INTERFACES environment variable, see https://github.com/openshift/baremetal-operator/blob/master/docs/dev-setup.md.

Comment 39 Nataf Sharabi 2021-04-29 11:41:22 UTC
Hi,

Since we saw that the WA is working [1]

Should I verify this bug? Or wait for a full solution?

Thanks,

Nataf


[1]
I0425 22:49:07.294758 1 csr_check.go:171] Found existing serving cert for worker-0-0
I0425 22:49:48.261301 1 controller.go:114] Reconciling CSR: csr-p4bsb
I0425 22:49:48.261822 1 csr_check.go:150] csr-p4bsb: CSR does not appear to be client csr
I0425 22:49:48.261914 1 csr_check.go:442] retrieving serving cert from worker-0-0 (192.168.123.138:10250)
I0425 22:49:48.264438 1 csr_check.go:171] Found existing serving cert for worker-0-0

Comment 44 Bob Fournier 2021-05-03 11:29:26 UTC
> Since we saw that the WA is working [1]

> Should I verify this bug? Or wait for a full solution?

> Thanks,
> Nataf

Hi Nataf,

The code fix for this bug includes the VLAN interfaces and IPs in the ironic introspection report so that VLAN IPs can be used for the CSR. However it relies on LLDP information provided by the network switch in order to determine the VLAN interfaces. In earlier log files I did not see LLDP information received by IPA in the setup. If LLDP is not available then the workaround can be used. It would be good to verify VLAN IPs used LLDP, if not the workaround is acceptable.

Comment 45 Nataf Sharabi 2021-06-02 09:21:59 UTC
This workaround has been tested & verified.

It is well documented here:

https://tools.apps.cee.redhat.com/support-exceptions/id/2332

Verifying.

Comment 49 errata-xmlrpc 2021-07-27 22:33:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.