Bug 1812661 - Worker Node does not appear in oc get nodes
Summary: Worker Node does not appear in oc get nodes
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.5.0
Assignee: Julia Kreger
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-11 19:27 UTC by DirectedSoul
Modified: 2020-05-28 12:12 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-28 12:12:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description DirectedSoul 2020-03-11 19:27:49 UTC
Description of problem:

Worker Node is UP but it's not visible in `oc get nodes`

Version-Release number of the following components:

```
oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.5     True        False         37m     Cluster version is 4.3.5
```


How reproducible: 100% 
Install the latest BM IPI cluster and wait until all nodes show up in bmh 

```
oc get bmh -n openshift-machine-api
NAME       STATUS   PROVISIONING STATUS      CONSUMER              BMC                                            HARDWARE PROFILE   ONLINE   ERROR
master-0   OK       externally provisioned   kni7-master-0         ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:100]                      true     
master-1   error    registering              kni7-master-1         ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:101]                      true     Failed to get power state for node 386d2445-2d26-4341-b361-bf2475fd5184. Error: IPMI call failed: power status.
master-2   OK       externally provisioned   kni7-master-2         ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:102]                      true     
worker-0   OK       provisioned              kni7-worker-0-tf9cx   ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:104]   unknown            true     
worker-1   OK       provisioned              kni7-worker-0-cjn5w   ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:105]   unknown            true     
worker-2   OK       provisioned              kni7-worker-0-n45ns   ipmi://[fd35:919d:4042:2:c7ed:9a9f:a9ec:106]   unknown            true    
```

Now, the install-config has been specified to 3xMasters + 2xWorkers 

```
compute:
- name: worker
  replicas: 2
controlPlane:
  name: master
  replicas: 3
  platform:
    baremetal: {}
```
But only masters and 1 worker is visible in 

```
oc get nodes
NAME                                         STATUS   ROLES    AGE    VERSION
master-0.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
master-1.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
master-2.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
worker-2.kni7.cloud.lab.eng.bos.redhat.com   Ready    worker   15m    v1.16.2
```

Once I scale this available worker(worker-2) as --replicas=3 , I can see the remainig worker-0 and worker-1 get attached with kni7-worker-0-xyz which is not expected. Instead, I was expecting 3M,2W architecture as specified in install-config.yaml file, and also each worker should have its own CONSUMER attached(please see the above `oc get bmh -n openshift-machine-api` command) . 

p.s: Both worker-0 and worker-1 are UP(confirmed by logging-in iDRAC console)

Steps to Reproduce:
1. Deploy an BM IPI cluster, wait until masters are UP
2. Observe that no worker node gets listed in `oc get nodes`
3. Check for pending csr's which in this case 
```
oc get csr
NAME        AGE         REQUESTOR                                                                   CONDITION
csr-2947c   9m40s       system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-2hc6l   9m28s       system:node:worker-2.kni7.cloud.lab.eng.bos.redhat.com                      Pending
csr-7gqxb   <invalid>   system:node:worker-2.kni7.cloud.lab.eng.bos.redhat.com                      Pending
```
I had to manually approve these certs

```
$ for csr in $(oc -n openshift-machine-api get csr | awk '/Pending/ {print $1}'); do oc adm certificate approve $csr;done
```
no workers show up even after this step. 

Actual results:
```
oc get nodes
NAME                                         STATUS   ROLES    AGE    VERSION
master-0.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
master-1.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
master-2.kni7.cloud.lab.eng.bos.redhat.com   Ready    master   149m   v1.16.2
worker-2.kni7.cloud.lab.eng.bos.redhat.com   Ready    worker   15m    v1.16.2
```

Expected results:

Expected 3Masters and 2 Workers to show up.

Comment 4 Stephen Benjamin 2020-03-26 17:40:27 UTC
Are you still encountering this is issue?


Note You need to log in before you can comment on or make changes to this bug.