Bug 1932967

Summary: [on-prem] Unable to deploy additional machinesets on separate subnets
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Machine Config OperatorAssignee: Martin André <m.andre>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: dgautam, ekasprzy, fmarting, javier.ordax, mkrejci, simore
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: For all on-premise platforms, when discovering the node IP, baremetal-runtimecfg wrongly assumed the nodes are always attached to a subnet that includes the VIP and looked for an address in this IP range. Consequence: Nodes fail to generate configuration files with baremetal-runtimcfg. Fix: Fallback to the IP address associated with default route. Result: New compute nodes can join the cluster when deployed on separate subnets.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-16 23:22:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1894539    
Bug Blocks:    

Comment 2 weiwei jiang 2021-03-08 04:19:57 UTC
Checked with 4.6.0-0.nightly-2021-03-06-050044, and it's fixed now.

# the original subnet of workers are 192.168.0.0/16
1. create a additional subnet with 192.168.64.0.0/16
2. add the new subnet to the original external router as an interface
3. add a new rule to allow 22623/tcp port for the new subnet for infra_id_masters security group
4. create a new machineset to provision a new worker in the new subnet
5. wait time to check machine list
6. wait time to check nodes list

5. $ oc get machineset -A      
NAMESPACE               NAME                              DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   wj46ios308az-k8sfl-addit-08zg5w   1         1         1       1           52m
openshift-machine-api   wj46ios308az-k8sfl-worker-0       3         3         3       3           114m
$ oc get machine -A -o wide
NAMESPACE               NAME                                    PHASE     TYPE        REGION      ZONE   AGE    NODE                                    PROVIDERID   STATE
openshift-machine-api   wj46ios308az-k8sfl-addit-08zg5w-kjb5z   Running   m1.large    regionOne   nova   52m    wj46ios308az-k8sfl-addit-08zg5w-kjb5z                ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-master-0             Running   m1.xlarge   regionOne   nova   114m   wj46ios308az-k8sfl-master-0                          ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-master-1             Running   m1.xlarge   regionOne   nova   114m   wj46ios308az-k8sfl-master-1                          ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-master-2             Running   m1.xlarge   regionOne   nova   114m   wj46ios308az-k8sfl-master-2                          ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-worker-0-8xq9d       Running   m1.large    regionOne   nova   112m   wj46ios308az-k8sfl-worker-0-8xq9d                    ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-worker-0-jsw69       Running   m1.large    regionOne   nova   112m   wj46ios308az-k8sfl-worker-0-jsw69                    ACTIVE
openshift-machine-api   wj46ios308az-k8sfl-worker-0-z22s4       Running   m1.large    regionOne   nova   112m   wj46ios308az-k8sfl-worker-0-z22s4                    ACTIVE

6. $ oc get nodes -o wide           
NAME                                    STATUS   ROLES    AGE    VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
wj46ios308az-k8sfl-addit-08zg5w-kjb5z   Ready    worker   43m    v1.19.0+2f3101c   192.168.64.199   <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-master-0             Ready    master   114m   v1.19.0+2f3101c   192.168.3.167    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-master-1             Ready    master   114m   v1.19.0+2f3101c   192.168.1.215    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-master-2             Ready    master   114m   v1.19.0+2f3101c   192.168.3.174    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-worker-0-8xq9d       Ready    worker   102m   v1.19.0+2f3101c   192.168.0.142    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-worker-0-jsw69       Ready    worker   102m   v1.19.0+2f3101c   192.168.2.251    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8
wj46ios308az-k8sfl-worker-0-z22s4       Ready    worker   102m   v1.19.0+2f3101c   192.168.1.123    <none>        Red Hat Enterprise Linux CoreOS 46.82.202103050041-0 (Ootpa)   4.18.0-193.41.1.el8_2.x86_64   cri-o://1.19.1-10.rhaos4.6.git27e71a0.el8

Comment 5 errata-xmlrpc 2021-03-16 23:22:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.21 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0753