Bug 1571194 - region=infra nodes should be infra only rather than compute & infra nodes after upgrade to 3.10
Summary: region=infra nodes should be infra only rather than compute & infra nodes aft...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.z
Assignee: Vikram Goyal
QA Contact: Vikram Goyal
Vikram Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-24 09:28 UTC by Weihua Meng
Modified: 2019-11-20 18:52 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-20 18:52:25 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Weihua Meng 2018-04-24 09:28:43 UTC
Description of problem:
region=infra nodes should be infra only rather than compute & infra nodes after upgrade to 3.10

Version-Release number of the following components:
openshift-ansible-3.10.0-0.27.0

How reproducible:
Always

Steps to Reproduce:
1. upgrade OCP 3.9 to 3.10

Actual results:
OCP 3.9 (before upgrade)
[root@qe-wmengrpm392ha-master-etcd-1 ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
qe-wmengrpm392ha-master-etcd-1 Ready master 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-master-etcd-2 Ready master 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-master-etcd-3 Ready master 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-node-primary-1 Ready compute 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-node-primary-2 Ready compute 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-nrri-1 Ready <none> 12m v1.9.1+a0ce1bc657
qe-wmengrpm392ha-nrri-2 Ready <none> 12m v1.9.1+a0ce1bc657
[root@qe-wmengrpm392ha-master-etcd-1 ~]# oc get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
qe-wmengrpm392ha-master-etcd-1 Ready master 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-master-etcd-1,node-role.kubernetes.io/master=true,role=node
qe-wmengrpm392ha-master-etcd-2 Ready master 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-master-etcd-2,node-role.kubernetes.io/master=true,role=node
qe-wmengrpm392ha-master-etcd-3 Ready master 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-master-etcd-3,node-role.kubernetes.io/master=true,role=node
qe-wmengrpm392ha-node-primary-1 Ready compute 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-node-primary-1,node-role.kubernetes.io/compute=true,region=primary,role=node
qe-wmengrpm392ha-node-primary-2 Ready compute 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-node-primary-2,node-role.kubernetes.io/compute=true,region=primary,role=node
qe-wmengrpm392ha-nrri-1 Ready <none> 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-nrri-1,region=infra,registry=enabled,role=node,router=enabled
qe-wmengrpm392ha-nrri-2 Ready <none> 13m v1.9.1+a0ce1bc657 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-highmem-2,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=qe-wmengrpm392ha-nrri-2,region=infra,registry=enabled,role=node,router=enabled

after upgrade to OCP 3.10

oc get nodes
NAME STATUS ROLES AGE VERSION
qe-wmengrpm39ha-master-etcd-1 Ready master 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-master-etcd-2 Ready master 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-master-etcd-3 Ready master 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-node-primary-1 Ready compute 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-node-primary-2 Ready compute 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-nrri-1 Ready compute,infra 2h v1.10.0+b81c8f8
qe-wmengrpm39ha-nrri-2 Ready compute,infra 2h v1.10.0+b81c8f8

oc get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
cakephp-mysql-example-1-6q22b 1/1 Running 0 1m 10.2.4.9 qe-wmengrpm39ha-nrri-2
cakephp-mysql-example-1-build 0/1 Completed 0 1m 10.2.12.7 qe-wmengrpm39ha-node-primary-1
mysql-1-rsw72 1/1 Running 0 1m 10.2.4.8 qe-wmengrpm39ha-nrri-2

Expected results:
app pods not running on infra nodes.
infra nodes are not labeled compute

Comment 1 Vadim Rutkovsky 2018-05-17 17:35:36 UTC
region=infra is not the tag our playbooks take into account. The nodes should have node-role.kubernetes.io/infra=true label before upgrade to avoid being marked as compute nodes. 

This, however, fails with node-role.kubernetes.io/infra=true too, as there is no group associated with this node, so it would get compute label during bootstrapping.

There are several options how to solve this:
1) Guess the openshift_node_group from existing labels
2) Ensure openshift_node_group is set
3) ??

Scott, should we enforce openshift_node_group during upgrade?

Comment 2 Scott Dodson 2018-05-18 02:03:33 UTC
Yes, we've said that we are going to force all hosts to have openshift_node_group set for them. We need to document this too.

Comment 3 Scott Dodson 2018-05-21 13:41:29 UTC
We'll check to ensure that openshift_node_group_name is set for every single host so that there's no longer any ambiguity as to how node groups are being assigned to hosts.

Need to document this requirement.

Comment 4 Weihua Meng 2018-05-30 10:56:16 UTC
Hi, Vadim.
I observed that now after upgrade to v3.10, the node labels are different from the one with openshift-ansible-3.10.0-0.27.0.

one infra node has one role (infra), 
the other node has two roles (infra & compute)

[root@wmengug39r75-master-etcd-zone1-1 ~]# oc version
oc v3.10.0-0.53.0
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://104.197.218.186
openshift v3.10.0-0.53.0
kubernetes v1.10.0+b81c8f8
[root@wmengug39r75-master-etcd-zone1-1 ~]# oc get nodes
NAME                                STATUS    ROLES           AGE       VERSION
wmengug39r75-master-etcd-zone1-1    Ready     master          1h        v1.10.0+b81c8f8
wmengug39r75-master-etcd-zone2-1    Ready     master          1h        v1.10.0+b81c8f8
wmengug39r75-master-etcd-zone2-2    Ready     master          1h        v1.10.0+b81c8f8
wmengug39r75-node-zone1-primary-1   Ready     compute         1h        v1.10.0+b81c8f8
wmengug39r75-node-zone2-primary-1   Ready     compute         1h        v1.10.0+b81c8f8
wmengug39r75-nrriz-1                Ready     infra           1h        v1.10.0+b81c8f8
wmengug39r75-nrriz-2                Ready     compute,infra   1h        v1.10.0+b81c8f8
[root@wmengug39r75-master-etcd-zone1-1 ~]# oc get nodes --show-labels
NAME                                STATUS    ROLES           AGE       VERSION           LABELS
wmengug39r75-master-etcd-zone1-1    Ready     master          1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=wmengug39r75-master-etcd-zone1-1,node-role.kubernetes.io/master=true,role=node
wmengug39r75-master-etcd-zone2-1    Ready     master          1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=wmengug39r75-master-etcd-zone2-1,node-role.kubernetes.io/master=true,role=node
wmengug39r75-master-etcd-zone2-2    Ready     master          1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=wmengug39r75-master-etcd-zone2-2,node-role.kubernetes.io/master=true,role=node
wmengug39r75-node-zone1-primary-1   Ready     compute         1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=wmengug39r75-node-zone1-primary-1,node-role.kubernetes.io/compute=true,region=primary,role=node
wmengug39r75-node-zone2-primary-1   Ready     compute         1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=wmengug39r75-node-zone2-primary-1,node-role.kubernetes.io/compute=true,region=primary,role=node
wmengug39r75-nrriz-1                Ready     infra           1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=wmengug39r75-nrriz-1,node-role.kubernetes.io/infra=true,region=infra,registry=enabled,role=node,router=enabled
wmengug39r75-nrriz-2                Ready     compute,infra   1h        v1.10.0+b81c8f8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=wmengug39r75-nrriz-2,node-role.kubernetes.io/compute=true,node-role.kubernetes.io/infra=true,region=infra,registry=enabled,role=node,router=enabled
[root@wmengug39r75-master-etcd-zone1-1 ~]#

Comment 5 Vadim Rutkovsky 2018-05-31 15:05:10 UTC
>I observed that now after upgrade to v3.10, the node labels are different from >the one with openshift-ansible-3.10.0-0.27.0.

>one infra node has one role (infra), 
>the other node has two roles (infra & compute)

Right, verified that the upgrade is working correctly if

   openshift_node_labels="{'region': 'infra'}"

is replaced by 

   openshift_node_group_name='node-config-infra'

Moving this to documentation team

Comment 6 Stephen Cuppett 2019-11-20 18:52:25 UTC
OCP 3.6-3.10 is no longer on full support [1]. Marking CLOSED DEFERRED. If you have a customer case with a support exception or have reproduced on 3.11+, please reopen and include those details. When reopening, please set the Target Release to the appropriate version where needed.

[1]: https://access.redhat.com/support/policy/updates/openshift


Note You need to log in before you can comment on or make changes to this bug.