Bug 1575051

Summary: 3.10: nodes lose install-time labels
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED DUPLICATE QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, ghuang, jokerman, mmccomas, wmeng
Target Milestone: ---Keywords: Regression
Target Release: 3.10.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-30 14:51:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike Fiedler 2018-05-04 15:52:30 UTC
Description of problem:

In releases of OCP 3.9 and earlier, nodes always keep the labels specified at install time.  These labels are stored in node-config.yaml.   These labels survive when a node is shutdown long enough to be removed from the cluster and then restarted and register again.

In OCP 3.10, the install time labels are present and work fine until the node is shutdown long enough to be removed from the cluster.   The node labels specified at install time are not present in the bootstrab configmap.  If the nodes are shutdown long enough to be removed from the cluster and then restarted, the install time labels (e.g. region=primary, zone=default) are not present and the node must be relabelled

This is a regression from 3.9


Version-Release number of selected component (if applicable):  3.10.0-0.32.0


How reproducible:  Always


Steps to Reproduce:
1.  Install a cluster via openshift-ansible.  In the inventory specify labels like region=primary or region=infra.  These are commonly used in OCP 3.9 and earlier installations.  Example:

ec2-54-201-7-155.us-west-2.compute.amazonaws.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
ec2-54-245-6-1.us-west-2.compute.amazonaws.com openshift_node_labels="{'region': 'primary', 'zone': 'default'}"

2.  After install, verify the labels are there with oc get nodes --show-labels
3.  Stop the node (or the instance the node runs on) until oc get nodes no longer shows the node
4.  Restart the node and oc get nodes --show-labels

Actual results:

Install-time labels such as region=infra are no longer present

Expected results:

Same behavior as 3.9 and earlier.  Install time labels are present.

Additional info:


Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Mike Fiedler 2018-05-04 15:53:46 UTC
Not sure if this is Pod or Installer.  Starting it with Installer since that is where the bootstrap config is created.

Comment 2 Scott Dodson 2018-05-18 03:13:40 UTC
It's a documentation item really.

Node configuration happens via node groups so we have to define a node group for each permutation of labels we want to have and then align nodes to those node groups. Those node groups are translated into configmaps in the openshift-node namespace from the definition of `openshift_node_groups` in openshift ansible.

2) ^^ means that it's not possible to have configuration, including labels, that are unique to each node without defining a node group per node.

Comment 3 Mike Fiedler 2018-05-18 10:55:27 UTC
Why let someone specify a label during the install, especially a critical one like region=infra, if we are not going to persist the configuration?   The first time the node is taken down for an extended period of time, it will no longer be able to run router, registry, elasticsearch, service broker, etc.

Comment 4 Scott Dodson 2018-05-30 14:51:48 UTC
Closing as a dupe of 1569476, we need to document how openshift_node_groups variable should be constructed and how one maps each node or a group of nodes back to those groups which is where one will define the variables. We also need to scrub openshift-ansible for openshift_node_labels.

*** This bug has been marked as a duplicate of bug 1569476 ***

Comment 5 Gan Huang 2018-06-07 07:47:57 UTC
*** Bug 1588371 has been marked as a duplicate of this bug. ***