Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1589629

Summary: Install logging/metrics failed at TASK [Validate openshift_node_groups and openshift_node_group_name]
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: InstallerAssignee: Michael Gugino <mgugino>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: anli, aos-bugs, jiajliu, jokerman, mmccomas, vrutkovs, wmeng
Target Milestone: ---Keywords: Regression
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-12 18:11:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
inventory file none

Description Junqi Zhao 2018-06-11 02:33:49 UTC
Created attachment 1449842 [details]
inventory file

Description of problem:
Install logging/metrics failed at TASK [Validate openshift_node_groups and openshift_node_group_name]
TASK [Validate openshift_node_groups and openshift_node_group_name] ***********************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:18
Sunday 10 June 2018  22:21:35 -0400 (0:00:01.829)       0:00:13.792 *********** 
fatal: [host-8-249-107.host.centralci.eng.rdu2.redhat.com]: FAILED! => {
    "msg": "last_checked_host: host-8-249-107.host.centralci.eng.rdu2.redhat.com, last_checked_var: openshift_node_group_name;openshift_node_group_name must be defined for all nodes"
}

we don't need to set openshift_node_group_name before

Version-Release number of the following components:
# rpm -qa | grep openshift-ansible
openshift-ansible-roles-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch
openshift-ansible-docs-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch
openshift-ansible-playbooks-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch
openshift-ansible-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy metrics 3.10, inventory file see the attached file
2.
3.

Actual results:
failed at TASK [Validate openshift_node_groups and openshift_node_group_name]

Expected results:

Additional info:

Comment 1 Junqi Zhao 2018-06-11 02:35:16 UTC
Blocks metrics/logging installstion

Comment 2 Junqi Zhao 2018-06-11 02:36:36 UTC
add skip_sanity_checks=true as workaround

Comment 3 Michael Gugino 2018-06-11 14:58:23 UTC
We have refactored the installer a bit recently.  openshift_node_group_name is required to be set for all nodes.

Comment 4 liujia 2018-06-12 06:14:39 UTC
Hit it when do upgrade on openshift-ansible-3.10.0-0.66.0.git.79.68197f9.el7.noarch.rpm

Comment 5 Anping Li 2018-06-12 06:27:22 UTC
What's purpose for this variable?  Is the openshift_node_group_name one of the configmap name in openshift-node project?

I guess it is used to schedule pods to the specified groups. But when I set openshift_node_group_name=node-config-infra. The pods weren't scheduled to those nodes.  Is that correct? 

Shall we update the other roles(openshift-logging/openshift-metrics/openshift_prometheus/openshift_web_console/openshift_hosted/upgrade and etc) accordingly?

Comment 6 Vadim Rutkovsky 2018-06-12 10:12:25 UTC
(In reply to Anping Li from comment #5)
> What's purpose for this variable?  Is the openshift_node_group_name one of
> the configmap name in openshift-node project?

Yes. This configmaps contains labels (and other config settings) which would be applied to a node group. See https://bugzilla.redhat.com/show_bug.cgi?id=1571194

> 
> I guess it is used to schedule pods to the specified groups. But when I set
> openshift_node_group_name=node-config-infra. The pods weren't scheduled to
> those nodes.  Is that correct? 

Did the nodes get the required labels?

Comment 7 Anping Li 2018-06-12 10:30:21 UTC
Athough, there is configure map node-config-infra. But no node are labelled with node-role.kubernetes.io/infra=true. 

For prometheus, I found the default nodes selector have been changed from region=infra to node-role.kubernetes.io/infra=true. 

For Logging&Metrics, there isn't default node selector, the logging and metrics pod can be scheduled to any nodes by default.

The node_selector is using to schedule pods.  Do really need the openshift_node_group_name? 

The OCP upgrade have been asked to provide openshift_node_group_name. What shall be provided to upgrade all nodes in one job?

Comment 8 Scott Dodson 2018-06-12 18:11:34 UTC
This is all related to the need to document node group configuration.

*** This bug has been marked as a duplicate of bug 1569476 ***