Created attachment 1449842 [details] inventory file Description of problem: Install logging/metrics failed at TASK [Validate openshift_node_groups and openshift_node_group_name] TASK [Validate openshift_node_groups and openshift_node_group_name] *********************************************************************************************************************************************** task path: /usr/share/ansible/openshift-ansible/playbooks/init/sanity_checks.yml:18 Sunday 10 June 2018 22:21:35 -0400 (0:00:01.829) 0:00:13.792 *********** fatal: [host-8-249-107.host.centralci.eng.rdu2.redhat.com]: FAILED! => { "msg": "last_checked_host: host-8-249-107.host.centralci.eng.rdu2.redhat.com, last_checked_var: openshift_node_group_name;openshift_node_group_name must be defined for all nodes" } we don't need to set openshift_node_group_name before Version-Release number of the following components: # rpm -qa | grep openshift-ansible openshift-ansible-roles-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch openshift-ansible-docs-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch openshift-ansible-playbooks-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch openshift-ansible-3.10.0-0.65.0.git.45.1ea4d05.el7.noarch How reproducible: Always Steps to Reproduce: 1. Deploy metrics 3.10, inventory file see the attached file 2. 3. Actual results: failed at TASK [Validate openshift_node_groups and openshift_node_group_name] Expected results: Additional info:
Blocks metrics/logging installstion
add skip_sanity_checks=true as workaround
We have refactored the installer a bit recently. openshift_node_group_name is required to be set for all nodes.
Hit it when do upgrade on openshift-ansible-3.10.0-0.66.0.git.79.68197f9.el7.noarch.rpm
What's purpose for this variable? Is the openshift_node_group_name one of the configmap name in openshift-node project? I guess it is used to schedule pods to the specified groups. But when I set openshift_node_group_name=node-config-infra. The pods weren't scheduled to those nodes. Is that correct? Shall we update the other roles(openshift-logging/openshift-metrics/openshift_prometheus/openshift_web_console/openshift_hosted/upgrade and etc) accordingly?
(In reply to Anping Li from comment #5) > What's purpose for this variable? Is the openshift_node_group_name one of > the configmap name in openshift-node project? Yes. This configmaps contains labels (and other config settings) which would be applied to a node group. See https://bugzilla.redhat.com/show_bug.cgi?id=1571194 > > I guess it is used to schedule pods to the specified groups. But when I set > openshift_node_group_name=node-config-infra. The pods weren't scheduled to > those nodes. Is that correct? Did the nodes get the required labels?
Athough, there is configure map node-config-infra. But no node are labelled with node-role.kubernetes.io/infra=true. For prometheus, I found the default nodes selector have been changed from region=infra to node-role.kubernetes.io/infra=true. For Logging&Metrics, there isn't default node selector, the logging and metrics pod can be scheduled to any nodes by default. The node_selector is using to schedule pods. Do really need the openshift_node_group_name? The OCP upgrade have been asked to provide openshift_node_group_name. What shall be provided to upgrade all nodes in one job?
This is all related to the need to document node group configuration. *** This bug has been marked as a duplicate of bug 1569476 ***