1517994 – Installing CNS with osm_default_node_selector fails

Bug 1517994 - Installing CNS with osm_default_node_selector fails

Summary: Installing CNS with osm_default_node_selector fails

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	3.6.z
Assignee:	Abhinav Dahiya
QA Contact:	Johnny Liu
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1526422 (view as bug list)
Depends On:
Blocks:	1526613 1541443
TreeView+	depends on / blocked

Reported:	2017-11-27 21:04 UTC by Gerald Nunn
Modified:	2021-06-10 13:43 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2018-04-12 05:59:59 UTC
Target Upstream Version:
Embargoed:
Flags:	gnunn: needinfo-

Attachments	(Terms of Use)
Inventory file (2.90 KB, text/plain) 2017-11-27 21:04 UTC, Gerald Nunn	no flags	Details
install log (910.76 KB, text/plain) 2017-11-27 21:04 UTC, Gerald Nunn	no flags	Details
glusterfs events (4.26 MB, text/plain) 2017-11-27 21:08 UTC, Gerald Nunn	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:1106	0	None	None	None	2018-04-12 06:00:45 UTC

Description Gerald Nunn 2017-11-27 21:04:20 UTC

Created attachment 1359603 [details]
Inventory file

Description of problem:

I am installing a single master with dedicated gluster storage and app nodes. I want to ensure that user pods are only ever scheduled on the app nodes and do not run on the master or the gluster nodes. To do so, I add the following to my inventory:

osm_default_node_selector="region=primary"

However this causes playbook to fail when waiting for the daemonset to start as per the attached install log file. If I run the playbook without the osm_default_node_selector everything runs as expected. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Gerald Nunn 2017-11-27 21:04:54 UTC

Created attachment 1359604 [details]
install log

Comment 2 Gerald Nunn 2017-11-27 21:08:29 UTC

Created attachment 1359605 [details]
glusterfs events

Comment 4 Gerald Nunn 2017-11-28 17:26:02 UTC

I'm using the package: openshift-ansible-3.6.173.0.75-1.git.0.0a44128.el7.noarch

Comment 5 Vadim Rutkovsky 2018-01-17 14:05:05 UTC

It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs": "storage-host") and 'osm_default_node_selector' ("region": "primary") selectors are used for this pod, so it tries to find the node, which satisfies both clauses.

Which labels are set for your nodes?

Comment 7 Gerald Nunn 2018-01-18 03:13:37 UTC

(In reply to Vadim Rutkovsky from comment #5)
> It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs":
> "storage-host") and 'osm_default_node_selector' ("region": "primary")
> selectors are used for this pod, so it tries to find the node, which
> satisfies both clauses.
> 
> Which labels are set for your nodes?

As per the attached inventory file, the gluster nodes have no label whereas the other nodes are labelled with region:primary

Comment 8 Vadim Rutkovsky 2018-01-18 09:59:12 UTC

Are there any pods running in glusterfs namespace?

Could you try labelling one of your nodes with 'region=primary' with 'glusterfs=storage-host' and check if that makes one of the pods be scheduled?

Comment 9 Scott Dodson 2018-01-19 19:26:46 UTC

Jose backported those fixes in https://github.com/openshift/openshift-ansible/pull/6493

The fix should be in openshift-ansible-3.6.173.0.90-1 or newer.

Comment 10 Wenkai Shi 2018-01-23 05:50:53 UTC

Verified with version openshift-ansible-3.6.173.0.96-1.git.0.2954b4a.el7. Installation with default node selector setting can succeed.

Comment 11 Jose A. Rivera 2018-02-14 15:15:42 UTC

*** Bug 1526422 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2018-04-12 05:59:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1106

Note You need to log in before you can comment on or make changes to this bug.