Bug 1517994 - Installing CNS with osm_default_node_selector fails
Summary: Installing CNS with osm_default_node_selector fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.6.z
Assignee: Abhinav Dahiya
QA Contact: Johnny Liu
URL:
Whiteboard:
: 1526422 (view as bug list)
Depends On:
Blocks: 1526613 1541443
TreeView+ depends on / blocked
 
Reported: 2017-11-27 21:04 UTC by Gerald Nunn
Modified: 2021-06-10 13:43 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-04-12 05:59:59 UTC
Target Upstream Version:
Embargoed:
gnunn: needinfo-


Attachments (Terms of Use)
Inventory file (2.90 KB, text/plain)
2017-11-27 21:04 UTC, Gerald Nunn
no flags Details
install log (910.76 KB, text/plain)
2017-11-27 21:04 UTC, Gerald Nunn
no flags Details
glusterfs events (4.26 MB, text/plain)
2017-11-27 21:08 UTC, Gerald Nunn
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1106 0 None None None 2018-04-12 06:00:45 UTC

Description Gerald Nunn 2017-11-27 21:04:20 UTC
Created attachment 1359603 [details]
Inventory file

Description of problem:

I am installing a single master with dedicated gluster storage and app nodes. I want to ensure that user pods are only ever scheduled on the app nodes and do not run on the master or the gluster nodes. To do so, I add the following to my inventory:

osm_default_node_selector="region=primary"

However this causes playbook to fail when waiting for the daemonset to start as per the attached install log file. If I run the playbook without the osm_default_node_selector everything runs as expected. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Gerald Nunn 2017-11-27 21:04:54 UTC
Created attachment 1359604 [details]
install log

Comment 2 Gerald Nunn 2017-11-27 21:08:29 UTC
Created attachment 1359605 [details]
glusterfs events

Comment 4 Gerald Nunn 2017-11-28 17:26:02 UTC
I'm using the package: openshift-ansible-3.6.173.0.75-1.git.0.0a44128.el7.noarch

Comment 5 Vadim Rutkovsky 2018-01-17 14:05:05 UTC
It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs": "storage-host") and 'osm_default_node_selector' ("region": "primary") selectors are used for this pod, so it tries to find the node, which satisfies both clauses.

Which labels are set for your nodes?

Comment 7 Gerald Nunn 2018-01-18 03:13:37 UTC
(In reply to Vadim Rutkovsky from comment #5)
> It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs":
> "storage-host") and 'osm_default_node_selector' ("region": "primary")
> selectors are used for this pod, so it tries to find the node, which
> satisfies both clauses.
> 
> Which labels are set for your nodes?

As per the attached inventory file, the gluster nodes have no label whereas the other nodes are labelled with region:primary

Comment 8 Vadim Rutkovsky 2018-01-18 09:59:12 UTC
Are there any pods running in glusterfs namespace?

Could you try labelling one of your nodes with 'region=primary' with 'glusterfs=storage-host' and check if that makes one of the pods be scheduled?

Comment 9 Scott Dodson 2018-01-19 19:26:46 UTC
Jose backported those fixes in https://github.com/openshift/openshift-ansible/pull/6493

The fix should be in openshift-ansible-3.6.173.0.90-1 or newer.

Comment 10 Wenkai Shi 2018-01-23 05:50:53 UTC
Verified with version openshift-ansible-3.6.173.0.96-1.git.0.2954b4a.el7. Installation with default node selector setting can succeed.

Comment 11 Jose A. Rivera 2018-02-14 15:15:42 UTC
*** Bug 1526422 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2018-04-12 05:59:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1106


Note You need to log in before you can comment on or make changes to this bug.