Bug 1517994

Summary:

Installing CNS with osm_default_node_selector fails

Product:

OpenShift Container Platform

Reporter:

Gerald Nunn <gnunn>

Component:

Installer

Assignee:

Abhinav Dahiya <adahiya>

Installer sub component:

openshift-installer

QA Contact:

Johnny Liu <jialiu>

Status:

CLOSED ERRATA

Docs Contact:

Severity:

unspecified

Priority:

unspecified

CC:

aos-bugs, aos-storage-staff, bmchugh, gnunn, jokerman, mmccomas, vrutkovs

Version:

3.6.0

Flags:

gnunn: needinfo-

Target Milestone:

---

Target Release:

3.6.z

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

undefined

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-04-12 05:59:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1526613, 1541443

Attachments:

Description	Flags
Inventory file	none
install log	none
glusterfs events	none

Description Gerald Nunn 2017-11-27 21:04:20 UTC

Created attachment 1359603 [details]
Inventory file

Description of problem:

I am installing a single master with dedicated gluster storage and app nodes. I want to ensure that user pods are only ever scheduled on the app nodes and do not run on the master or the gluster nodes. To do so, I add the following to my inventory:

osm_default_node_selector="region=primary"

However this causes playbook to fail when waiting for the daemonset to start as per the attached install log file. If I run the playbook without the osm_default_node_selector everything runs as expected. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Gerald Nunn 2017-11-27 21:04:54 UTC

Created attachment 1359604 [details]
install log

Comment 2 Gerald Nunn 2017-11-27 21:08:29 UTC

Created attachment 1359605 [details]
glusterfs events

Comment 4 Gerald Nunn 2017-11-28 17:26:02 UTC

I'm using the package: openshift-ansible-3.6.173.0.75-1.git.0.0a44128.el7.noarch

Comment 5 Vadim Rutkovsky 2018-01-17 14:05:05 UTC

It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs": "storage-host") and 'osm_default_node_selector' ("region": "primary") selectors are used for this pod, so it tries to find the node, which satisfies both clauses.

Which labels are set for your nodes?

Comment 7 Gerald Nunn 2018-01-18 03:13:37 UTC

(In reply to Vadim Rutkovsky from comment #5)
> It seems both 'openshift_storage_glusterfs_nodeselector' ("glusterfs":
> "storage-host") and 'osm_default_node_selector' ("region": "primary")
> selectors are used for this pod, so it tries to find the node, which
> satisfies both clauses.
> 
> Which labels are set for your nodes?

As per the attached inventory file, the gluster nodes have no label whereas the other nodes are labelled with region:primary

Comment 8 Vadim Rutkovsky 2018-01-18 09:59:12 UTC

Are there any pods running in glusterfs namespace?

Could you try labelling one of your nodes with 'region=primary' with 'glusterfs=storage-host' and check if that makes one of the pods be scheduled?

Comment 9 Scott Dodson 2018-01-19 19:26:46 UTC

Jose backported those fixes in https://github.com/openshift/openshift-ansible/pull/6493

The fix should be in openshift-ansible-3.6.173.0.90-1 or newer.

Comment 10 Wenkai Shi 2018-01-23 05:50:53 UTC

Verified with version openshift-ansible-3.6.173.0.96-1.git.0.2954b4a.el7. Installation with default node selector setting can succeed.

Comment 11 Jose A. Rivera 2018-02-14 15:15:42 UTC

*** Bug 1526422 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2018-04-12 05:59:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1106