Bug 1483923
Summary: | CNS deployment fails if default node selector is set | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Johan Swensson <jswensso> | ||||||
Component: | Installer | Assignee: | Jose A. Rivera <jarrpa> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Wenkai Shi <weshi> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 3.6.0 | CC: | anrussel, aos-bugs, benjamin.affolter, bmchugh, cbucur, jialiu, jokerman, jrosenta, jswensso, mmccomas, myllynen, sdodson, stwalter, weshi | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 3.7.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Allow option to use or ignore default node selectors
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1526422 (view as bug list) | Environment: | |||||||
Last Closed: | 2017-11-28 22:07:41 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1526422 | ||||||||
Attachments: |
|
Description
Johan Swensson
2017-08-22 09:35:41 UTC
What's the "oc describe" output for the GlusterFS daemonset and the cluster nodes? Offhand I imagine the problem is that the DaemonSet is looking for a node with both the GlusterFS label and the osm_default_node_selector label. I don't think it'd be a good idea to also automatically label all GlusterFS nodes with the default label since that opens up the possibility for other pods that may not be desired running there. Scott, is there any way to ignore osm_default_node_selector? Created attachment 1316679 [details]
nodes
Created attachment 1316680 [details]
daemonset
Uploaded the information as requested. Also, in my case, the default node selector is not any of the gluster nodes. PR for this is upstream: https://github.com/openshift/openshift-ansible/pull/5316 PR is merged No openshift-ansible build is attached to errata, no errata puddle, move it to "MODIFIED" Failed to verify in version openshift-ansible-3.6.173.0.35-1.git.0.6c318bc.el7. Installation failed when set osm_default_node_selector='region=compute' cause there didn't have three nodes with label "region=compute". It succeed when set osm_default_node_selector='role=node' cause the glusterfs nodes both have label "role=node". According to the task "Verify target namespace exists", seems there is no different between openshift-ansible-3.6.173.0.35-1.git.0.6c318bc.el7 and openshift-ansible-3.6.173.0.5-3.git.0.522a92a.el7. # cat roles/openshift_storage_glusterfs/tasks/glusterfs_common.yml ... - name: Verify target namespace exists oc_project: state: present name: "{{ glusterfs_namespace }}" node_selector: "{% if glusterfs_use_default_selector %}{{ omit }}{% endif %}" when: glusterfs_is_native or glusterfs_heketi_is_native or glusterfs_storageclass ... I mean there is no different between those version's output. The code does changed. # cat roles/openshift_storage_glusterfs/tasks/glusterfs_common.yml ... node_selector: "{% if glusterfs_use_default_selector %}{{ omit }}{% endif %}" ... And you didn't set openshift_storage_glusterfs_use_default_selector=True? (In reply to Jose A. Rivera from comment #14) > And you didn't set openshift_storage_glusterfs_use_default_selector=True? Yes, I didn't. Seems it should be False by default. # grep -nir "openshift_storage_glusterfs_use_default_selector" . ./roles/openshift_storage_glusterfs/defaults/main.yml:6:openshift_storage_glusterfs_use_default_selector: False The following PR should resolve the issue: https://github.com/openshift/openshift-ansible/pull/5608 PR is merged. Verified with version openshift-ansible-3.7.0-0.159.0.git.0.0cf8cf6.el7, set osm_default_node_selector='region=compute', which there didn't have three nodes with label "region=compute". Installation succeed. Would this work as a workaround; if osm_default_node_selector='region=compute' you can set on the project itself (oc edit namespace or something similar) or on the DS, set the nodeselector manually 'region=gluster'. A conflict can be averted as long as you overwrite the defaults on a lower-grain. So, the cluster at large will follow the default node selector. But the gluster pods can follow a different nodeselector if you architect properly: node1 region=infra, zone=apps node2 region=infra, zone=internal node3 region=compute, zone=apps node4 region=compute, zone=internal If the default nodeselector is region=infra, then you can set the nodeselector in the daemonset to go against this with region=compute. Or, if you dont want to force it to the compute nodes, but rather to zone=internal, you can set (in the namespace definition): metadata: annotations: openshift.io/node-selector: "" And in the daemonset set nodeselector to zone=internal. For me, setting the nodeselector on the daemonset had no influence. It has to be set on the namespace. I also tried creating the glusterfs project via openshift_additional_projects and setting either no nodeselector or the same as the role sets per default (glusterfs=storage-host), but that didn't help. So as of now it seems the only option is to manually set a nodeselector on the namespace. Unless someone else has another idea. Yes, you have to manually set a node selector on the entire namespace. Note that this selector can be "" (the empty string), which overwrites the default node selector with nothing thus imposing no additional node selector on pods in that namespace. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188 |