Bug 1435401 - [Tracker Bug (OCP)] [RFE] cns-deploy should permanently set the storagenode=gluster label to enable automatic restart of GlusterFS pods
Summary: [Tracker Bug (OCP)] [RFE] cns-deploy should permanently set the storagenode=g...
Keywords:
Status: CLOSED DUPLICATE of bug 1559271
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: CNS-deployment
Version: cns-3.4
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: CNS 3.7
Assignee: Michael Adam
QA Contact: Prasanth
URL:
Whiteboard:
Depends On: 1326732 1559271
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-23 17:41 UTC by Daniel Messer
Modified: 2019-02-05 10:40 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-05 10:40:41 UTC
Embargoed:


Attachments (Terms of Use)

Description Daniel Messer 2017-03-23 17:41:56 UTC
Description of problem:

During cns-deploy run a DaemonSet template containing a node selector on 'storagenode=glusterfs 'is used to deploy the GlusterFS pods. The label is applied to each node from via the Kubernetes API. When a node gets rebooted or temporarily shutdown the OpenShift Masters will delete the node. Upon restart of the node the label is not re-applied because the node registers from scratch.
Hence the GlusterFS pod does not startup automatically leaving the deployment in a degraded state despite the node being back up and healthy.

Request for enhancement:

cns-deploy should modify /etc/origin/node/node-config.yaml to include the storagenode=glusterfs label to it is present upon node registration.

Version-Release number of selected component (if applicable):


How reproducible:

- Deploy CNS on OpenShift Container Platform
- Observe all GlusterFS pods health
- observe label present: oc get nodes --show-labels
- shutdown one of the OpenShift nodes hosting a GlusterFS pods
- observe the node being erased by the masters: oc get nodes
- restart the node
- observe the node rejoin the cluster but without the label oc get nodes --show-labels

Actual results:

- observe the node rejoin the cluster but without the label oc get nodes --show-labels
- observe GlusterFS pod missing: oc get pods


Expected results:

- observe the node rejoin the cluster with the label oc get nodes --show-labels
- observice GlusterFS pod spawned again: oc get pods


Additional info:

A temporary workaround is to relabel the node(s) with: oc label <node-name> storagenode=glusterfs

Comment 2 Michael Adam 2017-03-30 13:52:52 UTC
Interesting observation, thanks!

We are doing he initial cli-labeling in cns-deploy, but if I get you right, there is no mechanism to reapply the label when a gluster pod is brought down and up again. So the node-config.yaml change would be a way to make this labeling permanent?

Thanks - Michael

Comment 3 Daniel Messer 2017-03-31 08:47:04 UTC
Correct. You may want to check back with the OpenShift/Kubernetes folks to verify this is indeed the best way. It worked just fine in my environment.

Comment 8 Humble Chirammal 2017-08-03 04:33:03 UTC
This has been communicated to CNS program and no objections on moving this out of CNS 3.6 release. I am changing the flag accordingly.

Comment 11 Niels de Vos 2019-02-05 10:40:41 UTC
According to the dependent bz, this has been fixed through bug 1559271.

*** This bug has been marked as a duplicate of bug 1559271 ***


Note You need to log in before you can comment on or make changes to this bug.