Bug 1408695

Summary:	cns-deploy should NOT try to re-configure a working setup and end up aborting the deployment on executing it for the second time
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Prasanth <pprakash>
Component:	cns-deploy-tool	Assignee:	Raghavendra Talur <rtalur>
Status:	CLOSED ERRATA	QA Contact:	Prasanth <pprakash>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	cns-3.4	CC:	dmesser, hchiramm, jarrpa, mliyazud, pprakash, rcyriac, vinug
Target Milestone:	---
Target Release:	CNS 3.5
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	cns-deploy-4.0.0-2.el7rhgs	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-04-20 18:26:21 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1415598

Description Prasanth 2016-12-26 12:43:50 UTC

Description of problem:

cns-deploy should NOT try to re-configure a working setup and end up aborting the deplyment on executing it for the second time

Version-Release number of selected component (if applicable):
cns-deploy-3.1.0-10.el7rhgs.x86_64
openshift v3.4.0.38
kubernetes v1.4.0+776c994

How reproducible: 100%

Steps to Reproduce:
1. Install cns-deploy tool
2. Create a namespace called "storage-project"
3. Setup a router as described in our official doc
4. Create a topology file
5. Execute # cns-deploy -g and setup CNS on the "storage-project" namespace
6. Once completed, ensure that the setup is proper
7. Now re-run the cns-deploy tool again without aborting the current deployment.

Actual results: It ended up deleting all the resources when executed the same tool for the second time on an already deployed setup. See below:

##############
# cns-deploy -n storage-project -g -c oc -y -l /var/log/1-cns-deploy.log -v
Using OpenShift CLI.
Using namespace "storage-project".

Error from server: error when creating "/usr/share/heketi/templates/deploy-heketi-template.yaml": templates "deploy-heketi" already exists
Error from server: error when creating "/usr/share/heketi/templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server: error when creating "/usr/share/heketi/templates/heketi-template.yaml": templates "heketi" already exists
Error from server: error when creating "/usr/share/heketi/templates/glusterfs-template.yaml": templates "glusterfs" already exists
Marking 'dhcp47-138.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Marking 'dhcp46-59.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Marking 'dhcp46-33.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Deploying GlusterFS pods.
Error from server: daemonsets.extensions "glusterfs" already exists
Waiting for GlusterFS pods to start ... Checking status of pods matching 'glusterfs-node=pod':
glusterfs-3w641 1/1 Running 0 1h
glusterfs-7rpwq 1/1 Running 0 1h
glusterfs-h9a6e 1/1 Running 0 1h
OK
Found secret 'heketi-service-account-token-zics6' in namespace 'storage-project' for heketi-service-account.
service "deploy-heketi" created
route "deploy-heketi" created
deploymentconfig "deploy-heketi" created
Waiting for deploy-heketi pod to start ... Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-eh8gx 1/1 Running 0 1h
OK
Determining heketi service URL ... OK
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 107 0 107 0 0 21125 0 --:--:-- --:--:-- --:--:-- 26750
Failed to communicate with deploy-heketi service.
Please verify that a router has been properly configured.
deploymentconfig "deploy-heketi" deleted
route "deploy-heketi" deleted
service "deploy-heketi" deleted
No resources found
service "heketi-storage-endpoints" deleted
serviceaccount "heketi-service-account" deleted
template "deploy-heketi" deleted
template "heketi" deleted
Removing label from 'dhcp47-138.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp47-138.lab.eng.blr.redhat.com" labeled
Removing label from 'dhcp46-59.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp46-59.lab.eng.blr.redhat.com" labeled
Removing label from 'dhcp46-33.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp46-33.lab.eng.blr.redhat.com" labeled
deploymentconfig "heketi" deleted
route "heketi" deleted
service "heketi" deleted
pod "heketi-1-eh8gx" deleted
daemonset "glusterfs" deleted
template "glusterfs" deleted
##############

Expected results: It should NOT abort any existing deployments automatically without proper checks or inputs from user

Additional info:

Comment 1 Humble Chirammal 2016-12-26 13:27:21 UTC

There is a flag called '--load' which starts the execution from topology load. Appreciated if you can validate this report wrt to the same.

Comment 2 Prasanth 2016-12-26 13:38:42 UTC

(In reply to Humble Chirammal from comment #1)
> There is a flag called '--load' which starts the execution from topology
> load. Appreciated if you can validate this report wrt to the same.

That flag might not help here in this use case and I was not trying to resume from the topology load. Rather it was a fully configured setup on which the deletion of reources happened on a second run. So we should ideally prevent this from automatically happen

Comment 7 Raghavendra Talur 2017-02-27 07:47:48 UTC

patch posted at https://github.com/gluster/gluster-kubernetes/pull/175

Comment 8 Humble Chirammal 2017-02-27 13:02:15 UTC

(In reply to Raghavendra Talur from comment #7)
> patch posted at https://github.com/gluster/gluster-kubernetes/pull/175

Merged.

Comment 9 Prasanth 2017-03-27 11:09:03 UTC

Verified

##########
# cns-deploy -n storage-project -g -c oc -y -l /var/log/1-cns-deploy.log -v
Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    5d
Using namespace "storage-project".
Checking that heketi pod is not running ... 
Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-lhqfh   1/1       Running   0         2d
Found heketi pod running. Please destroy existing setup and try again.


# cns-deploy -n storage-project  -c oc -y -l /var/log/2-cns-deploy.log -v
Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    5d
Using namespace "storage-project".
Checking that heketi pod is not running ... 
Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-lhqfh   1/1       Running   0         2d
Found heketi pod running. Please destroy existing setup and try again.
##########

Comment 10 Raghavendra Talur 2017-04-05 12:08:03 UTC

*** Bug 1437792 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2017-04-20 18:26:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1112

Comment 13 vinutha 2018-12-06 19:39:23 UTC

Marking qe-test-coverage as - since the preferred mode of deployment is using ansible