Bug 1408695

Summary: cns-deploy should NOT try to re-configure a working setup and end up aborting the deployment on executing it for the second time
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Prasanth <pprakash>
Component: cns-deploy-toolAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: Prasanth <pprakash>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: cns-3.4CC: dmesser, hchiramm, jarrpa, mliyazud, pprakash, rcyriac, vinug
Target Milestone: ---   
Target Release: CNS 3.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cns-deploy-4.0.0-2.el7rhgs Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-20 18:26:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1415598    

Description Prasanth 2016-12-26 12:43:50 UTC
Description of problem:

cns-deploy should NOT try to re-configure a working setup and end up aborting the deplyment on executing it for the second time

Version-Release number of selected component (if applicable):
cns-deploy-3.1.0-10.el7rhgs.x86_64
openshift v3.4.0.38
kubernetes v1.4.0+776c994

How reproducible: 100%


Steps to Reproduce:
1. Install cns-deploy tool
2. Create a namespace called "storage-project"
3. Setup a router as described in our official doc
4. Create a topology file
5. Execute # cns-deploy -g and setup CNS on the "storage-project" namespace
6. Once completed, ensure that the setup is proper
7. Now re-run the cns-deploy tool again without aborting the current deployment.


Actual results: It ended up deleting all the resources when executed the same tool for the second time on an already deployed setup. See below:

##############
# cns-deploy -n storage-project -g -c oc -y -l /var/log/1-cns-deploy.log -v 
Using OpenShift CLI.
Using namespace "storage-project".

Error from server: error when creating "/usr/share/heketi/templates/deploy-heketi-template.yaml": templates "deploy-heketi" already exists
Error from server: error when creating "/usr/share/heketi/templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server: error when creating "/usr/share/heketi/templates/heketi-template.yaml": templates "heketi" already exists
Error from server: error when creating "/usr/share/heketi/templates/glusterfs-template.yaml": templates "glusterfs" already exists
Marking 'dhcp47-138.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Marking 'dhcp46-59.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Marking 'dhcp46-33.lab.eng.blr.redhat.com' as a GlusterFS node.
error: 'storagenode' already has a value (glusterfs), and --overwrite is false
Deploying GlusterFS pods.
Error from server: daemonsets.extensions "glusterfs" already exists
Waiting for GlusterFS pods to start ... Checking status of pods matching 'glusterfs-node=pod':
glusterfs-3w641   1/1       Running   0         1h
glusterfs-7rpwq   1/1       Running   0         1h
glusterfs-h9a6e   1/1       Running   0         1h
OK
Found secret 'heketi-service-account-token-zics6' in namespace 'storage-project' for heketi-service-account.
service "deploy-heketi" created
route "deploy-heketi" created
deploymentconfig "deploy-heketi" created
Waiting for deploy-heketi pod to start ... Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-eh8gx   1/1       Running   0         1h
OK
Determining heketi service URL ... OK
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   107    0   107    0     0  21125      0 --:--:-- --:--:-- --:--:-- 26750
Failed to communicate with deploy-heketi service.
Please verify that a router has been properly configured.
deploymentconfig "deploy-heketi" deleted
route "deploy-heketi" deleted
service "deploy-heketi" deleted
No resources found
service "heketi-storage-endpoints" deleted
serviceaccount "heketi-service-account" deleted
template "deploy-heketi" deleted
template "heketi" deleted
Removing label from 'dhcp47-138.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp47-138.lab.eng.blr.redhat.com" labeled
Removing label from 'dhcp46-59.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp46-59.lab.eng.blr.redhat.com" labeled
Removing label from 'dhcp46-33.lab.eng.blr.redhat.com' as a GlusterFS node.
node "dhcp46-33.lab.eng.blr.redhat.com" labeled
deploymentconfig "heketi" deleted
route "heketi" deleted
service "heketi" deleted
pod "heketi-1-eh8gx" deleted
daemonset "glusterfs" deleted
template "glusterfs" deleted
##############


Expected results: It should NOT abort any existing deployments automatically without proper checks or inputs from user


Additional info:

Comment 1 Humble Chirammal 2016-12-26 13:27:21 UTC
There is a flag called '--load' which starts the execution from topology load. Appreciated if you can validate this report wrt to the same.

Comment 2 Prasanth 2016-12-26 13:38:42 UTC
(In reply to Humble Chirammal from comment #1)
> There is a flag called '--load' which starts the execution from topology
> load. Appreciated if you can validate this report wrt to the same.

That flag might not help here in this use case and I was not trying to resume from the topology load. Rather it was a fully configured setup on which the deletion of reources happened on a second run. So we should ideally prevent this from automatically happen

Comment 7 Raghavendra Talur 2017-02-27 07:47:48 UTC
patch posted at https://github.com/gluster/gluster-kubernetes/pull/175

Comment 8 Humble Chirammal 2017-02-27 13:02:15 UTC
(In reply to Raghavendra Talur from comment #7)
> patch posted at https://github.com/gluster/gluster-kubernetes/pull/175

Merged.

Comment 9 Prasanth 2017-03-27 11:09:03 UTC
Verified

##########
# cns-deploy -n storage-project -g -c oc -y -l /var/log/1-cns-deploy.log -v
Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    5d
Using namespace "storage-project".
Checking that heketi pod is not running ... 
Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-lhqfh   1/1       Running   0         2d
Found heketi pod running. Please destroy existing setup and try again.


# cns-deploy -n storage-project  -c oc -y -l /var/log/2-cns-deploy.log -v
Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    5d
Using namespace "storage-project".
Checking that heketi pod is not running ... 
Checking status of pods matching 'glusterfs=heketi-pod':
heketi-1-lhqfh   1/1       Running   0         2d
Found heketi pod running. Please destroy existing setup and try again.
##########

Comment 10 Raghavendra Talur 2017-04-05 12:08:03 UTC
*** Bug 1437792 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2017-04-20 18:26:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1112

Comment 13 vinutha 2018-12-06 19:39:23 UTC
Marking qe-test-coverage as - since the preferred mode of deployment is using ansible