Bug 1410499

Summary: CNS upgrade failing,CrashLoopBackOff
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Apeksha <akhakhar>
Component: CNS-deploymentAssignee: Michael Adam <madam>
Status: CLOSED NOTABUG QA Contact: Anoop <annair>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: cns-3.4CC: akhakhar, annair, ekuric, hchiramm, jarrpa, madam, mliyazud, mzywusko, pprakash, rcyriac, rreddy, rtalur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-06 16:12:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1385247    

Description Apeksha 2017-01-05 15:43:37 UTC
Description of problem:
CNS upgrade failing, After deploy --latest the new pods are going into
 CrashLoopBackOff.

When we try to upgrade heketi2.0.6-1.el7 to heketi3.1, At this point only image is upgraded and packages are not yet upgraded. Verified the upgrade from openshift 3.3 to 3.4 everything looks to be working. 

# oc get pods
NAME                                                     READY     STATUS             RESTARTS   AGE
aplo-router-3-lz1yb                                      1/1       Running            0          12m
glusterfs-dc-dhcp46-121.lab.eng.blr.redhat.com-1-zqmyl   1/1       Running            0          34m
glusterfs-dc-dhcp46-130.lab.eng.blr.redhat.com-1-h7rrs   1/1       Running            1          39m
glusterfs-dc-dhcp46-92.lab.eng.blr.redhat.com-1-je0d0    1/1       Running            0          37m
heketi-1-d1yjh                                           1/1       Running            7          37m
heketi-2-8ws0m                                           0/1       CrashLoopBackOff   5          6m
heketi-2-deploy                                          1/1       Running            0          8m
mongodb-1-mlwl4                                          1/1       Running            7          34m

Describe Heketi pod:
http://pastebin.test.redhat.com/443312

Describe Gluster pod: 
http://pastebin.test.redhat.com/443300

Describe Gluster DC:
http://pastebin.test.redhat.com/443306

Comment 1 Mohamed Ashiq 2017-01-05 15:57:25 UTC
Hi,

I had a look into Apeksha's machine and saw the old pods are not getting deleted. I manually deleted the old pod and everything looks to be working fine now. After checking the dc mentioned above in our templates have Rolling strategy for update. Looks like we don't support Recreate in heketi-2.0. We already have the fix for this issue[1].

Possible solutions:
We can provide a work around for the customers, because even the new build will not have the templates which have dc support.

Workaround is to edit the dc to recreate strategy in glusterfs and heketi. This has to be done before upgrade.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1357686

We should also support upgrade from deploymentConfig to daemonSet.

-- Ashiq

Comment 2 Prasanth 2017-01-06 07:18:26 UTC
(In reply to Mohamed Ashiq from comment #1)
 
> Possible solutions:
> We can provide a work around for the customers, because even the new build
> will not have the templates which have dc support.
> 
> Workaround is to edit the dc to recreate strategy in glusterfs and heketi.
> This has to be done before upgrade.

Ashiq, is that the *ONLY* work-around required considering the fact that we have moved away from deploymentConfig from the last release to daemonSets in this release?

Comment 3 Humble Chirammal 2017-01-06 16:12:10 UTC
(In reply to Prasanth from comment #2)
> (In reply to Mohamed Ashiq from comment #1)
>  
> > Possible solutions:
> > We can provide a work around for the customers, because even the new build
> > will not have the templates which have dc support.
> > 
> > Workaround is to edit the dc to recreate strategy in glusterfs and heketi.
> > This has to be done before upgrade.
> 
> Ashiq, is that the *ONLY* work-around required considering the fact that we
> have moved away from deploymentConfig from the last release to daemonSets in
> this release?

As discussed the 'dc to ds' change dont have any role here. This bug can actually be closed as it was reported due to suer error. I am closing this bug. Please feel free to reopen if needed.