Bug 1594491

Summary: current 3.10 puddle build deploys docker-registry PV twice when using CNS
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nicholas Schuetz <nick>
Component: cns-ansibleAssignee: Jose A. Rivera <jarrpa>
Status: CLOSED NOTABUG QA Contact: Ashmitha Ambastha <asambast>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.0CC: akhakhar, aos-bugs, aos-storage-staff, asambast, hchiramm, jarrpa, kramdoss, lxia, madam, nick, pprakash, rcyriac, rhs-bugs, rtalur, sankarshan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-07 15:11:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ansible inventory file
none
Inventory file for latest builds and registry none

Description Nicholas Schuetz 2018-06-23 17:18:42 UTC
Created attachment 1453955 [details]
ansible inventory file

Deploying the latest puddle build (3.10.5) with CNS enabled for all infra pods yields two docker-registry PVs of equal size.

pvc-46ce6d13-766c-11e8-ac7b-525400185802   25Gi       RWX            Delete           Bound       default/registry-claim                      glusterfs-storage                   18h
pvc-95ff382c-766c-11e8-ac7b-525400185802   10Gi       RWO            Delete           Bound       openshift-infra/metrics-cassandra-1         glusterfs-storage-block             18h
pvc-bee2c666-766c-11e8-ac7b-525400185802   10Gi       RWO            Delete           Bound       openshift-logging/logging-es-0              glusterfs-storage-block             18h
pvc-ed0ba9da-766c-11e8-ac7b-525400185802   10Gi       RWO            Delete           Bound       openshift-metrics/prometheus                glusterfs-storage                   18h
pvc-ed78dcb6-766c-11e8-ac7b-525400185802   10Gi       RWO            Delete           Bound       openshift-metrics/prometheus-alertmanager   glusterfs-storage                   18h
pvc-ede9f936-766c-11e8-ac7b-525400185802   10Gi       RWO            Delete           Bound       openshift-metrics/prometheus-alertbuffer    glusterfs-storage                   18h
registry-volume                            25Gi       RWX            Retain           Available  


resitry-volume is unused.  Attaching ansible inventory file that i used.

Comment 2 Humble Chirammal 2018-07-02 06:53:06 UTC
Jose, Can you please look into this ?

Comment 3 Jose A. Rivera 2018-07-02 12:16:44 UTC
Please provide what version of openshift-ansible was used, the full inventory file, and a complete list of the steps you took for installation that led to this situation.

Comment 4 Nicholas Schuetz 2018-07-03 22:59:32 UTC
I already provided the first two (see attachment and text of BZ).  Additionally, here are the steps (exactly) that i used.

https://github.com/nnachefski/ocpstuff/blob/master/install/install-beta.md#-begin

Comment 5 Nicholas Schuetz 2018-07-07 15:24:21 UTC
just installed v3.10.14 and got the same result.  It looks like the old-style hosted PV is deploying along with a dynamic PV (via glusterfs-storage storageclass).  The dynamic PV is the one being used (which i'm ok with).  As it is now, i just manually delete the extra registry-volume PV when the install is complete.

[root@master03 ~]# oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                    STORAGECLASS        REASON    AGE
pvc-444326d1-81f9-11e8-b3fa-525400185802   25Gi       RWX            Delete           Bound       default/registry-claim   glusterfs-storage             18s
registry-volume                            25Gi       RWX            Retain           Available                                                          50s

Comment 9 Ashmitha Ambastha 2018-07-12 10:50:21 UTC
Hi Humble, 

In my deployment I didn't see this issue. There is just 1 pv for docker registry of a default size of 5Gi

------------------ oc get pv output ---------------

# oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                 STORAGECLASS               REASON    AGE
pvc-485dd2e6-6f44-11e8-be0b-005056a56419   5G         RWO            Delete           Bound     openshift-infra/metrics-cassandra-1   glusterfs-registry-block             28d
pvc-4a1b0746-6f44-11e8-be0b-005056a56419   5G         RWO            Delete           Bound     openshift-infra/metrics-cassandra-2   glusterfs-registry-block             28d
pvc-4bca43f3-6f44-11e8-be0b-005056a56419   5G         RWO            Delete           Bound     openshift-infra/metrics-cassandra-3   glusterfs-registry-block             28d
pvc-8de354f6-6f44-11e8-be0b-005056a56419   10Gi       RWO            Delete           Bound     openshift-logging/logging-es-0        glusterfs-registry-block             28d
pvc-a7a62c13-6f44-11e8-be0b-005056a56419   10Gi       RWO            Delete           Bound     openshift-logging/logging-es-1        glusterfs-registry-block             28d
pvc-c6dd3513-6f44-11e8-be0b-005056a56419   10Gi       RWO            Delete           Bound     openshift-logging/logging-es-2        glusterfs-registry-block             28d
registry-volume                            5Gi        RWX            Retain           Bound     default/registry-claim                                                     28d

--------------------------- End -----------------------------------

Comment 10 Ashmitha Ambastha 2018-07-12 12:12:42 UTC
The oc version in comment #9 is v3.10.15

Comment 11 Humble Chirammal 2018-07-12 12:32:53 UTC
(In reply to Ashmitha Ambastha from comment #10)
> The oc version in comment #9 is v3.10.15

Thanks ! 

Nicholas, it looks like the issue is not reproducible with a latest version. Is it the same case at your end ?

Comment 12 Nicholas Schuetz 2018-07-12 14:37:45 UTC
I'm still getting this issue on 3.10.15.  Can you share with me your CNS settings in the ansible hosts file?

Comment 13 Humble Chirammal 2018-07-18 15:17:38 UTC
(In reply to Nicholas Nachefski from comment #12)
> I'm still getting this issue on 3.10.15.  Can you share with me your CNS
> settings in the ansible hosts file?

Setting needinfo accordingly.

Comment 14 Nicholas Schuetz 2018-07-18 15:21:54 UTC
As stated previously, my hosts file is attached to this BZ.

Comment 15 krishnaram Karthick 2018-07-19 06:31:50 UTC
(In reply to Humble Chirammal from comment #11)
> (In reply to Ashmitha Ambastha from comment #10)
> > The oc version in comment #9 is v3.10.15
> 
> Thanks ! 
> 
> Nicholas, it looks like the issue is not reproducible with a latest version.
> Is it the same case at your end ?

Why is this bug ON_QA? This bug neither has the acks nor the fix. Moving it back to assigned.

Comment 16 Ashmitha Ambastha 2018-07-19 12:04:42 UTC
Created attachment 1460024 [details]
Inventory file for latest builds and registry

Comment 17 Ashmitha Ambastha 2018-07-19 12:28:57 UTC
(In reply to Nicholas Nachefski from comment #12)
> I'm still getting this issue on 3.10.15.  Can you share with me your CNS
> settings in the ansible hosts file?

I've attached an inventory file to deploy the latest OCP, 3.10.18-1, and CNS. The inventory file also deploys metrics, logging and, gluster registry. You should be able to successfully deploy OCP+CNS (latest available builds).

Comment 18 Raghavendra Talur 2018-07-20 11:49:29 UTC
Nicholas,

Please check the inventory file attached and see if it works.

Comment 19 Nicholas Schuetz 2018-07-23 17:29:43 UTC
So i'm using the exact same hosts file, but now with v3.10.23 this bug has been fixed it seems.  I now only get one PV for docker-registry.  So somewhere between v3.10.18-3.10.23 this got fixed.  What was the issue?


[root@master03 ~]# oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                       STORAGECLASS              REASON    AGE
pvc-6a0e5b1b-8e9d-11e8-a1c0-525400185802   10Gi       RWO            Delete           Bound     openshift-infra/metrics-cassandra-1         glusterfs-storage-block             2m
pvc-90508968-8e9d-11e8-a1c0-525400185802   10Gi       RWO            Delete           Bound     openshift-logging/logging-es-0              glusterfs-storage-block             1m
pvc-b8d5de64-8e9d-11e8-a1c0-525400185802   10Gi       RWO            Delete           Bound     openshift-metrics/prometheus                glusterfs-storage                   35s
pvc-b948f9fe-8e9d-11e8-a1c0-525400185802   10Gi       RWO            Delete           Bound     openshift-metrics/prometheus-alertmanager   glusterfs-storage                   31s
pvc-b9bbf0d6-8e9d-11e8-a1c0-525400185802   10Gi       RWO            Delete           Bound     openshift-metrics/prometheus-alertbuffer    glusterfs-storage                   27s
registry-volume                            25Gi       RWX            Retain           Bound     default/registry-claim                                                          5m

Comment 22 Nicholas Schuetz 2018-08-07 15:11:02 UTC
This issue has magically went away with the GA release.  Closing...

Comment 23 Humble Chirammal 2018-08-07 15:13:29 UTC
(In reply to Nicholas Nachefski from comment #22)
> This issue has magically went away with the GA release.  Closing...

We are glad to hear it !!. Thanks for the update Nicholas.