Bug 1753594

Summary: Unable to deploy openshift-infra metrics with glusterfs-block
Product: OpenShift Container Platform Reporter: Radomir Ludva <rludva>
Component: HawkularAssignee: Ruben Vargas Palma <rvargasp>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: alegrand, anpicker, aos-bugs, erooth, jmartisk, kakkoyun, lcosic, pkrupa, surbania
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-24 05:35:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Inventory file
none
hawkulat-cassandra-1-______ logs..
none
hawkular-metrics-schema job logs.. none

Description Radomir Ludva 2019-09-19 11:25:09 UTC
Description of problem:
Unable to deploy metrics with glusterfs-block (independent mode) storage. PV is created, PVC is bound, but the hawkular-metrics-schema-7cp2k job is failing during executing (attached logs) and also logs for hawkular-cassandra-1-ksr4q and inventory file.

```
~ root(ocp:openshift-infra) $ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                                                                             STORAGECLASS              REASON    AGE
pvc-222a35b0-d40c-11e9-9750-f8b156b640cb   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspacezqu0oqnhx5ybl3uc                                                 glusterfs-storage-block             8d
pvc-262a9bb0-cffa-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-1                                                                    glusterfs-storage-block             13d
pvc-44dd12cc-d40e-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspacexwd369xqcwfnlewc                                                 glusterfs-storage-block             8d
pvc-6d190cdb-cffa-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-2                                                                    glusterfs-storage-block             13d
pvc-88fca37b-d40a-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/postgres-data                                                                                 glusterfs-storage-block             8d
pvc-d7982cfb-d48a-11e9-900f-90b11ca3452d   10Gi       RWO            Delete           Bound     application-monitoring/prometheus-application-monitoring-db-prometheus-application-monitoring-0   glusterfs-storage-block             7d
pvc-e05f5400-cff9-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-0                                                                    glusterfs-storage-block             13d
pvc-ea41644f-d410-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspaceyc54y6e3gmtht5px                                                 glusterfs-storage-block             8d
pvc-ed5c55bd-dabd-11e9-a5c2-f8b156b640cb   20Gi       RWO            Delete           Bound     openshift-infra/metrics-cassandra-1                                                               glusterfs-storage-block             2h
~ root(ocp:openshift-infra) $ oc get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS              AGE
metrics-cassandra-1   Bound     pvc-ed5c55bd-dabd-11e9-a5c2-f8b156b640cb   20Gi       RWO            glusterfs-storage-block   2h
```

```
~ root(ocp:openshift-infra) $ oc get pods -o wide
NAME                         READY     STATUS    RESTARTS   AGE       IP            NODE                                     NOMINATED NODE
hawkular-cassandra-1-ksr4q   1/1       Running   0          2h        10.129.2.93   torii-san-infra-node.local.nutius.com    <none>
hawkular-metrics-lf84z       0/1       Running   19         2h        10.130.2.62   torii-ni-infra-node.local.nutius.com     <none>
heapster-cv7p7               0/1       Running   14         2h        10.131.0.68   torii-ichi-infra-node.local.nutius.com   <none>
```

When using ephemeral storage, then everything is working correctly.


Version-Release number of selected component (if applicable):
oc v3.11.141
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://torii-ichi-master.local.nutius.com:8443
openshift v3.11.141
kubernetes v1.11.0+d4cacc0


How reproducible:
Deploy cluster with specified inventory file or with openshift_metrics and investigate the pods in openshift-infra. They are not created correctly and they are restarted all the time.


Actual results:
hawkular-cassandra-1-ksr4q and hawkular-metrics-schema-7cp2k are not successful:
```
WARN  2019-09-19 09:31:38,146 [main] org.hawkular.metrics.schema.Installer:run:102 - Installation failed
com.datastax.driver.core.exceptions.OperationTimedOutException: [hawkular-cassandra/172.30.72.229:9042] Timed out waiting for server response
```

Expected results:
hawkular-cassandra-1-ksr4q and hawkular-metrics-schema-7cp2k are executed successfully.

Additional info:
This is not customer case I have full access to the cluster and I can share the access. I am not sure if I have something configured incorrectly or if there is an error.

Comment 1 Radomir Ludva 2019-09-19 11:29:48 UTC
Created attachment 1616686 [details]
Inventory file

Comment 2 Radomir Ludva 2019-09-19 11:30:15 UTC
Created attachment 1616687 [details]
hawkulat-cassandra-1-______ logs..

Comment 3 Radomir Ludva 2019-09-19 11:30:41 UTC
Created attachment 1616688 [details]
hawkular-metrics-schema job logs..