Bug 1753594 - Unable to deploy openshift-infra metrics with glusterfs-block
Summary: Unable to deploy openshift-infra metrics with glusterfs-block
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Ruben Vargas Palma
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-19 11:25 UTC by Radomir Ludva
Modified: 2022-07-19 09:52 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-24 05:35:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Inventory file (6.92 KB, text/plain)
2019-09-19 11:29 UTC, Radomir Ludva
no flags Details
hawkulat-cassandra-1-______ logs.. (150.21 KB, text/plain)
2019-09-19 11:30 UTC, Radomir Ludva
no flags Details
hawkular-metrics-schema job logs.. (15.16 KB, text/plain)
2019-09-19 11:30 UTC, Radomir Ludva
no flags Details

Description Radomir Ludva 2019-09-19 11:25:09 UTC
Description of problem:
Unable to deploy metrics with glusterfs-block (independent mode) storage. PV is created, PVC is bound, but the hawkular-metrics-schema-7cp2k job is failing during executing (attached logs) and also logs for hawkular-cassandra-1-ksr4q and inventory file.

```
~ root(ocp:openshift-infra) $ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                                                                             STORAGECLASS              REASON    AGE
pvc-222a35b0-d40c-11e9-9750-f8b156b640cb   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspacezqu0oqnhx5ybl3uc                                                 glusterfs-storage-block             8d
pvc-262a9bb0-cffa-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-1                                                                    glusterfs-storage-block             13d
pvc-44dd12cc-d40e-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspacexwd369xqcwfnlewc                                                 glusterfs-storage-block             8d
pvc-6d190cdb-cffa-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-2                                                                    glusterfs-storage-block             13d
pvc-88fca37b-d40a-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/postgres-data                                                                                 glusterfs-storage-block             8d
pvc-d7982cfb-d48a-11e9-900f-90b11ca3452d   10Gi       RWO            Delete           Bound     application-monitoring/prometheus-application-monitoring-db-prometheus-application-monitoring-0   glusterfs-storage-block             7d
pvc-e05f5400-cff9-11e9-a4b6-f8b156b640cb   40Gi       RWO            Delete           Bound     openshift-logging/logging-es-0                                                                    glusterfs-storage-block             13d
pvc-ea41644f-d410-11e9-9dd7-90b11c873ca3   1Gi        RWO            Delete           Bound     che/claim-che-workspace-workspaceyc54y6e3gmtht5px                                                 glusterfs-storage-block             8d
pvc-ed5c55bd-dabd-11e9-a5c2-f8b156b640cb   20Gi       RWO            Delete           Bound     openshift-infra/metrics-cassandra-1                                                               glusterfs-storage-block             2h
~ root(ocp:openshift-infra) $ oc get pvc
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS              AGE
metrics-cassandra-1   Bound     pvc-ed5c55bd-dabd-11e9-a5c2-f8b156b640cb   20Gi       RWO            glusterfs-storage-block   2h
```

```
~ root(ocp:openshift-infra) $ oc get pods -o wide
NAME                         READY     STATUS    RESTARTS   AGE       IP            NODE                                     NOMINATED NODE
hawkular-cassandra-1-ksr4q   1/1       Running   0          2h        10.129.2.93   torii-san-infra-node.local.nutius.com    <none>
hawkular-metrics-lf84z       0/1       Running   19         2h        10.130.2.62   torii-ni-infra-node.local.nutius.com     <none>
heapster-cv7p7               0/1       Running   14         2h        10.131.0.68   torii-ichi-infra-node.local.nutius.com   <none>
```

When using ephemeral storage, then everything is working correctly.


Version-Release number of selected component (if applicable):
oc v3.11.141
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://torii-ichi-master.local.nutius.com:8443
openshift v3.11.141
kubernetes v1.11.0+d4cacc0


How reproducible:
Deploy cluster with specified inventory file or with openshift_metrics and investigate the pods in openshift-infra. They are not created correctly and they are restarted all the time.


Actual results:
hawkular-cassandra-1-ksr4q and hawkular-metrics-schema-7cp2k are not successful:
```
WARN  2019-09-19 09:31:38,146 [main] org.hawkular.metrics.schema.Installer:run:102 - Installation failed
com.datastax.driver.core.exceptions.OperationTimedOutException: [hawkular-cassandra/172.30.72.229:9042] Timed out waiting for server response
```

Expected results:
hawkular-cassandra-1-ksr4q and hawkular-metrics-schema-7cp2k are executed successfully.

Additional info:
This is not customer case I have full access to the cluster and I can share the access. I am not sure if I have something configured incorrectly or if there is an error.

Comment 1 Radomir Ludva 2019-09-19 11:29:48 UTC
Created attachment 1616686 [details]
Inventory file

Comment 2 Radomir Ludva 2019-09-19 11:30:15 UTC
Created attachment 1616687 [details]
hawkulat-cassandra-1-______ logs..

Comment 3 Radomir Ludva 2019-09-19 11:30:41 UTC
Created attachment 1616688 [details]
hawkular-metrics-schema job logs..


Note You need to log in before you can comment on or make changes to this bug.