Description of problem:
On OCP with two Gluster Clusters, gluster-blocks, there is a timeout while provisioning PVC with heketi.
It looks like OCP is not using right gluster provisionner so gluster block creation do not end well.
$ oc get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
glusterfs-registry 3 3 3 3 3 glusterfs=registry-host 5h
$ oc get dc
NAME REVISION DESIRED CURRENT TRIGGERED BY
glusterblock-registry-provisioner-dc 1 1 1 config
heketi-registry 1 1 1 config
$ oc get pod
NAME READY STATUS RESTARTS AGE
glusterblock-registry-provisioner-dc-1-79mzx 1/1 Running 0 5h
glusterfs-registry-6t9xs 1/1 Running 0 5h
glusterfs-registry-kgh7x 1/1 Running 0 5h
glusterfs-registry-km7n5 1/1 Running 0 5h
heketi-registry-1-v2fwj 1/1 Running 0 5h
$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-k8s-db-prometheus-k8s-0 Pending glusterfs-registry-block 5h
$ oc describe pvc prometheus-k8s-db-prometheus-k8s-0
Normal Provisioning 56m (x477 over 5h) gluster.org/glusterblock 7554b7cf-ff93-11e8-bfbe-0a580a810403 External provisioner is provisioning volume for claim "openshift-monitoring/prometheus-k8s-db-prometheus-k8s-0"
Normal ExternalProvisioning 2m (x8192 over 4h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "gluster.org/glusterblock" or manually created by system administrator
Warning ProvisioningFailed 1m (x576 over 5h) gluster.org/glusterblock 7554b7cf-ff93-11e8-bfbe-0a580a810403 Failed to provision volume with StorageClass "glusterfs-registry-block": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: Post http://heketi-registry.openshift-glusterfs-infra.svc:8080/blockvolumes: dial tcp 172.30.210.102:8080: i/o timeout
$ oc logs glusterblock-registry-provisioner-dc-1-79mzx | grep -i error
#erreurs diverse sur des operations avec des verrou
I1214 11:39:05.527185 1 leaderelection.go:156] attempting to acquire leader lease...
E1214 11:39:05.565884 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "prometheus-k8s-db-prometheus-k8s
-0": the object has been modified; please apply your changes to the latest version and try again
I1214 11:41:48.206955 1 leaderelection.go:156] attempting to acquire leader lease...
E1214 11:41:48.225538 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "metrics-cassandra-1": the object
has been modified; please apply your changes to the latest version and try again
W1214 11:44:09.557049 1 reflector.go:341] github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:644: watch of *v1.PersistentVolume ende
d with: The resourceVersion for the provided watch is too old.
I1214 11:45:29.515567 1 leaderelection.go:156] attempting to acquire leader lease...
E1214 11:45:29.540499 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "logging-es-0": the object has be
en modified; please apply your changes to the latest version and try again
I1214 11:46:33.784631 1 leaderelection.go:156] attempting to acquire leader lease...
E1214 11:46:33.811019 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "logging-es-1": the object has be
en modified; please apply your changes to the latest version and try again
## Gluster-application provisioner logs :
Failed to provision volume for claim "openshift-logging/logging-es-2" with StorageClass "glusterfs-registry-block": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: Post http://heketi-registry.openshift-glusterfs-infra.svc:8080/blockvolumes: dial tcp 172.30.210.102:8080: i/o timeout
I1214 17:46:22.067862 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"openshift-logging", Name:"logging-es-2", UID:"0a7402a9-ff96-11e8-b73d-0050568e86fb", APIVersion:"v1", ResourceVersion:"134479", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' Failed to provision volume with StorageClass "glusterfs-registry-block": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: Post http://heketi-registry.openshift-glusterfs-infra.svc:8080/blockvolumes: dial tcp 172.30.210.102:8080: i/o timeout
I1214 17:46:22.356874 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc openshift-logging/logging-es-2, timeout reached
W1214 17:46:22.356933 1 controller.go:686] retrying syncing claim "openshift-logging/logging-es-2" because failures 0 < threshold 15
E1214 17:46:22.356974 1 controller.go:701] error syncing claim "openshift-logging/logging-es-2": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: Post http://heketi-registry.openshift-glusterfs-infra.svc:8080/blockvolumes: dial tcp 172.30.210.102:8080: i/o timeout
Version-Release number of selected component (if applicable):
atomic-openshift-clients-3.11.43-1.git.0.647ac05.el7.x86_64
redhat-release-server-7.5-8.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
kernel-3.10.0-862.14.4.el7.x86_64
OCS 3.12.2 (glusterd)
How reproducible:
Clean installaion with two gluster clusters block storage.
Expected results:
All PVC are mounted to PV.
StorageClass Dump (if StorageClass used by PV/PVC):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: null
name: glusterfs-registry-block
parameters:
chapauthenabled: "true"
hacount: "3"
restsecretname: heketi-registry-admin-secret-block
restsecretnamespace: openshift-glusterfs-infra
resturl: http://heketi-registry.openshift-glusterfs-infra.svc:8080
restuser: admin
provisioner: gluster.org/glusterblock
reclaimPolicy: Delete
volumeBindingMode: Immediate
I am also experiencing this same issue with a similar environment and setup of 2 gluster clusters.
Comment 10Humble Chirammal
2019-07-09 09:41:30 UTC
It looks to me that, this is duplicate of bug#1703239, I am closing this on the same thought. Please feel free to reopen if thats not the case.
*** This bug has been marked as a duplicate of bug 1703239 ***