We are running a OCP 3.6 cluster with CNS 3.6 and are experiencing errors with our gluster-block volumes. We have 4 gluster-block volumes: 3 for elasticsearch and 1 for cassandra, i.e. storage for logging and metrics. # oc get pods -n default -o wide NAME READY STATUS RESTARTS AGE IP NODE docker-registry-2-3c68p 1/1 Running 2 19h 10.216.2.88 in03.example.com glusterblock-provisioner-dc-1-bv120 1/1 Running 3 85d 10.217.8.46 mb02.example.com glusterfs-ftmlc 1/1 Running 0 18h 10.80.4.137 in01.example.com glusterfs-qt4fh 1/1 Running 0 18h 10.80.4.138 in02.example.com glusterfs-rn6qb 1/1 Running 0 18h 10.80.4.139 in03.example.com heketi-1-mdmxn 1/1 Running 3 85d 10.216.8.127 no01.example.com registry-console-1-d397x 1/1 Running 3 85d 10.219.15.22 no12.example.com router1-2-pv2v6 1/1 Running 1 20h 10.219.2.105 in02.example.com router2-1-d7lqv 1/1 Running 1 19h 10.216.2.87 in03.example.com # oc rsh glusterfs-ftmlc sh-4.2# gluster peer status Number of Peers: 2 Hostname: 10.80.4.138 Uuid: 73905ca0-1767-4e97-b87b-a25b2b7eb9e7 State: Peer in Cluster (Connected) Hostname: 10.80.4.139 Uuid: 48d4e9f1-05ee-4b8a-95d9-b2ced1b738a2 State: Peer in Cluster (Connected) sh-4.2# gluster volume status Status of volume: glusterfs-registry-volume Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.80.4.137:/var/lib/heketi/mounts/vg _43b8a25d46fa347f0afcdd1992ce37c8/brick_432 365f248dcb11d51f8d9c4ea84f95e/brick N/A N/A N N/A Brick 10.80.4.139:/var/lib/heketi/mounts/vg _7a138fc76a774496bc6e75d94abd9ba2/brick_dad 5372836a2282887073774234c815f/brick N/A N/A N N/A Brick 10.80.4.138:/var/lib/heketi/mounts/vg _e6b5453e3167a6f5fdf2c4f4fac9822f/brick_7f0 e1cd16aec9b6979fac156ffd91540/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 36058 Self-heal Daemon on 10.80.4.139 N/A N/A Y 32183 Self-heal Daemon on 10.80.4.138 N/A N/A Y 34949 Task Status of Volume glusterfs-registry-volume ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.80.4.139:/var/lib/heketi/mounts/vg _994e32e5966f43a6a8bc54b218070306/brick_73d d9aa22b514f68b67d50bc6b6df963/brick 49153 0 Y 581 Brick 10.80.4.137:/var/lib/heketi/mounts/vg _7149f4a8bb2f3054298318190f359f30/brick_b9d 3095a45a196ecd02e35de2d421e5b/brick 49153 0 Y 960 Brick 10.80.4.138:/var/lib/heketi/mounts/vg _b8a38ff05a2a762102aea81149c30584/brick_f41 09a5fbba8e5068ebe92151d4ac20a/brick 49154 0 Y 612 Self-heal Daemon on localhost N/A N/A Y 36058 Self-heal Daemon on 10.80.4.138 N/A N/A Y 34949 Self-heal Daemon on 10.80.4.139 N/A N/A Y 32183 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_0d4a7567c603918fb666fd591255ffab Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.80.4.139:/var/lib/heketi/mounts/vg _0931dc845b77f975e5d2bef5bd77af58/brick_a5c 02542f3b227fe500b598a25c5121c/brick 49154 0 Y 584 Brick 10.80.4.137:/var/lib/heketi/mounts/vg _e19e0c9032852c897dbeb79c373b2b5a/brick_520 79049395620180f60a10b9cdf401a/brick 49154 0 Y 966 Brick 10.80.4.138:/var/lib/heketi/mounts/vg _b8a38ff05a2a762102aea81149c30584/brick_39f 15a9359e46f368240914533dc58dd/brick 49155 0 Y 614 Brick 10.80.4.139:/var/lib/heketi/mounts/vg _4c2c084ab43cae1cd6c8d7402195934a/brick_51a 815469c53433f18568fa8808aacf0/brick 49154 0 Y 584 Brick 10.80.4.138:/var/lib/heketi/mounts/vg _9dffbddb230709ab01a7746d71b91d0d/brick_d1d 8b9268469586b3c99f21b331773bd/brick 49155 0 Y 614 Brick 10.80.4.137:/var/lib/heketi/mounts/vg _7149f4a8bb2f3054298318190f359f30/brick_0d8 171711f908c2a66055497139137e1/brick 49154 0 Y 966 Self-heal Daemon on localhost N/A N/A Y 36058 Self-heal Daemon on 10.80.4.138 N/A N/A Y 34949 Self-heal Daemon on 10.80.4.139 N/A N/A Y 32183 Task Status of Volume vol_0d4a7567c603918fb666fd591255ffab ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_912ceedd86354fbd8806d3f531b956cc Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.80.4.138:/var/lib/heketi/mounts/vg _b8a38ff05a2a762102aea81149c30584/brick_4fd 2648d8b9beb1b4c8f2d60a9e5d319/brick 49154 0 Y 612 Brick 10.80.4.137:/var/lib/heketi/mounts/vg _e19e0c9032852c897dbeb79c373b2b5a/brick_adb 2ff41e976cfd276e2f9f179d3497b/brick 49153 0 Y 960 Brick 10.80.4.139:/var/lib/heketi/mounts/vg _4c2c084ab43cae1cd6c8d7402195934a/brick_bab d426faa0618f9402afc402c10f694/brick 49153 0 Y 581 Self-heal Daemon on localhost N/A N/A Y 36058 Self-heal Daemon on 10.80.4.139 N/A N/A Y 32183 Self-heal Daemon on 10.80.4.138 N/A N/A Y 34949 Task Status of Volume vol_912ceedd86354fbd8806d3f531b956cc ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_bd84de58ec563e823b1a50727b116031 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.80.4.139:/var/lib/heketi/mounts/vg _994e32e5966f43a6a8bc54b218070306/brick_e94 1a0886bbf39d2ff05d8f2d7fba490/brick 49153 0 Y 581 Brick 10.80.4.137:/var/lib/heketi/mounts/vg _d8ed3c2573b18c141167a8872e6f877e/brick_c13 ffab26b671edccb0f0ad82b892163/brick 49153 0 Y 960 Brick 10.80.4.138:/var/lib/heketi/mounts/vg _795194576df37a11a0c86a52da7172a2/brick_3ee e8811a363165de3295375e1b898a3/brick 49154 0 Y 612 Self-heal Daemon on localhost N/A N/A Y 36058 Self-heal Daemon on 10.80.4.139 N/A N/A Y 32183 Self-heal Daemon on 10.80.4.138 N/A N/A Y 34949 Task Status of Volume vol_bd84de58ec563e823b1a50727b116031 ------------------------------------------------------------------------------ There are no active volume tasks sh-4.2# targetcli ls o- / [...] o- backstores [...] | o- block [Storage Objects: 0] | o- fileio [Storage Objects: 0] | o- pscsi [Storage Objects: 0] | o- ramdisk [Storage Objects: 0] | o- user:glfs [Storage Objects: 1] | o- blockvol_ac89c92ad12aff23392628619eb6bdb4 [vol_0d4a7567c603918fb666fd591255ffab.4.137/block-store/35d64e70-547f-40e9-b5dd-78498454bf03 (11.0GiB) activated] | o- alua [ALUA Groups: 0] o- iscsi [Targets: 7] | o- iqn.2016-12.org.gluster-block:1ecf553d-fb92-48e1-926e-0ecba5f5072f [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:35d64e70-547f-40e9-b5dd-78498454bf03 [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 1] | | | | o- lun0 [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 1] | | | | o- lun0 [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 1] | | | o- lun0 [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:46e09beb-51b0-4ee4-a70f-18fa492e9393 [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:6085c84f-96b2-4ae5-9112-b8d860c8a2f7 [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:829e10ca-2d8b-4af7-ab33-071976260d05 [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:9ad684b0-4554-4b0f-be1a-d6f39a38b5c3 [TPGs: 3] | | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.137:3260 [OK] | | o- tpg2 [disabled] | | | o- acls [ACLs: 0] | | | o- luns [LUNs: 0] | | | o- portals [Portals: 1] | | | o- 10.80.4.138:3260 [OK] | | o- tpg3 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.139:3260 [OK] | o- iqn.2016-12.org.gluster-block:d8b29744-e9c9-470e-9d13-c004525ffa42 [TPGs: 3] | o- tpg1 [gen-acls, tpg-auth, 1-way auth] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.137:3260 [OK] | o- tpg2 [disabled] | | o- acls [ACLs: 0] | | o- luns [LUNs: 0] | | o- portals [Portals: 1] | | o- 10.80.4.138:3260 [OK] | o- tpg3 [disabled] | o- acls [ACLs: 0] | o- luns [LUNs: 0] | o- portals [Portals: 1] | o- 10.80.4.139:3260 [OK] o- loopback [Targets: 0] ----- The glusterfs volumes are working as expected. The problem was discovered when I noticed that the elasticsearch pods had become "not ready". We tried to recreate the pods, but that forced us to reboot the underlying VM because the pods did not terminate due to: https://bugzilla.redhat.com/show_bug.cgi?id=1561385 After the VMs were rebooted, the elasticsearch pods created again, but they fail to mount their gluster-block PVCs. # oc get pods -o wide -n logging NAME READY STATUS RESTARTS AGE IP NODE logging-es-data-master-4r3z0dcy-1-4trjm 0/1 ContainerCreating 0 9m <none> in01.example.com logging-es-data-master-4r3z0dcy-1-8wq6d 0/1 Terminating 1 19h <none> in02.example.com logging-es-data-master-rtg6tzck-1-0rpqg 0/1 Terminating 1 19h <none> in03.example.com logging-es-data-master-rtg6tzck-1-lz6xf 0/1 ContainerCreating 0 9m <none> in02.example.com logging-es-data-master-tvqvttfo-1-md56q 0/1 Terminating 2 1h 10.216.2.92 in03.example.com logging-es-data-master-tvqvttfo-1-pg2cx 0/1 Terminating 1 19h <none> in03.example.com logging-es-data-master-tvqvttfo-1-pjbzd 0/1 ContainerCreating 0 9m <none> in01.example.com logging-fluentd-0qxx7 1/1 Running 2 36d 10.216.6.44 rc01.example.com logging-fluentd-1114w 1/1 Running 2 36d 10.219.2.107 in02.example.com logging-fluentd-1spjq 1/1 Running 3 36d 10.219.4.28 rc04.example.com logging-fluentd-1tml5 1/1 Running 1 36d 10.216.4.31 rc02.example.com # oc describe pod logging-es-data-master-4r3z0dcy-1-4trjm Name: logging-es-data-master-4r3z0dcy-1-4trjm Namespace: logging Security Policy: restricted Node: in01.example.com/10.80.4.137 Start Time: Fri, 20 Apr 2018 11:12:29 +0200 Labels: component=es deployment=logging-es-data-master-4r3z0dcy-1 deploymentconfig=logging-es-data-master-4r3z0dcy logging-infra=elasticsearch provider=openshift Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-4r3z0dcy-1","uid":"7ac8378d-27... openshift.io/deployment-config.latest-version=1 openshift.io/deployment-config.name=logging-es-data-master-4r3z0dcy openshift.io/deployment.name=logging-es-data-master-4r3z0dcy-1 openshift.io/scc=restricted Status: Pending IP: Controllers: ReplicationController/logging-es-data-master-4r3z0dcy-1 Containers: elasticsearch: Container ID: Image: docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6 Image ID: Ports: 9200/TCP, 9300/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: memory: 8Gi Requests: cpu: 1 memory: 8Gi Readiness: exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3 Environment: DC_NAME: logging-es-data-master-4r3z0dcy NAMESPACE: logging (v1:metadata.namespace) KUBERNETES_TRUST_CERT: true SERVICE_DNS: logging-es-cluster CLUSTER_NAME: logging-es INSTANCE_RAM: 8Gi HEAP_DUMP_LOCATION: /elasticsearch/persistent/heapdump.hprof NODE_QUORUM: 2 RECOVER_EXPECTED_NODES: 3 RECOVER_AFTER_TIME: 5m READINESS_PROBE_TIMEOUT: 30 POD_LABEL: component=es IS_MASTER: true HAS_DATA: true Mounts: /elasticsearch/persistent from elasticsearch-storage (rw) /etc/elasticsearch/secret from elasticsearch (ro) /usr/share/java/elasticsearch/config from elasticsearch-config (ro) /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: elasticsearch: Type: Secret (a volume populated by a Secret) SecretName: logging-elasticsearch Optional: false elasticsearch-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: logging-elasticsearch Optional: false elasticsearch-storage: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: logging-es-2 ReadOnly: false aggregated-logging-elasticsearch-token-jtmtr: Type: Secret (a volume populated by a Secret) SecretName: aggregated-logging-elasticsearch-token-jtmtr Optional: false QoS Class: Burstable Node-Selectors: region=infra Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10m 10m 1 default-scheduler Normal Scheduled Successfully assigned logging-es-data-master-4r3z0dcy-1-4trjm to in01.example.com 11m 4m 4 kubelet, in01.example.com Warning FailedMount Unable to mount volumes for pod "logging-es-data-master-4r3z0dcy-1-4trjm_logging(51f45e19-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-4r3z0dcy-1-4trjm". list of unattached/unmounted volumes=[elasticsearch-storage] 11m 4m 4 kubelet, in01.example.com Warning FailedSync Error syncing pod 12m 3m 9 kubelet, in01.example.com Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/iscsi/51f45e19-447b-11e8-8388-001a4a160352-pvc-76fd7e26-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-76fd7e26-27a2-11e8-bf94-001a4a160352") pod "51f45e19-447b-11e8-8388-001a4a160352" (UID: "51f45e19-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen: iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU. iscsiadm: discovery login to 10.80.4.139 failed, giving up 2 iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure , err exit status 19 # oc describe pod logging-es-data-master-rtg6tzck-1-lz6xf Name: logging-es-data-master-rtg6tzck-1-lz6xf Namespace: logging Security Policy: restricted Node: in02.example.com/10.80.4.138 Start Time: Fri, 20 Apr 2018 11:15:17 +0200 Labels: component=es deployment=logging-es-data-master-rtg6tzck-1 deploymentconfig=logging-es-data-master-rtg6tzck logging-infra=elasticsearch provider=openshift Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-rtg6tzck-1","uid":"4c753c92-27... openshift.io/deployment-config.latest-version=1 openshift.io/deployment-config.name=logging-es-data-master-rtg6tzck openshift.io/deployment.name=logging-es-data-master-rtg6tzck-1 openshift.io/scc=restricted Status: Pending IP: Controllers: ReplicationController/logging-es-data-master-rtg6tzck-1 Containers: elasticsearch: Container ID: Image: docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6 Image ID: Ports: 9200/TCP, 9300/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: memory: 8Gi Requests: cpu: 1 memory: 8Gi Readiness: exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3 Environment: DC_NAME: logging-es-data-master-rtg6tzck NAMESPACE: logging (v1:metadata.namespace) KUBERNETES_TRUST_CERT: true SERVICE_DNS: logging-es-cluster CLUSTER_NAME: logging-es INSTANCE_RAM: 8Gi HEAP_DUMP_LOCATION: /elasticsearch/persistent/heapdump.hprof NODE_QUORUM: 2 RECOVER_EXPECTED_NODES: 3 RECOVER_AFTER_TIME: 5m READINESS_PROBE_TIMEOUT: 30 POD_LABEL: component=es IS_MASTER: true HAS_DATA: true Mounts: /elasticsearch/persistent from elasticsearch-storage (rw) /etc/elasticsearch/secret from elasticsearch (ro) /usr/share/java/elasticsearch/config from elasticsearch-config (ro) /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: elasticsearch: Type: Secret (a volume populated by a Secret) SecretName: logging-elasticsearch Optional: false elasticsearch-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: logging-elasticsearch Optional: false elasticsearch-storage: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: logging-es-0 ReadOnly: false aggregated-logging-elasticsearch-token-jtmtr: Type: Secret (a volume populated by a Secret) SecretName: aggregated-logging-elasticsearch-token-jtmtr Optional: false QoS Class: Burstable Node-Selectors: region=infra Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10m 10m 1 default-scheduler Normal Scheduled Successfully assigned logging-es-data-master-rtg6tzck-1-lz6xf to in02.example.com 8m 1m 4 kubelet, in02.example.com Warning FailedMount Unable to mount volumes for pod "logging-es-data-master-rtg6tzck-1-lz6xf_logging(574d007d-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-rtg6tzck-1-lz6xf". list of unattached/unmounted volumes=[elasticsearch-storage] 8m 1m 4 kubelet, in02.example.com Warning FailedSync Error syncing pod 9m 55s 9 kubelet, in02.example.com Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/iscsi/574d007d-447b-11e8-8388-001a4a160352-pvc-4af4c793-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-4af4c793-27a2-11e8-bf94-001a4a160352") pod "574d007d-447b-11e8-8388-001a4a160352" (UID: "574d007d-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen: iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU. iscsiadm: discovery login to 10.80.4.139 failed, giving up 2 iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure , err exit status 19 # oc describe pod logging-es-data-master-tvqvttfo-1-pjbzd Name: logging-es-data-master-tvqvttfo-1-pjbzd Namespace: logging Security Policy: restricted Node: in01.example.com/10.80.4.137 Start Time: Fri, 20 Apr 2018 11:12:44 +0200 Labels: component=es deployment=logging-es-data-master-tvqvttfo-1 deploymentconfig=logging-es-data-master-tvqvttfo logging-infra=elasticsearch provider=openshift Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-tvqvttfo-1","uid":"5ff870f0-27... openshift.io/deployment-config.latest-version=1 openshift.io/deployment-config.name=logging-es-data-master-tvqvttfo openshift.io/deployment.name=logging-es-data-master-tvqvttfo-1 openshift.io/scc=restricted Status: Pending IP: Controllers: ReplicationController/logging-es-data-master-tvqvttfo-1 Containers: elasticsearch: Container ID: Image: docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6 Image ID: Ports: 9200/TCP, 9300/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: memory: 8Gi Requests: cpu: 1 memory: 8Gi Readiness: exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3 Environment: DC_NAME: logging-es-data-master-tvqvttfo NAMESPACE: logging (v1:metadata.namespace) KUBERNETES_TRUST_CERT: true SERVICE_DNS: logging-es-cluster CLUSTER_NAME: logging-es INSTANCE_RAM: 8Gi HEAP_DUMP_LOCATION: /elasticsearch/persistent/heapdump.hprof NODE_QUORUM: 2 RECOVER_EXPECTED_NODES: 3 RECOVER_AFTER_TIME: 5m READINESS_PROBE_TIMEOUT: 30 POD_LABEL: component=es IS_MASTER: true HAS_DATA: true Mounts: /elasticsearch/persistent from elasticsearch-storage (rw) /etc/elasticsearch/secret from elasticsearch (ro) /usr/share/java/elasticsearch/config from elasticsearch-config (ro) /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: elasticsearch: Type: Secret (a volume populated by a Secret) SecretName: logging-elasticsearch Optional: false elasticsearch-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: logging-elasticsearch Optional: false elasticsearch-storage: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: logging-es-1 ReadOnly: false aggregated-logging-elasticsearch-token-jtmtr: Type: Secret (a volume populated by a Secret) SecretName: aggregated-logging-elasticsearch-token-jtmtr Optional: false QoS Class: Burstable Node-Selectors: region=infra Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 10m 10m 1 default-scheduler Normal Scheduled Successfully assigned logging-es-data-master-tvqvttfo-1-pjbzd to in01.example.com 11m 4m 4 kubelet, in01.example.com Warning FailedMount Unable to mount volumes for pod "logging-es-data-master-tvqvttfo-1-pjbzd_logging(5b304a15-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-tvqvttfo-1-pjbzd". list of unattached/unmounted volumes=[elasticsearch-storage] 11m 4m 4 kubelet, in01.example.com Warning FailedSync Error syncing pod 12m 3m 9 kubelet, in01.example.com Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/iscsi/5b304a15-447b-11e8-8388-001a4a160352-pvc-5e831446-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-5e831446-27a2-11e8-bf94-001a4a160352") pod "5b304a15-447b-11e8-8388-001a4a160352" (UID: "5b304a15-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen: iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU. iscsiadm: discovery login to 10.80.4.139 failed, giving up 2 iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure , err exit status 19 ----- Also, I have tried to create new PVCs for both glusterfs and glusterblock. It works fine to create new glusterfs PVCs but not glusterblock. See output below: # oc get sc NAME TYPE infra-glusterblock-storage gluster.org/glusterblock infra-glusterfs-storage kubernetes.io/glusterfs # cat pvc-glusterfs.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.beta.kubernetes.io/storage-class: infra-glusterfs-storage name: pvc-glusterfs spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi # oc create -f pvc-glusterfs.yml persistentvolumeclaim "pvc-glusterfs" created # cat pvc-glusterblock.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.beta.kubernetes.io/storage-class: infra-glusterblock-storage name: pvc-glusterblock spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi # oc create -f pvc-glusterblock.yml persistentvolumeclaim "pvc-glusterblock" created # oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE logging-es-0 Bound pvc-4af4c793-27a2-11e8-bf94-001a4a160352 100Gi RWO infra-glusterblock-storage 36d logging-es-1 Bound pvc-5e831446-27a2-11e8-bf94-001a4a160352 100Gi RWO infra-glusterblock-storage 36d logging-es-2 Bound pvc-76fd7e26-27a2-11e8-bf94-001a4a160352 100Gi RWO infra-glusterblock-storage 36d pvc-glusterblock Pending infra-glusterblock-storage 57s pvc-glusterfs Bound pvc-f647cf18-447b-11e8-ae31-001a4a160353 10Gi RWO infra-glusterfs-storage 2m [root@ma03 ~]# oc describe pvc pvc-glusterblock Name: pvc-glusterblock Namespace: logging StorageClass: infra-glusterblock-storage Status: Pending Volume: Labels: <none> Annotations: control-plane.alpha.kubernetes.io/leader={"holderIdentity":"de5d81f8-319c-11e8-b74f-0a580adb0630","leaseDurationSeconds":15,"acquireTime":"2018-04-20T09:20:55Z","renewTime":"2018-04-20T09:21:57Z","lea... volume.beta.kubernetes.io/storage-class=infra-glusterblock-storage volume.beta.kubernetes.io/storage-provisioner=gluster.org/glusterblock Capacity: Access Modes: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 56s 44s 2 gluster.org/glusterblock de5d81f8-319c-11e8-b74f-0a580adb0630 Warning ProvisioningFailed Failed to provision volume with StorageClass "infra-glusterblock-storage": failed to create volume: [heketi] error creating volume Unable to execute command on glusterfs-ftmlc: 1m 38s 3 gluster.org/glusterblock de5d81f8-319c-11e8-b74f-0a580adb0630 Normal Provisioning External provisioner is provisioning volume for claim "logging/pvc-glusterblock" 1m 1s 22 persistentvolume-controller Normal ExternalProvisioning cannot find provisioner "gluster.org/glusterblock", expecting that a volume for the claim is provisioned either manually or via external software I have the following available in /etc/target/ in the glusterfs pods: [root@ma03 ~]# oc get pods -n default -o wide NAME READY STATUS RESTARTS AGE IP NODE docker-registry-2-3c68p 1/1 Running 2 21h 10.216.2.88 in03.example.com glusterblock-provisioner-dc-1-bv120 1/1 Running 3 85d 10.217.8.46 mb02.example.com glusterfs-ftmlc 1/1 Running 0 20h 10.80.4.137 in01.example.com glusterfs-qt4fh 1/1 Running 0 19h 10.80.4.138 in02.example.com glusterfs-rn6qb 1/1 Running 0 19h 10.80.4.139 in03.example.com heketi-1-mdmxn 1/1 Running 3 85d 10.216.8.127 no01.example.com registry-console-1-d397x 1/1 Running 3 85d 10.219.15.22 no12.example.com router1-2-pv2v6 1/1 Running 1 22h 10.219.2.105 in02.example.com router2-1-d7lqv 1/1 Running 1 21h 10.216.2.87 in03.example.com [root@ma03 ~]# oc rsh glusterfs-ftmlc sh-4.2# ls -la /etc/target/ total 120 drwxr-xr-x. 3 root root 114 Apr 20 09:37 . drwxr-xr-x. 67 root root 4096 Apr 19 14:59 .. drwxr-xr-x. 2 root root 4096 Apr 20 09:37 backup -rw-------. 1 root root 33432 Apr 20 09:37 saveconfig.json -rw-------. 1 root root 34536 Jan 30 18:54 saveconfig.json.backup -rw-------. 1 root root 33432 Jan 31 13:48 saveconfig.json.backup-21-01-2018 sh-4.2# ls -la /etc/target/backup/ total 424 drwxr-xr-x. 2 root root 4096 Apr 20 09:37 . drwxr-xr-x. 3 root root 114 Apr 20 09:37 .. -rw-------. 1 root root 33432 Apr 20 09:29 saveconfig-20180420-09:29:25.json -rw-------. 1 root root 40097 Apr 20 09:30 saveconfig-20180420-09:30:13.json -rw-------. 1 root root 33432 Apr 20 09:30 saveconfig-20180420-09:30:16.json -rw-------. 1 root root 40097 Apr 20 09:32 saveconfig-20180420-09:32:49.json -rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:51.json -rw-------. 1 root root 40097 Apr 20 09:34 saveconfig-20180420-09:34:58.json -rw-------. 1 root root 33432 Apr 20 09:35 saveconfig-20180420-09:35:00.json -rw-------. 1 root root 40097 Apr 20 09:37 saveconfig-20180420-09:37:13.json -rw-------. 1 root root 33432 Apr 20 09:37 saveconfig-20180420-09:37:15.json -rw-r--r--. 1 root root 40060 Jan 30 18:54 saveconfig.json -rw-r--r--. 1 root root 40060 Jan 30 18:40 saveconfig.json.backup sh-4.2# exit exit [root@ma03 ~]# oc rsh glusterfs-qt4fh sh-4.2# ls -la /etc/target/ total 120 drwxr-xr-x. 3 root root 114 Apr 20 09:39 . drwxr-xr-x. 67 root root 4096 Apr 19 15:07 .. drwxr-xr-x. 2 root root 4096 Apr 20 09:39 backup -rw-------. 1 root root 33432 Apr 20 09:39 saveconfig.json -rw-------. 1 root root 34536 Jan 30 18:47 saveconfig.json.backup -rw-------. 1 root root 33432 Jan 31 13:43 saveconfig.json.backup-31-01-2018 sh-4.2# ls -la /etc/target/backup/ total 384 drwxr-xr-x. 2 root root 4096 Apr 20 09:39 . drwxr-xr-x. 3 root root 114 Apr 20 09:39 .. -rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:51.json -rw-------. 1 root root 40097 Apr 20 09:32 saveconfig-20180420-09:32:52.json -rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:54.json -rw-------. 1 root root 40097 Apr 20 09:35 saveconfig-20180420-09:35:28.json -rw-------. 1 root root 33432 Apr 20 09:35 saveconfig-20180420-09:35:30.json -rw-------. 1 root root 40097 Apr 20 09:37 saveconfig-20180420-09:37:36.json -rw-------. 1 root root 33432 Apr 20 09:37 saveconfig-20180420-09:37:38.json -rw-------. 1 root root 40097 Apr 20 09:39 saveconfig-20180420-09:39:51.json -rw-------. 1 root root 33432 Apr 20 09:39 saveconfig-20180420-09:39:52.json -rw-r--r--. 1 root root 40060 Jan 30 18:41 saveconfig.json sh-4.2# exit exit [root@ma03 ~]# oc rsh glusterfs-rn6qb sh-4.2# ls -la /etc/target/ total 124 drwxr-xr-x. 3 root root 114 Apr 19 14:45 . drwxr-xr-x. 67 root root 4096 Apr 19 15:04 .. drwxr-xr-x. 2 root root 4096 Apr 19 14:45 backup -rw-------. 1 root root 40060 Apr 19 14:45 saveconfig.json -rw-------. 1 root root 34536 Jan 30 18:48 saveconfig.json.backup -rw-------. 1 root root 33432 Jan 31 13:43 saveconfig.json.backup-31-01-2018 sh-4.2# ls -la /etc/target/backup/ total 352 drwxr-xr-x. 2 root root 4096 Apr 19 14:45 . drwxr-xr-x. 3 root root 114 Apr 19 14:45 .. -rw-------. 1 root root 33395 Mar 14 15:19 saveconfig-20180314-15:19:15.json -rw-------. 1 root root 26730 Mar 14 15:19 saveconfig-20180314-15:19:18.json -rw-------. 1 root root 20065 Mar 14 15:19 saveconfig-20180314-15:19:20.json -rw-------. 1 root root 26730 Mar 14 16:11 saveconfig-20180314-16:11:06.json -rw-------. 1 root root 33395 Mar 14 16:11 saveconfig-20180314-16:11:39.json -rw-------. 1 root root 40060 Mar 14 16:12 saveconfig-20180314-16:12:26.json -rw-------. 1 root root 40060 Mar 27 12:17 saveconfig-20180327-12:17:46.json -rw-------. 1 root root 40060 Apr 19 13:21 saveconfig-20180419-13:21:34.json -rw-------. 1 root root 40060 Apr 19 14:45 saveconfig-20180419-14:45:44.json -rw-r--r--. 1 root root 40060 Jan 30 18:42 saveconfig.json [root@ma03 ~]# oc rsh heketi-1-mdmxn sh-4.2# heketi-cli --user admin --secret <secret> volume list Id:0d4a7567c603918fb666fd591255ffab Cluster:15fb6aa960b6c2a5580228f32b5b2b3a Name:vol_0d4a7567c603918fb666fd591255ffab [block] Id:5d3ec04f22ce9451bec245ec1a1e1ffd Cluster:15fb6aa960b6c2a5580228f32b5b2b3a Name:heketidbstorage Id:912ceedd86354fbd8806d3f531b956cc Cluster:15fb6aa960b6c2a5580228f32b5b2b3a Name:vol_912ceedd86354fbd8806d3f531b956cc Id:bd84de58ec563e823b1a50727b116031 Cluster:15fb6aa960b6c2a5580228f32b5b2b3a Name:vol_bd84de58ec563e823b1a50727b116031
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2691