Bug 1569977

Summary:	[GSS] CNS gluster-block: Problem mounting existing volumes and provisioning new PVCs
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Ture Karlsson <tkarlsso>
Component:	gluster-block	Assignee:	Prasanna Kumar Kalever <prasanna.kalever>
Status:	CLOSED ERRATA	QA Contact:	Rachael <rgeorge>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	cns-3.6	CC:	kramdoss, madam, pkarampu, prasanna.kalever, rhs-bugs, sankarshan, tkarlsso, vinug
Target Milestone:	---
Target Release:	CNS 3.10
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	gluster-block-0.2.1-19.el7rhgs	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-09-12 09:25:34 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1537170
Bug Blocks:	1568861, 1573420

Description Ture Karlsson 2018-04-20 12:11:40 UTC

We are running a OCP 3.6 cluster with CNS 3.6 and are experiencing errors with our gluster-block volumes. We have 4 gluster-block volumes: 3 for elasticsearch and 1 for cassandra, i.e. storage for logging and metrics.

# oc get pods -n default -o wide
NAME                                  READY     STATUS    RESTARTS   AGE       IP             NODE
docker-registry-2-3c68p               1/1       Running   2          19h       10.216.2.88    in03.example.com
glusterblock-provisioner-dc-1-bv120   1/1       Running   3          85d       10.217.8.46    mb02.example.com
glusterfs-ftmlc                       1/1       Running   0          18h       10.80.4.137    in01.example.com
glusterfs-qt4fh                       1/1       Running   0          18h       10.80.4.138    in02.example.com
glusterfs-rn6qb                       1/1       Running   0          18h       10.80.4.139    in03.example.com
heketi-1-mdmxn                        1/1       Running   3          85d       10.216.8.127   no01.example.com
registry-console-1-d397x              1/1       Running   3          85d       10.219.15.22   no12.example.com
router1-2-pv2v6                       1/1       Running   1          20h       10.219.2.105   in02.example.com
router2-1-d7lqv                       1/1       Running   1          19h       10.216.2.87    in03.example.com

# oc rsh glusterfs-ftmlc
sh-4.2# gluster peer status
Number of Peers: 2

Hostname: 10.80.4.138
Uuid: 73905ca0-1767-4e97-b87b-a25b2b7eb9e7
State: Peer in Cluster (Connected)

Hostname: 10.80.4.139
Uuid: 48d4e9f1-05ee-4b8a-95d9-b2ced1b738a2
State: Peer in Cluster (Connected)
sh-4.2# gluster volume status
Status of volume: glusterfs-registry-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_43b8a25d46fa347f0afcdd1992ce37c8/brick_432
365f248dcb11d51f8d9c4ea84f95e/brick         N/A       N/A        N       N/A  
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_7a138fc76a774496bc6e75d94abd9ba2/brick_dad
5372836a2282887073774234c815f/brick         N/A       N/A        N       N/A  
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_e6b5453e3167a6f5fdf2c4f4fac9822f/brick_7f0
e1cd16aec9b6979fac156ffd91540/brick         N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       36058
Self-heal Daemon on 10.80.4.139             N/A       N/A        Y       32183
Self-heal Daemon on 10.80.4.138             N/A       N/A        Y       34949
 
Task Status of Volume glusterfs-registry-volume
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_994e32e5966f43a6a8bc54b218070306/brick_73d
d9aa22b514f68b67d50bc6b6df963/brick         49153     0          Y       581  
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_7149f4a8bb2f3054298318190f359f30/brick_b9d
3095a45a196ecd02e35de2d421e5b/brick         49153     0          Y       960  
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_b8a38ff05a2a762102aea81149c30584/brick_f41
09a5fbba8e5068ebe92151d4ac20a/brick         49154     0          Y       612  
Self-heal Daemon on localhost               N/A       N/A        Y       36058
Self-heal Daemon on 10.80.4.138             N/A       N/A        Y       34949
Self-heal Daemon on 10.80.4.139             N/A       N/A        Y       32183
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_0d4a7567c603918fb666fd591255ffab
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_0931dc845b77f975e5d2bef5bd77af58/brick_a5c
02542f3b227fe500b598a25c5121c/brick         49154     0          Y       584  
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_e19e0c9032852c897dbeb79c373b2b5a/brick_520
79049395620180f60a10b9cdf401a/brick         49154     0          Y       966  
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_b8a38ff05a2a762102aea81149c30584/brick_39f
15a9359e46f368240914533dc58dd/brick         49155     0          Y       614  
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_4c2c084ab43cae1cd6c8d7402195934a/brick_51a
815469c53433f18568fa8808aacf0/brick         49154     0          Y       584  
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_9dffbddb230709ab01a7746d71b91d0d/brick_d1d
8b9268469586b3c99f21b331773bd/brick         49155     0          Y       614  
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_7149f4a8bb2f3054298318190f359f30/brick_0d8
171711f908c2a66055497139137e1/brick         49154     0          Y       966  
Self-heal Daemon on localhost               N/A       N/A        Y       36058
Self-heal Daemon on 10.80.4.138             N/A       N/A        Y       34949
Self-heal Daemon on 10.80.4.139             N/A       N/A        Y       32183
 
Task Status of Volume vol_0d4a7567c603918fb666fd591255ffab
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_912ceedd86354fbd8806d3f531b956cc
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_b8a38ff05a2a762102aea81149c30584/brick_4fd
2648d8b9beb1b4c8f2d60a9e5d319/brick         49154     0          Y       612  
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_e19e0c9032852c897dbeb79c373b2b5a/brick_adb
2ff41e976cfd276e2f9f179d3497b/brick         49153     0          Y       960  
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_4c2c084ab43cae1cd6c8d7402195934a/brick_bab
d426faa0618f9402afc402c10f694/brick         49153     0          Y       581  
Self-heal Daemon on localhost               N/A       N/A        Y       36058
Self-heal Daemon on 10.80.4.139             N/A       N/A        Y       32183
Self-heal Daemon on 10.80.4.138             N/A       N/A        Y       34949
 
Task Status of Volume vol_912ceedd86354fbd8806d3f531b956cc
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_bd84de58ec563e823b1a50727b116031
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.80.4.139:/var/lib/heketi/mounts/vg
_994e32e5966f43a6a8bc54b218070306/brick_e94
1a0886bbf39d2ff05d8f2d7fba490/brick         49153     0          Y       581  
Brick 10.80.4.137:/var/lib/heketi/mounts/vg
_d8ed3c2573b18c141167a8872e6f877e/brick_c13
ffab26b671edccb0f0ad82b892163/brick         49153     0          Y       960  
Brick 10.80.4.138:/var/lib/heketi/mounts/vg
_795194576df37a11a0c86a52da7172a2/brick_3ee
e8811a363165de3295375e1b898a3/brick         49154     0          Y       612  
Self-heal Daemon on localhost               N/A       N/A        Y       36058
Self-heal Daemon on 10.80.4.139             N/A       N/A        Y       32183
Self-heal Daemon on 10.80.4.138             N/A       N/A        Y       34949
 
Task Status of Volume vol_bd84de58ec563e823b1a50727b116031
------------------------------------------------------------------------------
There are no active volume tasks

sh-4.2# targetcli ls
o- /  [...]
  o- backstores  [...]
  | o- block  [Storage Objects: 0]
  | o- fileio  [Storage Objects: 0]
  | o- pscsi  [Storage Objects: 0]
  | o- ramdisk  [Storage Objects: 0]
  | o- user:glfs  [Storage Objects: 1]
  |   o- blockvol_ac89c92ad12aff23392628619eb6bdb4  [vol_0d4a7567c603918fb666fd591255ffab.4.137/block-store/35d64e70-547f-40e9-b5dd-78498454bf03 (11.0GiB) activated]
  |     o- alua  [ALUA Groups: 0]
  o- iscsi  [Targets: 7]
  | o- iqn.2016-12.org.gluster-block:1ecf553d-fb92-48e1-926e-0ecba5f5072f  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 0]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:35d64e70-547f-40e9-b5dd-78498454bf03  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 1]
  | | | | o- lun0  [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 1]
  | | | | o- lun0  [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 1]
  | |   | o- lun0  [user/blockvol_ac89c92ad12aff23392628619eb6bdb4 (None)]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:46e09beb-51b0-4ee4-a70f-18fa492e9393  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 0]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:6085c84f-96b2-4ae5-9112-b8d860c8a2f7  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 0]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:829e10ca-2d8b-4af7-ab33-071976260d05  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 0]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:9ad684b0-4554-4b0f-be1a-d6f39a38b5c3  [TPGs: 3]
  | | o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.137:3260  [OK]
  | | o- tpg2  [disabled]
  | | | o- acls  [ACLs: 0]
  | | | o- luns  [LUNs: 0]
  | | | o- portals  [Portals: 1]
  | | |   o- 10.80.4.138:3260  [OK]
  | | o- tpg3  [disabled]
  | |   o- acls  [ACLs: 0]
  | |   o- luns  [LUNs: 0]
  | |   o- portals  [Portals: 1]
  | |     o- 10.80.4.139:3260  [OK]
  | o- iqn.2016-12.org.gluster-block:d8b29744-e9c9-470e-9d13-c004525ffa42  [TPGs: 3]
  |   o- tpg1  [gen-acls, tpg-auth, 1-way auth]
  |   | o- acls  [ACLs: 0]
  |   | o- luns  [LUNs: 0]
  |   | o- portals  [Portals: 1]
  |   |   o- 10.80.4.137:3260  [OK]
  |   o- tpg2  [disabled]
  |   | o- acls  [ACLs: 0]
  |   | o- luns  [LUNs: 0]
  |   | o- portals  [Portals: 1]
  |   |   o- 10.80.4.138:3260  [OK]
  |   o- tpg3  [disabled]
  |     o- acls  [ACLs: 0]
  |     o- luns  [LUNs: 0]
  |     o- portals  [Portals: 1]
  |       o- 10.80.4.139:3260  [OK]
  o- loopback  [Targets: 0]

-----

The glusterfs volumes are working as expected. The problem was discovered when I noticed that the elasticsearch pods had become "not ready". We tried to recreate the pods, but that forced us to reboot the underlying VM because the pods did not terminate due to: https://bugzilla.redhat.com/show_bug.cgi?id=1561385

After the VMs were rebooted, the elasticsearch pods created again, but they fail to mount their gluster-block PVCs.

# oc get pods -o wide -n logging
NAME                                      READY     STATUS              RESTARTS   AGE       IP              NODE
logging-es-data-master-4r3z0dcy-1-4trjm   0/1       ContainerCreating   0          9m        <none>          in01.example.com
logging-es-data-master-4r3z0dcy-1-8wq6d   0/1       Terminating         1          19h       <none>          in02.example.com
logging-es-data-master-rtg6tzck-1-0rpqg   0/1       Terminating         1          19h       <none>          in03.example.com
logging-es-data-master-rtg6tzck-1-lz6xf   0/1       ContainerCreating   0          9m        <none>          in02.example.com
logging-es-data-master-tvqvttfo-1-md56q   0/1       Terminating         2          1h        10.216.2.92     in03.example.com
logging-es-data-master-tvqvttfo-1-pg2cx   0/1       Terminating         1          19h       <none>          in03.example.com
logging-es-data-master-tvqvttfo-1-pjbzd   0/1       ContainerCreating   0          9m        <none>          in01.example.com
logging-fluentd-0qxx7                     1/1       Running             2          36d       10.216.6.44     rc01.example.com
logging-fluentd-1114w                     1/1       Running             2          36d       10.219.2.107    in02.example.com
logging-fluentd-1spjq                     1/1       Running             3          36d       10.219.4.28     rc04.example.com
logging-fluentd-1tml5                     1/1       Running             1          36d       10.216.4.31     rc02.example.com

# oc describe pod logging-es-data-master-4r3z0dcy-1-4trjm
Name:			logging-es-data-master-4r3z0dcy-1-4trjm
Namespace:		logging
Security Policy:	restricted
Node:			in01.example.com/10.80.4.137
Start Time:		Fri, 20 Apr 2018 11:12:29 +0200
Labels:			component=es
			deployment=logging-es-data-master-4r3z0dcy-1
			deploymentconfig=logging-es-data-master-4r3z0dcy
			logging-infra=elasticsearch
			provider=openshift
Annotations:		kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-4r3z0dcy-1","uid":"7ac8378d-27...
			openshift.io/deployment-config.latest-version=1
			openshift.io/deployment-config.name=logging-es-data-master-4r3z0dcy
			openshift.io/deployment.name=logging-es-data-master-4r3z0dcy-1
			openshift.io/scc=restricted
Status:			Pending
IP:			
Controllers:		ReplicationController/logging-es-data-master-4r3z0dcy-1
Containers:
  elasticsearch:
    Container ID:	
    Image:		docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6
    Image ID:		
    Ports:		9200/TCP, 9300/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Limits:
      memory:	8Gi
    Requests:
      cpu:	1
      memory:	8Gi
    Readiness:	exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    Environment:
      DC_NAME:			logging-es-data-master-4r3z0dcy
      NAMESPACE:		logging (v1:metadata.namespace)
      KUBERNETES_TRUST_CERT:	true
      SERVICE_DNS:		logging-es-cluster
      CLUSTER_NAME:		logging-es
      INSTANCE_RAM:		8Gi
      HEAP_DUMP_LOCATION:	/elasticsearch/persistent/heapdump.hprof
      NODE_QUORUM:		2
      RECOVER_EXPECTED_NODES:	3
      RECOVER_AFTER_TIME:	5m
      READINESS_PROBE_TIMEOUT:	30
      POD_LABEL:		component=es
      IS_MASTER:		true
      HAS_DATA:			true
    Mounts:
      /elasticsearch/persistent from elasticsearch-storage (rw)
      /etc/elasticsearch/secret from elasticsearch (ro)
      /usr/share/java/elasticsearch/config from elasticsearch-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  elasticsearch:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	logging-elasticsearch
    Optional:	false
  elasticsearch-config:
    Type:	ConfigMap (a volume populated by a ConfigMap)
    Name:	logging-elasticsearch
    Optional:	false
  elasticsearch-storage:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	logging-es-2
    ReadOnly:	false
  aggregated-logging-elasticsearch-token-jtmtr:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	aggregated-logging-elasticsearch-token-jtmtr
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	region=infra
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  10m		10m		1	default-scheduler				Normal		Scheduled	Successfully assigned logging-es-data-master-4r3z0dcy-1-4trjm to in01.example.com
  11m		4m		4	kubelet, in01.example.com			Warning		FailedMount	Unable to mount volumes for pod "logging-es-data-master-4r3z0dcy-1-4trjm_logging(51f45e19-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-4r3z0dcy-1-4trjm". list of unattached/unmounted volumes=[elasticsearch-storage]
  11m		4m		4	kubelet, in01.example.com			Warning		FailedSync	Error syncing pod
  12m		3m		9	kubelet, in01.example.com			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/iscsi/51f45e19-447b-11e8-8388-001a4a160352-pvc-76fd7e26-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-76fd7e26-27a2-11e8-bf94-001a4a160352") pod "51f45e19-447b-11e8-8388-001a4a160352" (UID: "51f45e19-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU.
iscsiadm: discovery login to 10.80.4.139 failed, giving up 2
iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure
, err exit status 19

# oc describe pod logging-es-data-master-rtg6tzck-1-lz6xf
Name:			logging-es-data-master-rtg6tzck-1-lz6xf
Namespace:		logging
Security Policy:	restricted
Node:			in02.example.com/10.80.4.138
Start Time:		Fri, 20 Apr 2018 11:15:17 +0200
Labels:			component=es
			deployment=logging-es-data-master-rtg6tzck-1
			deploymentconfig=logging-es-data-master-rtg6tzck
			logging-infra=elasticsearch
			provider=openshift
Annotations:		kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-rtg6tzck-1","uid":"4c753c92-27...
			openshift.io/deployment-config.latest-version=1
			openshift.io/deployment-config.name=logging-es-data-master-rtg6tzck
			openshift.io/deployment.name=logging-es-data-master-rtg6tzck-1
			openshift.io/scc=restricted
Status:			Pending
IP:			
Controllers:		ReplicationController/logging-es-data-master-rtg6tzck-1
Containers:
  elasticsearch:
    Container ID:	
    Image:		docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6
    Image ID:		
    Ports:		9200/TCP, 9300/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Limits:
      memory:	8Gi
    Requests:
      cpu:	1
      memory:	8Gi
    Readiness:	exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    Environment:
      DC_NAME:			logging-es-data-master-rtg6tzck
      NAMESPACE:		logging (v1:metadata.namespace)
      KUBERNETES_TRUST_CERT:	true
      SERVICE_DNS:		logging-es-cluster
      CLUSTER_NAME:		logging-es
      INSTANCE_RAM:		8Gi
      HEAP_DUMP_LOCATION:	/elasticsearch/persistent/heapdump.hprof
      NODE_QUORUM:		2
      RECOVER_EXPECTED_NODES:	3
      RECOVER_AFTER_TIME:	5m
      READINESS_PROBE_TIMEOUT:	30
      POD_LABEL:		component=es
      IS_MASTER:		true
      HAS_DATA:			true
    Mounts:
      /elasticsearch/persistent from elasticsearch-storage (rw)
      /etc/elasticsearch/secret from elasticsearch (ro)
      /usr/share/java/elasticsearch/config from elasticsearch-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  elasticsearch:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	logging-elasticsearch
    Optional:	false
  elasticsearch-config:
    Type:	ConfigMap (a volume populated by a ConfigMap)
    Name:	logging-elasticsearch
    Optional:	false
  elasticsearch-storage:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	logging-es-0
    ReadOnly:	false
  aggregated-logging-elasticsearch-token-jtmtr:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	aggregated-logging-elasticsearch-token-jtmtr
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	region=infra
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  10m		10m		1	default-scheduler				Normal		Scheduled	Successfully assigned logging-es-data-master-rtg6tzck-1-lz6xf to in02.example.com
  8m		1m		4	kubelet, in02.example.com			Warning		FailedMount	Unable to mount volumes for pod "logging-es-data-master-rtg6tzck-1-lz6xf_logging(574d007d-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-rtg6tzck-1-lz6xf". list of unattached/unmounted volumes=[elasticsearch-storage]
  8m		1m		4	kubelet, in02.example.com			Warning		FailedSync	Error syncing pod
  9m		55s		9	kubelet, in02.example.com			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/iscsi/574d007d-447b-11e8-8388-001a4a160352-pvc-4af4c793-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-4af4c793-27a2-11e8-bf94-001a4a160352") pod "574d007d-447b-11e8-8388-001a4a160352" (UID: "574d007d-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU.
iscsiadm: discovery login to 10.80.4.139 failed, giving up 2
iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure
, err exit status 19

# oc describe pod logging-es-data-master-tvqvttfo-1-pjbzd
Name:			logging-es-data-master-tvqvttfo-1-pjbzd
Namespace:		logging
Security Policy:	restricted
Node:			in01.example.com/10.80.4.137
Start Time:		Fri, 20 Apr 2018 11:12:44 +0200
Labels:			component=es
			deployment=logging-es-data-master-tvqvttfo-1
			deploymentconfig=logging-es-data-master-tvqvttfo
			logging-infra=elasticsearch
			provider=openshift
Annotations:		kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"logging","name":"logging-es-data-master-tvqvttfo-1","uid":"5ff870f0-27...
			openshift.io/deployment-config.latest-version=1
			openshift.io/deployment-config.name=logging-es-data-master-tvqvttfo
			openshift.io/deployment.name=logging-es-data-master-tvqvttfo-1
			openshift.io/scc=restricted
Status:			Pending
IP:			
Controllers:		ReplicationController/logging-es-data-master-tvqvttfo-1
Containers:
  elasticsearch:
    Container ID:	
    Image:		docker-registry-default.reg.ocr.example.com:443/openshift3/logging-elasticsearch:v3.6
    Image ID:		
    Ports:		9200/TCP, 9300/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Limits:
      memory:	8Gi
    Requests:
      cpu:	1
      memory:	8Gi
    Readiness:	exec [/usr/share/java/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    Environment:
      DC_NAME:			logging-es-data-master-tvqvttfo
      NAMESPACE:		logging (v1:metadata.namespace)
      KUBERNETES_TRUST_CERT:	true
      SERVICE_DNS:		logging-es-cluster
      CLUSTER_NAME:		logging-es
      INSTANCE_RAM:		8Gi
      HEAP_DUMP_LOCATION:	/elasticsearch/persistent/heapdump.hprof
      NODE_QUORUM:		2
      RECOVER_EXPECTED_NODES:	3
      RECOVER_AFTER_TIME:	5m
      READINESS_PROBE_TIMEOUT:	30
      POD_LABEL:		component=es
      IS_MASTER:		true
      HAS_DATA:			true
    Mounts:
      /elasticsearch/persistent from elasticsearch-storage (rw)
      /etc/elasticsearch/secret from elasticsearch (ro)
      /usr/share/java/elasticsearch/config from elasticsearch-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-jtmtr (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  elasticsearch:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	logging-elasticsearch
    Optional:	false
  elasticsearch-config:
    Type:	ConfigMap (a volume populated by a ConfigMap)
    Name:	logging-elasticsearch
    Optional:	false
  elasticsearch-storage:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	logging-es-1
    ReadOnly:	false
  aggregated-logging-elasticsearch-token-jtmtr:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	aggregated-logging-elasticsearch-token-jtmtr
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	region=infra
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  10m		10m		1	default-scheduler				Normal		Scheduled	Successfully assigned logging-es-data-master-tvqvttfo-1-pjbzd to in01.example.com
  11m		4m		4	kubelet, in01.example.com			Warning		FailedMount	Unable to mount volumes for pod "logging-es-data-master-tvqvttfo-1-pjbzd_logging(5b304a15-447b-11e8-8388-001a4a160352)": timeout expired waiting for volumes to attach/mount for pod "logging"/"logging-es-data-master-tvqvttfo-1-pjbzd". list of unattached/unmounted volumes=[elasticsearch-storage]
  11m		4m		4	kubelet, in01.example.com			Warning		FailedSync	Error syncing pod
  12m		3m		9	kubelet, in01.example.com			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/iscsi/5b304a15-447b-11e8-8388-001a4a160352-pvc-5e831446-27a2-11e8-bf94-001a4a160352" (spec.Name: "pvc-5e831446-27a2-11e8-bf94-001a4a160352") pod "5b304a15-447b-11e8-8388-001a4a160352" (UID: "5b304a15-447b-11e8-8388-001a4a160352") with: failed to get any path for iscsi disk, last err seen:
iscsi: failed to sendtargets to portal 10.80.4.139:3260 output: iscsiadm: Login response timeout. Waited 30 seconds and did not get response PDU.
iscsiadm: discovery login to 10.80.4.139 failed, giving up 2
iscsiadm: Could not perform SendTargets discovery: encountered non-retryable iSCSI login failure
, err exit status 19

-----

Also, I have tried to create new PVCs for both glusterfs and glusterblock. It works fine to create new glusterfs PVCs but not glusterblock. See output below:

# oc get sc
NAME                              TYPE
infra-glusterblock-storage        gluster.org/glusterblock   
infra-glusterfs-storage           kubernetes.io/glusterfs 

# cat pvc-glusterfs.yml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    volume.beta.kubernetes.io/storage-class: infra-glusterfs-storage
  name: pvc-glusterfs
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

# oc create -f pvc-glusterfs.yml 
persistentvolumeclaim "pvc-glusterfs" created

# cat pvc-glusterblock.yml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    volume.beta.kubernetes.io/storage-class: infra-glusterblock-storage
  name: pvc-glusterblock
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

# oc create -f pvc-glusterblock.yml 
persistentvolumeclaim "pvc-glusterblock" created

# oc get pvc
NAME               STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS                 AGE
logging-es-0       Bound     pvc-4af4c793-27a2-11e8-bf94-001a4a160352   100Gi      RWO           infra-glusterblock-storage   36d
logging-es-1       Bound     pvc-5e831446-27a2-11e8-bf94-001a4a160352   100Gi      RWO           infra-glusterblock-storage   36d
logging-es-2       Bound     pvc-76fd7e26-27a2-11e8-bf94-001a4a160352   100Gi      RWO           infra-glusterblock-storage   36d
pvc-glusterblock   Pending                                                                       infra-glusterblock-storage   57s
pvc-glusterfs      Bound     pvc-f647cf18-447b-11e8-ae31-001a4a160353   10Gi       RWO           infra-glusterfs-storage      2m

[root@ma03 ~]# oc describe pvc pvc-glusterblock
Name:		pvc-glusterblock
Namespace:	logging
StorageClass:	infra-glusterblock-storage
Status:		Pending
Volume:		
Labels:		<none>
Annotations:	control-plane.alpha.kubernetes.io/leader={"holderIdentity":"de5d81f8-319c-11e8-b74f-0a580adb0630","leaseDurationSeconds":15,"acquireTime":"2018-04-20T09:20:55Z","renewTime":"2018-04-20T09:21:57Z","lea...
		volume.beta.kubernetes.io/storage-class=infra-glusterblock-storage
		volume.beta.kubernetes.io/storage-provisioner=gluster.org/glusterblock
Capacity:	
Access Modes:	
Events:
  FirstSeen	LastSeen	Count	From								SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----								-------------	--------	------			-------
  56s		44s		2	gluster.org/glusterblock de5d81f8-319c-11e8-b74f-0a580adb0630			Warning		ProvisioningFailed	Failed to provision volume with StorageClass "infra-glusterblock-storage":  failed to create volume: [heketi] error creating volume Unable to execute command on glusterfs-ftmlc:
  1m		38s		3	gluster.org/glusterblock de5d81f8-319c-11e8-b74f-0a580adb0630			Normal		Provisioning		External provisioner is provisioning volume for claim "logging/pvc-glusterblock"
  1m		1s		22	persistentvolume-controller							Normal		ExternalProvisioning	cannot find provisioner "gluster.org/glusterblock", expecting that a volume for the claim is provisioned either manually or via external software

I have the following available in /etc/target/ in the glusterfs pods:

[root@ma03 ~]# oc get pods -n default -o wide
NAME                                  READY     STATUS    RESTARTS   AGE       IP             NODE
docker-registry-2-3c68p               1/1       Running   2          21h       10.216.2.88    in03.example.com
glusterblock-provisioner-dc-1-bv120   1/1       Running   3          85d       10.217.8.46    mb02.example.com
glusterfs-ftmlc                       1/1       Running   0          20h       10.80.4.137    in01.example.com
glusterfs-qt4fh                       1/1       Running   0          19h       10.80.4.138    in02.example.com
glusterfs-rn6qb                       1/1       Running   0          19h       10.80.4.139    in03.example.com
heketi-1-mdmxn                        1/1       Running   3          85d       10.216.8.127   no01.example.com
registry-console-1-d397x              1/1       Running   3          85d       10.219.15.22   no12.example.com
router1-2-pv2v6                       1/1       Running   1          22h       10.219.2.105   in02.example.com
router2-1-d7lqv                       1/1       Running   1          21h       10.216.2.87    in03.example.com

[root@ma03 ~]# oc rsh glusterfs-ftmlc

sh-4.2# ls -la /etc/target/
total 120
drwxr-xr-x.  3 root root   114 Apr 20 09:37 .
drwxr-xr-x. 67 root root  4096 Apr 19 14:59 ..
drwxr-xr-x.  2 root root  4096 Apr 20 09:37 backup
-rw-------.  1 root root 33432 Apr 20 09:37 saveconfig.json
-rw-------.  1 root root 34536 Jan 30 18:54 saveconfig.json.backup
-rw-------.  1 root root 33432 Jan 31 13:48 saveconfig.json.backup-21-01-2018

sh-4.2# ls -la /etc/target/backup/
total 424
drwxr-xr-x. 2 root root  4096 Apr 20 09:37 .
drwxr-xr-x. 3 root root   114 Apr 20 09:37 ..
-rw-------. 1 root root 33432 Apr 20 09:29 saveconfig-20180420-09:29:25.json
-rw-------. 1 root root 40097 Apr 20 09:30 saveconfig-20180420-09:30:13.json
-rw-------. 1 root root 33432 Apr 20 09:30 saveconfig-20180420-09:30:16.json
-rw-------. 1 root root 40097 Apr 20 09:32 saveconfig-20180420-09:32:49.json
-rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:51.json
-rw-------. 1 root root 40097 Apr 20 09:34 saveconfig-20180420-09:34:58.json
-rw-------. 1 root root 33432 Apr 20 09:35 saveconfig-20180420-09:35:00.json
-rw-------. 1 root root 40097 Apr 20 09:37 saveconfig-20180420-09:37:13.json
-rw-------. 1 root root 33432 Apr 20 09:37 saveconfig-20180420-09:37:15.json
-rw-r--r--. 1 root root 40060 Jan 30 18:54 saveconfig.json
-rw-r--r--. 1 root root 40060 Jan 30 18:40 saveconfig.json.backup

sh-4.2# exit
exit

[root@ma03 ~]# oc rsh glusterfs-qt4fh

sh-4.2# ls -la /etc/target/
total 120
drwxr-xr-x.  3 root root   114 Apr 20 09:39 .
drwxr-xr-x. 67 root root  4096 Apr 19 15:07 ..
drwxr-xr-x.  2 root root  4096 Apr 20 09:39 backup
-rw-------.  1 root root 33432 Apr 20 09:39 saveconfig.json
-rw-------.  1 root root 34536 Jan 30 18:47 saveconfig.json.backup
-rw-------.  1 root root 33432 Jan 31 13:43 saveconfig.json.backup-31-01-2018

sh-4.2# ls -la /etc/target/backup/
total 384
drwxr-xr-x. 2 root root  4096 Apr 20 09:39 .
drwxr-xr-x. 3 root root   114 Apr 20 09:39 ..
-rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:51.json
-rw-------. 1 root root 40097 Apr 20 09:32 saveconfig-20180420-09:32:52.json
-rw-------. 1 root root 33432 Apr 20 09:32 saveconfig-20180420-09:32:54.json
-rw-------. 1 root root 40097 Apr 20 09:35 saveconfig-20180420-09:35:28.json
-rw-------. 1 root root 33432 Apr 20 09:35 saveconfig-20180420-09:35:30.json
-rw-------. 1 root root 40097 Apr 20 09:37 saveconfig-20180420-09:37:36.json
-rw-------. 1 root root 33432 Apr 20 09:37 saveconfig-20180420-09:37:38.json
-rw-------. 1 root root 40097 Apr 20 09:39 saveconfig-20180420-09:39:51.json
-rw-------. 1 root root 33432 Apr 20 09:39 saveconfig-20180420-09:39:52.json
-rw-r--r--. 1 root root 40060 Jan 30 18:41 saveconfig.json

sh-4.2# exit
exit

[root@ma03 ~]# oc rsh glusterfs-rn6qb

sh-4.2# ls -la /etc/target/
total 124
drwxr-xr-x.  3 root root   114 Apr 19 14:45 .
drwxr-xr-x. 67 root root  4096 Apr 19 15:04 ..
drwxr-xr-x.  2 root root  4096 Apr 19 14:45 backup
-rw-------.  1 root root 40060 Apr 19 14:45 saveconfig.json
-rw-------.  1 root root 34536 Jan 30 18:48 saveconfig.json.backup
-rw-------.  1 root root 33432 Jan 31 13:43 saveconfig.json.backup-31-01-2018

sh-4.2# ls -la /etc/target/backup/
total 352
drwxr-xr-x. 2 root root  4096 Apr 19 14:45 .
drwxr-xr-x. 3 root root   114 Apr 19 14:45 ..
-rw-------. 1 root root 33395 Mar 14 15:19 saveconfig-20180314-15:19:15.json
-rw-------. 1 root root 26730 Mar 14 15:19 saveconfig-20180314-15:19:18.json
-rw-------. 1 root root 20065 Mar 14 15:19 saveconfig-20180314-15:19:20.json
-rw-------. 1 root root 26730 Mar 14 16:11 saveconfig-20180314-16:11:06.json
-rw-------. 1 root root 33395 Mar 14 16:11 saveconfig-20180314-16:11:39.json
-rw-------. 1 root root 40060 Mar 14 16:12 saveconfig-20180314-16:12:26.json
-rw-------. 1 root root 40060 Mar 27 12:17 saveconfig-20180327-12:17:46.json
-rw-------. 1 root root 40060 Apr 19 13:21 saveconfig-20180419-13:21:34.json
-rw-------. 1 root root 40060 Apr 19 14:45 saveconfig-20180419-14:45:44.json
-rw-r--r--. 1 root root 40060 Jan 30 18:42 saveconfig.json


[root@ma03 ~]# oc rsh heketi-1-mdmxn
sh-4.2# heketi-cli --user admin --secret <secret> volume list
Id:0d4a7567c603918fb666fd591255ffab    Cluster:15fb6aa960b6c2a5580228f32b5b2b3a    Name:vol_0d4a7567c603918fb666fd591255ffab [block]
Id:5d3ec04f22ce9451bec245ec1a1e1ffd    Cluster:15fb6aa960b6c2a5580228f32b5b2b3a    Name:heketidbstorage
Id:912ceedd86354fbd8806d3f531b956cc    Cluster:15fb6aa960b6c2a5580228f32b5b2b3a    Name:vol_912ceedd86354fbd8806d3f531b956cc
Id:bd84de58ec563e823b1a50727b116031    Cluster:15fb6aa960b6c2a5580228f32b5b2b3a    Name:vol_bd84de58ec563e823b1a50727b116031

Comment 16 errata-xmlrpc 2018-09-12 09:25:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2691