Going through the gluster cmd history, I found only create commands related to volume 13 in question. [2018-05-28 14:19:07.157559] : volume create fl_glusterfs_mongodb-13_0eccac84-6282-11e8-bce8-005056a52b66 replica 3 10.70.42.84:/var/lib/heketi/mounts/vg_ac1e7f8b95ff5e45dfc512ff80a39500/brick_059f53211e900cf48c2c1b6ec6a57292/brick 10.70.42.86:/var/lib/heketi/mounts/vg_0e56fba30532535400683bfba6418693/brick_16c4a50bda4066e6a22372fbd9be1e9a/brick 10.70.41.217:/var/lib/heketi/mounts/vg_3df896c762972fe426d32a583c98938d/brick_b09d463676bc7cf2aa568dd0861a2367/brick : SUCCESS [2018-05-28 14:19:52.902926] : volume start fl_glusterfs_mongodb-13_0eccac84-6282-11e8-bce8-005056a52b66 : SUCCESS [2018-05-28 14:25:44.807127] : volume create fl_glusterfs_mongodb-14_f7731709-6282-11e8-bce8-005056a52b66 replica 3 10.70.42.84:/var/lib/heketi/mounts/vg_ac1e7f8b95ff5e45dfc512ff80a39500/brick_88ef0b992801d59cd0aeea998faea381/brick 10.70.42.86:/var/lib/heketi/mounts/vg_0e56fba30532535400683bfba6418693/brick_9199142f5b8d9a044cd3d671e9942e5e/brick 10.70.41.217:/var/lib/heketi/mounts/vg_4e3c85737db8bb8de87e5e04465c37ef/brick_123e57134fbf9b8c9187eb0851a4d24b/brick : SUCCESS [2018-05-28 14:25:55.191166] : volume start fl_glusterfs_mongodb-14_f7731709-6282-11e8-bce8-005056a52b66 : SUCCESS [2018-05-28 14:26:00.810355] : volume create fl_glusterfs_mongodb-13_f77396c8-6282-11e8-bce8-005056a52b66 replica 3 10.70.42.84:/var/lib/heketi/mounts/vg_ea22c9a72381f27d14a7656721e62a0b/brick_80950cb457ab663b8e51ad4ed8b9f534/brick 10.70.42.86:/var/lib/heketi/mounts/vg_0e56fba30532535400683bfba6418693/brick_d018321a2ed997aef577f61f1568b8e0/brick 10.70.41.217:/var/lib/heketi/mounts/vg_4e3c85737db8bb8de87e5e04465c37ef/brick_1a8f98451ad975f958e25b99143df9a3/brick : SUCCESS [2018-05-28 14:26:05.606277] : volume start fl_glusterfs_mongodb-13_f77396c8-6282-11e8-bce8-005056a52b66 : SUCCESS [2018-05-28 14:51:36.020415] : v status fl_glusterfs_mongodb-13_0eccac84-6282-11e8-bce8-005056a52b66 : SUCCESS Also, found the following logs in events 29m 29m 1 mongodb-13.1532d51460c932f9 PersistentVolumeClaim Warning ProvisioningFailed persistentvolume-controller Failed to provision volume with StorageClass "gluster-container": failed to create volume: failed to create volume: Get http://172.31.45.8:8080/queue/0438e64f9bff557bc3b33b2bb6112d22: dial tcp 172.31.45.8:8080: getsockopt: connection refused 29m 29m 1 mongodb-13.1532d5158e4479d3 PersistentVolumeClaim Warning ProvisioningFailed persistentvolume-controller Failed to provision volume with StorageClass "gluster-container": failed to create volume: failed to create volume: Post http://172.31.45.8:8080/volumes: dial tcp 172.31.45.8:8080: getsockopt: connection refused Noticing that heketi logs start only at 14:25, ``` Heketi 6.0.0 [heketi] INFO 2018/05/28 14:25:21 Loaded kubernetes executor ``` it can be assumed that heketi pod rebooted between 14:19:52.902926 and 14:25:55.191166. Hence, it is possible that heketi created the volume but failed before the provisioner queried the result. Throttling feature in heketi would reduce the occurence of such bugs. It does not fully fix it though. The real fix would be to have a handle that can be used by provisioner and heketi to identify requests.