Description of problem: On a CNS setup, two block hosting volumes were created and block devices were created on both the volumes. One of the block hosting volumes was stopped and the gluster-block-target.service was restarted. After the services had come up, a block PVC was created. However the creation failed, because heketi was trying to create the block device on the block hosting volume which was down. [kubeexec] ERROR 2018/07/10 09:30:22 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block create vol_0c4e24dba9d66bfd28cb3d7a9871397f/test-vol_glusterfs_mongodb-6_dadfb7e7-8423-11e8-9d0f-0a580a830009 ha 3 auth enable prealloc full 10.70.43.230,10.70.43.19,10.70.43.53 1GiB --json] on glusterfs-storage-krlwr: Err[command terminated with exit code 5]: Stdout [{ "RESULT": "FAIL", "errCode": 5, "errMsg": "Check if volume vol_0c4e24dba9d66bfd28cb3d7a9871397f is operational" } ]: Stderr [] On all the retry attempts it picked the same block hosting volume and failed even though there was another block hosting volume with sufficient space. Volume Name: vol_0c4e24dba9d66bfd28cb3d7a9871397f Type: Replicate Volume ID: 4fc33195-9d1a-41b6-8c09-f82f1ca3cfa7 Status: Stopped Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.70.43.230:/var/lib/heketi/mounts/vg_86720e336d04af17d0d1df0de77fabe5/brick_cd20c546d7596fb88e9b264ba82b91d3/brick Brick2: 10.70.43.19:/var/lib/heketi/mounts/vg_4f884c058ce59bce86912bbf803ba1d5/brick_1d15b816fe5749bdbd00add791ac18fc/brick Brick3: 10.70.43.53:/var/lib/heketi/mounts/vg_a27e963d67f8a834e0048d9cf200c5cc/brick_72e518b267fc011b747875056fa355b0/brick Options Reconfigured: server.allow-insecure: on user.cifs: off features.shard-block-size: 64MB features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: disable performance.strict-o-direct: on performance.readdir-ahead: off performance.open-behind: off performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet nfs.disable: on cluster.brick-multiplex: on Volume Name: vol_c43f0247bd000e52369dde72cc428342 Type: Replicate Volume ID: bd0c5acc-6a67-4d18-9c8f-bea8674712fe Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.70.43.230:/var/lib/heketi/mounts/vg_86720e336d04af17d0d1df0de77fabe5/brick_259857c4b93ac58f3f4416308f23c656/brick Brick2: 10.70.43.19:/var/lib/heketi/mounts/vg_4f884c058ce59bce86912bbf803ba1d5/brick_3da36f9b37c449e259a2e38781e7aec2/brick Brick3: 10.70.43.53:/var/lib/heketi/mounts/vg_a27e963d67f8a834e0048d9cf200c5cc/brick_821191c35768c513cf977fc6f71f340f/brick Options Reconfigured: nfs.disable: on transport.address-family: inet performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off performance.open-behind: off performance.readdir-ahead: off performance.strict-o-direct: on network.remote-dio: disable cluster.eager-lock: enable cluster.quorum-type: auto cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 features.shard: on features.shard-block-size: 64MB user.cifs: off server.allow-insecure: on cluster.brick-multiplex: on Version-Release number of selected component (if applicable): # oc version oc v3.10.0-0.67.0 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO # oc rsh heketi-storage-1-4f7vz rpm -qa|grep heketi python-heketi-7.0.0-2.el7rhgs.x86_64 heketi-client-7.0.0-2.el7rhgs.x86_64 heketi-7.0.0-2.el7rhgs.x86_64 # rpm -qa|grep gluster glusterfs-client-xlators-3.8.4-54.12.el7rhgs.x86_64 glusterfs-fuse-3.8.4-54.12.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-54.12.el7rhgs.x86_64 glusterfs-libs-3.8.4-54.12.el7rhgs.x86_64 glusterfs-3.8.4-54.12.el7rhgs.x86_64 glusterfs-api-3.8.4-54.12.el7rhgs.x86_64 glusterfs-cli-3.8.4-54.12.el7rhgs.x86_64 glusterfs-server-3.8.4-54.12.el7rhgs.x86 How reproducible: 1/1 Steps to Reproduce: 1. Create 2 block hosting volumes 2. Create block devices on both the volumes 3. Stop one of the block hosting volumes 4. Restart gluster-block-target service 5. Create a new block device Actual results: Since heketi is trying to create block device on the stopped volume, it fails Expected results: Block device creation should be successful Additional info: Logs will be attached soon