Description of problem: [root@master ~]# oc get pods NAME READY STATUS RESTARTS AGE block-router-1-ll8z5 1/1 Running 0 7d glusterblock-provisioner-1-br10s 1/1 Running 0 21h glusterfs-4j9nx 1/1 Running 0 47m glusterfs-6qz73 1/1 Running 0 13h glusterfs-p71zc 1/1 Running 0 13h glusterfs-sc5tt 1/1 Running 0 13h heketi-1-22x5g 1/1 Running 2 13h [root@master ~]# oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD heketi heketi-block.cloudapps.mystorage.com heketi <all> None [root@master ~]# [root@master ~]# export HEKETI_CLI_SERVER="http://heketi-block.cloudapps.mystorage.com" [root@master ~]# heketi-cli volume list Id:61c7447d1d3ba55d4ff7e19efa22a686 Cluster:c35f18bf77b5c8aaa8576d2c95e0ba8c Name:heketidbstorage Id:96d8648ebd2ff7dcb87be3a2587c6246 Cluster:c35f18bf77b5c8aaa8576d2c95e0ba8c Name:vol_96d8648ebd2ff7dcb87be3a2587c6246 [block] [root@master ~]# heketi-cli blockvolume create --size=1 Error: Unable to execute command on glusterfs-6qz73: [root@master ~]# oc logs heketi-1-22x5g|tail -f [kubeexec] ERROR 2017/07/27 07:52:18 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:247: Failed to run command [gluster-block create vol_96d8648ebd2ff7dcb87be3a2587c6246/blockvol_2320f4e37bccc23f81caeb0489fe3240 ha 1 auth enable 192.168.35.3 1G --json] on glusterfs-6qz73: Err[command terminated with exit code 107]: Stdout [{ "RESULT": "FAIL", "errCode": 107, "errMsg": "Not able to acquire lock on vol_96d8648ebd2ff7dcb87be3a2587c6246[Transport endpoint is not connected]" } ]: Stderr [] [kubeexec] ERROR 2017/07/27 07:52:18 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:247: Failed to run command [gluster-block delete vol_96d8648ebd2ff7dcb87be3a2587c6246/blockvol_2320f4e37bccc23f81caeb0489fe3240] on glusterfs-6qz73: Err[command terminated with exit code 107]: Stdout [Not able to acquire lock on vol_96d8648ebd2ff7dcb87be3a2587c6246[Transport endpoint is not connected] RESULT:FAIL ]: Stderr [] [sshexec] ERROR 2017/07/27 07:52:18 /src/github.com/heketi/heketi/executors/sshexec/block_volume.go:88: Unable to delete volume blockvol_2320f4e37bccc23f81caeb0489fe3240: Unable to execute command on glusterfs-6qz73: [asynchttp] INFO 2017/07/27 07:52:18 asynchttp.go:129: Completed job b79a3667fc50be1dcd5ffc3df30755d3 in 615.266451ms [heketi] ERROR 2017/07/27 07:52:18 /src/github.com/heketi/heketi/apps/glusterfs/app_block_volume.go:84: Failed to create block volume: Unable to execute command on glusterfs-6qz73: [negroni] Started GET /queue/b79a3667fc50be1dcd5ffc3df30755d3 [negroni] Completed 500 Internal Server Error in 89.38µs [root@master ~]# Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Err[command terminated with exit code 107]: Stdout [Not able to acquire lock on vol_96d8648ebd2ff7dcb87be3a2587c6246[Transport endpoint is not connected] Above error would have gone back to caller which help a lot on troubleshooting. Additional info:
Upstream patch: https://github.com/heketi/heketi/pull/793/commits/0dd6237cc4d7c171d8a4166824504418f03a829f
This issue isn't fixed yet. Tried to verify the fix in build - cns-deploy-5.0.0-25.el7rhgs.x86_64 When gluster-blockd service is stopped on one of the 3 node RHGS cluster and volume create command was executed, it errored without propogating the actual error message. =============================================== #heketi-cli blockvolume create --size=1 --ha=3 Error: ================================================ snippet from heketi log: ------------------------ [sshexec] ERROR 2017/09/01 15:48:43 /src/github.com/heketi/heketi/pkg/utils/ssh/ssh.go:173: Failed to run command [/bin/bash -c 'gluster-block create vol_314b6b07b1e82a528a7bd1e2d2d00d20/blockvol_e56581f3ba9793625e8b828188576f96 ha 3 auth disable 10.70.46.1,10.70.47.105,10.70.47.25 1G --json'] on dhcp47-105.lab.eng.blr.redhat.com:22: Err[Process exited with status 255]: Stdout [Connection failed. Please check if gluster-block daemon is operational. Moving the bug back to Assigned.
This is duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1479777 *** This bug has been marked as a duplicate of bug 1479777 ***