Bug 1479345 - Gluster Block failed to create volumes with an error "Transport endpoint not connected"
Gluster Block failed to create volumes with an error "Transport endpoint not ...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: CNS-deployment (Show other bugs)
cns-3.6
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Michael Adam
Anoop
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-08 08:00 EDT by Humble Chirammal
Modified: 2017-08-08 10:45 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-08 10:45:59 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Humble Chirammal 2017-08-08 08:00:56 EDT
Description of problem:


[root@master ~]# oc get pods
NAME                               READY     STATUS        RESTARTS   AGE
block-router-1-j00m7               1/1       Running       2          3d
glusterblock-provisioner-1-d7rmn   1/1       Running       6          5d
glusterfs-0l0zk                    1/1       Running       0          4h
glusterfs-3f7mf                    1/1       Running       0          4h
glusterfs-h8rvq                    1/1       Running       0          4h
glusterfs-rnfbw                    1/1       Running       0          4h
heketi-1-pbtbw                     1/1       Running       0          6h


[root@master ~]# heketi-cli blockvolume create --size=7 --ha=1
Error: Unable to execute command on glusterfs-h8rvq:
[root@master ~]# 
[root@master ~]# 
[root@master ~]# oc logs heketi-1-pbtbw |tail
[kubeexec] ERROR 2017/08/08 11:59:03 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:247: Failed to run command [gluster-block create vol_96d8648ebd2ff7dcb87be3a2587c6246/blockvol_b6bc55dfdf00b1c6b86b474be9237c2b  ha 1 auth disable  192.168.35.3 7G --json] on glusterfs-h8rvq: Err[command terminated with exit code 107]: Stdout [{ "RESULT": "FAIL", "errCode": 107, "errMsg": "Not able to acquire lock on vol_96d8648ebd2ff7dcb87be3a2587c6246[Transport endpoint is not connected]" }
]: Stderr []
[kubeexec] ERROR 2017/08/08 11:59:03 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:247: Failed to run command [gluster-block delete vol_96d8648ebd2ff7dcb87be3a2587c6246/blockvol_b6bc55dfdf00b1c6b86b474be9237c2b] on glusterfs-h8rvq: Err[command terminated with exit code 107]: Stdout [Not able to acquire lock on vol_96d8648ebd2ff7dcb87be3a2587c6246[Transport endpoint is not connected]
RESULT:FAIL
]: Stderr []
[sshexec] ERROR 2017/08/08 11:59:03 /src/github.com/heketi/heketi/executors/sshexec/block_volume.go:101: Unable to delete volume blockvol_b6bc55dfdf00b1c6b86b474be9237c2b: Unable to execute command on glusterfs-h8rvq: 
[heketi] ERROR 2017/08/08 11:59:03 /src/github.com/heketi/heketi/apps/glusterfs/app_block_volume.go:83: Failed to create block volume: Unable to execute command on glusterfs-h8rvq: 
[asynchttp] INFO 2017/08/08 11:59:03 asynchttp.go:129: Completed job ac9eb0c625b5ec376881220c747cea10 in 547.207164ms
[negroni] Started GET /queue/ac9eb0c625b5ec376881220c747cea10
[negroni] Completed 500 Internal Server Error in 64.305µs
[root@master ~]# 



[root@master ~]# oc rsh glusterfs-h8rvq
sh-4.2# 
sh-4.2# 
sh-4.2# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-08-08 07:28:07 UTC; 4h 32min ago
 Main PID: 157 (gluster-blockd)
   CGroup: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod1cd4ac9e_7c0b_11e7_aa20_52540070d000.slice/docker-ae62535102935f334d836ecb85d1d479dd4cddfa5a9bea55050bfd5ec8677e51.scope/system.slice/gluster-blockd.service
           └─157 /usr/sbin/gluster-blockd --glfs-lru-count 5 --log-level INFO...

Aug 08 07:28:07 node2 systemd[1]: Started Gluster block storage utility.
Aug 08 07:28:07 node2 systemd[1]: Starting Gluster block storage utility...
sh-4.2# systemctl status tcmu-runner   
● tcmu-runner.service - LIO Userspace-passthrough daemon
   Loaded: loaded (/usr/lib/systemd/system/tcmu-runner.service; static; vendor preset: disabled)
   Active: active (running) since Tue 2017-08-08 07:28:06 UTC; 4h 32min ago
 Main PID: 106 (tcmu-runner)
   CGroup: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod1cd4ac9e_7c0b_11e7_aa20_52540070d000.slice/docker-ae62535102935f334d836ecb85d1d479dd4cddfa5a9bea55050bfd5ec8677e51.scope/system.slice/tcmu-runner.service
           └─106 /usr/bin/tcmu-runner --tcmu-log-dir=/var/log/gluster-block/

Aug 08 07:28:06 node2 systemd[1]: Starting LIO Userspace-passthrough daemon...
Aug 08 07:28:06 node2 systemd[1]: Started LIO Userspace-passthrough daemon.
sh-4.2# systemctl status target     
● target.service - Restore LIO kernel target configuration
   Loaded: loaded (/usr/lib/systemd/system/target.service; disabled; vendor preset: disabled)
   Active: active (exited) since Tue 2017-08-08 07:28:07 UTC; 4h 32min ago
  Process: 105 ExecStart=/usr/bin/targetctl restore (code=exited, status=0/SUCCESS)
 Main PID: 105 (code=exited, status=0/SUCCESS)
   CGroup: /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod1cd4ac9e_7c0b_11e7_aa20_52540070d000.slice/docker-ae62535102935f334d836ecb85d1d479dd4cddfa5a9bea55050bfd5ec8677e51.scope/system.slice/target.service

Aug 08 07:28:06 node2 systemd[1]: Starting Restore LIO kernel target config.....
Aug 08 07:28:07 node2 target[105]: No saved config file at /etc/target/save...ng
Aug 08 07:28:07 node2 systemd[1]: Started Restore LIO kernel target configu...n.
Hint: Some lines were ellipsized, use -l to show in full.
sh-4.2# 





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 2 Humble Chirammal 2017-08-08 10:45:00 EDT
Status of volume: vol_96d8648ebd2ff7dcb87be3a2587c6246
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.35.3:/var/lib/heketi/mounts/v
g_803bb877441ad72c82b0ee6113a79511/brick_9a
7f3517851a71e13bc3ff1f3b96f4bc/brick        N/A       N/A        N       N/A  
Brick 192.168.35.4:/var/lib/heketi/mounts/v
g_799ba6da52ca29cf5404bd96b53fa957/brick_3f
ed005da1f6746d2d299b1e971fcc39/brick        49152     0          Y       414  
Brick 192.168.35.4:/var/lib/heketi/mounts/v
g_3ae4bb7cd25740f3cd84a15dbb985f48/brick_bb
9eb06eb20f6d570d4d2f6378b14742/brick        49152     0          Y       414  
Brick 192.168.35.5:/var/lib/heketi/mounts/v
g_96d6ff00fd2a38ead4c9ef9cd6db935f/brick_1a
6dcabc43a70402dc595ed89d76126f/brick        49154     0          Y       443  
Self-heal Daemon on localhost               N/A       N/A        Y       419  
Self-heal Daemon on 192.168.35.2            N/A       N/A        Y       398  
Self-heal Daemon on 192.168.35.5            N/A       N/A        Y       716  
Self-heal Daemon on 192.168.35.4            N/A       N/A        Y       405  
 
Task Status of Volume vol_96d8648ebd2ff7dcb87be3a2587c6246
------------------------------------------------------------------------------
sh-4.2# cd /var/log/glusterfs/bricks/
sh-4.2# ls
var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_809331b5ef0e23c3327cf4483258dd25-brick.log
var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_809331b5ef0e23c3327cf4483258dd25-brick.log-20170808
var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_9a7f3517851a71e13bc3ff1f3b96f4bc-brick.log
var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_9a7f3517851a71e13bc3ff1f3b96f4bc-brick.log-20170730.gz
var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_9a7f3517851a71e13bc3ff1f3b96f4bc-brick.log-20170808
sh-4.2# ps aux |grep glusterfs
root        419  1.8  0.1 534144 18796 ?        Ssl  07:28   4:56 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/02999fdf21f5e88dc48d632567ae77fc.socket --xlator-option *replicate*.node-uuid=68989b9e-0fb8-4265-80a3-87d19e7ea6c5
root        428  0.0  0.2 982604 22492 ?        Ssl  07:28   0:02 /usr/sbin/glusterfsd -s 192.168.35.3 --volfile-id vol_bd43b5b5f048c469ee16f80376d73ec3.192.168.35.3.var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_809331b5ef0e23c3327cf4483258dd25-brick -p /var/lib/glusterd/vols/vol_bd43b5b5f048c469ee16f80376d73ec3/run/192.168.35.3-var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_809331b5ef0e23c3327cf4483258dd25-brick.pid -S /var/run/gluster/e6ae8574b384167f5e9707a453d930fc.socket --brick-name /var/lib/heketi/mounts/vg_803bb877441ad72c82b0ee6113a79511/brick_809331b5ef0e23c3327cf4483258dd25/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_803bb877441ad72c82b0ee6113a79511-brick_809331b5ef0e23c3327cf4483258dd25-brick.log --xlator-option *-posix.glusterd-uuid=68989b9e-0fb8-4265-80a3-87d19e7ea6c5 --brick-port 49155 --xlator-option vol_bd43b5b5f048c469ee16f80376d73ec3-server.listen-port=49155
Comment 3 Humble Chirammal 2017-08-08 10:45:59 EDT
This is due to a race condition which is getting fixed in https://review.gluster.org/#/c/17984/.
I am closing this bug for now.

Note You need to log in before you can comment on or make changes to this bug.