Storage mounted on /gluster/scratch. The client somehow got into a state whereby ls /gluster/scratch always returned "Transport endpoint not connected". No file under /var/log/gluster/ was appended to after each attempt. However, after umount /gluster/scratch mount /gluster/scratch everything was fine. Relevant /etc/fstab entry on client: storage1:/scratch3 /gluster/scratch3 glusterfs defaults,_netdev 0 0 Client info: Ubuntu 10.04 x86_64 glusterfs 3.2.5-1 Server info: Ubuntu 12.04 x86_64 glusterfs 3.2.5-1 (this system was upgraded online from 10.04 to 12.04) How reproducible: Not really, have seen this once in a while. Additional info: Here is the end of /var/log/glusterfs/gluster-scratch.log, which suggests there was some sort of crash in the client (from which the client was unable to recover automatically, presumably). 2012-06-04 is when the unmount/remount was done. ... frame : type(1) op(WRITE) frame : type(1) op(WRITE) frame : type(1) op(WRITE) time of crash: frame : type(1) op(WRITE) 2012-04-25 12:35:57 frame : type(1) op(WRITE) configuration details: frame : type(1) op(WRITE) argp 1 frame : type(1) op(WRITE) backtrace 1 frame : type(1) op(WRITE) dlfcn 1 frame : type(1) op(WRITE) fdatasync 1 frame : type(1) op(WRITE) libpthread 1 frame : type(1) op(WRITE) llistxattr 1 frame : type(1) op(WRITE) setfsid 1 frame : type(1) op(WRITE) spinlock 1 frame : type(1) op(WRITE) epoll.h 1 frame : type(1) op(WRITE) xattr.h 1 frame : type(1) op(WRITE) st_atim.tv_nsec 1 frame : type(1) op(WRITE) package-string: glusterfs 3.2.5 frame : type(1) op(WRITE) frame : type(1) op(WRITE) ... more of same frame : type(1) op(WRITE) frame : type(1) op(WRITE) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-04-25 12:35:57 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.2.5 /lib/libc.so.6(+0x33af0)[0x7f282cdddaf0] /lib/libc.so.6(+0x33af0)[0x7f282cdddaf0] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_sync_cbk+0x30)[0x7f2829eaf060] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_sync_cbk+0x30)[0x7f2829eaf060] /usr/lib/glusterfs/3.2.5/xlator/cluster/distribute.so(dht_writev_cbk+0xd3)[0x7f282a0c6a43] /usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client3_1_writev+0x13a)[0x7f282a30e89a] /usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client_writev+0xa4)[0x7f282a2f2aa4] /usr/lib/glusterfs/3.2.5/xlator/cluster/distribute.so(dht_writev+0x162)[0x7f282a0cb982] /usr/lib/glusterfs/3.2.5/xlator/cluster/distribute.so(dht_writev_cbk+0xd3)[0x7f282a0c6a43] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_sync+0x4fa)[0x7f2829ea838a] /usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client3_1_writev+0x13a)[0x7f282a30e89a] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_do_ops+0x53)[0x7f2829eac443] /usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client_writev+0xa4)[0x7f282a2f2aa4] /usr/lib/glusterfs/3.2.5/xlator/cluster/distribute.so(dht_writev+0x162)[0x7f282a0cb982] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_process_queue+0xe8)[0x7f2829ea97c8] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_sync+0x4fa)[0x7f2829ea838a] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_writev+0x887)[0x7f2829eabf87] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_do_ops+0x53)[0x7f2829eac443] /usr/lib/glusterfs/3.2.5/xlator/performance/read-ahead.so(ra_writev+0x18f)[0x7f2829c9ddaf] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_process_queue+0xe8)[0x7f2829ea97c8] /usr/lib/glusterfs/3.2.5/xlator/performance/io-cache.so(ioc_writev+0x175)[0x7f2829a8d3f5] /usr/lib/glusterfs/3.2.5/xlator/performance/write-behind.so(wb_sync_cbk+0xfa)[0x7f2829eaf12a] /usr/lib/glusterfs/3.2.5/xlator/performance/quick-read.so(qr_writev+0x224)[0x7f2829880e84] /usr/lib/glusterfs/3.2.5/xlator/cluster/distribute.so(dht_writev_cbk+0xd3)[0x7f282a0c6a43] /usr/lib/glusterfs/3.2.5/xlator/protocol/client.so(client3_1_writev_cbk+0x515)[0x7f282a30acb5] /usr/lib/glusterfs/3.2.5/xlator/performance/stat-prefetch.so(sp_writev+0x178)[0x7f28296683c8] /usr/lib/glusterfs/3.2.5/xlator/debug/io-stats.so(io_stats_writev+0x1f6)[0x7f2829448766] /usr/lib/glusterfs/3.2.5/xlator/mount/fuse.so(fuse_write_resume+0x181)[0x7f282bb97331] /usr/lib/glusterfs/3.2.5/xlator/mount/fuse.so(fuse_resolve_and_resume+0x52)[0x7f282bb8d692] /usr/lib/glusterfs/3.2.5/xlator/mount/fuse.so(+0x17e5d)[0x7f282bb9ee5d] /lib/libpthread.so.0(+0x69ca)[0x7f282d1339ca] /lib/libc.so.6(clone+0x6d)[0x7f282ce9070d] --------- /usr/lib/libgfrpc.so.0(saved_frames_unwind+0x1c9)[0x7f282d56e3d9] [2012-06-04 20:47:11.40519] I [glusterfsd.c:1493:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.2.5 [2012-06-04 20:47:11.468286] W [write-behind.c:3023:init] 0-scratch-write-behind: disabling write-behind for first 0 bytes [2012-06-04 20:47:11.475928] I [client.c:1935:notify] 0-scratch-client-0: parent translators are ready, attempting connect on transport [2012-06-04 20:47:11.476879] I [client.c:1935:notify] 0-scratch-client-1: parent translators are ready, attempting connect on transport Given volfile: +------------------------------------------------------------------------------+ 1: volume scratch-client-0 2: type protocol/client 3: option remote-host storage2 4: option remote-subvolume /disk/scratch/scratch 5: option transport-type tcp 6: end-volume 7: 8: volume scratch-client-1 9: type protocol/client 10: option remote-host storage3 11: option remote-subvolume /disk/scratch/scratch 12: option transport-type tcp 13: end-volume 14: 15: volume scratch-dht 16: type cluster/distribute 17: subvolumes scratch-client-0 scratch-client-1 18: end-volume 19: 20: volume scratch-write-behind 21: type performance/write-behind 22: subvolumes scratch-dht 23: end-volume 24: 25: volume scratch-read-ahead 26: type performance/read-ahead 27: subvolumes scratch-write-behind 28: end-volume 29: 30: volume scratch-io-cache 31: type performance/io-cache 32: subvolumes scratch-read-ahead 33: end-volume 34: 35: volume scratch-quick-read 36: type performance/quick-read 37: subvolumes scratch-io-cache 38: end-volume 39: 40: volume scratch-stat-prefetch 41: type performance/stat-prefetch 42: subvolumes scratch-quick-read 43: end-volume 44: 45: volume scratch 46: type debug/io-stats 47: option latency-measurement off 48: option count-fop-hits off 49: subvolumes scratch-stat-prefetch 50: end-volume +------------------------------------------------------------------------------+ [2012-06-04 20:47:11.477867] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-scratch-client-0: changing port to 24009 (from 0) [2012-06-04 20:47:11.478028] I [rpc-clnt.c:1536:rpc_clnt_reconfig] 0-scratch-client-1: changing port to 24010 (from 0) [2012-06-04 20:47:15.164693] I [client-handshake.c:1090:select_server_supported_programs] 0-scratch-client-0: Using Program GlusterFS 3.2.5, Num (1298437), Version (310) [2012-06-04 20:47:15.165256] I [client-handshake.c:1090:select_server_supported_programs] 0-scratch-client-1: Using Program GlusterFS 3.2.5, Num (1298437), Version (310) [2012-06-04 20:47:15.165504] I [client-handshake.c:913:client_setvolume_cbk] 0-scratch-client-0: Connected to 192.168.6.71:24009, attached to remote volume '/disk/scratch/scratch'. [2012-06-04 20:47:15.165713] I [client-handshake.c:913:client_setvolume_cbk] 0-scratch-client-1: Connected to 192.168.6.72:24010, attached to remote volume '/disk/scratch/scratch'. [2012-06-04 20:47:15.177925] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse: switched to graph 0 [2012-06-04 20:47:15.178193] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
I Have the same problem with 3.2.6, time to time in a randon basis some server give-me the "Transport endpoint not connected". I Have to reboot the server to make it connect again. I run Fedora 16 and Gluster 3.2.6-2
are you guys using rdma?
(In reply to comment #3) > are you guys using rdma? nm dumb question, didn't see the client config at first.
In my case it's 10G ethernet (Intel X520-DA2 cards, SFP+ cables, Netgear XSM7224S switch)
I´m using Gigabit ethernet cards, some with bond, in client and servers.
This bug is fixed with 3.2.6 release, and also is not valid in 3.3.0 release. Please upgrade to one of the above release. *** This bug has been marked as a duplicate of bug 767359 ***