Bug 1224646

Summary: transport endpoint is not connected message when tried to access .snaps dir on an ec volume with distribute hot tier attached and IO is going on.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Triveni Rao <trao>
Component: tierAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED WORKSFORME QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: dlambrig, kramdoss, rhs-bugs, sankarshan, sashinde
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: tier-interops
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-17 12:42:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Triveni Rao 2015-05-25 08:55:40 UTC
Description of problem:

transport endpoint is not connected message when tried to access .snaps dir on an ec volume with distribute hot tier attached and IO is going on.


Version-Release number of selected component (if applicable):

[root@rhsqa14-vm1 ~]# rpm -qa | grep gluster
glusterfs-3.7.0-2.el6rhs.x86_64
glusterfs-cli-3.7.0-2.el6rhs.x86_64
glusterfs-libs-3.7.0-2.el6rhs.x86_64
glusterfs-client-xlators-3.7.0-2.el6rhs.x86_64
glusterfs-api-3.7.0-2.el6rhs.x86_64
glusterfs-server-3.7.0-2.el6rhs.x86_64
glusterfs-fuse-3.7.0-2.el6rhs.x86_64
glusterfs-debuginfo-3.7.0-2.el6rhs.x86_64
[root@rhsqa14-vm1 ~]# 

[root@rhsqa14-vm1 ~]# gluster --version
glusterfs 3.7.0 built on May 15 2015 01:31:12
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@rhsqa14-vm1 ~]# 

How reproducible:

easily

Steps to Reproduce:
1. create an EC volume as cold tier and attach distribute hot tier. uss enabled
2. fuse mount it and untar linux kernel, create snapshot
3. Activate snapshot while IO is going on.
4. on the mount point try accessing .snaps dir, 

Additional info:


[root@rhsqa14-vm1 ~]# gluster v info jim

Volume Name: jim
Type: Tier
Volume ID: 633cc77d-68b7-48e7-8081-509ac29fdff9
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 4
Brick1: 10.70.46.236:/rhs/brick5/p
Brick2: 10.70.46.233:/rhs/brick5/p
Brick3: 10.70.46.236:/rhs/brick4/p
Brick4: 10.70.46.233:/rhs/brick4/p
Cold Bricks:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.46.233:/rhs/brick1/p
Brick6: 10.70.46.236:/rhs/brick1/p
Brick7: 10.70.46.233:/rhs/brick2/p
Brick8: 10.70.46.236:/rhs/brick2/p
Brick9: 10.70.46.233:/rhs/brick3/p
Brick10: 10.70.46.236:/rhs/brick3/p
Options Reconfigured:
features.barrier: disable
features.uss: enable
features.inode-quota: on
features.quota: on
cluster.min-free-disk: 10
performance.readdir-ahead: on
[root@rhsqa14-vm1 ~]#

[root@rhsqa14-vm5 ~]# cd /mnt
[root@rhsqa14-vm5 mnt]# ls -la
total 80388
drwxr-xr-x.  5 root root      243 May 25 03:30 .
dr-xr-xr-x. 31 root root     4096 May 25 02:24 ..
drwx------.  4 root root      302 May 25 03:33 linux-4.0
-rw-r--r--.  1 root root 82313052 May 25 03:30 linux-4.0.tar.xz
drwxr-xr-x.  3 root root       96 May 25 03:24 .trashcan
[root@rhsqa14-vm5 mnt]# cd .snaps
-bash: cd: .snaps: Transport endpoint is not connected
[root@rhsqa14-vm5 mnt]#

Comment 2 Ashish Pandey 2015-06-04 09:19:58 UTC
Tried to recreate the bug on EC volumes with all the features enabled except  tiering. It is working fine and don't see any kind of error. No "I/O Error".  No "Transport endpoint is not connected". 

[root@aspandey gfsa]# ll
total 16
drwx------. 7 root root 4096 Jun  4 14:45 linux-2.6.39
[root@aspandey gfsa]# pwd
/mnt/gfsa
[root@aspandey gfsa]# cd .snaps
[root@aspandey .snaps]# ll
total 2
drwxr-xr-x. 5 root root 58 Jun  4 14:24 test_1_GMT-2015.06.04-09.03.26
drwxr-xr-x. 5 root root 58 Jun  4 14:24 test_2_GMT-2015.06.04-09.13.20
drwxr-xr-x. 5 root root 58 Jun  4 14:24 test_GMT-2015.06.04-08.57.31
[root@aspandey .snaps]# pwd
/mnt/gfsa/.snaps
[root@aspandey .snaps]# cd test_GMT-2015.06.04-08.57.31/
[root@aspandey test_GMT-2015.06.04-08.57.31]# ll
total 4
drwx------. 4 root root 4096 Jun  4 14:26 linux-2.6.39
[root@aspandey test_GMT-2015.06.04-08.57.31]# pwd
/mnt/gfsa/.snaps/test_GMT-2015.06.04-08.57.31
[root@aspandey test_GMT-2015.06.04-08.57.31]# 



[root@rhs3 glusterfs]# gluster --version
glusterfs 3.7.1 built on Jun  4 2015 13:17:30
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License

[root@rhs3 glusterfs]# gluster v info gv1
 
Volume Name: gv1
Type: Disperse
Volume ID: f139b6c8-fe00-44a6-a9f9-317b1e365afd
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.42.64:/brick/a1
Brick2: 10.70.42.64:/brick/a2
Brick3: 10.70.42.64:/brick/a3
Brick4: 10.70.43.118:/brick/a4
Brick5: 10.70.43.118:/brick/a5
Brick6: 10.70.43.118:/brick/a6
Options Reconfigured:
features.barrier: disable
features.uss: on
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
cluster.disperse-self-heal-daemon: enable

Comment 3 Mohammed Rafi KC 2015-06-10 14:05:45 UTC
I tried to reproduce the issue as mentioned, but I could able to access .snaps with out any error and no I/O errors, but tier daemon crashed during the course. But I think it is nothing to do with snapshot, the problem is being investigated.

Comment 5 Dan Lambright 2016-06-22 15:29:21 UTC
Karthik, can you try to reproduce this? If you cannot, I believe it should be closed.