1385605 – fuse mount point not accessible

Bug 1385605 - fuse mount point not accessible

Summary: fuse mount point not accessible

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rpc
Sub Component:
Version:	rhgs-3.2
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Raghavendra Talur
QA Contact:	Karan Sandha
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1388414 1429145 (view as bug list)
Depends On:
Blocks:	1351528 1386626 1388323 1392906 1397267 1398930 1401534 1408949 1474007
TreeView+	depends on / blocked

Reported:	2016-10-17 12:06 UTC by Karan Sandha
Modified:	2020-12-14 07:48 UTC (History)
CC List:	22 users (show)
Fixed In Version:	glusterfs-3.8.4-7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1386626 (view as bug list)
Environment:
Last Closed:	2017-03-23 06:11:05 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description Karan Sandha 2016-10-17 12:06:52 UTC

Description of problem:
Mount point inaccessible when try to access.

Version-Release number of selected component (if applicable):
[root@dhcp46-231 gluster]# rpm -qa | grep gluster
gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-server-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-api-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-cli-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.el7rhgs.noarch
glusterfs-libs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-fuse-3.8.4-2.26.git0a405a4.el7rhgs.x86_64

How reproducible: Hit it once
logs placed @ rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

Steps performed:
1. Create an arbiter volume 1*3 volume named mdcache
2. Mount the volume on two different clients /mnt on the both the clients.
3. Replace the brick0 with new brick. check for heal info wait for it to complete
4. touch files{1..10000} from one client
5. Replace the brick brick2(arbiter) with new brick simultaneously create newfiles{1..10000} on the mount point from second client.
4. When completed. echo 1234 > newfiles from (1..10000) using script.sh placed with log files from first client.
5 Check for gluster volume heal mdcache info
6. / directory of the brick and one more file needs to be healed.

[root@dhcp46-231 gluster]# gluster volume heal mdcache info
Brick dhcp46-231.lab.eng.blr.redhat.com:/bricks/brick1/mdcache
/ - Possibly undergoing heal

/newfiles0
Status: Connected
Number of entries: 2

Brick dhcp46-50.lab.eng.blr.redhat.com:/bricks/brick0/mdcache
Status: Connected
Number of entries: 0

Brick dhcp47-111.lab.eng.blr.redhat.com:/bricks/brick1/mdcache
/ - Possibly undergoing heal

/newfiles0
Status: Connected
Number of entries: 2

##################################################################
[root@dhcp46-231 gluster]# getfattr -d -m . -e hex /bricks/brick1/mdcache/
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/mdcache/
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.mdcache-client-1=0x000000000000000000000008
trusted.afr.mdcache-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5a5aa31b79d84f458641f7c032141e53

*****************************
[root@dhcp47-111 gluster]# getfattr -d -m . -e hex /bricks/brick1/mdcache/
getfattr: Removing leading '/' from absolute path names
# file: bricks/brick1/mdcache/
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.mdcache-client-1=0x000000000000000000000008
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5a5aa31b79d84f458641f7c032141e53

Actual results:

1) There were hangs observed on the mount points.
2) Heals couldn't get completed.
3) Transport end point not connected errors observed in the logs of client and bricks
4) Multiple Blocked locks observed in the statedumps of the bricks.
5) Mount point not accessible.

Expected results:
No hangs should be observed
No pending heals should be there.

Additional info:

Comment 2 Raghavendra G 2016-10-18 04:56:23 UTC

Karan,

Can you attach brick and client log files?

regards,
Raghavendra

Comment 5 Poornima G 2016-10-20 07:03:34 UTC

Also, has this test case been tried on 3.2 build without md-cache options?

Comment 7 Karan Sandha 2016-10-24 06:58:50 UTC

Poornima,

yes i tried without MDCACHE build but i wasn't able to hit it. 

Thanks & regards
Karan Sandha

Comment 11 Pranith Kumar K 2016-10-25 11:14:40 UTC

*** Bug 1388414 has been marked as a duplicate of this bug. ***

Comment 15 Nag Pavan Chilakam 2016-11-15 09:39:40 UTC

I hit this case, in my systemic testing, where the replica pair has one brick down.
However the client sees that both the bricks are down inspite of one being up.
Hence if we try to cat  a file sitting on the brick, we get transportendpoint error
and if we try to write to a file on this brick we get EIO

version:3.8.4-5

Comment 16 Nag Pavan Chilakam 2016-11-15 10:03:33 UTC

sosreport of client is availble at [qe@rhsqe-repo nchilaka]$ pwd
/var/www/html/sosreports/nchilaka
[qe@rhsqe-repo nchilaka]$ /var/www/html/sosreports/nchilaka/bug.1385605

[root@dhcp35-191 ~]# gluster v info
gl 
Volume Name: sysvol
Type: Distributed-Replicate
Volume ID: b1ef4d84-0614-4d5d-9e2e-b19183996e43
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 10.70.35.191:/rhs/brick1/sysvol
Brick2: 10.70.37.108:/rhs/brick1/sysvol
Brick3: 10.70.35.3:/rhs/brick1/sysvol
Brick4: 10.70.37.66:/rhs/brick1/sysvol
Brick5: 10.70.35.191:/rhs/brick2/sysvol
Brick6: 10.70.37.108:/rhs/brick2/sysvol
Brick7: 10.70.35.3:/rhs/brick2/sysvol
Brick8: 10.70.37.66:/rhs/brick2/sysvol
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.stat-prefetch: on
performance.cache-invalidation: on
cluster.shd-max-threads: 10
features.cache-invalidation-timeout: 400
features.cache-invalidation: on
performance.md-cache-timeout: 300
features.uss: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@dhcp35-191 ~]# gluster v status
Status of volume: sysvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/sysvol       N/A       N/A        N       N/A  
Brick 10.70.37.108:/rhs/brick1/sysvol       49152     0          Y       27848
Brick 10.70.35.3:/rhs/brick1/sysvol         N/A       N/A        N       N/A  
Brick 10.70.37.66:/rhs/brick1/sysvol        49152     0          Y       28853
Brick 10.70.35.191:/rhs/brick2/sysvol       49153     0          Y       18344
Brick 10.70.37.108:/rhs/brick2/sysvol       N/A       N/A        N       N/A  
Brick 10.70.35.3:/rhs/brick2/sysvol         49153     0          Y       11727
Brick 10.70.37.66:/rhs/brick2/sysvol        N/A       N/A        N       N/A  
Snapshot Daemon on localhost                49154     0          Y       18461
Self-heal Daemon on localhost               N/A       N/A        Y       18364
Quota Daemon on localhost                   N/A       N/A        Y       18410
Snapshot Daemon on 10.70.35.3               49154     0          Y       11826
Self-heal Daemon on 10.70.35.3              N/A       N/A        Y       11747
Quota Daemon on 10.70.35.3                  N/A       N/A        Y       11779
Snapshot Daemon on 10.70.37.66              49154     0          Y       28970
Self-heal Daemon on 10.70.37.66             N/A       N/A        Y       28892
Quota Daemon on 10.70.37.66                 N/A       N/A        Y       28923
Snapshot Daemon on 10.70.37.108             49154     0          Y       27965
Self-heal Daemon on 10.70.37.108            N/A       N/A        Y       27887
Quota Daemon on 10.70.37.108                N/A       N/A        Y       27918
 
Task Status of Volume sysvol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp35-191 ~]#

Comment 17 Jiffin 2016-11-16 12:46:11 UTC

*** Bug 1392906 has been marked as a duplicate of this bug. ***

Comment 18 Raghavendra Talur 2016-11-23 12:49:40 UTC

Patch posted upstream at http://review.gluster.org/#/c/15916

Comment 19 rjoseph 2016-12-05 14:22:09 UTC

Upstream master      : http://review.gluster.org/15916
Upstream release-3.8 : http://review.gluster.org/16025
Upstream release-3.9 : http://review.gluster.org/16026

Downstream : https://code.engineering.redhat.com/gerrit/92095

Comment 23 errata-xmlrpc 2017-03-23 06:11:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Comment 40 Bipin Kunal 2017-06-19 07:11:08 UTC

Thanks Nag for the update.

@Rejy : do we need hotfix flag set on this bug?

Comment 44 Raghavendra G 2017-12-01 06:35:21 UTC

*** Bug 1429145 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.

aloganat
amukherj
asrivast
bkunal
ccalhoun
hamiller
ksandha
nchilaka
olim
omasek
pgurusid
pkarampu
rabhat
rcyriac
rgowdapp
rhinduja
rhs-bugs
rjoseph
rnalakka
rtalur
sanandpa
storage-qa-internal