This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 763737 - (GLUSTER-2005) Mounting Gluster volume with RO bricks hangs
Mounting Gluster volume with RO bricks hangs
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
3.1.0
All All
low Severity medium
: ---
: ---
Assigned To: Amar Tumballi
:
: GLUSTER-1905 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-22 17:37 EDT by Jacob Shucart
Modified: 2015-12-01 11:45 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: fuse
Documentation: DA
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jacob Shucart 2010-10-22 17:37:00 EDT
I created LVM snapshots on 4 different systems and then I created a distribute volume consisting of those read only bricks(hostname:/tmp/distribute.snapshot/ was the brick names) and distribute-snap was the volume name.  I started the volume.

I then try to mount it to a client system and it hangs things like when I go to do a df -h.  I have to forcibly unmount the volume.

mount -t glusterfs jacobgfs31:/distribute-snap /mnt

If I then look on the Gluster server I was mounting from I see that the glusterfs process that was running the distribute-snap volume is no longer running.
Comment 1 Amar Tumballi 2010-11-08 02:13:02 EST
Yes.. Server process will exit because it needs 'extended attribute' support from the backend bricks. If backend bricks are in RO mode, glusterfs fails to start or behave properly. Hence this behaves as if mount hang scenario without starting the volume.

As of now, GlusterFS doesn't support RO backend. We will be addressing the mount command issues if the volume is not started soon, so users are not left with a hung mount point.
Comment 2 Jacob Shucart 2010-11-08 08:28:29 EST
If we don't support this, then we should give an error if someone tries to set things up in this way...
Comment 3 Jacob Shucart 2010-12-08 14:48:16 EST
Will this be fixed in 3.1.2?  I had a customer(Kaltura) report an issue where one of their bricks because read only and it hung the entire volume.  This should not happen...
Comment 4 Amar Tumballi 2011-01-21 02:41:36 EST
*** Bug 1905 has been marked as a duplicate of this bug. ***
Comment 5 Anand Avati 2011-02-22 09:21:51 EST
PATCH: http://patches.gluster.com/patch/6233 in master (send the CHILD_DOWN event also to fuse)
Comment 6 Amar Tumballi 2011-02-24 03:42:39 EST
By sending the CHILD_DOWN event to fuse, the cases where some (or all) of the servers (glusterfsd) have not started, the mount point doesn't hang, but continues to work with existing glusterfsd (if all are not available like the case here, it will complain that 'Transport End point not connected')

Its marked for DP because, we have to specify in FAQ (or similar) pages that if user gets 'Transport end point not connected' error, (s)he should check if all the glusterfsd's are running fine.
Comment 7 Saurabh 2011-03-10 03:17:12 EST
when glusterfsd process is killed,


[root@centos-qa-3 nfs-test]# ls
file.1   file.11  file.14  file.18  file.2   file.3   file.4  file.6  file.8  read.fsxgood
file.10  file.13  file.15  file.19  file.20  file.30  file.5  file.7  read


when second is also killed,

[root@centos-qa-3 nfs-test]# echo test >> file.30
[root@centos-qa-3 nfs-test]# ls
file.1   file.11  file.14  file.18  file.2   file.3   file.4  file.6  file.8  read.fsxgood
file.10  file.13  file.15  file.19  file.20  file.30  file.5  file.7  read


when the third is killed, specifically one both the glusterfsd processes on one brick are killed,


[root@centos-qa-3 nfs-test]# ls
file.1   file.11  file.14  file.18  file.2   file.3   file.4  file.6  file.8  read.fsxgood
file.10  file.13  file.15  file.19  file.20  file.30  file.5  file.7  read
[root@centos-qa-3 nfs-test]# echo test1 >> file30
-bash: file30: Transport endpoint is not connected

when all the processes are killed,

[root@centos-qa-3 nfs-test]# ls
ls: .: Transport endpoint is not connected

Note You need to log in before you can comment on or make changes to this bug.