I created LVM snapshots on 4 different systems and then I created a distribute volume consisting of those read only bricks(hostname:/tmp/distribute.snapshot/ was the brick names) and distribute-snap was the volume name. I started the volume. I then try to mount it to a client system and it hangs things like when I go to do a df -h. I have to forcibly unmount the volume. mount -t glusterfs jacobgfs31:/distribute-snap /mnt If I then look on the Gluster server I was mounting from I see that the glusterfs process that was running the distribute-snap volume is no longer running.
Yes.. Server process will exit because it needs 'extended attribute' support from the backend bricks. If backend bricks are in RO mode, glusterfs fails to start or behave properly. Hence this behaves as if mount hang scenario without starting the volume. As of now, GlusterFS doesn't support RO backend. We will be addressing the mount command issues if the volume is not started soon, so users are not left with a hung mount point.
If we don't support this, then we should give an error if someone tries to set things up in this way...
Will this be fixed in 3.1.2? I had a customer(Kaltura) report an issue where one of their bricks because read only and it hung the entire volume. This should not happen...
*** Bug 1905 has been marked as a duplicate of this bug. ***
PATCH: http://patches.gluster.com/patch/6233 in master (send the CHILD_DOWN event also to fuse)
By sending the CHILD_DOWN event to fuse, the cases where some (or all) of the servers (glusterfsd) have not started, the mount point doesn't hang, but continues to work with existing glusterfsd (if all are not available like the case here, it will complain that 'Transport End point not connected') Its marked for DP because, we have to specify in FAQ (or similar) pages that if user gets 'Transport end point not connected' error, (s)he should check if all the glusterfsd's are running fine.
when glusterfsd process is killed, [root@centos-qa-3 nfs-test]# ls file.1 file.11 file.14 file.18 file.2 file.3 file.4 file.6 file.8 read.fsxgood file.10 file.13 file.15 file.19 file.20 file.30 file.5 file.7 read when second is also killed, [root@centos-qa-3 nfs-test]# echo test >> file.30 [root@centos-qa-3 nfs-test]# ls file.1 file.11 file.14 file.18 file.2 file.3 file.4 file.6 file.8 read.fsxgood file.10 file.13 file.15 file.19 file.20 file.30 file.5 file.7 read when the third is killed, specifically one both the glusterfsd processes on one brick are killed, [root@centos-qa-3 nfs-test]# ls file.1 file.11 file.14 file.18 file.2 file.3 file.4 file.6 file.8 read.fsxgood file.10 file.13 file.15 file.19 file.20 file.30 file.5 file.7 read [root@centos-qa-3 nfs-test]# echo test1 >> file30 -bash: file30: Transport endpoint is not connected when all the processes are killed, [root@centos-qa-3 nfs-test]# ls ls: .: Transport endpoint is not connected
Added the details (http://gluster.qotd.co/q/mounting-gluster-volume-with-read-only-bricks-displays-transport-end-point-not-connected-error/) in Gluster QOTD site.