Bug 1503413 - When sub-dir is mounted on Fuse client,adding bricks to the same volume unmounts the subdir from fuse client
Summary: When sub-dir is mounted on Fuse client,adding bricks to the same volume unmou...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: fuse
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.3.1
Assignee: Amar Tumballi
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks: 1475688 1505323
TreeView+ depends on / blocked
 
Reported: 2017-10-18 04:59 UTC by Manisha Saini
Modified: 2017-11-29 03:30 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.8.4-51
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1505323 (view as bug list)
Environment:
Last Closed: 2017-11-29 03:30:36 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3276 0 normal SHIPPED_LIVE glusterfs bug fix update 2017-11-29 08:28:52 UTC

Description Manisha Saini 2017-10-18 04:59:38 UTC
Description of problem:
When sub-dir is mounted on Fuse client,adding bricks to the same volume unmounts the subdir on fuse client

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-50.el7rhgs.x86_64

How reproducible:
2/2

Steps to Reproduce:
1.Create 4*3 Distributed-Replicate Volume 
2.Mount the volume to glusterfs Fuse client
3.Create a directory inside the mount point say dir1
4.Set auth allow option on volume for dir1 
# gluster v set ganeshavol1 auth.allow "/dir1(10.70.37.*),/(*)"
volume set: success
5.Mount the sub-dir dir1 on the same client
6.Start creating 512 kb of files inside sub-dir mount point
7.when IO's are in process,add 3 bricks to existing volume

# gluster v add-brick ganeshavol1 dhcp42-125.lab.eng.blr.redhat.com:/gluster/brick5/b1 dhcp42-127.lab.eng.blr.redhat.com:/gluster/brick5/b2 dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick5/b3


Actual results:
Brick add operation passed on volume.After bricks are added,sub-dir is automatically unmounted from the fuse client.

Expected results:
Add brick operation should not unmount the sub-dir from fuse client.

Additional info:

brick logs on newly added bricks-

1-client-12-2-0 (version: 3.8.4)
[2017-10-18 04:24:38.619538] E [server-handshake.c:342:do_path_lookup] 0-/gluster/brick5/b1: first lookup on subdir (dir1) failed: Success
[2017-10-18 04:24:38.619554] E [server-handshake.c:402:server_first_lookup] 0-ganeshavol1-server: first lookup on subdir (/dir1) failed: Invalid argument


Cliemt logs-

# cat mnt-Fuse-Subdir.log | grep E
[2017-10-18 04:22:26.816447] I [fuse-bridge.c:4153:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.22
[2017-10-18 04:24:38.613427] E [MSGID: 114044] [client-handshake.c:1139:client_setvolume_cbk] 2-ganeshavol1-client-12: SETVOLUME on remote-host failed: subdirectory for mount "/dir1" is not found [No such file or directory]
[2017-10-18 04:24:38.613441] I [MSGID: 114049] [client-handshake.c:1249:client_setvolume_cbk] 2-ganeshavol1-client-12: sending AUTH_FAILED event
[2017-10-18 04:24:38.613462] E [fuse-bridge.c:5328:notify] 0-fuse: Server authenication failed. Shutting down.


Full sosreport will be attached shortly.

Comment 3 Amar Tumballi 2017-10-18 13:23:16 UTC
Thanks for the logs.

There are two issues at present to be solved here. I would treat 1 as blocker, and another one we can debate.

1. In the code for subdir mount, there are mainly 2 cases for handshake failure between client and server (ie, client process and each bricks).
  a) authentication setup: If the IP address of the client is not in 'allow' list, or is in 'reject' list.
  b) subdirectory we want to mount is not existing yet on the volume.

In this case, if its a) then unmounting the client makes sense (ie, the infrastructure for this exists). But for b), we need not do a umount, and can try to connect back later.

Right now, the issue can be resolved by fixing 'b)' here. The fix in under review, and testing at the moment.


2. Assuming the scenario of 'add-brick', where a subdir was created earlier in existing bricks, but on the new bricks there are no subdirectory to handshake with, above fix gives extra time to complete the directory (entry) heal, and hence this can be fixed in subsequent handshakes.

But the catch here is, the self-heal can happen only from a 'volume' level mount, not subdir mount. Hence, if there are no volume level mount at all, or no operations on these directories on that volume level mount, those entries would never be created, and hence the new bricks will never get used. (A rebalance can fix it though).

So, a possible fix (thanks to Nithya B) is adding a hook scripts for add-bricks operation to check if the volume has any subdir mount options (in auth.allow) and 'stat' them on a volume level mount in a hook script. 

Fix for 2 can be treated non-blocker for 3.3.1 as a rebalance after add-brick would fix it.

Comment 4 Amar Tumballi 2017-10-22 07:34:45 UTC
https://review.gluster.org/#/c/18550/

Comment 7 Amar Tumballi 2017-10-30 10:09:15 UTC
https://code.engineering.redhat.com/gerrit/121725

Comment 9 Manisha Saini 2017-11-08 17:01:57 UTC
Verified this with glusterfs-3.8.4-51.el7rhgs.x86_64


Steps to Reproduce:
1.Create 4*3 Distributed-Replicate Volume 
2.Mount the volume to glusterfs Fuse client
3.Create  directories inside the mount point say d1,d2,d3
4.Set auth allow option on volume for dir1 
 gluster v set ganeshavol1 auth.allow "/d1(10.70.37.142),/d2(10.70.37.142),/d3(10.70.37.142),/(*)"
5.Mount the sub-dirs on another client
6.run linux untars on all the subdir mount poins
7.when IO's are in process,add 3 bricks to existing volume

Add brick succeed and IO's keeps running on mount points.

Hence moving this Bug to verified state

Comment 12 errata-xmlrpc 2017-11-29 03:30:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3276


Note You need to log in before you can comment on or make changes to this bug.