Description of problem: While sub-dir is mounted on client and add-brick is performed,doing rm -rf * on mount point fails to delete the directories present on mount point Version-Release number of selected component (if applicable): glusterfs-api-3.8.4-51.el7rhgs.x86_64 How reproducible: 2/2 Steps to Reproduce: 1.Create 3 x (2 + 1) = 9 Arbiter volume. 2.Mount the volume on client via Fuse 3.Create a directory say "dir1" inside the mount point 4.Set permissions for the directory on volume # gluster v set glustervol auth.allow "/dir1(10.70.37.192)" volume set: success 5.Mount the sub-dir "dir1" on client. mount -t glusterfs dhcp42-125.lab.eng.blr.redhat.com:glustervol/dir1 /mnt/posix_Parent/ 5.Create 1000 directories on mount point 6.Perform add brick # gluster v add-brick glustervol dhcp42-127.lab.eng.blr.redhat.com:/gluster/brick3/3 dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick3/3 dhcp42-119.lab.eng.blr.redhat.com:/gluster/brick3/3 volume add-brick: success 7.After performing add-brick,do rm -rf * on mount point Actual results: rm -rf * on mount point results in "Transport endpoint is not connected".Even though the subdir is mounted on client. rm: cannot remove ‘sd979’: Transport endpoint is not connected rm: cannot remove ‘sd98’: Transport endpoint is not connected rm: cannot remove ‘sd980’: Transport endpoint is not connected rm: cannot remove ‘sd981’: Transport endpoint is not connected rm: cannot remove ‘sd982’: Transport endpoint is not connected rm: cannot remove ‘sd983’: Transport endpoint is not connected rm: cannot remove ‘sd984’: Transport endpoint is not connected rm: cannot remove ‘sd985’: Transport endpoint is not connected rm: cannot remove ‘sd986’: Transport endpoint is not connected ]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/rhel_dhcp37--192-root 17811456 2959232 14852224 17% / devtmpfs 1930048 0 1930048 0% /dev tmpfs 1940904 0 1940904 0% /dev/shm tmpfs 1940904 75644 1865260 4% /run tmpfs 1940904 0 1940904 0% /sys/fs/cgroup /dev/sda1 1038336 219524 818812 22% /boot rhsqe-repo.lab.eng.blr.redhat.com:/opt 1953887232 405861376 1448791040 22% /opt tmpfs 388184 0 388184 0% /run/user/0 dhcp42-125.lab.eng.blr.redhat.com:glustervol/dir1 62539776 163712 62376064 1% /mnt/posix_Parent Client mount logs- ================ [2017-11-02 15:48:31.961601] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11954: RMDIR() /sd974 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:31.977846] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11962: RMDIR() /sd975 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:31.992027] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11970: RMDIR() /sd976 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.003641] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11978: RMDIR() /sd977 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.014953] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11986: RMDIR() /sd978 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.026279] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 11994: RMDIR() /sd979 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.038034] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12002: RMDIR() /sd98 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.050025] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12010: RMDIR() /sd980 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.062428] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12018: RMDIR() /sd981 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.074406] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12026: RMDIR() /sd982 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.085732] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12034: RMDIR() /sd983 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.099808] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12042: RMDIR() /sd984 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.114376] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12050: RMDIR() /sd985 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.128303] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12058: RMDIR() /sd986 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.146789] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12066: RMDIR() /sd987 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.161191] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12074: RMDIR() /sd988 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.174965] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12082: RMDIR() /sd989 => -1 (Transport endpoint is not connected) [2017-11-02 15:48:32.189015] W [fuse-bridge.c:1355:fuse_unlink_cbk] 0-glusterfs-fuse: 12090: RMDIR() /sd99 => -1 (Transport endpoint is not connected) ================= Expected results: rm -rf * should delete the directories present on mount point Additional info: Attaching sosreports shortly
I propose to make this a known issue as the feature is in TP. The steps to resolve this issues are: * After 'add-brick' operation, do a 'stat ${all_subdirs_exported}' on the full volume mount, and then continue the operations in subdir mount-points. Or, * After 'add-brick' operation, run 'rebalance' (even just rebalance fix-layout alone is good enough), and then continue rm -rf operations on subdir mount points.
https://review.gluster.org/18645 is a method to fix it.. but the patch needs more review and more testing, doesn't look like we can fix it by 3.3.1 and hence I still recommend this as the 'known issue'. Marking it as POST as the RCA is known, and a patch to automatically handle it is posted upstream. (Note that we may need similar hook script in replace-brick too).
DocText Looks fine.
Verified this BZ on glusterfs-3.12.2-6.el7rhgs.x86_64 Steps- 1.Create 4*3 dist-replicate volume. 2.Mount the volume on client via FUSE 3.Create 4 dirs inside the mount point 4.Set auth allow permissions on volume #gluster v set Ganeshavol1 auth.allow "/dir1(10.70.46.125),/dir2(10.70.46.20),/dir3(10.70.47.33),/(*)" 5.Mount the subdirs on respective client 6.Perform some IO's(Create directories) 7.Perform add-brick operation on that volume #gluster v add-brick Ganeshavol1 dhcp47-193.lab.eng.blr.redhat.com:/gluster/brick1/new1 dhcp46-116.lab.eng.blr.redhat.com:/gluster/brick1/new1 dhcp46-184.lab.eng.blr.redhat.com:/gluster/brick1/new1 8.Perform rm -rf * from all the mount points Moving this BZ to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607
Made minor changes, and everything looks good now, IMO.