Description of problem: Auxiliary mount remains even after crawler finishes crawling , when quota is enabled when bricks are down. Version-Release number of selected component (if applicable): glusterfs-3.8.4-15.el6rhs.x86_64 How reproducible: 1/1 Steps to Reproduce: 1. Kill one of the brick in each sub-volume 2. Enable quota 3. Actual results: [root@rhs-arch-srv3 tmp]# ps -ef | grep quota root 7762 1 0 07:01 ? 00:00:00 /usr/sbin/glusterfs --volfile-server localhost --volfile-id clone2 -l /var/log/glusterfs/quota-mount-clone2.log -p /var/run/gluster/clone2.pid --client-pid -5 /var/run/gluster/clone2/ root 14877 6664 0 09:20 pts/0 00:00:00 grep quota root 15095 1 0 Mar01 ? 00:00:01 /usr/sbin/glusterfs -s localhost --volfile-id client_per_brick/clone2.client.10.70.36.3.var-run-gluster-snaps-0c286cc539c34063aae2d17fdb66fd71-brick2-b2.vol --use-readdirp=yes --client-pid -100 -l /var/log/glusterfs/quota_crawl/var-run-gluster-snaps-0c286cc539c34063aae2d17fdb66fd71-brick2-b2.log /var/run/gluster/tmp/mntRlTPnv root 18829 1 0 Mar01 ? 00:00:02 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/3b3b5f5b2555b948d2dbb20210fbc9d6.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off ================================ glusterd logs: [2017-03-01 06:02:36.883481] W [MSGID: 106033] [glusterd-quota.c:331:_glusterd_quota_initiate_fs_crawl] 0-management: chdir /var/run/gluster/tmp/mntRlTPnv failed [Transport endpoint is not connected] Expected results: Auxiliary mount should remains even after crawler finishes crawling Additional info:
Since the brick is down we get transport_disconnected on chdir to mountpoint. In the function _glusterd_quota_initiate_fs_crawl, we exit if chdir fails without doing a umount. ret = chdir (mountdir); if (ret == -1) { gf_msg (THIS->name, GF_LOG_WARNING, errno, GD_MSG_DIR_OP_FAILED, "chdir %s failed", mountdir); exit (EXIT_FAILURE); } To fix this we need to do a lazy_umount before exit.
upstream patch : https://review.gluster.org/#/c/16853/
the issue is fixed as part of 3.12.2 rebase (RHGS 3.4.0), Good to verify the fix.
BZ should be in MODIFIED state until and unless it's all acked.
Hi, As the fix for the bug is available, marking it as current release and closing it. Feel free to open the bug if it still persists. -Hari.