Description of problem: Gluster Volume restart fails for arbiter volume having fuse subdir export Version-Release number of selected component (if applicable): glusterfs-3.8.4-50.el6rhs.x86_64 How reproducible: Consistently Steps to Reproduce: 1.Create 4 x (2 + 1) arbiter volume.Start the volume 2.Mount the volume on fuse client.Create a directory inside mount point # mount -t glusterfs 10.70.42.22:/glustervol /mnt/subdir_mount # cd /mnt/subdir_mount # mkdir dirr1 # ls dirr1 3.Set auth allow for dirr1 on volume # gluster v set glustervol auth.allow "/dirr1(10.70.37.125)" volume set: success 4.Perform volume stop and start # gluster v stop glustervol Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: glustervol: success # gluster v start glustervol volume start: glustervol: failed: Commit failed on localhost. Please check log file for details. Actual results: Arbiter Volume restart failed when fuse sub dir was exported Expected results: Arbiter Volume restart should not fail Additional info: ================ # tailf /var/log/glusterfs/glusterd.log [2017-10-27 06:15:43.137724] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /gluster/brick1/1 on port 49152 [2017-10-27 06:15:43.138148] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /gluster/brick1/1.rdma on port 49153 [2017-10-27 06:15:43.138847] E [MSGID: 106005] [glusterd-utils.c:5878:glusterd_brick_start] 0-management: Unable to start brick dhcp42-22.lab.eng.blr.redhat.com:/gluster/brick1/1 [2017-10-27 06:15:43.139047] E [MSGID: 106123] [glusterd-mgmt.c:317:gd_mgmt_v3_commit_fn] 0-management: Volume start commit failed. [2017-10-27 06:15:43.139092] E [MSGID: 106123] [glusterd-mgmt.c:1456:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Start on local node [2017-10-27 06:15:43.139123] E [MSGID: 106123] [glusterd-mgmt.c:2047:glusterd_mgmt_v3_initiate_all_phases] 0-management: Commit Op Failed [2017-10-27 06:15:44.322101] W [MSGID: 106057] [glusterd-snapshot-utils.c:379:glusterd_snap_volinfo_find_by_volume_id] 0-management: Snap volume not found [2017-10-27 06:15:23.207965] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped ================== Attaching sosreports shortly.
Same Issue is observed with disperse volume type as well. Distributed-replicate works fine -https://bugzilla.redhat.com/show_bug.cgi?id=1500720
Did you check the brick log file for dhcp42-22.lab.eng.blr.redhat.com:/gluster/brick1/1 ? The log indicates brick failed to come up, so there should be some relevant log entry there. Please check and point out.
Tested this with glusterfs-3.8.4-51.el7rhgs.x86_64, Tried couple of times volume restart.Not hitting this issue with this build. Complete logs mentioned in comment#7. Moving this bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3276