+++ This bug was initially created as a clone of Bug #994405 +++ Description of problem: while removing a directory from the mount point if we issue add-brick command then rm fails with "Transport end point not connected" Version-Release number of selected component (if applicable): 3.4.0.17rhs-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. created a distributed volume 2. mount the volume and untarred the kernel 3. rm -rf linux-2.6.32.61 Actual results: after sometime error pops on the mount point [root@gqac024 mnt]# rm -rf linux-2.6.32.61 rm: cannot remove `linux-2.6.32.61/arch/arm/mach-integrator/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-iop13xx/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-iop32x/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-iop33x/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-ixp2000/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-ixp23xx/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-ixp4xx/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-kirkwood/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-ks8695/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-l7200/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-lh7a40x/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-loki/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mmp/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-msm/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mv78xx0/include/mach': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mx1': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mx2': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mx25': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mx3': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-mxc91231': Transport endpoint is not connected rm: cannot remove `linux-2.6.32.61/arch/arm/mach-netx/include/mach': Transport endpoint is not connected Expected results: Additional info: ================ RHS nodes ========= gqac022.sbu.lab.eng.bos.redhat.com gqac023.sbu.lab.eng.bos.redhat.com Mounted on ============ gqac024.sbu.lab.eng.bos.redhat.com mount point =========== /mnt add-brick issued from gqac022.sbu.lab.eng.bos.redhat.com [root@gqac022 rpm]# gluster v info anon Volume Name: anon Type: Distribute Volume ID: 61e3c5b2-cb03-4ea8-9a69-a8762191d296 Status: Started Number of Bricks: 15 Transport-type: tcp Bricks: Brick1: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon1 Brick2: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon2 Brick3: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon3 Brick4: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon4 Brick5: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon5 Brick6: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon6 Brick7: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon7 Brick8: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon8 Brick9: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon9 Brick10: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon10 Brick11: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon11 Brick12: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon12 Brick13: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon13 Brick14: gqac023.sbu.lab.eng.bos.redhat.com:/home/anon14 Brick15: gqac022.sbu.lab.eng.bos.redhat.com:/home/anon15 Rebalance was performed before adding the new bricks mnt logs ========= 2013-08-07 08:06:00.201235] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-14: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt/kvm (28c128a7-87c1-493e-9aab-713bcbc73221) [2013-08-07 08:06:00.201274] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-13: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt/kvm (28c128a7-87c1-493e-9aab-713bcbc73221) [2013-08-07 08:06:00.215054] W [client-rpc-fops.c:2316:client3_3_readdirp_cbk] 4-anon-client-13: remote operation failed: No such fil e or directory [2013-08-07 08:06:00.215480] W [client-rpc-fops.c:2316:client3_3_readdirp_cbk] 4-anon-client-14: remote operation failed: No such fil e or directory [2013-08-07 08:06:00.228461] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-13: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt/kvm (28c128a7-87c1-493e-9aab-713bcbc73221) [2013-08-07 08:06:00.228536] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-14: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt/kvm (28c128a7-87c1-493e-9aab-713bcbc73221) [2013-08-07 08:06:00.229528] W [client-rpc-fops.c:695:client3_3_rmdir_cbk] 4-anon-client-14: remote operation failed: No such file or directory [2013-08-07 08:06:00.229607] W [client-rpc-fops.c:695:client3_3_rmdir_cbk] 4-anon-client-13: remote operation failed: No such file or directory [2013-08-07 08:06:00.230174] W [fuse-bridge.c:1688:fuse_unlink_cbk] 0-glusterfs-fuse: 1154112: RMDIR() <gfid:6ef0c711-d65c-4cec-90ba- 1ba87e1163e0>/virt/kvm => -1 (No such file or directory) [2013-08-07 08:06:00.230932] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-14: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt (42ca8074-6470-48c0-9731-eb7e8a5d63ea) [2013-08-07 08:06:00.231116] W [client-rpc-fops.c:2523:client3_3_opendir_cbk] 4-anon-client-13: remote operation failed: No such file or directory. Path: <gfid:6ef0c711-d65c-4cec-90ba-1ba87e1163e0>/virt (42ca8074-6470-48c0-9731-eb7e8a5d63ea) [2013-08-07 08:06:00.231941] W [client-rpc-fops.c:695:client3_3_rmdir_cbk] 4-anon-client-14: remote operation failed: No such file or directory [2013-08-07 08:06:00.232113] W [client-rpc-fops.c:695:client3_3_rmdir_cbk] 4-anon-client-13: remote operation failed: No such file or directory [2013-08-07 08:06:00.232618] W [fuse-bridge.c:1688:fuse_unlink_cbk] 0-glusterfs-fuse: 1154115: RMDIR() <gfid:6ef0c711-d65c-4cec-90ba- 1ba87e1163e0>/virt => -1 (No such file or directory) --- Additional comment from shylesh on 2013-08-07 04:32:29 EDT --- sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/994405/
The main issue is that, irrespective of whether the newly added brick received a port or not the volume file change happens. So consider a scenario where the newly added brick requested for a new port but has not received a port, however volume file change happens. Hence fops are now sent to the newly added brick as well. The newly added brick still does not have a port yet hence the fop fails on that brick with "Transpor endpoint not connected". Also instead of glusterd creating and notifying the volfile immediately, it can notify the client once the brick is added. The fix would be to notify volume file change only after a new brick is added and has gets a port.
REVIEW: http://review.gluster.org/11342 (glusterfsd: newly added brick receives fops only after it is started) posted (#2) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/11342 (glusterfsd: newly added brick receives fops only after it is started) posted (#3) for review on master by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/11342 (glusterfsd: newly added brick receives fops only after it is started) posted (#4) for review on master by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/11342 (glusterfsd: newly added brick receives fops only after it is started) posted (#5) for review on master by Dan Lambright (dlambrig)
REVIEW: http://review.gluster.org/11342 (glusterfsd: newly added brick receives fops only after it is started) posted (#6) for review on master by Vijay Bellur (vbellur)
REVIEW: http://review.gluster.org/11342 (glusterfsd : newly added brick receives fops only after it is started) posted (#7) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11342 (glusterfsd : newly added brick receives fops only after it is started) posted (#8) for review on master by Dan Lambright (dlambrig)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user