Description: We tried a small test on "ACL client". For listing 50k files on root ,it took around 50seconds with readdirp enabled while the same operation took 5-6 seconds with readdirp disabled. Both the times md-cache was enabled. Observation: We observed that on the 1st test case (readdirp enabled), post readdirp a getxattr is done. The number of getxattr depends on the number of acl xattrs (I saw requests on these two: system.posix_acl_default, system.posix_acl_access). Since need_lookup flag is set, during fuse_resolve a nameless lookup is executed on the inode(getxattr being inode operation, hence the nameless lookup). Since md-cache does not serve nameless lookup, a network hop is needed for each file, costing the time. With readdirp disabled, the getxattrs are served from md-cache itself(note: we are discussing the 2nd attempt of ls -l use case).
REVIEW: https://review.gluster.org/17985 (fuse/readdirp: Remove need_lookup from fuse_readdirp_cbk) posted (#3) for review on master by Susant Palai (spalai)
COMMIT: https://review.gluster.org/17985 committed in master by Raghavendra G (rgowdapp) ------ commit 60ff231644862c0d9351352febdda7b2dfdde994 Author: Susant Palai <spalai> Date: Mon Aug 7 15:19:47 2017 +0530 fuse/readdirp: Remove need_lookup from fuse_readdirp_cbk background: Various xlators used to populate their ctx, on an explicit lookup. That means without a lookup, the translator will have either null or stale data to function. E.g. dht would depend on lookup to create linkto files on the correct node/hashed subvol, afr would rely on this lookup to heal pending data/metadata etc. So to complete above actions a lookup used to be issued on files, even their inode was populated on a readdirp_cbk. This was done by setting the need_lookup flag on all the files those were read on readdirp fop. We tried a small test on "ACL client". For listing 50k files on root itself, it took around 50seconds with readdirp enabled while the same operation took 5-6 seconds with readdirp disabled. Both the times md-cache was enabled. We observed that on the 1st test case (readdirp enabled), post readdirp a getxattr is done. The number of getxattr depends on the number of acl xattrs (I saw requests on these two: system.posix_acl_default, system.posix_acl_access). Since need_lookup flag is set, during fuse_resolve a nameless lookup is executed on the inode(getxattr being inode operation, hence the nameless lookup). Since md-cache does not serve nameless lookup, a network hop is needed for each file, costing the time. With readdirp disabled, the getxattrs are served from md-cache itself(note: we are discussing the 2nd attempt of ls -l use case). _Current affairs around need of lookup for a file to populate it's ctx_: For the xlators on client stack we discussed quite extensively about the need for a lookup fop post readdirp in all three cluster translators - afr, EC and dht. EC and dht don't really need a nameless lookup post readdirp. For afr too, the need for lookup was negated with patch (http://review.gluster.org/6010 - AFRV2), where afr added a function called afr_inode_refresh() which does a lookup and populates its inode context in case a FOP came to AFR without a lookup being issued prior to it. We ran a thread on gluster-devel asking for feedback on the need of explicit lookup post readdirp. For responses refer [1]. Refer [2] for discussions happened on gerrit. After gathering inputs from [1] and [2], it looks like there is no xlator in current state that requires an explicit lookup post readdirp to function properly. * A separate similar patch will be sent for gfapi/nfs/nfs-ganesha. Note: Only file's inode is built with readdirp. [1] http://lists.gluster.org/pipermail/gluster-devel/2017-August/053505.html [2] https://review.gluster.org/#/c/17985/ Change-Id: Ie1d68ce7bea5e1f8a1fab9a62217f478322554f5 BUG: 1492996 Signed-off-by: Susant Palai <spalai>
> We tried a small test on "ACL client". For listing 50k files on root ,it took around 50seconds with readdirp enabled while the same operation took 5-6 seconds with readdirp disabled. Both the times md-cache was enabled. What is the time taken with readdirp ENABLED and with patch #17985 present? Also, now that #17985 is merged, can this bug be moved to MODIFIED?
I saw 5-6 seconds with the patch. I will just udpate the bug after one more testing and move it to modified. Susant
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/
REVIEW: https://review.gluster.org/19322 (mount/fuse: cleanup useless need_lookup) posted (#2) for review on master by Kinglong Mee
REVIEW: https://review.gluster.org/19338 (gfapi: cleanup useless need_lookup) posted (#1) for review on master by Kinglong Mee
REVIEW: https://review.gluster.org/19339 (libglusterfs: cleanup useless need_lookup) posted (#1) for review on master by Kinglong Mee