Bug 1048111
| Summary: | RHS: directory long listing performance over NFS very slow until remount | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Manoj Pillai <mpillai> | ||||
| Component: | glusterfs | Assignee: | santosh pradhan <spradhan> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | Sudhir D <sdharane> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 2.1 | CC: | mpillai, vagarwal, vbellur | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-01-23 19:25:20 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Manoj Pillai
2014-01-03 08:14:42 UTC
Results in the spreadsheet are copied here for easy reference. Numbers are "real" time reported by the time command (is in seconds).
RHS RHS-noacl kNFS kNFS-noacl
SEQUENCE OF COMMANDS
create 50K files in <mnt-pt>
time ls -l <mnt-pt> 214.5 183.42 6.46 1.55
sleep 60; time ls -l <mnt-pt> 181.95 181.65 5.68 5.54
sleep 60; time ls -l <mnt-pt> 181.19 180.77 5.69 5.55
sleep 1; time ls -l <mnt-pt> 178.61 178.24 0.79 0.73
sleep 1; time ls -l <mnt-pt> 180.79 179.75 0.9 0.86
sleep 1; time ls -l <mnt-pt> 178.94 180.77 0.94 0.93
sleep 60; time ls -l <mnt-pt> 179.39 181.58 5.67 5.56
umount <mnt-pt>; mount again
time ls -l <mnt-pt> 39.96 6.8 6.3 1.52
sleep 60; time ls -l <mnt-pt> 22.74 22.93 5.55 5.4
sleep 60; time ls -l <mnt-pt> 22.01 23.21 5.5 5.45
sleep 1; time ls -l <mnt-pt> 21.59 22.81 0.84 0.87
sleep 1; time ls -l <mnt-pt> 0.96 0.91 0.91 0.87
sleep 1; time ls -l <mnt-pt> 21.59 23.02 0.95 0.74
sleep 60; time ls -l <mnt-pt> 22.27 23.23 5.53 5.43
Some points to note:
* kNFS behavior for the sequence of commands before umount/mount and after umount/mount is the same. This is what we would expect for RHS/gNFS also. However, until the umount, performance with gNFS is terrible.
* acl/noacl: Time taken by first instance of ls -l after fresh mount is significantly more when acl is enabled compared to when it is not (39.96 v. 6.8). kNFS also shows penalty for acl first time after fresh mount (6.3 v. 1.52). Subsequent instances of "ls -l" do not pay a penalty in the acl case.
* caching effects: after fresh mount, "ls -l" times are either around 22 secs with gNFS or around 0.9 secs. This has to do with attribute caching timeouts in NFS client. We have verified that by setting actimeo=120 as a client NFS mount option, we can get most of these numbers below 1 sec. kNFS also shows similar behavior: in the case of kNFS, "ls -l" times are either around 5.5 sec or around 0.9 sec.
Here are the columns for gNFS and kNFS for the same test as above, but here time for each instance of "ls -l" is followed in the same column by the NFS calls that the client makes. This data was obtained on the NFS client by doing an "nfsstat -c" before and after each "ls -l" command. Some rows, that don't look very different, have not been annotated with the NFS calls information to avoid clutter.
RHS kNFS
Time Time
[NFS calls] [NFS calls]
SEQUENCE OF COMMANDS
create 50K files in <mnt-pt>
time ls -l <mnt-pt> 214.5 6.46
[2 getattr, 50K lookup, [1 getattr,
98 access, 1 access,
64642 readdirplus] 2441 readdirplus]
sleep 60; time ls -l <mnt-pt> 181.95 5.68
[2 getattr, 50K lookup, [50002 getattr,
98 access, 1 access]
64642 readdirplus]
sleep 60; time ls -l <mnt-pt> 181.19 5.69
[2 getattr, 50K lookup, [50002 getattr,
98 access, 1 access]
64642 readdirplus]
sleep 1; time ls -l <mnt-pt> 178.61 0.79
[1 getattr, 50K lookup, [1 getattr]
98 access,
64642 readdirplus]
sleep 1; time ls -l <mnt-pt> 180.79 0.9
sleep 1; time ls -l <mnt-pt> 178.94 0.94
sleep 60; time ls -l <mnt-pt> 179.39 5.67
umount <mnt-pt>; mount again
time ls -l <mnt-pt> 39.96 6.3
[1 getattr, 1 access, [1 getattr,
1 access, 1 access,
2451 readdirplus] 2441 readdirplus]
sleep 60; time ls -l <mnt-pt> 22.74 5.55
[50002 getattr, [50002 getattr,
1 access] 1 access]
sleep 60; time ls -l <mnt-pt> 22.01 5.5
sleep 1; time ls -l <mnt-pt> 21.59 0.84
sleep 1; time ls -l <mnt-pt> 0.96 0.91
[1 getattr] [1 getattr]
sleep 1; time ls -l <mnt-pt> 21.59 0.95
sleep 60; time ls -l <mnt-pt> 22.27 5.53
[50002 getattr, [50002 getattr,
1 access] 1 access]
Some points to note:
* after umount, calls made by the client for gNFS and kNFS are very similar. There seems to be a small difference in the the number of readdirplus calls in the first instance after fresh mount (2451 for gNFS v. 2441 for kNFS). But mostly, gNFS and kNFS are processing the same number of calls; kNFS is just faster.
* after the create phase, before umount, there is clearly some bug in the gNFS case that is causing the NFS client to send many additional requests including many lookups.
* in case of the sub-second times, when "ls -l" is issued in quick succession, almost everything is coming out of the client cache
* I'm not seeing any difference in the number of client NFS calls with the noacl mount option. So, the difference in times between acl and noacl (seen when "ls -l" is run after a fresh mount) should be coming from the auxilliary calls needed to implement acl.
It sounds another cache inconsistency issue which is generating LOOKUP operations and takes longer. Thats why unmounting (drops the cache) and remounting gives better numbers. I guess there is another way to verify: Mount the vol. create the files. drop the cache (eg. echo 3 > /proc/sys/vm/drop_caches) ls -l <mount pt> (drop cache followed by ls -l is exactly same as directory_crawl() in perf-test.sh). I could reproduce the issue with a simple test case: Create a single brick volume and NFS mount it. In mount pt, create a file e.g. testfile ls -l <mnt pt> This is what I get in client side rpcdebug: ========================================== Jan 9 00:54:45 santosh-new-4 kernel: NFS: revalidating (0:1d/1) Jan 9 00:54:45 santosh-new-4 kernel: NFS call getattr Jan 9 00:54:45 santosh-new-4 kernel: NFS reply getattr: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs3_forget_cached_acls(0:1d/1) Jan 9 00:54:45 santosh-new-4 kernel: NFS: (0:1d/1) revalidation complete Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs3_get_cached_acl(0:1d/1, 32768) = fffffffffffffff5 Jan 9 00:54:45 santosh-new-4 kernel: NFS call getacl Jan 9 00:54:45 santosh-new-4 kernel: NFS reply getacl: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: nfs3_cache_acls(0:1d/1, (null), (null)) Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs3_get_cached_acl(0:1d/1, 16384) = (null) Jan 9 00:54:45 santosh-new-4 kernel: NFS call access Jan 9 00:54:45 santosh-new-4 kernel: NFS reply access: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x24, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: open dir(/) Jan 9 00:54:45 santosh-new-4 kernel: NFS: revalidating (0:1d/1) Jan 9 00:54:45 santosh-new-4 kernel: NFS call getattr Jan 9 00:54:45 santosh-new-4 kernel: NFS reply getattr: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: (0:1d/1) revalidation complete Jan 9 00:54:45 santosh-new-4 kernel: NFS: readdir(/) starting at cookie 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: (0:1d/1) data cache invalidated ====>>> data cache gets invalidated HERE <<<======= Jan 9 00:54:45 santosh-new-4 kernel: NFS call readdirplus 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS reply readdirplus: 492 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/-5753586402791623407 ct=1 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_do_filldir() filling ended @ cookie 512; returning = 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: readdir(/) returns 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS call lookup skdot Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: mtime change on server for file 0:1d/1 Jan 9 00:54:45 santosh-new-4 kernel: NFS reply lookup: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/-5753586402791623407 ct=1 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS call access Jan 9 00:54:45 santosh-new-4 kernel: NFS reply access: 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs3_get_cached_acl(0:1d/-5753586402791623407, 32768) = (null) Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: permission(0:1d/1), mask=0x1, res=0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_lookup_revalidate(/skdot) is valid Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs3_get_cached_acl(0:1d/-5753586402791623407, 16384) = (null) Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: readdir(/) starting at cookie 3 Jan 9 00:54:45 santosh-new-4 kernel: NFS: (0:1d/1) data cache invalidated Jan 9 00:54:45 santosh-new-4 kernel: NFS call readdirplus 0 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/1 ct=2 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: mtime change on server for file 0:1d/1 ==========>>>> This seems to be the culprit <<<<======== Jan 9 00:54:45 santosh-new-4 kernel: NFS reply readdirplus: 492 Jan 9 00:54:45 santosh-new-4 kernel: NFS: nfs_update_inode(0:1d/-5753586402791623407 ct=1 info=0x7e7f) Jan 9 00:54:45 santosh-new-4 kernel: NFS: dentry_delete(/skdot, 8) Jan 9 00:54:45 santosh-new-4 kernel: NFS: readdir(/) returns 0 Client just does a ls -l on the mount pt, STAT/GETXATTR/ACCESS calls wont change the mtime on mntpt. Seems to be a client issue. Jeff Layton pointed out a NFS client issue i.e. BZ 976879. This does not need any code change in Gluster NFS server side. We can close this bug :) Thanks, Santosh Forgot to mention that When I tested with Fedora 18 client box (3.11 kernel), the issue is not seen. The bug I mentioned in comment # 5, might have fixed in RHEL 6.5 client. > > RHS kNFS > Time Time > [NFS calls] [NFS calls] > SEQUENCE OF COMMANDS > > create 50K files in <mnt-pt> > > time ls -l <mnt-pt> 214.5 6.46 > [2 getattr, 50K lookup, [1 getattr, > 98 access, 1 access, > 64642 readdirplus] 2441 readdirplus] > > sleep 60; time ls -l <mnt-pt> 181.95 5.68 > [2 getattr, 50K lookup, [50002 getattr, > 98 access, 1 access] > 64642 readdirplus] > > sleep 60; time ls -l <mnt-pt> 181.19 5.69 > [2 getattr, 50K lookup, [50002 getattr, > 98 access, 1 access] > 64642 readdirplus] [Santosh] You correctly pointed out the no of readdirplus calls is the culprit. This is infact a NFS client readdir() performance issue which was invalidating the data cache frequently, generating more READDIRPLUS calls. Impacting the performance of ls -l. > > sleep 1; time ls -l <mnt-pt> 178.61 0.79 > [1 getattr, 50K lookup, [1 getattr] > 98 access, > 64642 readdirplus] > > sleep 1; time ls -l <mnt-pt> 180.79 0.9 > sleep 1; time ls -l <mnt-pt> 178.94 0.94 > sleep 60; time ls -l <mnt-pt> 179.39 5.67 > > umount <mnt-pt>; mount again > > time ls -l <mnt-pt> 39.96 6.3 > [1 getattr, 1 access, [1 getattr, > 1 access, 1 access, > 2451 readdirplus] 2441 readdirplus] After umount/mount, it shows the correct behavior. Lets close the bug as it does nt anymore work. Please feel free to reopenbif you dont agree with analysis. clearing needinfo. |