Bug 502884
Summary: | NFSv4 Issue/slowdown when testing against the NetApp server | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Vivek Goyal <vgoyal> |
Component: | kernel | Assignee: | Jeff Layton <jlayton> |
Status: | CLOSED ERRATA | QA Contact: | yanfu,wang <yanwang> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4.9 | CC: | bikash.choudhury, bikash, dhoward, jburke, lwang, nhorman, pbunyan, ricardo.labiaga, rwheeler, sager, staubach, steved, yanwang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-02-16 15:51:29 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 589293 | ||
Attachments: |
Description
Vivek Goyal
2009-05-27 15:51:22 UTC
Created attachment 345636 [details]
Network capture
Attaching a network capture of the traffic while the slowness is occuring.
It may or may not be useful, but did look different than the network
traffic during the faster sections of the test run.
Peter, as per your email I took a look at the tcpdump, and as you noted there is a connection which is constantly getting reset (well, closed with a FIN from the server). I'm not sure if it the source of your problems, but it certainly looks suspicious. I compared a good NFS v4 call (frame 22) with a data block from a connection that the server closes (frame 20). If you look at them the first difference is the destination port on the server they are sending to, the good connection uses the nfs program port (733) while the connection that is perpetually getting closed is going to port 59216 on the server. Theres not enough of the trace to tell, and its possible that a program registered with the portmapper there, but its been my experience that nfs ports tend to be fairly regular and well known. Thats not to say a different service isn't registered there, although I couldn't say what it is. Continuing, if you look at the data block in frame 20 and compare it to the good frame, we can see that frame 20 holds what looks like an rpc message (the rpc header is almost identical to that of 22 with obvious differences for xid, etc). Wireshark seems unable to decode frame 20 propoerly, but a hadn decode shows that it looks like a 148 byte rpc message contained in 1 fragment with an xid of 0x3d46a3af containing a call command in rpc v2 format. AFter that the wheels come off the wagon, though. The RPC program number is 0x40000000, which I can not find defined anywhere in any well known rpc program list. Of course that makes the rest of the frame meaningless, since we don't know what the rest of it should decode as (the program version and call number seem equally garbage-like). The netapp responds to this in frame 23 with a response containing the correct/same xid as in frame 20, but it appears to be some sort of rpc error (I need to read up on the rpc frame format for a full decode) So in summary: 1) The RHEL4 client is attempting to connect to an odd port that doesn't appear to belong to the NFS suite 2) The Netapp Server is accepting the connection 3) The RHEL4 client is pusing what appears to be valid rpc data for an unknown RPC program number/version/call 4) The Netapp server responds with what appears to be a valid rpc response containing (I think) some error data 5) The netapp closes the connection Given that all these closures are occuring on the same port, I think a next logical step would be to run an rpcinfo on the netapp, and figure out what program 0x40000000 and what program is listening on port 59216. That'll probably give us a better idea of what exactly this exchange is about, and likely give us more information about the change in behavior. That unknown RPC call is the server calling back the client to recall a delegation. If you look you can see the client almost always sends an immediate DELEGRETURN call to the server after that exachange completes. Unfortunately wireshark is really bad at decoding these callbacks. They're pretty simple however and I don't think it's uncommon for the server to close the socket after doing the call. It looks like the delegation callbacks are pretty universally being done just after a SETATTR call. The server responds with NFS4ERR_DELAY and then recalls the delegation. The problem here though is that RHEL4 has this: #define NFS4_POLL_RETRY_MIN (1*HZ) ...so it waits a minimum of 1s before retrying the SETATTR call. If we're getting a lot of these sorts of calls then that may explain the problem. In RHEL5 we added a patch a while back that brought down this value to .1s, which is a little more reasonable. Perhaps we should experiment with that for RHEL4 too? The RHEL5 case for that problem was bug 441884. Created attachment 345808 [details]
test patch -- backport of all 3 patches for 441884
This patch is a rollup of all 3 patches from 441884. I've lightly tested it and it seems to basically work. I don't have a good way to reproduce this here however, so it might be good to have someone who has been able to reproduce it test this patch.
From my own testing against filerb (note that my clients are much farther away on the network than Peter's): Unpatched -89.EL: created 155 files 62 directories 5 levels deep in 183.14 seconds Patched -89.EL with 3 patch set from above: created 155 files 62 directories 5 levels deep in 42.6 seconds RHEL-5 2.6.18-150.el5debug: created 155 files 62 directories 5 levels deep in 15.61 seconds ...so I'm also seeing a significant difference between RHEL4 and 5 even with the 3 patch set so far. I'll get some captures and see if I can determine the cause. Nothing really stands out at me in the post-patch RHEL4 capture. RHEL5 is a bit different, the server never seems to issue a NFS4ERR_DELAY and delegation recall after the setattr from a RHEL5 client, even though the clients are setting the same attribute (FATTR4_TIME_MODIFY_SET to SET_TO_SERVER_TIME4). We may need some help from netapp to explain why this is. cc'ing Bikash and Ricardo... Can either of you shed any light on the difference I see in comment #9? I can provide captures if they'll help... Created attachment 346127 [details]
capture of rhel4 creating file
Created attachment 346128 [details]
capture of rhel5 creating file
Ok, here are 2 captures of files being created... The first thing I notice is that RHEL4 involves a lot more calls. That in and of itself probably accounts for some of the performance differential. I still don't see exactly why the filer decides to recall the delegation from the RHEL4 client, but not the RHEL5 one on the setattr. The open and setattr calls pretty much look identical. Even though we have a patch to retry in .1s after getting NFS4ERR_DELAY, it would still probably be a big win if we never got such a delay at all. Can you run the following commands on the filer to determine OS version and options? version options nfs Thanks Ricardo, here's the info: na3070b> version NetApp Release 7.3.1: Thu Jan 8 01:17:51 PST 2009 na3070b> options nfs nfs.acache.persistence.enabled on (value might be overwritten in takeover) nfs.assist.queue.limit 40 (value might be overwritten in takeover) nfs.export.allow_provisional_access on (value might be overwritten in takeover) nfs.export.auto-update on (value might be overwritten in takeover) nfs.export.exportfs_comment_on_delete on (value might be overwritten in takeover) nfs.export.harvest.timeout 1800 (value might be overwritten in takeover) nfs.export.neg.timeout 3600 (value might be overwritten in takeover) nfs.export.pos.timeout 36000 (value might be overwritten in takeover) nfs.export.resolve.timeout 6 (value might be overwritten in takeover) nfs.hide_snapshot off nfs.ifc.rcv.high 66340 nfs.ifc.rcv.low 33170 nfs.ifc.xmt.high 16 nfs.ifc.xmt.low 8 nfs.ipv6.enable off nfs.kerberos.enable off nfs.locking.check_domain on (value might be overwritten in takeover) nfs.max_num_aux_groups 32 nfs.mount_rootonly on nfs.mountd.trace off nfs.netgroup.strict off nfs.notify.carryover on nfs.ntacl_display_permissive_perms off (value might be overwritten in takeover) nfs.per_client_stats.enable off nfs.require_valid_mapped_uid off nfs.response.trace off (value might be overwritten in takeover) nfs.response.trigger 60 (value might be overwritten in takeover) nfs.rpcsec.ctx.high 0 nfs.rpcsec.ctx.idle 360 nfs.tcp.enable on nfs.thin_prov.ejuke off (value might be overwritten in takeover) nfs.udp.enable on nfs.udp.xfersize 32768 (value might be overwritten in takeover) nfs.v2.df_2gb_lim off (value might be overwritten in takeover) nfs.v3.enable on (value might be overwritten in takeover) nfs.v4.acl.enable on (value might be overwritten in takeover) nfs.v4.enable on (value might be overwritten in takeover) nfs.v4.id.domain redhat.com nfs.v4.read_delegation on (value might be overwritten in takeover) nfs.v4.setattr_acl_preserve off (value might be overwritten in takeover) nfs.v4.write_delegation on (value might be overwritten in takeover) nfs.webnfs.enable off nfs.webnfs.rootdir XXX nfs.webnfs.rootdir.set off Jeff, Is RHEL4 using the delegation stateid in the setattr call? ONTAP 7.3.1 will actually check for this. Good catch, Ricardo. RHEL4 is sending a zeroed out stateid. I'll poke at the code and see if I can figure out why. Thanks! Created attachment 348370 [details]
test patch 2 -- backport 3 patches for 441884 and 2 patches to fix setattr handling
Here's an updated test patch.
This adds 2 patches backported from upstream to make setattrs use the delegation stateid when there is one. My results with this are much better -- ~23s to create the files for cthon basic test 1 and it looks like it eliminates the frequent delegation recalls.
Still not quite as fast as RHEL5, but definitely getting into the same ballpark (only about an 8s delta now in my testing). I think the main difference now is that RHEL5 seems to do fewer calls overall on a create. That probably accounts for the difference here.
Peter, when you get time can you test this patch too and let me know if it looks better on your end?
I will include it into the kernel that I run. Even at ~23 seconds, this is still a far cry from the NFSv3 times of .3 seconds. I wonder if this is required or whether there is still something amiss. My client is quite a bit farther from the server. NFSv4 + RHEL5 takes ~15s to do the same test. From a look at traces, RHEL4 definitely makes a lot more calls on the wire for each create. I'm pretty sure some of those could be optimized away, but given that we discourage using NFSv4 on RHEL4, I'm not sure it's really worthwhile. I've also included these patches in my RHEL4 test kernels: http://people.redhat.com/jlayton After reading Jeff's Comment #20 and thinking about the amount of time this is wasting. On every kernel, every cthon test we have to manually review the WARN. Just to make sure it is this issue we are aware of. If we are not going to fix this. I would like to disable RHEL4 NFS tests on the NetApp servers. I am advocating RHEL5.x to all our field personnel and customers who are attempting to use NFSv4 in production. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 89.36.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ I run connectathon testsuite on 2.6.9-89.1.EL which is the initial problem report and run on the new kernel 2.6.9-94.EL again in beaker, the NFSv4 test in RHEL4 against the NetApp nfs server all pass now, pls check the below links: run against old kernel 2.6.9-89.1.EL: https://beaker.engineering.redhat.com/jobs/43731 https://beaker.engineering.redhat.com/jobs/43732 https://beaker.engineering.redhat.com/jobs/43734 https://beaker.engineering.redhat.com/jobs/43735 https://beaker.engineering.redhat.com/jobs/43736 https://beaker.engineering.redhat.com/jobs/43742 against new kernel version 2.6.9-94.EL: i386: https://beaker.engineering.redhat.com/jobs/43750 x86_64: https://beaker.engineering.redhat.com/jobs/43751, https://beaker.engineering.redhat.com/jobs/44029 ppc64: https://beaker.engineering.redhat.com/jobs/43753 s390x: https://beaker.engineering.redhat.com/jobs/43755 s390: https://beaker.engineering.redhat.com/jobs/44028 ia64: https://beaker.engineering.redhat.com/jobs/43759 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |