| Summary: | NFS is 40x slower when using 32K I/O size to copy files from a Solaris 10 NFS client to a RHEL 6.2 NFS server | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | roland.teague | ||||||
| Component: | kernel | Assignee: | nfs-maint | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 6.2 | CC: | bengland, jlayton, perfbz, steved | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | NFS performance | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2012-02-16 11:06:16 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
roland.teague
2012-01-30 19:09:25 UTC
I think the first place to start is with some network captures that show the traffic between client and server. nfsstat info might also be interesting if there's a large difference in the number and/or type of calls between the two test runs. Hi Roland, we worked together about 5 years ago at IBRIX.
How do you know that you are using 4-KB I/O size (paragraph 1 in your original
problem report)? It appears to me that in the "expected results" section with
your test run with good performance, you didn't specify rsize and wsize. The
Linux NFS server in RHEL6.2 will negotiate up to 1 MB RPC size (default with
NFS V4 at least), If network round-trip time is high then larger RPC size
should help. IN both expected and actual cases above, dd I/O size is 1000000
bytes.
I don't know Solaris well but is there a /proc/mounts file or equivalent on the
Solaris client and does the mountpoint appear in it? This would tell us what
parameters were negotiated for the NFS mount in each case.
I don't see this kind of drop-off with Linux NFS client when I do the test
using a Linux client, nor do I see any kind of evidence here of a regression
with NFS since you reproduced it on all those RHEV versions.
Can you do this on RHEL server?
# tcpdump -w /tmp/a.tcpdump -s 1500 -c 100000
before test starts, compress and post as an attachment? Also, can you do this
on RHEL server
# while [ 1 ] ; do nfsstat -s ; sleep 5 ; done > nfsstat.log
before test starts and post that? This will tell us whether there is a stall or
whether this is steady state behavior.
does NFS V3 behave differently than NFS V4?
Is it possible that the network path might be the cause of this problem? There
is a way to use netperf to simulate NFS behavior to some extent at the network
level, you might want to try this and see what different RPC sizes will do
(vary the -r parameter below to simulate reads vs writes, different RPC sizes,
run multiple netperf processes to simulate multiple threads).
[root@perf56 ~]# netperf -v 5 -l 5 -H perf36 -t TCP_RR -- -r 512,32768
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
perf36.lab.bos.redhat.com (10.16.41.6) port 0 AF_INET : spin interval : demo :
first burst 0
Alignment Offset RoundTrip Trans Throughput
Local Remote Local Remote Latency Rate 10^6bits/s
Send Recv Send Recv usec/Tran per sec Outbound Inbound
8 0 0 0 611.323 1635.797 6.700 428.814
Hi Ben, glad to hear your at Redhat. :-) I assumed that the default rsize and wsize values were 4k. Perhaps I was mistaken because when I specify a rsize and wsize of 4k I get performance that is worse than the 32k block size. I'm not sure what I/O size is used on Solaris 10 when the rsize and wsize values are not specified. But I tried setting the rsize and wsize values to 1m and I still get horrible performance. I see the issue with both NFS version 3 and 4. Here are the results when not specifying the rsize and wsize values. bash-3.2# mount -F nfs -o vers=4 10.10.138.134:/home /ibfs1 bash-3.2# mount /ibfs1 on 10.10.138.134:/home remote/read/write/setuid/devices/rstchown/vers=4/xattr/dev=4d40024 on Mon Feb 6 14:19:15 2012 bash-3.2# time dd if=/dev/zero of=/ibfs1/dd_test_4k bs=1000000 count=1000 1000+0 records in 1000+0 records out real 0m16.791s user 0m0.003s sys 0m1.052s bash-3.2# I will work on getting the tcpdumps. nfsstat is showing a 50% split between putfh and write calls on both the client and server for both 4K and 32K I/O sizes. (In reply to comment #4) > I will work on getting the tcpdumps. nfsstat is showing a 50% split between > putfh and write calls on both the client and server for both 4K and 32K I/O > sizes. Please use tshark got capture the traces since tcpdump does not have v4 support. Something similar to: tshark -w /tmp/data.pcap <server> bzip2 /tmp/data.pcap. tia, Created attachment 559758 [details]
badperf 32k I/O size tcpdump
The end customer is using NFS version 3 so I have included a tcpdump using NFS version 3. These are the results when I use a rsize/wsize of 32k.
Created attachment 559759 [details]
goodperf default I/O size tcpdump
This is the good performing tcpdump of NFS version 3 using the default rsize/wsize.
(In reply to comment #6) > Created attachment 559758 [details] > badperf 32k I/O size tcpdump > > The end customer is using NFS version 3 so I have included a tcpdump using NFS > version 3. These are the results when I use a rsize/wsize of 32k. With this trace you are not getting 32k writes. You are getting 32 byte writes... How are you setting the rsize/wsize sizes? Also, the "goodperf" capture shows that solaris is defaulting to a 32k wsize, not a 4k one. I have to wonder if Solaris understands the 'k' that you're using. I suggest specifying rsize/wsize in bytes and redoing your test. For instance: wsize=32768 I also noticed the same write sizes in the tcpdumps which would explain the performance difference. I suspect that "Solaris" does not understand the "k" so I am retesing. I'm also confirming with the end customer if they were indeed using 32k for the rsize/wsize mount options and not 32768. Ok, given that we think we understand the problem, I'm going to go ahead and close this as NOTABUG. Roland, please feel free to reopen the bug if our analysis turns out to be incorrect or if you want to discuss it further. So it turns out that I could not reproduce the performance issue as easily as I thought I could due to the syntax issues on the Solaris side. The customer still is seeing the performance issue with any block size greater than 4096. We haven't been able to reproduce it and the customer cannot reproduce at will. I have asked for them to capture tcpdumps when they reproduce the issue again. They are running RHEL 5.5. |