Bug 830606

Summary:	NFS very very slow.
Product:	[Fedora] Fedora	Reporter:	Gerry Reno <greno>
Component:	kernel	Assignee:	nfs-maint
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	17	CC:	bfields, gansalmon, itamar, jforbes, jlayton, jonathan, kernel-maint, lczerner, madhu.chinakonda, madko, rwheeler
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-08-01 05:37:22 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Gerry Reno 2012-06-10 22:14:05 UTC

Description of problem:
NFS is very very slow.


Version-Release number of selected component (if applicable):
# uname -r
3.3.7-1.fc17.x86_64


How reproducible:
Always.

Steps to Reproduce:
1. Install F17
2. Create NFS mount for NAS box
3. Start backups.
  

Actual results:
Really slow NFS response.

Eg:  trying to list the NFS directories while backup is running is almost impossible.
On other non-F17 machines this works without any problem.

# time ls -lh /mnt/Backup_Filesets_?/
/mnt/Backup_Filesets_A/
/mnt/Backup_Filesets_B/

real	13m43.501s
user	0m0.010s
sys	0m0.076s

Expected results:
NFS operates normally.
No delays in listing NFS directories.



Additional info:

Comment 1 Gerry Reno 2012-06-10 22:16:41 UTC

Something else as well.

While backups are running over the NFS mounts I am getting hangs in various things such as Chrome and even doing a ps listing.

ps will list but then will not return to the command prompt.

Chrome keeps putting up messages about page still loading and eventually tabs will appear to totally hang.

.

Comment 2 Gerry Reno 2012-06-10 22:18:21 UTC

And this machine has an SSD drive if that makes any difference.

.

Comment 3 Justin M. Forbes 2012-09-11 15:42:17 UTC

Is this still happening with 3.5.3 kernels in updates?

Comment 4 Gerry Reno 2012-09-11 16:09:42 UTC

I won't be at the location where the NAS box is located for a few weeks so I cannot test this right now.

I'm still running 3.3.7 currently.

I'll make a note to upgrade before I go to the NAS box location and test this again.

.

Comment 5 J. Bruce Fields 2012-09-11 17:27:50 UTC

Note there's a known problem with stat'ing while writing: in order to get the (posix-required) up-to-date file times from the server, the client needs to first make sure any dirty data's written out to the server.

If that's the problem I think tuning the vm to limit the number of dirty pages might help (e.g., dirty_bytes or dirty_ratio--see Documentation/sysctl/vm.txt in the kernel source tree.)

Comment 6 Gerry Reno 2012-09-11 18:27:23 UTC

As I said, I'm using an SSD in this machine and I have it mounted like this:
defaults,noatime,discard

in order to reduce unnecessary writes to the drive.

Is the noatime affecting NFS?

.

Comment 7 Ric Wheeler 2012-09-12 14:41:48 UTC

Take the "discard" mount option out on the local file system. It is off by default since it can (depending on the device) be very slow.

Comment 8 Gerry Reno 2012-09-12 14:52:41 UTC

I really do not want to do that.

Without 'discard' the SSD will eventually get slower and slower.  There are multiple sites that have tested SSD performance and proven this.

I got discard working and it tests just fine as it zeroes out all the sectors on files that have been removed.

I'm using a SanDisk Extreme SSD and have not noticed any slowness at all.  In fact quite the opposite.  And it's stayed just as responsive as the day I installed it.

.

Comment 9 Ric Wheeler 2012-09-12 17:59:11 UTC

Could you test without discard and then report the results?

Comment 10 Gerry Reno 2012-09-12 18:12:29 UTC

Yes, I can disable discard temporarily to run a test.

As I said, it'll be few weeks before I'm back where I can test it.

.

Comment 11 Lukáš Czerner 2012-09-12 18:42:26 UTC

(In reply to comment #10)
> Yes, I can disable discard temporarily to run a test.
> 
> As I said, it'll be few weeks before I'm back where I can test it.
> 

Thanks. It is important to rule out discard because as Ric mentioned, depending on the device, the -o discard mount option could make things very slow. Especially in workloads with lots of freed blocks and sync, which might not be your case, but it's worth testing anyway.

Note that if this happens to be the cause of the slowness you can always use batched discard (see fstrim(8) from util-linux) letting it run once in a while (daily, weekly, monthly) and get rid of the -o discard option.

Thanks!
-Lukas

Comment 12 Jeff Layton 2012-09-12 18:47:39 UTC

I'm a little fuzzy on the actual configuration here. The original description said:


Steps to Reproduce:
1. Install F17
2. Create NFS mount for NAS box
3. Start backups.

What sort of server is this NAS box? When you said that the machine had a SSD, you meant the NFS client, right? If so, then it's not clear to me that messing around with mount options involving it will have any effect on NFS performance...

Comment 13 Ric Wheeler 2012-09-12 18:53:59 UTC

Good question - I had assumed that the problem was exporting via NFS the SSD file system.

Comment 14 Gerry Reno 2012-09-12 18:58:46 UTC

No.  It's the other way around.

The client laptop has the SSD drive.

The NAS box has regular HDD drives.

.

Comment 15 Jeff Layton 2012-09-12 19:09:19 UTC

Ok, thanks. So the problem is slow "ls" performance while running heavy I/O (mostly writes?) to the NAS. You also said:

   "On other non-F17 machines this works without any problem."

What other non-F17 machines seem to work better here?

Also:

How much RAM is in the client?

Does it work better if you use an unaliased ls command with no options? e.g.:

    $ /bin/ls /mnt/Backup_Filesets_?

Comment 16 Gerry Reno 2012-09-12 19:18:16 UTC

I have been using the same setup for NFS mounting the NAS box since F11 and everything worked normally until I hit F17.  F16, F15 were fine.

RAM:
# cat /proc/meminfo | head -1
MemTotal:        8075568 kB

ls is aliased as 'ls --color=auto'.

Have not seen any other problems at all with ls.

And it's not just ls.  It's Chrome weirdness, ps listings not returning.

.

Comment 17 Jeff Layton 2012-09-12 19:36:24 UTC

(In reply to comment #16)
> I have been using the same setup for NFS mounting the NAS box since F11 and
> everything worked normally until I hit F17.  F16, F15 were fine.
> 

Ok, so this is a regression -- good to know. Is there some reason you haven't updated to 3.5.z?


> RAM:
> # cat /proc/meminfo | head -1
> MemTotal:        8075568 kB
> 

Ok, so quite a bit of RAM. Bruce's theory is quite plausible in this case.


> ls is aliased as 'ls --color=auto'.
> 
> Have not seen any other problems at all with ls.
> 
> And it's not just ls.  It's Chrome weirdness, ps listings not returning.
> 
> .

Yep, that's one of the things you'll need to separate out. "ls --color=auto" means that ls has to first do a readdir() (or similar) to determine what files are in a directory and then stat() each one in order to colorize them properly.

Determining which of those operations is taking the bulk of the time would be helpful. Most likely, it's one of the stat() operations since, as Bruce points out, POSIX requires that any cached data be flushed to the server prior to returning the results of that operation. When a file continually being written to, then that can give you a livelock sort of situation...

Comment 18 Gerry Reno 2012-09-12 19:50:20 UTC

I have upgraded to 3.5.1 but have just not had the opportunity to get back to the NAS box location to test it yet.

I understand the ls situation but even without ls lots of things hanging.

I'll report back once I can test again.

.

Comment 19 Edouard Bourguignon 2012-10-21 10:46:57 UTC

I think I may have the same problem. When I uncompress a file on my nfs mount, iotop shows 80KB/s max write rate, 50% to 100% in the IO> column (iowait?) on the client side.

20715 be/4 edouard     0.00 B/s   70.99 K/s  0.00 % 17.83 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   50.97 K/s  0.00 % 39.00 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   58.78 K/s  0.00 % 37.35 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   19.61 K/s  0.00 % 37.81 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s  117.63 K/s  0.00 % 36.78 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   70.58 K/s  0.00 % 30.27 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s  176.51 K/s  0.00 % 38.20 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   39.22 K/s  0.00 % 28.67 % tar zxf linux-2.6.32.9-apc.tar.gz
20715 be/4 edouard     0.00 B/s   31.37 K/s  0.00 % 28.69 % tar zxf linux-2.6.32.9-apc.tar.gz

On the server side, some nfsd take some io but I have 100% IOwait on jbd2/md2-8

Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s    0.00 B/s  0.00 % 58.05 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s  276.69 K/s  0.00 %  0.00 % [nfsd]
Total DISK READ: 5.52 M/s | Total DISK WRITE: 499.34 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s    3.87 K/s  0.00 % 99.99 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s   23.23 K/s  0.00 % 13.14 % [nfsd]
  483 be/3 root        0.00 B/s    7.74 K/s  0.00 %  5.02 % [jbd2/dm-0-8]
Total DISK READ: 6.12 M/s | Total DISK WRITE: 464.52 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s    3.87 K/s  0.00 % 99.99 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s   30.97 K/s  0.00 % 14.22 % [nfsd]
Total DISK READ: 5.23 M/s | Total DISK WRITE: 537.80 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s   33.97 K/s  0.00 % 97.43 % [jbd2/md2-8]
 1708 be/4 root        0.00 B/s   86.80 K/s  0.00 % 21.05 % [nfsd]
Total DISK READ: 6.71 M/s | Total DISK WRITE: 576.86 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s   38.72 K/s  0.00 % 96.89 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s  104.53 K/s  0.00 %  7.19 % [nfsd]
 1708 be/4 root        0.00 B/s   19.36 K/s  0.00 %  5.84 % [nfsd]
Total DISK READ: 6.28 M/s | Total DISK WRITE: 488.07 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s   19.14 K/s  0.00 % 99.99 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s   45.94 K/s  0.00 % 12.44 % [nfsd]
Total DISK READ: 5.60 M/s | Total DISK WRITE: 642.64 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s   11.61 K/s  0.00 % 99.90 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s  147.11 K/s  0.00 % 11.63 % [nfsd]
 1018 be/3 root        0.00 B/s    3.87 K/s  0.00 %  3.28 % [jbd2/dm-2-8]
Total DISK READ: 6.76 M/s | Total DISK WRITE: 543.90 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s   15.48 K/s  0.00 % 95.35 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s  135.49 K/s  0.00 % 11.20 % [nfsd]
  483 be/3 root        0.00 B/s    3.87 K/s  0.00 %  2.11 % [jbd2/dm-0-8]
Total DISK READ: 5.85 M/s | Total DISK WRITE: 509.07 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s    7.74 K/s  0.00 % 99.99 % [jbd2/md2-8]
 1702 be/4 root        0.00 B/s   54.20 K/s  0.00 %  8.87 % [nfsd]
Total DISK READ: 6.52 M/s | Total DISK WRITE: 530.46 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1021 be/3 root        0.00 B/s    7.74 K/s  0.00 % 94.30 % [jbd2/md2-8]
 1704 be/4 root        0.00 B/s   85.18 K/s  0.00 %  7.22 % [nfsd]
 1702 be/4 root        0.00 B/s    7.74 K/s  0.00 %  5.92 % [nfsd]

== NFS Server ==

kernel 2.6.32-279.11.1.el6.x86_64

Here is my exports:
/srv/stockage/devel	192.168.2.0/24(rw,insecure,fsid=0)

vm.dirty_background_ratio = 10
vm.dirty_background_bytes = 0
vm.dirty_ratio = 20
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000

dd inside the export directory shows ~100MB/s 

== NFS Client ==

kernel 3.6.2-4.fc17.x86_64

On my client, here are my mount options:
stockage:/ on /mnt/stockage type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.2.30,local_lock=none,addr=192.168.2.111)

vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500

Comment 20 Edouard Bourguignon 2012-10-21 12:11:39 UTC

with the async export option the write rate increase from 80KBps to 4MBps! It's more usuable with this option.

Comment 21 J. Bruce Fields 2012-10-22 12:18:10 UTC

(In reply to comment #19)
> I think I may have the same problem.

I see no evidence that this is the same problem.  Please file a separate bug, and if it does turn out to have the same root cause then we'll mark it as a dup later.

Comment 22 Edouard Bourguignon 2012-10-23 12:27:49 UTC

Ok thanks, bug #869260 has been opened

Comment 23 Fedora End Of Life 2013-07-04 01:18:26 UTC

This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 24 Fedora End Of Life 2013-08-01 05:37:30 UTC

Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.