Bug 118839 (IT_53121) - RPC: buffer allocation failures for NFS client
Summary: RPC: buffer allocation failures for NFS client
Keywords:
Status: CLOSED ERRATA
Alias: IT_53121
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact:
URL:
Whiteboard:
: 136423 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-03-21 12:58 UTC by Tom Sightler
Modified: 2007-11-30 22:07 UTC (History)
30 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-20 20:54:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/proc/slabinfo output (9.93 KB, text/plain)
2004-03-23 14:44 UTC, Brian Smith
no flags Details
slabinfo from nfs client which locks up with ls -l (4.90 KB, text/plain)
2004-06-24 12:18 UTC, jason andrade
no flags Details
patch to reduce memory requirements for NFS_ACL responses (513 bytes, patch)
2004-07-29 14:43 UTC, Neil Horman
no flags Details | Diff
sysrq m output (2.39 KB, text/plain)
2004-07-29 20:25 UTC, Brian Smith
no flags Details
patch to make NFS3_ACL_MAX_ENTRIES configurable (5.94 KB, patch)
2004-08-17 15:02 UTC, Neil Horman
no flags Details | Diff
follow on patch to add same functionality to nfsd module (2.57 KB, patch)
2004-08-17 17:25 UTC, Neil Horman
no flags Details | Diff
enhancement on prior patch to display module acl option via proc file (11.96 KB, patch)
2004-08-31 18:31 UTC, Neil Horman
no flags Details | Diff
new patch to add nfs_acl_max_entries module option to nfs.o (5.94 KB, patch)
2004-09-01 20:11 UTC, Neil Horman
no flags Details | Diff
follow-on patch to add same functionality to nfsd.o (6.16 KB, patch)
2004-09-01 20:13 UTC, Neil Horman
no flags Details | Diff
same patch with added nfs sysctl (6.40 KB, patch)
2004-09-07 19:32 UTC, Neil Horman
no flags Details | Diff
follow on nfsd patch for new kernel (4.86 KB, patch)
2004-09-07 19:34 UTC, Neil Horman
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:550 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 3 Update 4 2004-12-20 05:00:00 UTC

Description Tom Sightler 2004-03-21 12:58:16 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; Linux i686; U) Opera 7.22  [en]

Description of problem:
We have several systems experiencing this problem.  It seems to 
possibly be related to memory pressure.  These systems are all 
running Oracle version 9.2.0.4 but NFS is only used for home 
directories.

The original symptom is that a system will hang when doing a simple 
'ls -l' on any nfs mounted directory.  A standard 'ls' and even other 
normal file operations seems to work fine.  You can even still umount 
and mount the same NFS file system or other NFS systems, but the 'ls 
-l' command will always hang.  The only way to kill it is to do a 
'kill -9' on the ls process and then it will hang in 'D' state then 
you can 'kill -9 rpciod' and the proccess will release.

Whenever you get the hanging 'ls-l commands you also get entried like 
the following:

RPC: buffer allocation failed for task da8fbca8
RPC: buffer allocation failed for task da8fbca8
RPC: buffer allocation failed for task dc00fca8

This is happening on a 4-way Dell 8450 w/8GB of RAM, a 4-way Dell 
6450 w/4GB, and a 2-way 2550 w/4GB RAM.   All of these systems are 
under significant memory pressure but gernerally preform well.  This 
problem has appeared since the recent upgrade of these systems from 
AS 2.1 to AS 3.

The NFS server is a Dell 2650 running ES 3.

Please ley me know what other information I can provide.

Thanks,
Tom



Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-9.0.1.EL

How reproducible:
Sometimes

Steps to Reproduce:
1. Mount filesystem vis NFS
2. Let system run for a while until memory pressure occurs
3. Preform 'ls -l' until it hangs
    

Actual Results:  Hangs when trying to do 'ls -l' of NFS mounted 
directories

Expected Results:  Should list directories as normal.

Additional info:

These system are connected to a Dell/EMC CX400 disk array

Comment 1 Tom Sightler 2004-03-22 14:36:07 UTC
This issue occured again today on our production Oracle box.  That
system can't seem to make more than 24 hours with a working NFS.

The hang today was worse than previous, pretty much any access to the
NFS share was completely broken.  The 8450 with 8GB of RAM seems to
have much more problems that the other systems with only 4GB.  Is this
likely to be a low-mem issue.  Would it be possible that running the
hugemem kernel might be an improvement even though the system has on
8GB of RAM?

Thanks,
Tom

Comment 2 Rik van Riel 2004-03-22 14:43:04 UTC
If it's a VM problem with allocating contiguous buffers, then yes
switching to the hugemem kernel may help since the amount of available
kernel memory will increase by about 400%.

Assigning to SteveD in case it's an NFS problem anyway ;)

Comment 3 Brian Smith 2004-03-23 01:34:40 UTC
Yow.  This happened once, was hoping it was a fluke...but twice now.

I'm also seeing this on a lightly loaded backup server, that doesn't
do a ton of NFS:

|eowyn# cp -a web web2ls -l web web2
|web:
|total 2384
|-rw-r--r-- 1 brian support 2435809 Mar 18 18:28 apache_1.3.29.tar.gz
|-rw-r--r-- 1 brian support 4630171 Mar 22 20:32 php-4.3.4.tar.gz
| 
|web2:
|total 2384
|-rw------- 1 brian support 2435809 Mar 18 18:28 apache_1.3.29.tar.gz


This hangs, but is able to be interrupted.  Now that the machine is
having problems, there's some interesting things.  dmesg reports:
  RPC: buffer allocation failed for task d1fe5ca
for each attempt (diff hex code).  I can ls -R the whole dir, but try
one ls -l and then it hangs again.  I can rm -r it as well.

I have a machine in this state now, which I could create a login
for a tech, I'll prob have to reboot the machine tomorrow before
an upgrade to RHE 3 on my other servers.

More info:
NFS server: Red Hat 9 2.4.20-30.9smp (upgrading tomorrow to AS)
Client: Single processor Xeon, 1GB RAM 2.4.21-9.0.1.ELsmp

I cannot recall if I was seeing this with the old kernel.

Comment 4 Brian Smith 2004-03-23 01:46:31 UTC
Ok.  This is creepy:
 cp -r web h
works fine.
 cp -rp web i
hangs.  dmesg shows:
 RPC: buffer allocation failed for task f2b51ca8
but the message only appears when you ^c the cp.


Comment 5 Brian Smith 2004-03-23 02:51:54 UTC
Last one, I promise.  I was thinking kernel as well, but I note that
with the cp -a or -rp commands I have done, the owner and perms have
not been changed when I interrupt them.

So I tried some chown, chown -R, chmod, and chmod -R.  All worked
fine, verifying with ls -l on another system.

I did foreach to the stat command with every individual format option.
 It did not hang on any of them.

I have no idea what this means, but I d/l and compile the fileutils
4.1 package from gnu.org, and bingo, ls -l and cp -a work.  (I did
have to configure on another RHE3 AS system, due to config.status
hanging on creating the makefiles.  But the compile ran on the
affected machine.)

That's the limit of my mojo tonigh.

Comment 6 Rik van Riel 2004-03-23 13:40:13 UTC
I suspect you'll be able to find more info in /proc/slabinfo ;)

In particular, after the first copy the dentry and inode caches should
still be at a more or less reasonable size, but after the second copy
they are probably really really big.

I have a suspicion on what your problem could be: if there is enough
memory free, we don't reclaim slab cache memory to fulfill higher
order memory allocations, but only normal user/cache pages. 

If the slab cache is really big, maybe we need to reclaim slab and
buffer headers from the defragmentation routine that's used when
higher order memory allocations can't be immediately satisfied...

Larry, does this make sense ?

Comment 7 Brian Smith 2004-03-23 14:44:35 UTC
Created attachment 98780 [details]
/proc/slabinfo output

FYI: Attached is the /proc/slabinfo output of the affected machine, before and
during an ls -l.

Comment 8 Brian Smith 2004-03-24 14:52:07 UTC
This may help, an strace on ls -l to_do and cp -p to_do a both hang at:
 getxattr("to_do", "system.posix_acl_access"
and print <unfinished> when ^c

Comment 9 Eric Swenson 2004-04-23 21:36:52 UTC
Hi, the tech at redhat directed me to this bug report, which appears
to be the exact same behavior that I'm seeing; namely, 'ls' works fine
on my nfs-mounted home directories, but 'ls -l' just hangs, with a
corresponding "kernel: RPC: buffer allocation failed for task d4dc3ca8".

'ls -l' seems to be working fine on local directories.  'cp -rp'
exhibits the same hanging behavior that Brian mentioned, with that RPC
error popping up when you eventually ctrl-c it.

NFS server is RHEL ES 3.0, Dell PE4600 with 4GB ram and HT on.
NFS client is also RHEL ES 3.0, single-cpu Dell dimension with 1GB
ram, and kernel 2.4.21-9.0.1.ELsmp.  I'm trying to update the kernel
on the nfs client box to see if that will fix anything.

Comment 10 Brian Smith 2004-04-28 19:41:15 UTC
I upgraded to 2.4.21-9.0.3.ELsmp and this is still happening, after
about 5 days uptime.  It doesn't seem to happen with a non-smp kernel,
as I rand one of those 20 days w/o problem.

Comment 11 Eric Swenson 2004-04-28 22:36:42 UTC
After upgrading to 2.4.21-14.ELsmp (from the Rhel 3.0 beta channel), I
haven't yet seen the problem, but this is my secondary server and
isn't highly stressed.  I'll keep an eye on it, especially if this is
something that disappears for a while after rebooting (such as for
upgrading the kernel).  If it does work, that kernel is going on my
primary server.

Comment 12 Tom Sightler 2004-04-29 01:53:23 UTC
I also haven't been able to reproduce this problem since upgrading 
all but two of my systems that were having this problem to 2.4.21-12.
ELsmp several weeks ago, even on the servers that regularly showed 
this problem in only a few hours with 2.4.21-9.0.1.


Comment 13 Brian Smith 2004-04-29 02:08:27 UTC
I'll give that a shot, then!

Comment 14 Brian Smith 2004-05-07 18:18:25 UTC
Unfortunately, vmlinuz-2.4.21-14.ELsmp, produced the error afer about
5 days.  Back to 2.4.21-9.0.3.EL.img.

Comment 15 Brian Smith 2004-05-30 16:01:05 UTC
2.4.21-15.ELsmp is producing the error after about five days, now not
just on my back server, but the web server as well.  Marvelous.

On the backup server if I run the non-smp version the behavior does
not manifest (9 days and counting).  Both of these machines are dual
xeon motherboards.  Backup server has 1 hyperthreading processor, web
has two non-HT xeons.  Another server I have on the same MB as the
backup server has two HT xeons in it, and has not shown the problem. 
Why does this seems to happen with two procs on the smp kernel?

Any way to turn off acls on nfs shares?

Comment 16 Tom Sightler 2004-05-31 03:12:19 UTC
According to "man exports" you can add "no_acl" to the export line to 
mask off acl permissions on nfs shares.

Also, I think I've discovered that my problem went away when I 
actually mounted up the filesystem that I was exporting with the 
"acl" option.  In other words I was exporting the directory 
/mnt/misfiles which was a standard ext3 filesystem but was not 
mounted with acl support.  Other systems accessing this filesystem 
via NFS would generate the RPC errors on occassion.  However, once I 
modified the /mnt/misfiles ext3 filesystem on the nfs server to 
enable acl support the problem seems to go away on the clients.

I haven't seen this issue in months on my machines.  Maybe just luck, 
but it was happening regularly before I made the changes to enable 
ACL's on the underlying filesystems that were being exported.

Later,
Tom


Comment 17 Bill Soudan 2004-06-03 18:57:53 UTC
I'm having this issue too.  The box is a development server that 
supports a few developers and serves up an SVN repository and some 
development web pages/cgis.  Our home directories are NFS mounted 
from a Solaris box via the automounter, which in turn reads its 
configuration from an NIS domain hosted by a different Solaris box. 
 
Every few weeks, I'll run an 'ls -al' command and it will hang.  
Nothing will dump to the syslog until you try to kill the process, 
at which point the syslog is spammed with the following: 
 
Jun  3 13:54:34 roclindev1 kernel: RPC: buffer allocation failed for 
task c8b31ca8 
Jun  3 13:54:54 roclindev1 last message repeated 331 times 
 
Once a particular path is hung, any attempt to access it will hang, 
even a plain ls.  However, other paths on the same NFS mount will be 
fine.  In addition, attempting an strace on the hung process will 
result in a hung strace. 
 
I swear I was able to recover one time through a combination of 
'kill -9 pid', 'umount -f /affected/mount', 'umount 
-l /affected/mount', 'mount /affected/mount', and 
'/etc/init.d/autofs restart'.  But I haven't been able to do that 
since, I don't waste my time, I just reboot the box.  I've spent a 
few hours researching the problem on the web and to the best of my 
knowledge but I've never been very good at NFS troubleshooting.  If 
I can provide any more information, please let me know. 
 

Comment 18 Brian Smith 2004-06-07 14:05:19 UTC
Sadly, adding acl to the mounts on the servers didn't fix the problem,
and combinations of no_acl don't seem to matter.  Thus, maybe it's a
kernel memory issue.

Before 2.4.21-15.ELsmp I was seeing this only on a machine with an
Intel SE7501BR2 7501 chipset motherboad, now I'm seeing it on Tyan
S2720GN 7500 chipset boards.

Interestly, 2.4.21-15.ELsmp on the initial machine the problem seems
to come and go now.  ls -l hangs, and then I ^C and sometime later for
some unknown reason it works.


Comment 19 Bill Soudan 2004-06-07 15:41:33 UTC
Some more data: 
 
Happened again just now.  I was doing an 'mv -f ' on a large (581MB) 
backup file to an NFS mount, noticed it was taking longer than 
necessary.  Here's exactly the steps I followed: 
 
1) log into linux box 
2) ls on backup directory -- this worked fine 
3) ls -al on backup directory -- hang 
4) log in again 
5) ls on backup directory -- hang 
 
I'll see if I can reproduce tonight once no one needs the box any 
longer -- this is the quickest the hang has happened to me after a 
reboot.  Maybe the large file triggers something? 

Comment 22 jason andrade 2004-06-24 11:47:46 UTC
i see exactly the same problem.  2.4.21-15ELsmp on both client and
server.  it's extremely annoying.  it also only seems to do it on 'high
load' clients.  on one client ls -l works fine on a directory.  on
another which has heavy network (nfs) load it doesn't.

-jason

Comment 23 jason andrade 2004-06-24 11:50:52 UTC
some additional information - we do a reasonable amount of NFS. 
approximately 3-4Tbyte/day over nfs (the joys of a popular internet
archive).  i have the following in my sysctl.conf (if it makes any
difference)

# Performance tuning to increase fragment buffer memory
#net.ipv4.ipfrag_high_thresh = 4194304
#net.ipv4.ipfrag_low_thresh = 1048576
                    
# Performance tuning to increase tcp read/write buffers
#net.ipv4.tcp_rmem = 4096 349520 1048576
#net.ipv4.tcp_wmem = 4096 131072 1048576


this bug is very reproducible and affects every nfs client we have
that does any kind of load. 

-jason

Comment 24 jason andrade 2004-06-24 12:18:37 UTC
Created attachment 101372 [details]
slabinfo from nfs client which locks up with ls -l

this is from an nfs client running 2.4.15ELsmp.

Comment 25 jason andrade 2004-06-24 12:19:48 UTC
some additional information - we do a reasonable amount of NFS. 
approximately 3-4Tbyte/day over nfs (the joys of a popular internet
archive).  i have the following in my sysctl.conf (if it makes any
difference)

# Performance tuning to increase fragment buffer memory
#net.ipv4.ipfrag_high_thresh = 4194304
#net.ipv4.ipfrag_low_thresh = 1048576
                    
# Performance tuning to increase tcp read/write buffers
#net.ipv4.tcp_rmem = 4096 349520 1048576
#net.ipv4.tcp_wmem = 4096 131072 1048576


this bug is very reproducible and affects every nfs client we have
that does any kind of load. 

-jason

Comment 26 jason andrade 2004-06-24 12:20:30 UTC
with some 

Comment 27 jason andrade 2004-06-24 12:24:44 UTC
with some further testing the bug is strange.

ftpd isn't affected (proftpd internal ls doesn't appear to do an ls -l)
httpd (apache 1.3.29) isn't affected.  it seems able to browse directory
trees without issues
rsyncd (2.6.2) isn't affected.  i can rsync list in long form
directories remotely (rsyncd) as well as over ssh.

so really i can only reproduce this by using the command line, i.e going
to an NFS client and trying to ls -l in any nfs mounted directory.

this is particularly annoying for home directories of course..

other info - all the clients are mounted RO. except for one which is
RW (and interestingly it doesn't have the ls problems).  i suspect
this is because it doesn't do that much reading across nfs, just
relatively small (10-20G/day) writes (compared with 2000-3000G/day
reads on clients).

Comment 28 Juanjo Villaplana 2004-06-25 11:40:14 UTC
Hi,

I have noticed the same problem.

The server is an L200 with 2GB RAM and the client is an N400 with 4GB
RAM. We are running RHEL 3 Update 2 (2.4.21-15.0.2.ELsmp) on both systems.

I have noticed that running (on the client) a program that
allocates/touches/frees a considerable amount of memory (1GB for
example), and hence decreasing buffers & cached, makes "ls -l" to work
again during some time. Maybe this helps to diagnose the problem, I
can attach some slabinfo if needed.

Regards,
            Juanjo

Comment 29 Tom Sightler 2004-06-25 13:37:48 UTC
Has anyone opened an official support case with Redhat on this issue?
 This seems to be a pretty big issue which is starting to show up for
quite a few users, perhaps opening an official support case would get
this bug some attention.

I would do it but, even though I originally reported this bug, I'm
actually no longer having the issue so I wouldn't be a good choice to
pursue a support case.

Later,
Tom


Comment 30 Neil Horman 2004-06-29 18:15:32 UTC
The issue has been opened with Red Hat support.

I've taken a brief look through the code.  It looks to me that the
problem is based around two things:

1) The fact that RPC in Linux reserves response buffers before the RPC
call is made
2) The size of the buffer that Linux has to reserve for getacl requests.

Since we reserve response buffers before we make an RPC call, and
since we have no way of knowing the maximum size of a response for a
getacl call (which is required for getxattr requests on systems that
support ACL's), we have to assume a worst case scenario.  In the case
of the getacl call, this requires quite a bit of space (since 1024 ACE
objects are supported per response).  This means that we need to make
at least an order 2 allocation for every getacl request sent.  I would
imagine that systems under a high level of memory pressure with
significant fragmentation would not be able to satisfy this request. 
I believe the rpc code, in an effort to prevent all NFS action from
blocking, simply fails this request, rather than putting the calling
process to sleep.  

As a side note, it may be helpful, if you find yourself in this
situation, to mount your NFS shares with an rsize and a wsize of 4096.
 This will decrease your performance somwhat, but doing so should
alieve some of the systems demands for order 2 allocations and may
prevent these errors from occuring.




Comment 34 Stephen Balukoff 2004-07-01 20:46:20 UTC
Just a ditto of the above.  I'm seeing the same problems on our Oracle
server doing a trivial amount of NFS per day (<5GB).  Runnin kernel
2.4.21-9.0.1.ELsmp, tried reducing rsize and wsize to 4096 and still
had the problem.  Only 'ls -l' appears to be affected.

Comment 35 Sameh Attia 2004-07-05 12:59:30 UTC
We are running 15.0.3.ELsmp with NFS on a NAS server from Dell running
Win2k3 and we suffer the same symptoms. The ls lookups happen irregularly.

We are going to use another distro to check it is a RHEL problem or what?

Comment 36 Brian Smith 2004-07-12 19:13:12 UTC
So far the only thing the new kernels have done for me is:

1) make more systems experience the bug
2) make the interval of useful uptime less

Has anyone seen this with a non-smp kernel?

Comment 37 Tom Sightler 2004-07-12 19:32:45 UTC
I have seen this issue on non-smp, but only on one server.  Once 
again it was a server that was under significant memory pressure.  
Actaully, adding more memory seemed to resolve the issue.

Another thing I have noted when trying to figure out why I no longer 
see this problem while others still do, I remember that I made a 
change to the inactive_clean_percent to return it to the behavior of 
2.4.21-4.EL which I think was the original kernel.  I did this to 
resolve another issue where the kernel goes way too far into swap 
instead of reclaiming memory, however, maybe the extra memory reclaim 
gives NFS a better chance of completing it's order 2 allocations.  I 
added the following to my sysctl.conf to make this change:

vm.inactive_clean_percent = 100

It probably doesn't have anything to do with it, but, since I haven't 
seen this issue in months, and this is one of the changes I made 
during those months, I thought I'd mention it.

Later,
Tom

Comment 38 Neil Horman 2004-07-12 20:07:38 UTC
You probably shouldn't set your inactive_clean_percent to 100.  That
implies that anytime you have dirty memory pages, the VM is going to
try to clean them.  Thats alot of unneeded overhead to put on your VM,
and might in fact make your system quite unresponsive.  It would of
course help optimize the number of order 2 allocations you had
available, but at a pretty terrible cost.

Comment 39 Tom Sightler 2004-07-12 22:33:21 UTC
Actually, as far as I can tell, in any system under memory pressure 
from applications doing lots of IO, not having this set causes the 
system load to be much higher because the system fails to reclaim 
memory agressively enough to avoid it from going heavy into swap, and 
using swap is a much higher cost that using CPU cycles to reclaim 
memory.

In my experience, and many others if you do a search on Bugzilla,or 
the Taroon or Oracle mailing lists, this setting actually has a 
positive effect.  I have one server here where it is VERY noticable.

Still, you don't have to set it to 100 (even though that was the 
behaviour of the original RHEL 3 kernel) you can set it to 75 or 50 
and see if it helps.  I'm just trying to offer some suggestions since 
so many people are having this issue and there doesn't seem to be any 
progress on fixing it.  If this setting kills performance then they 
can change it back, but on the 15 servers I have the performance 
imapact is negligible at best, and on a few memory constrained 
servers it actually improved performance by making the system use 
less swap.  Even if it has a small impact on performance, some people 
may be willing to trade a few percent of their CPU cycles for a 
working NFS client (assuming that actually helps with the problem).

Later,
Tom


Comment 40 Rik van Riel 2004-07-13 00:52:31 UTC
Neil, while setting inactive_clean_percent to 100 could result in bad
disk IO when doing swapping, it should be pretty harmless when doing
mmap()d IO over NFS.

In fact, starting the write out earlier should guarantee that there is
more free memory available - making it easier to allocate the RPC
buffers and reducing the chance of an allocation failure.

This is a special case, though. For normal disk bound systems it's
probably best to leave inactive_clean_percent at a lower value...

Comment 41 Brian Smith 2004-07-13 00:58:16 UTC
I've set it up on my backup server (performance won't matter a whole lot), to try to see if it 
hits the problem.  I'll report back in 5 days or so when it would be sure to do it by then.

So what is the value set at in the current kernels, and what might be a good value to avoid 
this problem if it works?


Comment 42 Neil Horman 2004-07-13 12:08:59 UTC
believe the default on the -15 kernels is 30

Comment 43 Brad Dickerson 2004-07-16 14:36:12 UTC
We are running RedHat Enterprise Linux ES release 3 with linux kernel 
 2.4.21-9.ELsmp and once every day or 2 days the 'mv' command  hangs
when moving a file from local directory to a NFS mounted directory.
This is accompanied by repeated errors in messages file of 
kernel: RPC: buffer allocation failed for task e0c43cb4 for different taks
 The only fix we have found is to reboot.

Comment 44 Geoff Dolman 2004-07-17 14:13:40 UTC
Running kernel-2.4.21-15.0.3.EL and 2.4.21-15.0.2.ELsmp
I am getting this exact same issue with Dell 2650s. It is really
really annoying and making the machines virtually unuseable at times.
The machines in question are under fairly heavy cpu loads from
statistical analysis programs but not heavy nfs load.
I can't reboot without the stats jobs failing so this isn't an option
as they run for weeks at a time.
Things that hang in these circumstances include vi, vim, rm, ls.
It's also happened on machines that are not heavily loaded.
I never got this problem with redhat 9.
If anyone got a fix from Red Hat support could they please mail it to
me urgently? Cheers

Comment 45 Brian Smith 2004-07-19 17:54:16 UTC
FYI:  Though it seemed to last a day longer, the backup server with
vm.inactive_clean_percent = 100 did experience the problem again. :(

Comment 46 George 2004-07-21 14:20:02 UTC
same problem here. 2GB Memory with kernel-smp-2.4.21-15.0.3.EL.
Typical paging and disk I/O info are following,


11:30:03 AM  pgpgin/s pgpgout/s  activepg  inadtypg  inaclnpg  inatarpg
11:40:03 AM    242.07   1659.09    235921    195099      7315     97402
11:50:01 AM      5.45    759.29    235434    196169      8065     98205
12:00:00 PM      0.52   1522.38    213007    213161      7239     98026
12:10:00 PM    215.22   1449.89    236690    196093      8192     98322
12:20:00 PM    413.29   1162.54    240586    192554      7018     98195
12:30:04 PM     28.32   1307.60    214242    210747      7385     97660
12:40:00 PM    424.62   1443.24    241216    192619      7463     98339
12:50:00 PM     12.09    928.98    228875    201079      7220     98179
01:00:01 PM    132.47   1861.38    221040    212119      4645     96681
01:10:03 PM    386.30   1753.47    247961    179431      6705     94704
01:20:03 PM     48.78    790.09    245355    173183      6429     92946
01:30:00 PM     99.52   1086.99    251283    180604     21462     95238

11:30:03 AM       DEV       tps    sect/s
11:40:03 AM    dev8-0    134.28   3802.32
11:50:01 AM    dev8-0     58.85   1529.47
12:00:00 PM    dev8-0    120.93   3045.80
12:10:00 PM    dev8-0    120.58   3330.22
12:20:00 PM    dev8-0    127.76   3151.68
12:30:04 PM    dev8-0    102.59   2671.85
12:40:00 PM    dev8-0    134.42   3735.71
12:50:00 PM    dev8-0     65.35   1882.15
01:00:01 PM    dev8-0    166.50   3987.71
01:10:03 PM    dev8-0    159.99   4279.54
01:20:03 PM    dev8-0     70.15   1677.75
01:30:00 PM    dev8-0     98.44   2373.01


Comment 47 George 2004-07-22 13:21:09 UTC
It happened again in less than 24 hours.

 13:20:36  up 23:15,  5 users,  load average: 5.04, 5.00, 3.98
119 processes: 118 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    0.4%    0.0%    1.4%   0.0%     0.0%    0.0%   98.0%
           cpu00    0.9%    0.0%    1.9%   0.0%     0.0%    0.0%   97.0%
           cpu01    0.0%    0.0%    0.9%   0.0%     0.0%    0.0%   99.0%
Mem:  2061636k av, 1934868k used,  126768k free,       0k shrd,  
48852k buff
                   1046136k actv,  469100k in_d,   28304k in_c
Swap: 2096440k av,    2240k used, 2094200k free                
1736384k cached

[root@log]# cat /proc/slabinfo
slabinfo - version: 1.1 (SMP)
kmem_cache            96     96    244    6    6    1 : 1008  252
nfs_write_data        40     40    384    4    4    1 :  496  124
nfs_read_data        360    360    384   36   36    1 :  496  124
nfs_page            1020   1020    128   34   34    1 : 1008  252
ip_fib_hash           16    224     32    2    2    1 : 1008  252
ip_conntrack         260    260    384   26   26    1 :  496  124
urb_priv               0      0     64    0    0    1 : 1008  252
ext3_xattr             0      0     44    0    0    1 : 1008  252
journal_head        1241  13321     48   45  173    1 : 1008  252
revoke_table           5    250     12    1    1    1 : 1008  252
revoke_record        224    224     32    2    2    1 : 1008  252
clip_arp_cache         0      0    256    0    0    1 : 1008  252
ip_mrt_cache           0      0    128    0    0    1 : 1008  252
tcp_tw_bucket         23     30    128    1    1    1 : 1008  252
tcp_bind_bucket      224    224     32    2    2    1 : 1008  252
tcp_open_request      30     30    128    1    1    1 : 1008  252
inet_peer_cache        3    116     64    2    2    1 : 1008  252
secpath_cache          0      0    128    0    0    1 : 1008  252
xfrm_dst_cache         0      0    256    0    0    1 : 1008  252
ip_dst_cache         225    225    256   15   15    1 : 1008  252
arp_cache             30     30    256    2    2    1 : 1008  252
flow_cache             0      0    128    0    0    1 : 1008  252
blkdev_requests     3072   3360    128  112  112    1 : 1008  252
kioctx                 0      0    128    0    0    1 : 1008  252
kiocb                  0      0    128    0    0    1 : 1008  252
dnotify_cache          0      0     20    0    0    1 : 1008  252
file_lock_cache      120    120     96    3    3    1 : 1008  252
async_poll_table       0      0    140    0    0    1 : 1008  252
fasync_cache           0      0     16    0    0    1 : 1008  252
uid_cache            224    224     32    2    2    1 : 1008  252
skbuff_head_cache   1495   1495    168   65   65    1 : 1008  252
sock                 290    290   1408   58   58    2 :  240   60
sigqueue             261    261    132    9    9    1 : 1008  252
kiobuf                 0      0    128    0    0    1 : 1008  252
cdev_cache           269    290     64    5    5    1 : 1008  252
bdev_cache             7    116     64    2    2    1 : 1008  252
mnt_cache             27    116     64    2    2    1 : 1008  252
inode_cache         3205   4823    512  689  689    1 :  496  124
dentry_cache        1469   2550    128   85   85    1 : 1008  252
dquot                  0      0    128    0    0    1 : 1008  252
filp                1453   1470    128   49   49    1 : 1008  252
names_cache           44     44   4096   44   44    1 :  240   60
buffer_head       254395 273980    108 7827 7828    1 : 1008  252
mm_struct            250    250    384   25   25    1 :  496  124
vm_area_struct      4536   4536     68   81   81    1 : 1008  252
fs_cache             406    406     64    7    7    1 : 1008  252
files_cache          210    210    512   30   30    1 :  496  124
signal_cache         580    580     64   10   10    1 : 1008  252
sighand_cache        178    180   1408   36   36    2 :  240   60
pte_chain           5494  12150    128  284  405    1 : 1008  252
pae_pgd              406    406     64    7    7    1 : 1008  252
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        1      1  16384    1    1    4 :    0    0
size-16384            22     23  16384   22   23    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              6      8   8192    6    8    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :  240   60
size-4096            662    722   4096  662  722    1 :  240   60
size-2048(DMA)         0      0   2048    0    0    1 :  240   60
size-2048            350    350   2048  175  175    1 :  240   60
size-1024(DMA)         0      0   1024    0    0    1 :  496  124
size-1024            100    100   1024   25   25    1 :  496  124
size-512(DMA)          0      0    512    0    0    1 :  496  124
size-512             576    576    512   72   72    1 :  496  124
size-256(DMA)          0      0    256    0    0    1 : 1008  252
size-256            1080   1080    256   72   72    1 : 1008  252
size-128(DMA)          1     30    128    1    1    1 : 1008  252
size-128            2277   2400    128   80   80    1 : 1008  252
size-64(DMA)           0      0    128    0    0    1 : 1008  252
size-64              660    660    128   22   22    1 : 1008  252
size-32(DMA)          17     58     64    1    1    1 : 1008  252
size-32              883   1044     64   18   18    1 : 1008  252


Comment 48 George 2004-07-22 13:23:10 UTC
vm.inactive_clean_percent = 100 does not help 

Comment 49 Brad Dickerson 2004-07-23 15:41:46 UTC
We are running RedHat Enterprise Linux ES release 3 with linux kernel 
 2.4.21-9.ELsmp and once every day or 2 days the 'mv' command  hangs
when moving a file from local directory to a NFS mounted directory.
Currently, we just had a vi session hang accompanied by errors in the
/var/log/messages like
RPC: buffer allocation failed for task
The hanging up of the nfs mounted filesystem is (ae)ffecting our
production. Can someone give me a time frame for when this bug may be
fixed? That way, we can decide whether its worth the effort to find a
workaround. Thanks.

Comment 50 Brad Dickerson 2004-07-27 17:39:53 UTC
Would replacing the nfs system that comes with RedHat Enterprise ES
with the nfs system that comes with Red Hat Enterprise 2.1 avoid the bug?


Comment 51 Neil Horman 2004-07-27 18:00:30 UTC
Thats a pretty tall order.  Have you all tried using the hugemem
kernel?  Its not exactly a fix for the problem, but it will certainly
avoid the problem by poentailly quadrupling the amount of lowmem that
you have to work with (assuming that you have 4GB of ram in these
systems).

Comment 52 Rex Dieter 2004-07-27 18:10:29 UTC
> Have you tried the hugemem kernel? ...
> assuming that you have 4GB...

What happens if one installs hugemem on a box with less than 4GB? 
Would that be bad?

Comment 53 Neil Horman 2004-07-27 18:25:40 UTC
Nope, nothing wrong with it, but it won't maximize the advantage that
the hugemem configuration provides.  Non-hugemem kernels have 1GB of
kernel address space (lowmem), while hugemem kernels have 4GB.  So if
you have less than 1GB of total RAM, hugemem isn't helpful.  1GB to 
4GB ram provides a scaled advantage, by which I mean a system with 2GB
of RAM will have 2GB of lowmem, a system with 3GB of RAM has three 3GB
of lowmem, up to 4GB, after which you're back into adding highmem. 
Since RPC allocates kernel memory for response buffers, it uses
lowmem.  More lowmem means more memory for RPC to allocate if it needs it.

Comment 54 Chris Worley 2004-07-27 18:29:36 UTC
The above solution:

  you can 'kill -9 rpciod' and the proccess will release

Is true, but the system is hosed.  I cannot further nfs mount anything.

Not a Dell box, not running Oracle... just RHEL3 with the -15 kernel.

The client is mounting NFS partitions that are GFS file systems on the
server.

Comment 55 Neil Horman 2004-07-27 18:32:50 UTC
Yeah, Don't do that.  Killing your rpc/nfs tasks won't lead to
anything good.  This is a memory problem.  The best solution right now
that I can think of is moving to the hugemem kernel (at least for
those systems that have > 1GB of RAM).

Comment 56 Alexander Pertsemlidis 2004-07-27 19:18:25 UTC
Moving to hugemem only delays the inevitable.

Comment 57 Larry Woodman 2004-07-27 19:26:38 UTC
Can someone get "AltSysrq M" outputs for both smp and hugemem kernels
when this problem is happening so we can figure out if this is a
memory fragmentation issue or lowmem exhaustion issue?

Thanks, Larry Woodman


Comment 58 Jason Baron 2004-07-27 20:03:03 UTC
i think we need to backprot the mempool_alloc infratructure from 2.6,
that should solve this issue

Comment 59 Kostas Georgiou 2004-07-27 20:49:53 UTC
I just got the error message in a machine (dual Xeon 2GB) that was moved recently to 
RHEL3 from 7.3. It managed to survive without a reboot for about a year and about 20 
users running remote X sessions (Exceed) and heavy computational problems (often 
needing more than 1GB of memory). After the rebuild it was only used by one user for 
about two weeks before nfs failed.

Until the problem is fixed (backporting mempool_alloc, whatever) what other options are 
there any other options except running the hugemem kernel ? 

Will it help to lower vm.max_map_count? It is listed as a possible solution when you run 
out of lowmem pages in a RedHat document. Is there a way to check how many VMAs are 
used by each process to check if it's going to cause problems ?

Comment 60 George 2004-07-29 13:42:33 UTC
With Hugemem kernel, our system still behaves normal after 1 day (24
hours). It acts up in one night with other kernels.

Comment 61 Neil Horman 2004-07-29 14:32:26 UTC
Thats going to be the best solution, at least for now.  If you
absolutely can't move to hugemem.  Alternatively, it may help to
reduce the NFS_ACL_MAX_ENTIRES definition from its current setting of
1024 to something smaller.  At a value of 1024 it implies an order 2
allocation (~12KB) for each NFS_ACL request, at  512 it requires an
order 1 allocation, and at 256, it requires an order 0 allocation. 
This should   help alieviate the need for large contiguous buffer
allocation in the RPC layer for NFS_ACL requests 

Comment 62 Neil Horman 2004-07-29 14:43:38 UTC
Created attachment 102284 [details]
patch to reduce memory requirements for NFS_ACL responses

If someone wants to give it a try, heres a patch I think might help.  I haven't
been able to determine if reducing the number of ACE objects we support in a
single response message violates any germaine standards or RFC's, but It seems
at the least it should relieve some of the memory demand the NFS_ACL has.

Comment 63 Brian Smith 2004-07-29 14:52:17 UTC
Hugemem is an ok idea, but since this problem appears mainly on
clients, most of mine have 1GB of memory, so it's probably not going
to help small clients.

>Can someone get "AltSysrq M" outputs for both smp and hugemem kernels

I can do it for smp, how exactly does one do this?

>patch to reduce memory requirements for NFS_ACL responses

Sure, I'll attempt a new kernel on my backup server.

Comment 64 Chris Worley 2004-07-29 15:00:16 UTC
Would lowering NFS_ACL_MAX_ENTIRES approximate the "noacl" client
mount option and cause the NFS servers to thrash (and slow NFS file
access)?

Comment 65 Neil Horman 2004-07-29 17:31:54 UTC
>I can do it for smp, how exactly does one do this?
echo 1 > /proc/sys/kernel/sysrq
press alt-printscreen-m

>Would lowering NFS_ACL_MAX_ENTIRES approximate the "noacl"
Performance will probably be degraded slightly, although I can't put a
number on how much.  Certainly it will be better than the performance
you get when you receive the out of memory errors documented above.

Comment 66 Brian Smith 2004-07-29 20:25:28 UTC
Created attachment 102301 [details]
sysrq m output

Okay here's the sysrq m output from two machines with similar setups.  One that

is  currently experiencing the bug, and another that will in the future.

They are both running 2.4.21-15.0.3.ELsmp.  I don't have any hugemem kernels
running at the moment.

Comment 67 Steve Dickson 2004-07-30 13:23:16 UTC
*** Bug 127830 has been marked as a duplicate of this bug. ***

Comment 68 George 2004-08-02 15:50:45 UTC
My system is still functioning normal with hugemem kernel after 5
days. The physical memory size is 2GB.

Comment 69 Neil Horman 2004-08-02 16:16:52 UTC
thats good to hear. any results yet from the patch I posted?

Comment 70 George 2004-08-02 16:41:31 UTC
Neil, Haven't chance to test your one line patch yet. I use ICP RAID
controller and the driver is supported under unsupported module. The
config file for unsupported couldn't be found in src package. Do you
know where could find/download the config file for unsupported hugemem
kernel? Thanks.

Comment 71 Neil Horman 2004-08-02 16:53:58 UTC
in the configs directory the hugmem kernel is the config that you want.  

Comment 72 Brian Smith 2004-08-03 18:22:06 UTC
I just rebooted my back and web servers with the NFS_ACL_MAX_ENTIRES
from 1024 -> 256 nfs3.h change.

I should know in a few days if it extends the lifetime.

Comment 73 George 2004-08-04 13:39:26 UTC
My system got kernel panic after 6 days with standard hugemem kernel.
The last two lines displayed on console are,

Code: 8b 81 84 00 00 00 42 39 41 70 89 d9 0f 43 54 24 10 81 e1 00
Kernel Panic: Fatal Exception

Couldn't get sysrq+m output...

Comment 74 George 2004-08-06 13:35:48 UTC
NFS_ACL_MAX_ENTIRES from 1024 to 256 looks having positive impact. One
of my system couldn't live longer than 1 day with default kernel, but
now it is still behaving normal after 2 days with Neil's patch. 

Comment 75 Tom Sightler 2004-08-09 14:29:25 UTC
OK, this issue just hit me again for the first time in months.  What's
the current consensus on the best solution?  It seems the options for
now are:

1.  Run hugemem -- live with the performance drop that comes with this.
2.  Try NFS patch and hope it helps.

The patch seems more attractive to me.  Do we know if it's really
helping to workaround this issue?  

Later,
Tom


Comment 76 Kevin Krafthefer 2004-08-09 15:08:19 UTC
We've been seeing positive results from people who've applied the patch.

Comment 77 Brian Smith 2004-08-09 16:42:29 UTC
I second that, I'm about 2 days past where I would normally ls -l hang
by changing to NFS_ACL_MAX_ENTIRES = 256.

Comment 78 Tom Sightler 2004-08-09 23:09:33 UTC
OK, I compiled a kernel with the patch and decided to try something a
little wild.  I had several systems with the NFS hang, I couldn't even
copy a file from NFS.  I really needed NFS on these systems to work,
but they also run production critical apps that I couldn't really reboot.

Since NFS client support is compiled as a module I thought I could
unload the nfs.o module and replace it on the fly with a binary
compatible nfs.o module that includes the patch.  So I compiled an
nfs.o module for 15.0.3-ELsmp and 15.0.4-ELsmp that I could drop in to
the module directory to replace the delivered nfs.o module from
Redhat.  After testing on a non-production system I unmounted my NFS
mounts, did a 'rmmod nfs', copied my new custom nfs.o over the
existing one in the modules directory, ran depmod and then remounted
my NFS exports.  Sure enough after this NFS would work fine.  I could
even swap back and forth between the Redhat nfs.o and the custom nfs.o
and switch between working and non-working NFS, so this patch
definately seems to help.  Basically I can fix my critical systems on
the fly by simply replacing this one module.

My new question is, does this have any other side effects and does
Redhat consider this an official fix?  Certainly an official fix is
needed.

Thanks,
Tom


Comment 79 Neil Horman 2004-08-09 23:38:43 UTC
No this is not an official fix.  Ideally we would like to fix the
problem without needing to reduce the number of ACE objects supported
in a single rpc response.  I expect we'll post the alternative here
just as soon as its ready.  

Comment 80 George 2004-08-10 12:58:46 UTC
Tom's nfs module switch looks a good procedure to fix nfs on fly. My
system is still good after 6 days with Neil's patch. Thanks.

Comment 84 Neil Horman 2004-08-17 15:02:09 UTC
Created attachment 102796 [details]
patch to make NFS3_ACL_MAX_ENTRIES configurable

This is a variation on my previous patch, which makes the number of ACE objects
supported by the nfs client configurable.  Could someone test this please and
confirm that it works as well as the previous patch?  Thanks!

Comment 85 Neil Horman 2004-08-17 17:25:12 UTC
Created attachment 102803 [details]
follow on patch to add same functionality to nfsd module

This patch adds on the same acl ace entry module option to the nfsd module for
those who are interested.

Comment 86 Thomas Fitzsimmons 2004-08-20 18:33:16 UTC
I've installed a kernel built with Neil's latest patch.  It seems to
be working, but I'll have to use the new kernel for about a week to be
really sure that this bug is fixed (since it usually takes a few days
to show up after a reboot).


Comment 87 Andrew Pitts 2004-08-22 17:52:33 UTC
I'm having exactly the same problem on my web server EL 3 clients. The
NFS server is running redhat Linux 9.

I have 2 identical web servers mounted to a common nfs server. The ll
lockup condition can occur on one client while it is working correctly
on the other. The problem is fixable only by rebooting either the nfs
server or the troubled client.

The problem didn't appear until I upgrade the web servers from RH 7.3.

Now we're sitting on a time bomb. I've set the rsize and wsize to
4,096.   I don't have the option of setting the no_acl in my nfs
mounts. They are not recognized by the nfs clients in my install of EL
3, yet the documentation says it should.

I'm running kernel 2.4.21-15.0.3.ELsmp on the clients. Linux version
2.4.20-8smp on the server.

Don't really want to go and patch the source for the webservers. Will
probably have to uninstall EL 3 and rebuild the servers with RH 9.
I don't have anyway of predicting the problem. They have been trouble
free at times anywhere from 2 weeks to just a few hours. These are
relatively low bandwidth web servers.



Comment 88 Neil Horman 2004-08-26 11:25:40 UTC
Thomas, how has the patch been running for you this week?

Comment 89 Thomas Fitzsimmons 2004-08-26 15:56:56 UTC
I've had no problems since I rebooted to the new kernel six days ago.


Comment 90 Chris Worley 2004-08-30 14:16:36 UTC
Note that the machine where this problem occurrs hosts a cluster of
about 130 nodes.  I added 256 more nodes to the cluster, and the
problem went from about a two week cycle to a two day cycle.

What would change (more than linearly) with the additional nodes is
the amount of ssh and rsh'ing occurring from the host.  At least 4k of
each per hour, estimated.  

There would also be NIS pressure, but the host load got so high on the
host (i.e. whenever 1000 processes would start simultaneously on the
nodes), that I turned NIS off in favor of local files... and the "RPC:
buffer allocation failed" still occurred.

Note that I turned off attribute caching alltogether on the host
client mounts (where this host is an NFS client), using the NFS "noac"
option, and that did not help.

Comment 91 Neil Horman 2004-08-30 18:33:52 UTC
NIS and NFS are the only two things in your comments that would have
made any effect on the situation, since those are the only two
subsystems that you mentioned which use RPC.  Did you by any chance
try the latest patch that I have attached that allows you to reduce
the number of ACE objects that the nfs clients supports?

Comment 92 Neil Horman 2004-08-31 18:31:57 UTC
Created attachment 103306 [details]
enhancement on prior patch to display module acl option via proc file

same patch as before (combined nfs/nfsd changes into one patch) and exported
nfsd_acl_max_entries as a read only proc file

Comment 95 Neil Horman 2004-09-01 20:11:13 UTC
Created attachment 103367 [details]
new patch to add nfs_acl_max_entries module option to nfs.o

this patch (split out from the last larger patch adds the aforementioned
nfs_acl_max_entries modules option to nfs.o

Comment 96 Neil Horman 2004-09-01 20:13:39 UTC
Created attachment 103368 [details]
follow-on patch to add same functionality to nfsd.o

This patch builds on the last patch, adding the same module parameter
functionality to nfsd.o, and includes the addition of a sysctl to report the
assigned value.

Comment 97 Neil Horman 2004-09-07 19:32:42 UTC
Created attachment 103553 [details]
same patch with added nfs sysctl

same patch as before, but adds nfs sysctl and is diffed against latest kernel

Comment 98 Neil Horman 2004-09-07 19:34:57 UTC
Created attachment 103555 [details]
follow on nfsd patch for new kernel

same as last nfsd patch, but diffed against new kernel.

Comment 99 Matthew Davis 2004-09-16 13:26:51 UTC
Customer, finally got to installing the kernel generated
(kernel-2.4.21-20.EL.RPCTEST.i686.rpm).  The machine ran successfully
for 10 hours then locked up.  No response from console or 
anywhere.  Required a power off.

Any ideas?

Comment 100 Tom Sightler 2004-09-16 13:34:02 UTC
I'm seeing this hard lockup on multiple systems since upgrading to
2.4.21-20.EL, systems both with and without the patch, everything from
an 8-way Dell 8450 with 12GB of RAM and a Qlogic 2300 FC adapter to my
700Mhz 1-CPU Dell PowerApp.web 100 with standard IDE drives.  Have
also seen the lockup on a IBM HS20 Dual 3Ghz system.  I'm falling back
to 15.0.4 on everything but my test server until I get a better feel
for 20, but right now I think 20 is buggy.

Later,
Tom


Comment 101 Neil Horman 2004-09-16 13:48:18 UTC
The last two comments regarding lockups sound like they are related to
a different prloblem than the one being addressed here.  I'd open a
separate bugzilla.  Just out of curiosity though, are the affected
machines all running with more than 4GB of RAM?  If so, try booting
with less than 4GB of RAM (you can use the mem=4G kernel command line
option).  If that relieves the lock up it might give us a clue as to
where the problem lies.

Comment 102 Tom Sightler 2004-09-16 14:27:52 UTC
I agree they are likely a different problem.  I'm going to apply the
existing patch for 20 to my 15.0.4 kernels and see how that goes.

I'm still researching existing bugzilla entries on the hang to see if
they match my problem.  One of the systems that hard locked has only
512MB of RAM, but I think it may have been bit by the Bug 132547.  The
other two systems have 8GB and 12GB respectively, but booting with
less memory is not really possible on these systems as they run large,
active Oracle databases.

Later,
Tom


Comment 105 Ernie Petrides 2004-09-18 06:05:46 UTC
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.7.EL).


Comment 110 Kostas Georgiou 2004-10-21 12:40:13 UTC
One of my systems just got hit by the bug again under
kernel-smp-2.4.21-20.9.EL and fs.nfs.nfs3_acl_max_entries set to 256
after 22 days of uptime :(

Comment 111 Neil Horman 2004-10-21 12:58:26 UTC
can you get a sysrq-m off the system and post it here during the failure?

Comment 112 Rex Dieter 2004-10-21 13:07:58 UTC
Regarding the latest set of patches (comment #96 and comment #97),
exactly what *is* the nfs module parameter name/syntax, and how to use
it in practice?  Put something in /etc/modules.conf? 
//etc/sysctl.conf?  It's not entirely clear to me.

Comment 113 Neil Horman 2004-10-21 13:31:25 UTC
there are two module options, one for the client and one for the
server (nfs3_acl_max_entries and nfsd3_acl_max_entries).  You specify
them with an options directive in modules.conf.  Both parameters take
integer values, and allow you to specify the number of acl entries
listed per NFSACL transaction.  By specifying a lower value, you save
memory when the RPC subsystem allocates buffers to store the
transaction response, thereby avoiding buffer allocation failures from
this particular problem.  Comment 61 in this bugzilla provides
interesting values for these options that correlate to allocation size
thresholds.

Comment 114 Ken Snider 2004-10-26 04:15:27 UTC
In regards to comment #105, which patch was actually ported to U4, so
that those of us affected can patch now?

Comment 115 Neil Horman 2004-10-26 11:28:06 UTC
The last two on the attachement list below (the only two patches that
are not obsoleted), dated:
2004-09-07 15:32
and
2004-09-07 15:34

Comment 119 Ken Snider 2004-11-08 20:37:17 UTC
FYI, this fix does *not* correct the bug described in bug 126598 or
bug 129861.

Comment 120 Rex Dieter 2004-11-08 21:06:01 UTC
Re: comment #113, so I'd add the following to /etc/modules.conf, for 
example:

options nfsd nfs3_acl_max_entries=256 nfsd3_acl_max_entries=256

Comment 121 Tom Sightler 2004-11-08 21:16:48 UTC
Ken (Comment #119) did you actually add the module parameters requied
to actually implement this fix?  Just running the kernel isn't enough,
you have to actually add the options to your modules.conf.

Of course, you may still be correct, your problem might not even be
related to this bug, but I just wanted to make sure you actually
tested it with the proper changes to modules.conf as it seems there is
some confusion as to how to actually implement this fix.

Just to clear things up, my understanding is as follows:

1.  Just running the new kernel doesn't change anything

2.  To actually implement the fix you must load the modules with the
new options, either via manually loading the driver and explicitly
supplying the options, or by adding them to modules.conf.

3.  The /proc interface is a READ-ONLY interface so that you can see
what values the modules were loaded with, you cannot actually change
the values via this interface.

There seem to be multple people confused about this (even discussed on
Taroon list).  Could someone verify that my understanding correct?

Thanks,
Tom


Comment 122 Rex Dieter 2004-11-10 13:22:42 UTC
Re: comment #120, here's the correct modules.conf syntax/usage after
some trial and error:
options nfsd nfsd3_acl_max_entries=256
options nfs nfs3_acl_max_entries=256


Comment 123 Neil Horman 2004-11-11 19:36:13 UTC
*** Bug 133246 has been marked as a duplicate of this bug. ***

Comment 124 Neil Horman 2004-11-12 15:35:40 UTC
In response to comment #119, Tom is correct.  The parameters which
this patchset adds to nfs and nfsd are settable once via module
parameters at load time.  The proc interface is a read-only interface,
allowing you to see what the load time settings were.  It was decided
some time ago, that allowing the dynamic resizing of the max acl
message size could be rather racy, and so we decided to require one
time initalization only at module load time.

Comment 125 Rex Dieter 2004-11-22 18:29:14 UTC
kernel-2.4.21-20.EL with patch applied on a DELL 1GB RAM box using:
options nfsd nfsd3_acl_max_entries=256
options nfs nfs3_acl_max_entries=256

This box is an NFS server (and client, in this case against a
venerable redhat9 server) with random processes hanging twice in the
last 2 days... here's the latest seemingly relavent syslog entries

Nov 22 12:07:59 x kernel: lockd: cannot monitor x.x.x.133
Nov 22 12:08:24 x kernel: lockd: cannot monitor x.x.x.133
Nov 22 12:09:07 x kernel: lockd: cannot unmonitor x.x.x.103
Nov 22 12:09:32 x kernel: lockd: cannot monitor x.x.x.133
Nov 22 12:09:57 x kernel: lockd: cannot monitor x.x.x.133
Nov 22 12:10:22 x kernel: lockd: cannot monitor x.x.x.133
Nov 22 12:10:47 x kernel: lockd: cannot monitor x.x.x.6
Nov 22 12:11:12 x kernel: lockd: cannot monitor x.x.x.133

Does this sound like something related to this bug, or should I open a
new one?

Comment 126 Neil Horman 2004-11-22 18:34:50 UTC
It might be a simmilar type of issue, but its probably unrelated.  I'd
open another bugzilla for it.

Comment 127 Rex Dieter 2004-11-22 18:49:25 UTC
OK, submitted as bug #140385 "lockd: cannot monitor/unmonitor"

Comment 128 jason andrade 2004-11-30 15:18:15 UTC
Hi,

Exactly what is the current status of this ? Should we get the test
kernel from RHEL ES3QU4 Beta and run it on.. the server ? the client ?
both ?

If we do, are there any patches to apply or are they already applied ?

Are there fixes for the other nfs bugs in the beta kernel or are there
separate patches that have to be applied for that ? 

Comment 129 Neil Horman 2004-11-30 15:23:53 UTC
The fix as attached in the latest patch set to this bz is applied to
the U4 beta kernel

Run it on the machine in which the log messages appeared.  This could
be the client or the server, but is in most cases the client.  don't
forget to set the new nfs module options appropriately.

"Are there fixes for the other nfs bugs in the beta kernel"
What exactly do you mean here?  Are there other bugzillas you are
specifically concerned about?

Comment 130 Suzanne Hillman 2004-11-30 18:05:02 UTC
*** Bug 139952 has been marked as a duplicate of this bug. ***

Comment 131 jason andrade 2004-12-01 02:29:48 UTC
thanks muchly neil - i've started the process of applying this to our
nfs clients and i've fired it up on the server (rhel3qu4 beta kernel).

we currently do between 200-400Mbit/sec (around 3-4Tbyte per day) so
nfs is hammered fairly hard.  this is expected to double within the
next 12 months.

the other bugs i was thinking of were the ones listed in #119.

regards,

-jason

Comment 132 Ernie Petrides 2004-12-03 03:11:06 UTC
*** Bug 121803 has been marked as a duplicate of this bug. ***

Comment 133 John Flanagan 2004-12-20 20:54:57 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html


Comment 134 Ernie Petrides 2005-03-14 21:24:24 UTC
*** Bug 136423 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.