210306 – server hangs with RPC sendmsg returned error 12

Bug 210306 - server hangs with RPC sendmsg returned error 12

Summary: server hangs with RPC sendmsg returned error 12

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i586
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Larry Woodman
QA Contact:	Brian Brock
Docs Contact:
URL:	http://www.nmt.edu/tcc
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-10-11 14:08 UTC by Michael Martinez
Modified:	2008-08-02 23:40 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-10-19 18:40:46 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
sysrq output during rcp errors (70.90 KB, text/plain) 2006-10-17 15:38 UTC, Michael Martinez	no flags	Details
kernel 2.4.21-40.ELsmp sysrq memdump (2.01 KB, text/plain) 2006-10-23 17:11 UTC, Michael Martinez	no flags	Details
View All

Description Michael Martinez 2006-10-11 14:08:33 UTC

server hangs at random times approximately once a day with RPC: sendmsg 
returned error 12. Requires hard reset. 

We're running kernel-2.4.21-47.EL on AS 3 update 3. We set /proc/sys/vm/vm-
defragment to 100 but it did not alleviate the problem.

This server is an email server that is an NFS client.

Opening new ticket at request of Larry Woodman. There is a related closed bug 
report # 123226.

Thanks!

Michael Martinez

Comment 1 Michael Martinez 2006-10-12 19:33:17 UTC

An interesting side note: 

I tried out two new "defrag" kernels from Larry. In both cases, enabling sysrq 
and the server immediately proceeds to go from its normal operating load of 4 
to 150. At one point yesterday the load reached 450 before we disabled sysrq 
and rebooted!

I'd like to be able to get sysrq -m data to post but can't do it until the load 
issue is fixed.

Another thing I'd like to note: in the previous bug report, there seemed to be 
a trend of using Proliant servers among the users reporting the problem. We're 
using one too, so perhaps a Proliant / Xeon / Intel issue here?

Michael

Comment 2 Michael Martinez 2006-10-17 15:38:59 UTC

Created attachment 138691 [details]
sysrq output during rcp errors

sysrq output

Comment 3 Michael Martinez 2006-10-23 17:11:28 UTC

Created attachment 139147 [details]
kernel 2.4.21-40.ELsmp sysrq memdump

sysrq -m output during rpc 12 errors, running kernel 2.4.21-40.ELsmp.

Comment 4 Michael Martinez 2006-10-23 17:15:08 UTC

Larry,

We've got other sysrq output, not just mem, from this crash, if you need it.

Michael

Comment 5 Larry Woodman 2006-12-15 15:58:33 UTC

Michael, what are the NFS mount options?  Specifically I'm looking for the MTU
size that RPC is using, that is the underlying cause of the memory allocation
failure.  The kernel/VM can only try to deal with defragmenting memory once it
has become highly fragmented.

Larry Woodman

Comment 6 RHEL Program Management 2007-10-19 18:40:46 UTC

This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.