79573 – system lockup with message "ENOMEM in do_get_write_access retrying"

Bug 79573 - system lockup with message "ENOMEM in do_get_write_access retrying"

Summary: system lockup with message "ENOMEM in do_get_write_access retrying"

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	7.3
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Stephen Tweedie
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-12-13 16:31 UTC by Need Real Name
Modified:	2007-04-18 16:49 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-01-05 20:22:01 UTC
Embargoed:

Attachments	(Terms of Use)

Description Need Real Name 2002-12-13 16:31:18 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (X11; U; HP-UX B.11.11 9000/785)

Description of problem:
System locked up when using 'rdist' to copy many files onto an ext3 file system.

Console login did not respond, as well as ssh login.  We had to power the system
off to restart it.

Additional messages seen were:

ENOMEM in new_handle, retrying.
ENOMEM in journal_get_undo_access_Rsmp_df5dec49, retrying.

The system is running kernel 2.4.18-18.7.xbigmem.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Use the rdist utility to send a large number of files to the machine with the
new kernel.
2.
3.
	

Actual Results:  Machine locks up

Expected Results:  Normal operation

Additional info:

I ran a 'cat /proc/meminfo' shortly before the lockup and these were the
results:

        total:    used:    free:  shared: buffers:  cached:
Mem:  8132640768 8126660608  5980160        0   892928 7294271488
Swap: 2097434624   737280 2096697344
MemTotal:      7942032 kB
MemFree:          5840 kB
MemShared:           0 kB
Buffers:           872 kB
Cached:        7123240 kB
SwapCached:         72 kB
Active:        3710784 kB
Inact_dirty:   3157560 kB
Inact_clean:    267020 kB
Inact_target:  1427072 kB
HighTotal:     7143360 kB
HighFree:         1244 kB
LowTotal:       798672 kB
LowFree:          4596 kB
SwapTotal:     2048276 kB
SwapFree:      2047556 kB
Committed_AS:    22452 kB

Comment 1 Stephen Tweedie 2002-12-13 17:06:36 UTC

The "ENOMEM ..., retrying" messages are an indication that ext3 is experiencing
temporary memory allocation pressure, but they do happen under very high load
and are not a fault in themselves.  ext3 should continue quite happily under
those conditions (and indeed it does so under testing.)

So we need far more information to work out where the real lockup is --- the
presence of these messages does not in any way tell us that the lockup is due to
ext3.

Comment 2 Norman Gaywood 2002-12-14 02:05:49 UTC

Is this the same as bug# 79257?

The message "ENOMEM in do_get_write_access retrying" occurs in the 79257 case if
the copy is left long enough.

Comment 3 Need Real Name 2002-12-16 18:42:29 UTC

It looks very similar to 79257.  It is likely to be caused by the same
problem.

Comment 4 Alan Cox 2003-06-08 15:30:48 UTC

Is this still occuring with 2.4.20 based errata ?

Note You need to log in before you can comment on or make changes to this bug.