Red Hat Bugzilla – Bug 79573
system lockup with message "ENOMEM in do_get_write_access retrying"
Last modified: 2007-04-18 12:49:01 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (X11; U; HP-UX B.11.11 9000/785)
Description of problem:
System locked up when using 'rdist' to copy many files onto an ext3 file system.
Console login did not respond, as well as ssh login. We had to power the system
off to restart it.
Additional messages seen were:
ENOMEM in new_handle, retrying.
ENOMEM in journal_get_undo_access_Rsmp_df5dec49, retrying.
The system is running kernel 2.4.18-18.7.xbigmem.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Use the rdist utility to send a large number of files to the machine with the
Actual Results: Machine locks up
Expected Results: Normal operation
I ran a 'cat /proc/meminfo' shortly before the lockup and these were the
total: used: free: shared: buffers: cached:
Mem: 8132640768 8126660608 5980160 0 892928 7294271488
Swap: 2097434624 737280 2096697344
MemTotal: 7942032 kB
MemFree: 5840 kB
MemShared: 0 kB
Buffers: 872 kB
Cached: 7123240 kB
SwapCached: 72 kB
Active: 3710784 kB
Inact_dirty: 3157560 kB
Inact_clean: 267020 kB
Inact_target: 1427072 kB
HighTotal: 7143360 kB
HighFree: 1244 kB
LowTotal: 798672 kB
LowFree: 4596 kB
SwapTotal: 2048276 kB
SwapFree: 2047556 kB
Committed_AS: 22452 kB
The "ENOMEM ..., retrying" messages are an indication that ext3 is experiencing
temporary memory allocation pressure, but they do happen under very high load
and are not a fault in themselves. ext3 should continue quite happily under
those conditions (and indeed it does so under testing.)
So we need far more information to work out where the real lockup is --- the
presence of these messages does not in any way tell us that the lockup is due to
Is this the same as bug# 79257?
The message "ENOMEM in do_get_write_access retrying" occurs in the 79257 case if
the copy is left long enough.
It looks very similar to 79257. It is likely to be caused by the same
Is this still occuring with 2.4.20 based errata ?