Bug 99025

Summary:

kernel: ENOMEM in journal_alloc_journal_head, retrying Dell 6650

Product:

Red Hat Enterprise Linux 2.1

Reporter:

Geoff Dolman <geoff.dolman>

Component:

kernel

Assignee:

Larry Woodman <lwoodman>

Status:

CLOSED NOTABUG

QA Contact:

Severity:

high

Docs Contact:

Priority:

medium

Version:

2.1

CC:

kalyanj80, riel, sct, tao, tommy.yan

Target Milestone:

---

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2004-12-13 20:06:28 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
/proc/slabcache	none

Description Geoff Dolman 2003-07-11 21:58:31 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
Under heavy nfs load (from several rsyncs) on a large 556GB filesystem the
following message appears in /var/log/messages:

Jul 11 14:44:00 srs kernel: ENOMEM in journal_get_undo_access_Rsmp_767cdac6,
retrying.
Jul 11 14:44:07 srs kernel: ENOMEM in journal_alloc_journal_head, retrying.
Jul 11 14:44:10 srs kernel: journal_write_metadata_buffer: ENOMEM at
get_unused_buffer_head, trying again.
Jul 11 14:44:18 srs kernel: ENOMEM in journal_get_undo_access_Rsmp_767cdac6,
retrying.
Jul 11 14:44:28 srs kernel: journal_write_metadata_buffer: ENOMEM at
get_unused_buffer_head, trying again.
Jul 11 14:45:30 srs kernel: journal_write_metadata_buffer: ENOMEM at
get_unused_buffer_head, trying again.
Jul 11 14:45:38 srs kernel: ENOMEM in do_get_write_access, retrying.

etc...

At the end of an hour or more of these messages the machine hangs. It can still
respond to ping, but it isn't possible to login on the console or via ssh. The
only resolution appears to be switching it off and on again.

I see a similar bug has been logged concerning the tg3 driver for the broadcom
NetXtreme ethernet card, but I am not using the tg3 driver - I am using the
bcm5700 driver...




Version-Release number of selected component (if applicable):
Linux 2.4.9-e.25enterprise

How reproducible:
Sometimes

Steps to Reproduce:
1. A heavy network load, typically produced by running several rsyncs.
2.
3.
    

Actual Results:  Machine hangs indefinitely

Expected Results:  Other possibly relevant information: the machine is 4
processor with hyperthreading enabled and has 32 GB of memory.

Additional info:

Comment 1 Stephen Tweedie 2003-07-11 23:52:44 UTC

The journal messages are symptoms of heavy memory pressure but do not, by
themselves, signal any failures.  There are plenty of other places in the kernel
where similar memory allocators simply retry silently under that pressure: the
journal code is just one of the very few places that logs this condition.

These messages are entirely harmless in themselves.  The hang will need more
information to diagnose.  Can you capture an alt-sysrq-t or alt-sysrq-p
backtrace when the hang occurs?

Comment 2 Geoff Dolman 2003-07-31 13:36:27 UTC

The machine does not display /anything/ on its monitor after this hang has
occurred. The sysrq key combinations are working (sysrq-b works) but owing to
lack of display nothing can be seen from the sysrq-p etc.

I also now have a way of reproducing the error:

On six machines cat /dev/zero into a file on a filing system exported from the
machine that suffers the hang.

The load average goes up gradually and then suddenly goes up to 60 or more and
then the machine hangs. [This is not because the filing system has filled
though.] When the /dev/zeros are stopped the machine does not recover. All
logins on terminals are unresponsive from then on.

The machine still responds to ping and the lvs that it is running redirecting
http requests continues to work. However you cannot rsh or ssh in, get a login
on the console or mount the exported filing system. 

There is no change after several hours.

Comment 3 Stephen Tweedie 2003-08-04 12:01:43 UTC

Larry, looks like another VM exhaustion problem.

We really need the sysrq output to get much further with this.  Serial console
is the best way to capture it if /var/log/messages is wedged.

Comment 4 Geoff Dolman 2003-08-04 16:02:43 UTC

...think I've got to the bottom of it. Turns out to be a contention between a
tape drive and disks on the same bus. Could this be right?

Comment 5 Stephen Tweedie 2003-08-04 16:54:08 UTC

Depends: if it's SCSI, it shouldn't really matter unless the tape drive is doing
something really broken.  If it's an ATAPI tape drive on the same controller as
a disk being used heavily, then yes, it could easily cause the VM to get upset
as we can't swap things out fast enough. 

What bus and devices exactly were you using?

Comment 6 MF Lee 2003-08-26 15:34:20 UTC

is it mm-1.1.3-1 problem?

Comment 7 Geoff Dolman 2003-08-27 17:29:43 UTC

Okay, the scsi thing was a red herring. The machine still exhibits the same
behaviour as before only that for obvious reasons with the tape drive moved to a
different bus it takes more machines writing to the nfs export to make it go
over. (10 instead of 6). 

I have now captured some sysrq information (second URL). Hopefully this contains
a sysrq-t, sysrq-p and a sysrq-m. Th first URL is lspci -vv.

http://www-gene.cimr.cam.ac.uk/~geoff/lspci.tgz

http://www-gene.cimr.cam.ac.uk/~geoff/sysrq-output.tgz

The version of mm on the system is mm-1.1.3-8.

I have not upgraded to the 27 enterprise kernel yet.

Comment 8 Stephen Tweedie 2003-08-27 18:05:56 UTC

All the important tasks are stuck in the VM page reclaim code.  Looks like
you're just getting critically short of low memory.  We don't officially support
32GB on AS21, precisely because of the low memory shortage that you get on such
systems.  You really need the 4G/4G split from recent 2.6 and RHEL beta kernels
to make them work properly.

Comment 9 Jason Baron 2003-08-27 18:15:39 UTC

out on a limb here, but any scsi_malloc failures in the log?

Comment 10 Larry Woodman 2003-08-27 18:23:45 UTC

The problem here is that there is 550MB in the mem_map page array for the
32GB and 350MB in the slabcache used for kernel data structs.  Since
both of these are allocated out of lowmem, its exhausted and subsequent
allocation are failing.  Please attach a /proc/slabinfo output and I'll see if
we can do any tuning magic to prevent lowmem exhaustion.  Ohter than that I
would suggest booting with 16GB of memory.  16GB is the maximum AS2.1 supports
for this very reason.

Larry Woodman

Comment 11 Geoff Dolman 2003-08-28 09:56:58 UTC

Many thanks. I have posted the /proc/slabinfo to 

http://www-gene.cimr.cam.ac.uk/~geoff/slabinfo.tgz

Comment 12 Graham Lenton 2004-07-05 11:09:03 UTC

I'm having the same problem on a pair of Dell 6650s with 16Gb of
memory each, running 2.4.9-e.40enterprise #1 SMP.

Reproducable by copying a 9Gb directory structure from one filesystem
to another. Around 7Gb is successfully copied before the system locks.

Glad I found this before purchasing another 16Gb for both servers..

Comment 13 Graham Lenton 2004-07-06 08:38:04 UTC

I have upgraded everything via up2date, and now running kernel
2.4.9-e.43enterprise #1 SMP. Still have this problem.

Does upgrading to RHEL3 fix this one ?

Comment 14 Geoff Dolman 2004-07-06 08:55:32 UTC

I found the problem was fixed by mirroring the memory ie by reducing
from 32 GB to 16GB. There has not bee a crash since then.
I tried upgrading to RHEL 3.0 and everything seemed fine except the
alleged 'ISV support' as Veritas Netbackup 4.5 does not work under
RHEL 3.0.
With a lot of fiddling and wrapper scripts it will work but not in any
kind of way you can imagine Veritas supporting so I reverted to 2.1.
There's no officially supported server version of Netbackup for RHEL
3.0 until 5.1 is released so we're just twiddling our thumbs waiting...

There were no problems with the RAM under 3 though.

Comment 15 Graham Lenton 2004-07-06 09:35:53 UTC

thanks Geoff. So did you have 32Gb running under RHEL3 ?

Comment 16 Geoff Dolman 2004-07-06 09:52:05 UTC

Yes I tried it but I didn't put it into production for the reasons
outlined re Veritas :(

Comment 17 Graham Lenton 2004-07-06 10:11:43 UTC

thanks Geoff. So did you have 32Gb running under RHEL3 ?

Comment 18 Geoff Dolman 2004-07-06 10:14:06 UTC

Yes.

Comment 19 Graham Lenton 2004-07-06 10:50:02 UTC

Similar issue here with Legato Networker - version 7 does support
RHEL3, but we're on 6. Arghh !

Comment 20 thriller 2004-07-14 08:03:29 UTC

I'm also facing this problem in HP dl380/6G mem, ml570/32G mem,
dl760/32G mem. Just dd a large file(over 10GB) will enconter out of
memory error, and see 'ENOMEM in journal_alloc_journal_head' in server
screen. Switch fs to ext2/ext3/reiserfs did not solve this problem.

Comment 21 thriller 2004-07-14 09:54:51 UTC

hum, 

The system run out of buffer in this case. 

You can reproduce with some simple dd or cp,
For me, in a 30GB mem config, 
#dd if=/dev/zero of=/file1 bs=4096k count=2000 &
#dd if=/dev/zero of=/file2 bs=4096k count=2000 &
#dd if=/dev/zero of=/file3 bs=4096k count=2000 &

#vmstat 3
procs                      memory      swap          io     system   
     cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us
sy id wa
 4  1      0 20456412  40108 11790712    0    0     3    26   13     5
 0  0  4  0
 0  8      0 20223332  33984 12021504    0    0    11 76564  303  2987
 0 30 31 39
 0  7      0 20108120  32844 12136352    0    0    19 38220  302  2007
 0 14 33 53
 4  2      0 19831936  24152 12412832    0    0    15 91988  306  3947
 0 38 33 29
 0  7      0 19634008  18396 12610232    0    0     3 65229  282  2759
 0 27 17 56
 0  7      0 19471276  14028 12771068    0    0     1 55196  284  2565
 0 22 12 66
 4  1      0 19309040   8736 12937064    0    0     1 54031  276  2075
 0 25 14 61
 4  1      0 19126688   5452 13117664    0    0     4 61431  296  2788
 0 25 19 56
 0  6      0 19065608   3440 13179820    0    0     1 19595  287  1286
 0  8  8 84
 4  1      0 18854072   1404 13390412    0    0     4 72199  451  2500
 0 30 34 36
[root@ml570-1 root]# vmstat 3 
Connection to ml570-1.test closed by remote host.

in the server screen,  ENOMEM in journal_alloc_journal_head error 
appear. 

[root@ml570-1 root]# cat /proc/slabinfo
slabinfo - version: 1.1 (SMP)
kmem_cache           128    128    244    8    8    1 : 1008  252
nfs_write_data         0      0    384    0    0    1 :  496  124
nfs_read_data          0      0    384    0    0    1 :  496  124
nfs_page               0      0    128    0    0    1 : 1008  252
ip_fib_hash           11    448     32    4    4    1 : 1008  252
ext3_xattr             0      0     44    0    0    1 : 1008  252
journal_head       27874  27874     48  362  362    1 : 1008  252
revoke_table           1    250     12    1    1    1 : 1008  252
revoke_record          0      0     32    0    0    1 : 1008  252
clip_arp_cache         0      0    256    0    0    1 : 1008  252
ip_mrt_cache           0      0    128    0    0    1 : 1008  252
tcp_tw_bucket          0      0    128    0    0    1 : 1008  252
tcp_bind_bucket        7    336     32    3    3    1 : 1008  252
tcp_open_request       0      0    128    0    0    1 : 1008  252
inet_peer_cache        0      0     64    0    0    1 : 1008  252
secpath_cache          0      0    128    0    0    1 : 1008  252
xfrm_dst_cache         0      0    256    0    0    1 : 1008  252
ip_dst_cache          10     45    256    3    3    1 : 1008  252
arp_cache              2     30    256    2    2    1 : 1008  252
flow_cache             0      0    128    0    0    1 : 1008  252
blkdev_requests    12288  12600    128  420  420    1 : 1008  252
kioctx                 0      0    128    0    0    1 : 1008  252
kiocb                  0      0    128    0    0    1 : 1008  252
dnotify_cache          0      0     20    0    0    1 : 1008  252
file_lock_cache        3    120     96    3    3    1 : 1008  252
async_poll_table       0      0    140    0    0    1 : 1008  252
fasync_cache           0      0     16    0    0    1 : 1008  252
uid_cache              2    112     32    1    1    1 : 1008  252
skbuff_head_cache    453    598    168   26   26    1 : 1008  252
sock                  57    100   1408   20   20    2 :  240   60
sigqueue               1     29    132    1    1    1 : 1008  252
kiobuf                 1     30    128    1    1    1 : 1008  252
cdev_cache            13    406     64    7    7    1 : 1008  252
bdev_cache             7    290     64    5    5    1 : 1008  252
mnt_cache             15    232     64    4    4    1 : 1008  252
inode_cache          755   1064    512  152  152    1 :  496  124
dentry_cache         457   1200    128   40   40    1 : 1008  252
dquot                  0      0    128    0    0    1 : 1008  252
filp                 360    510    128   17   17    1 : 1008  252
names_cache            1      1   4096    1    1    1 :  240   60
buffer_head       3210480 3210480    108 91728 91728    1 : 1008  252
mm_struct             30     90    384    9    9    1 :  496  124
vm_area_struct       908   1512     68   27   27    1 : 1008  252
fs_cache              36    348     64    6    6    1 : 1008  252
files_cache           37     70    512   10   10    1 :  496  124
signal_cache          72    638     64   11   11    1 : 1008  252
sighand_cache         52    100   1408   20   20    2 :  240   60
pte_chain           1302   3000    128  100  100    1 : 1008  252
pae_pgd               30    464     64    8    8    1 : 1008  252
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            2      2 131072    2    2   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384            23     23  16384   23   23    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              6      6   8192    6    6    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :  240   60
size-4096            425    425   4096  425  425    1 :  240   60
size-2048(DMA)         0      0   2048    0    0    1 :  240   60
size-2048            157    184   2048   92   92    1 :  240   60
size-1024(DMA)         0      0   1024    0    0    1 :  496  124
size-1024            132    132   1024   33   33    1 :  496  124
size-512(DMA)          0      0    512    0    0    1 :  496  124
size-512             344    344    512   43   43    1 :  496  124
size-256(DMA)          0      0    256    0    0    1 : 1008  252
size-256            4245   4245    256  283  283    1 : 1008  252
size-128(DMA)          0      0    128    0    0    1 : 1008  252
size-128            1991   2040    128   68   68    1 : 1008  252
size-64(DMA)           0      0    128    0    0    1 : 1008  252
size-64              330    330    128   11   11    1 : 1008  252
size-32(DMA)           0      0     64    0    0    1 : 1008  252
size-32              966   1218     64   21   21    1 : 1008  252

sb. can help explain?

Comment 23 Graham Lenton 2004-07-16 10:32:28 UTC

Created attachment 101956 [details]
/proc/slabcache

slabcache when machine is unresponsive

Comment 24 Graham Lenton 2004-07-16 10:34:34 UTC

re #20 thriller - this is on RHEL2.1 right ? I want to be sure that
RHEL3 doesn't have this bug as well before I upgrade ...

Comment 25 thriller 2004-07-19 01:44:37 UTC

I've tried this on rhel3 U1 & U2, both of them have this error.

Comment 26 Larry Woodman 2004-07-19 19:08:27 UTC

Tommy, can you attach "cat /proc/meminfo" output from the rhel3 system
that is experiencing this problem?  AS2.1 does not support > 16GB of
memory and RHEL3 supports up to 16GB with the smp kernel and up to
32GB with the hugemem kernel.

Larry

Comment 27 thriller 2004-07-20 06:57:07 UTC

This is smp kernel, meminfo just before the process killer kicked out 
my login is:

[root@ml570-1 root]# cat /proc/meminfo 
        total:    used:    free:  shared: buffers:  cached:
Mem:  33448136704 14163038208 19285098496        0  4214784 
13736730624
Swap: 2097434624        0 2097434624
MemTotal:     32664196 kB
MemFree:      18833104 kB
MemShared:           0 kB
Buffers:          4116 kB
Cached:       13414776 kB
SwapCached:          0 kB
Active:          23716 kB
ActiveAnon:       6112 kB
ActiveCache:     17604 kB
Inact_dirty:  12763024 kB
Inact_laundry:  318872 kB
Inact_clean:    319384 kB
Inact_target:  2684996 kB
HighTotal:    32243676 kB
HighFree:     18817196 kB
LowTotal:       420520 kB
LowFree:         15908 kB
SwapTotal:     2048276 kB
SwapFree:      2048276 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

Comment 28 Graham Lenton 2004-07-21 11:30:44 UTC

Tommy - have you tried the hugemem kernel? as Larry says, that one
should be ok for 32Gb.

Comment 29 thriller 2004-07-23 06:10:28 UTC

Yeah, We've tried hugemem kernel and that issue did not happen.

I've read rhel3 release notes it does said if 16gb memory + should 
use huge memory kernel. But it also indicate support 64GB memory. 
Where does 'up to 32 GB' come from?

Comment 30 Graham Lenton 2004-07-23 08:09:47 UTC

Thanks Tommy. Why didn't you stick with the hugemem kernel ?

Comment 32 John Flanagan 2004-12-13 20:06:28 UTC

An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-505.html

Comment 34 Kalyan 2006-09-27 20:44:50 UTC

We have a 4 CPU dual-core DL580 G3 with 32G of memory that is having similar
issues on and off. I don't have a decent way of reproducing the error. I am
running the 2.4.9-e.57enterprise kernel. This is our Oracle DB server.
Currently, we are running without BigPages enabled. I have read a whole lot
online that the memory needed to store the PTE entries will shrink by a
magnitude of 1024 as the PageSize would go from 4k to 4MB. To go with this
theory and put this box to production, I need to determine two things.
 
1. What is a precise way of reproducing the error?
Currently, the server hangs with ssh and the console unavailabe, and there are
the  'ENOMEM in journal_alloc_journal_head' msgs on the console? This happens
when I create a high load by running 10 instances of dd if=/dev/zero
of=test1file21 bs=4096k count=2000&, and with network copies like scp or write
to a nfs mounted file. However, I have been able to crash the server only 2 out
of some 40 times this way.

2. Once I have a precise way of crashing it, I need to see if enabling BigPages
will make it go away.

Thanks in advance. We are stuck with 2.4 kernel because of Oracle 9i.

#cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  33773588480 31810072576 1963515904 18690871296 30732288 12716052480
Swap: 2147426304        0 2147426304
MemTotal:     32982020 kB
MemFree:       1917496 kB
MemShared:    18252804 kB
Buffers:         30012 kB
Cached:       12418020 kB
SwapCached:          0 kB
Active:          31560 kB
Inact_dirty:  18272676 kB
Inact_clean:  12396600 kB
Inact_target:  8245292 kB
HighTotal:    32635748 kB
HighFree:      1817460 kB
LowTotal:       346272 kB
LowFree:        100036 kB
SwapTotal:     2097096 kB
SwapFree:      2097096 kB
BigPagesFree:        0 kB

(In reply to comment #27)
> This is smp kernel, meminfo just before the process killer kicked out 
> my login is:
> 
> [root@ml570-1 root]# cat /proc/meminfo 
>         total:    used:    free:  shared: buffers:  cached:
> Mem:  33448136704 14163038208 19285098496        0  4214784 
> 13736730624
> Swap: 2097434624        0 2097434624
> MemTotal:     32664196 kB
> MemFree:      18833104 kB
> MemShared:           0 kB
> Buffers:          4116 kB
> Cached:       13414776 kB
> SwapCached:          0 kB
> Active:          23716 kB
> ActiveAnon:       6112 kB
> ActiveCache:     17604 kB
> Inact_dirty:  12763024 kB
> Inact_laundry:  318872 kB
> Inact_clean:    319384 kB
> Inact_target:  2684996 kB
> HighTotal:    32243676 kB
> HighFree:     18817196 kB
> LowTotal:       420520 kB
> LowFree:         15908 kB
> SwapTotal:     2048276 kB
> SwapFree:      2048276 kB
> HugePages_Total:     0
> HugePages_Free:      0
> Hugepagesize:     2048 kB

(In reply to comment #0)
> From Bugzilla Helper:
> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225
> 
> Description of problem:
> Under heavy nfs load (from several rsyncs) on a large 556GB filesystem the
> following message appears in /var/log/messages:
> 
> Jul 11 14:44:00 srs kernel: ENOMEM in journal_get_undo_access_Rsmp_767cdac6,
> retrying.
> Jul 11 14:44:07 srs kernel: ENOMEM in journal_alloc_journal_head, retrying.
> Jul 11 14:44:10 srs kernel: journal_write_metadata_buffer: ENOMEM at
> get_unused_buffer_head, trying again.
> Jul 11 14:44:18 srs kernel: ENOMEM in journal_get_undo_access_Rsmp_767cdac6,
> retrying.
> Jul 11 14:44:28 srs kernel: journal_write_metadata_buffer: ENOMEM at
> get_unused_buffer_head, trying again.
> Jul 11 14:45:30 srs kernel: journal_write_metadata_buffer: ENOMEM at
> get_unused_buffer_head, trying again.
> Jul 11 14:45:38 srs kernel: ENOMEM in do_get_write_access, retrying.
> 
> etc...
> 
> At the end of an hour or more of these messages the machine hangs. It can still
> respond to ping, but it isn't possible to login on the console or via ssh. The
> only resolution appears to be switching it off and on again.
> 
> I see a similar bug has been logged concerning the tg3 driver for the broadcom
> NetXtreme ethernet card, but I am not using the tg3 driver - I am using the
> bcm5700 driver...
> 
> 
> 
> 
> Version-Release number of selected component (if applicable):
> Linux 2.4.9-e.25enterprise
> 
> How reproducible:
> Sometimes
> 
> Steps to Reproduce:
> 1. A heavy network load, typically produced by running several rsyncs.
> 2.
> 3.
>     
> 
> Actual Results:  Machine hangs indefinitely
> 
> Expected Results:  Other possibly relevant information: the machine is 4
> processor with hyperthreading enabled and has 32 GB of memory.
> 
> Additional info: