Bug 51121 - networking crashes with mm: critical shortage of bounce buffers
networking crashes with mm: critical shortage of bounce buffers
Status: CLOSED ERRATA
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-08-07 11:43 EDT by Neil Prockter
Modified: 2007-04-18 12:35 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-12-31 10:04:40 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Neil Prockter 2001-08-07 11:43:11 EDT
Description of Problem:

during an rsync networking stops and console shows
mm: critical shortage of bounce buffers

How Reproducible:


Steps to Reproduce: 
1. use linux-2.4.3-12
2. rsync a lot of data
3. look at console
Actual Results:
networking stops

Expected Results:
none

Additional Information:
console shows
mm: critical shortage of bounce buffers.
eth0: can't fill rx buffer (force 0)!
eth0: can't fill rx buffer (force 1)!
eth0: can't fill rx buffer (force 1)!
eth0: can't fill rx buffer (force 1)!
eth0: can't fill rx buffer (force 0)!

/var/log/messages
Aug  7 14:19:09 localhost kernel: mm: critical shortage of bounce buffers.
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost kernel: eth0: can't fill rx buffer (force 0)!
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost last message repeated 7 times
Aug  7 14:46:24 localhost kernel: eth0: can't fill rx buffer (force 1)!
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost last message repeated 3 times
Aug  7 14:46:24 localhost kernel: eth0: can't fill rx buffer (force 1)!
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost last message repeated 3 times
Aug  7 14:46:24 localhost kernel: eth0: can't fill rx buffer (force 1)!
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost last message repeated 3 times
Aug  7 14:46:24 localhost kernel: eth0: can't fill rx buffer (force 0)!
Aug  7 14:46:24 localhost kernel: __alloc_pages: 0-order allocation 
failed.
Aug  7 14:46:24 localhost last message repeated 4 times
Comment 1 Arjan van de Ven 2001-08-07 11:45:28 EDT
1) which NIC is this ?
2) did the machine survive ?
Comment 2 Neil Prockter 2001-08-07 14:15:16 EDT
the machine survived and after ifdown eth0 ifup eth0 the networking started ok (so 
far)

NIC is (according to lspci) Intel Corporation 82557 [Ethernet Pro 100] (rev 08)

the full lspci (with a nice amount of unknown devices!) is 

00:00.0 Host bridge: ServerWorks CNB20HE (rev 23)
00:00.1 Host bridge: ServerWorks CNB20HE (rev 01)
00:00.2 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:00.3 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: ServerWorks OSB4 (rev 50)
00:0f.1 IDE interface: ServerWorks: Unknown device 0211
00:0f.2 USB Controller: ServerWorks: Unknown device 0220 (rev 04)
01:02.0 PCI bridge: Intel Corporation: Unknown device 0962 (rev 01)
01:02.1 RAID bus controller: Dell Computer Corporation PowerEdge Expandable 
RAID Controller 3/Di (rev 01)
02:04.0 SCSI storage controller: Adaptec RAID subsystem HBA (rev 01)
02:04.1 SCSI storage controller: Adaptec 7899P (rev 01)
Comment 3 Neil Prockter 2001-08-07 14:25:42 EDT
btw linux-kernel mailing list mentions this phenomeon (spellcheck please) as being 
due to problems in 2.4.3-ac and fixed in 2.4.6?
Comment 4 Arjan van de Ven 2001-08-07 15:24:54 EDT
The "vm: shortage" message is harmless in itself, it's just an early warning
that you're getting low on memory. It seems the eepro100 doesn't cope with that
properly though, and that's not good.
Comment 5 Neil Prockter 2001-08-07 17:07:13 EDT
I have 1Gb ram and 2047Mb swap how can i be short on ram when system low is very 
low. rsync was the only major process running. I wasn't even running X or apache.

Anyway what do I do next??

BTW Thanks for help so far
Comment 6 Arjan van de Ven 2001-08-07 17:12:59 EDT
"low on memory" is a relative term. If the kernel hangs on to it's diskcache too
firmly, it can actually run out of free space (or slightly start to do so; the
vm warning is printed well before REAL problems start).

There is an alternate driver for your networkcard. It's the e100 module. Some
people have great success with it, even under high load, other people are very
unhappy with it; it all depends on your hardware and luck. You might try it to
see if it fixes your problem.
(eg just change eepro100 to e100 in the /etc/modules.conf file)
Comment 7 Steve Kann 2001-08-13 14:21:31 EDT
We have seen this same error, also during an rsync.

The machine in question is a 440GX motherboard, with lspci showing the
following:# lspci 
00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
00:0c.0 SCSI storage controller: Adaptec 7896
00:0c.1 SCSI storage controller: Adaptec 7896
00:0e.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
00:12.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
00:12.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
00:12.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
00:12.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
00:14.0 VGA compatible controller: Cirrus Logic GD 5480 (rev 23)
01:0f.0 PCI bridge: Digital Equipment Corporation DECchip 21150 (rev 06)

Also, on a _different_ machine, we've seen reboots during rsyncs, on serverworks
motherboard based machines, with the following lspci output:

00:00.0 Host bridge: ServerWorks CNB20LE (rev 06)
00:00.1 Host bridge: ServerWorks CNB20LE (rev 06)
00:01.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
00:05.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
00:0f.0 ISA bridge: ServerWorks OSB4 (rev 50)
00:0f.1 IDE interface: ServerWorks: Unknown device 0211
00:0f.2 USB Controller: ServerWorks: Unknown device 0220 (rev 04)
01:06.0 SCSI storage controller: Symbios Logic Inc. (formerly NCR) 53c1010
Ultra3 SCSI Adapter (rev 01)
01:06.1 SCSI storage controller: Symbios Logic Inc. (formerly NCR) 53c1010
Ultra3 SCSI Adapter (rev 01)


Both of these machines have the same ethernet controller, although different
SCSI subsystems, and have slightly different results..
Comment 8 Neil Prockter 2001-08-14 05:08:15 EDT
hmmm perhaps something is up with the Intel 82557? 

I've used rsync for years with no trouble (numerous kernels) with other intel cards 
using the eepro100

I'll give the e100 driver a good hammering and see what happens
Comment 9 Stephen Samuel 2001-12-31 10:04:35 EST
This appears to be a resurrection of bug 32758
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=32758

We are getting similar results on a box with an ether pro e100
00:11.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 0c)

Dec 30 09:39:54 web01 kernel: Out of Memory: Killed process 14095
(interchange).Dec 30 09:43:21 web01 kernel: TCP: time wait bucket table overflow
Dec 30 09:45:36 web01 kernel: Out of Memory: Killed process 14038
(interchange).Dec 30 09:48:21 web01 kernel: Out of Memory: Killed process 13879
(interchange).Dec 30 09:51:23 web01 kernel: Out of Memory: Killed process 13968
(interchange).Dec 30 09:54:50 web01 kernel: eth0: can't fill rx buffer (force 0)!
Dec 30 09:54:51 web01 kernel: eth0: can't fill rx buffer (force 1)!
Dec 30 09:54:51 web01 kernel: eth0: can't fill rx buffer (force 1)!
Dec 30 09:54:51 web01 kernel: eth0: card reports no resources.
Dec 30 09:54:58 web01 last message repeated 35 times
Dec 30 09:54:59 web01 kernel: Out of Memory: Killed process 14364 (interchange).
.....

Ours is a machine with 1.5GB of ram (700M currently used) and 2G swap

(Does this problem only occur on systems with large ammounts of memory??)
Comment 10 Arjan van de Ven 2002-02-11 11:19:27 EST
2.4.9-21 has the vm much better in control and no longer has the critical
shortage problem.

Note You need to log in before you can comment on or make changes to this bug.