Bug 137270

Summary:

"eth0: memory shortage" abort truncates very large file xfer

Product:

Red Hat Enterprise Linux 3

Reporter:

Robert G. 'Doc' Savage <dsavage>

Component:

kernel

Assignee:

John W. Linville <linville>

Status:

CLOSED NOTABUG

QA Contact:

Severity:

medium

Docs Contact:

Priority:

medium

Version:

3.0

CC:

petrides, riel

Target Milestone:

---

Target Release:

---

Hardware:

athlon

OS:

Linux

URL:

http://seclists.org/lists/linux-kernel/2002/Jul/0395.html

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2004-11-22 20:52:42 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
3c59x-napi.patch	none

Description Robert G. 'Doc' Savage 2004-10-27 03:55:46 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20040922

Description of problem:
Using dd and nc to copy partition image files from FC1 laptop to
RHEL3/ES(u3) workstation with 1.1T RAID5 data space (18% fill):

listener:  # nc -l -p 30000 > /pub/images/hda5.img
source:    # dd if=/dev/hda5 bs=2048 | nc 192.168.1.2 30000 -w 3

Transfer of 35G partition image consistently fails at 9.8G point.
/var/log/messages on listener contains multiple reports of:

Oct 26 17:49:05 lion kernel: eth0: memory shortage
Oct 26 17:50:16 lion last message repeated 2 times
Oct 26 17:53:07 lion kernel: eth0: memory shortage

Listener system is Tyan S2468UGN w/dual 3c59x interfaces (only one
active), dual Athlon MP 2800+, 3G PC2100 registered ECC RAM, and 8G swap.

Version-Release number of selected component (if applicable):
kernel-smp-2.4.21-20.EL

How reproducible:
Always

Steps to Reproduce:
1. Set up netcat listener with 3c59x 
2. Select >10G partition to image
3. Pipe dd thru nc to listener using 100Base-T Etherswitch
    

Actual Results:  dd process aborts when destination file reaches 9.8G

Expected Results:  Full partition image should have transferred
successfully.

Additional info:

URL indicates this bug has been around for some time and has probably
survived multiple beta cycles. The code fragement in 3c59x.c quoted in
the URL is identical to the code in the RHEL3 kernel source.

Oddly enough, a 10G image of /dev/hda2 transfers successfully without
errors. The larger 35G /dev/hda5 image fails before the destination
filesize reaches 10G (9.8G).

Source is relatively slow 5400 rpm 48G EIDE laptop drive in 1G P-III/M
IBM A22p ThinkPad. Destination is fast RAID array composed of Adaptec
2000S Zero Channel RAID controller, two on-board U160 channels, and
nine 146.8G Seagate 10krpm U320 Cheetah drives. Source should not be
able to overdrive the destination.

Comment 1 John W. Linville 2004-10-28 20:47:07 UTC

In fact, the code in question remains all the way 'til today...

I quickly tried to reproduce this using FC3 Test 3 on an x86_64 w/ a
3c905B card, but it was successful.

Can you attach the output of running sysreport on the listener?  Do
you have any more specific recreation instructions?

I'll attach some information from my test in the next comment.  I'll
also install RHEL3 on this box and retry...

Comment 2 John W. Linville 2004-10-28 20:48:14 UTC

sender:

Disk /dev/hda: 40.0 GB, 40000000000 bytes
255 heads, 63 sectors/track, 4863 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1          13      104391   83  Linux
/dev/hda2              14        4609    36917370   83  Linux
/dev/hda3            4610        4863     2040255   82  Linux swap

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 9
cpu MHz         : 2392.376
cache size      : 512 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
bogomips        : 4718.59

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.40GHz
stepping        : 9
cpu MHz         : 2392.376
cache size      : 512 KB
physical id     : 0
siblings        : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
bogomips        : 4767.74

dd if=/dev/hda2 bs=2048 | nc 172.16.59.183 30000 -w 3
18458685+0 records in
18458685+0 records out

listener:

06:03.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX
[Cyclone] (rev 30)

06:03.0 Class 0200: 10b7:9055 (rev 30)
        Subsystem: 10b7:9055
        Flags: bus master, medium devsel, latency 32, IRQ 193
        I/O ports at bc00 [size=128]
        Memory at feaffc00 (32-bit, non-prefetchable) [size=128]
        Expansion ROM at feac0000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 1

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 3
model name      :                    Genuine Intel(R) CPU 3.20GHz
stepping        : 4
cpu MHz         : 3194.190
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm pni monitor ds_cpl cid cx16 xtpr
bogomips        : 6307.84
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

nc -l -p 30000 > hd.img

-rw-r--r--  1 root root 37803386880 Oct 28 16:35 hd.img

Comment 3 John W. Linville 2004-10-29 14:37:04 UTC

OK, I am seeing the "eth0: memory shortage" messages under RHEL3 U4...

There is a patch pending that upgrades the RHEL3 3c59x to be almost
identical to the RHEL4 version of the driver.  It is associated w/ bug
133843 and will probably be in RHEL3 U5 (it was too late for U4).  I
will try that and see if the messages disappear...

Comment 4 John W. Linville 2004-10-29 18:13:33 UTC

Sadly, that did not seem to remove the problem...it would seem that
the problem is not totally inside the driver...

I'll keep thinking... :-)

Comment 5 Robert G. 'Doc' Savage 2004-10-30 02:33:36 UTC

My listener system has a dual Athlon (32-bit) motherboard with dual
on-board 3c920 NICs. For complete hardware details see:
ftp://ftp.tyan.com/datasheets/d_s2468_150.pdf

To reproduce the problem exactly...

The objective of the following is to take a safety snapshot of my
laptop hard drive's partitions prior to upgrading from FC1 to FC3t3.
By making bit-wise copies of the partitions and of the MBR, I can do a
bare-metal restoration should anything go wrong during the upgrade.

Source system:
$ cat hda_fdisk.txt
Disk /dev/hda: 48.0 GB, 48004669440 bytes
16 heads, 63 sectors/track, 93015 cylinders, total 93759120 sectors
Units = sectors of 1 * 512 = 512 bytes
FileSys    Device Boot    Start       End    Blocks   Id  System
/boot   /dev/hda1   *        63    211679    105808+  83  Linux
/       /dev/hda2        211680  20699279  10243800   83  Linux
swap    /dev/hda3      20699280  22785839   1043280   82  Linuxswap
--      /dev/hda4      22785840  93759119  35486640    f  Win95 Ext'd
(LBA)
/pub    /dev/hda5      22785903  93759119  35486608+  83  Linux

1. Set up listener:
   # nc -l -p 30000 > /pub/images/hda1.img

2. Boot laptop with Helix 1.5 forensic CD and dd the first partition
image to the listener. (Using CD so no source partitions have to be
mounted.)
   # dd if=/dev/hda1 bs=2048 | nc 192.168.1.2 30000 -w 3
   Result on listener:
   # ls -gG /pub/images/hda1.img
   -rw-rw-r--    1 108347904 Oct 24 1820 hda1.img

3. Repeat 1 and 2 for /dev/hda2. Result on listener:
   # ls -gG /pub/images/hda2.img
   -rw-rw-r--     1 10489651200 Oct 24 18:42 hda2.img

4. Repeat 1 and 2 for /dev/hda3. Result on listener:
   # ls -gG /pub/images/hda3.img
   -rw-rw-r--     1 1068318720 Oct 24 19:03 hda3.img

5. Repeat 1 and 2 for /dev/hda4. Result on listener:
   # ls -gG /pub/images/hda4.img
   -rw-rw-r--     1     1024 Oct 24 19:04 hda4.img

6. Repeat 1 and 2 for /dev/hda5. Result on listener:
   # ls -gG /pub/images/hda5.img
   -rw-rw-r--     1 9882988544 Oct 24 22:03 hda5.img
   Note that the size of this image file should be 36338319360 bytes.
   The dd | nc session consistently fails with "eth0: memory shortage"
   errors at approximately 9983xxxxxx bytes. Note carefully that the
   hda2.img, which is larger than the partial hda5.img, is
   consistently created without incident or errors.

Comment 7 John W. Linville 2004-11-09 15:35:19 UTC

I am still looking at this (although I will be out of town for the
rest of this week)...

The "eth0: memory shortage" message comes from the 3c59x driver when
he starts to fail at allocating memory for incoming packets. 
Presumably your netcat sessions time-out when the delays caused by the
memory shortage get big enough to kill the session.

I repeated the test w/ output redirected to /dev/null.  I did not get
any of the "eth0: memory shortage" messages, and the test completed
about 10% faster.  To me, this suggested that at least some of the
congestion is caused by writing to disk.

Are there any netcat options (perhaps -i and/or -w) that can improve
netcat's tolerance?

Is it possible you could use a receiver system w/ a faster disk subsystem?

Please note that I never saw netcat fail.  My sender and receiver are
very "close" on the network.  Could you improve the network between
the sender and the receiver?

I have a preliminary NAPI patch for the 3c59x driver.  I didn't see
much difference with it, but it may be worth testing?  I'll attach it
in case you are feeling frisky... :-)  It is against RHEL3 U4, BTW...

Comment 8 John W. Linville 2004-11-09 15:36:39 UTC

Created attachment 106349 [details]
3c59x-napi.patch

NAPI may improve the driver's performance enough to make a difference...?

Comment 9 Robert G. 'Doc' Savage 2004-11-10 00:45:54 UTC

Got the patch, but not U4. Will it work against the U3 source?

I've been using netcat with '-w 3'. Maybe adding a few more seconds
will keep it from timing out. Haven't tried the '-i' option before.

It'd be rather hard to find a faster disk system. I invested quite a
lot of money to make this one very big and very fast :-). It has nine
10K rpm U320 drives in a hardare RAID5 array on a U160 channel. The
sender is a measley 5400 rpm EIDE notebook drive.

Like yours, my sender and receiver are also close -- on the same desk
with the Fast Etherswitch. Total patch cable length is about 10 feet.

Comment 10 Robert G. 'Doc' Savage 2004-11-12 10:37:34 UTC

Using the 'i' option slows things down to an impossible crawl.
Increasing the wait time interval to '-w 30' changes nothing. The
transfer still fails at exactly the same place in two attempts. This
is too precise to be explained by ring buffer exhaustion.

Despite the error message we're seeing, I'm now skeptical of the buggy
driver hypothesis. I now suspect we're seeing the end result of
something else entirely, but haven't a clue what it might be. What's
so special about a repeating file size of 9,883,033,600 bytes on a
disk with 142,794 logical 8,225,280 byte cylinders??

Comment 11 Robert G. 'Doc' Savage 2004-11-13 21:19:10 UTC

I re-ran the transfer sequence after first slowing the source NIC down
to 10Mbps. This time the transfer took more than 12 hours rather than
22 minutes, but the listener again aborted when the received file size
reached 9,883,033,600 bytes. There were no eth0:... errors in the log
file. The exact error reported on the source end was:

  # dd if=/dev/hda5 bs=2048 | nc 192.168.1.2 30000 -w 3
  dd: reading `/dev/hda5': Input/output error
  4825700+0 records in
  4825700+0 records out

After getting the same result after slowing the transfer rate by 90%,
I think it's unlikely that we're seeing the result of overdriving the
listener's NIC or the large drive array. It must be something that's
speed-independent, like an overflowing counter. In nc? In dd? Where?

Comment 12 John W. Linville 2004-11-15 18:54:14 UTC

Strange, as I was always able to complete my transfers.  Hmmm...

Have you tried this?

   dd if=/dev/hda5 bs=2048 > /dev/null

I'd be curious to know if the results are different.  Have you tried
the test w/ a different sender?

Comment 13 Robert G. 'Doc' Savage 2004-11-22 11:31:24 UTC

Sorry to be so tardy getting back to you, John. Your suggestion to
pipe to /dev/null ultimately lead me to the answer: A consecutive
string of 160 bad sectors had formed on the source drive. It was
killing every dd transfer with an input/output error (at the same
place, of course). A friend turned me on to dd_rescue which has the
ability to survive those errors and keep imaging. (Red Hat should
consider adding dd_rescue to the RHEL bag of tools.)

Certainly those "eth0: memory shortage errors" still need to be
tracked down and fixed, but in the end they turned out to be a red
herring and not the cause of this problem. 

I'm very sorry to have wasted your time on this wild goose chase.

Comment 14 Ernie Petrides 2004-11-22 20:52:42 UTC

Thanks for the update, Robert.  I'll close this as NOTABUG,
since John already explained the eth0 messages in comment #7.