716623 – Frequent page allocation failures - forcedeth-related?

Bug 716623 - Frequent page allocation failures - forcedeth-related?

Summary: Frequent page allocation failures - forcedeth-related?

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	15
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-06-25 16:57 UTC by Adam Huffman
Modified:	2012-06-06 15:27 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-06-06 15:27:28 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
kernel error log (5.42 KB, text/plain) 2011-06-25 16:57 UTC, Adam Huffman	no flags	Details
View All

Description Adam Huffman 2011-06-25 16:57:44 UTC

Created attachment 509910 [details]
kernel error log

Description of problem:

I have a couple of Asrock ION boxes that use NFS quite heavily.  In F13 and F14 I saw lots of kernel errors that seemed to be correlated with significant network traffic.  I've just updated one of them to F15 and the error rate seems to have increased.

An example is attached.



Version-Release number of selected component (if applicable):
2.6.38.8-32.fc15.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Adam Huffman 2011-06-25 16:58:39 UTC

lspci:


00:00.0 Host bridge: nVidia Corporation MCP79 Host Bridge (rev b1)
00:00.1 RAM memory: nVidia Corporation MCP79 Memory Controller (rev b1)
00:03.0 ISA bridge: nVidia Corporation MCP79 LPC Bridge (rev b2)
00:03.1 RAM memory: nVidia Corporation MCP79 Memory Controller (rev b1)
00:03.2 SMBus: nVidia Corporation MCP79 SMBus (rev b1)
00:03.3 RAM memory: nVidia Corporation MCP79 Memory Controller (rev b1)
00:03.5 Co-processor: nVidia Corporation MCP79 Co-processor (rev b1)
00:04.0 USB Controller: nVidia Corporation MCP79 OHCI USB 1.1 Controller (rev b1)
00:04.1 USB Controller: nVidia Corporation MCP79 EHCI USB 2.0 Controller (rev b1)
00:08.0 Audio device: nVidia Corporation MCP79 High Definition Audio (rev b1)
00:09.0 PCI bridge: nVidia Corporation MCP79 PCI Bridge (rev b1)
00:0a.0 Ethernet controller: nVidia Corporation MCP79 Ethernet (rev b1)
00:0b.0 SATA controller: nVidia Corporation MCP79 AHCI Controller (rev b1)
00:10.0 PCI bridge: nVidia Corporation MCP79 PCI Express Bridge (rev b1)
00:15.0 PCI bridge: nVidia Corporation MCP79 PCI Express Bridge (rev b1)
01:00.0 VGA compatible controller: nVidia Corporation ION VGA (rev b1)
02:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01)

Comment 2 Chuck Ebbert 2011-06-27 05:11:18 UTC

It's trying to find 16k of contiguous space for packets. Are you using large packets?

Comment 3 Adam Huffman 2011-06-27 12:24:18 UTC

Yes, MTU is set to 9000 on that interface.  Is that not advised with forcedeth?  I tried something similar on a different machine on the same network, with a different chipset, and the driver in that case gives a warning when the MTU is set above 1500.  I didn't see a similar warning with the machine referred to here.

Is this the sort of thing that would be fixed by adding more RAM, or is it the result of memory fragmentation, which would happen regardless of the total size?

Comment 4 Adam Huffman 2011-06-27 22:19:01 UTC

I've switched back to MTU=1500 and there's no sign of those failures so far.

Comment 5 Chuck Ebbert 2011-06-29 14:16:50 UTC

(In reply to comment #3)
> Yes, MTU is set to 9000 on that interface.  Is that not advised with forcedeth?
>  I tried something similar on a different machine on the same network, with a
> different chipset, and the driver in that case gives a warning when the MTU is
> set above 1500.  I didn't see a similar warning with the machine referred to
> here.
> 
> Is this the sort of thing that would be fixed by adding more RAM, or is it the
> result of memory fragmentation, which would happen regardless of the total
> size?

It's due to fragmentation; you might be able to set MTU to something around 3500 bytes and still only use 1-page allocations. (I'm not sure what the overhead is.)

Comment 6 Sergio Basto 2011-07-14 16:19:05 UTC

have you try the new kernel update ? kernel  2.6.38.8-35

Comment 7 Sergio Basto 2011-07-14 16:22:45 UTC

My issue is fixed on kernel 2.6.38.8-35 , I think is was : 

* Wed Jul 06 2011 Chuck Ebbert <cebbert> 2.6.38.8-35 - Revert
SCSI/block patches from 2.6.38.6 that caused more problems

Note You need to log in before you can comment on or make changes to this bug.