678766 – xfsdump or ext4 dump over network leads to kernel trace and system lock

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 678766 - xfsdump or ext4 dump over network leads to kernel trace and system lock

Summary: xfsdump or ext4 dump over network leads to kernel trace and system lock

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Rashid Khan
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-02-19 13:30 UTC by daryl herzmann
Modified:	2015-11-27 11:23 UTC (History)
CC List:	16 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-11-27 11:23:25 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
kernel logs to /var/log/messages (22.90 KB, text/plain) 2011-02-19 13:30 UTC, daryl herzmann	no flags	Details
/proc/slabinfo (20.96 KB, text/plain) 2011-02-21 13:57 UTC, daryl herzmann	no flags	Details
2.6.32-71.18.1 kernel logs (16.02 KB, text/plain) 2011-02-24 15:19 UTC, daryl herzmann	no flags	Details
Stack trace from 2.6.32-71.el6.x86_64 (38.21 KB, text/plain) 2011-08-07 11:11 UTC, Shyam Mani	no flags	Details
View All

Description daryl herzmann 2011-02-19 13:30:40 UTC

Created attachment 479686 [details]
kernel logs to /var/log/messages

Description of problem:
I am unable to do a full filesystem backup of a 3 TB RAID5 array either using XFS or EXT4 as the filesystem.  The backup will run for a while and eventually kernel traces will appear and the system will eventually lockup.

I had no troubles backing up this filesystem on RHEL5 on the same hardware.

Version-Release number of selected component (if applicable):
Fully updated RHEL6 Server, no fastrack.
2.6.32-71.14.1.el6.x86_64

How reproducible:
Always
 
Steps to Reproduce:
1. Run xfsdump or dump
2. system locks up
3.
  
Actual results:
system lock up

Expected results:
I have backups and can sleep easier.

thanks!

Comment 2 Ric Wheeler 2011-02-19 14:06:38 UTC

Thanks for the report!

While we look into this, you might consider using tar to backup your data.

Comment 3 Ric Wheeler 2011-02-19 14:12:15 UTC

Are you dumping over the network? Can you test your backup with local storage (assuming you have something large enough)?

Thanks!

Comment 4 daryl herzmann 2011-02-19 14:17:56 UTC

I am backing up over network NFS.  I don't have the ability to do local backups.  I have tried a network rsync backup too, and that will dump kernel traces too, but the system won't lock up.  I figured this all was related to the kernel memory leaks reported on the rhel6 mailing list, but am not sure.  I have nothing but problems with RHEL6 and NFSv3 and v4, so maybe that is related as well.  Sorry, that I am not of more help.

Comment 5 Larry Woodman 2011-02-21 13:50:44 UTC

For some reason the free memory is so fragmented the allocator cant satisfy an order-1 allocation.  Since the GFP mask doesnt not include GFP_FS or GFP_IO for sockets it wont try to reclaim pagecache pages:

Node 0 DMA32: 26783*4kB 718*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 112876kB

I'm not sure why sk_prot_alloc() would result in an order-1 allocation, every socket related slab on my RHEL6 system has one page-per-slab which would result in an order-0 allocation.  Daryl, can you get us a /proc/slabinfo output from your system?

Larry Woodman

Comment 6 daryl herzmann 2011-02-21 13:57:39 UTC

Created attachment 479922 [details]
/proc/slabinfo

thank you, attached is my /proc/slabinfo .  I have 4 GB of memory and 6 GB of swap.  The system is a Dell PowerEdge 1900 tower.

Comment 7 Larry Woodman 2011-02-21 15:59:44 UTC

The problem is we are in an irq receiving a TPC packet and sk_prot_alloc() calls the slab allocator for a struct that is allocated out of a slab with more than one page:

static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority,
                int family)
{
        struct sock *sk;
        struct kmem_cache *slab;

        slab = prot->slab;
        if (slab != NULL) {
                sk = kmem_cache_alloc(slab, priority & ~__GFP_ZERO)


Feb 18 21:15:10 xxx kernel: Call Trace:
Feb 18 21:15:10 xxx kernel: <IRQ>  [<ffffffff8111ea06>] __alloc_pages_nodemask+0x706/0x850
Feb 18 21:15:10 xxx kernel: [<ffffffff811560e2>] kmem_getpages+0x62/0x170
Feb 18 21:15:10 xxx kernel: [<ffffffff81156cfa>] fallback_alloc+0x1ba/0x270
Feb 18 21:15:10 xxx kernel: [<ffffffff8115674f>] ? cache_grow+0x2cf/0x320
Feb 18 21:15:10 xxx kernel: [<ffffffff81156a79>] ____cache_alloc_node+0x99/0x160
Feb 18 21:15:10 xxx kernel: [<ffffffff811573db>] kmem_cache_alloc+0x11b/0x190
Feb 18 21:15:10 xxx kernel: [<ffffffff81403fc8>] sk_prot_alloc+0x48/0x180
Feb 18 21:15:10 xxx kernel: [<ffffffff81404212>] sk_clone+0x22/0x2a0
Feb 18 21:15:10 xxx kernel: [<ffffffff8144bd96>] inet_csk_clone+0x16/0xd0
Feb 18 21:15:10 xxx kernel: [<ffffffff81464ce3>] tcp_create_openreq_child+0x23/0x450
Feb 18 21:15:10 xxx kernel: [<ffffffff8146272d>] tcp_v4_syn_recv_sock+0x4d/0x280
Feb 18 21:15:10 xxx kernel: [<ffffffff81464aa1>] tcp_check_req+0x201/0x420
Feb 18 21:15:10 xxx kernel: [<ffffffff8146214b>] tcp_v4_do_rcv+0x35b/0x430
Feb 18 21:15:10 xxx kernel: [<ffffffff81463960>] tcp_v4_rcv+0x5b0/0x7e0
Feb 18 21:15:10 xxx kernel: [<ffffffff814418c0>] ? ip_local_deliver_finish+0x0/0x2d0
Feb 18 21:15:10 xxx kernel: [<ffffffff81436d74>] ? nf_hook_slow+0x74/0x100
Feb 18 21:15:10 xxx kernel: [<ffffffff814418c0>] ? ip_local_deliver_finish+0x0/0x2d0
Feb 18 21:15:10 xxx kernel: [<ffffffff8144199d>] ip_local_deliver_finish+0xdd/0x2d0
Feb 18 21:15:10 xxx kernel: [<ffffffff81441c28>] ip_local_deliver+0x98/0xa0
Feb 18 21:15:10 xxx kernel: [<ffffffff814410ed>] ip_rcv_finish+0x12d/0x440
Feb 18 21:15:10 xxx kernel: [<ffffffff81441675>] ip_rcv+0x275/0x350
Feb 18 21:15:10 xxx kernel: [<ffffffff8140fb4b>] netif_receive_skb+0x38b/0x670
Feb 18 21:15:10 xxx kernel: [<ffffffffa022b238>] bnx2_poll_work+0xd18/0x1240 [bnx2]
Feb 18 21:15:10 xxx kernel: [<ffffffff8126d470>] ? swiotlb_map_page+0x0/0x100
Feb 18 21:15:10 xxx kernel: [<ffffffffa022b889>] bnx2_poll+0x69/0x2d8 [bnx2]
Feb 18 21:15:10 xxx kernel: [<ffffffff814106d3>] net_rx_action+0x103/0x210
Feb 18 21:15:10 xxx kernel: [<ffffffff81095373>] ? hrtimer_get_next_event+0xc3/0x100
Feb 18 21:15:10 xxx kernel: [<ffffffff81073cd7>] __do_softirq+0xb7/0x1e0
Feb 18 21:15:10 xxx kernel: [<ffffffff810d8950>] ? handle_IRQ_event+0x60/0x170
Feb 18 21:15:10 xxx kernel: [<ffffffff810142cc>] call_softirq+0x1c/0x30
Feb 18 21:15:10 xxx kernel: [<ffffffff81015f35>] do_softirq+0x65/0xa0
Feb 18 21:15:10 xxx kernel: [<ffffffff81073ad5>] irq_exit+0x85/0x90
Feb 18 21:15:10 xxx kernel: [<ffffffff814d0425>] do_IRQ+0x75/0xf0
Feb 18 21:15:10 xxx kernel: [<ffffffff81013ad3>] ret_from_intr+0x0/0x11

When this happens the GFP_MASK is only GFP_HIGH(0x20) so we will not try any page reclamation even for defragmentation therefore the allocation fails when 
there is not enough free contiguous memory even though there is plenty free.
We need to get someone more familiar with the network structure requirements 
to see why.

Larry Woodman

Comment 8 Larry Woodman 2011-02-21 20:24:26 UTC

The exact cause of this failure is inet_init() calls proto_register(&tcp_prot,1) which will cause sk_alloc() to call kmem_cache_alloc() for a TCP struct.  Since the TCP slab has 2 pages per slab cache_grow() will call kmem_getpages() for an order-1 allocation.  Since memory is so fragmented the allocation will fail.

/cat /proc/slabinfo

slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> 
TCP                   41           72        1664         4             2 


Larry

Comment 9 daryl herzmann 2011-02-24 15:19:05 UTC

Created attachment 480781 [details]
2.6.32-71.18.1 kernel logs

Thanks for the very kind help.  For what it is worth, I tried with the latest errata kernel and have attached the message that appears during the dump.  I was watching it closely and stopped the backup process when the message appeared, so I think I may have saved the system from locking up so far :)  I'd be happy to try any suggestions you may have.

Comment 10 daryl herzmann 2011-03-25 12:30:21 UTC

I tried the RHEL6.1 beta kernel and was able to run for 12 hours of dump over NFS without a kernel trace.  Unfortunately, the dump was going so slow (3 MB/s) that it would have taken a week to finish :(

Comment 11 RHEL Program Management 2011-04-04 02:48:55 UTC

Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 12 daryl herzmann 2011-06-06 14:40:14 UTC

Hello,

I am fine closing this bugzilla.  I have not seen any issues with RHEL6.1 kernels.  NFS IO with solaris stinks, but that is a different issue :)

thanks!
 daryl

Comment 13 Ric Wheeler 2011-06-06 14:42:17 UTC

Thanks for the report and testing!

Comment 14 daryl herzmann 2011-06-09 12:07:20 UTC

Hi,

Sorry about this, but I was confused yesterday and this issue still exists on the most recent 6.1 kernel.  The system did not lock up, but was dumping kernel traces that I did not notice, sorry.

swapper: page allocation failure. order:1, mode:0x20
Pid: 0, comm: swapper Tainted: G           ---------------- T 2.6.32-131.0.15.el6.x86_64 #1


daryl

Comment 15 daryl herzmann 2011-07-10 14:24:06 UTC

Hi,

I'm probably going to downgrade to RHEL5 this week so that I can have backups again :(  Anyway you wish for me to test before doing so?

thanks!
 daryl

Comment 16 Shyam Mani 2011-08-07 10:50:45 UTC

We're seeing a similar problem on one of our FTP servers running RHEL 6.1, which has NFS directories mounted on it. We're not doing any dump/backup operation as such, just accessing files.

Will attach a trace shortly.

Comment 17 Shyam Mani 2011-08-07 11:11:11 UTC

Created attachment 517038 [details]
Stack trace from 2.6.32-71.el6.x86_64

This is the trace we're seeing on our end. Unfortunately, we had to downgrade one of the servers to RHEL 5.x to keep the cluster up and that machine hasn't been seeing any errors whatsoever.

I would be happy to provide more details, if needed.

TIA for looking into this!

Comment 18 daryl herzmann 2011-09-30 17:00:04 UTC

Is there interest to try to resolve this?  I'd be happy to do whatever suggested?

Comment 19 RHEL Program Management 2011-10-07 15:23:22 UTC

Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 20 daryl herzmann 2011-12-10 16:43:24 UTC

I haven't seen issues with RHEL6.2 2.6.32-220.el6.x86_64 yet.

Comment 21 Peng Ge 2012-02-08 02:10:12 UTC

I saw similar problem with RHEL6.2 kernel 2.6.32-220.4.1.el6.x86_64


swapper: page allocation failure. order:2, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.32-220.4.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320
 [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
 [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
 [<ffffffff814620b0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
 [<ffffffff8142198d>] ? dev_alloc_skb+0x1d/0x40
 [<ffffffffa01b36e7>] ? netxen_alloc_rx_skb+0x37/0x1d0 [netxen_nic]
 [<ffffffffa01b3cde>] ? netxen_process_rcv_ring+0x32e/0xa30 [netxen_nic]
 [<ffffffff81421267>] ? __kfree_skb+0x47/0xa0
 [<ffffffff814212fb>] ? consume_skb+0x3b/0x80
 [<ffffffffa01b69e5>] ? netxen_process_cmd_ring+0x185/0x2a0 [netxen_nic]
 [<ffffffffa01b08d5>] ? netxen_nic_poll+0x45/0xc0 [netxen_nic]
 [<ffffffff814307c3>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d93c2>] ? handle_IRQ_event+0x92/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81071de5>] ? irq_exit+0x85/0x90
 [<ffffffff814f4c85>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff812c49de>] ? intel_idle+0xde/0x170
 [<ffffffff812c49c1>] ? intel_idle+0xc1/0x170
 [<ffffffff813f9ef7>] ? cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
 [<ffffffff814e5ebc>] ? start_secondary+0x202/0x245
----------------
~~~
----------------
swapper: page allocation failure. order:2, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.32-220.4.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320
 [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
 [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
 [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff8142186a>] ? __alloc_skb+0x7a/0x180
 [<ffffffff8142198d>] ? dev_alloc_skb+0x1d/0x40
 [<ffffffffa01b36e7>] ? netxen_alloc_rx_skb+0x37/0x1d0 [netxen_nic]
 [<ffffffff8142e415>] ? __napi_gro_receive+0x55/0x100
 [<ffffffffa01b3cde>] ? netxen_process_rcv_ring+0x32e/0xa30 [netxen_nic]
 [<ffffffff8126c977>] ? kobject_put+0x27/0x60
 [<ffffffff81341a47>] ? put_device+0x17/0x20
 [<ffffffffa01b68a4>] ? netxen_process_cmd_ring+0x44/0x2a0 [netxen_nic]
 [<ffffffffa01b08d5>] ? netxen_nic_poll+0x45/0xc0 [netxen_nic]
 [<ffffffff814307c3>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d93c2>] ? handle_IRQ_event+0x92/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81071de5>] ? irq_exit+0x85/0x90
 [<ffffffff814f4c85>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff812c49de>] ? intel_idle+0xde/0x170
 [<ffffffff812c49c1>] ? intel_idle+0xc1/0x170
 [<ffffffff813f9ef7>] ? cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
 [<ffffffff814e5ebc>] ? start_secondary+0x202/0x245


very frequent occurrence, about >50 times over 30 minutes, but only lasted for 30 minutes.

Comment 22 Peng Ge 2012-02-08 02:13:02 UTC

It also has something to do with the network driver. (netxen_nic) here. Both stock (netxen_nic) driver or HP (nx_nic) drivers suffer from the same issue. An earlier post in this thread, comment 7, also has a network driver module in its backtrace.
Could it be some kind of attack?

Comment 23 Peng Ge 2012-02-08 02:16:25 UTC

Debian with 2.6.32 kernel also has similar problem pertaining to network.
http://lists.debian.org/debian-backports/2011/01/msg00037.html

Comment 24 daryl herzmann 2012-06-05 19:27:59 UTC

I spoke too soon in Comment #20, I am still seeing this with 2.6.32-220.7.1.el6.x86_64

swapper: page allocation failure. order:1, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.32-220.7.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81123ebf>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8142c800>] ? dev_hard_start_xmit+0x240/0x3f0
 [<ffffffff8115dd72>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115e98a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115e3df>] ? cache_grow+0x2cf/0x320

 [<ffffffff8115e709>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f4eb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141f5c8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141f852>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146cac6>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81485993>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff8148337d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485751>] ? tcp_check_req+0x201/0x420
 [<ffffffff81482d9b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa03d1557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484511>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff814621c0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8146229d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462528>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff814619ed>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff81461f75>] ? ip_rcv+0x275/0x350
 [<ffffffff8142bf7b>] ? __netif_receive_skb+0x49b/0x6e0
 [<ffffffff8142e028>] ? netif_receive_skb+0x58/0x60
 [<ffffffff8142e130>] ? napi_skb_finish+0x50/0x70
 [<ffffffff814307b9>] ? napi_gro_receive+0x39/0x50
 [<ffffffffa0289d4f>] ? bnx2_poll_work+0xd4f/0x1270 [bnx2]
 [<ffffffff8127fab0>] ? swiotlb_map_page+0x0/0x100
 [<ffffffffa028a399>] ? bnx2_poll+0x69/0x2d8 [bnx2]
 [<ffffffff814308d3>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81095963>] ? __hrtimer_start_range_ns+0x1a3/0x460
 [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d94a0>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81071de5>] ? irq_exit+0x85/0x90
 [<ffffffff814f4dc5>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff81014687>] ? mwait_idle+0x77/0xd0
 [<ffffffff814f265a>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
 [<ffffffff814e5ffc>] ? start_secondary+0x202/0x245


The machine does not lock up or crash, so that's good at least :)

Comment 25 Jure Pečar 2013-05-22 12:19:46 UTC

A "me too" post ...
Using 2.6.32-220.4.1 on Lustre storage nodes, connected via two gigE links, spewing this as soon as some real IO on Lustre comes up:

kswapd0: page allocation failure. order:2, mode:0x20
Pid: 38, comm: kswapd0 Not tainted 2.6.32-220.4.2.el6_lustre.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81123daf>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115dc62>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115e87a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115e2cf>] ? cache_grow+0x2cf/0x320
 [<ffffffff8115e5f9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff81421bca>] ? __alloc_skb+0x7a/0x180
 [<ffffffff8115f4bf>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
 [<ffffffff81462410>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8115f6fb>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff814624ed>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81421bca>] ? __alloc_skb+0x7a/0x180
 [<ffffffff81421d46>] ? __netdev_alloc_skb+0x36/0x60
 [<ffffffffa03c0f5a>] ? tg3_alloc_rx_data+0x9a/0x250 [tg3]
 [<ffffffffa03c6308>] ? tg3_poll_work+0xdc8/0x1050 [tg3]
 [<ffffffffa03d4c81>] ? tg3_poll+0x71/0x450 [tg3]
 [<ffffffff81430b23>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d9390>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81071de5>] ? irq_exit+0x85/0x90
 [<ffffffff814f4fe5>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff8112a4ee>] ? shrink_inactive_list+0x2de/0x740
 [<ffffffff8100bdae>] ? call_function_single_interrupt+0xe/0x20
 [<ffffffffa048e634>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
 [<ffffffffa0490522>] ? cfs_hash_lookup+0x82/0xa0 [libcfs]
 [<ffffffff8112b21f>] ? shrink_zone+0x38f/0x520
 [<ffffffff81120f0d>] ? zone_watermark_ok_safe+0xad/0xc0
 [<ffffffff8112c769>] ? balance_pgdat+0x6e9/0x7e0
 [<ffffffff8112cc10>] ? isolate_pages_global+0x0/0x350
 [<ffffffff81133486>] ? set_pgdat_percpu_threshold+0xa6/0xd0
 [<ffffffff81133670>] ? calculate_pressure_threshold+0x0/0x50
 [<ffffffff8112c996>] ? kswapd+0x136/0x3b0
 [<ffffffff81090a90>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8112c860>] ? kswapd+0x0/0x3b0
 [<ffffffff81090726>] ? kthread+0x96/0xa0
 [<ffffffff8100c14a>] ? child_rip+0xa/0x20
 [<ffffffff81090690>] ? kthread+0x0/0xa0
 [<ffffffff8100c140>] ? child_rip+0x0/0x20


As I understand, this means dropped packets and if this is true, it is the likely reason for the sporiadic Lustre failures we're observing.

Are there any tweaks to either decrease the likelihood of these events or get rid of them completely?

Comment 26 Jure Pečar 2013-05-22 13:17:54 UTC

I tested the memory fragmentation hypothesis by dropping caches (echo 3 > /proc/sys/vm/drop_caches). So far it appears to help.

I'm raising vm.min_free_bytes, I expect it to help too.

Is there a way to figure out what is the largest contigious chunk of memory available?

Comment 28 Marcelo Ricardo Leitner 2015-11-16 14:55:21 UTC

Sorry to ask but is anyone still interested in this bug? It's idle for more than 2 years now. Daryl?

Comment 29 daryl herzmann 2015-11-16 14:58:18 UTC

I don't have any current reproducers, so I am not of help for this.

Comment 30 Jure Pečar 2015-11-16 15:04:11 UTC

Probably not anymore this particular bug, but I see xfs in general still suffers from memory fragmentation. Even on machines with 512GB of ram, with about 90% of it being in buffers/cache, I still see XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) on el7.1.

Comment 31 Marcelo Ricardo Leitner 2015-11-16 15:08:11 UTC

(In reply to Jure Pečar from comment #30)
> Probably not anymore this particular bug, but I see xfs in general still
> suffers from memory fragmentation. Even on machines with 512GB of ram, with
> about 90% of it being in buffers/cache, I still see XFS: possible memory
> allocation deadlock in kmem_alloc (mode:0x250) on el7.1.

Okay. Please fill another bug for that, Jure. Thanks.

Comment 32 Marcelo Ricardo Leitner 2015-11-27 11:23:25 UTC

RHEL6 had known issues with memory compaction, which probably were one if not the main cause of this issue.
As this bz unfortunately lost traction and now with the lack of reproducers, we don't have other choice than close this bug.
Thank you all for the time and patience invested here.

Note You need to log in before you can comment on or make changes to this bug.