Bug 842206

Summary:	glusterfsd: page allocation failure
Product:	[Community] GlusterFS	Reporter:	Saurabh <saujain>
Component:	core	Assignee:	Raghavendra Bhat <rabhat>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	medium
Version:	pre-release	CC:	amarts, dchinner, esandeen, gluster-bugs, mzywusko, ryszard.lach
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.4.0	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2013-07-24 17:17:57 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Saurabh 2012-07-23 05:13:27 UTC

Description of problem:

glusterfsd: page allocation failure. order:1, mode:0x20
Pid: 16727, comm: glusterfsd Not tainted 2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112415f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115e152>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115ed6a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115eae9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f8cb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141fcf8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141ff82>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146d256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81486143>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff81483b2d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485f01>] ? tcp_check_req+0x201/0x420
 [<ffffffff8148354b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa0384557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484cc1>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff81462940>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81462a1d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462ca8>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff8146216d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814626f5>] ? ip_rcv+0x275/0x350
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff8142e768>] ? netif_receive_skb+0x58/0x60
 [<ffffffffa01553ad>] ? virtnet_poll+0x5dd/0x8d0 [virtio_net]
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff81431013>] ? net_rx_action+0x103/0x2f0
 [<ffffffffa01541b9>] ? skb_recv_done+0x39/0x40 [virtio_net]
 [<ffffffff81072291>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d9740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072075>] ? irq_exit+0x85/0x90
 [<ffffffff814f5515>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI> 

Version-Release number of selected component (if applicable):
[root@localhost ~]# glusterfs -V
glusterfs 3.3.0 built on Jul 19 2012 14:08:45
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.


How reproducible:

Happening regularly in my set up

Steps to Reproduce:
1.sending REST API requests in parallel using swift
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Saurabh 2012-07-23 08:10:23 UTC

Similar backtrace I am finding with swift-account-server and swift-proxy-server page faults.

Also, to be mentioned the /var/log/messages and /var/log/glusterfs/mnt-gluster-AUTH_test.log is not getting updated, even though the requests are been to this same machine.

Comment 2 Saurabh 2012-07-23 09:24:20 UTC

On similar note, I found some page faults with slight change in backtrace

glusterfs: page allocation failure. order:1, mode:0x20
Pid: 14544, comm: glusterfs Not tainted 2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112415f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffffa0154600>] ? start_xmit+0x30/0x1d0 [virtio_net]
 [<ffffffff8115e152>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115ed6a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115eae9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f8cb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141fcf8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141ff82>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146d256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81486143>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff81483b2d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485f01>] ? tcp_check_req+0x201/0x420
 [<ffffffff8148354b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa0384557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484cc1>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff81462940>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81462a1d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462ca8>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff8146216d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814626f5>] ? ip_rcv+0x275/0x350
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff8142e768>] ? netif_receive_skb+0x58/0x60
 [<ffffffffa01553ad>] ? virtnet_poll+0x5dd/0x8d0 [virtio_net]
 [<ffffffff81431013>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072291>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810958b0>] ? hrtimer_interrupt+0x140/0x250
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072075>] ? irq_exit+0x85/0x90
 [<ffffffff814f5600>] ? smp_apic_timer_interrupt+0x70/0x9b
 [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
 <EOI> 




glusterfs: page allocation failure. order:1, mode:0x20
Pid: 14544, comm: glusterfs Not tainted 2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112415f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115e152>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115ed6a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115eae9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f8cb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141fcf8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141ff82>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146d256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81486143>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff81483b2d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485f01>] ? tcp_check_req+0x201/0x420
 [<ffffffff8148354b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa0384557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484cc1>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff81462940>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81462a1d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462ca8>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff8146216d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814626f5>] ? ip_rcv+0x275/0x350
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff8142e768>] ? netif_receive_skb+0x58/0x60
 [<ffffffffa01553ad>] ? virtnet_poll+0x5dd/0x8d0 [virtio_net]
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff81431013>] ? net_rx_action+0x103/0x2f0
 [<ffffffffa01541b9>] ? skb_recv_done+0x39/0x40 [virtio_net]
 [<ffffffff81072291>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d9740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072075>] ? irq_exit+0x85/0x90
 [<ffffffff814f5515>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff810dde01>] ? rcu_sched_qs+0x1/0x30
 [<ffffffff814ece97>] ? schedule+0x47/0x3b2
 [<ffffffff8103758c>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff81037599>] ? kvm_clock_get_cycles+0x9/0x10
 [<ffffffff811777f6>] ? vfs_writev+0x46/0x60
 [<ffffffff81177972>] ? sys_writev+0xa2/0xb0
 [<ffffffff8100b16a>] ? sysret_careful+0x14/0x17

Comment 3 Amar Tumballi 2012-07-25 04:49:04 UTC

output of 'free -m' will help. Just by having a look on stack trace, doesn't look like something obvious with glusterfs. Will keep it open, and work on fixing some memory leak issues. Will need to revisit after we fix some of the leaks.

Comment 4 Saurabh 2012-07-25 04:51:46 UTC

I had some issues on the machine, so it was rebooted. I will try to reproduce it and update the results.

Comment 5 Saurabh 2012-07-26 12:15:42 UTC

I have been able to reproduce the issue in a completely different setup.

This time I have four Hardware machines with 50GB RAM.

[root@gqac028 ~]# glusterfs -V
glusterfs 3.3.0 built on Jul 19 2012 14:08:45
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@gqac028 ~]# 
[root@gqac028 ~]# 
[root@gqac028 ~]# rpm -qa | grep glusterfs
glusterfs-devel-3.3.0-23.el6rhs.x86_64
glusterfs-server-3.3.0-23.el6rhs.x86_64
glusterfs-rdma-3.3.0-23.el6rhs.x86_64
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
glusterfs-3.3.0-23.el6rhs.x86_64
glusterfs-fuse-3.3.0-23.el6rhs.x86_64
glusterfs-geo-replication-3.3.0-23.el6rhs.x86_64
[root@gqac028 ~]# 




=======================================================================



swift-container: page allocation failure. order:1, mode:0x20
Pid: 4768, comm: swift-container Not tainted 2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112415f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff8115e152>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115ed6a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115eae9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f8cb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141fcf8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141ff82>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146d256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81486143>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff81483b2d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485f01>] ? tcp_check_req+0x201/0x420
 [<ffffffff8148354b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa031d557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484cc1>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff81462940>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81462a1d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462ca8>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff8146216d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814626f5>] ? ip_rcv+0x275/0x350
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff8142e768>] ? netif_receive_skb+0x58/0x60
 [<ffffffff8142e870>] ? napi_skb_finish+0x50/0x70
 [<ffffffff81430ef9>] ? napi_gro_receive+0x39/0x50
 [<ffffffffa014ad4f>] ? bnx2_poll_work+0xd4f/0x1270 [bnx2]
 [<ffffffff81104f5b>] ? perf_pmu_enable+0x2b/0x40
 [<ffffffff8110a808>] ? perf_event_task_tick+0xa8/0x2f0
 [<ffffffffa014b2ad>] ? bnx2_poll_msix+0x3d/0xc0 [bnx2]
 [<ffffffff81431013>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072291>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d9740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810722ea>] ? __do_softirq+0x11a/0x1d0
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072075>] ? irq_exit+0x85/0x90
 [<ffffffff814f5515>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI> 
swift-container: page allocation failure. order:1, mode:0x20
Pid: 4768, comm: swift-container Not tainted 2.6.32-220.23.1.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff8112415f>] ? __alloc_pages_nodemask+0x77f/0x940
 [<ffffffff81053400>] ? select_idle_sibling+0x40/0x150
 [<ffffffff8115e152>] ? kmem_getpages+0x62/0x170
 [<ffffffff8115ed6a>] ? fallback_alloc+0x1ba/0x270
 [<ffffffff8115eae9>] ? ____cache_alloc_node+0x99/0x160
 [<ffffffff8115f8cb>] ? kmem_cache_alloc+0x11b/0x190
 [<ffffffff8141fcf8>] ? sk_prot_alloc+0x48/0x1c0
 [<ffffffff8141ff82>] ? sk_clone+0x22/0x2e0
 [<ffffffff8146d256>] ? inet_csk_clone+0x16/0xd0
 [<ffffffff81486143>] ? tcp_create_openreq_child+0x23/0x450
 [<ffffffff81483b2d>] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [<ffffffff81485f01>] ? tcp_check_req+0x201/0x420
 [<ffffffff8148354b>] ? tcp_v4_do_rcv+0x35b/0x430
 [<ffffffffa031d557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [<ffffffff81484cc1>] ? tcp_v4_rcv+0x4e1/0x860
 [<ffffffff81462940>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81462a1d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81462ca8>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff8146216d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff814626f5>] ? ip_rcv+0x275/0x350
 [<ffffffff8142c6ab>] ? __netif_receive_skb+0x49b/0x6f0
 [<ffffffff8142e768>] ? netif_receive_skb+0x58/0x60
 [<ffffffff8142e870>] ? napi_skb_finish+0x50/0x70
 [<ffffffff81430ef9>] ? napi_gro_receive+0x39/0x50
 [<ffffffffa014ad4f>] ? bnx2_poll_work+0xd4f/0x1270 [bnx2]
 [<ffffffff81280110>] ? swiotlb_map_page+0x0/0x100
 [<ffffffffa014b2ad>] ? bnx2_poll_msix+0x3d/0xc0 [bnx2]
 [<ffffffff810de937>] ? cpu_quiet_msk+0x77/0x130
 [<ffffffff81431013>] ? net_rx_action+0x103/0x2f0
 [<ffffffff81072291>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff810d9740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810722ea>] ? __do_softirq+0x11a/0x1d0
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072075>] ? irq_exit+0x85/0x90
 [<ffffffff814f5515>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI> 
[root@gqac028 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         48383      46827       1556          0        129      35999
-/+ buffers/cache:      10697      37686
Swap:        50431          0      50431
[root@gqac028 ~]#

Comment 6 Ryszard Łach 2012-08-01 09:47:33 UTC

Hi.
free -m is not enough for explanation. Look at 

http://utcc.utoronto.ca/~cks/space/blog/linux/WhyPageAllocFailure

and than you notice, that important are lines (from my logs):


glusterfsd: page allocation failure. order:4, mode:0xc0d0
(Pid: 14168, comm: glusterfsd Not tainted 2.6.32-5-xen-amd64 #1)

order:4 means, that glusterfsd tried to allocate 2^4*pageSize=64kB of data

[...]

Node 0 DMA: 2*4kB 2*8kB 1*16kB 1*32kB 2*64kB 4*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7880kB 
Node 0 DMA32: 15869*4kB 3667*8kB 614*16kB 40*32kB 21*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 106796kB
Node 0 Normal: 8448*4kB 61*8kB 52*16kB 22*32kB 10*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 37480kB

Here I see, that my system has 21 of 64KB continuous chunks in DMA32 zone and 10 64KB chunks in Normal zone, but probably page allocation fails because of many 4KB-sized chunks.

So, in short, I suppose it is memory fragmentation problem. I have no idea how to deal with it, I'll try to reduce memory assinged for the OS (xen DomU) and see what happens.

Cheers,

R.

Comment 7 Ryszard Łach 2012-08-03 07:57:40 UTC

Hi, again.

We've experimented with various vm settings (see http://community.gluster.org/a/linux-kernel-tuning-for-glusterfs/) . It seems, that that only one parameter that changes anything is vm.vfs_cache_pressure. After increasing it to a huge value (10000) there are  noticeably more contiguous pages available (hopefully we will not have problems with dentry/inode read performance). However, the page allocation error happens, but not so frequent as before.

We've noticed one more interesting thing. We have a 4 brick setup (stripe + replicate). page allocation failure happens almost  every 5 minutes, at the same time, at almost all systems (1 brick = 1 OS). We have also an error related to this one in brick's log:

 [2012-08-03 09:48:36.898418] I [server3_1-fops.c:823:server_getxattr_cbk] 0-shared-server: 196890: GETXATTR / (system.posix_acl_access) ==> -1 (Cannot allocate memory)

We'll check, if it is related to ACL's (we're mounting gluster via native gluster client with acl mount option).

Does anybody have an idea why it happens every 5 minutes? 

R.

Comment 8 Ryszard Łach 2012-08-06 06:35:22 UTC

Hi.
Remount without ACL solved the problem. We now have 4 bricks, all without ACL's, with different kernel settings, none of them has page allocation failures since remount.

R.

Comment 9 Amar Tumballi 2012-08-06 07:11:47 UTC

When you say 'remount without ACL', are you talking about -oacl for gluster mount or backend filesystem? (which i assume is XFS).

Comment 10 Ryszard Łach 2012-08-06 07:13:13 UTC

Both.

Comment 11 Eric Sandeen 2012-08-06 16:15:49 UTC

"acl" is not a valid mount option for xfs.

SELinux: initialized (dev sdb4, type xfs), uses xattr
XFS (sdb4): unknown mount option [acl].

so I don't understand "both" - but maybe your backend isn't xfs?

Comment 12 Dave Chinner 2012-08-06 21:19:03 UTC

All the stack traces point to the ethernet driver stack failing order 1 GFP_ATOMIC allocations during interrupt. I can't see a connection between the failures and the reported filesystem ACL solution...

Comment 13 Ryszard Łach 2012-08-07 06:26:43 UTC

Sorry, I didn't comment your 'xfs' suggestion (did I mention xfs somewhere?)

My FS is EXT4.

Dave: I'm trying to give you pure facts. 6th day after remount without failures. Before remount - failures every day, almost every 5 minutes (during morning import jobs).

Cheers,

R.

Comment 14 Ryszard Łach 2012-08-07 11:22:17 UTC

I have a fresh setup: two replicated bricks + georeplication from one of them to another machine.

After first georeplication connection (and start of replication of all files to the third machine) I have some page allocation failures (on the master for georep. node):

Aug  7 13:06:08 kernel: [445091.204793] Pid: 15480, comm: glusterfsd Not tainted 2.6.32-5-xen-amd64 #1
Aug  7 13:06:08 kernel: [445091.204799] Call Trace:
Aug  7 13:06:08 kernel: [445091.204812]  [<ffffffff810bb986>] ? __alloc_pages_nodemask+0x59b/0x5fd
Aug  7 13:06:08 kernel: [445091.204820]  [<ffffffff810ba943>] ? __get_free_pages+0x9/0x46
Aug  7 13:06:08 kernel: [445091.204828]  [<ffffffff810e948d>] ? __kmalloc+0x3f/0x141
Aug  7 13:06:08 kernel: [445091.204837]  [<ffffffff81107d14>] ? getxattr+0x89/0x117
Aug  7 13:06:08 kernel: [445091.204847]  [<ffffffff8100ecdf>] ? xen_restore_fl_direct_end+0x0/0x1
Aug  7 13:06:08 kernel: [445091.204854]  [<ffffffff810e84e9>] ? kmem_cache_free+0x72/0xa3
Aug  7 13:06:08 kernel: [445091.204862]  [<ffffffff810fb1b4>] ? user_path_at+0x52/0x79
Aug  7 13:06:08 kernel: [445091.204870]  [<ffffffff8118f8cf>] ? _atomic_dec_and_lock+0x33/0x50
Aug  7 13:06:08 kernel: [445091.204878]  [<ffffffff81107e3d>] ? sys_lgetxattr+0x42/0x5c
Aug  7 13:06:08 kernel: [445091.204885]  [<ffffffff81011b42>] ? system_call_fastpath+0x16/0x1b
Aug  7 13:06:08 kernel: [445091.204891] Mem-Info: 
Aug  7 13:06:08 kernel: [445091.204899] Node 0 DMA per-cpu:
Aug  7 13:06:08 kernel: [445091.204905] CPU    0: hi:    0, btch:   1 usd:   0
Aug  7 13:06:08 kernel: [445091.204910] Node 0 DMA32 per-cpu:
Aug  7 13:06:08 kernel: [445091.204916] CPU    0: hi:  186, btch:  31 usd:   0
Aug  7 13:06:08 kernel: [445091.204924] active_anon:52709 inactive_anon:96198 isolated_anon:24
Aug  7 13:06:08 kernel: [445091.204925]  active_file:65598 inactive_file:121637 isolated_file:22
Aug  7 13:06:08 kernel: [445091.204927]  unevictable:777 dirty:539 writeback:496 unstable:0
Aug  7 13:06:08 kernel: [445091.204928]  free:16038 slab_reclaimable:21131 slab_unreclaimable:4726
Aug  7 13:06:08 kernel: [445091.204930]  mapped:2461 shmem:3 pagetables:1169 bounce:0
Aug  7 13:06:08 kernel: [445091.204948] Node 0 DMA free:6068kB min:40kB low:48kB high:60kB active_anon:0kB inactive_anon:176kB active_file:5616kB inactive_file:132kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:12824kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:732kB slab_unreclaimable:228kB kernel_stack:32kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Aug  7 13:06:08 kernel: [445091.204977] lowmem_reserve[]: 0 1499 1499 1499 
Aug  7 13:06:08 kernel: [445091.204987] Node 0 DMA32 free:58084kB min:4932kB low:6164kB high:7396kB active_anon:210836kB inactive_anon:384616kB active_file:256776kB inactive_file:486416kB unevictable:3108kB isolated(anon):96kB isolated(file):88kB present:1535200kB mlocked:3108kB dirty:2156kB writeback:1984kB mapped:9844kB shmem:12kB slab_reclaimable:83792kB slab_unreclaimable:18676kB kernel_stack:928kB pagetables:4676kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Aug  7 13:06:08 kernel: [445091.205018] lowmem_reserve[]: 0 0 0 0
Aug  7 13:06:08 kernel: [445091.205028] Node 0 DMA: 45*4kB 60*8kB 50*16kB 16*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 6068kB
Aug  7 13:06:08 kernel: [445091.205050] Node 0 DMA32: 13899*4kB 47*8kB 30*16kB 9*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 58084kB 

and in gluster.log:

[2012-08-07 13:06:08.036781] W [client3_1-fops.c:1059:client3_1_getxattr_cbk] 0-foto-client-1: remote operation failed: Cannot allocate memory. Path: /5/2613849468/3ea82d2b9e33fa7f809b6e2a3176ffc0/2613849468_5we.jpg (91986aa3-58ce-4f0a-99d3-628234ffb2fc). Key: trusted.glusterfs.f92a558a-6b55-4908-8542-990f017593e6.xtime

I just noticed, that  Saurabh had errors with 'order 1' (8kB) and mine are at 'order 4' (64kB). I don't know, if it is the same issue.

R.

Comment 15 Dave Chinner 2012-08-09 00:38:04 UTC

(In reply to comment #13)
> Sorry, I didn't comment your 'xfs' suggestion (did I mention xfs somewhere?)
> 
> My FS is EXT4.

The current gluster storage product runs on XFS. Hence when "glusterfsd" was seen in the stack traces, Eric assumed you are using XFS.

> Dave: I'm trying to give you pure facts. 6th day after remount without
> failures. Before remount - failures every day, almost every 5 minutes
> (during morning import jobs).

I'm not disputing that it made the warnings go away, just that I didn't see the connection. The trace in comment #14 points it out - the ACL code is doing high order (order 5) memory allocations and that is exhausting the machine of contiguous pages, leading to the ethernet driver failing contiguous allocations.
That's a memory management problem, not a filesystem or ethernet driver problem...

Cheers,

Dave.

Comment 16 Amar Tumballi 2012-10-22 04:04:50 UTC

We will keep it open, and see if we are able to reproduce the issue with RHS2.0 (updates) or RHS2.1 testing... If not found till GA date of RHS2.1, will be closing the bug.

Comment 17 Amar Tumballi 2012-11-29 10:54:37 UTC

haven't seen any issues in this regard, should I ask QE to see if happens again.

We had some fixes in getxattr()'s memory allocation part, hopefully they fixed the issues:

http://review.gluster.com/3640
http://review.gluster.com/3673
http://review.gluster.com/3681

Saurabh, please re-open if seen again.