981158 – OOM: observed for fuse client process (glusterfs) when one brick from replica pairs were offlined and high IO was in progress from client

Bug 981158 - OOM: observed for fuse client process (glusterfs) when one brick from replica pairs were offlined and high IO was in progress from client

Summary: OOM: observed for fuse client process (glusterfs) when one brick from replica...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Ravishankar N
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	988182 1112844
TreeView+	depends on / blocked

Reported:	2013-07-04 07:05 UTC by Rahul Hinduja
Modified:	2014-06-24 19:45 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.4.0.15rhs
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	988182 (view as bug list)
Environment:
Last Closed:	2013-09-23 22:35:40 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Rahul Hinduja 2013-07-04 07:05:40 UTC

Description of problem:
=======================

glusterfs invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
glusterfs cpuset=/ mems_allowed=0
Pid: 29151, comm: glusterfs Not tainted 2.6.32-358.6.1.el6.x86_64 #1
Call Trace:
 [<ffffffff810cb5f1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
 [<ffffffff8111cdf0>] ? dump_header+0x90/0x1b0
 [<ffffffff8121d1fc>] ? security_real_capable_noaudit+0x3c/0x70
 [<ffffffff8111d272>] ? oom_kill_process+0x82/0x2a0
 [<ffffffff8111d1b1>] ? select_bad_process+0xe1/0x120
 [<ffffffff8111d6b0>] ? out_of_memory+0x220/0x3c0
 [<ffffffff8112c35c>] ? __alloc_pages_nodemask+0x8ac/0x8d0
 [<ffffffff8116095a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff8111a1d7>] ? __page_cache_alloc+0x87/0x90
 [<ffffffff81119bbe>] ? find_get_page+0x1e/0xa0
 [<ffffffff8111b197>] ? filemap_fault+0x1a7/0x500
 [<ffffffff81143194>] ? __do_fault+0x54/0x530
 [<ffffffff81143767>] ? handle_pte_fault+0xf7/0xb50
 [<ffffffff811443fa>] ? handle_mm_fault+0x23a/0x310
 [<ffffffff810474c9>] ? __do_page_fault+0x139/0x480
 [<ffffffff813230bf>] ? extract_entropy_user+0xbf/0x130
 [<ffffffff8103c7b8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff8103b8ac>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff8103b8b9>] ? kvm_clock_get_cycles+0x9/0x10
 [<ffffffff810a1420>] ? getnstimeofday+0x60/0xf0
 [<ffffffff815135ce>] ? do_page_fault+0x3e/0xa0
 [<ffffffff81510985>] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:  30
CPU    1: hi:  186, btch:  31 usd: 179
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  30
CPU    1: hi:  186, btch:  31 usd:  78
active_anon:740421 inactive_anon:187455 isolated_anon:0
 active_file:62 inactive_file:81 isolated_file:0
 unevictable:0 dirty:0 writeback:1 unstable:0
 free:21244 slab_reclaimable:2805 slab_unreclaimable:13478
 mapped:94 shmem:11230 pagetables:4252 bounce:0
Node 0 DMA free:15724kB min:248kB low:308kB high:372kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3512 4017 4017
Node 0 DMA32 free:60840kB min:58868kB low:73584kB high:88300kB active_anon:2759112kB inactive_anon:547136kB active_file:148kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3596500kB mlocked:0kB dirty:4kB writeback:0kB mapped:308kB shmem:44908kB slab_reclaimable:3324kB slab_unreclaimable:2092kB kernel_stack:72kB pagetables:7940kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:460 all_unreclaimable? yes
lowmem_reserve[]: 0 0 505 505
Node 0 Normal free:8412kB min:8464kB low:10580kB high:12696kB active_anon:202572kB inactive_anon:202684kB active_file:100kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:517120kB mlocked:0kB dirty:0kB writeback:4kB mapped:68kB shmem:12kB slab_reclaimable:7896kB slab_unreclaimable:51820kB kernel_stack:1016kB pagetables:9068kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:158 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 3*4kB 2*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15724kB
Node 0 DMA32: 808*4kB 421*8kB 282*16kB 242*32kB 154*64kB 97*128kB 45*256kB 8*512kB 2*1024kB 1*2048kB 0*4096kB = 60840kB
Node 0 Normal: 299*4kB 96*8kB 41*16kB 21*32kB 10*64kB 9*128kB 5*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 8412kB
14544 total pagecache pages
3136 pages in swap cache
Swap cache stats: add 1620224, delete 1617088, find 48938/52854
Free swap  = 0kB
Total swap = 4063224kB
1048575 pages RAM
67843 pages reserved
210 pages shared
955684 pages non-shared
[ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[  502]     0   502     2766        2   0     -17         -1000 udevd
[ 1157]     0  1157     2279        1   0       0             0 dhclient
[ 1201]     0  1201     6915       25   0     -17         -1000 auditd
[ 1226]     0  1226    62271       91   0       0             0 rsyslogd
[ 1255]     0  1255     2704       24   0       0             0 irqbalance
[ 1269]    32  1269     4743       16   0       0             0 rpcbind
[ 1287]    29  1287     5836        1   0       0             0 rpc.statd
[ 1320]     0  1320     6290        1   0       0             0 rpc.idmapd
[ 1414]    81  1414     7943        7   0       0             0 dbus-daemon
[ 1431]     0  1431    47336        1   1       0             0 cupsd
[ 1456]     0  1456     1019        0   1       0             0 acpid
[ 1465]    68  1465     6270       92   0       0             0 hald
[ 1466]     0  1466     4526        1   0       0             0 hald-runner
[ 1494]     0  1494     5055        1   1       0             0 hald-addon-inpu
[ 1505]    68  1505     4451        1   1       0             0 hald-addon-acpi
[ 1526]     0  1526    96425       31   1       0             0 automount
[ 1542]     0  1542     1691        0   0       0             0 mcelog
[ 1554]     0  1554    16029        1   0     -17         -1000 sshd
[ 1576]    38  1576     7540       14   1       0             0 ntpd
[ 1652]     0  1652    19682       23   0       0             0 master
[ 1659]    89  1659    19745       17   0       0             0 qmgr
[ 1676]     0  1676    27544        1   1       0             0 abrtd
[ 1684]     0  1684    29302       25   0       0             0 crond
[ 1695]     0  1695     5363        1   0       0             0 atd
[ 1706]     0  1706    25231       20   0       0             0 rhnsd
[ 1713]     0  1713    25972        1   0       0             0 rhsmcertd
[ 1728]     0  1728    15480       13   0       0             0 certmonger
[ 1746]     0  1746     1015        1   1       0             0 mingetty
[ 1748]     0  1748     1015        1   0       0             0 mingetty
[ 1750]     0  1750     1015        1   1       0             0 mingetty
[ 1752]     0  1752     1015        1   1       0             0 mingetty
[ 1755]     0  1755     1015        1   0       0             0 mingetty
[ 1763]     0  1763     3095        1   0     -17         -1000 udevd
[20478]     0 20478     2765        0   1     -17         -1000 udevd
[29143]     0 29143  1760204   912796   0       0             0 glusterfs
[32082]     0 32082    24466       36   0       0             0 sshd
[32086]     0 32086    27117       20   0       0             0 bash
[32107]    89 32107    19702       17   0       0             0 pickup
[32151]     0 32151    24466       51   1       0             0 sshd
[32152]     0 32152    24466       37   1       0             0 sshd
[32159]     0 32159    27117       62   0       0             0 bash
[32160]     0 32160    27117       86   1       0             0 bash
[ 6284]     0  6284    25234       23   0       0             0 tail
[17084]     0 17084    26523       79   0       0             0 script2.sh
[21651]     0 21651    26294       63   0       0             0 dd
Out of memory: Kill process 29143 (glusterfs) score 809 or sacrifice child
Killed process 29143, UID 0, (glusterfs) total-vm:7040816kB, anon-rss:3650900kB, file-rss:284kB
[root@darrel f]# 
[root@darrel f]# ls
ls: cannot open directory .: Transport endpoint is not connected
[root@darrel f]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_darrel-lv_root
                       50G   34G   13G  73% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             485M   63M  397M  14% /boot
/dev/mapper/vg_darrel-lv_home
                       16G  168M   15G   2% /home
df: `/mnt/vol-dr': Transport endpoint is not connected
10.70.36.35:/vol-dr    11T   16G   11T   1% /mnt/nvol-dr
[root@darrel f]# 



Version-Release number of selected component (if applicable):
=============================================================

[root@darrel n]# rpm -qa | grep gluster 
glusterfs-devel-3.4.0.12rhs.beta1-1.el6.x86_64
glusterfs-3.4.0.12rhs.beta1-1.el6.x86_64
glusterfs-rdma-3.4.0.12rhs.beta1-1.el6.x86_64
glusterfs-debuginfo-3.4.0.12rhs.beta1-1.el6.x86_64
glusterfs-fuse-3.4.0.12rhs.beta1-1.el6.x86_64
[root@darrel n]# 


Steps Carried:
==============
1. Created and started 6*2 setup from 4 server nodes
2. Mounted on client darrel (fuse and NFS)
3. set the volume options

cluster.background-self-heal-count: 0
cluster.self-heal-daemon: off

4. Created directories f and n from fuse mount
5. cd to f from fuse mount
6. cd to n from nfs mount
7. Ran script1.sh from fuse(f) and NFS(n) mount directories
8. Brought down all the bricks from server2(kill -9)
9. Brought down server4 (powered off)
10. Ran script2.sh from fuse(f) and NFS(n) mount directories
11. It finished on NFS (n) directory and was still going on fuse (f) directory.
12. Reran script2.sh from NFS (n) directory.
13. Meanwhile Fuse client hit OOM

Actual results:
===============

Out of memory: Kill process 29143 (glusterfs) score 809 or sacrifice child
Killed process 29143, UID 0, (glusterfs) total-vm:7040816kB, anon-rss:3650900kB, file-rss:284kB


Expected results:
=================

Should not hit OOM

Comment 6 Ravishankar N 2013-08-02 06:47:28 UTC

Patch review URL:
https://code.engineering.redhat.com/gerrit/11061

Comment 8 Scott Haines 2013-09-23 22:35:40 UTC

Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html

Note You need to log in before you can comment on or make changes to this bug.