Bug 1381452

Summary:	OOM kill of nfs-ganesha on one node while fs-sanity test suite is executed.
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Shashank Raj <sraj>
Component:	distribute	Assignee:	Jiffin <jthottan>
Status:	CLOSED ERRATA	QA Contact:	Arthy Loganathan <aloganat>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.2	CC:	aloganat, amukherj, jthottan, kkeithle, mzywusko, ndevos, rcyriac, rhs-bugs, sbhaloth, skoduri, storage-qa-internal, tdesala
Target Milestone:	---	Keywords:	Triaged
Target Release:	RHGS 3.2.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-3.8.4-7	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1397052 (view as bug list)		Environment:
Last Closed:	2017-03-23 06:07:42 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1351528, 1397052, 1401021, 1401023, 1401029, 1401032

Description Shashank Raj 2016-10-04 07:01:04 UTC

Description of problem:

OOM kill of nfs-ganesha on one node while posix_compliance test suite is executed.

Version-Release number of selected component (if applicable):

[root@dhcp42-59 ~]# rpm -qa|grep ganesha
nfs-ganesha-2.4.0-2.el6rhs.x86_64
nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64
glusterfs-ganesha-3.8.4-2.el6rhs.x86_64


How reproducible:

Once

Steps to Reproduce:
1. Create a ganesha cluster, create a volume and enable ganesha on it.
2. Mount the volume with ver=4 on client and start executing posix_compliance test suite.
3. Observe that once the posix_compliance test suite is finished, ganesha gets oom_killed on the mounted node with below messages in dmesg:


pcs invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
pcs cpuset=/ mems_allowed=0
Pid: 3248, comm: pcs Not tainted 2.6.32-642.4.2.el6.x86_64 #1
Call Trace:
 [<ffffffff81131420>] ? dump_header+0x90/0x1b0
 [<ffffffff8123bfec>] ? security_real_capable_noaudit+0x3c/0x70
 [<ffffffff811318a2>] ? oom_kill_process+0x82/0x2a0
 [<ffffffff811317e1>] ? select_bad_process+0xe1/0x120
 [<ffffffff81131ce0>] ? out_of_memory+0x220/0x3c0
 [<ffffffff8113e6bc>] ? __alloc_pages_nodemask+0x93c/0x950
 [<ffffffff81177a0a>] ? alloc_pages_vma+0x9a/0x150
 [<ffffffff81159d8d>] ? handle_pte_fault+0x73d/0xb20
 [<ffffffff810567c7>] ? pte_alloc_one+0x37/0x50
 [<ffffffff81193f79>] ? do_huge_pmd_anonymous_page+0xb9/0x3b0
 [<ffffffff8115a409>] ? handle_mm_fault+0x299/0x3d0
 [<ffffffff81052156>] ? __do_page_fault+0x146/0x500
 [<ffffffff811609d5>] ? do_mmap_pgoff+0x335/0x380
 [<ffffffff8154f03e>] ? do_page_fault+0x3e/0xa0
 [<ffffffff8154c345>] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:   0
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
CPU    1: hi:  186, btch:  31 usd:   0
CPU    2: hi:  186, btch:  31 usd:  30
CPU    3: hi:  186, btch:  31 usd:  92
active_anon:1618270 inactive_anon:292236 isolated_anon:0
 active_file:0 inactive_file:12 isolated_file:0
 unevictable:22804 dirty:13 writeback:0 unstable:0
 free:25293 slab_reclaimable:4863 slab_unreclaimable:20291
 mapped:12015 shmem:14677 pagetables:7894 bounce:0
Node 0 DMA free:15716kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3512 8057 8057
Node 0 DMA32 free:47568kB min:29404kB low:36752kB high:44104kB active_anon:2723100kB inactive_anon:544028kB active_file:4kB inactive_file:28kB unevictable:24kB isolated(anon):0kB isolated(file):0kB present:3596500kB mlocked:24kB dirty:4kB writeback:0kB mapped:328kB shmem:4kB slab_reclaimable:48kB slab_unreclaimable:196kB kernel_stack:0kB pagetables:8600kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? yes
lowmem_reserve[]: 0 0 4545 4545
Node 0 Normal free:37888kB min:38052kB low:47564kB high:57076kB active_anon:3749980kB inactive_anon:624916kB active_file:0kB inactive_file:20kB unevictable:91192kB isolated(anon):0kB isolated(file):0kB present:4654080kB mlocked:62576kB dirty:48kB writeback:0kB mapped:47732kB shmem:58704kB slab_reclaimable:19404kB slab_unreclaimable:80968kB kernel_stack:11680kB pagetables:22976kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:313 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 3*4kB 1*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15716kB
Node 0 DMA32: 376*4kB 137*8kB 37*16kB 31*32kB 33*64kB 5*128kB 4*256kB 4*512kB 1*1024kB 2*2048kB 8*4096kB = 47896kB
Node 0 Normal: 8430*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 37800kB
39246 total pagecache pages
22008 pages in swap cache
Swap cache stats: add 1306301, delete 1284293, find 109183/161893
Free swap  = 0kB
Total swap = 3145724kB
2097151 pages RAM
82361 pages reserved
28737 pages shared
1970987 pages non-shared
[ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[  641]     0   641     2776      105   1     -17         -1000 udevd
[ 1362]     0  1362     2281       83   2       0             0 dhclient
[ 1426]     0  1426     6900      158   2     -17         -1000 auditd
[ 1494]     0  1494     4578      139   2       0             0 irqbalance
[ 1512]    32  1512     4746      148   2       0             0 rpcbind
[ 1568]    81  1568    24364      226   2       0             0 dbus-daemon
[ 1590]     0  1590    47244      238   0       0             0 cupsd
[ 1622]     0  1622     1021      133   0       0             0 acpid
[ 1634]    68  1634     9500      305   2       0             0 hald
[ 1635]     0  1635     5101      132   0       0             0 hald-runner
[ 1667]     0  1667     5631      118   3       0             0 hald-addon-inpu
[ 1674]    68  1674     4503      161   3       0             0 hald-addon-acpi
[ 1701]     0  1701    96537      251   0       0             0 automount
[ 6072]     0  6072     1698       35   1       0             0 mcelog
[ 6089]     0  6089    16560       89   2     -17         -1000 sshd
[ 6168]     0  6168    20226      287   2       0             0 master
[ 6172]    89  6172    20289      273   2       0             0 qmgr
[ 6197]     0  6197    45773      220   2       0             0 abrtd
[ 6224]     0  6224     5278       71   2       0             0 atd
[ 6238]     0  6238    25235      113   1       0             0 rhnsd
[ 6250]     0  6250    27088       99   2       0             0 rhsmcertd
[ 6267]     0  6267    16092      105   2       0             0 certmonger
[ 6312]     0  6312     1017      115   2       0             0 mingetty
[ 6314]     0  6314     1017      115   2       0             0 mingetty
[ 6316]     0  6316     1017      115   0       0             0 mingetty
[ 6318]     0  6318     1017      115   2       0             0 mingetty
[ 6320]     0  6320     1017      115   0       0             0 mingetty
[ 6322]     0  6322     1017      115   2       0             0 mingetty
[ 8731]     0  8731    29221      197   2       0             0 crond
[ 8756]     0  8756    62806      244   0       0             0 rsyslogd
[ 9346]     0  9346   167548      405   0       0             0 glusterd
[10370]     0 10370   373001      555   1       0             0 glusterfsd
[10473]     0 10473   211916      436   3       0             0 glusterfs
[11607]     0 11607    15925      197   3       0             0 check_gluster_s
[25769]     0 25769   175887     7686   0     -17         -1000 dmeventd
[27899]     0 27899    44284      284   2       0             0 tuned
[ 8055]     0  8055     5852      186   1       0             0 rpc.statd
[26157]     0 26157    25553      340   2       0             0 sshd
[26258]     0 26258    27089      204   2       0             0 bash
[29145]     0 29145   320209     3119   0       0             0 glusterfsd
[29165]     0 29165   320210     3635   0       0             0 glusterfsd
[29185]     0 29185   319695     4118   3       0             0 glusterfsd
[29206]     0 29206   263494      945   1       0             0 glusterfs
[31559]     0 31559     2973      100   0     -17         -1000 udevd
[31904]     0 31904  3793222  1859244   1       0             0 ganesha.nfsd
[32496]     0 32496     2972       97   2     -17         -1000 udevd
[  444]     0   444   152050    19556   2       0             0 corosync
[  510]     0   510    49291      143   2     -16          -941 fenced
[  525]     0   525    52509      138   1     -16          -941 dlm_controld
[  589]     0   589    32284      133   2     -16          -941 gfs_controld
[  751]     0   751    20100      313   2       0             0 pacemakerd
[  758]   189   758    23995     1230   2       0             0 cib
[  759]     0   759    23889      346   2       0             0 stonithd
[  760]     0   760    15627      365   2       0             0 lrmd
[  761]   189   761    21431      392   2       0             0 attrd
[  762]   189   762    29577      740   0       0             0 pengine
[  763]     0   763    34234      737   2       0             0 crmd
[  790]     0   790    35543    10087   3       0             0 ruby
[ 3248]     0  3248    45176     1674   2       0             0 pcs
Out of memory: Kill process 31904 (ganesha.nfsd) score 884 or sacrifice child
Killed process 31904, UID 0, (ganesha.nfsd) total-vm:15172888kB, anon-rss:7435736kB, file-rss:1240kB


Actual results:

OOM kill of nfs-ganesha on one node while posix_compliance test suite is executed.

Expected results:

There should not be any OOM_kills

Additional info:

sosreports will be attached.

Comment 2 Shashank Raj 2016-10-04 10:05:30 UTC

sosreports and logs can be accessed at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1381452

Comment 3 Soumya Koduri 2016-10-05 05:57:50 UTC

Shashank,

Please turn off features.cache-invalidation for that volume and re-run the tests. If the oom_score of ganesha process is still high after the tests complete, please tune the no. of worker threads of nfs-ganesha to 16 using below config option and re-try the tests.

NFS_Core_Param
{
        Nb_Worker = 16;
}

Comment 4 Shashank Raj 2016-10-05 08:37:24 UTC

Tried running posix_compliance again with both features.cache-invalidation on/off and i am not able to reproduce this issue again.

So it seems some other test in fs-sanity is the culprit for this issue.

Will keep trying it and update bug accordingly. For now changing the bug title as appropriate.

Comment 7 Soumya Koduri 2016-11-08 09:16:26 UTC

(In reply to Shashank Raj from comment #4)
> Tried running posix_compliance again with both features.cache-invalidation
> on/off and i am not able to reproduce this issue again.
> 
> So it seems some other test in fs-sanity is the culprit for this issue.
> 
> Will keep trying it and update bug accordingly. For now changing the bug
> title as appropriate.

Surabhi,

Could you please check the same and update the bug with the details of the test which may be causing this issue.

Comment 8 Arthy Loganathan 2016-11-17 14:59:22 UTC

For 6X2 volume, while executing posix_compliance tests, ganesha gets oom_killed on the mounted node always when the oom_score reaches to ~870.

sosreports are at, http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1381452/

Comment 10 Arthy Loganathan 2016-11-18 03:37:04 UTC

I have tried with volumes with lesser bricks like plain distribute with 2 bricks and 1*2 volume, and issue is not seen. 

As Jiffin suggested, I have executed the following test with 6*2 volume

prove -vf /opt/qa/tools/posix-testsuite/tests/rename/00.t

and oom_score increases drastically when this test is running. 

dmesg:

[248560.640500] Call Trace:
[248560.640511]  [<ffffffff81685eac>] dump_stack+0x19/0x1b
[248560.640516]  [<ffffffff81680e57>] dump_header+0x8e/0x225
[248560.640523]  [<ffffffff812ae71b>] ? cred_has_capability+0x6b/0x120
[248560.640530]  [<ffffffff8113cb03>] ? delayacct_end+0x33/0xb0
[248560.640537]  [<ffffffff8118460e>] oom_kill_process+0x24e/0x3c0
[248560.640542]  [<ffffffff810936ce>] ? has_capability_noaudit+0x1e/0x30
[248560.640545]  [<ffffffff81184e46>] out_of_memory+0x4b6/0x4f0
[248560.640548]  [<ffffffff81681960>] __alloc_pages_slowpath+0x5d7/0x725
[248560.640552]  [<ffffffff8118af55>] __alloc_pages_nodemask+0x405/0x420
[248560.640556]  [<ffffffff811cf10a>] alloc_pages_current+0xaa/0x170
[248560.640563]  [<ffffffff8106a587>] pte_alloc_one+0x17/0x40
[248560.640568]  [<ffffffff811adb23>] __pte_alloc+0x23/0x170
[248560.640571]  [<ffffffff811b1535>] handle_mm_fault+0xe25/0xfe0
[248560.640574]  [<ffffffff811b76d5>] ? do_mmap_pgoff+0x305/0x3c0
[248560.640579]  [<ffffffff81691994>] __do_page_fault+0x154/0x450
[248560.640581]  [<ffffffff81691cc5>] do_page_fault+0x35/0x90
[248560.640584]  [<ffffffff8168df88>] page_fault+0x28/0x30
[248560.640586] Mem-Info:
[248560.640591] active_anon:1620957 inactive_anon:292997 isolated_anon:0
 active_file:0 inactive_file:974 isolated_file:0
 unevictable:6562 dirty:0 writeback:0 unstable:0
 slab_reclaimable:7116 slab_unreclaimable:13556
 mapped:5683 shmem:8641 pagetables:7532 bounce:0
 free:25150 free_pcp:474 free_cma:0
[248560.640595] Node 0 DMA free:15852kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15936kB managed:15852kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[248560.640602] lowmem_reserve[]: 0 3327 7805 7805
[248560.640605] Node 0 DMA32 free:46200kB min:28752kB low:35940kB high:43128kB active_anon:2727832kB inactive_anon:545796kB active_file:0kB inactive_file:2536kB unevictable:16040kB isolated(anon):0kB isolated(file):0kB present:3653620kB managed:3408880kB mlocked:16040kB dirty:0kB writeback:0kB mapped:16596kB shmem:12260kB slab_reclaimable:9708kB slab_unreclaimable:24740kB kernel_stack:4944kB pagetables:11448kB unstable:0kB bounce:0kB free_pcp:792kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:285 all_unreclaimable? yes
[248560.640611] lowmem_reserve[]: 0 0 4478 4478
[248560.640613] Node 0 Normal free:38548kB min:38696kB low:48368kB high:58044kB active_anon:3755996kB inactive_anon:626192kB active_file:0kB inactive_file:1360kB unevictable:10208kB isolated(anon):0kB isolated(file):0kB present:4718592kB managed:4585756kB mlocked:10208kB dirty:0kB writeback:0kB mapped:6136kB shmem:22304kB slab_reclaimable:18756kB slab_unreclaimable:29484kB kernel_stack:7872kB pagetables:18680kB unstable:0kB bounce:0kB free_pcp:1104kB local_pcp:160kB free_cma:0kB writeback_tmp:0kB pages_scanned:1049 all_unreclaimable? yes
[248560.640618] lowmem_reserve[]: 0 0 0 0
[248560.640620] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15852kB
[248560.640630] Node 0 DMA32: 1564*4kB (UE) 956*8kB (UE) 723*16kB (UEM) 388*32kB (UEM) 120*64kB (UEM) 5*128kB (EM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46208kB
[248560.640639] Node 0 Normal: 1174*4kB (UEM) 1024*8kB (UEM) 691*16kB (UEM) 305*32kB (UEM) 53*64kB (UEM) 9*128kB (M) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 38504kB
[248560.640649] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[248560.640651] 18106 total pagecache pages
[248560.640653] 6146 pages in swap cache
[248560.640654] Swap cache stats: add 1107386, delete 1101240, find 294696/305552
[248560.640655] Free swap  = 0kB
[248560.640656] Total swap = 2097148kB
[248560.640657] 2097037 pages RAM
[248560.640658] 0 pages HighMem/MovableOnly
[248560.640659] 94415 pages reserved
[248560.640660] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[248560.640666] [  685]     0   685    17664     2179      39       49             0 systemd-journal
[248560.640668] [  716]     0   716   220817      676      46     1476             0 lvmetad
[248560.640671] [  722]     0   722    11679      635      22      546         -1000 systemd-udevd
[248560.640675] [  881]     0   881   179084     6113      49        0         -1000 dmeventd
[248560.640685] [ 1273]     0  1273    13854      234      26       89         -1000 auditd
[248560.640688] [ 1292]     0  1292     4826      217      14       37             0 irqbalance
[248560.640690] [ 1293]    81  1293     8197      262      17       71          -900 dbus-daemon
[248560.640692] [ 1296]     0  1296     6156      261      15      138             0 systemd-logind
[248560.640695] [ 1299]   998  1299   132067      351      54     1894             0 polkitd
[248560.640697] [ 1310]   997  1310    28962      310      26       42             0 chronyd
[248560.640699] [ 1311]    32  1311    16237      175      34      104             0 rpcbind
[248560.640701] [ 1322]     0  1322    50303      142      40      114             0 gssproxy
[248560.640704] [ 1334]     0  1334    82865      469      84     5904             0 firewalld
[248560.640706] [ 1691]     0  1691    28206      115      52     3081             0 dhclient
[248560.640708] [ 1785]     0  1785    28335       98      12       37             0 rhsmcertd
[248560.640710] [ 1787]     0  1787   138291      385      87     2567             0 tuned
[248560.640712] [ 1798]     0  1798    20617       91      42      190         -1000 sshd
[248560.640715] [ 1916]     0  1916    22244      222      42      238             0 master
[248560.640717] [ 1918]    89  1918    22287      245      44      236             0 qmgr
[248560.640719] [ 2334]     0  2334    31556      209      17      133             0 crond
[248560.640721] [ 2385]     0  2385    26978      101       8       37             0 rhnsd
[248560.640723] [ 2388]     0  2388    27509      164      10       33             0 agetty
[248560.640726] [17375]    29 17375    10605      230      24      177             0 rpc.statd
[248560.640728] [16763]     0 16763    72838     1270      59      105             0 rsyslogd
[248560.640730] [16951]     0 16951   151619      470      86    12040             0 glusterd
[248560.640733] [27747]     0 27747   428530     2595     125    10071             0 glusterfsd
[248560.640735] [27962]     0 27962   226969     5025      89     6433             0 glusterfs
[248560.640737] [29536]     0 29536    49589     2611      63     2017             0 corosync
[248560.640739] [29552]     0 29552    33157      377      64     1026             0 pacemakerd
[248560.640741] [29554]   189 29554    35595     2224      72     1416             0 cib
[248560.640744] [29555]     0 29555    34361      885      69      479             0 stonithd
[248560.640746] [29556]     0 29556    26273      371      52      228             0 lrmd
[248560.640748] [29557]   189 29557    31731      940      64      345             0 attrd
[248560.640750] [29558]   189 29558    38963     2038      71      241             0 pengine
[248560.640752] [29559]   189 29559    47014     2147      79      880             0 crmd
[248560.640754] [29577]     0 29577   244360     8064      98     2064             0 pcsd
[248560.640757] [ 6278]     0  6278  3262386  1857506    4856   406101             0 ganesha.nfsd
[248560.640759] [22343]     0 22343    35726      306      72      290             0 sshd
[248560.640761] [22358]     0 22358    28879      278      14       48             0 bash
[248560.640764] [27763]     0 27763   330732     1777     113     8853             0 glusterfsd
[248560.640767] [27785]     0 27785   330733     2281     115     9062             0 glusterfsd
[248560.640769] [27804]     0 27804   314090     3314     110     8708             0 glusterfsd
[248560.640771] [27836]     0 27836   255249     6445     106    14561             0 glusterfs
[248560.640773] [22453]    89 22453    22270      479      42        0             0 pickup
[248560.640776] [ 5088]     0  5088    35726      635      71        0             0 sshd
[248560.640778] [ 5111]     0  5111    28879      319      15        0             0 bash
[248560.640780] [11672]     0 11672    26984      136      10        0             0 tail
[248560.640782] [14710]     0 14710    35726      581      72        0             0 sshd
[248560.640784] [14745]     0 14745    28879      311      14        0             0 bash
[248560.640787] [15813]     0 15813    28910      333      14        0             0 ganesha_grace
[248560.640789] [15819]     0 15819    28910      180      10        0             0 ganesha_grace
[248560.640791] [15820]     0 15820    30197      552      62        0             0 crm_attribute
[248560.640793] [15821]     0 15821    28877      274      14        0             0 portblock
[248560.640795] [15824]     0 15824    28811      185      14        0             0 ganesha_mon
[248560.640797] [15825]     0 15825    28877      125      10        0             0 portblock
[248560.640800] [15826]     0 15826    28811       98      11        0             0 ganesha_mon
[248560.640801] [15827]     0 15827    26974      127      10        0             0 basename
[248560.640803] Out of memory: Kill process 6278 (ganesha.nfsd) score 870 or sacrifice child
[248560.640886] Killed process 6278 (ganesha.nfsd) total-vm:13049544kB, anon-rss:7430024kB, file-rss:0kB, shmem-rss:0kB

Comment 11 Jiffin 2016-11-21 14:16:34 UTC

Basically following part of test got hung.

1.) create a file 0644 permission
2.) rename the file using non root user, it will fail
3.) delete the file
4.) create a directory with same name
5.) rename directory with non root user -- it hungs

RCA :

When a rename fails with non root user, the linkto file created by dht is not removed properly. It will exists as stale entry. On the next rename call, a lookup will be performed and lookup tries to remove stale entry but it fails.
And client keep on trying to remove stale entry but always results in EPERM.

The linkto file(mknod call) always created using with root user(even for non root user). But clean up for this file using original user which results in the failure.

Patch posted upstream fro review

http://review.gluster.org/#/c/15894/1

Comment 13 surabhi 2016-11-29 10:08:15 UTC

As per the triaging we all have the agreement that this BZ has to be fixed in rhgs-3.2.0. Providing qa_ack

Comment 16 Jiffin 2016-12-01 05:35:30 UTC

The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in that script
 * create a file using root
 * rename the file using a non root user, it fails with EACESS
 * delete the file 
 * create directory directory using root
 * rename the directory using non root user, test hungs and slowly led to OOM kill of ganesha 

RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being present as:
 * a linkto file on one subvol 
 * a directory on rest of subvols

When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: 
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).

This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops.

This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case)

Comment 19 Arthy Loganathan 2016-12-15 10:07:18 UTC

OOM kill of nfs-ganesha is not seen in the latest build when posix compliance tests are running.

nfs-ganesha-2.4.1-2.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-2.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-8.el7rhgs.x86_64

However, following test in posix compliance test suite are failing, for which a different bug(1404367) has been raised.

Test Summary Report
-------------------
/opt/qa/tools/posix-testsuite/tests/chown/00.t   (Wstat: 0 Tests: 171 Failed: 1)
  Failed test:  77
/opt/qa/tools/posix-testsuite/tests/link/00.t    (Wstat: 0 Tests: 82 Failed: 1)
  Failed test:  77
/opt/qa/tools/posix-testsuite/tests/open/07.t    (Wstat: 0 Tests: 23 Failed: 3)
  Failed tests:  5, 7, 9
Files=185, Tests=1962, 132 wallclock secs ( 1.57 usr  0.59 sys + 16.39 cusr 33.22 csys = 51.77 CPU)
Result: FAIL
end: 15:20:31

Comment 21 errata-xmlrpc 2017-03-23 06:07:42 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html