Bug 1415632

Summary:	Systemic setup brick process got killed due to Out Of Memory (OOM)
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Nag Pavan Chilakam <nchilaka>
Component:	core	Assignee:	Mohit Agrawal <moagrawa>
Status:	CLOSED DUPLICATE	QA Contact:	Rahul Hinduja <rhinduja>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.2	CC:	amukherj, moagrawa, nchilaka, rcyriac, rhs-bugs, storage-qa-internal
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-01-24 15:29:14 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Nag Pavan Chilakam 2017-01-23 10:04:47 UTC

Description of problem:
========================
During my systemic testing,
I hit issues which were reported as below
https://bugzilla.redhat.com/show_bug.cgi?id=1415178 ===>systemic testing: seeing lot of ping time outs which would lead to splitbrains 
https://bugzilla.redhat.com/show_bug.cgi?id=1415101 ===> glustershd process crashed on systemic setup 


I kept the setup for further debugging over the weekend untouched, but with IOs running which were triggered previously.

I saw today that the brick#3(dht-subvol 2) process got killed due to OOM

One note: the shd crashed on node#4 for which the above bz was reported, and this brick is a replica pair of node#4 brick


Note: node2 was hung with kernel hangs and hence had to reboot(10.70.37.86)



[root@dhcp35-156 bricks]# gluster v status
Status of volume: systemic
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.20:/rhs/brick1/systemic      49152     0          Y       4154 
Brick 10.70.37.86:/rhs/brick1/systemic      49152     0          Y       2285 
Brick 10.70.35.156:/rhs/brick1/systemic     N/A       N/A        N       N/A  
Brick 10.70.37.154:/rhs/brick1/systemic     N/A       N/A        N       N/A  
Brick 10.70.35.20:/rhs/brick2/systemic      N/A       N/A        N       N/A  
Brick 10.70.37.86:/rhs/brick2/systemic      49153     0          Y       2290 
Brick 10.70.35.156:/rhs/brick2/systemic     49153     0          Y       4045 
Brick 10.70.37.154:/rhs/brick2/systemic     49153     0          Y       3873 
Self-heal Daemon on localhost               N/A       N/A        Y       19630
Quota Daemon on localhost                   N/A       N/A        Y       19638
Self-heal Daemon on 10.70.37.86             N/A       N/A        Y       16060
Quota Daemon on 10.70.37.86                 N/A       N/A        Y       16068
Self-heal Daemon on dhcp35-20.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       19131
Quota Daemon on dhcp35-20.lab.eng.blr.redha
t.com                                       N/A       N/A        Y       19139
Self-heal Daemon on 10.70.37.154            N/A       N/A        Y       30598
Quota Daemon on 10.70.37.154                N/A       N/A        Y       30606
 
Task Status of Volume systemic
------------------------------------------------------------------------------
There are no active volume tasks







 free:43291 free_pcp:61 free_cma:0
[Sat Jan 21 07:28:01 2017] Node 0 DMA free:15908kB min:60kB low:72kB high:88kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Jan 21 07:28:01 2017] lowmem_reserve[]: 0 2815 16475 16475
[Sat Jan 21 07:28:01 2017] Node 0 DMA32 free:85160kB min:11532kB low:14412kB high:17296kB active_anon:1944336kB inactive_anon:546052kB active_file:0kB inactive_file:68kB unevictable:4524kB isolated(anon):0kB isolated(file):0kB present:3129332kB managed:2883568kB mlocked:4524kB dirty:0kB writeback:0kB mapped:2120kB shmem:556kB slab_reclaimable:10696kB slab_unreclaimable:277408kB kernel_stack:2016kB pagetables:5736kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3 all_unreclaimable? yes
[Sat Jan 21 07:28:01 2017] lowmem_reserve[]: 0 0 13659 13659
[Sat Jan 21 07:28:01 2017] Node 0 Normal free:72096kB min:55984kB low:69980kB high:83976kB active_anon:12328876kB inactive_anon:1232716kB active_file:424kB inactive_file:352kB unevictable:27908kB isolated(anon):0kB isolated(file):0kB present:14262272kB managed:13987552kB mlocked:27908kB dirty:0kB writeback:44kB mapped:6116kB shmem:2528kB slab_reclaimable:33824kB slab_unreclaimable:68868kB kernel_stack:4176kB pagetables:50240kB unstable:0kB bounce:0kB free_pcp:240kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1219 all_unreclaimable? yes
[Sat Jan 21 07:28:01 2017] lowmem_reserve[]: 0 0 0 0
[Sat Jan 21 07:28:01 2017] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
[Sat Jan 21 07:28:01 2017] Node 0 DMA32: 21331*4kB (UEM) 1*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 85332kB
[Sat Jan 21 07:28:01 2017] Node 0 Normal: 17899*4kB (UEM) 29*8kB (UEM) 23*16kB (M) 2*32kB (M) 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 72324kB
[Sat Jan 21 07:28:01 2017] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Jan 21 07:28:01 2017] 33005 total pagecache pages
[Sat Jan 21 07:28:01 2017] 29897 pages in swap cache
[Sat Jan 21 07:28:01 2017] Swap cache stats: add 295752345, delete 295722448, find 92579823/127325977
[Sat Jan 21 07:28:01 2017] Free swap  = 0kB
[Sat Jan 21 07:28:01 2017] Total swap = 8585212kB
[Sat Jan 21 07:28:01 2017] 4351899 pages RAM
[Sat Jan 21 07:28:01 2017] 0 pages HighMem/MovableOnly
[Sat Jan 21 07:28:01 2017] 130142 pages reserved
[Sat Jan 21 07:28:01 2017] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[Sat Jan 21 07:28:01 2017] [  561]     0   561     9204      277      22       47             0 systemd-journal
[Sat Jan 21 07:28:01 2017] [  580]     0   580    87696      321      35     1470             0 lvmetad
[Sat Jan 21 07:28:01 2017] [  609]     0   609    79648     1811      36        0         -1000 multipathd
[Sat Jan 21 07:28:01 2017] [  616]     0   616    11745      272      23      615         -1000 systemd-udevd
[Sat Jan 21 07:28:01 2017] [  787]     0   787    46853     7504      44        0         -1000 dmeventd
[Sat Jan 21 07:28:01 2017] [  889]     0   889    13854      214      27       90         -1000 auditd
[Sat Jan 21 07:28:01 2017] [  909]     0   909     4202      201      14       46             0 alsactl
[Sat Jan 21 07:28:01 2017] [  910]     0   910     1092      105       8       16             0 rngd
[Sat Jan 21 07:28:01 2017] [  911]     0   911    53103      262      57      384             0 abrtd
[Sat Jan 21 07:28:01 2017] [  912]    81   912    24620      330      17       85          -900 dbus-daemon
[Sat Jan 21 07:28:01 2017] [  914]   997   914    28962      282      28       72             0 chronyd
[Sat Jan 21 07:28:01 2017] [  925]     0   925     4826      212      13       42             0 irqbalance
[Sat Jan 21 07:28:01 2017] [  926]   998   926   132071      328      55     1668             0 polkitd
[Sat Jan 21 07:28:01 2017] [  934]     0   934    52548      238      53      326             0 abrt-watch-log
[Sat Jan 21 07:28:01 2017] [  939]     0   939     6070      190      16      124             0 smartd
[Sat Jan 21 07:28:01 2017] [  940]     0   940     6048      252      16       37             0 systemd-logind
[Sat Jan 21 07:28:01 2017] [  944]     0   944     1083       95       7       34             0 acpid
[Sat Jan 21 07:28:01 2017] [  957]    32   957    16237      134      34      123             0 rpcbind
[Sat Jan 21 07:28:01 2017] [  963]     0   963    82803      853      84     5442             0 firewalld
[Sat Jan 21 07:28:01 2017] [  964]     0   964    50303      144      38      114             0 gssproxy
[Sat Jan 21 07:28:01 2017] [  972]     0   972    31555      249      18      129             0 crond
[Sat Jan 21 07:28:01 2017] [  973]     0   973     6461      183      16       47             0 atd
[Sat Jan 21 07:28:01 2017] [  979]     0   979    27509      164      10       30             0 agetty
[Sat Jan 21 07:28:01 2017] [  981]     0   981   127850      539      67      268             0 NetworkManager
[Sat Jan 21 07:28:01 2017] [ 1084]     0  1084    28206      274      56     3076             0 dhclient
[Sat Jan 21 07:28:01 2017] [ 1309]     0  1309   158898      279     122      972             0 libvirtd
[Sat Jan 21 07:28:01 2017] [ 1311]   995  1311    11050      143      24      172             0 nrpe
[Sat Jan 21 07:28:01 2017] [ 1314]     0  1314    71328      230      41      316             0 rsyslogd
[Sat Jan 21 07:28:01 2017] [ 1316]     0  1316   138290      452      90     2576             0 tuned
[Sat Jan 21 07:28:01 2017] [ 1319]     0  1319   168455     1525      80     6170             0 glusterd
[Sat Jan 21 07:28:01 2017] [ 1324]     0  1324    20617       74      39      213         -1000 sshd
[Sat Jan 21 07:28:01 2017] [ 1325]     0  1325    28335      114      11       18             0 rhsmcertd
[Sat Jan 21 07:28:01 2017] [ 1421]     0  1421    22244      189      42      236             0 master
[Sat Jan 21 07:28:01 2017] [ 1429]    89  1429    22314      256      41      233             0 qmgr
[Sat Jan 21 07:28:01 2017] [ 1657]     0  1657    26978       99       9       11             0 rhnsd
[Sat Jan 21 07:28:01 2017] [ 4026]     0  4026  4351104  2481504    7524  1271633             0 glusterfsd
[Sat Jan 21 07:28:01 2017] [ 4045]     0  4045  2933885  1494036    4702   816818             0 glusterfsd
[Sat Jan 21 07:28:01 2017] [ 4066]     0  4066   643284     2959     289    16464             0 glusterfs
[Sat Jan 21 07:28:01 2017] [ 4097]     0  4097   135927      322      73    13924             0 glusterfs
[Sat Jan 21 07:28:01 2017] [ 6258]     0  6258    31927      122      18      129             0 screen
[Sat Jan 21 07:28:01 2017] [ 6259]     0  6259    29317      279      14      523             0 bash
[Sat Jan 21 07:28:01 2017] [14012]     0 14012    26974      136      10        1             0 sleep
[Sat Jan 21 07:28:01 2017] [14127]    89 14127    22270      470      43        0             0 pickup
[Sat Jan 21 07:28:01 2017] Out of memory: Kill process 4026 (glusterfsd) score 572 or sacrifice child
[Sat Jan 21 07:28:01 2017] Killed process 4026 (glusterfsd) total-vm:17404416kB, anon-rss:9926016kB, file-rss:0kB, shmem-rss:0kB

Comment 2 Nag Pavan Chilakam 2017-01-23 13:21:29 UTC

logs available @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1415632

Comment 3 Atin Mukherjee 2017-01-23 13:24:32 UTC

Can we please check if this is not duplicate of BZ 1413351 ?

Comment 4 Mohit Agrawal 2017-01-24 04:08:42 UTC

Hi,

Is it possible to share statedump of glustershd process from other nodes in cluster if it is up?

Regards
Mohit Agrawal

Comment 5 Mohit Agrawal 2017-01-24 05:47:17 UTC

(In reply to Mohit Agrawal from comment #4)
> Hi,
> 
> Is it possible to share statedump of glustershd process from other nodes in
> cluster if it is up?
> 
> Regards
> Mohit Agrawal

Please share the statedump of brick process also those are running.

Regards
Mohit Agrawal

Comment 6 Mohit Agrawal 2017-01-24 05:47:24 UTC

(In reply to Mohit Agrawal from comment #4)
> Hi,
> 
> Is it possible to share statedump of glustershd process from other nodes in
> cluster if it is up?
> 
> Regards
> Mohit Agrawal

Please share the statedump of brick process also those are running.

Regards
Mohit Agrawal

Comment 7 Nag Pavan Chilakam 2017-01-24 11:43:33 UTC

Hi Mohit, the statedump of the bricks are available on the volume.
you can mount the volume and visit the below path

/mnt/systemic/logs/rhs-client11.lab.eng.blr.redhat.com/statedumps


it has all the dumps of all bricks of all nodes.

Given this is systemic testing it may be highly difficult to point to one statedump

Comment 8 Mohit Agrawal 2017-01-24 12:30:01 UTC

Hi,

  Thank you for sharing the location of statedump.I have checked current brick process
  statedump in /var/run/gluster, for specific to dict leak it is similar to bugzilla (https://bugzilla.redhat.com/show_bug.cgi?id=1411329) and current memory consumption of brick
  process is also high 

   ps -aef | grep glusterfsd
root      4154     1 75 Jan18 ?        4-10:47:51 /usr/sbin/glusterfsd -s 10.70.35.20 --volfile-id systemic.10.70.35.20.rhs-brick1-systemic -p /var/lib/glusterd/vols/systemic/run/10.70.35.20-rhs-brick1-systemic.pid -S /var/run/gluster/ecb70f1c5d8ade863b15ff45abb9d46c.socket --brick-name /rhs/brick1/systemic -l /var/log/glusterfs/bricks/rhs-brick1-systemic.log --xlator-option *-posix.glusterd-uuid=c5954ab2-6283-4e02-a6f6-1fa6cd943f49 --brick-port 49152 --xlator-option systemic-server.listen-port=49152
root     24472 24391  0 17:56 pts/4    00:00:00 grep --color=auto glusterfsd
[root@dhcp35-20 gluster]# pmap -x 4154 | grep "total"


total kB         21724380 14717652 14714608

dict leak issue is already fixed from the bugzilla (1411329)

Regards
Mohit Agrawal