Bug 1456231 - gluster-blockd gets OOM killed
Summary: gluster-blockd gets OOM killed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-block
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.3.0
Assignee: Prasanna Kumar Kalever
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On:
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-05-28 04:21 UTC by Pranith Kumar K
Modified: 2017-09-21 04:19 UTC (History)
7 users (show)

Fixed In Version: gluster-block-0.2.1-1.el7rhgs
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 04:19:33 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:2773 0 normal SHIPPED_LIVE new packages: gluster-block 2017-09-21 08:16:22 UTC

Description Pranith Kumar K 2017-05-28 04:21:02 UTC
Description of problem:
Create two replica 3 volumes and enable gluster-block profile on it.

On machine-1 execute:
for i in {1..200}; do gluster-block create vol1/block1 ha 3 192.168.122.61,192.168.122.123,192.168.122.113 1GiB && gluster-block delete vol1/block1; done

On machine-2 execute:
for i in {1..200}; do gluster-block create vol2/block2 ha 3 192.168.122.61,192.168.122.123,192.168.122.113 1GiB && gluster-block delete vol2/block2; done

After 40 minutes gluster-blockd on 2 machines died of OOM killer

[ 5941.594297] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[ 5941.594302] [  520]     0   520    15454       46      31       3      100             0 systemd-journal
[ 5941.594303] [  550]     0   550    11854        0      26       3      574         -1000 systemd-udevd
[ 5941.594304] [  661]     0   661    13888        6      28       3      102         -1000 auditd
[ 5941.594305] [  671]     0   671    21136        0      13       3       54             0 audispd
[ 5941.594306] [  673]     0   673    10907        0      27       3       86             0 sedispatch
[ 5941.594307] [  686]    70   686    12547        0      29       3      104             0 avahi-daemon
[ 5941.594308] [  687]     0   687     6425        0      17       3       69             0 qemu-ga
[ 5941.594309] [  688]     0   688     1104        0       7       3       28             0 rngd
[ 5941.594310] [  689]     0   689    12545       19      28       3      414             0 systemd-logind
[ 5941.594311] [  691]     0   691     1642        1       9       3       29             0 mcelog
[ 5941.594312] [  692]   172   692    46710        0      28       4       86             0 rtkit-daemon
[ 5941.594313] [  693]    70   693    12547        0      27       3       85             0 avahi-daemon
[ 5941.594313] [  694]     0   694     4220        1      15       3       50             0 alsactl
[ 5941.594314] [  698]    81   698    14287        0      27       3      328          -900 dbus-daemon
[ 5941.594315] [  700]     0   700    51153        0      34       3      126             0 gssproxy
[ 5941.594316] [  709]     0   709   130329        2     102       4     5261             0 firewalld
[ 5941.594317] [  710]     0   710   103902        0      70       3      442             0 ModemManager
[ 5941.594318] [  712]     0   712    98766        0      44       4      212             0 accounts-daemon
[ 5941.594319] [  726]     0   726    10025        0      22       3       69             0 spice-vdagentd
[ 5941.594320] [  731]   998   731   133327        0      58       4     1407             0 polkitd
[ 5941.594321] [  739]     0   739   111762        0      66       3      358             0 abrtd
[ 5941.594322] [  741]   996   741    25760       12      21       3       86             0 chronyd
[ 5941.594322] [  761]     0   761   132237       22     163       3      312             0 abrt-dump-journ
[ 5941.594323] [  763]     0   763   132236       19     153       5      308             0 abrt-dump-journ
[ 5941.594324] [  774]     0   774   158221      240      86       4      378             0 NetworkManager
[ 5941.594325] [  797]     0   797    20785        1      43       4      212         -1000 sshd
[ 5941.594326] [  801]     0   801     6490        0      19       3       51             0 atd
[ 5941.594327] [  804]     0   804    33233       21      17       3      137             0 crond
[ 5941.594328] [  805]     0   805   102391        0      47       3      279             0 gdm
[ 5941.594329] [  899]     0   899    92978        1      64       3      302             0 gdm-session-wor
[ 5941.594330] [  913]    42   913    16494        1      36       3      257             0 systemd
[ 5941.594330] [  918]     0   918    21781        1      46       3     3074             0 dhclient
[ 5941.594331] [  932]    42   932    24738        0      49       3      748             0 (sd-pam)
[ 5941.594332] [  957]    42   957   112433        1      95       3      351             0 gdm-wayland-ses
[ 5941.594333] [  964]    42   964    14144        1      27       3      163             0 dbus-daemon
[ 5941.594334] [  967]    42   967   172779        0     113       4      466             0 gnome-session-b
[ 5941.594335] [  976]    42   976   407794     5908     338       4    22976             0 gnome-shell
[ 5941.594336] [  994]     0   994   107134        0      54       3      308             0 upowerd
[ 5941.594337] [ 1010]    42  1010    50712       18      97       3     1929             0 Xwayland
[ 5941.594338] [ 1012]    42  1012    86186        0      37       3      175             0 at-spi-bus-laun
[ 5941.594338] [ 1017]    42  1017    14074        0      27       3      109             0 dbus-daemon
[ 5941.594339] [ 1020]    42  1020    55841        0      44       3      195             0 at-spi2-registr
[ 5941.594340] [ 1026]    42  1026   164909        1      86       3      685             0 pulseaudio
[ 5941.594341] [ 1039]    42  1039   114957       22      42       5      578             0 ibus-daemon
[ 5941.594342] [ 1042]    42  1042    95605        0      36       3      163             0 ibus-dconf
[ 5941.594343] [ 1045]    42  1045   140030        1     152       3     2053             0 ibus-x11
[ 5941.594343] [ 1053]    42  1053   109779        0      61       3      326             0 xdg-permission-
[ 5941.594344] [ 1062]     0  1062    16544       13      34       3      158             0 wpa_supplicant
[ 5941.594345] [ 1063]     0  1063   244587       19     226       4    53465             0 packagekitd
[ 5941.594346] [ 1070]    42  1070   258935       69     185       4     1590             0 gnome-settings-
[ 5941.594347] [ 1072]    42  1072    10338        0      25       3       97             0 spice-vdagent
[ 5941.594348] [ 1090]   995  1090   103899        0      52       4      726             0 colord
[ 5941.594349] [ 1103]    42  1103    77156        0      35       3      161             0 ibus-engine-sim
[ 5941.594349] [ 1175]     0  1175    15963        1      34       3      238             0 systemd
[ 5941.594350] [ 1192]     0  1192    44699        0      51       3      775             0 (sd-pam)
[ 5941.594351] [ 5123]     0  5123    37727        1      74       3      320             0 sshd
[ 5941.594352] [ 5129]     0  5129    37727        0      72       3      337             0 sshd
[ 5941.594353] [ 5142]     0  5142    30958        1      15       3      531             0 bash
[ 5941.594354] [ 5680]     0  5680 5369979246    88629    1330     130    77195             0 tcmu-runner
[ 5941.594355] [ 6078]     0  6078 5368816741     1378     296     120    51789             0 glusterd
[ 5941.594356] [ 6210]    32  6210    14326        0      31       3      124             0 rpcbind
[ 5941.594356] [ 6212]     0  6212 5370501607   197078    2154     133   215162             0 gluster-blockd
[ 5941.594357] [ 6215]     0  6215    37727        6      76       3      313             0 sshd
[ 5941.594358] [ 6224]     0  6224    37727        3      75       3      319             0 sshd
[ 5941.594359] [ 6236]     0  6236    30731       30      15       3      273             0 bash
[ 5941.594360] [ 6275]     0  6275 5368838850    94596     469     126    28895             0 glusterfsd
[ 5941.594360] [ 6327]     0  6327 5368833902    76039     457     125    41244             0 glusterfsd
[ 5941.594361] [ 6351]     0  6351 5368803299     1193     185     103     5097             0 glusterfs
[ 5941.594362] Out of memory: Kill process 6212 (gluster-blockd) score 387 or sacrifice child
[ 5941.594371] Killed process 6212 (gluster-blockd) total-vm:21482006428kB, anon-rss:788312kB, file-rss:0kB, shmem-rss:0kB

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Pranith Kumar K 2017-05-28 04:23:49 UTC
I think we need to get the chaching part done even for gluster-blockd for fixing this. I was trying to see If there would be any races that we would find by testing this. But instead it found the OOM killer. IMO we should implement caching soon.

Comment 6 Prasanna Kumar Kalever 2017-06-01 10:32:55 UTC
Here are the patches open for review:
https://review.gluster.org/#/c/17440/
https://review.gluster.org/#/c/17441/

Comment 10 Sweta Anandpara 2017-08-01 15:51:11 UTC
Cache functionality has already been validated in bz 1464421. Changed the cache size to non-default value by setting the same in /etc/sysconfig/gluster-blockd and it showed the expected behaviour. 

Also, the actual issue of mem-leak is being tracked separately in bz 1196020. 

Moving this bug to verified in 3.3.0. Logs are pasted below. 

[root@dhcp47-121 ~]# vim /etc/sysconfig/gluster-blockd         >> set the cache limit to '4'
[root@dhcp47-121 ~]# vim /etc/sysconfig/gluster-blockd 
[root@dhcp47-121 ~]# systemctl restart gluster-blockd
[root@dhcp47-121 ~]# 
[root@dhcp47-121 ~]# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-08-01 11:31:46 EDT; 4s ago
 Main PID: 18230 (gluster-blockd)
   CGroup: /system.slice/gluster-blockd.service
           └─18230 /usr/sbin/gluster-blockd --glfs-lru-count 4 --log-level INFO

Aug 01 11:31:46 dhcp47-121.lab.eng.blr.redhat.com systemd[1]: Started Gluster bl...
Aug 01 11:31:46 dhcp47-121.lab.eng.blr.redhat.com systemd[1]: Starting Gluster b...
Hint: Some lines were ellipsized, use -l to show in full.
[root@dhcp47-121 ~]# 








[root@dhcp47-121 ~]# gluster-block list vol1
bk1
bk2
bk3
bk4
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m4.235s
user	0m0.003s
sys	0m0.014s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.031s
user	0m0.006s
sys	0m0.001s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m3.117s
user	0m0.003s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m0.241s
user	0m0.002s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.032s
user	0m0.003s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.028s
user	0m0.002s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m3.504s
user	0m0.001s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m0.023s
user	0m0.003s
sys	0m0.001s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m0.030s
user	0m0.003s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.089s
user	0m0.002s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.027s
user	0m0.001s
sys	0m0.005s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m4.123s
user	0m0.004s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol1
bk1
bk2
bk3
bk4

real	0m4.758s
user	0m0.004s
sys	0m0.004s
[root@dhcp47-121 ~]# time gluster-block list vol3
bk1
bk2

real	0m0.038s
user	0m0.004s
sys	0m0.010s
[root@dhcp47-121 ~]# time gluster-block list vol4
bk1
bk2

real	0m0.032s
user	0m0.001s
sys	0m0.006s
[root@dhcp47-121 ~]# time gluster-block list vol5
bk1
bk2

real	0m0.024s
user	0m0.001s
sys	0m0.003s
[root@dhcp47-121 ~]# time gluster-block list vol2
bk1
bk2

real	0m3.971s
user	0m0.003s
sys	0m0.008s
[root@dhcp47-121 ~]#

Comment 13 errata-xmlrpc 2017-09-21 04:19:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2773


Note You need to log in before you can comment on or make changes to this bug.