Bug 1529249
Summary: | Brick Multiplexing: glusterfsd (brick) process leaks memory on an idle brick multiplex setup | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> | ||||||
Component: | core | Assignee: | Mohit Agrawal <moagrawa> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | rhgs-3.3 | CC: | moagrawa, nchilaka, pasik, pprakash, puebele, rhs-bugs, rkothiya, sheggodu, storage-qa-internal, ubansal | ||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||
Target Release: | RHGS 3.5.z Batch Update 3 | Flags: | moagrawa:
needinfo-
|
||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | brick-multiplexing | ||||||||
Fixed In Version: | glusterfs-6.0-38 | Doc Type: | No Doc Update | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-12-17 04:50:16 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Nag Pavan Chilakam
2017-12-27 10:07:27 UTC
Over a period of 1hr: 3.3.1-async:6mb [root@dhcp35-205 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:26 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.9G 11M 401M 6.0G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24834 root 20 0 9640864 872484 6080 S 0.0 10.9 34:33.02 glusterfsd [root@dhcp35-205 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.9G 11M 401M 6.0G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24834 root 20 0 9714444 878944 6080 S 0.0 11.0 36:11.54 glusterfsd [root@dhcp35-205 ~]# n2: [root@dhcp35-57 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:26 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 393M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19719 root 20 0 8155744 730816 6092 S 0.0 9.1 35:17.07 glusterfsd [root@dhcp35-57 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 393M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19719 root 20 0 8168224 723392 6092 S 0.0 9.0 36:55.94 glusterfsd [root@dhcp35-57 ~]# ^C n3: [root@dhcp35-169 ~]# sleep 1800;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:40:55 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 395M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21637 root 20 0 8030728 722664 6092 S 0.0 9.0 28:57.93 glusterfsd [root@dhcp35-169 ~]# [root@dhcp35-169 ~]# [root@dhcp35-169 ~]# [root@dhcp35-169 ~]# sleep 1800;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" ^C [root@dhcp35-169 ~]# ^C [root@dhcp35-169 ~]# date Wed Dec 27 13:59:43 IST 2017 [root@dhcp35-169 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 395M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21637 root 20 0 8032288 728540 6092 S 0.0 9.1 29:22.78 glusterfsd n4:2mb [root@dhcp35-145 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:26 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.8G 11M 540M 5.9G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3659 root 20 0 8207900 738180 6092 S 0.0 9.2 33:21.81 glusterfsd [root@dhcp35-145 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.8G 11M 540M 5.9G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3659 root 20 0 8221420 739768 6092 S 0.0 9.2 34:56.23 glusterfsd n5:1.5mb [root@dhcp35-177 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:26 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 396M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21642 root 20 0 8137948 730032 6068 S 0.0 9.1 27:49.56 glusterfsd [root@dhcp35-177 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.1G 6.1G 11M 396M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21642 root 20 0 8145488 731984 6068 S 6.2 9.1 29:04.44 glusterfsd n6:6mb [root@dhcp35-237 ~]# [root@dhcp35-237 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:26 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 6.1G 11M 394M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19603 root 20 0 8385376 757876 6092 S 0.0 9.5 36:42.35 glusterfsd [root@dhcp35-237 ~]# sleep 10;date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:11 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 6.1G 11M 394M 6.2G Swap: 7.9G 0B 7.9G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19603 root 20 0 8409296 763800 6092 S 0.0 9.5 38:24.84 glusterfsd ############################################################################## 3.3.1 ======= n1 ~12mb Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.4G 5.4G 48M 891M 5.9G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16614 root 20 0 9023600 816204 6024 S 0.0 10.2 39:51.15 glusterfsd [root@dhcp42-44 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.4G 5.4G 48M 892M 5.9G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16614 root 20 0 9159320 828104 6024 S 0.0 10.3 41:56.49 glusterfsd [root@dhcp42-44 ~]# n2:~20mb [root@dhcp41-169 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.5G 48M 872M 6.0G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15283 root 20 0 9577948 844664 6032 S 0.0 10.5 38:13.33 glusterfsd [root@dhcp41-169 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.5G 48M 873M 6.0G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15283 root 20 0 9747988 864216 6032 S 0.0 10.8 40:15.04 glusterfsd [root@dhcp41-169 ~]# n3:~20mb [root@dhcp42-174 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.5G 48M 910M 6.0G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5832 root 20 0 8948936 805980 6032 S 0.0 10.1 38:58.78 glusterfsd [root@dhcp42-174 ~]# [root@dhcp42-174 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.3G 5.5G 48M 911M 6.0G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5832 root 20 0 9101036 826264 6032 S 0.0 10.3 41:02.06 glusterfsd [root@dhcp42-174 ~]# n4: ~16mb [root@dhcp41-216 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.4G 5.4G 48M 874M 5.9G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16063 root 20 0 9982356 892360 6028 S 0.0 11.1 36:47.00 glusterfsd [root@dhcp41-216 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.4G 5.4G 48M 875M 5.9G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16063 root 20 0 9924.9m 908072 6028 S 0.0 11.3 38:44.79 glusterfsd [root@dhcp41-216 ~]# n5:7mb [root@dhcp41-230 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 5.6G 46M 868M 6.1G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15024 root 20 0 8344840 760992 6024 S 0.0 9.5 39:03.90 glusterfsd [root@dhcp41-230 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 5.6G 46M 868M 6.1G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15024 root 20 0 8426740 767732 6024 S 5.9 9.6 41:07.52 glusterfsd n6:4mb [root@dhcp42-150 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 13:00:20 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 5.6G 46M 867M 6.1G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15091 root 20 0 8180088 741924 6024 S 0.0 9.3 37:49.72 glusterfsd [root@dhcp42-150 ~]# date;free -h;top -n 1 -b|egrep "RES|glusterfsd" Wed Dec 27 14:00:52 IST 2017 total used free shared buff/cache available Mem: 7.6G 1.2G 5.6G 46M 868M 6.1G Swap: 4.0G 0B 4.0G PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15091 root 20 0 8261724 744476 6024 S 0.0 9.3 39:50.88 glusterfsd From the periodic statedumps of the brick process: [global.glusterfs - usage-type gf_common_mt_gf_timer_t memusage] size=128 num_allocs=2 max_size=192 max_num_allocs=3 total_allocs=222 Versus [global.glusterfs - usage-type gf_common_mt_gf_timer_t memusage] size=128 num_allocs=2 max_size=192 max_num_allocs=3 total_allocs=346 There's a suspected leak here. Created attachment 1372786 [details]
331-ga mem difference in 1hr for glusterfsd
Created attachment 1372787 [details]
331-async mem difference in 1hr for glusterfsd
sosreports,2 statedumps with 1 hr interval are available @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1529249/ *** Bug 1529563 has been marked as a duplicate of this bug. *** So here is data on 3.12.2-24 [root@dhcp35-140 ~]# date; top -n 1 -b|grep gluster Wed Nov 14 14:57:51 IST 2018 12460 root 20 0 13.7g 403728 3688 S 5.0 5.0 1:03.91 glusterfsd 10673 root 20 0 607108 37372 4368 S 0.0 0.5 0:16.74 glusterd 18984 root 20 0 11.1g 269664 7472 S 0.0 3.4 0:12.48 glusterfs [root@dhcp35-140 ~]# sleep 300;date; top -n 1 -b|grep gluster ^C [root@dhcp35-140 ~]# sleep 300;date; top -n 1 -b|grep gluster; echo "##################";sleep 3600;date; top -n 1 -b|grep gluster; echo "##################" Wed Nov 14 15:03:22 IST 2018 12460 root 20 0 13.8g 413728 3688 S 5.0 5.2 1:22.38 glusterfsd 10673 root 20 0 607108 37372 4368 S 0.0 0.5 0:17.12 glusterd 18984 root 20 0 11.1g 268428 7472 S 0.0 3.4 0:17.71 glusterfs ################## Wed Nov 14 16:03:22 IST 2018 10673 root 20 0 607108 37372 4368 S 4.8 0.5 0:21.99 glusterd 12460 root 20 0 13.8g 419908 3688 S 0.0 5.2 4:27.84 glusterfsd 18984 root 20 0 11.1g 268792 7472 S 0.0 3.4 1:10.48 glusterfs ################## [root@dhcp35-140 ~]# date; top -n 1 -b|grep gluster; echo "##################";sleep 3600;date; top -n 1 -b|grep gluster; echo "##################" Wed Nov 14 18:23:03 IST 2018 10673 root 20 0 607108 37372 4368 S 0.0 0.5 0:33.89 glusterd 12460 root 20 0 13.8g 421536 3688 S 0.0 5.3 11:42.12 glusterfsd 18984 root 20 0 11.1g 268792 7472 S 0.0 3.4 3:13.39 glusterfs ################## so there is still a leak of anything b/w 1-5 mb/hr updating resoruce details after 3.5 hrs, another 4 mb leaked [root@dhcp35-140 ~]# top -n 1 -b|grep gluster 10673 root 20 0 607108 37372 4368 S 0.0 0.5 0:51.77 glusterd 12460 root 20 0 13.8g 425616 3688 S 0.0 5.3 22:44.26 glusterfsd 18984 root 20 0 11.1g 268792 7472 S 0.0 3.4 6:21.55 glusterfs [root@dhcp35-140 ~]# date Wed Nov 14 21:53:24 IST 2018 seems like the leak subsides over time 12 hrs difference (note: this is on 3.12.2-27, but the above observations on 3.12.2-24 was similar even for 3.12.2-27) checked over 12 hrs difference below is the o/p at 11:08 PM nov14 21511 root 20 0 13.9g 343336 3812 S 4.8 4.3 7:39.76 glusterfsd 21311 root 20 0 838484 52120 4400 S 0.0 0.7 0:26.25 glusterd 21501 root 20 0 20.7g 288988 7636 S 0.0 3.6 5:49.55 glusterfs at 11:08AM nov15 [root@dhcp35-140 ~]# top -n 1 -b|grep gluster 21311 root 20 0 838484 52120 4400 S 0.0 0.7 1:36.10 glusterd 21501 root 20 0 20.7g 288992 7636 S 0.0 3.6 55:21.63 glusterfs 21511 root 20 0 13.9g 347336 3812 S 0.0 4.3 113:30.57 glusterfsd an increase of 4mb In 3.5.0 we fixed io-thread detach leak issue. Can we try to reproduce on the latest 3.5.0 release? Thanks, Mohit Agrawal (In reply to Mohit Agrawal from comment #22) > In 3.5.0 we fixed io-thread detach leak issue. > Can we try to reproduce on the latest 3.5.0 release? > > Thanks, > Mohit Agrawal Removing the needinfo as QE had requested dev to move it to onqa in case it is fixed, as we would need to test it formally refer above comments Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5603 |