Bug 1398311
Summary: | [compound FOPs]:in replica pair one brick is down the other Brick process and fuse client process consume high memory at a increasing pace | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> |
Component: | replicate | Assignee: | Krutika Dhananjay <kdhananj> |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.2 | CC: | amukherj, kdhananj, mchangir, rhinduja, rhs-bugs, sasundar, storage-qa-internal |
Target Milestone: | --- | ||
Target Release: | RHGS 3.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.4-10 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-03-23 05:50:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1351528 |
Description
Nag Pavan Chilakam
2016-11-24 12:24:51 UTC
Note: 1398315 - [compound FOPs]: Memory leak while doing FOPs with brick down was raised due to memory leak I would like to track them as two different issues. However on RCA if we find that the root cause is same, then we can go ahead and dup one of them to the other hit this while validating RFE 1360978 - [RFE]Reducing number of network round trips Requesting qe and pm ack on this BZ. http://review.gluster.org/#/c/15965/ <--- patch posted in upstream master. Moving this bug to POST state. NOTE: The fix for this bug is the same as the one for https://bugzilla.redhat.com/show_bug.cgi?id=1398315 -Krutika ON_QATP: I have tried this scenario on 3.8.4-8 However I still see this rapid memory consumption May have to fail this bug Brick mem consumption: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND (beginning of file create post brick kill) 16200 root 20 0 1250076 53876 4052 S 33.3 0.7 10:36.66 gluster+ (end of file create post brick kill) 16200 root 20 0 1578784 374512 3032 S 0.0 4.7 20:52.72 gluster+ client mem consumption: [root@rhs-client45 ~]# top -n 1 -b|egrep "RES|7699" PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7699 root 20 0 708532 44260 3904 S 68.8 0.3 21:00.23 gluster+ (post brick kill) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7699 root 20 0 2021388 1.277g 4052 S 0.0 8.2 31:34.71 gluster+ Which means I will have to fail 1398315 - [compound FOPs]: Memory leak while doing FOPs with brick down Also i saw client error in fuse log as below(probably when i killed the brick) [2016-12-16 12:13:08.076069] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-cfops-client-1: remote operation failed [Transport endpoint is not connected] [root@rhs-client45 gluster]# grep -iarT '^num_allocs=' glusterdump.7699.dump.1481891567 | sort | uniq num_allocs=1 num_allocs=1024 num_allocs=1025 num_allocs=13 num_allocs=14 num_allocs=16 num_allocs=2 num_allocs=21 num_allocs=25 num_allocs=26 num_allocs=3 num_allocs=4 num_allocs=5 num_allocs=59 num_allocs=6 num_allocs=7 num_allocs=8 num_allocs=9 highest value of num_allocs is 1024,1025 from FUSE event history buffers. Same with all other statedumps of PID 7699 As for PID 3942, [root@rhs-client45 gluster]# grep -iarT '^num_allocs=' glusterdump.3942.dump.1481889016 | sort | uniq num_allocs=1 num_allocs=10 num_allocs=1024 num_allocs=1025 num_allocs=104 num_allocs=13 num_allocs=176 num_allocs=19 num_allocs=2 num_allocs=24 num_allocs=3 num_allocs=32 num_allocs=34 num_allocs=394 num_allocs=4 num_allocs=404 num_allocs=415 num_allocs=5 num_allocs=6 num_allocs=66 num_allocs=7 num_allocs=8 num_allocs=9 The highest value is again 1024, 1025 and both of them are related to event history buffers. Coming to the bricks statedump: [root@dhcp35-239 gluster]# grep -iarT '^num_allocs=' rhs-brick4-cfops.16200.dump.1481891916| sort | uniq num_allocs=1 num_allocs=10 num_allocs=104 num_allocs=115 num_allocs=12 num_allocs=13 num_allocs=14 num_allocs=19 num_allocs=2 num_allocs=21 num_allocs=3 num_allocs=36 num_allocs=4 num_allocs=5 num_allocs=59 num_allocs=6 num_allocs=66 num_allocs=7 num_allocs=8 num_allocs=9 num_allocs=97 The highest value is 115 and seems to be coming from [global.glusterfs - usage-type gf_common_mt_strdup memusage] size=2342 num_allocs=115 max_size=3281 max_num_allocs=135 total_allocs=323 I am not sure about the source of these allocations. With all that, I am guessing 115 strings would not consume so much memory as to lead to OOM kill. This looks similar to the issue you are working on with Csaba. I cannot of course claim with 100% certainty that the leaks are NOT in compound fops unless we capture statedump with the same test but without compound fops && the statedump numbers look identical. At least the gluster-managed data structures dont look to be consumed at large enough rate to be causing oom kill as per statedump. -Krutika I reran same case on a similar new volume but without enabling compound fops. In this case I hardly saw any memory spike(a spike was seen on the online brick once the other brick was down, but then on the consumption was not increasing or spiking) Hence have to fail this bug upstream mainline patch http://review.gluster.org/#/c/16210/ posted for review. downstream patch : https://code.engineering.redhat.com/gerrit/#/c/93488/ BZ added to erratum https://errata.devel.redhat.com/advisory/24866 Moving to ON_QA on_qa validation: ================= there was minimal mem consumption when compared to before fix Hence moving the testcase to pass, ie the bz to verified had a 1x2 vol with cfops enabled mounted vol one two client cli45(el7.3) and cli24(el6.7) From cli45, started to create a 10gb file using dd after about 3gb was created brought down b1 kept tracking client side and brick process memeory consumption below are the details, in a loop for every 30sec cli45: before file create started: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 41632 3196 S 0.0 0.3 0:00.05 glusterfs after create was issued: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 46764 3804 S 75.0 0.3 0:21.34 glusterfs ==>it shot by 5MB but from there on it was constant or minimal increase before brick b1 was killed: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 46848 3824 S 66.7 0.3 4:49.47 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 46848 3824 S 73.3 0.3 5:11.78 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 46848 3824 S 62.5 0.3 5:33.70 glusterfs after b1 was killed: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 643012 46848 3824 S 68.8 0.3 5:55.95 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47264 3936 S 68.8 0.3 6:17.59 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47268 3936 S 73.3 0.3 6:38.93 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47268 3936 S 68.8 0.3 7:00.40 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47268 3936 S 81.2 0.3 7:21.79 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47268 3936 S 75.0 0.3 7:42.81 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 645148 47328 3936 S 68.8 0.3 12:40.12 glusterfs after b1 was brought up using vol force start: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 778272 49512 4028 S 93.8 0.3 13:55.20 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 778272 49516 4032 S 106.2 0.3 14:26.07 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 778272 49516 4032 S 100.0 0.3 14:56.36 glusterfs at end of file create complete 10gb file PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 776220 49512 4040 S 0.0 0.3 20:48.97 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 776220 49512 4040 S 0.0 0.3 20:48.97 glusterfs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2985 root 20 0 776220 49512 4040 S 0.0 0.3 20:48.97 glusterfs brick process(the replica brick b2 which was alive) PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1117984 48020 4024 S 37.5 0.6 0:01.36 glusterfsd Tue Dec 27 12:53:45 IST 2016 ============>started file1 create PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1117984 48272 4164 S 43.8 0.6 0:11.84 glusterfsd Tue Dec 27 12:54:15 IST 2016 ===>killing b1 after about 3gb created PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48520 4260 S 37.5 0.6 2:44.61 glusterfsd Tue Dec 27 13:01:48 IST 2016 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48784 4264 S 56.2 0.6 3:00.91 glusterfsd Tue Dec 27 13:02:18 IST 2016 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48784 4264 S 62.5 0.6 3:19.55 glusterfsd PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48784 4264 S 56.2 0.6 7:58.98 glusterfsd Tue Dec 27 13:10:21 IST 2016 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48784 4264 S 52.9 0.6 8:17.68 glusterfsd Tue Dec 27 13:10:51 IST 2016 ===>i dont see any mem leak even after 6gb completed hence restarting brick PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1250084 48636 4116 S 62.5 0.6 8:36.46 glusterfsd Tue Dec 27 13:11:21 IST 2016 ====>started vol using force PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1316648 48848 4132 S 61.1 0.6 8:59.89 glusterfsd Tue Dec 27 13:11:52 IST 2016 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1316648 48848 4132 S 25.0 0.6 9:20.06 glusterfsd Tue Dec 27 13:12:22 IST 2016 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28670 root 20 0 1316648 49072 4152 S 37.5 0.6 9:33.38 glusterfsd Tue Dec 27 13:12:52 IST 2016 So there was minimal mem consumption when compared to before fix Hence moving the testcase to pass, ie the bz to verified [root@dhcp35-37 ~]# gluster v heal comp info Brick 10.70.35.37:/rhs/brick4/comp Status: Connected Number of entries: 0 Brick 10.70.35.116:/rhs/brick4/comp /dir1/file1 Status: Connected Number of entries: 1 [root@dhcp35-37 ~]# gluster v heal comp info Brick 10.70.35.37:/rhs/brick4/comp Status: Connected Number of entries: 0 Brick 10.70.35.116:/rhs/brick4/comp Status: Connected Number of entries: 0 [root@dhcp35-37 ~]# gluster v info comp Volume Name: comp Type: Replicate Volume ID: 1dae5832-1625-4287-b478-3be79be62d68 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick4/comp Brick2: 10.70.35.116:/rhs/brick4/comp Options Reconfigured: cluster.use-compound-fops: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable [root@dhcp35-37 ~]# gluster v status comp Status of volume: comp Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.35.37:/rhs/brick4/comp 49155 0 Y 23814 Brick 10.70.35.116:/rhs/brick4/comp 49155 0 Y 28670 Self-heal Daemon on localhost N/A N/A Y 23834 Self-heal Daemon on dhcp35-239.lab.eng.blr. redhat.com N/A N/A Y 17583 Self-heal Daemon on dhcp35-196.lab.eng.blr. redhat.com N/A N/A Y 22197 Self-heal Daemon on dhcp35-116.lab.eng.blr. redhat.com N/A N/A Y 20028 Self-heal Daemon on dhcp35-135.lab.eng.blr. redhat.com N/A N/A Y 16450 Self-heal Daemon on dhcp35-8.lab.eng.blr.re dhat.com N/A N/A Y 16470 Task Status of Volume comp ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp35-37 ~]# refer bz https://bugzilla.redhat.com/show_bug.cgi?id=1398315 for validation log Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |