Bug 1327831 - Ganesha+Tiering: oom-kill of self-heal daemon observed while doing IO on tiered volume.
Summary: Ganesha+Tiering: oom-kill of self-heal daemon observed while doing IO on tier...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: replicate
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Ashish Pandey
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On: 1342426
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-16 13:23 UTC by Shashank Raj
Modified: 2016-11-08 03:53 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-22 06:15:17 UTC
Embargoed:


Attachments (Terms of Use)

Description Shashank Raj 2016-04-16 13:23:57 UTC
Description of problem:
Ganesha+Tiering: ganesha_grace invoked oom-killer observed while removal of files is going on tiered volume

Version-Release number of selected component (if applicable):
glusterfs-3.7.9-1
nfs-ganesha-2.3.1-3

How reproducible:
Once

Steps to Reproduce:
1.Create a 4 node cluster and configure ganesha on the cluster
2.Create a tiered volume, attach tier, enable quota on the volume.

Volume Name: tiervolume
Type: Tier
Volume ID: 32d2eaf1-7a5b-4d39-8ec8-27bdb9bee4c1
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.174:/bricks/brick3/b3
Brick2: 10.70.37.127:/bricks/brick3/b3
Brick3: 10.70.37.158:/bricks/brick3/b3
Brick4: 10.70.37.180:/bricks/brick3/b3
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: 10.70.37.180:/bricks/brick0/b0
Brick6: 10.70.37.158:/bricks/brick0/b0
Brick7: 10.70.37.127:/bricks/brick0/b0
Brick8: 10.70.37.174:/bricks/brick0/b0
Brick9: 10.70.37.180:/bricks/brick1/b1
Brick10: 10.70.37.158:/bricks/brick1/b1
Brick11: 10.70.37.127:/bricks/brick1/b1
Brick12: 10.70.37.174:/bricks/brick1/b1
Brick13: 10.70.37.180:/bricks/brick2/b2
Brick14: 10.70.37.158:/bricks/brick2/b2
Brick15: 10.70.37.127:/bricks/brick2/b2
Brick16: 10.70.37.174:/bricks/brick2/b2
Options Reconfigured:
cluster.watermark-hi: 40
cluster.watermark-low: 10
cluster.tier-mode: cache
features.ctr-enabled: on
ganesha.enable: on
features.cache-invalidation: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
nfs.disable: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
nfs-ganesha: enable

3.Enable ganesha on the volume and mount on 2 clients using vers=4.
4.Performed file creations (100kb files in large number) from these mount points simultaneously.

During file creation observed, observed below issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1327773

5.Restarted ganesha service on the mounted node after above issue was hit.

6.Started removing of files from the mount ponits and while doing ls simultaneously from other mount point observed below issue (continuous cache_invalidation messages in ganesha-gfapi.log)

https://bugzilla.redhat.com/show_bug.cgi?id=1323424

7) After some time observed " ganesha_grace invoked oom-killer' on one node of the cluster with below trace in dmesg

[313396.095545] ganesha_grace invoked oom-killer: gfp_mask=0x3000d0, order=2, oom_score_adj=0
[313396.095551] ganesha_grace cpuset=/ mems_allowed=0
[313396.095554] CPU: 0 PID: 5124 Comm: ganesha_grace Not tainted 3.10.0-327.el7.x86_64 #1
[313396.095556] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[313396.095558]  ffff8800ca245080 00000000a3331627 ffff8800516d3b68 ffffffff816351f1
[313396.095562]  ffff8800516d3bf8 ffffffff81630191 ffff8800be262830 ffff8800be262848
[313396.095564]  ffffffff00000202 fffeefff00000000 0000000000000002 ffffffff81128803
[313396.095567] Call Trace:
[313396.095575]  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[313396.095578]  [<ffffffff81630191>] dump_header+0x8e/0x214
[313396.095582]  [<ffffffff81128803>] ? delayacct_end+0x63/0xb0
[313396.095586]  [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0
[313396.095590]  [<ffffffff81088dae>] ? has_capability_noaudit+0x1e/0x30
[313396.095593]  [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[313396.095597]  [<ffffffff811737f5>] __alloc_pages_nodemask+0xa95/0xb90
[313396.095601]  [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[313396.095604]  [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[313396.095608]  [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[313396.095611]  [<ffffffff8107a401>] do_fork+0xe1/0x320
[313396.095613]  [<ffffffff81090731>] ? __set_task_blocked+0x41/0xa0
[313396.095616]  [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[313396.095620]  [<ffffffff81645c59>] stub_clone+0x69/0x90
[313396.095623]  [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b

lrmd invoked oom-killer on other 2 nodes with below trace:

[127609.615506] lrmd invoked oom-killer: gfp_mask=0x3000d0, order=2, oom_score_adj=0
[127609.615513] lrmd cpuset=/ mems_allowed=0
[127609.615516] CPU: 3 PID: 12818 Comm: lrmd Not tainted 3.10.0-327.el7.x86_64 #1
[127609.615518] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[127609.615520]  ffff8800d3f03980 00000000db61a341 ffff880210e7fb68 ffffffff816351f1
[127609.615524]  ffff880210e7fbf8 ffffffff81630191 ffff8800363ba7c0 ffff8800363ba7d8
[127609.615526]  ffffffff00000202 fff6efff00000000 0000000000000001 ffffffff81128803
[127609.615529] Call Trace:
[127609.615539]  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[127609.615543]  [<ffffffff81630191>] dump_header+0x8e/0x214
[127609.615548]  [<ffffffff81128803>] ? delayacct_end+0x63/0xb0
[127609.615553]  [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0
[127609.615556]  [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[127609.615559]  [<ffffffff811737f5>] __alloc_pages_nodemask+0xa95/0xb90
[127609.615565]  [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[127609.615568]  [<ffffffff8107a401>] do_fork+0xe1/0x320
[127609.615574]  [<ffffffff811e3b5e>] ? SYSC_newstat+0x3e/0x60
[127609.615576]  [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[127609.615581]  [<ffffffff81645c59>] stub_clone+0x69/0x90
[127609.615584]  [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b


Actual results:

ganesha_grace invoked oom-killer observed while doing IO on tiered volume.


Expected results:

No OOM kill should be observed.

Additional info:

Comment 2 Shashank Raj 2016-04-16 13:25:30 UTC
Also pcs status shows below failed actions:

Failed actions:
    nfs-grace_monitor_5000 on dhcp37-180.lab.eng.blr.redhat.com 'unknown error' (1): call=98, status=Timed Out, exit-reason='none', last-rc-change='Fri Apr 15 18:47:44 2016', queued=0ms, exec=0ms
    nfs-mon_monitor_10000 on dhcp37-127.lab.eng.blr.redhat.com 'unknown error' (1): call=40, status=Timed Out, exit-reason='none', last-rc-change='Fri Apr 15 16:35:07 2016', queued=0ms, exec=0ms
    nfs-grace_monitor_5000 on dhcp37-127.lab.eng.blr.redhat.com 'unknown error' (1): call=43, status=Timed Out, exit-reason='none', last-rc-change='Fri Apr 15 16:35:07 2016', queued=0ms, exec=0ms
    nfs-grace_monitor_5000 on dhcp37-158.lab.eng.blr.redhat.com 'unknown error' (1): call=34, status=Timed Out, exit-reason='none', last-rc-change='Fri Apr 15 16:35:11 2016', queued=0ms, exec=0ms
    nfs-mon_monitor_10000 on dhcp37-158.lab.eng.blr.redhat.com 'unknown error' (1): call=33, status=Timed Out, exit-reason='none', last-rc-change='Fri Apr 15 16:35:11 2016', queued=0ms, exec=0ms

Comment 3 Shashank Raj 2016-04-16 13:37:17 UTC
sosreports and logs are placed at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1327831

Comment 4 Soumya Koduri 2016-04-17 12:25:58 UTC
Shashank,

oom killer gets invoked when system is experiencing memory crunch. Do you see nfs-ganesha (or any other related) process being killed because of oom_kill?

Comment 5 Shashank Raj 2016-04-19 07:35:32 UTC
below messages are observed in dmesg:

on node1:

[313396.095816] Out of memory: Kill process 7431 (glusterfs) score 711 or sacrifice child
[313396.095880] Killed process 7431 (glusterfs) total-vm:132192500kB, anon-rss:5576680kB, file-rss:0k

on node2:

[127609.615788] Out of memory: Kill process 10162 (glusterfs) score 798 or sacrifice child
[127609.615857] Killed process 10162 (glusterfs) total-vm:151881424kB, anon-rss:6311664kB, file-rss:0kB

on node3:

[127218.901862] Out of memory: Kill process 11349 (glusterfs) score 754 or sacrifice child
[127218.901928] Killed process 11349 (glusterfs) total-vm:141319480kB, anon-rss:6073472kB, file-rss:780kB


Volume status shows self heal processes not running on 3 nodes:

Status of volume: tiervolume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick 10.70.37.174:/bricks/brick3/b3        49159     0          Y       12378
Brick 10.70.37.127:/bricks/brick3/b3        49160     0          Y       11328
Brick 10.70.37.158:/bricks/brick3/b3        49165     0          Y       10141
Brick 10.70.37.180:/bricks/brick3/b3        49160     0          Y       7410 
Cold Bricks:
Brick 10.70.37.180:/bricks/brick0/b0        49157     0          Y       32233
Brick 10.70.37.158:/bricks/brick0/b0        49162     0          Y       2675 
Brick 10.70.37.127:/bricks/brick0/b0        49157     0          Y       3909 
Brick 10.70.37.174:/bricks/brick0/b0        49156     0          Y       4960 
Brick 10.70.37.180:/bricks/brick1/b1        49158     0          Y       32252
Brick 10.70.37.158:/bricks/brick1/b1        49163     0          Y       2694 
Brick 10.70.37.127:/bricks/brick1/b1        49158     0          Y       3928 
Brick 10.70.37.174:/bricks/brick1/b1        49157     0          Y       4979 
Brick 10.70.37.180:/bricks/brick2/b2        49159     0          Y       32271
Brick 10.70.37.158:/bricks/brick2/b2        49164     0          Y       2713 
Brick 10.70.37.127:/bricks/brick2/b2        49159     0          Y       3947 
Brick 10.70.37.174:/bricks/brick2/b2        49158     0          Y       4998 
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
Quota Daemon on localhost                   N/A       N/A        Y       7439 
Self-heal Daemon on dhcp37-158.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Quota Daemon on dhcp37-158.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       10179
Self-heal Daemon on dhcp37-174.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       12399
Quota Daemon on dhcp37-174.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       12413
Self-heal Daemon on dhcp37-127.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A 
Quota Daemon on dhcp37-127.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       11363

Comment 6 Soumya Koduri 2016-04-19 08:40:46 UTC
Thanks Shashank. Moving it to AFR component to investigate high-memory usage by self-heal daemon.

Comment 8 Shashank Raj 2016-04-21 12:59:37 UTC
This issue has been frequently seen while removing files from ganesha mount on tiered volume. self heal daemon gets killed on all the nodes resulting in hang of the ganesha mount.

Comment 13 Shashank Raj 2016-04-26 12:59:19 UTC
While trying to verify the bug with build provided, below are the observations:

rpm versions:

[root@dhcp37-180 /]# rpm -qa|grep glusterfs
glusterfs-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-fuse-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-api-devel-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-rdma-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-libs-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-client-xlators-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-cli-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-devel-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-ganesha-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-resource-agents-3.7.9-1.el7rhgs.testing.bz1327831.noarch
glusterfs-debuginfo-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-api-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-server-3.7.9-1.el7rhgs.testing.bz1327831.x86_64
glusterfs-geo-replication-3.7.9-1.el7rhgs.testing.bz1327831.x86_64

[root@dhcp37-180 /]# rpm -qa|grep ganesha
nfs-ganesha-2.3.1-4.el7rhgs.x86_64
nfs-ganesha-gluster-2.3.1-4.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-1.el7rhgs.testing.bz1327831.x86_64

While creating 100 KB files from 2 mount points (120000 from each), 2 of 4 nodes had below CPU and memory utilization for SHD

[root@dhcp37-158 exports]# ps aux|grep 25330
root     25330 25.6 47.5 108718472 3808860 ?   Ssl  01:57  68:13 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/1b764ddb77c8e070781bbf988d8cd97c.socket --xlator-option *replicate*.node-uuid=18fa3cca-c714-4c70-b227-cef260fffa27

[root@dhcp37-158 exports]# cat /proc/25330/oom_score
569

[root@dhcp37-174 exports]# ps aux|grep 26891

root     26891 28.2 61.1 127531980 4901076 ?   Ssl  01:57  78:45 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/209f5cf6a29e255887e0f676be136874.socket --xlator-option *replicate*.node-uuid=1a5a806a-ab58-462b-b939-0b8158a2d914

[root@dhcp37-174 exports]# cat /proc/26891/oom_score
668

At this point, IO hanged and after sometime OOM kill issue is seen on dhcp37-174 node

in dmesg on node:

[1039955.322429] glusterfs invoked oom-killer: gfp_mask=0x42d0, order=3, oom_score_adj=0
[1039955.322436] glusterfs cpuset=/ mems_allowed=0
[1039955.322439] CPU: 1 PID: 26980 Comm: glusterfs Not tainted 3.10.0-327.el7.x86_64 #1
[1039955.322441] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[1039955.322443]  ffff8801d9541700 00000000868b512e ffff88018b0df840 ffffffff816351f1
[1039955.322447]  ffff88018b0df8d0 ffffffff81630191 ffff880212160520 ffff880212160538
[1039955.322450]  ffffffff00000202 fffeefff00000000 0000000000000004 ffffffff81128803
[1039955.322453] Call Trace:
[1039955.322463]  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[1039955.322467]  [<ffffffff81630191>] dump_header+0x8e/0x214
[1039955.322472]  [<ffffffff81128803>] ? delayacct_end+0x63/0xb0
[1039955.322477]  [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0
[1039955.322482]  [<ffffffff81088dae>] ? has_capability_noaudit+0x1e/0x30
[1039955.322485]  [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[1039955.322489]  [<ffffffff811737f5>] __alloc_pages_nodemask+0xa95/0xb90
[1039955.322494]  [<ffffffff811b43f9>] alloc_pages_current+0xa9/0x170
[1039955.322500]  [<ffffffff81514ad0>] sk_page_frag_refill+0x70/0x160
[1039955.322505]  [<ffffffff81576b73>] tcp_sendmsg+0x263/0xc20
[1039955.322511]  [<ffffffff815a0f44>] inet_sendmsg+0x64/0xb0
[1039955.322516]  [<ffffffff812889d3>] ? selinux_socket_sendmsg+0x23/0x30
[1039955.322519]  [<ffffffff8150fe47>] sock_aio_write+0x157/0x180
[1039955.322521]  [<ffffffff8116b5e8>] ? wait_on_page_bit_killable+0x88/0xb0
[1039955.322525]  [<ffffffff811dde69>] do_sync_readv_writev+0x79/0xd0
[1039955.322528]  [<ffffffff811df43e>] do_readv_writev+0xce/0x260
[1039955.322533]  [<ffffffff81197088>] ? handle_mm_fault+0x5b8/0xf50
[1039955.322539]  [<ffffffff81058aaf>] ? kvm_clock_get_cycles+0x1f/0x30
[1039955.322544]  [<ffffffff810d87ca>] ? __getnstimeofday64+0x3a/0xd0
[1039955.322546]  [<ffffffff811df665>] vfs_writev+0x35/0x60
[1039955.322548]  [<ffffffff811df81f>] SyS_writev+0x7f/0x110
[1039955.322554]  [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
.
.
.
.
[1039955.322797] Out of memory: Kill process 26891 (glusterfs) score 668 or sacrifice child
[1039955.323035] Killed process 26891 (glusterfs) total-vm:127531980kB, anon-rss:4900364kB, file-rss:376kB


the other node dhcp37-158 is hanged and not able to login to that node. (Will update the bug once it comes back)

No such messages "client_cbk_cache_invalidation" seen in shd log.

Setup details are as under in case anyone wants to have a look:

[root@dhcp37-180 ~]# gluster peer status
Number of Peers: 3

Hostname: dhcp37-158.lab.eng.blr.redhat.com
Uuid: 18fa3cca-c714-4c70-b227-cef260fffa27
State: Peer in Cluster (Connected)

Hostname: dhcp37-127.lab.eng.blr.redhat.com
Uuid: 43649367-7f47-41cf-8d63-97896e3504d4
State: Peer in Cluster (Connected)

Hostname: dhcp37-174.lab.eng.blr.redhat.com
Uuid: 1a5a806a-ab58-462b-b939-0b8158a2d914
State: Peer in Cluster (Connected)

Comment 14 Shashank Raj 2016-04-27 18:25:19 UTC
Observed that some of the nodes of the cluster were not accessible at all because of high CPU usage, so had to do a reboot of all nodes.

sosreports from the nodes are placed at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1327831/latest

Comment 15 Shashank Raj 2016-05-01 06:43:48 UTC
We are hitting this issue consistently on tiered volume setup and it makes whole cluster unusable because of the high CPU and memory usage.

While running fssanity on v3 ganesha mount on a tiered volume, after some of the test suites got executed, observed OOM kill of self heal daemon on all the nodes, the mounted node is not accessible and all the bricks residing on that mounted node went down

[root@dhcp37-127 ~]# gluster vol info tiervolume
 
Volume Name: tiervolume
Type: Tier
Volume ID: 45fd73f7-e8ed-43da-b9c6-79ae042cef12
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.37.174:/bricks/brick3/b3
Brick2: 10.70.37.127:/bricks/brick3/b3
Brick3: 10.70.37.158:/bricks/brick3/b3
Brick4: 10.70.37.180:/bricks/brick3/b3
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: 10.70.37.180:/bricks/brick0/b0
Brick6: 10.70.37.158:/bricks/brick0/b0
Brick7: 10.70.37.127:/bricks/brick0/b0
Brick8: 10.70.37.174:/bricks/brick0/b0
Brick9: 10.70.37.180:/bricks/brick1/b1
Brick10: 10.70.37.158:/bricks/brick1/b1
Brick11: 10.70.37.127:/bricks/brick1/b1
Brick12: 10.70.37.174:/bricks/brick1/b1
Brick13: 10.70.37.180:/bricks/brick2/b2
Brick14: 10.70.37.158:/bricks/brick2/b2
Brick15: 10.70.37.127:/bricks/brick2/b2
Brick16: 10.70.37.174:/bricks/brick2/b2
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.tier-mode: cache
features.ctr-enabled: on
nfs.disable: on
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
nfs-ganesha: enable

[root@dhcp37-127 ~]# gluster vol status tiervolume
Status of volume: tiervolume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick 10.70.37.174:/bricks/brick3/b3        49206     0          Y       5472 
Brick 10.70.37.127:/bricks/brick3/b3        49211     0          Y       3970 
Brick 10.70.37.158:/bricks/brick3/b3        49212     0          Y       3708 
Cold Bricks:
Brick 10.70.37.158:/bricks/brick0/b0        49209     0          Y       3525 
Brick 10.70.37.127:/bricks/brick0/b0        49208     0          Y       3780 
Brick 10.70.37.174:/bricks/brick0/b0        49203     0          Y       5291 
Brick 10.70.37.158:/bricks/brick1/b1        49210     0          Y       3547 
Brick 10.70.37.127:/bricks/brick1/b1        49209     0          Y       3799 
Brick 10.70.37.174:/bricks/brick1/b1        49204     0          Y       5310 
Brick 10.70.37.158:/bricks/brick2/b2        49211     0          Y       3566 
Brick 10.70.37.127:/bricks/brick2/b2        49210     0          Y       3818 
Brick 10.70.37.174:/bricks/brick2/b2        49205     0          Y       5329 
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
Quota Daemon on localhost                   N/A       N/A        Y       4323 
Self-heal Daemon on dhcp37-174.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Quota Daemon on dhcp37-174.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       5828 
Self-heal Daemon on dhcp37-158.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Quota Daemon on dhcp37-158.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       4055 
 
Task Status of Volume tiervolume
------------------------------------------------------------------------------
Task                 : Tier migration      
ID                   : bf23ff00-4a5f-4b30-a2f7-e942847d63a5
Status               : in progress

Comment 16 Krutika Dhananjay 2016-05-03 12:09:18 UTC
OK. The same daemon (Linux process) handles self-healing in both EC and AFR sub-volumes. We need to find out whether the leak is coming from AFR or EC.

Could you help us in isolating the translator that is leaking memory?

There are two ways you can do this:

With the same distributed-replicate + distributed-disperse tiered volume, when you run your IO and run self-heal and find that the memory consumed by the daemon is progressively rising, could you take the statedump of the SHD and attach it here?
Here's what you need to do to capture the statedump:
$ kill -USR1 <pid-of-self-heal-daemon>

It would be helpful if you capture the statedump at several different points of time on shd; that would prove that the memory consumed by the process is indeed progressively increasing and help us in isolating the data structure whose memory is being leaked.

OR

Run the same test twice - once with no disperse in the tiered volume, and another time with no AFR in the tiered volume. One of them (or both if they both have leaks!) will be OOM-killed eventually.


-Krutika

Comment 23 Shashank Raj 2016-07-20 06:34:25 UTC
Just to confirm, prefer running the steps once again with 3.1.3 final build and then based on the results we can close it.

Will update accordingly.

Comment 24 Nithya Balachandran 2016-07-20 06:42:01 UTC
(In reply to Shashank Raj from comment #23)
> Just to confirm, prefer running the steps once again with 3.1.3 final build
> and then based on the results we can close it.
> 
> Will update accordingly.

Thanks. When is this planned?

Comment 25 Shashank Raj 2016-07-20 06:47:19 UTC
(In reply to Nithya Balachandran from comment #24)
> (In reply to Shashank Raj from comment #23)
> > Just to confirm, prefer running the steps once again with 3.1.3 final build
> > and then based on the results we can close it.
> > 
> > Will update accordingly.
> 
> Thanks. When is this planned?

Will try to finish it by EOW

Comment 26 Shashank Raj 2016-07-22 06:00:36 UTC
Tried reproducing the issue with latest 3.1.3 build on tiered volume. During IO's the memory consumption for shd remains almost constant now, which earlier was causing the oom kills.

So this issue can be closed as it works fine with the latest builds.


Note You need to log in before you can comment on or make changes to this bug.