1342426 – self heal deamon killed due to oom kills on a dist-disperse volume using nfs ganesha

Bug 1342426 - self heal deamon killed due to oom kills on a dist-disperse volume using nfs ganesha

Summary: self heal deamon killed due to oom kills on a dist-disperse volume using nfs ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	disperse
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.1.3
Assignee:	Ashish Pandey
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1311817 1327831 1330997 1342796 1342954 1342964 1344836 1360576 1361402
TreeView+	depends on / blocked

Reported:	2016-06-03 09:02 UTC by Nag Pavan Chilakam
Modified:	2018-01-24 06:04 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.7.9-9
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1342796 (view as bug list)
Environment:
Last Closed:	2016-06-23 05:25:52 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
steps during raising bug (58.73 KB, text/html) 2016-06-03 09:08 UTC, Nag Pavan Chilakam	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:1240	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage 3.1 Update 3	2016-06-23 08:51:28 UTC

Description Nag Pavan Chilakam 2016-06-03 09:02:53 UTC

As Part of 1330997 - [Disperse volume]: IO hang seen on mount with file ops verification I hit oom kills of shd processes on all the nodes





I am seeing oom kills of disperse self heal deamons, on a distr-disperse volume
[root@dhcp35-191 glusterfs]# rpm -qa|grep gluster
glusterfs-3.7.9-7.el7rhgs.x86_64
glusterfs-api-3.7.9-7.el7rhgs.x86_64
glusterfs-fuse-3.7.9-7.el7rhgs.x86_64
glusterfs-client-xlators-3.7.9-7.el7rhgs.x86_64
glusterfs-server-3.7.9-7.el7rhgs.x86_64
glusterfs-cli-3.7.9-7.el7rhgs.x86_64
python-gluster-3.7.9-7.el7rhgs.noarch
glusterfs-libs-3.7.9-7.el7rhgs.x86_64

Comment 3 Nag Pavan Chilakam 2016-06-03 09:08:11 UTC

Created attachment 1164394 [details]
steps during raising bug

Comment 4 Rahul Hinduja 2016-06-03 09:17:13 UTC

Putting all the details from comment 2 at the bugzilla:

    STEP#1.  I have a 6 node 7.2 rhel cluster (passwd for cluster ndoes is  cluster@123 ///// and paswd for clients is either "redhat" or  "client@123")

    STEP#1.  I created a 3x(4+2) ec volume as below:

    Vol details before triggering IOs:
    [root@dhcp35-191 glusterfs]# gluster v status ganesha-disperse
    Status of volume: ganesha-disperse
    Gluster process                             TCP Port  RDMA Port  Online  Pid
    ------------------------------------------------------------------------------
    Brick 10.70.35.191:/rhs/brick1/ganesha-disp
    erse                                        49152     0          Y       9412
    Brick 10.70.35.27:/rhs/brick1/ganesha-dispe
    rse                                         49152     0          Y       8656
    Brick 10.70.35.98:/rhs/brick1/ganesha-dispe
    rse                                         49152     0          Y       28176
    Brick 10.70.35.64:/rhs/brick1/ganesha-dispe
    rse                                         49152     0          Y       21563
    Brick 10.70.35.44:/rhs/brick1/ganesha-dispe
    rse                                         49152     0          Y       21769
    Brick 10.70.35.114:/rhs/brick1/ganesha-disp
    erse                                        49152     0          Y       21605
    Brick 10.70.35.191:/rhs/brick2/ganesha-disp
    erse                                        49153     0          Y       9431
    Brick 10.70.35.27:/rhs/brick2/ganesha-dispe
    rse                                         49153     0          Y       8675
    Brick 10.70.35.98:/rhs/brick2/ganesha-dispe
    rse                                         49153     0          Y       28195
    Brick 10.70.35.64:/rhs/brick2/ganesha-dispe
    rse                                         49153     0          Y       21582
    Brick 10.70.35.44:/rhs/brick2/ganesha-dispe
    rse                                         49153     0          Y       21788
    Brick 10.70.35.114:/rhs/brick2/ganesha-disp
    erse                                        49153     0          Y       21624
    Brick 10.70.35.191:/rhs/brick4/ganesha-disp
    erse                                        49154     0          Y       9450
    Brick 10.70.35.27:/rhs/brick4/ganesha-dispe
    rse                                         49154     0          Y       8694
    Brick 10.70.35.98:/rhs/brick4/ganesha-dispe
    rse                                         49154     0          Y       28214
    Brick 10.70.35.64:/rhs/brick4/ganesha-dispe
    rse                                         49154     0          Y       21601
    Brick 10.70.35.44:/rhs/brick4/ganesha-dispe
    rse                                         49154     0          Y       21807
    Brick 10.70.35.114:/rhs/brick4/ganesha-disp
    erse                                        49154     0          Y       21643
    Self-heal Daemon on localhost               N/A       N/A        Y       9470
    Self-heal Daemon on 10.70.35.98             N/A       N/A        Y       28242
    Self-heal Daemon on 10.70.35.44             N/A       N/A        Y       21835
    Self-heal Daemon on 10.70.35.114            N/A       N/A        Y       21663
    Self-heal Daemon on 10.70.35.64             N/A       N/A        Y       21621
    Self-heal Daemon on 10.70.35.27             N/A       N/A        Y       8714
     
    Task Status of Volume ganesha-disperse
    ------------------------------------------------------------------------------
    There are no active volume tasks
     
    Volume Name: ganesha-disperse
    Type: Distributed-Disperse
    Volume ID: 1174f400-712f-4f15-8227-3f0677f47e83
    Status: Started
    Number of Bricks: 3 x (4 + 2) = 18
    Transport-type: tcp
    Bricks:
    Brick1: 10.70.35.191:/rhs/brick1/ganesha-disperse
    Brick2: 10.70.35.27:/rhs/brick1/ganesha-disperse
    Brick3: 10.70.35.98:/rhs/brick1/ganesha-disperse
    Brick4: 10.70.35.64:/rhs/brick1/ganesha-disperse
    Brick5: 10.70.35.44:/rhs/brick1/ganesha-disperse
    Brick6: 10.70.35.114:/rhs/brick1/ganesha-disperse
    Brick7: 10.70.35.191:/rhs/brick2/ganesha-disperse
    Brick8: 10.70.35.27:/rhs/brick2/ganesha-disperse
    Brick9: 10.70.35.98:/rhs/brick2/ganesha-disperse
    Brick10: 10.70.35.64:/rhs/brick2/ganesha-disperse
    Brick11: 10.70.35.44:/rhs/brick2/ganesha-disperse
    Brick12: 10.70.35.114:/rhs/brick2/ganesha-disperse
    Brick13: 10.70.35.191:/rhs/brick4/ganesha-disperse
    Brick14: 10.70.35.27:/rhs/brick4/ganesha-disperse
    Brick15: 10.70.35.98:/rhs/brick4/ganesha-disperse
    Brick16: 10.70.35.64:/rhs/brick4/ganesha-disperse
    Brick17: 10.70.35.44:/rhs/brick4/ganesha-disperse
    Brick18: 10.70.35.114:/rhs/brick4/ganesha-disperse
    Options Reconfigured:
    ganesha.enable: on
    performance.readdir-ahead: on
    nfs.disable: on
    nfs-ganesha: enable
     
     

    STEP#3. > I have 4 VIPs for this nfs ganesha setup 10.70.35.219/220/221/222

    STEP#4: >I mounted the volume using v3 on 5 different clients which are as below:

        client 1:10.70.35.103:RHEL7.2:     10.70.35.219:/ganesha-disperse on  /mnt/ganesha-disperse type nfs  (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.70.35.219,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.70.35.219)

       client2::10.70.35.126:RHEL7.2: 10.70.35.220:/ganesha-disperse on  /mnt/ganesha-disperse type nfs  (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.70.35.220,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.70.35.220)

        client3::10.70.37.64:RHEL6.8:     10.70.35.219:/ganesha-disperse on  /mnt/ganesha-disperse type nfs (rw,vers=3,addr=10.70.35.219)

        client4 ::10.70.37.166:RHEL6.8:          10.70.35.222:/ganesha-disperse  on /mnt/ganesha-disperse type nfs (rw,vers=3,addr=10.70.35.222)

        client5::10.70.35.162:RHEL6.8:    10.70.35.222:/ganesha-disperse on  /mnt/ganesha-disperse type nfs (rw,vers=3,addr=10.70.35.222)


    STEP#5: Start IOs in following pattern:

    client1:10.70.35.103: created a dir "103" and started below IO:

    for i in {1..10000}; do mkdir -p dir-$i/d1/d2/d3/d4/d5/d6/d7/d8/d9/d10; done

    Client2:10.70.35.126: following IO i n a dir 126

    for i in {1..100000}; do dd if=/dev/urandom of=file-$i bs=1k count=512; done

    Client3 :10.70.37.64:linux tar untar

    created a dir 64 and copied the latest kernel tar

    then  issued "for i in {1..10};do  echo "=============== LOOP $i  ================================"  |& tee -a tar.log ;date  |&  tee -a tar.log;echo "=============== LOOP $i  ================================" >>tar.err.log;date  >>tar.err.log;mkdir kern.$i |& tee -a tar.log;cp  linux-4.6.1.tar.xz kern.$i/ |& tee -a tar.log;tar -xvf  kern.$i/linux-4.6.1.tar.xz -C kern.$i/ >> tar.log  2>>tar.err.log; tar -cvf kern.$i/linux-4.6.1-mytar.tar  kern.$i/linux-4.6.1 >> tar.log 2>>tar.err.log;echo  "=======================" |& tee -a tar.log ;echo  "=======================" >> tar.err.log ;echo  "###########################################################" |& tee  -a tar.log ;echo  "###########################################################" |& tee  -a tar.log ;echo  "###########################################################" >>  tar.err.log;echo  "###########################################################" >>  tar.err.log;sleep 10;done"


    client4 ::10.70.37.166: ls -lRt *

    client5::10.70.35.162: no IOs; just mounted and the left it idle


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Left the IO for overnight run
Observation on next day morning:

    client1: IO still goingon

    client2: IO completed no issue

    client3: Linux tar and untar compelted for 4 loops, but then failed from 5th loop as below

    IO error

    OOM kills

    #################

            bash: tar.log: Remote I/O error

            bash: tar.log: Remote I/O error

            tee: tar.log: Remote I/O error

            =======================

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            ###########################################################

            tee: tar.log: Remote I/O error

            ###########################################################

            bash: tar.err.log: Remote I/O error

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            =============== LOOP 9 ================================

            tee: tar.log: Remote I/O error

            Fri Jun  3 01:27:45 IST 2016

            bash: tar.err.log: Remote I/O error

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            mkdir: cannot create directory `kern.9': Remote I/O error

            tee: tar.log: Remote I/O error

            cp: accessing `kern.9/': Remote I/O error

            bash: tar.log: Remote I/O error

            bash: tar.log: Remote I/O error

            tee: tar.log: Remote I/O error

            =======================

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            ###########################################################

            tee: tar.log: Remote I/O error

            ###########################################################

            bash: tar.err.log: Remote I/O error

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            =============== LOOP 10 ================================

            tee: tar.log: Remote I/O error

            Fri Jun  3 01:27:55 IST 2016

            bash: tar.err.log: Remote I/O error

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            mkdir: cannot create directory `kern.10': Remote I/O error

            tee: tar.log: Remote I/O error

            cp: accessing `kern.10/': Remote I/O error

            bash: tar.log: Remote I/O error

            bash: tar.log: Remote I/O error

            tee: tar.log: Remote I/O error

            =======================

            bash: tar.err.log: Remote I/O error

            tee: tar.log: Remote I/O error

            ###########################################################

            tee: tar.log: Remote I/O error

            ###########################################################

            bash: tar.err.log: Remote I/O error

            bash: tar.err.log: Remote I/O error

            [root@dhcp37-64 64]# pwd




    Dmesg OOM kill log:

            nfs: server 10.70.35.219 not responding, still trying

            nfs: server 10.70.35.219 not responding, still trying

            nfs: server 10.70.35.219 OK

            INFO: task tar:14146 blocked for more than 120 seconds.

                  Not tainted 2.6.32-642.el6.x86_64 #1

            "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

            tar           D 0000000000000001     0 14146   1519 0x00000080

             ffff8801186afc78 0000000000000082 0000000000000000 000507a3bae591dc

             ffff8801186afde8 ffff88011832d600 00024947aa559d76 ffffffffa8b94371

             0000000014c06a95 000000012656c9d5 ffff88011c8705f8 ffff8801186affd8

            Call Trace:

             [<ffffffff8112e610>] ? sync_page+0x0/0x50

             [<ffffffff81547cf3>] io_schedule+0x73/0xc0

             [<ffffffff8112e64d>] sync_page+0x3d/0x50

             [<ffffffff815487df>] __wait_on_bit+0x5f/0x90

             [<ffffffff8112e883>] wait_on_page_bit+0x73/0x80

             [<ffffffff810a6b40>] ? wake_bit_function+0x0/0x50

             [<ffffffff811449c5>] ? pagevec_lookup_tag+0x25/0x40

             [<ffffffff8112ecab>] wait_on_page_writeback_range+0xfb/0x190

             [<ffffffff81143a31>] ? do_writepages+0x21/0x40

             [<ffffffff8112edfb>] ? __filemap_fdatawrite_range+0x5b/0x60

             [<ffffffff8112ee78>] filemap_write_and_wait_range+0x78/0x90

             [<ffffffff811ccb4e>] vfs_fsync_range+0x7e/0x100

             [<ffffffff811ccc3d>] vfs_fsync+0x1d/0x20

             [<ffffffffa03449e0>] nfs_file_flush+0x70/0xa0 [nfs]

             [<ffffffff81196a1c>] filp_close+0x3c/0x90

             [<ffffffff81196b15>] sys_close+0xa5/0x100

             [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b

            INFO: task tar:14146 blocked for more than 120 seconds.

                  Not tainted 2.6.32-642.el6.x86_64 #1

            "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

            tar           D 0000000000000001     0 14146   1519 0x00000080

             ffff8801186afc78 0000000000000082 0000000000000000 000507a3bae591dc

             ffff8801186afde8 ffff88011832d600 00024947aa559d76 ffffffffa8b94371

             0000000014c06a95 000000012656c9d5 ffff88011c8705f8 ffff8801186affd8

            Call Trace:

             [<ffffffff8112e610>] ? sync_page+0x0/0x50

             [<ffffffff81547cf3>] io_schedule+0x73/0xc0

             [<ffffffff8112e64d>] sync_page+0x3d/0x50

             [<ffffffff815487df>] __wait_on_bit+0x5f/0x90

             [<ffffffff8112e883>] wait_on_page_bit+0x73/0x80

             [<ffffffff810a6b40>] ? wake_bit_function+0x0/0x50

             [<ffffffff811449c5>] ? pagevec_lookup_tag+0x25/0x40

             [<ffffffff8112ecab>] wait_on_page_writeback_range+0xfb/0x190

             [<ffffffff81143a31>] ? do_writepages+0x21/0x40

             [<ffffffff8112edfb>] ? __filemap_fdatawrite_range+0x5b/0x60

             [<ffffffff8112ee78>] filemap_write_and_wait_range+0x78/0x90

             [<ffffffff811ccb4e>] vfs_fsync_range+0x7e/0x100

             [<ffffffff811ccc3d>] vfs_fsync+0x1d/0x20

             [<ffffffffa03449e0>] nfs_file_flush+0x70/0xa0 [nfs]

             [<ffffffff81196a1c>] filp_close+0x3c/0x90

             [<ffffffff81196b15>] sys_close+0xa5/0x100

             [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b

            nfs: server 10.70.35.219 OK



    client4:

     ls -lRt IO still going on


    [root@dhcp37-166 ~]# dmesg log


    NFS: Registering the id_resolver key type

    Key type id_resolver registered

    FS-Cache: Netfs 'nfs' registered for caching

    SELinux: initialized (dev 0:16, type nfs4), uses genfs_contexts

    SELinux: initialized (dev 0:17, type nfs4), uses genfs_contexts

    SELinux: initialized (dev 0:16, type nfs), uses genfs_contexts

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK

    nfs: server 10.70.35.222 not responding, still trying

    nfs: server 10.70.35.222 OK


    client5:

    [root@dhcp35-162 ~]# 

    nothing done







Server Side OOM Kills:
node#1:
[Thu Jun  2 10:39:20 2016] SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
[Thu Jun  2 10:45:11 2016] SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
[Thu Jun  2 10:57:37 2016] SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
[Thu Jun  2 13:17:37 2016] bash invoked oom-killer: gfp_mask=0x3000d0, order=2, oom_score_adj=0
[Thu Jun  2 13:17:37 2016] bash cpuset=/ mems_allowed=0
[Thu Jun  2 13:17:37 2016] CPU: 2 PID: 9931 Comm: bash Not tainted 3.10.0-327.13.1.el7.x86_64 #1
[Thu Jun  2 13:17:37 2016] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[Thu Jun  2 13:17:37 2016]  ffff88020c092280 00000000b8425347 ffff8800ae82fb68 ffffffff816356f4
[Thu Jun  2 13:17:37 2016]  ffff8800ae82fbf8 ffffffff8163068f ffff88023288d600 ffff88023288d618
[Thu Jun  2 13:17:37 2016]  0408000000000202 fffeffff00000000 0000000000000001 ffffffff81128903
[Thu Jun  2 13:17:37 2016] Call Trace:
[Thu Jun  2 13:17:37 2016]  [<ffffffff816356f4>] dump_stack+0x19/0x1b
[Thu Jun  2 13:17:37 2016]  [<ffffffff8163068f>] dump_header+0x8e/0x214
[Thu Jun  2 13:17:37 2016]  [<ffffffff81128903>] ? delayacct_end+0x53/0xb0
[Thu Jun  2 13:17:37 2016]  [<ffffffff8116ce7e>] oom_kill_process+0x24e/0x3b0
[Thu Jun  2 13:17:37 2016]  [<ffffffff81088d8e>] ? has_capability_noaudit+0x1e/0x30
[Thu Jun  2 13:17:37 2016]  [<ffffffff8116d6a6>] out_of_memory+0x4b6/0x4f0
[Thu Jun  2 13:17:37 2016]  [<ffffffff81173885>] __alloc_pages_nodemask+0xa95/0xb90
[Thu Jun  2 13:17:37 2016]  [<ffffffff81078d53>] copy_process.part.25+0x163/0x1610
[Thu Jun  2 13:17:37 2016]  [<ffffffff8107a3e1>] do_fork+0xe1/0x320
[Thu Jun  2 13:17:37 2016]  [<ffffffff81090711>] ? __set_task_blocked+0x41/0xa0
[Thu Jun  2 13:17:37 2016]  [<ffffffff8107a6a6>] SyS_clone+0x16/0x20
[Thu Jun  2 13:17:37 2016]  [<ffffffff816461d9>] stub_clone+0x69/0x90
[Thu Jun  2 13:17:37 2016]  [<ffffffff81645e89>] ? system_call_fastpath+0x16/0x1b
[Thu Jun  2 13:17:37 2016] Mem-Info:
[Thu Jun  2 13:17:37 2016] Node 0 DMA per-cpu:
[Thu Jun  2 13:17:37 2016] CPU    0: hi:    0, btch:   1 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    1: hi:    0, btch:   1 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    2: hi:    0, btch:   1 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    3: hi:    0, btch:   1 usd:   0
[Thu Jun  2 13:17:37 2016] Node 0 DMA32 per-cpu:
[Thu Jun  2 13:17:37 2016] CPU    0: hi:  186, btch:  31 usd:  73
[Thu Jun  2 13:17:37 2016] CPU    1: hi:  186, btch:  31 usd:  49
[Thu Jun  2 13:17:37 2016] CPU    2: hi:  186, btch:  31 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    3: hi:  186, btch:  31 usd:   0
[Thu Jun  2 13:17:37 2016] Node 0 Normal per-cpu:
[Thu Jun  2 13:17:37 2016] CPU    0: hi:  186, btch:  31 usd:  50
[Thu Jun  2 13:17:37 2016] CPU    1: hi:  186, btch:  31 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    2: hi:  186, btch:  31 usd:   0
[Thu Jun  2 13:17:37 2016] CPU    3: hi:  186, btch:  31 usd:  10
[Thu Jun  2 13:17:37 2016] active_anon:1434044 inactive_anon:280519 isolated_anon:0
 active_file:0 inactive_file:575 isolated_file:0
 unevictable:6090 dirty:120 writeback:0 unstable:0
 free:31116 slab_reclaimable:108918 slab_unreclaimable:49624
 mapped:2354 shmem:565 pagetables:70547 bounce:0
 free_cma:0
[Thu  Jun  2 13:17:37 2016] Node 0 DMA free:15908kB min:132kB low:164kB  high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB  inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB  present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB  mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB  kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB  writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Thu Jun  2 13:17:37 2016] lowmem_reserve[]: 0 2815 7812 7812
[Thu  Jun  2 13:17:37 2016] Node 0 DMA32 free:54664kB min:24308kB low:30384kB  high:36460kB active_anon:1976708kB inactive_anon:494272kB  active_file:0kB inactive_file:2240kB unevictable:6724kB  isolated(anon):0kB isolated(file):0kB present:3129336kB  managed:2884968kB mlocked:6724kB dirty:224kB writeback:0kB mapped:4128kB  shmem:236kB slab_reclaimable:159000kB slab_unreclaimable:73784kB  kernel_stack:2000kB pagetables:101548kB unstable:0kB bounce:0kB  free_cma:0kB writeback_tmp:0kB pages_scanned:74 all_unreclaimable? yes
[Thu Jun  2 13:17:37 2016] lowmem_reserve[]: 0 0 4996 4996
[Thu  Jun  2 13:17:37 2016] Node 0 Normal free:53892kB min:43136kB  low:53920kB high:64704kB active_anon:3759468kB inactive_anon:627804kB  active_file:0kB inactive_file:208kB unevictable:17636kB  isolated(anon):0kB isolated(file):0kB present:5251072kB  managed:5116220kB mlocked:17636kB dirty:256kB writeback:0kB  mapped:5288kB shmem:2024kB slab_reclaimable:276672kB  slab_unreclaimable:124712kB kernel_stack:5280kB pagetables:180640kB  unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:144  all_unreclaimable? yes
[Thu Jun  2 13:17:37 2016] lowmem_reserve[]: 0 0 0 0
[Thu  Jun  2 13:17:37 2016] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U)  2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R)  3*4096kB (M) = 15908kB
[Thu  Jun  2 13:17:37 2016] Node 0 DMA32: 9569*4kB (UEM) 2048*8kB (UE) 0*16kB  0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =  54660kB
[Thu  Jun  2 13:17:37 2016] Node 0 Normal: 13517*4kB (UE) 0*8kB 0*16kB 0*32kB  0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 54068kB
[Thu Jun  2 13:17:37 2016] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Thu Jun  2 13:17:37 2016] 21622 total pagecache pages
[Thu Jun  2 13:17:37 2016] 17936 pages in swap cache
[Thu Jun  2 13:17:37 2016] Swap cache stats: add 1155588, delete 1137652, find 71536/83011
[Thu Jun  2 13:17:37 2016] Free swap  = 0kB
[Thu Jun  2 13:17:37 2016] Total swap = 4194300kB
[Thu Jun  2 13:17:37 2016] 2099100 pages RAM
[Thu Jun  2 13:17:37 2016] 0 pages HighMem/MovableOnly
[Thu Jun  2 13:17:37 2016] 94826 pages reserved
[Thu Jun  2 13:17:37 2016] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[Thu Jun  2 13:17:37 2016] [  588]     0   588     9204      298      20       51             0 systemd-journal
[Thu Jun  2 13:17:37 2016] [  615]     0   615    79546     1709      36        0         -1000 multipathd
[Thu Jun  2 13:17:37 2016] [  620]     0   620   122399      613      38     1329             0 lvmetad
[Thu Jun  2 13:17:37 2016] [  627]     0   627    11474      601      22      462         -1000 systemd-udevd
[Thu Jun  2 13:17:37 2016] [  801]     0   801    82764     5460      42        0         -1000 dmeventd
[Thu Jun  2 13:17:37 2016] [  977]     0   977    12797      228      24       95         -1000 auditd
[Thu Jun  2 13:17:37 2016] [  999]     0   999     4189      200      14       44             0 alsactl
[Thu Jun  2 13:17:37 2016] [ 1000]     0  1000     6600      260      15       36             0 systemd-logind
[Thu Jun  2 13:17:37 2016] [ 1002]     0  1002    52567      273      55      400             0 abrtd
[Thu Jun  2 13:17:37 2016] [ 1004]   993  1004     2133      150       9       29             0 lsmd
[Thu Jun  2 13:17:37 2016] [ 1006]     0  1006    51996      238      52      329             0 abrt-watch-log
[Thu Jun  2 13:17:37 2016] [ 1007]     0  1007    81556      516      80     4130             0 firewalld
[Thu Jun  2 13:17:37 2016] [ 1015]     0  1015     1094      110       8       17             0 rngd
[Thu Jun  2 13:17:37 2016] [ 1018]    81  1018     9268      321      18       69          -900 dbus-daemon
[Thu Jun  2 13:17:37 2016] [ 1020]     0  1020    50842      140      41      115             0 gssproxy
[Thu Jun  2 13:17:37 2016] [ 1034]   990  1034    28961      273      26       51             0 chronyd
[Thu Jun  2 13:17:37 2016] [ 1048]     0  1048     6621      237      16      129             0 smartd
[Thu Jun  2 13:17:37 2016] [ 1060]     0  1060     1085       97       7       33             0 acpid
[Thu Jun  2 13:17:37 2016] [ 1086]     0  1086    31582      246      18      131             0 crond
[Thu Jun  2 13:17:37 2016] [ 1088]     0  1088     6491      173      17       51             0 atd
[Thu Jun  2 13:17:37 2016] [ 1092]     0  1092    28812      120      11       25             0 ksmtuned
[Thu Jun  2 13:17:37 2016] [ 1101]     0  1101    27509      165      10       28             0 agetty
[Thu Jun  2 13:17:37 2016] [ 1652]     0  1652    27631       18      50     3112             0 dhclient
[Thu Jun  2 13:17:37 2016] [ 1656]   996  1656   132533      365      57     1639             0 polkitd
[Thu Jun  2 13:17:37 2016] [ 1735]     0  1735    20640      228      44      213         -1000 sshd
[Thu Jun  2 13:17:37 2016] [ 1739]     0  1739    28335       92      11       39             0 rhsmcertd
[Thu Jun  2 13:17:37 2016] [ 1740]     0  1740   125957      356     118      929             0 libvirtd
[Thu Jun  2 13:17:37 2016] [ 1742]     0  1742   138265      412      90     3061             0 tuned
[Thu Jun  2 13:17:37 2016] [ 1743]   991  1743    11592      136      26      173             0 nrpe
[Thu Jun  2 13:17:37 2016] [ 1804]     0  1804    26978      101       8       36             0 rhnsd
[Thu Jun  2 13:17:37 2016] [ 3201]    32  3201    16227      164      34      127             0 rpcbind
[Thu Jun  2 13:17:37 2016] [ 5789]    29  5789    13222      178      29     1771             0 rpc.statd
[Thu Jun  2 13:17:37 2016] [ 6803]     0  6803    71323      197      39      144             0 rsyslogd
[Thu Jun  2 13:17:37 2016] [ 7397]     0  7397   151676      850      72     4487             0 glusterd
[Thu Jun  2 13:17:37 2016] [ 8493]     0  8493   207841     4280     131    30864             0 glusterfs
[Thu Jun  2 13:17:37 2016] [ 9365]     0  9365   292206      843     107     8524             0 glusterfsd
[Thu Jun  2 13:17:37 2016] [ 9412]     0  9412   524536    90199     290    10980             0 glusterfsd
[Thu Jun  2 13:17:37 2016] [ 9431]     0  9431   524536    86433     283     8368             0 glusterfsd
[Thu Jun  2 13:17:37 2016] [ 9450]     0  9450   557303   101991     347    26877             0 glusterfsd
[Thu Jun  2 13:17:37 2016] [ 9470]     0  9470 35437737  1410312   68246   935624             0 glusterfs
[Thu Jun  2 13:17:37 2016] [ 9891]     0  9891    31446      230      16       86             0 screen
[Thu Jun  2 13:17:37 2016] [ 9892]     0  9892    29176      247      14      429             0 bash
[Thu Jun  2 13:17:37 2016] [ 9931]     0  9931    30892      107      15     2047             0 bash
[Thu Jun  2 13:17:37 2016] [14768]     0 14768    26974       75      11        0             0 sleep
[Thu Jun  2 13:17:37 2016] Out of memory: Kill process 9470 (glusterfs) score 767 or sacrifice child
[Thu Jun  2 13:17:37 2016] Killed process 9470 (glusterfs) total-vm:141750972kB, anon-rss:5640732kB, file-rss:740kB
[root@dhcp35-191 ~]# 



Current vol status:"
[root@dhcp35-191 glusterfs]# gluster v status ganesha-disperse
Status of volume: ganesha-disperse
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.191:/rhs/brick1/ganesha-disp
erse                                        49152     0          Y       9412 
Brick 10.70.35.27:/rhs/brick1/ganesha-dispe
rse                                         49152     0          Y       8656 
Brick 10.70.35.98:/rhs/brick1/ganesha-dispe
rse                                         49152     0          Y       28176
Brick 10.70.35.64:/rhs/brick1/ganesha-dispe
rse                                         49152     0          Y       21563
Brick 10.70.35.44:/rhs/brick1/ganesha-dispe
rse                                         49152     0          Y       21769
Brick 10.70.35.114:/rhs/brick1/ganesha-disp
erse                                        49152     0          Y       21605
Brick 10.70.35.191:/rhs/brick2/ganesha-disp
erse                                        49153     0          Y       9431 
Brick 10.70.35.27:/rhs/brick2/ganesha-dispe
rse                                         49153     0          Y       8675 
Brick 10.70.35.98:/rhs/brick2/ganesha-dispe
rse                                         49153     0          Y       28195
Brick 10.70.35.64:/rhs/brick2/ganesha-dispe
rse                                         49153     0          Y       21582
Brick 10.70.35.44:/rhs/brick2/ganesha-dispe
rse                                         49153     0          Y       21788
Brick 10.70.35.114:/rhs/brick2/ganesha-disp
erse                                        49153     0          Y       21624
Brick 10.70.35.191:/rhs/brick4/ganesha-disp
erse                                        49154     0          Y       9450 
Brick 10.70.35.27:/rhs/brick4/ganesha-dispe
rse                                         49154     0          Y       8694 
Brick 10.70.35.98:/rhs/brick4/ganesha-dispe
rse                                         49154     0          Y       28214
Brick 10.70.35.64:/rhs/brick4/ganesha-dispe
rse                                         49154     0          Y       21601
Brick 10.70.35.44:/rhs/brick4/ganesha-dispe
rse                                         49154     0          Y       21807
Brick 10.70.35.114:/rhs/brick4/ganesha-disp
erse                                        49154     0          Y       21643
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.35.114            N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.35.98             N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.35.27             N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.35.64             N/A       N/A        N       N/A  
Self-heal Daemon on 10.70.35.44             N/A       N/A        N       N/A  
 
Task Status of Volume ganesha-disperse
------------------------------------------------------------------------------
There are no active volume tasks

Comment 5 Nag Pavan Chilakam 2016-06-03 09:54:43 UTC

All sosreports and logs like cpu and mem consumption over night along with dmesg logs availabel at [nchilaka@rhsqe-repo bug.1342426]$ pwd
/home/repo/sosreports/nchilaka/bug.1342426
[nchilaka@rhsqe-repo bug.1342426]$ chmod -R 0777 *
[nchilaka@rhsqe-repo bug.1342426]$ chmod -R 0777 .
[nchilaka@rhsqe-repo bug.1342426]$ hostname
rhsqe-repo.lab.eng.blr.redhat.com

Comment 6 Pranith Kumar K 2016-06-04 11:52:19 UTC

Steps to re-create the issue without nfs-ganesha (The issue seems to be with cache-invalidation + ec. Cache invalidation is enabled when nfs-ganesha is enabled):

On single machine this issue can be re-created with the following steps:
1) glusterd && gluster v create ec2 redundancy 2 localhost.localdomain:/home/gfs/ec_{0..5} force && gluster v  start ec2 && mount -t glusterfs localhost.localdomain:/ec2 /mnt/fuse1 && mount -t glusterfs localhost.localdomain:/ec2 /mnt/ec2

2) gluster volume set ec2 features.cache-invalidation on

3) On two different terminals which are in /mnt/ec2 and /mnt/fuse1 execute:
while true; do echo abc > a; done

4) Execute gluster volume heal ec2 in a loop for 10 times, it may hang in the middle when we do this.

5) Keep observing memory usage going up in shd by doing:
top -p <pid-of-shd>

Comment 7 Pranith Kumar K 2016-06-04 12:00:08 UTC

We will give out a scratch build and we need a confirmation that both this bug and https://bugzilla.redhat.com/show_bug.cgi?id=1327831 are because of this same issue.

Comment 8 Atin Mukherjee 2016-06-06 08:51:11 UTC

Upstream mainline patch : http://review.gluster.org/14649

Comment 13 Nag Pavan Chilakam 2016-06-10 08:26:21 UTC

QATP:
====
1) 6 node cluster (Rhel 7.2) Ran the same case for which i filed the bug-->ie as below (3x(4+2)
  1)mounted a dist-ec volume on a nfs-ganesha.
  2)Ran IOs for about 15hours
  Result: No Self heal oom kills happened(before fix we had hit that in 2hrs)
          Observation 1:--->seen a OOM kill of ganesha process(will update a bug which is already existing)
          Observation 2:--->due to OOM kill(probably), the VIP was hosted on its HA node, after which the Client which mounted using that VIP is seeing "Remote I/O error" and "Stale file handle" of the mount
          However if i mount the volume using this VIP for a new instance that worked

2) (2x(4+2)) 4 node cluster (Rhel 6.8) Ran the case mentioned by Pranith with cache-invalidation ON on a different setup with fuse mount---> NO OOM kills seen
3) (Rhel 6.8) Ran the same case mentioned by me while filing the bug(ie with differnt IOs) with cache invalidation ON and on fuse mount --->NO oom kills with IOs complete


Hence moving the bug to verified


[root@dhcp35-98 3.7.9-9]#  rpm -qa|grep ganesha
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-2.3.1-8.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.3.1-8.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
[root@dhcp35-98 3.7.9-9]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.9-9.el7rhgs.x86_64
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-server-3.7.9-9.el7rhgs.x86_64
glusterfs-api-3.7.9-9.el7rhgs.x86_64
glusterfs-debuginfo-3.7.9-9.el7rhgs.x86_64
glusterfs-libs-3.7.9-9.el7rhgs.x86_64
glusterfs-fuse-3.7.9-9.el7rhgs.x86_64
glusterfs-cli-3.7.9-9.el7rhgs.x86_64
glusterfs-3.7.9-9.el7rhgs.x86_64
glusterfs-ganesha-3.7.9-9.el7rhgs.x86_64
python-gluster-3.7.9-9.el7rhgs.noarch

Comment 14 Nag Pavan Chilakam 2016-06-10 08:32:41 UTC

even did a replace brick which worked well

Comment 16 errata-xmlrpc 2016-06-23 05:25:52 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240

Note You need to log in before you can comment on or make changes to this bug.