Description of problem: ======================= glusterfs invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 glusterfs cpuset=/ mems_allowed=0 Pid: 29151, comm: glusterfs Not tainted 2.6.32-358.6.1.el6.x86_64 #1 Call Trace: [<ffffffff810cb5f1>] ? cpuset_print_task_mems_allowed+0x91/0xb0 [<ffffffff8111cdf0>] ? dump_header+0x90/0x1b0 [<ffffffff8121d1fc>] ? security_real_capable_noaudit+0x3c/0x70 [<ffffffff8111d272>] ? oom_kill_process+0x82/0x2a0 [<ffffffff8111d1b1>] ? select_bad_process+0xe1/0x120 [<ffffffff8111d6b0>] ? out_of_memory+0x220/0x3c0 [<ffffffff8112c35c>] ? __alloc_pages_nodemask+0x8ac/0x8d0 [<ffffffff8116095a>] ? alloc_pages_current+0xaa/0x110 [<ffffffff8111a1d7>] ? __page_cache_alloc+0x87/0x90 [<ffffffff81119bbe>] ? find_get_page+0x1e/0xa0 [<ffffffff8111b197>] ? filemap_fault+0x1a7/0x500 [<ffffffff81143194>] ? __do_fault+0x54/0x530 [<ffffffff81143767>] ? handle_pte_fault+0xf7/0xb50 [<ffffffff811443fa>] ? handle_mm_fault+0x23a/0x310 [<ffffffff810474c9>] ? __do_page_fault+0x139/0x480 [<ffffffff813230bf>] ? extract_entropy_user+0xbf/0x130 [<ffffffff8103c7b8>] ? pvclock_clocksource_read+0x58/0xd0 [<ffffffff8103b8ac>] ? kvm_clock_read+0x1c/0x20 [<ffffffff8103b8b9>] ? kvm_clock_get_cycles+0x9/0x10 [<ffffffff810a1420>] ? getnstimeofday+0x60/0xf0 [<ffffffff815135ce>] ? do_page_fault+0x3e/0xa0 [<ffffffff81510985>] ? page_fault+0x25/0x30 Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 30 CPU 1: hi: 186, btch: 31 usd: 179 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 30 CPU 1: hi: 186, btch: 31 usd: 78 active_anon:740421 inactive_anon:187455 isolated_anon:0 active_file:62 inactive_file:81 isolated_file:0 unevictable:0 dirty:0 writeback:1 unstable:0 free:21244 slab_reclaimable:2805 slab_unreclaimable:13478 mapped:94 shmem:11230 pagetables:4252 bounce:0 Node 0 DMA free:15724kB min:248kB low:308kB high:372kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3512 4017 4017 Node 0 DMA32 free:60840kB min:58868kB low:73584kB high:88300kB active_anon:2759112kB inactive_anon:547136kB active_file:148kB inactive_file:364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3596500kB mlocked:0kB dirty:4kB writeback:0kB mapped:308kB shmem:44908kB slab_reclaimable:3324kB slab_unreclaimable:2092kB kernel_stack:72kB pagetables:7940kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:460 all_unreclaimable? yes lowmem_reserve[]: 0 0 505 505 Node 0 Normal free:8412kB min:8464kB low:10580kB high:12696kB active_anon:202572kB inactive_anon:202684kB active_file:100kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:517120kB mlocked:0kB dirty:0kB writeback:4kB mapped:68kB shmem:12kB slab_reclaimable:7896kB slab_unreclaimable:51820kB kernel_stack:1016kB pagetables:9068kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:158 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 3*4kB 2*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15724kB Node 0 DMA32: 808*4kB 421*8kB 282*16kB 242*32kB 154*64kB 97*128kB 45*256kB 8*512kB 2*1024kB 1*2048kB 0*4096kB = 60840kB Node 0 Normal: 299*4kB 96*8kB 41*16kB 21*32kB 10*64kB 9*128kB 5*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 8412kB 14544 total pagecache pages 3136 pages in swap cache Swap cache stats: add 1620224, delete 1617088, find 48938/52854 Free swap = 0kB Total swap = 4063224kB 1048575 pages RAM 67843 pages reserved 210 pages shared 955684 pages non-shared [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name [ 502] 0 502 2766 2 0 -17 -1000 udevd [ 1157] 0 1157 2279 1 0 0 0 dhclient [ 1201] 0 1201 6915 25 0 -17 -1000 auditd [ 1226] 0 1226 62271 91 0 0 0 rsyslogd [ 1255] 0 1255 2704 24 0 0 0 irqbalance [ 1269] 32 1269 4743 16 0 0 0 rpcbind [ 1287] 29 1287 5836 1 0 0 0 rpc.statd [ 1320] 0 1320 6290 1 0 0 0 rpc.idmapd [ 1414] 81 1414 7943 7 0 0 0 dbus-daemon [ 1431] 0 1431 47336 1 1 0 0 cupsd [ 1456] 0 1456 1019 0 1 0 0 acpid [ 1465] 68 1465 6270 92 0 0 0 hald [ 1466] 0 1466 4526 1 0 0 0 hald-runner [ 1494] 0 1494 5055 1 1 0 0 hald-addon-inpu [ 1505] 68 1505 4451 1 1 0 0 hald-addon-acpi [ 1526] 0 1526 96425 31 1 0 0 automount [ 1542] 0 1542 1691 0 0 0 0 mcelog [ 1554] 0 1554 16029 1 0 -17 -1000 sshd [ 1576] 38 1576 7540 14 1 0 0 ntpd [ 1652] 0 1652 19682 23 0 0 0 master [ 1659] 89 1659 19745 17 0 0 0 qmgr [ 1676] 0 1676 27544 1 1 0 0 abrtd [ 1684] 0 1684 29302 25 0 0 0 crond [ 1695] 0 1695 5363 1 0 0 0 atd [ 1706] 0 1706 25231 20 0 0 0 rhnsd [ 1713] 0 1713 25972 1 0 0 0 rhsmcertd [ 1728] 0 1728 15480 13 0 0 0 certmonger [ 1746] 0 1746 1015 1 1 0 0 mingetty [ 1748] 0 1748 1015 1 0 0 0 mingetty [ 1750] 0 1750 1015 1 1 0 0 mingetty [ 1752] 0 1752 1015 1 1 0 0 mingetty [ 1755] 0 1755 1015 1 0 0 0 mingetty [ 1763] 0 1763 3095 1 0 -17 -1000 udevd [20478] 0 20478 2765 0 1 -17 -1000 udevd [29143] 0 29143 1760204 912796 0 0 0 glusterfs [32082] 0 32082 24466 36 0 0 0 sshd [32086] 0 32086 27117 20 0 0 0 bash [32107] 89 32107 19702 17 0 0 0 pickup [32151] 0 32151 24466 51 1 0 0 sshd [32152] 0 32152 24466 37 1 0 0 sshd [32159] 0 32159 27117 62 0 0 0 bash [32160] 0 32160 27117 86 1 0 0 bash [ 6284] 0 6284 25234 23 0 0 0 tail [17084] 0 17084 26523 79 0 0 0 script2.sh [21651] 0 21651 26294 63 0 0 0 dd Out of memory: Kill process 29143 (glusterfs) score 809 or sacrifice child Killed process 29143, UID 0, (glusterfs) total-vm:7040816kB, anon-rss:3650900kB, file-rss:284kB [root@darrel f]# [root@darrel f]# ls ls: cannot open directory .: Transport endpoint is not connected [root@darrel f]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_darrel-lv_root 50G 34G 13G 73% / tmpfs 1.9G 0 1.9G 0% /dev/shm /dev/vda1 485M 63M 397M 14% /boot /dev/mapper/vg_darrel-lv_home 16G 168M 15G 2% /home df: `/mnt/vol-dr': Transport endpoint is not connected 10.70.36.35:/vol-dr 11T 16G 11T 1% /mnt/nvol-dr [root@darrel f]# Version-Release number of selected component (if applicable): ============================================================= [root@darrel n]# rpm -qa | grep gluster glusterfs-devel-3.4.0.12rhs.beta1-1.el6.x86_64 glusterfs-3.4.0.12rhs.beta1-1.el6.x86_64 glusterfs-rdma-3.4.0.12rhs.beta1-1.el6.x86_64 glusterfs-debuginfo-3.4.0.12rhs.beta1-1.el6.x86_64 glusterfs-fuse-3.4.0.12rhs.beta1-1.el6.x86_64 [root@darrel n]# Steps Carried: ============== 1. Created and started 6*2 setup from 4 server nodes 2. Mounted on client darrel (fuse and NFS) 3. set the volume options cluster.background-self-heal-count: 0 cluster.self-heal-daemon: off 4. Created directories f and n from fuse mount 5. cd to f from fuse mount 6. cd to n from nfs mount 7. Ran script1.sh from fuse(f) and NFS(n) mount directories 8. Brought down all the bricks from server2(kill -9) 9. Brought down server4 (powered off) 10. Ran script2.sh from fuse(f) and NFS(n) mount directories 11. It finished on NFS (n) directory and was still going on fuse (f) directory. 12. Reran script2.sh from NFS (n) directory. 13. Meanwhile Fuse client hit OOM Actual results: =============== Out of memory: Kill process 29143 (glusterfs) score 809 or sacrifice child Killed process 29143, UID 0, (glusterfs) total-vm:7040816kB, anon-rss:3650900kB, file-rss:284kB Expected results: ================= Should not hit OOM
Patch review URL: https://code.engineering.redhat.com/gerrit/11061
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html