Bug 1655352 - [GSS] Gluster client process is crashing / getting killed by OOM killer.
Summary: [GSS] Gluster client process is crashing / getting killed by OOM killer.
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Sunny Kumar
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On: RHGS34MemoryLeak
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-03 00:38 UTC by Ben Turner
Modified: 2022-03-13 16:18 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 05:58:45 UTC
Embargoed:


Attachments (Terms of Use)

Description Ben Turner 2018-12-03 00:38:33 UTC
Description of problem:

Every two weeks or so the gluster client process is crashing / getting killed by OOM killer when running commvault backups.

Version-Release number of selected component (if applicable):

glusterfs-3.8.4-18.el6rhs.x86_64

How reproducible:

Intermittent, happens about once every two weeks when running daily backups where commvault writes to gluster.

Steps to Reproduce:
1.  Run commvault backup SW
2.  Backup to a gluster FUSE mount
3.  Crash / OOM killer kills the FUSE mount and the backup fails.

Actual results:

Crash / OOM kill takes down the gluster mount.

Expected results:

Normal operation.

Additional info:

We have 2 app cores from the gluster client mount process that need analysis.

Comment 9 Amar Tumballi 2018-12-04 08:52:48 UTC
Few more details required!

Size of RAM.

$ grep 'lru' <statedump-file>
$ grep 'active' <statedump-file>

Comment 11 Nathan Barry 2018-12-04 21:11:39 UTC
RAM at time of coredump was 16GB, client has since been upgraded to 32GB RAM.
Statedumps will be uploaded to case

Comment 12 Ben Turner 2018-12-10 15:28:27 UTC
Its been since 12/3 and no updates have been added and no owner is assigned, what is the status of this bug?

Comment 17 Ben Turner 2018-12-10 16:44:30 UTC
Core was generated by `/usr/sbin/glusterfs --volfile-server=aclrhgs.noblehosted.com --volfile-server=a'.
Program terminated with signal 11, Segmentation fault.
#0  mem_get (mem_pool=0x7fcd5800f4e0) at mem-pool.c:523
523	        *pool_ptr = (struct mem_pool *)mem_pool;

(gdb) f 0
#0  mem_get (mem_pool=0x7fcd5800f4e0) at mem-pool.c:523
523	        *pool_ptr = (struct mem_pool *)mem_pool;

(gdb) p ptr
$1 = (void *) 0x0

(gdb) p pool_ptr
$2 = (struct mem_pool **) 0x10


Note You need to log in before you can comment on or make changes to this bug.