Bug 1883191

Summary: OOM: OSD is crashed with error: thread_name:bstore_kv_sync
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Pawan <pdhiran>
Component: RADOSAssignee: Neha Ojha <nojha>
Status: CLOSED NOTABUG QA Contact: Manohar Murthy <mmurthy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2CC: akupczyk, bhubbard, ceph-eng-bugs, dzafman, jdurgin, kchai, nojha, rzarzyns, sseshasa
Target Milestone: rc   
Target Release: 5.*   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-30 15:21:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Brad Hubbard 2020-09-29 02:48:53 UTC
*** Bug 1883189 has been marked as a duplicate of this bug. ***

Comment 4 Brad Hubbard 2020-09-29 02:58:23 UTC
I've marked https://bugzilla.redhat.com/show_bug.cgi?id=1883189 as a dup of this bug since they both seem to have the same root cause (an out of memory condition. Did you gather any memory data at the time such as systat, etc.? Also, what restrictions (cgroups, etc.) if any were placed on the daemons that failed? Since this appears to be an OOM condition we need to focus on how much memory was available and what was consuming it. If you can reproduce this it might be a good idea to monitor the daemons to see how the memory usage is growing and using something like the commands in https://tracker.ceph.com/issues/46658 (comments 4 and 8, etc.) to get a better idea of how the memory is being consumed.

Comment 8 Josh Durgin 2020-09-30 15:21:46 UTC
Those OSD nodes have 4 GiB of ram total, and you're running 4 osds, each with a 4GB osd_memory_target individually.

This is not enough memory and not close to a supported setup.