Bug 1784721

Summary: volume's brick go offline due to calloc failed
Product: [Community] GlusterFS Reporter: guolei <guol-fnst>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1CC: bugs, pasik, srakonde
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-02 06:26:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
birck logs none

Description guolei 2019-12-18 07:35:33 UTC
Created attachment 1646024 [details]
birck logs

Description of problem:

glusterfs version 4.1.9
Volume info: replica 2

The volume has 34 bricks, and some of them suddenly became offline.


Here are some logs of offlined bricks.

[2019-12-15 08:53:50.982431] E [marker-quota.c:1098:mq_synctask1] 0-pacs-marker: Failed to spawn new synctask
[2019-12-15 08:53:50.982864] A [MSGID: 0] [mem-pool.c:118:__gf_calloc] : no memory available for size (5336) [call stack follows]
/usr/lib64/libglusterfs.so.0(+0x25940)[0x7fe7a4271940]
/usr/lib64/libglusterfs.so.0(_gf_msg_nomem+0x3e1)[0x7fe7a4271f21]




There are two bricks(sda,sdc) logs in the attachment.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sanju 2020-01-02 06:26:18 UTC
When there is no memory available, we cannot expect glusterfs processes to run without any interruption. Please have sufficient resources to keep the cluster in good state.

I'm closing this as not a bug. If bricks go offline when there are sufficient resources, please file a new bug.

Thanks,
Sanju

Comment 2 guolei 2020-01-02 06:29:07 UTC
When bricks down, There are still 50GB memory left.
I think this is not the real reason.