Bug 1342097

Summary: [RFE] Need way to display cache size of gluster native client
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Chris Blum <cblum>
Component: coreAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED CANTFIX QA Contact: Anoop <annair>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: atumball, bkunal, bturner, cblum, fabrice.dumay, rhs-bugs, rkavunga, sankarshan
Target Milestone: ---Keywords: FutureFeature, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-14 03:29:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1408949, 1468976    
Attachments:
Description Flags
State dump RHGS1 from CLIENT15
none
State dump RHGS2 from CLIENT15 none

Description Chris Blum 2016-06-02 12:29:51 UTC
Description of problem:
Customer has the native client running inside of a VM. The process consumes 4GB of memory very stable over a 6 month period.
Since the consumed memory is not marked as cache, the 'free' command shows this memory as 'used' and monitoring is alarming the customer of filled up memory when it reaches the 80% mark.
We need a way to display the currently allocated cache size of the client in a simple CLI command, so we can subtract this from the used memory count.
As a workaround it would be also possible to limit the amount of memory available for the client in total (memory per file is not usable)

Version-Release number of selected component (if applicable):
RHGS 3.1.2

Additional info:
Statedump of native client process will be uploaded by the customer directly at a later point in time.

Comment 3 fabrice.dumay 2016-06-15 08:07:19 UTC
Created attachment 1168239 [details]
State dump RHGS1 from CLIENT15

Comment 4 fabrice.dumay 2016-06-15 08:11:57 UTC
Created attachment 1168240 [details]
State dump RHGS2 from CLIENT15

Comment 5 fabrice.dumay 2016-06-15 08:16:40 UTC
Hello,

I've attached both state dumps from the two RHGS server.

You can see here below the memory usage of the client.

The mountpoint is used by an FTP server and a Apache to serve static content. We're talking about small files here.


CLIENT15# ps auxf | grep gluster
root     15015  0.0  0.0 103308   832 pts/0    S+   09:33   0:00                      \_ grep gluster
root     39521  1.2 50.9 5605112 4105776 ?     Ssl  Apr28 853:51 /usr/sbin/glusterfs --selinux --volfile-server=RHGS1 --volfile-server=RHGS2 --volfile-id=VOLUME1 /MOUNTPOINT

CLIENT15# free -m
             total       used       free     shared    buffers     cached
Mem:          7864       7725        139          0        127       3112
-/+ buffers/cache:       4485       3379
Swap:         4095         44       4051

kill -SIGUSR1 $(pgrep glusterfs)

CLIENT15# kill -SIGUSR1 $(pgrep glusterfs)


Thanks.

Regards,

Fab

Comment 12 Mohammed Rafi KC 2016-07-13 15:41:03 UTC
Hello,

The state dumps which you provided are from brick process (please see comment8 and comment11). The comments you provided to take statedump is correct, you might have uploaded a wrong one.

If you can upload a statedump for client process when the memory usage is high, that would really help to find the root cause.

Thanks
Rafi

Comment 13 fabrice.dumay 2016-07-13 17:25:09 UTC
Hello,

I did not find any other dump anywhere. No dump was find on the client (including /tmp and /var/run/).

How should it be done and find then ?

For the record, the memory consumption is always high. It suffice to access files on the filesystem to have it growing.

Thanks.

Regards,

Fabrice

Comment 15 Mohammed Rafi KC 2016-08-16 12:31:46 UTC
Upstream design discussion thread https://www.mail-archive.com/gluster-users@gluster.org/msg25851.html.

If you have any suggestion , feel free to reply to thread.

Comment 34 Red Hat Bugzilla 2023-09-14 23:59:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days