Bug 1342097 - [RFE] Need way to display cache size of gluster native client
Summary: [RFE] Need way to display cache size of gluster native client
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Mohammed Rafi KC
QA Contact: Anoop
URL:
Whiteboard:
Depends On:
Blocks: 1408949 RHGS-usability-bug-GSS
TreeView+ depends on / blocked
 
Reported: 2016-06-02 12:29 UTC by Chris Blum
Modified: 2023-09-14 23:59 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-14 03:29:38 UTC
Embargoed:


Attachments (Terms of Use)
State dump RHGS1 from CLIENT15 (2.05 MB, text/plain)
2016-06-15 08:07 UTC, fabrice.dumay
no flags Details
State dump RHGS2 from CLIENT15 (2.05 MB, text/plain)
2016-06-15 08:11 UTC, fabrice.dumay
no flags Details

Description Chris Blum 2016-06-02 12:29:51 UTC
Description of problem:
Customer has the native client running inside of a VM. The process consumes 4GB of memory very stable over a 6 month period.
Since the consumed memory is not marked as cache, the 'free' command shows this memory as 'used' and monitoring is alarming the customer of filled up memory when it reaches the 80% mark.
We need a way to display the currently allocated cache size of the client in a simple CLI command, so we can subtract this from the used memory count.
As a workaround it would be also possible to limit the amount of memory available for the client in total (memory per file is not usable)

Version-Release number of selected component (if applicable):
RHGS 3.1.2

Additional info:
Statedump of native client process will be uploaded by the customer directly at a later point in time.

Comment 3 fabrice.dumay 2016-06-15 08:07:19 UTC
Created attachment 1168239 [details]
State dump RHGS1 from CLIENT15

Comment 4 fabrice.dumay 2016-06-15 08:11:57 UTC
Created attachment 1168240 [details]
State dump RHGS2 from CLIENT15

Comment 5 fabrice.dumay 2016-06-15 08:16:40 UTC
Hello,

I've attached both state dumps from the two RHGS server.

You can see here below the memory usage of the client.

The mountpoint is used by an FTP server and a Apache to serve static content. We're talking about small files here.


CLIENT15# ps auxf | grep gluster
root     15015  0.0  0.0 103308   832 pts/0    S+   09:33   0:00                      \_ grep gluster
root     39521  1.2 50.9 5605112 4105776 ?     Ssl  Apr28 853:51 /usr/sbin/glusterfs --selinux --volfile-server=RHGS1 --volfile-server=RHGS2 --volfile-id=VOLUME1 /MOUNTPOINT

CLIENT15# free -m
             total       used       free     shared    buffers     cached
Mem:          7864       7725        139          0        127       3112
-/+ buffers/cache:       4485       3379
Swap:         4095         44       4051

kill -SIGUSR1 $(pgrep glusterfs)

CLIENT15# kill -SIGUSR1 $(pgrep glusterfs)


Thanks.

Regards,

Fab

Comment 12 Mohammed Rafi KC 2016-07-13 15:41:03 UTC
Hello,

The state dumps which you provided are from brick process (please see comment8 and comment11). The comments you provided to take statedump is correct, you might have uploaded a wrong one.

If you can upload a statedump for client process when the memory usage is high, that would really help to find the root cause.

Thanks
Rafi

Comment 13 fabrice.dumay 2016-07-13 17:25:09 UTC
Hello,

I did not find any other dump anywhere. No dump was find on the client (including /tmp and /var/run/).

How should it be done and find then ?

For the record, the memory consumption is always high. It suffice to access files on the filesystem to have it growing.

Thanks.

Regards,

Fabrice

Comment 15 Mohammed Rafi KC 2016-08-16 12:31:46 UTC
Upstream design discussion thread https://www.mail-archive.com/gluster-users@gluster.org/msg25851.html.

If you have any suggestion , feel free to reply to thread.

Comment 34 Red Hat Bugzilla 2023-09-14 23:59:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.