Description of problem:
Cloudforms includes cached memory when computing used memory of openshift 3.3 providers - this means that the used memory alerts generated are not useful
Version-Release number of selected component (if applicable):
all the time
Steps to Reproduce:
1. set up openshift
2. set up cloudforms to use the openshift provider
3. make openshift consumption grow (java processes using cache memory)
cloudforms alerts are generated without taking into account that cached memory is reclaimable
Cloudforms doesn't take cached memory into account when computing the memory usage of openshift providers
planning to re-test against cfme-188.8.131.52
Created attachment 1284157 [details]
the heapster endpopints found (cant find cache or rss)
To make it explicit: ATM this CFME bug/RFE is blocked on OpenShift Heapster which is not collecting and exposing the information about used memory vs cache.
Solving this issue has to be prioritized first in bug 1457933 (OpenShift Metrics).
Please assess the importance of this issue and update the priority accordingly. Somewhere it was missed in the bug triage process. Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#priority for a reminder on each priority's definition.
If it's something like a tracker bug where it doesn't matter, please set it to Low/Low.
This seems related to https://bugzilla.redhat.com/show_bug.cgi?id=1485504
This needs a PM's decision.
Please let me know what version you think we can do the change for.
For now pushing to 6.0 as this may require a DB schema change.
This is a wider issue than just collecting the RSS, as in cfme the used memory columns are always a sum of cached & RSS.
As Prometheus reports all the memory data required, We can collect the parameters separately , and we can store these in separate columns, however the question is what should we do with those params, other than expose them for reports. which may bring a different set of questions about what is used for charge back ??
> I suggest adding a chached column in the metric table (we could add 2
> columns RSS & cache ) or we can calculate the third (total = RSS + cache).
> The two customers ticket are closed, I am proposing to fix it with Prometheus.
The proposed solution is to add "cached" and "rss" columns to the metrics table, and populate them only if we have a prometheus metrics.
closing as not a bug because showing "used_memory" on the "used memory" column is what is expected and documented, if we want to add new columns for providers using prometheus metrics collector we need an RFE.