Bug 900047 (JBPAPP6-1533)

Summary: SystemMemoryUsageLoadMetric is not correct on Linux/Unix
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Michal Karm Babacek <mbabacek>
Component: mod_clusterAssignee: Radoslav Husar <rhusar>
Status: CLOSED WONTFIX QA Contact: Michal Karm Babacek <mbabacek>
Severity: high Docs Contact: Russell Dickenson <rdickens>
Priority: high    
Version: 6.0.0CC: dimitris, lcosti, paul.ferraro, rhatlapa, rhusar, rsvoboda, smumford, twells
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: http://jira.jboss.org/jira/browse/JBPAPP6-1533
Whiteboard: eap601candidate
Fixed In Version: Doc Type: Known Issue
Doc Text:
The `SystemMemoryUsageLoadMetric` does not show useful information on Linux or UNIX operating systems. For these systems, `HeapMemoryUsageLoadMetric` provides more useful information. The solution to this problem will be to change the algorithm of `SystemMemoryUsageLoadMetric` to subtract the `buffers/cache` value from the `used` number. The best method for doing this is under investigation.
Story Points: ---
Clone Of: Environment:
Linux/Unix operating systems with standard memory management
Last Closed: 2013-11-25 19:24:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michal Karm Babacek 2012-03-14 14:51:04 UTC
Affects: Release Notes
project_key: JBPAPP6

I am convinced that the [SystemMemoryUsageLoadMetric.java|https://github.com/modcluster/mod_cluster/blob/master/core/src/main/java/org/jboss/modcluster/load/metric/impl/SystemMemoryUsageLoadMetric.java] is of no use to anyone.

I have tested it by allocating some memory outside the JVM control with [NativeMemoryStress.c|https://github.com/Karm/clusterbench/blob/system-memory-load/clusterbench-native/src/main/native/NativeMemoryStress.c].

The results from Mod_cluster are such that the system is almost fully loaded all the time despite the fact that I work with 8GB RAM box with almost nothing running on it.
This [picture|http://hudson.qa.jboss.com/hudson/view/mod_cluster-QE/view/mod_cluster-QE-eap-6x/job/eap-6x-mod_cluster-rhel6-x86_64-load-mem/4/artifact/report/load.png] displays time on X axis and (100-(Load from Mod_cluster)) on Y axis. As you may observe, all 4 nodes appear to be under full load all the time and there is a spike on each of them. This spike represents a period of time when the aforementioned [NativeMemoryStress.c|https://github.com/Karm/clusterbench/blob/system-memory-load/clusterbench-native/src/main/native/NativeMemoryStress.c] was called with the following setting:
{noformat}
   megabytes    500;
   milliseconds 60000;
{noformat}

What had happened is that kernel has cleared some cache and garbage data from the RAM in order to accommodate new demands, so by the time I freed the memory, there was all of sudden actually more free RAM then before.

The actual problem with the current [SystemMemoryUsageLoadMetric.java|https://github.com/modcluster/mod_cluster/blob/master/core/src/main/java/org/jboss/modcluster/load/metric/impl/SystemMemoryUsageLoadMetric.java] implementation using [OperatingSystemMXBean|http://docs.oracle.com/javase/6/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html] is that this code:
{code:lang=java}
+++
  public static final String FREE_MEMORY = "FreePhysicalMemorySize";
  public static final String TOTAL_MEMORY = "TotalPhysicalMemorySize";
+++
  double free = ((Number) this.server.getAttribute(name, FREE_MEMORY)).doubleValue();
  double total = ((Number) this.server.getAttribute(name, TOTAL_MEMORY)).doubleValue();
  return (total - free) / total;
+++
{code}
Is somewhat similar to this (on 8GB RAM machine with almost nothing running on at the moment):
{noformat}
[karm@killerbox karm]$ free -m
             total       used       free     shared    buffers     cached
Mem:          7933       6755       1177          0         44        141
-/+ buffers/cache:       6569       1363
Swap:        16383         58      16325
{noformat}
Obviously, this does not mean that I could get only 1177MB of RAM if I want to. The "free" actually means "wasted". The majority of RAM is being used as cache. If I ask for e.g. 2GB, I am gonna get them, because kernel will simply sacrifice some cache, clears some garbage from RAM and free up space so as to accommodate my demands.

So, according to my opinion, on a sane, well-behaving caching operating system, the [SystemMemoryUsageLoadMetric.java|https://github.com/modcluster/mod_cluster/blob/master/core/src/main/java/org/jboss/modcluster/load/metric/impl/SystemMemoryUsageLoadMetric.java] will always show ~ 98% RAM usage.

Due to this fact, this metric is unlikely to be of some use to anyone on Linux/UNIX. What do you think?

Comment 1 Michal Karm Babacek 2012-03-14 14:51:06 UTC
Link: Added: This issue Cloned from MODCLUSTER-288


Comment 2 Michal Karm Babacek 2012-03-14 14:51:45 UTC
Security: Added: Public
Docs QE Status: Added: NEW


Comment 3 Rajesh Rajasekaran 2012-03-19 16:52:49 UTC
Labels: Added: eap6_need_triage


Comment 4 Dimitris Andreadis 2012-03-19 17:36:27 UTC
Paul, can you take a look please?

Comment 5 Paul Ferraro 2012-03-22 19:13:46 UTC
See comments for MODCLUSTER-288.

Comment 6 Michal Karm Babacek 2012-03-27 14:37:30 UTC
As Paul said, please follow [MODCLUSTER-288]. This issue will likely result in a documentation task...

Comment 7 Misty Stanley-Jones 2012-06-12 11:13:42 UTC
Release Notes Docs Status: Added: Documented as Known Issue
Release Notes Text: Added: The <code>SystemMemoryUsageLoadMetric</code> does not show useful information on Linux or UNIX operating systems. For these systems, <code>HeapMemoryUsageLoadMetric</code> provides more useful information.  The solution to this probiem will be to change the algorithm of <code>SystemMemoryUsageLoadMetric</code> to subtract the <code>buffers/cache</code> value from the <code>used</code> number. The best method for doing this is under investigation.
Affects: Added: Release Notes


Comment 8 Rajesh Rajasekaran 2012-07-11 19:48:20 UTC
Labels: Removed: eap6_need_triage Added: eap601candidate


Comment 9 Paul Ferraro 2012-08-08 20:24:02 UTC
Deferring to EAP 6.1/7 - the solution will probably involve dropping this metric altogether.

Comment 10 Dana Mison 2012-10-16 05:54:53 UTC
Writer: Added: mistysj


Comment 11 Dana Mison 2012-10-29 00:59:53 UTC
Release Notes Text: Removed: The <code>SystemMemoryUsageLoadMetric</code> does not show useful information on Linux or UNIX operating systems. For these systems, <code>HeapMemoryUsageLoadMetric</code> provides more useful information.  The solution to this probiem will be to change the algorithm of <code>SystemMemoryUsageLoadMetric</code> to subtract the <code>buffers/cache</code> value from the <code>used</code> number. The best method for doing this is under investigation. Added: The `SystemMemoryUsageLoadMetric` does not show useful information on Linux or UNIX operating systems. For these systems, `HeapMemoryUsageLoadMetric` provides more useful information.  The solution to this probiem will be to change the algorithm of `SystemMemoryUsageLoadMetric` to subtract the `buffers/cache` value from the `used` number. The best method for doing this is under investigation.


Comment 12 Michal Karm Babacek 2012-10-29 13:00:18 UTC
Release Notes Text: Removed: The `SystemMemoryUsageLoadMetric` does not show useful information on Linux or UNIX operating systems. For these systems, `HeapMemoryUsageLoadMetric` provides more useful information.  The solution to this probiem will be to change the algorithm of `SystemMemoryUsageLoadMetric` to subtract the `buffers/cache` value from the `used` number. The best method for doing this is under investigation. Added: The `SystemMemoryUsageLoadMetric` does not show useful information on Linux or UNIX operating systems. For these systems, `HeapMemoryUsageLoadMetric` provides more useful information.  The solution to this problem will be to change the algorithm of `SystemMemoryUsageLoadMetric` to subtract the `buffers/cache` value from the `used` number. The best method for doing this is under investigation.


Comment 13 Anne-Louise Tangring 2012-11-13 21:01:42 UTC
Release Notes Docs Status: Removed: Documented as Known Issue 
Writer: Removed: mistysj 
Release Notes Text: Removed: The `SystemMemoryUsageLoadMetric` does not show useful information on Linux or UNIX operating systems. For these systems, `HeapMemoryUsageLoadMetric` provides more useful information.  The solution to this problem will be to change the algorithm of `SystemMemoryUsageLoadMetric` to subtract the `buffers/cache` value from the `used` number. The best method for doing this is under investigation. 
Docs QE Status: Removed: NEW 


Comment 15 Michal Karm Babacek 2013-09-13 10:22:36 UTC
Guys, let's drop this feature.
We came to the conclusion that without JNI, one can't do this properly from within the JDK quite some time ago...
https://issues.jboss.org/browse/JBPAPP6-1533

Comment 16 Radoslav Husar 2013-09-16 19:15:31 UTC
Yeah, lets...

Comment 17 Radoslav Husar 2013-11-25 19:24:41 UTC
This metric will be dropped starting from EAP 7. We shouldn't be dropping features in existing EAP.

However, the concern is valid and we need to make sure that these limitations are documented in EAP 6 documentation so that users aren't let to believe this does something else. Michal please check this is properly documented in the documentation, not just part of old 'known issues' document for previous EAP releases (maybe a separate documentation Jira might be needed).

Comment 20 Michal Karm Babacek 2014-05-22 08:24:57 UTC
In the current documentation, Administration Guide:

mem
    The mem load metric uses free native memory as a load factor. Usage of this metric is discouraged because it provides a value that includes buffers and cache, so it is always a very low figure on every decent system with good memory management. 

So IMHO, Rado's concern is unfounded.

Comment 21 Lucas Costi 2015-01-27 05:27:10 UTC
Removed from EAP 6.4 beta release notes