Description of problem: - Single VM with 8G of RAM, 1node/1master on the same VM - After running 30 nodejs test pods I started seeing exceptions in hawkular-metrics log. I then deleted hawkular-metrics and cassandra pods but the new pods didn't fix the problem. Version-Release number of selected component (if applicable): - Origin: 1.4.1 - Metrics: 1.4.1 - Centos 7 - Deployed on RedHat public OS1, one single VM with 8GB of RAM How reproducible: 100% Steps to Reproduce: 1. Install Metrics as instructed 2. Install Hawkular OpenShift Agent 3. Deploy some user pods 4. After several days of running and collecting metrics CPU,RAM, Network graphs are blank. I attempted to delete Hawkular-Metrics and Cassandra pods to no avail. Actual results: Exceptions in hawkular-metrics pod. No metrics data returns from a python test client Expected results: - UI displays metrics - Additional info:
Created attachment 1263754 [details] log files
Do you have any resource limit applied to the Hawkular Metrics, Cassandra and Heapster pods? If you are trying to do scalability testing, you need to be running the infrastructure components on one OpenShift node and have the pods to be monitored on another. Trying to run everything on one pod for testing purposes is not a good idea, all your pods will be competing for resources.