Description of problem ====================== In skyring.log file, there are multiple ERROR lines for a default configuration (based on a current documentation). This is totally unacceptable. Version-Release number of selected component ============================================ osd machine: ceph-0.94.5-9.el7cp.x86_64 ceph-common-0.94.5-9.el7cp.x86_64 ceph-osd-0.94.5-9.el7cp.x86_64 rhscon-agent-0.0.3-3.el7.noarch mon machine: ceph-0.94.5-9.el7cp.x86_64 ceph-common-0.94.5-9.el7cp.x86_64 ceph-mon-0.94.5-9.el7cp.x86_64 rhscon-agent-0.0.3-3.el7.noarch usm server machine: ceph-0.94.5-9.el7cp.x86_64 ceph-ansible-1.0.1-1.20160307gitb354445.el7.noarch ceph-common-0.94.5-9.el7cp.x86_64 redhat-ceph-installer-0.2.3-1.20160304gitb3e3c68.el7.noarch rhscon-ceph-0.0.6-14.el7.x86_64 rhscon-core-0.0.8-14.el7.x86_64 rhscon-ui-0.0.23-1.el7.noarch How reproducible ================ 100 % Steps to Reproduce ================== 1. Install and configure USM based on the documentation 2. See /var/log/skyring/skyring.log file (on usm server machine) Actual results ============== See beginning of a skyring.log file, there are multiple ERROR messages: ~~~ 2016-03-17T09:45:19.463+01:00 INFO main.go:201 func·003] start listening on 0.0.0.0 : 8080 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1440 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1440 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1443 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1443 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1446 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1446 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1464 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing memory utilization.Err dial tcp 127.0.0.1:2003: connection refused 2016-03-17T09:45:20.164+01:00 ERROR monitoring.go:1464 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing memory utilization.Err dial tcp 127.0.0.1:2003: connection refused ~~~ Expected results ================ There are no *ERROR* lines in *any* USM log for a default installation. Additional info =============== If after developer evaluation it turns out that some of the issues are caused by an incomplete documentation, create *new doc BZ* with an explanation what is missing and needs an update for each such doc issue. This BZ should not be turned into a documentation one.
I would like to point out that this applies to all basic use cases such as: * creating a cluster * importing a cluster * creating object storage * creating rbd * ... and so on ... So far, this issue looks gradually worse over time as it's harder and harder to get around in the logs to find relevant details.
Need to provide an analysis based on the latest builds
(In reply to Nishanth Thomas from comment #4) > Need to provide an analysis based on the latest builds Based on discussion during today's bug scrub meeting, I'm going to analyze logs of current build for crash reports, failures, error messages and other problems. This is important given the sheer number of issues reported in the logs. And for the same reason, it will take some time.