Bug 1319850

Summary: ERROR lines in skyring logs for a default installation
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Martin Bukatovic <mbukatov>
Component: unclassifiedAssignee: Nishanth Thomas <nthomas>
Status: CLOSED WONTFIX QA Contact: Martin Bukatovic <mbukatov>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2CC: mbukatov, sankarshan
Target Milestone: ---   
Target Release: 3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-23 04:11:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1309640    
Bug Blocks:    

Description Martin Bukatovic 2016-03-21 16:22:22 UTC
Description of problem
======================

In skyring.log file, there are multiple ERROR lines for a default configuration
(based on a current documentation). This is totally unacceptable.

Version-Release number of selected component
============================================

osd machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-common-0.94.5-9.el7cp.x86_64
ceph-osd-0.94.5-9.el7cp.x86_64
rhscon-agent-0.0.3-3.el7.noarch

mon machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-common-0.94.5-9.el7cp.x86_64
ceph-mon-0.94.5-9.el7cp.x86_64
rhscon-agent-0.0.3-3.el7.noarch

usm server machine:
ceph-0.94.5-9.el7cp.x86_64
ceph-ansible-1.0.1-1.20160307gitb354445.el7.noarch
ceph-common-0.94.5-9.el7cp.x86_64
redhat-ceph-installer-0.2.3-1.20160304gitb3e3c68.el7.noarch
rhscon-ceph-0.0.6-14.el7.x86_64
rhscon-core-0.0.8-14.el7.x86_64
rhscon-ui-0.0.23-1.el7.noarch

How reproducible
================

100 %

Steps to Reproduce
==================

1. Install and configure USM based on the documentation
2. See /var/log/skyring/skyring.log file (on usm server machine)

Actual results
==============

See beginning of a skyring.log file, there are multiple ERROR messages:

~~~
2016-03-17T09:45:19.463+01:00 INFO     main.go:201 funcĀ·003] start listening on 0.0.0.0 : 8080
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1440 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1440 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1443 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1443 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1446 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1446 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing cluster utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1464 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing memory utilization.Err dial tcp 127.0.0.1:2003: connection refused
2016-03-17T09:45:20.164+01:00 ERROR    monitoring.go:1464 Compute_System_Summary] skyring:cefe0e1e-cd7a-44c2-a3e2-faefdf58fa9f - Error pushing memory utilization.Err dial tcp 127.0.0.1:2003: connection refused
~~~

Expected results
================

There are no *ERROR* lines in *any* USM log for a default installation.

Additional info
===============

If after developer evaluation it turns out that some of the issues are caused
by an incomplete documentation, create *new doc BZ* with an explanation what
is missing and needs an update for each such doc issue. This BZ should not be
turned into a documentation one.

Comment 3 Martin Bukatovic 2016-06-03 08:46:55 UTC
I would like to point out that this applies to all basic use cases such as:

 * creating a cluster
 * importing a cluster
 * creating object storage
 * creating rbd
 * ... and so on ...

So far, this issue looks gradually worse over time as it's harder and harder
to get around in the logs to find relevant details.

Comment 4 Nishanth Thomas 2016-06-22 12:36:54 UTC
Need to provide an analysis based on the latest builds

Comment 5 Martin Bukatovic 2016-06-22 17:18:08 UTC
(In reply to Nishanth Thomas from comment #4)
> Need to provide an analysis based on the latest builds

Based on discussion during today's bug scrub meeting, I'm going to analyze logs
of current build for crash reports, failures, error messages and other problems.

This is important given the sheer number of issues reported in the logs. And
for the same reason, it will take some time.