Bug 1647322 - WA should detect and report problems with carbon initialization
Summary: WA should detect and report problems with carbon initialization
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-commons
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Timothy Asir
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On:
Blocks: 1696807
TreeView+ depends on / blocked
 
Reported: 2018-11-07 08:23 UTC by Martin Bukatovic
Modified: 2019-10-30 12:23 UTC (History)
7 users (show)

Fixed In Version: tendrl-monitoring-integration-1.6.3-22.el7rhgs.noarch
Doc Type: Bug Fix
Doc Text:
Previously, tendrl did not set an owner for the /var/lib/carbon/whisper/tendrl directory. When the owner of this directory was not the 'carbon' user, carbon-cache could not create whisper files in this location. Tendrl now ensures the directory is owned by the 'carbon' user to ensure whisper files can be created.
Clone Of:
Environment:
Last Closed: 2019-10-30 12:23:13 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github Tendrl monitoring-integration issues 595 0 None closed WA should detect and report problems with carbon initialization 2020-02-27 12:07:11 UTC
Red Hat Bugzilla 1589801 0 unspecified CLOSED no error reported by WA ui when importing cluster without free disk space on /var/lib/carbon partition 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1647214 1 None None None 2024-09-18 00:49:29 UTC
Red Hat Product Errata RHBA-2019:3251 0 None None None 2019-10-30 12:23:34 UTC

Internal Links: 1589801 1647214

Description Martin Bukatovic 2018-11-07 08:23:41 UTC
Description of problem
======================

When carbon fails to create it's database for some reason, WA doesn't notice
the problem and reports import cluster as success, even though no data could
be shown on any dashboard.

Version-Release number of selected component
============================================

# rpm -qa | grep tendrl | sort
tendrl-ansible-1.6.3-8.el7rhgs.noarch
tendrl-api-1.6.3-7.el7rhgs.noarch
tendrl-api-httpd-1.6.3-7.el7rhgs.noarch
tendrl-commons-1.6.3-13.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-14.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-14.el7rhgs.noarch
tendrl-node-agent-1.6.3-10.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-11.el7rhgs.noarch

[root@mbukatov-usm1-server ~]# rpm -qa | egrep '(carbon|grafana|collectd)' | grep -v tendrl | sort
carbon-selinux-1.5.4-2.el7rhgs.noarch
collectd-5.7.2-3.1.el7rhgs.x86_64
collectd-ping-5.7.2-3.1.el7rhgs.x86_64
grafana-4.3.2-3.el7rhgs.x86_64
libcollectdclient-5.7.2-3.1.el7rhgs.x86_64
python-carbon-0.9.15-2.1.el7rhgs.noarch

How reproducible
================

100 %

Steps to Reproduce
==================

1. Prepare trusted storage pool with few volumes to import
2. Install WA using tendrl-ansible
3. On WA server, make sure that directory
   /var/lib/carbon/whisper/tendrl exists, but carbon user
   can't write there (eg. run chown root /var/lib/carbon/whisper/tendrl)
   and that /var/lib/carbon/whisper/tendrl is empty directory
4. Import cluster

Alternative reproducer when you have cluster already imported:

1. Unmanage the cluster
2. Wait for some time to show again in tendrl interface of WA
3. On WA server, run: chown root /var/lib/carbon/whisper/tendrl
4. On WA server, run: rmdir /var/lib/carbon/whisper/tendrl/cluster
5. Run import cluster again

Actual results
==============

The import task finishes with success, but there are no data points
in the dashboard, as carbon was unable to initialize it's database,
which can be seen in /var/log/carbon/console.log, which contains
tons of error messages like:

```
07/11/2018 09:16:27 :: 'Error creating /var/lib/carbon/whisper/tendrl/clusters/969a2d08-3e24-4f56-8a09-61575106f8b9/nodes/mbukatov-usm1-gl3/brick_count/up.wsp'
07/11/2018 09:16:27 :: "[Errno 13] Permission denied: '/var/lib/carbon/whisper/tendrl/clusters'"
```

One can also directly check that clusters directory (which normally
holds the carbon database) is missing:

```
# ls -l /var/lib/carbon/whisper/tendrl/
total 0
drwxr-xr-x. 3 root root 22 Nov  7 08:57 archive
drwxr-xr-x. 2 root root 26 Nov  7 09:14 names
# 
```

Expected results
================

WA performs some validation during last phase of import cluster and
notifies the user about the problem.

Additional info
===============

When the problem with access rights is fixed, carbon recovers and
the database and dashboard is populated (datapoints appear in the
dashboard almost immediately).

```
# chown carbon /var/lib/carbon/whisper/tendrl
# ls -l /var/lib/carbon/whisper/tendrl/clusters/
total 0
drwxr-xr-x. 8 carbon carbon 124 Nov  7 09:20 969a2d08-3e24-4f56-8a09-61575106f8b9
#
```

Comment 2 Martin Bukatovic 2018-11-07 08:30:08 UTC
There are other self monitoring/error reporting bugs reported for WA and carbon,
eg. BZ 1589801.

Comment 3 gowtham 2019-04-01 16:43:36 UTC
PR: https://github.com/Tendrl/monitoring-integration/pull/596, assigning permission to the carbon user while creating an alias

Comment 20 errata-xmlrpc 2019-10-30 12:23:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3251


Note You need to log in before you can comment on or make changes to this bug.