Bug 1589801 - no error reported by WA ui when importing cluster without free disk space on /var/lib/carbon partition
Summary: no error reported by WA ui when importing cluster without free disk space on ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-node-agent
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: gowtham
QA Contact: sds-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-11 12:16 UTC by Martin Bukatovic
Modified: 2019-05-08 19:43 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-08 19:43:41 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1590443 1 None None None 2024-09-18 00:48:05 UTC
Red Hat Bugzilla 1647322 0 unspecified CLOSED WA should detect and report problems with carbon initialization 2021-02-22 00:41:40 UTC

Internal Links: 1590443 1647322

Description Martin Bukatovic 2018-06-11 12:16:11 UTC
Description of problem
======================

When there is no free disk space left on /var/lib/carbon partition (eg. because
we have lot of archived data from previous unmanage tasks), the import
cluster task finishes with success, WA doesn't report any error directly
in the web ui or via alerts, but the Grafana dashboards doesn't show any data
(as no files can be written into /var/lib/carbon/whisper/tendrl/clusters/<cluster-id> directory).

This is an edge case which we can address by combination of:

 * increased error detection/reporting (eg. monitoring free space on
   /var/lib/carbon partition and reporting an alert)
 * description of this case in debugging guide

Version-Release number
======================

RHGS WA components on tendrl server machine:

tendrl-ansible-1.6.3-4.el7rhgs.noarch
tendrl-api-1.6.3-3.el7rhgs.noarch
tendrl-api-httpd-1.6.3-3.el7rhgs.noarch
tendrl-commons-1.6.3-6.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-4.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-4.el7rhgs.noarch
tendrl-node-agent-1.6.3-6.el7rhgs.noarch
tendrl-notifier-1.6.3-3.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-3.el7rhgs.noarch

Other WA components:

grafana-4.3.2-3.el7rhgs.x86_64
python-carbon-0.9.15-2.1.el7rhgs.noarch
python-whisper-0.9.15-1.1.el7rhgs.noarch

How reproducible
================

100%

Steps to Reproduce
==================

1. Prepare RHGS trusted storage pool
2. Prepare separate partitions for etcd and graphite data on RHGS WA server
3. Install RHGS WA using tendrl-ansible
4. Generate large file in /var/lib/carbon partition so that there are no
   free space left there
5. Import the cluster

Actual results
==============

The ImportCluster task finishes with success, and all components of the
trusted storage pool are shown in the WA interface.

There are no errors reported by WA directly (via ui, task details or alerts).

Grafana dashboard shows no data. Only empty directory structure was created
in /var/lib/carbon/whisper/tendrl/clusters/<cluster-id> directory, as any
attempt to write data there fails:

```
# pwd
/var/lib/carbon/whisper/tendrl/clusters/<cluster-id>
# find . -type d | wc -l
301
# find . -type f | wc -l
0
```

Log file of carbon, /var/log/carbon/console.log contains error messages about
this:

```
11/06/2018 08:11:13 :: 'Error creating /var/lib/carbon/whisper/tendrl/clusters/84ffce52-031b-415f-a8a0-c878043dfd89/nodes/mbukatov-usm1-gl5/bricks/|mnt|brick_gama_disperse_2|2/device/vdc/mount_utilization/total.wsp'
```

Expected results
================

This is an edge case, we may consider monitoring disk usage of carbon partition
and report the problem via alerts if needed.

Comment 3 Martin Bukatovic 2018-06-11 12:28:18 UTC
Asking doc team: would this case be worth mentioning in debugging/troubleshooting
guide for RHGS WA 3.4?

Comment 4 Martin Bukatovic 2018-06-11 12:43:33 UTC
Additional Information
----------------------

When I free some space in /var/lib/carbon, the data starts to gradually appear
on the dashboard. I haven't checked in detail how long it takes or whether all
data will appear eventually.


Note You need to log in before you can comment on or make changes to this bug.