Bug 1688630 - When tendrl-monitoring-integration is not running, Import flow failure related log messages should be more specific about what went wrong
Summary: When tendrl-monitoring-integration is not running, Import flow failure relate...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: web-admin-tendrl-commons
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.5.0
Assignee: Timothy Asir
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On: 1686888
Blocks: 1696807
TreeView+ depends on / blocked
 
Reported: 2019-03-14 06:40 UTC by gowtham
Modified: 2019-10-30 12:23 UTC (History)
9 users (show)

Fixed In Version: tendrl-commons-1.6.3-18.el7rhgs.noarch
Doc Type: Bug Fix
Doc Text:
Previously, errors that occurred because tendrl-monitoring-integration was not running were reported with generic error messages. More specific error messages about tendrl-monitoring-integration status is now logged in this situation.
Clone Of:
Environment:
Last Closed: 2019-10-30 12:23:13 UTC
Embargoed:


Attachments (Terms of Use)
Screenshot with correct message for unmanage cluster (84.37 KB, image/png)
2019-06-06 08:10 UTC, Sweta Anandpara
no flags Details
Screenshot with correct message for import cluster (87.04 KB, image/png)
2019-06-06 08:12 UTC, Sweta Anandpara
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github Tendrl commons issues 1080 0 None open Import flow failure related log messages should be more specific about what went wrong 2020-02-27 12:07:17 UTC
Github Tendrl monitoring-integration issues 593 0 None closed Updating "tendrl/integration/monitoring" from monitoring-integration sync 2020-02-27 12:07:16 UTC
Red Hat Product Errata RHBA-2019:3251 0 None None None 2019-10-30 12:23:34 UTC

Description gowtham 2019-03-14 06:40:45 UTC
Description of problem:

In most of the error log messages in import flow are very generic, It displays a big traceback with atom failed messages. But it is not specified why the atom is failed. With this error message user unable to pinpoint of the failure. 

## import failed

Failure in Job 0845ffb1-4d53-4a6c-9e18-3ed0a72c1ce5 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/__init__.py", line 227, in run\n "Error executing post run function: %s" % atom_fqn\n', 'AtomExecutionFailedError: Atom Execution failed. Error: Error executing post run function: tendrl.objects.Cluster.atoms.SetupClusterAlias\n'

Failed post-run: tendrl.objects.Cluster.atoms.SetupClusterAlias for flow: Import existing Gluster Cluster

Version-Release number of selected component (if applicable):
tendrl-commons-1.6.3-17.el7rhgs.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create gluster cluster
2. Install RHGSWA via tendrl-ansible
3. Stop tendrl-monitoring-integration service in a server
4. Try to import the cluster

Actual results:
Import failed with some huge traceback info

Expected results:
Need a specific log message that shows why import is failed 

Additional info:

Comment 2 Martin Bukatovic 2019-03-14 09:19:32 UTC
Quick list of related bugs (this is not a complete list) based on the sheer
title of this bug (Import flow failure related log messages should be more
specific about what went wrong):

#1647322 WA should detect and report problems with carbon initialization
#1647909 Import fails when WA is not updated
#1616005 Repeated Import (and Unmanage) fails: Timing out import job, Cluster data still not fully updated
#1612096 Import cluster with bricks down failed
#1602858 Root cause of problem with import cluster job failure  needs to be identified
#1599375 Error executing pre run function: tendrl.objects.Cluster.atoms.Check Cluster Nodes Up
#1589820 Non descriptive Import Cluster failure: Atom Execution failed
#1589801 no error reported by WA ui when importing cluster without free disk space on /var/lib/carbon partition
#1583713 No dashboards when cluster is imported on second attempt
#1686888 import cluster fails after timeout without clear indication what went wrong
#1686855 Task messages are not informative

Comment 3 Martin Bukatovic 2019-03-14 09:20:44 UTC
The reproducer in this BZ is the same as in linked BZ 1686888. What is the purpose of this BZ?

Comment 6 gowtham 2019-04-01 16:47:52 UTC
Added pre-atom in import and unmanage cluster flow to check all required services are running:
    https://github.com/Tendrl/commons/pull/1081
    https://github.com/Tendrl/commons/pull/1083
    https://github.com/Tendrl/monitoring-integration/pull/594

Assigning ownership for carbon user while creating an alias:
   PR: https://github.com/Tendrl/monitoring-integration/pull/596

Comment 12 Sweta Anandpara 2019-06-06 08:10:44 UTC
Created attachment 1577798 [details]
Screenshot with correct message for unmanage cluster

Comment 13 Sweta Anandpara 2019-06-06 08:12:20 UTC
Created attachment 1577799 [details]
Screenshot with correct message for import cluster

Comment 20 errata-xmlrpc 2019-10-30 12:23:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3251


Note You need to log in before you can comment on or make changes to this bug.