Bug 1516242 - When rpm package installation fails on Could not resolve host error during Import Cluster operation, information about this problem is not provided
Summary: When rpm package installation fails on Could not resolve host error during Im...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-node-agent
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Shubhendu Tripathi
QA Contact: Lubos Trilety
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-22 10:48 UTC by Martin Bukatovic
Modified: 2017-12-18 04:37 UTC (History)
5 users (show)

Fixed In Version: tendrl-node-agent-1.5.4-8.el7rhgs.noarch
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-18 04:37:36 UTC
Target Upstream Version:


Attachments (Terms of Use)
screenshot of task details page with error messages (154.42 KB, image/png)
2017-11-22 10:48 UTC, Martin Bukatovic
no flags Details
response from ${TENDRL_SERVER}/api/1.0/jobs/${JOB_ID}/messages api call (45.92 KB, text/plain)
2017-11-22 10:49 UTC, Martin Bukatovic
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:3478 normal SHIPPED_LIVE RHGS Web Administration packages 2017-12-18 09:34:49 UTC

Description Martin Bukatovic 2017-11-22 10:48:46 UTC
Created attachment 1357364 [details]
screenshot of task details page with error messages

Description of problem
======================

When rpm package installation fails on Could not resolve host error during
Import Cluster operation, information about this problem is not provided.

This is similar to upstream, more serious issue reported (and fixed no) as:

https://github.com/Tendrl/node-agent/issues/627

but it differs in reproducer and a type of failure, which is not detected
and reported in case of this BZ. Compared to original upstream issue 
Tendrl/node-agent/issues/627, this seems to have a lower priority.

Version-Release number
======================

tendrl-node-agent-1.5.4-3.el7rhgs.noarch

[root@usm1-gl1 ~]# rpm -qa | grep tendrl | sort
tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-3.el7rhgs.noarch
tendrl-node-agent-1.5.4-3.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch

[root@usm1-server ~]# rpm -qa | grep tendrl | sort
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-commons-1.5.4-3.el7rhgs.noarch
tendrl-grafana-plugins-1.5.4-4.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-4.el7rhgs.noarch
tendrl-node-agent-1.5.4-3.el7rhgs.noarch
tendrl-notifier-1.5.4-2.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-ui-1.5.4-3.el7rhgs.noarch

Steps to Reproduce
==================

1. Prepare machines with GlusterFS storage pool, including gluster volume
2. Install RHGS WA using tendrl-ansible
3. Pick one storage server node and break rpm repository of RHGS WA channel
   by specifying invalid hostname in repourl (eg. download.example.com).
4. Verify that installation of tendrl-gluster-integration is not possible
   on machine selected in previous step.
   Yum install should fail on error:
   > "Could not resolve host: download.example.com; Name or service not known"
5. Import gluster trusted storage pool with a volume (prepared in step #1).
6. Wait until the import task finishes (it's expected to fail).
7. Check the details in task details page.

Actual results
==============

Even though that the installation of tendrl-gluster-integration package failed
on one storage server, there is no direct indication of this happening on the
page.

Errors reported include:

* Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster
* Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster
* Error: Cluster data sync still incomplete. Timing out
* Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster

but not a single one mentions the problem with yum installation failure.

When I download the task messages via curl like this:

```
$ curl "${TENDRL_SERVER}/api/1.0/jobs/${JOB_ID}/messages" -H "Authorization: Bearer ${TENDRL_TOKEN}" | jq '.' > job.pretty.json
```

I can grep within the error messages, but I don't see anything interesting:

```
$ grep yum job.pretty.json
$ grep fail job.pretty.json
      "message": "Failure in Job 4cd6bfee-d3ef-49cf-8c24-b72827e47225 Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n"
      "message": "Failure in Job 5c0c6cab-a4ea-4892-adaf-3001822d3efe Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n"
      "message": "Failure in Job 38ae87de-d101-429f-a499-c84f18f93a1a Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n"
      "message": "Failure in Job e282dca2-af32-49f5-bb87-a9610e9b4fef Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n"
      "message": "Failure in Job bba9853d-3961-4bbe-87ab-8f02c5562a1c Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n"
      "message": "Failure in Job b7680ecf-e0a8-4dc9-8cd6-538f6fe807f6 Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n    the_flow.run()\n  File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n    raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n"
```

```
$ grep gluster-integration job.pretty.json
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com"
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com"
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com"
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl1.example.com"
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com"
      "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com"
      "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com"
      "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com"
      "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com"
      "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com"
      "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com"
      "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com"
      "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com"
      "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com"
      "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com"
      "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com"

```

Expected results
================

Tendrl should report an error about the fact that installation of the package
failed, with details indicating why is that (based on yum error) if possible.

Which in this case is that yum can't resolve a hostname of RHGS WA repo.

Comment 2 Martin Bukatovic 2017-11-22 10:49:50 UTC
Created attachment 1357365 [details]
response from ${TENDRL_SERVER}/api/1.0/jobs/${JOB_ID}/messages api call

Comment 4 Lubos Trilety 2017-11-29 14:26:27 UTC
Tested with:
tendrl-node-agent-1.5.4-8.el7rhgs.noarch

There error from which is clear what happened, why import failed.

Comment 6 errata-xmlrpc 2017-12-18 04:37:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478


Note You need to log in before you can comment on or make changes to this bug.