Created attachment 1357364 [details] screenshot of task details page with error messages Description of problem ====================== When rpm package installation fails on Could not resolve host error during Import Cluster operation, information about this problem is not provided. This is similar to upstream, more serious issue reported (and fixed no) as: https://github.com/Tendrl/node-agent/issues/627 but it differs in reproducer and a type of failure, which is not detected and reported in case of this BZ. Compared to original upstream issue Tendrl/node-agent/issues/627, this seems to have a lower priority. Version-Release number ====================== tendrl-node-agent-1.5.4-3.el7rhgs.noarch [root@usm1-gl1 ~]# rpm -qa | grep tendrl | sort tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch tendrl-commons-1.5.4-3.el7rhgs.noarch tendrl-node-agent-1.5.4-3.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch [root@usm1-server ~]# rpm -qa | grep tendrl | sort tendrl-ansible-1.5.4-1.el7rhgs.noarch tendrl-api-1.5.4-2.el7rhgs.noarch tendrl-api-httpd-1.5.4-2.el7rhgs.noarch tendrl-commons-1.5.4-3.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-4.el7rhgs.noarch tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-4.el7rhgs.noarch tendrl-node-agent-1.5.4-3.el7rhgs.noarch tendrl-notifier-1.5.4-2.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch tendrl-ui-1.5.4-3.el7rhgs.noarch Steps to Reproduce ================== 1. Prepare machines with GlusterFS storage pool, including gluster volume 2. Install RHGS WA using tendrl-ansible 3. Pick one storage server node and break rpm repository of RHGS WA channel by specifying invalid hostname in repourl (eg. download.example.com). 4. Verify that installation of tendrl-gluster-integration is not possible on machine selected in previous step. Yum install should fail on error: > "Could not resolve host: download.example.com; Name or service not known" 5. Import gluster trusted storage pool with a volume (prepared in step #1). 6. Wait until the import task finishes (it's expected to fail). 7. Check the details in task details page. Actual results ============== Even though that the installation of tendrl-gluster-integration package failed on one storage server, there is no direct indication of this happening on the page. Errors reported include: * Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster * Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster * Error: Cluster data sync still incomplete. Timing out * Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster but not a single one mentions the problem with yum installation failure. When I download the task messages via curl like this: ``` $ curl "${TENDRL_SERVER}/api/1.0/jobs/${JOB_ID}/messages" -H "Authorization: Bearer ${TENDRL_TOKEN}" | jq '.' > job.pretty.json ``` I can grep within the error messages, but I don't see anything interesting: ``` $ grep yum job.pretty.json $ grep fail job.pretty.json "message": "Failure in Job 4cd6bfee-d3ef-49cf-8c24-b72827e47225 Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n" "message": "Failure in Job 5c0c6cab-a4ea-4892-adaf-3001822d3efe Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n" "message": "Failure in Job 38ae87de-d101-429f-a499-c84f18f93a1a Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n" "message": "Failure in Job e282dca2-af32-49f5-bb87-a9610e9b4fef Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n" "message": "Failure in Job bba9853d-3961-4bbe-87ab-8f02c5562a1c Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Cluster data sync still incomplete. Timing out\n" "message": "Failure in Job b7680ecf-e0a8-4dc9-8cd6-538f6fe807f6 Flow tendrl.flows.ImportCluster with error:\nTraceback (most recent call last):\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py\", line 218, in process_job\n the_flow.run()\n File \"/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py\", line 103, in run\n raise ex\nAtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n" ``` ``` $ grep gluster-integration job.pretty.json "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com" "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com" "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com" "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl1.example.com" "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com" "message": "Installing tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com" "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com" "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl6.example.com" "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com" "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl3.example.com" "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com" "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl5.example.com" "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com" "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl2.example.com" "message": "Generating configuration for tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com" "message": "Running tendrl-gluster-integration on Node mbukatov-usm1-gl4.example.com" ``` Expected results ================ Tendrl should report an error about the fact that installation of the package failed, with details indicating why is that (based on yum error) if possible. Which in this case is that yum can't resolve a hostname of RHGS WA repo.
Created attachment 1357365 [details] response from ${TENDRL_SERVER}/api/1.0/jobs/${JOB_ID}/messages api call
Tested with: tendrl-node-agent-1.5.4-8.el7rhgs.noarch There error from which is clear what happened, why import failed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3478