Description of problem: ======================= While importing a 3 node cluster into a tendrl server already having a successfully imported (one) cluster, it failed with the message 'Could not find atom tendrl.objects.Cluster.atoms.ConfigureMonitoring ' Everything seemed to be going well with cluster import, until a missing 'service collectd running on node <>' message for one of the nodes of the 3-node-cluster. When checked on the backend, 'collectd' service is dead(inactive) on one of the nodes, but running on the other 2 nodes. error Failed post-run: tendrl.objects.Cluster.atoms.ConfigureMonitoring for flow: Import existing Gluster Cluster 18 Nov 2017 06:59:23 info Running Flow monitoring.flows.NewClusterDashboard 18 Nov 2017 06:59:22 info Processing Job 90885685-5b3f-4fc2-9074-280294a47d57 18 Nov 2017 06:59:22 error Could not find atom tendrl.objects.Cluster.atoms.ConfigureMonitoring 18 Nov 2017 06:59:22 info Released lock (e6a30215-3261-475a-bcd9-78071d7ff3ae) for Node (dec9e261-e263-41c1-9324-fc9ced7d4d62) 18 Nov 2017 06:59:22 info Job (4b7d2892-fdce-41a2-8cbe-a34a5720c466): Finished Flow tendrl.flows.ImportCluster 18 Nov 2017 06:59:14 info Service collectd running on node dhcp42-243.lab.eng.blr.redhat.com 18 Nov 2017 06:59:14 info Job (85e39abf-c114-4f83-8cb5-0e44835a7f0d): Finished Flow tendrl.flows.ImportCluster 18 Nov 2017 06:59:14 info Service collectd running on node dhcp42-206.lab.eng.blr.redhat.com 18 Nov 2017 06:59:14 Screenshot of the tasks and var/log/messages have been copied to http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber> Version-Release number of selected component (if applicable): ============================================================= tendrl-grafana-plugins-1.5.4-3.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch tendrl-node-agent-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-3.el7rhgs.noarch tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch tendrl-notifier-1.5.4-2.el7rhgs.noarch tendrl-commons-1.5.4-2.el7rhgs.noarch tendrl-api-1.5.4-2.el7rhgs.noarch tendrl-api-httpd-1.5.4-2.el7rhgs.noarch tendrl-ansible-1.5.4-1.el7rhgs.noarch tendrl-ui-1.5.4-2.el7rhgs.noarch On storage node: tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch tendrl-commons-1.5.4-2.el7rhgs.noarch tendrl-node-agent-1.5.4-2.el7rhgs.noarch tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch How reproducible: ================= 1:1 Additional info: ================ The setup is in the same state if it has to be looked at.
Giving pm_ack and 3.3.z+ since both qa_ack and dev_ack are already given.
Validated the same on build tendrl-node-agent-1.5.4-5.el7rhgs.noarch Cluster import succeeded without any issues. Collectd service on every storage node is up and running. Having discussed it with developer (Nishanth), there was a fix that went in gluster/monitoring integration, which ended up fixing this issue as well.. in other words, the issue mentioned in this bugzilla is a one-off case and will not be easy to reproduce and hence verify. Moving the bug to its final state for RHGS 3.3.1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3478