Bug 1769868

Summary: OSP 16 | Service Assurance | qdr mesh patch broke qdr connector configuration. Seems like only puppet side of the change went to OSP16
Product: Red Hat OpenStack Reporter: Leonid Natapov <lnatapov>
Component: openstack-tripleo-heat-templatesAssignee: Martin Magr <mmagr>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 16.0 (Train)CC: amcleod, gregraka, jschluet, mburns, mmagr, mrunge, pkilambi, sclewis, scohen, shrjoshi
Target Milestone: z1Keywords: Regression, Triaged
Target Release: 16.0 (Train on RHEL 8.1)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-11.4.1-0.20200131.09fa984.el8ost openstack-tripleo-heat-templates-11.3.2-0.20200119215823.f90eb2c.el8ost Doc Type: Bug Fix
Doc Text:
Previously, the mesh network infrastructure was configured incorrectly for the message router, QDR, causing the AMQP-1.0 message bus on the Service Telemetry Framework (STF) client to malfunction. This fix corrects the configuration for the qdrouterd daemon on all overcloud nodes, and the STF client now functions properly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-03 09:45:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Leonid Natapov 2019-11-07 16:11:27 UTC
qdr mesh patch broke qdr connector configuration. Seems like  only puppet side of the change went to OSP16

Currently controller node  thinks that it should operate as edge node and tries to connect to itself.

019-11-06 16:22:26.440406 +0000 SERVER (info) [2]: Connection to controller-0:5668 failed: proton:io Connection refused - disconnected controller-0:5668
2019-11-06 16:22:31.442049 +0000 SERVER (info) [3]: Connection to controller-0:5668 failed: proton:io Connection refused - disconnected controller-0:5668
2019-11-06 16:22:36.443500 +0000 SERVER (info) [4]: Connection to controller-0:5668 failed: proton:io Connection refused - disconnected controller-0:5668


W/A" Add  ControllerExtraConfig option in the env. file.

ControllerExtraConfig:
        tripleo::profile::base::metrics::qdr::router_mode: interior

Comment 7 Leonid Natapov 2019-12-17 10:52:41 UTC
Failed QA. Moving back to Assign.We have to open firewall for the internal ports

Comment 17 Leonid Natapov 2020-02-20 05:22:55 UTC
overcloud deployment im mesh mode fails,moving back to assign

<13>Feb 19 18:23:38 puppet-user: Error: Evaluation Error: Error while evaluating a Function Call, Value of interior_ip '192.168.24.16' is not member of interior_mesh_nodes ''. (file: /etc/puppet/modules/tripleo/manifests/profile/base/metrics/qdr.pp, line: 176, column: 9) on node controller-2.redhat.local", "+ rc=1", "+ '[' False = false ']'", "+ set -e", "+ '[' 1 -ne 2 -a 1 -ne 0 ']'", "+ exit 1", " attempt(s): 3", "2020-02-19 18:23:41,976 INFO: 52437 -- Removing container: container-puppet-metrics-qdr", "2020-02-19 18:23:42,207 WARNING: 52437 -- Retrying running container: metrics-qdr", "2020-02-19 18:23:42,207 ERROR: 52437 -- Failed running container for metrics-qdr"


puppet-tripleo-11.4.1-0.20200205150840.71ff36d.el8ost.noarch
openstack-tripleo-heat-templates-11.3.2-0.20200211065543.d3d6dc3.el8ost.noarch

Comment 18 Leonid Natapov 2020-02-21 00:14:37 UTC
verified. the problem was in THT that not purges that hieradata while stack update.

CLean deployment for the mesh mode - works.

Comment 20 errata-xmlrpc 2020-03-03 09:45:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0655