Bug 1950533
Summary: | haproxy validator should not fail on compute nodes | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Fields <cfields> |
Component: | validations-common | Assignee: | Michele Baldessari <michele> |
Status: | CLOSED ERRATA | QA Contact: | nlevinki <nlevinki> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 16.2 (Train) | CC: | bperkins, chjones, gchamoul, jjoyce, jpodivin, jschluet, lmiccini, michele, slinaber, tvignaud, uemit.seren |
Target Milestone: | z2 | Keywords: | Triaged |
Target Release: | 16.2 (Train on RHEL 8.4) | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-validations-11.6.1-2.20210713004808.f46d2bb.el8ost validations-common-1.1.2-2.20210721144807.92f51ea.el8ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-03-23 22:10:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Fields
2021-04-16 19:43:33 UTC
fwiw validator run supports '--limit' so operators can decide on which node to run this (useful for composable or non standard roles as well). $ openstack tripleo validator run --validation haproxy --limit Controller Running Validations without Overcloud settings. +--------------------------------------+-------------+--------+------------+------------------------------------------+-------------------+-------------+ | UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration | +--------------------------------------+-------------+--------+------------+------------------------------------------+-------------------+-------------+ | fe3b1444-2bc5-47a9-8205-c528e48b71a7 | haproxy | PASSED | all | controller-0, controller-1, controller-2 | | 0:00:01.874 | +--------------------------------------+-------------+--------+------------+------------------------------------------+-------------------+-------------+ @lmiccini The haproxy validation is hosted in validations-common and should be generic (Non tripleo-centric). Moreover, the target of the playbook is all the nodes by default[1] and should target only the haproxy group from the inventory. The work to do here is: - Make the haproxy validation more generic by removing tripleo references [2] and use the traditional haproxy.cfg file path (not the one used in a tripleo deployment) - Create a new validations called tripleo-haproxy in tripleo-validations calling the haproxy role coming from validations-common with the haproxy.cfg path used by tripleo [1] https://opendev.org/openstack/validations-common/src/branch/master/validations_common/playbooks/haproxy.yaml#L2 [2] https://opendev.org/openstack/validations-common/src/branch/master/validations_common/playbooks/haproxy.yaml#L9 What is the prevailing behaviour for validations? There must be many of them that can only run successfully on a subset of machines in an OSP deployment. If this haproxy validation is the outlier and most of them self-filter, then I would agree we should modify it to behave more like the others, but if the common approach is to expect the operator or the organisation of the validations, to dictate which ones run where, then I would suggest that we should behave in that way. It's worth noting that PIDONE is not the only team who use HAproxy, it's also used by Network on Octavia, so if this validation does become self-filtering it would need to also have a way to identify if it's running against an Octavia node. I'd say it's also worth further noting that if validations are expected to be self-filtering based on roles, that is potentially quite fragile when customers have custom roles with names we can't predict (I'm assuming here that validations would be expected to self-filter based on the role name of a machine, rather than some more precise indicator). Other validations are limited in scope - for example rabbitmq-limits only runs on controllers: (undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator run --validation rabbitmq-limits Running Validations without Overcloud settings. +--------------------------------------+-----------------+--------+------------+------------------------------------------------------------------------+------------------- | UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | +--------------------------------------+-----------------+--------+------------+------------------------------------------------------------------------+-------------------+ | bd29b3df-5edf-42e6-a59e-48391bc7c3d0 | rabbitmq-limits | FAILED | Controller | overcloud-controller-0, overcloud-controller-1, overcloud-controller-2 | | +--------------------------------------+-----------------+--------+------------+------------------------------------------------------------------------+-------------------+ One of the issues I see with not self limiting is that the group validations get less useable. For example, if you want to run --group post-deployment you have no way to tell it to run some validators only on controllers. In this case you are guaranteed to fail on the haproxy validator. (In reply to Chris Fields from comment #6) > Other validations are limited in scope - for example rabbitmq-limits only > runs on controllers: > > (undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator run > --validation rabbitmq-limits > Running Validations without Overcloud settings. > +--------------------------------------+-----------------+--------+---------- > --+------------------------------------------------------------------------+- > ------------------ > | UUID | Validations | Status | > Host_Group | Status_by_Host > | Unreachable_Hosts | > +--------------------------------------+-----------------+--------+---------- > --+------------------------------------------------------------------------+- > ------------------+ > | bd29b3df-5edf-42e6-a59e-48391bc7c3d0 | rabbitmq-limits | FAILED | > Controller | overcloud-controller-0, overcloud-controller-1, > overcloud-controller-2 | | > +--------------------------------------+-----------------+--------+---------- > --+------------------------------------------------------------------------+- > ------------------+ > > One of the issues I see with not self limiting is that the group validations > get less useable. For example, if you want to run --group post-deployment > you have no way to tell it to run some validators only on controllers. In > this case you are guaranteed to fail on the haproxy validator. Hi Chris, The haproxy validation was self-filtered[1] before I moved it to validations-common It was a mistake and that validation should be fixed in order to get the hosts key back to: - hosts: "{{ controller_rolename | default('Controller') }}" instead of: - hosts: all [1] https://opendev.org/openstack/tripleo-validations/src/commit/ec0465e481234da62d1ba673b4432e44c930f630/playbooks/haproxy.yaml (In reply to Gaël Chamoulaud from comment #7) > (In reply to Chris Fields from comment #6) > Hi Chris, > > The haproxy validation was self-filtered[1] before I moved it to > validations-common > It was a mistake and that validation should be fixed in order to get the > hosts key back to: > > - hosts: "{{ controller_rolename | default('Controller') }}" > > instead of: > > - hosts: all Hi Gaël, a question. Is there a way in validations to express 'the nodes which have the service XYZ configured/installed'? That way this (and other) validations would just work with any composable roles. A bit like the *_node_names hiera key do today: [root@ctrl-2-0 hieradata]# hiera -c /etc/puppet/hiera.yaml haproxy_short_node_names ["ctrl-1-0", "ctrl-2-0", "ctrl-3-0"] Is that doable today within the validation framework? cheers, Michele > > > [1] > https://opendev.org/openstack/tripleo-validations/src/commit/ > ec0465e481234da62d1ba673b4432e44c930f630/playbooks/haproxy.yaml (In reply to Michele Baldessari from comment #8) > (In reply to Gaël Chamoulaud from comment #7) > > (In reply to Chris Fields from comment #6) > > Hi Chris, > > > > The haproxy validation was self-filtered[1] before I moved it to > > validations-common > > It was a mistake and that validation should be fixed in order to get the > > hosts key back to: > > > > - hosts: "{{ controller_rolename | default('Controller') }}" > > > > instead of: > > > > - hosts: all > > Hi Gaël, > > a question. Is there a way in validations to express 'the nodes which have > the service XYZ configured/installed'? > That way this (and other) validations would just work with any composable > roles. Hi Michele, We actually have this through the inventory (from the undercloud): $ tripleo-ansible-inventory —list | jq . > A bit like the *_node_names hiera key do today: > [root@ctrl-2-0 hieradata]# hiera -c /etc/puppet/hiera.yaml > haproxy_short_node_names > ["ctrl-1-0", "ctrl-2-0", "ctrl-3-0"] In the validations roles, you can also query the hiera db through our ansible custom module called hiera to know if a specific service is enabled. It is especially interesting for optional services such as telemetry, or cloudops. > Is that doable today within the validation framework? So yes it is doable. > > cheers, > Michele > > > > > > > [1] > > https://opendev.org/openstack/tripleo-validations/src/commit/ > > ec0465e481234da62d1ba673b4432e44c930f630/playbooks/haproxy.yaml *** Bug 2005904 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1001 |