Description of problem: Instance-HA deployment uses the admin password to authenticate to nova and set evacuate on the impacted VMs. This works fine until the admin changes their password. We should be using a service id and password for achieving this action. Version-Release number of selected component (if applicable): RHOSP13z6 How reproducible: Every time. Steps to Reproduce: 1. Deploy overcloud with Instance-HA. 2. Enable fencing as per documentation. 3. Change the admin password via Horizon. 4. Test Instance HA. Actual results: Compute node is powered off. VMs are not evacuated. Expected results: VMs evacuated Additional info:
The question is what is the best way to update the password in "nova-evacuate" pcs resource when AdminPassword is changed. Below does not work. ExtraConfig: tripleo::instanceha::no_shared_storage: false AdminPassword: new_pass pacemaker::resource::bundle::deep_compare: true pacemaker::resource::ip::deep_compare: true pacemaker::resource::ocf::deep_compare: true
(In reply to Sadique Puthen from comment #1) > The question is what is the best way to update the password in > "nova-evacuate" pcs resource when AdminPassword is changed. Below does not > work. > > > ExtraConfig: > tripleo::instanceha::no_shared_storage: false > AdminPassword: new_pass Shouldn't AdminPassword be set outside of the ExtraConfig section which sets the hiera keys only. The hiera key we're interested in is keystone::admin_password which gets set taking the AdminPassword THT parameter. I.e. AdminPassword should be directly under the parameter_defaults section. Did you try that as well and it did not work? > pacemaker::resource::bundle::deep_compare: true > pacemaker::resource::ip::deep_compare: true > pacemaker::resource::ocf::deep_compare: true
(In reply to Michele Baldessari from comment #2) > (In reply to Sadique Puthen from comment #1) > > The question is what is the best way to update the password in > > "nova-evacuate" pcs resource when AdminPassword is changed. Below does not > > work. > > > > > > ExtraConfig: > > tripleo::instanceha::no_shared_storage: false > > AdminPassword: new_pass > > Shouldn't AdminPassword be set outside of the ExtraConfig section which sets > the hiera keys only. The hiera key we're interested in is > keystone::admin_password which gets set taking the AdminPassword THT > parameter. > I.e. AdminPassword should be directly under the parameter_defaults section. > > Did you try that as well and it did not work? > > > pacemaker::resource::bundle::deep_compare: true > > pacemaker::resource::ip::deep_compare: true > > pacemaker::resource::ocf::deep_compare: true Ok so I tried this and this works only half way (once AdminPassword is moved inside parameter_defaults:). Because deep_compare is currently limited to resources and we store the password also in a stonith device which does not get updated. So we will need to fix https://bugzilla.redhat.com/show_bug.cgi?id=1647478 before this will work out of the box. And I still think we should also explore using a service account for all this.
(In reply to Michele Baldessari from comment #3) > (In reply to Michele Baldessari from comment #2) > > (In reply to Sadique Puthen from comment #1) > > > The question is what is the best way to update the password in > > > "nova-evacuate" pcs resource when AdminPassword is changed. Below does not > > > work. > > > > > > > > > ExtraConfig: > > > tripleo::instanceha::no_shared_storage: false > > > AdminPassword: new_pass > > > > Shouldn't AdminPassword be set outside of the ExtraConfig section which sets > > the hiera keys only. The hiera key we're interested in is > > keystone::admin_password which gets set taking the AdminPassword THT > > parameter. > > I.e. AdminPassword should be directly under the parameter_defaults section. > > > > Did you try that as well and it did not work? > > > > > pacemaker::resource::bundle::deep_compare: true > > > pacemaker::resource::ip::deep_compare: true > > > pacemaker::resource::ocf::deep_compare: true > > Ok so I tried this and this works only half way (once AdminPassword is moved > inside parameter_defaults:). Because deep_compare is currently limited to > resources and we store the password also in a stonith device which does not > get updated. So we will need to fix > https://bugzilla.redhat.com/show_bug.cgi?id=1647478 before this will work > out of the box. > > And I still think we should also explore using a service account for all > this. With this, by moving AdminPassword under parameter_defaults, I got it working. parameter_defaults: AdminPassword: Spu23487DI ComputeInstanceHACount: 2 CephCount: 3 ExtraConfig: tripleo::instanceha::no_shared_storage: false pacemaker::resource::bundle::deep_compare: true pacemaker::resource::ip::deep_compare: true pacemaker::resource::ocf::deep_compare: true Before: # pcs resource show nova-evacuate Resource: nova-evacuate (class=ocf provider=openstack type=NovaEvacuate) Attributes: auth_url=https://overcloud.redhat.local:13000 no_shared_storage=false password=BWDh8Ye7HVhmeud7DPs4WNpGE project_domain=Default tenant_name=admin user_domain=Default username=admin Operations: monitor interval=10 timeout=600 (nova-evacuate-monitor-interval-10) start interval=0s timeout=20 (nova-evacuate-start-interval-0s) stop interval=0s timeout=20 (nova-evacuate-stop-interval-0s) After: # pcs resource show nova-evacuate Resource: nova-evacuate (class=ocf provider=openstack type=NovaEvacuate) Attributes: auth_url=https://overcloud.redhat.local:13000 no_shared_storage=false password=Spu23487DI project_domain=Default tenant_name=admin user_domain=Default username=admin Operations: monitor interval=10 timeout=600 (nova-evacuate-monitor-interval-10) start interval=0s timeout=20 (nova-evacuate-start-interval-0s) stop interval=0s timeout=20 (nova-evacuate-stop-interval-0s) Had below errors in pcs after stack update. Failed Actions: * stonith-fence_compute-fence-nova_start_0 on controller-2 'unknown error' (1): call=155, status=Timed Out, exitreason='', last-rc-change='Fri May 31 17:01:46 2019', queued=0ms, exec=20009ms * stonith-fence_compute-fence-nova_start_0 on controller-3 'unknown error' (1): call=155, status=Timed Out, exitreason='', last-rc-change='Fri May 31 17:02:15 2019', queued=1ms, exec=20012ms * stonith-fence_compute-fence-nova_start_0 on controller-1 'unknown error' (1): call=161, status=Timed Out, exitreason='', last-rc-change='Fri May 31 17:01:14 2019', queued=0ms, exec=20018ms pcs resource cleanup cleaned it up. So it's working and we support it right?
Summary/recap. For this to work ootb we need: 1. deep_compare for stonith resources (tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1647478) 2. admin password to be changed via director or passed as parameters as in https://bugzilla.redhat.com/show_bug.cgi?id=1715372#c4 Until https://bugzilla.redhat.com/show_bug.cgi?id=1647478 is fixed a potential workaround is to use a postconfig template+script similar to the following: ~~~ postconfig.yaml: heat_template_version: 2014-10-16 description: > Update fence_compute-fence-nova parameters: servers: type: json DeployIdentifier: type: string resources: ExtraConfig: type: OS::Heat::SoftwareConfig properties: group: script inputs: - name: deploy_identifier config: {get_file: ./postconfig.sh} ExtraDeployments: type: OS::Heat::SoftwareDeployments properties: servers: {get_param: servers} config: {get_resource: ExtraConfig} actions: ['CREATE', 'UPDATE'] input_values: deploy_identifier: {get_param: DeployIdentifier} ~~~ postconfig.sh: #!/bin/bash case $(hostname) in <CONTROLLER_0_HOSTNAME>*) NEWPWD=$(hiera -c /etc/puppet/hiera.yaml "keystone::admin_password") pcs stonith update stonith-fence_compute-fence-nova passwd=$NEWPWD ;; *) ;; esac ~~~ 0. add the AdminPassword parameter to any existing templates, see above. 1. swap <CONTROLLER_0_HOSTNAME> for the hostname of the controller that should be used to update the resource (keeping the '*' at the end). 2. make sure that path to the script (config: {get_file: ./postconfig.sh}) is correct. 3. add postconfig.yaml to the list of templates in the deploy command. There is another RFE to allow IHA to use a different user: https://bugzilla.redhat.com/show_bug.cgi?id=1719528 (using the nova service user by default). I am closing this as duplicate of 1647478. I suggest you to track 1719528 too, as using a service user will be the preferred way forward. Regards Luca *** This bug has been marked as a duplicate of bug 1647478 ***
we are missing https://opendev.org/openstack/puppet-tripleo/commit/0f27a41b7bf053a4ee55e0de2615fb7d6a053d61 (introduces deep_compare) downstream. Working on the backport.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0760