Bug 1715372 - Instance-HA fails to evacuate after admin changes their password.
Summary: Instance-HA fails to evacuate after admin changes their password.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z11
: 13.0 (Queens)
Assignee: Luca Miccini
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-30 08:30 UTC by Nick Satsia
Modified: 2023-12-15 16:31 UTC (History)
16 users (show)

Fixed In Version: puppet-tripleo-8.5.1-6.el7ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1719528 (view as bug list)
Environment:
Last Closed: 2020-03-10 11:18:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-3288 0 None None None 2022-08-23 18:27:25 UTC

Description Nick Satsia 2019-05-30 08:30:06 UTC
Description of problem:
Instance-HA deployment uses the admin password to authenticate to nova and set evacuate on the impacted VMs. This works fine until the admin changes their password.

We should be using a service id and password for achieving this action.


Version-Release number of selected component (if applicable):
RHOSP13z6

How reproducible:
Every time.

Steps to Reproduce:
1. Deploy overcloud with Instance-HA.
2. Enable fencing as per documentation.
3. Change the admin password via Horizon.
4. Test Instance HA.

Actual results:
Compute node is powered off.
VMs are not evacuated.

Expected results:
VMs evacuated

Additional info:

Comment 1 Sadique Puthen 2019-05-31 09:40:11 UTC
The question is what is the best way to update the password in "nova-evacuate" pcs resource when AdminPassword is changed. Below does not work.


  ExtraConfig:
    tripleo::instanceha::no_shared_storage: false
    AdminPassword: new_pass
    pacemaker::resource::bundle::deep_compare: true
    pacemaker::resource::ip::deep_compare: true
    pacemaker::resource::ocf::deep_compare: true

Comment 2 Michele Baldessari 2019-05-31 11:43:02 UTC
(In reply to Sadique Puthen from comment #1)
> The question is what is the best way to update the password in
> "nova-evacuate" pcs resource when AdminPassword is changed. Below does not
> work.
> 
> 
>   ExtraConfig:
>     tripleo::instanceha::no_shared_storage: false
>     AdminPassword: new_pass

Shouldn't AdminPassword be set outside of the ExtraConfig section which sets the hiera keys only. The hiera key we're interested in is keystone::admin_password which gets set taking the AdminPassword THT parameter.
I.e. AdminPassword should be directly under the parameter_defaults section.

Did you try that as well and it did not work?

>     pacemaker::resource::bundle::deep_compare: true
>     pacemaker::resource::ip::deep_compare: true
>     pacemaker::resource::ocf::deep_compare: true

Comment 3 Michele Baldessari 2019-05-31 12:57:50 UTC
(In reply to Michele Baldessari from comment #2)
> (In reply to Sadique Puthen from comment #1)
> > The question is what is the best way to update the password in
> > "nova-evacuate" pcs resource when AdminPassword is changed. Below does not
> > work.
> > 
> > 
> >   ExtraConfig:
> >     tripleo::instanceha::no_shared_storage: false
> >     AdminPassword: new_pass
> 
> Shouldn't AdminPassword be set outside of the ExtraConfig section which sets
> the hiera keys only. The hiera key we're interested in is
> keystone::admin_password which gets set taking the AdminPassword THT
> parameter.
> I.e. AdminPassword should be directly under the parameter_defaults section.
> 
> Did you try that as well and it did not work?
> 
> >     pacemaker::resource::bundle::deep_compare: true
> >     pacemaker::resource::ip::deep_compare: true
> >     pacemaker::resource::ocf::deep_compare: true

Ok so I tried this and this works only half way (once AdminPassword is moved inside parameter_defaults:). Because deep_compare is currently limited to resources and we store the password also in a stonith device which does not get updated. So we will need to fix https://bugzilla.redhat.com/show_bug.cgi?id=1647478 before this will work out of the box.

And I still think we should also explore using a service account for all this.

Comment 4 Sadique Puthen 2019-05-31 23:10:49 UTC
(In reply to Michele Baldessari from comment #3)
> (In reply to Michele Baldessari from comment #2)
> > (In reply to Sadique Puthen from comment #1)
> > > The question is what is the best way to update the password in
> > > "nova-evacuate" pcs resource when AdminPassword is changed. Below does not
> > > work.
> > > 
> > > 
> > >   ExtraConfig:
> > >     tripleo::instanceha::no_shared_storage: false
> > >     AdminPassword: new_pass
> > 
> > Shouldn't AdminPassword be set outside of the ExtraConfig section which sets
> > the hiera keys only. The hiera key we're interested in is
> > keystone::admin_password which gets set taking the AdminPassword THT
> > parameter.
> > I.e. AdminPassword should be directly under the parameter_defaults section.
> > 
> > Did you try that as well and it did not work?
> > 
> > >     pacemaker::resource::bundle::deep_compare: true
> > >     pacemaker::resource::ip::deep_compare: true
> > >     pacemaker::resource::ocf::deep_compare: true
> 
> Ok so I tried this and this works only half way (once AdminPassword is moved
> inside parameter_defaults:). Because deep_compare is currently limited to
> resources and we store the password also in a stonith device which does not
> get updated. So we will need to fix
> https://bugzilla.redhat.com/show_bug.cgi?id=1647478 before this will work
> out of the box.
> 
> And I still think we should also explore using a service account for all
> this.

With this, by moving AdminPassword under parameter_defaults, I got it working.

parameter_defaults:
  AdminPassword: Spu23487DI
  ComputeInstanceHACount: 2
  CephCount: 3

  ExtraConfig:
    tripleo::instanceha::no_shared_storage: false
    pacemaker::resource::bundle::deep_compare: true
    pacemaker::resource::ip::deep_compare: true
    pacemaker::resource::ocf::deep_compare: true

Before:

# pcs resource show nova-evacuate
 Resource: nova-evacuate (class=ocf provider=openstack type=NovaEvacuate)
  Attributes: auth_url=https://overcloud.redhat.local:13000 no_shared_storage=false password=BWDh8Ye7HVhmeud7DPs4WNpGE project_domain=Default tenant_name=admin user_domain=Default username=admin
  Operations: monitor interval=10 timeout=600 (nova-evacuate-monitor-interval-10)
              start interval=0s timeout=20 (nova-evacuate-start-interval-0s)
              stop interval=0s timeout=20 (nova-evacuate-stop-interval-0s)

After:

# pcs resource show nova-evacuate
 Resource: nova-evacuate (class=ocf provider=openstack type=NovaEvacuate)
  Attributes: auth_url=https://overcloud.redhat.local:13000 no_shared_storage=false password=Spu23487DI project_domain=Default tenant_name=admin user_domain=Default username=admin
  Operations: monitor interval=10 timeout=600 (nova-evacuate-monitor-interval-10)
              start interval=0s timeout=20 (nova-evacuate-start-interval-0s)
              stop interval=0s timeout=20 (nova-evacuate-stop-interval-0s)

Had below errors in pcs after stack update.

Failed Actions:
* stonith-fence_compute-fence-nova_start_0 on controller-2 'unknown error' (1): call=155, status=Timed Out, exitreason='',
    last-rc-change='Fri May 31 17:01:46 2019', queued=0ms, exec=20009ms
* stonith-fence_compute-fence-nova_start_0 on controller-3 'unknown error' (1): call=155, status=Timed Out, exitreason='',
    last-rc-change='Fri May 31 17:02:15 2019', queued=1ms, exec=20012ms
* stonith-fence_compute-fence-nova_start_0 on controller-1 'unknown error' (1): call=161, status=Timed Out, exitreason='',
    last-rc-change='Fri May 31 17:01:14 2019', queued=0ms, exec=20018ms

pcs resource cleanup cleaned it up.

So it's working and we support it right?

Comment 10 Luca Miccini 2019-10-21 07:49:25 UTC
Summary/recap.

For this to work ootb we need:

1. deep_compare for stonith resources (tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1647478)
2. admin password to be changed via director or passed as parameters as in https://bugzilla.redhat.com/show_bug.cgi?id=1715372#c4

Until https://bugzilla.redhat.com/show_bug.cgi?id=1647478 is fixed a potential workaround is to use a postconfig template+script similar to the following:

~~~ postconfig.yaml:

heat_template_version: 2014-10-16

description: >
  Update fence_compute-fence-nova

parameters:
  servers:
    type: json
  DeployIdentifier:
    type: string

resources:
  ExtraConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      inputs:
        - name: deploy_identifier
      config: {get_file: ./postconfig.sh}

  ExtraDeployments:
    type: OS::Heat::SoftwareDeployments
    properties:
      servers:  {get_param: servers}
      config: {get_resource: ExtraConfig}
      actions: ['CREATE', 'UPDATE']
      input_values:
        deploy_identifier: {get_param: DeployIdentifier}


~~~ postconfig.sh:

#!/bin/bash
case $(hostname) in
    <CONTROLLER_0_HOSTNAME>*)
        NEWPWD=$(hiera -c /etc/puppet/hiera.yaml "keystone::admin_password")
        pcs stonith update stonith-fence_compute-fence-nova passwd=$NEWPWD
    ;;
    *)
    ;;
esac

~~~ 

0. add the AdminPassword parameter to any existing templates, see above.
1. swap <CONTROLLER_0_HOSTNAME> for the hostname of the controller that should be used to update the resource (keeping the '*' at the end).
2. make sure that path to the script (config: {get_file: ./postconfig.sh}) is correct.
3. add postconfig.yaml to the list of templates in the deploy command.


There is another RFE to allow IHA to use a different user: https://bugzilla.redhat.com/show_bug.cgi?id=1719528 (using the nova service user by default).

I am closing this as duplicate of 1647478. I suggest you to track 1719528 too, as using a service user will be the preferred way forward.

Regards
Luca

*** This bug has been marked as a duplicate of bug 1647478 ***

Comment 14 Luca Miccini 2019-11-22 10:21:14 UTC
we are missing https://opendev.org/openstack/puppet-tripleo/commit/0f27a41b7bf053a4ee55e0de2615fb7d6a053d61 (introduces deep_compare) downstream.
Working on the backport.

Comment 24 errata-xmlrpc 2020-03-10 11:18:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0760


Note You need to log in before you can comment on or make changes to this bug.