Bug 1761546

Summary: [RHOSP 15] Redeployment of Overcloud Fails with "KeyError: 'passwords'" After Cancelling STACK_UPDATE Early
Product: Red Hat OpenStack Reporter: Luke Short <lshort>
Component: python-tripleoclientAssignee: Luke Short <lshort>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: hbrock, jslagle, mburns
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: DFG:DF
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-05 12:00:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luke Short 2019-10-14 16:04:30 UTC
Description copied from original BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1753251

===============

Description of problem:

It's a very common issue if the deployment gets canceled for any reason/ctl C, when redeployed it gets failed with the below error.

~~~
HTTP POST http://10.151.5.50:8989/v2/action_executions 201
Starting new HTTP connection (1): 10.151.5.50
http://10.151.5.50:8080 "GET /v1/AUTH_9ee3af68b5ec4bbd9fbfcfea344c67f2/overcloud/plan-environment.yaml HTTP/1.1" 200 233
'passwords'
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 400, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/tripleoclient/command.py", line 25, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python2.7/site-packages/osc_lib/command/command.py", line 41, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 184, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py", line 973, in take_action
    self._deploy_tripleo_heat_templates_tmpdir(stack, parsed_args)
  File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py", line 428, in _deploy_tripleo_heat_templates_tmpdir
    new_tht_root, tht_root)
  File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py", line 453, in _deploy_tripleo_heat_templates
    type(self)._keep_env_on_update)
  File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/plan_management.py", line 170, in update_plan_from_templates
    passwords = _load_passwords(swift_client, name)
  File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/plan_management.py", line 256, in _load_passwords
    return plan_env['passwords']
KeyError: 'passwords'
~~~

I have tried a few workarounds, it did not work for me:

1) A workaround is to delete swift container :
Following the kcs[1] and bugzilla[2] deleted swift containers and redeployment gets failed with the issue mentioned in bugzilla[3]
[1] https://access.redhat.com/solutions/3714651
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1622725
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1638363

2) Tried upgrading undercloud but it does not reflect anything in plans.

3) Using the below command we were successfully able to populate password section in plan-environment.yaml but parameter_default section is still not there and hence failed:

~~~
openstack workflow execution create tripleo.plan_management.v1.update_deployment_plan '{"container": "overcloud"}'
~~~



Version-Release number of selected component (if applicable):


How reproducible:
Run the deployment command again and when it starts updating then plan press ctl C. 

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
Need steps to regenerate the plan-environmnet.yaml or where ever it made changes after pressing CTL c

Additional info:
We also tried updating the plan but no use.
All these workarounds were done on our lab environment before suggesting the same to Cu
Deleting the overcloud is not possible as VM's are running on the environment.
If any data is required, please let me know.

Comment 5 errata-xmlrpc 2020-03-05 12:00:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0643