Bug 1572257

Summary: openstack undercloud upgrade fails when overcloud has Failed state
Product: Red Hat OpenStack Reporter: Sergii Golovatiuk <sgolovat>
Component: instack-undercloudAssignee: Rabi Mishra <ramishra>
Status: CLOSED NOTABUG QA Contact: Yurii Prokulevych <yprokule>
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: dbecker, dmacpher, joflynn, jschluet, jstransk, mburns, morazi, nchandek, ramishra, slinaber, therve, vcojot, zbitter
Target Milestone: z2Keywords: Rebase, Reopened, Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: instack-undercloud-8.4.3-2.el7ost Doc Type: Bug Fix
Doc Text:
Red Hat OpenStack undercloud upgrade failed when the overcloud was in a Failed state. It failed very late with a cryptic error when trying to migrate the overcloud stack to use convergence architecture in the post-configuration step of the upgrade process. Now, it fails fast and does not allow undercloud upgrade to proceed. The user receives an error at the beginning of undercloud upgrade. The user must ensure that the overcloud is in *_COMPLETE state before proceeding with the undercloud upgrade.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-02 14:09:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sergii Golovatiuk 2018-04-26 14:23:35 UTC
Description of problem:

openstack undercloud upgrade

fails when overcloud stack was in Failed state with the following error

How reproducible:
All the time

Steps to Reproduce:
Install RHOSP12
Install overcloud
make it in failed state
run: openstack undercloud upgrade

Actual results:

#############
TypeError: 'NoneType' object has no attribute '__getitem__'


2018-04-26 10:14:57,328 DEBUG: An exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2336, in install
    _post_config(instack_env, upgrade)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2042, in _post_config
    _migrate_to_convergence(heat)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1982, in _migrate_to_convergence
    _run_command(args, name='heat-manage')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 642, in _run_command
    env=env).decode('utf-8')
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['sudo', '-E', 'heat-manage', 'migrate_convergence_1', u'1e0047ea-0462-43c9-a7a3-aea13aa49789']' returned non-zero exit status 1
2018-04-26 10:14:57,329 ERROR:
#############################################################################
Undercloud upgrade failed.

Reason: Command '['sudo', '-E', 'heat-manage', 'migrate_convergence_1', u'1e0047ea-0462-43c9-a7a3-aea13aa49789']' returned non-zero exit status 1

See the previous output for details about what went wrong.  The full install
log can be found at /home/stack/.instack/install-undercloud.log.

#############################################################################

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2336, in install
    _post_config(instack_env, upgrade)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2042, in _post_config
    _migrate_to_convergence(heat)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1982, in _migrate_to_convergence
    _run_command(args, name='heat-manage')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 642, in _run_command
    env=env).decode('utf-8')
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['sudo', '-E', 'heat-manage', 'migrate_convergence_1', u'1e0047ea-0462-43c9-a7a3-aea13aa49789']' returned non-zero exit status 1
Command 'instack-upgrade-undercloud' returned non-zero exit status 1
###############


undercloud upgrade failed.

Expected results:
Detect the state of overcloud and not to run upgrade. Ask operator to fix overcloud stack.

Additional info:

Comment 1 Sergii Golovatiuk 2018-04-26 14:28:33 UTC
*** Bug 1572259 has been marked as a duplicate of this bug. ***

Comment 2 Jiri Stransky 2018-04-27 14:06:04 UTC
Additional context: we hit this during FFWD upgrade testing. I'm not sure if we should be saying "all Heat stacks need to be *_COMPLETE rather than *_FAILED before we upgrade Heat" or if this is something that we need to fix.

Comment 4 Zane Bitter 2018-05-03 14:16:45 UTC
(In reply to Jiri Stransky from comment #2)
> Additional context: we hit this during FFWD upgrade testing. I'm not sure if
> we should be saying "all Heat stacks need to be *_COMPLETE rather than
> *_FAILED before we upgrade Heat" or if this is something that we need to fix.

Yes we should say that, because we decided we wanted all overcloud stacks migrated to convergence in Queens, and that can only happen for stacks that are in a COMPLETE state.

Comment 5 Jiri Stransky 2018-05-04 12:38:38 UTC
Thanks, so sounds like we should transform it into a Doc BZ?

(CCing Dan)

Comment 6 Rabi Mishra 2018-05-09 06:43:18 UTC
> Thanks, so sounds like we should transform it into a Doc BZ?

we were trying to migrate the FAILED stacks, which would be fixed by  https://review.openstack.org/#/c/566225/. 

Though it seems little weird that undercloud upgrade would fail for an overcloud being in FAILED state. If that's something we want, then we should fail fast rather than in post config[1].

[1] https://github.com/openstack/instack-undercloud/blob/master/instack_undercloud/undercloud.py#L2042

Comment 7 Zane Bitter 2018-05-10 14:19:43 UTC
(In reply to Rabi Mishra from comment #6)
> Though it seems little weird that undercloud upgrade would fail for an
> overcloud being in FAILED state. If that's something we want, then we should
> fail fast rather than in post config[1].
> 
> [1]
> https://github.com/openstack/instack-undercloud/blob/master/
> instack_undercloud/undercloud.py#L2042

Agree, it would be much better if we didn't let people start the undercloud upgrade until the overcloud was in a COMPLETE state. I'm pretty sure this is the procedure we follow in the field in practice, but customers who are updating on their own may not always be aware of it.

Comment 10 Rabi Mishra 2018-07-10 06:58:26 UTC
*** Bug 1593741 has been marked as a duplicate of this bug. ***

Comment 21 Joanne O'Flynn 2018-08-15 07:39:24 UTC
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible.

If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-".


To add draft documentation text:

* Select the documentation type from the "Doc Type" drop down field.

* A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field.

Comment 22 Yurii Prokulevych 2018-08-15 09:09:40 UTC
Verified with instack-undercloud-8.4.3-3.el7ost.noarch :

2018-08-15 05:03:21,460 ERROR: Can not upgrade undercloud with FAILED overcloud

Comment 24 errata-xmlrpc 2018-08-29 16:35:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2574