Bug 1485418 - [RFE] NFV specific Fast Forward Upgrade RHOSP10 to RHOSP13
Summary: [RFE] NFV specific Fast Forward Upgrade RHOSP10 to RHOSP13
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: unspecified
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Yolanda Robla
QA Contact: Sanjay Upadhyay
URL:
Whiteboard:
: 1502793 (view as bug list)
Depends On:
Blocks: 1560721
TreeView+ depends on / blocked
 
Reported: 2017-08-25 15:54 UTC by Maria Bracho
Modified: 2020-02-25 14:35 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-25 14:35:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Maria Bracho 2017-08-25 15:54:11 UTC
Description of problem:
Leverage NFV specific procedures to perform Skip-Level Upgrades of RHOSP10 to RHOSP13. Reduce downtime and risk while automating steps to perform upgrade and NFVM.

Comment 2 Sofer Athlan-Guyot 2017-11-13 13:55:31 UTC
From Yolanda

Description of problem:

For fast forward upgrades, we have some very specific needs for telcos. Specially, it is very important to consider the maintenance window. We have a limit of 4 hours per window. In this window we need to do several tasks:
- take backup
- start the upgrade task
- test
- rollback if something went wrong

This needs to be considered for all the parts of the process. Specifically, we have an important need in terms of upgrading the computes. Computes need to be upgraded in batch, and also we need the ability to control the order of the upgrades and reboots.

More requirements:
- data plane needs to be 100% up
- add the ability to control the order of reboots in the minor update step
- add the ability to provide hooks, after minor updates, to be able to deal with ovs version upgrade problems

Comment 3 atelang 2017-11-13 14:23:40 UTC
*** Bug 1502793 has been marked as a duplicate of this bug. ***

Comment 4 Sofer Athlan-Guyot 2017-11-20 13:59:24 UTC
*** Bug 1508762 has been marked as a duplicate of this bug. ***

Comment 5 Yolanda Robla 2017-11-20 15:11:53 UTC
Another requirement we are hitting:
- as I explained, we need to do the fast forward upgrade in maintenance windows. These maintenance windows have a max of 4 hours, and we need to be able to perform full steps that can be isolated on that time, including the time for backup, testing and restore
- the system needs to be on an usable state after each of the maintenance window steps
- but we hit a problem on undercloud upgrade. When we upgrade undercloud to n+3 the overcloud is still on 10. This means the overcloud is still up, but we are not able to control the status via undercloud. Things like removing a node, or creating a new one, are failing because there is no API compatibility between N and N+3
- so between the undercloud maintenance window and the controllers one, the system is not in an stable status

What can we do to improve it? Ideally we will need to have API compatibility at least between undercloud and overcloud. Is that possible? What can be the alternatives?


Note You need to log in before you can comment on or make changes to this bug.