Bug 1485418

Summary: [RFE] NFV specific Fast Forward Upgrade RHOSP10 to RHOSP13
Product: Red Hat OpenStack Reporter: Maria Bracho <mbracho>
Component: rhosp-directorAssignee: Yolanda Robla <yroblamo>
Status: CLOSED WONTFIX QA Contact: Sanjay Upadhyay <supadhya>
Severity: high Docs Contact:
Priority: medium    
Version: unspecifiedCC: achernet, ccamacho, dbecker, fbaudin, mburns, morazi, pmorey, rhel-osp-director-maint, sathlang, srelf, yrachman, yroblamo, zshi
Target Milestone: ---Keywords: FutureFeature, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-25 14:35:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1560721    

Description Maria Bracho 2017-08-25 15:54:11 UTC
Description of problem:
Leverage NFV specific procedures to perform Skip-Level Upgrades of RHOSP10 to RHOSP13. Reduce downtime and risk while automating steps to perform upgrade and NFVM.

Comment 2 Sofer Athlan-Guyot 2017-11-13 13:55:31 UTC
From Yolanda

Description of problem:

For fast forward upgrades, we have some very specific needs for telcos. Specially, it is very important to consider the maintenance window. We have a limit of 4 hours per window. In this window we need to do several tasks:
- take backup
- start the upgrade task
- test
- rollback if something went wrong

This needs to be considered for all the parts of the process. Specifically, we have an important need in terms of upgrading the computes. Computes need to be upgraded in batch, and also we need the ability to control the order of the upgrades and reboots.

More requirements:
- data plane needs to be 100% up
- add the ability to control the order of reboots in the minor update step
- add the ability to provide hooks, after minor updates, to be able to deal with ovs version upgrade problems

Comment 3 atelang 2017-11-13 14:23:40 UTC
*** Bug 1502793 has been marked as a duplicate of this bug. ***

Comment 4 Sofer Athlan-Guyot 2017-11-20 13:59:24 UTC
*** Bug 1508762 has been marked as a duplicate of this bug. ***

Comment 5 Yolanda Robla 2017-11-20 15:11:53 UTC
Another requirement we are hitting:
- as I explained, we need to do the fast forward upgrade in maintenance windows. These maintenance windows have a max of 4 hours, and we need to be able to perform full steps that can be isolated on that time, including the time for backup, testing and restore
- the system needs to be on an usable state after each of the maintenance window steps
- but we hit a problem on undercloud upgrade. When we upgrade undercloud to n+3 the overcloud is still on 10. This means the overcloud is still up, but we are not able to control the status via undercloud. Things like removing a node, or creating a new one, are failing because there is no API compatibility between N and N+3
- so between the undercloud maintenance window and the controllers one, the system is not in an stable status

What can we do to improve it? Ideally we will need to have API compatibility at least between undercloud and overcloud. Is that possible? What can be the alternatives?