Description of problem:
Currently the fast forward upgrade workflow is made up of a combination of running stack updates via overcloud deploy command and running several ansible-playbooks commands manually.
This process is fragile as it is error prone(the overcloud with its workloads can reach in an undesired state if some playbook is missed or contains the wrong options).
Moreover the entire workflow require the user to understand the internals of the upgrade mechanism in order to make sense of the steps, e.g. what each ansible playbook command does.
We should simplify this process and hide unnecessary bits behind a CLI wrapper which exposes the actions which are meaningful to the operator, e.g: assuming a basic environment with 3 controller and 2 computes, the end user should be able to do the upgrade into simple steps such as:
openstack ffwd-upgrade --role Controller ## upgrade controller nodes
openstack ffwd-upgrade --node compute-0 ## upgrade compute-0 node
openstack ffwd-upgrade --node compute-1 ## upgrade compute-1 node
As it is right now the procedure involves multiple steps which make the upgrade experience confusing and also introduces multiple points of possible manual errors.
I am exemplifying below the steps required to upgrade at this moment:
1. Update stack outputs by appending environment files to the overcloud deploy command used for initial deployment
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
--control-scale 3 \
--control-flavor controller \
--compute-scale 2 \
--compute-flavor compute \
--ceph-storage-scale 3 \
--ceph-storage-flavor ceph \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \
-e /home/stack/virt/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/debug.yaml \
-e /home/stack/virt/docker-images.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/fast-forward-upgrade.yaml \
-e /home/stack/ffu_repos.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/config-download-environment.yaml \
-e /home/stack/ceph-ansible-env.yaml \
2. Download config and save the location to be later used with ansible-playbook commands:
openstack overcloud config download --name $stack_name
3. Generate inventory file
/usr/bin/tripleo-ansible-inventory \
--plan $stack_name \
--static-yaml-inventory /home/stack/tripleo-ansible-inventory-static.yaml
4. Run FFU playbook
ansible-playbook --module-path /usr/share/ansible-modules/ -i /home/stack/tripleo-ansible-inventory-static.yaml -b /home/stack/tripleo-p1woxQ-config/fast_forward_upgrade_playbook.yaml
5. Run FFU upgrade step on controller nodes
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-p1woxQ-config/upgrade_steps_playbook.yaml --skip-tags=validation --limit=Controller
6. Run FFU deploy step on controller nodes
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-UnqRNC-config/deploy_steps_playbook.yaml --limit=Controller
7. Run FFU upgrade step on compute-0 node
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-p1woxQ-config/upgrade_steps_playbook.yaml --skip-tags=validation --limit=compute-0
8. Run FFU deploy step on compute-0 nodes
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-UnqRNC-config/deploy_steps_playbook.yaml --limit=compute-0
9. Run FFU upgrade step on compute-1 node
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-p1woxQ-config/upgrade_steps_playbook.yaml --skip-tags=validation --limit=compute-1
10. Run FFU deploy step on compute-1 nodes
ansible-playbook --module-path /usr/share/ansible-modules/ -i /usr/bin/tripleo-ansible-inventory -b /home/stack/tripleo-UnqRNC-config/deploy_steps_playbook.yaml --limit=compute-1
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2018:2086