So basically the best way to do would be: - remove the package install from the mistral task - install the qcow2 image package on the undercloud IFF we enable octavia (using the host_prep_tasks) - bind-mount the images in the wanted mistral containers - proceed to the image upload in the overcloud glance. In order to have a "conditional bind-mount", we can make a simple test on the existence of some octavia-only variable, and Voilà. Note: this will require a reconfigure + restart of the mistral containers in order to take the new mount
Agreed. host_prep_tasks is the way to go for accessing the undercloud host itself (not the mistral-engine container).(In reply to Cédric Jeanneret from comment #3) > So basically the best way to do would be: > - remove the package install from the mistral task > - install the qcow2 image package on the undercloud IFF we enable octavia > (using the host_prep_tasks) Agreed. host_prep_tasks is the way to go for accessing the undercloud host itself (not the mistral-engine container). > - bind-mount the images in the wanted mistral containers > - proceed to the image upload in the overcloud glance. > > In order to have a "conditional bind-mount", we can make a simple test on > the existence of some octavia-only variable, and Voilà. Can't a service_config_settings be defined in THT/docker/services/octavia/octavia-deployment-config.yaml to set the bind mount in mistral_engine? > Note: this will require a reconfigure + restart of the mistral containers in > order to take the new mount I think there is some restart signal instruction we can trigger, no?
To be clear, is what we are suggesting here is that decision to deploy octavia is made when installing the *undercloud*?
No. When deploying overcloud. It involve moving the package installation out from the octavia-undercloud role to host_prep_tasks in THT. For an example, see http://git.openstack.org/cgit/openstack/tripleo-heat-templates/tree/docker/services/aodh-api.yaml#n154
Well, in fact, doing so will create some nasty issues, after some more thoughts: - as it will require to reconfigure the mistral-engine container, it will need to restart - this restart will more than probably crash the overcloud deploy. So, nope. Bad idea. Overcloud deploy should not touch the undercloud services/integrity imho. Another way would be: Take advantage of the bootstrap_server_id fact in order to install the octavia image on only ONE controller, and upload from there In the CentOS case, I suspect no package is provided, and we must build the image using the script provided by openstack-octavia-diskimage-create, meaning we must either: - build the image on the undercloud and upload it (but we will face some issues, as this package is not installed by default on the undercloud, and we will need to find a way to upload it once it's built, and so on) - OR build the image on the boostrap_server_id and upload it into Glance from there as well I tend to push for that solution, as it should prevent any issue on the undercloud. In addition, the code change should not be that big, but I obviously don't know all the internals of the octavia activation. How does that proposal sound?
@Cedric, I agree on the first point that - I don't think we could make this work. I think the only workable solution is to install the rpm into a single node in the overcloud (assuming we have access to repos from overcloud) and upload from there. With respect to CentOS, we don't support building the image, but download from a known location if available. We would simply download to a single overcloud node and install from there similar to the OSP case.
@Brent, OK for the CentOS case - I wasn't sure how it was done for that boy. So, basically, the mistral workflow running https://github.com/openstack/tripleo-common/tree/master/playbooks/roles/octavia-undercloud/tasks should be dropped and replaced by proper things in tripleo-heat-templates, right?
> So, basically, the mistral workflow running > https://github.com/openstack/tripleo-common/tree/master/playbooks/roles/ > octavia-undercloud/tasks should be dropped and replaced by proper things in > tripleo-heat-templates, right? No, but move the image handling logic from octavia-undercloud role over to role octavia-overcloud-config which is run only by one single node (octavia_nodes[0]) https://github.com/openstack/tripleo-common/blob/79a5455ffececa8239b6bed03edbc9dd8269d3dd/playbooks/octavia-files.yaml#L28-L53 I am also not sure we have the necessary repos enabled in the overcloud. At least on deployments via Infrared, repos are not configured: [heat-admin@controller-0 ~]$ sudo yum repolist Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is not registered with an entitlement server. You can use subscription-manager to register. repolist: 0
hmm ok. Well, at least, enabling a repository on that node is easier than doing stuff in a container. But indeed, it would require to ensure we actually have the right repository.
After some thoughts, here's what is, imho, the best way to support the custom images: - create a location on the undercloud, and document it - mount that location in the right container(s) - upload the image to the overcloud glance, either directly from the undercloud, or using the octavia_nodes[0] as "temporary host" That way, we avoid the need to hot-bind-mount volumes to a running container on the undercloud (even if it's possible to do it), meaning we still don't modify any running container on the undercloud. So, I imagine the right steps to do the modification will be: # tripleo-common - ensure the correct repositor(ies|y) is|are deployed in order to fetch the octavia-amphora-image package (and other deps if needed) - move the octavia-undercloud tasks to the octavia-overcloud-config role - adapt moved tasks to run on the overcloud node # tripleo-heat-templates - add the directory for custom images in the right service(s) - add the bind-mount in the right service(s) container Notes regarding the directory for custom images: - a location in /usr/share would be great, as it means no selinux issue - if possible, avoid /home/stack location, as this will really create selinux issues for sure - the directory in /usr/share might be owned by the "stack" user - "stack user" means "deploy user - usually "stack". There's a variable in t-h-t for that iirc. Note for SELinux: with Podman support being added, we must take care of selinux issues, as we don't have an easy, global way to deactivate selinux separation like we can do with the docker daemon. So yeah, please, take selinux into consideration ;). I might have some time for that, but I'll really need help, as I'm new to octavia. I'd propose to go on some trello board in order to get the tasks listed and properly addressed. Does it sound like a realistic plan?
So we have the expectation (because disconnected environments) that by default no rpms will be downloaded during the deployment (this is the default and the point of overcloud-full). If the image needs to be customized, it seems that the octavia image needs to be pulled down prior to deployment and essentially passed in as input to the deployment and we should not be fetching it during the deployment. Should this just be part of or a dependency with rhosp-director-images? It would be a better idea to handle the image availability upfront before the deployment then use the deployment to upload it into swift as part of the deployment for use later on the nodes. From there you could execute something on the controllers to load it into the overcloud glance after pulling it down from swift or we could perhaps do a bind mount in the mistral container as a source to use ansible to push it to the overcloud nodes.
We don't know if Octavia should be enabled prior to overcloud deployment. Pulling down the amphora image (~750 MB RPM package) regardless or it being a dependency of rhos-director-images sounds unreasonable to me. As for the operator passing in it as input, maybe it is worth noting that directive for Octavia being full supported was to make it a "1-click install" (rhbz #1414022) -- we might have to revisit that now, though...
If it needs to be a 1 click install, it should be pulled down with the rhosp-director-images. That's already 1.2G or something but that's the way to ensure it will be available up front for enablement at a later date. Additionally that'll make it sure it's always up to date if they are following the OSP update/upgrade processes.
(In reply to Alex Schultz from comment #15) > Additionally that'll make it sure it's always up to date if they are > following the OSP update/upgrade processes. TripleO already offers that starting from Queens: https://bugzilla.redhat.com/show_bug.cgi?id=1545151
No I mean if a customer doesn't enable octavia until well after the cloud has been in place. Anyway I don't think that we should by default require connectivity to pull down the image to the overcloud nodes.
It was agreed at today's Octavia squad meeting that, given all possible options already presented, the way to move forward is by ensuring the octavia-amphora-image RPM is installed in the undercloud by making it a RPM dependency of rhos-director-images. It will imply a penalty in disk usage, though. The mistral_engine container will need to bind mount /usr/share/rhos-director-images/, and the yum install bits in the Ansible playbook in tripleo-common will be dropped. Still to figure out is how can we ensure users get the latest amphora image uploaded to Glance on overcloud update. One option, that requires user intervention, is to 'yum update octavia-amphora-image' in the undercloud pre-overcloud update.
LGTM for the direct dependency - imho 800Mo aren't that a pain seeing we have container image over 2Go... ;)
Installed octavia-amphora-image package in undercloud base OS and applied the two patches upstreamed to an OSP14 environment provided by Alex. Result is positive :) (overcloud) [stack@undercloud-0 ~]$ ./overcloud_deploy.sh [...] PLAY [External deployment Post Deploy tasks] *********************************** PLAY RECAP ********************************************************************* compute-0 : ok=162 changed=36 unreachable=0 failed=0 compute-1 : ok=162 changed=36 unreachable=0 failed=0 compute-2 : ok=162 changed=36 unreachable=0 failed=0 controller-0 : ok=212 changed=37 unreachable=0 failed=0 controller-1 : ok=205 changed=37 unreachable=0 failed=0 controller-2 : ok=205 changed=37 unreachable=0 failed=0 undercloud : ok=21 changed=9 unreachable=0 failed=0 Wednesday 31 October 2018 17:34:35 -0400 (0:00:00.291) 0:24:39.049 ***** =============================================================================== Ansible passed. Overcloud configuration completed. Waiting for messages on queue 'tripleo' with no timeout. Overcloud Endpoint: http://10.0.0.115:5000 Overcloud Horizon Dashboard URL: http://10.0.0.115:80/dashboard Overcloud rc file: /home/stack/overcloudrc Overcloud Deployed (overcloud) [stack@undercloud-0 ~]$ openstack image list +--------------------------------------+----------------------------------------+--------+ | ID | Name | Status | +--------------------------------------+----------------------------------------+--------+ | 58baebd0-8d0a-403a-a2c9-10505a66520a | octavia-amphora-14.0-20181019.1.x86_64 | active | +--------------------------------------+----------------------------------------+--------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045