Making TripleO workflows use Ironic directly to provision nodes has quite a few benefits:
1. First and foremost, getting rid of the horrible "no valid hosts found" exception. The scheduling will be much simpler and the errors will be clearer.
2. Also important for the generic provisioner case, we'll be able to get rid of Nova and Glance, reducing the memory footprint.
3. We'll get rid of pre-deploy validations that currently try to get what nova scheduler will expect.
4. We'll become on charge of building the configdrive, potentially putting more useful things there.
5. In the future we'll be able to integrate things like building RAID on demand much easier.
6. Also in the future we might want to use introspection data in scheduling and provisioning decisions. Particularly, we can automate handling root device hints.
7. We'll probably have easier time combining ironic with pre-deployed services or virtual solutions (e.g. putting controllers on oVirt, which is currently done via an unsupported ironic-staging-driver driver).
8. Hopefully, scale-up will be less error-prone.
The way I see it implemented is probably similar to the config-download work (and based on it). We can have a Heat flag to not provision instances, essentially stop after creating ports and creating ansible input. Then a mistral workflow will pick the required number of nodes, prepare them and provision. Then the usual config-download will proceed.
From ML discussion, making this TechPreview for OSP-15.
I've attached the upstream docs bug. I think this could still be Tech Preview for 16, and ready to be the default for 16.1
Exit TP in 16.1.
Ramon - this isn't planned for testing in 16.1 so should probably stay in TechPreview.
According to our records, this should be resolved by openstack-tripleo-common-11.4.1-1.20200914165651.el8ost. This build is available now.
Moving to 17.0 since that is where downstream CI is testing it.
Taking this off the OSP17 planning as we already have a tracking BZ for it targeting 17, 1894547