Making TripleO workflows use Ironic directly to provision nodes has quite a few benefits:
1. First and foremost, getting rid of the horrible "no valid hosts found" exception. The scheduling will be much simpler and the errors will be clearer.
2. Also important for the generic provisioner case, we'll be able to get rid of Nova and Glance, reducing the memory footprint.
3. We'll get rid of pre-deploy validations that currently try to get what nova scheduler will expect.
4. We'll become on charge of building the configdrive, potentially putting more useful things there.
5. In the future we'll be able to integrate things like building RAID on demand much easier.
6. Also in the future we might want to use introspection data in scheduling and provisioning decisions. Particularly, we can automate handling root device hints.
7. We'll probably have easier time combining ironic with pre-deployed services or virtual solutions (e.g. putting controllers on oVirt, which is currently done via an unsupported ironic-staging-driver driver).
8. Hopefully, scale-up will be less error-prone.
The way I see it implemented is probably similar to the config-download work (and based on it). We can have a Heat flag to not provision instances, essentially stop after creating ports and creating ansible input. Then a mistral workflow will pick the required number of nodes, prepare them and provision. Then the usual config-download will proceed.