Red Hat Bugzilla – Bug 1270874
Ironic should not be enabling PXE Boot on all VM interfaces
Last modified: 2016-09-06 11:21:49 EDT
Description of problem:
RHOS only has a restriction that the Compute Nodes be bare metal. When creating VM's with multiple NICS, as is outlined as a requirement for any node (on on the provisioning network and one public facing) ironic enables PXE Boot for every single interface on the VM when trying to detect and provision it. It seems completely reasonable that a DHCP server would be available on the public network and if so the installation can PXE boot to a network that was not intended and never reach ironic.
I had properly configured my VM to pxe boot only from the provisioning netowrk and ironic overrode this configuration.
Ironic should either not touch the boot configuration or intelligently configure pxe boot only for the correct interface
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install an undercloud/director
2. Create a VM with multiple interfaces, both with pxe boot available, have the provisioning network for RHOS be the second interface, and properly configure the system to PXE boot from the provisioning network.
3. Try to detect the VM for installation as an overcloud host.
VM will pxe boot to the wrong network.
VM PXE Boots to the correct network.
"It is recommended to use bare metal systems for all nodes. At minimum, the Compute nodes require bare metal systems."
"Set all Overcloud systems to PXE boot off the Provisioning NIC and disable PXE boot on the External NIC and any other NICs on the system. Also ensure PXE boot for Provisioning NIC is at the top of the boot order, ahead of hard disks and CD/DVD drives."
Just for clarification, Ironic is deploying a VM so it's using the pxe_ssh driver, right? Cause this driver is just a testing driver for Ironic (not really meant for production)
But anyway, with pxe_ssh Ironic will change the virsh XML of that VM to boot it from "network", it doesn't specify any MAC address or anything like that, all it does is create a "<boot dev='network'/>" in the "<os>" XML node. e.g:
<type arch='x86_64' machine='pc-1.0'>hvm</type>
When you said you properly configured the VM to PXE boot only in the provisioning network, can you give me an example of that virsh XML please?
When I add a single interface to the boot order (via virt-manager) it does so like this:
<type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
The problem here is that SSH drivers are not meant for production, so they don't cover all possible cases. To top it all, SSH drivers are going away in the next release, so even if we fix it now, it will regress soon.
We will use ipmitool drivers with a service called virtualbmc: https://github.com/openstack/virtualbmc, that translates IPMI protocol into libvirt calls. You may want to ensure that this project has the necessary fixes and open an upstream bug against it. The SSH driver is unlikely to receive any updates now.