Hide Forgot
Created attachment 1136821 [details] Screenshot of IPA log when inspector posting collected data is attempted In an OVB environment where nodes have a nic for provisioning network plus nics for full multi-nic network isolation. The isolated networks are not running DHCP yet so IPA doesn't configure IPs for these nics. The problem is that when inspector is posted collected data, the mac used is for one of the non-configured isolation nics instead of the configured provisioning nic. See the attached screenshot, which shows the mac for unconfigured eth1 instead of the provisioning nic eth0. This is in an OVB environment on the rhos-central-ci cloud, so can be replicated on demand. This bug will prevent using OVB environments like rhos-central-ci to do CI testing of network isolation.
Setting needinfo for mburns to evaluate this for blocker
One guess: did you try it with the latest iPXE ROM available in poodles? Last time I saw it, we provided a wrong BOOTIF. Also in your case there may be a workaround: will it work if you set https://github.com/openstack/ironic-inspector/blob/master/example.conf#L602 to "active"?
This is with 8.0 puddle 20160311.1. Is the iPXE rom in latest poodles even newer? I'll try the workaround.
The following worked for me before doing introspection: openstack-config --set /etc/ironic-inspector/inspector.conf processing add_ports active systemctl restart openstack-ironic-inspector I'm assigning this bug to instack-undercloud so they can evaluate whether add_ports should be set to active for all undercloud installs.
I don't think it should be set to "active". It has high chances of breaking other use cases. I still wonder why it doesn't work in your case. Could you please get ironic-inspector ramdisk logs for me? Please set https://github.com/openstack/ironic-inspector/blob/master/example.conf#L647 to true, restart ironic-inspector, restart introspection, and grab the tarball from /var/log/ironic-inspector/ramdisk.
Created attachment 1138862 [details] IPA journal from /var/log/ironic-inspector/ramdisk
The attached journal file shows that the incorrect mac is being passed in as a BOOTIF kernel parameter, so iPXE is specifiying this. My /httpboot/inspector.ipxe has BOOTIF=${mac} but looking at the ipxe docs[1] the only examples show explicitly specifying which interface such as ${net0/mac} Sure enough, setting inspector.ipxe BOOTIF=${net0/mac} fixed this problem for me. Is the inspector interface ever anything other than net0? I've tried in the past to make it later in the interface order and the result was no booting. What I'm hoping is that BOOTIF=${net0/mac} can be proposed as a fix to ironic/drivers/modules/ipxe_config.template [1] http://ipxe.org/cfg/mac
... and there is a vaguely related upstream bug whose root cause was iPXE ${mac} not corresponding to the boot mac https://bugs.launchpad.net/ironic/+bug/1504482
Folks, please stop changing projects randomly :( there is nothing in ironic-inspector itself related to the discussion right now... Puppet is managing the iPXE setting for us. I was expecting the iPXE update to fix the issue, but it seems to be not the case. We probably need to get our iPXE experts involved again, as always assuming the 1st NIC (which is what net0/mac does) is not the way to go either..
The version of iPXE I'm seeing during introspection boot is c4bce43, which I believe is still the old one. This is with 20160318.2 puddle and images from http://rhos-release.virt.bos.redhat.com/mburns/latest-8.0-images/
Steve, what's ipxe-bootimgs package version for you? I see 20160127-1.git6366fa7a.el7 with the latest poodle.
Dmitry, my undercloud has ipxe-bootimgs-20160127-1.git6366fa7a.el7. However this is an openstack-virtual-baremetal environment and I suspect the iPXE being booted is the one that comes from ipxe-roms-qemu on the *host* cloud (in this case, the rhos-dev-ci cloud running 7.3) I'd like to explore a couple of options for not having to upgrade ipxe-roms-qemu on the host cloud. Do you have any suggestions of how I might chain the first iPXE to boot the iPXE in undercloud /tftpboot?
(In reply to Steve Baker from comment #13) > Dmitry, my undercloud has ipxe-bootimgs-20160127-1.git6366fa7a.el7. > > However this is an openstack-virtual-baremetal environment and I suspect the > iPXE being booted is the one that comes from ipxe-roms-qemu on the *host* > cloud (in this case, the rhos-dev-ci cloud running 7.3) > > I'd like to explore a couple of options for not having to upgrade > ipxe-roms-qemu on the host cloud. Do you have any suggestions of how I might > chain the first iPXE to boot the iPXE in undercloud /tftpboot? Hi Steve, Yes this is a tricky one, because the VMs will no chainload the iPXE ROM from the /tftpboot directory since it's already booting from iPXE. This happens because the DHCP server has a simple conditional: If not booting from iPXE then chainload; if booting from iPXE fetch the iPXE script and continue with the boot process. See: https://github.com/openstack/ironic/blob/69c33f7ed5004afd4fd1589f1aed0e498845a952/ironic/common/pxe_utils.py#L316-L321 Now, this is even trickier for inspector. As you rightly pointed out in comment #9 we did have this problem in Ironic and the way we solved it in a generic way was by iterating on all nics and trying to find the iPXE configuration in the /httpboot dir that matches the MAC address of that nic. That works for Ironic because Ironic has the node's MAC address registered in the database, but that is not the case for inspector. So two solutions here, but I don't think that neither of them should go upstream because they make the inspector.ipxe script rigid, the right solution upstream is to ask people to update their packages (unfortunately): * Solution 1: Since it's VM, you can edit the inspector.ipxe script and add the right nic number to it just like you did on comment comment #8. In VMs the order of the NICs are static so you won't have a problem of net0 net1 being switched between boots. * Solution 2: Force a chainload to a newer iPXE ROM. Apart from chainloading it using the DHCP server options we can do it directly in the inspector.ipxe script. E.g we could check which version of the iPXE ROM we are using and if that does not match the one we expect we tell it fetch the right one from the /tftpboot dir, e.g: #!ipxe set EXPECTED_VERSION 1.0.0+ (abcdef) # Check if version is set and if the version matches the one we expect, if not chainload isset ${version} && iseq ${version} ${EXPECTED_VERSION} && goto boot_inspector || echo "Not the current version, upgrading" chain tftp://{{next-server}}/undionly.kpxe :boot_inspector <original inspector.ipxe content here> ps*: I have not tested the script above yet. ... I was looking at a way to tell QEMU to use standard PXE instead of iPXE for network boot so the chainload would happen automatically. But I couldn't find a way to do it. Hope that helps, Lucas
Thanks Lucas. I have nothing to add unfortunately.
Thanks Lucas, solution 2 sounds worth trying. I was thinking of a solution 3 changing the ironic-inspector dnsmasq.conf tag filtering which is currently: dhcp-boot=tag:!ipxe,undionly.kpxe,localhost.localdomain,192.0.2.1 dhcp-boot=tag:ipxe,http://192.0.2.1:8088/inspector.ipxe What I'm hoping is that there is some revealed difference between the old qemu iPXE and the one served by undionly.kpxe so that I can add a third dhcp-boot entry which boots undionly.kpxe. I'm not sure if there is enough information to do this, or how to discover what tags are available to filter on.
I've confirmed that this isn't an issue when iPXE 6366fa7a is loaded, which I achieved using Lucas's solution 2. Below is my modified inspector.ipxe which is populated with an appropriate git version by a handful of ansible tasks. I do wonder if something like this should be contributed upstream (well, puppet-ironic). How often will ancient iPXE be running on flashed hardware rather than being loaded via tftp? #!ipxe dhcp set EXPECTED_VERSION 1.0.0+ (6366fa7a) # Check if version is set and if the version matches the one we expect, if not chainload isset ${EXPECTED_VERSION} || goto boot_inspector echo Expected iPXE version ${EXPECTED_VERSION} isset ${version} || goto boot_chained iseq ${version} ${EXPECTED_VERSION} && goto boot_inspector :boot_chained echo Booting chained iPXE chain undionly.kpxe :boot_inspector echo Booting inspector