Hide Forgot
Description of problem: OSPd7's bulk introspection times out during PXE boot phase when dnsmasq hash function produces collision in IP addresses offered to nodes initially (explained http://permalink.gmane.org/gmane.network.dns.dnsmasq.general/3846). This might happen when IP pool configured in dnsmasq is narrow enough and MAC addresses are similar enough, therefore dnsmasq sends DHCPOFFER's with same proposed IP to nodes with different MAC addresses initially. One of the nodes ACK's the offered IP, but other one fails to ACK, sending NAK 4 times as defined in DHCP workflow. Unfortunately, this delay is not usually expected from the PXE firmware (10-15 sec. wait time for successful handshake) and PXE boot times out resulting in a failed introspection of affected node. Version-Release number of selected component (if applicable): RHEL 7.2 dnsmasq-2.66-14.el7_1.x86_64 dnsmasq-utils-2.66-14.el7_1.x86_64 instack-undercloud-2.1.2-37.el7ost.noarch python-rdomanager-oscplugin-0.0.10-26.el7ost.noarch openstack-ironic-discoverd-1.1.0-8.el7ost.noarch openstack-ironic-common-2015.1.2-2.el7ost.noarch openstack-ironic-conductor-2015.1.2-2.el7ost.noarch openstack-ironic-api-2015.1.2-2.el7ost.noarch openstack-tripleo-puppet-elements-0.0.1-5.el7ost.noarch openstack-tripleo-image-elements-0.9.6-10.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-112.el7ost.noarch openstack-tripleo-common-0.0.1.dev6-5.git49b57eb.el7ost.noarch openstack-tripleo-0.0.7-0.1.1664e566.el7ost.noarch A) Steps to reproduce in baremetal environment: 1. Deploy OSPd7 with instackenv.json with e.g. following MAC's { "nodes": [ { "pm_type": "pxe_ipmitool", "mac": [ "34:17:eb:e6:45:33" ], ... }, { "pm_type": "pxe_ipmitool", "mac": [ "34:17:eb:e6:45:f0" ], ... }, ... ] } 2. $ openstack baremetal import --json instackenv.json $ openstack baremetal configure boot 3. If is not configured already, check configuration of dnsmasq (it is important for reproducing that pool/range has exactly 20 addresses available in "dhcp-range" field, the IP's doesn't matter): $ cat /etc/ironic-discoverd/dnsmasq.conf port=0 interface=br-ctlplane bind-interfaces dhcp-range=10.200.200.100,10.200.200.120,29 enable-tftp tftp-root=/tftpboot dhcp-match=ipxe,175 dhcp-boot=tag:!ipxe,undionly.kpxe,localhost.localdomain,10.200.200.1 dhcp-boot=tag:ipxe,http://10.200.200.1:8088/discoverd.ipxe 4. $ openstack baremetal introspection bulk start 5. Discovery recognizes following MAC's broadcasting DHCPDISCOVER's on isolated network 34:17:eb:e6:45:32 34:17:eb:e6:45:ef 6. Whole communication looks like $ sudo journalctl -f -u openstack-ironic-discoverd-dnsmasq Jan 22 23:50:55 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:50:55 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef Jan 22 23:50:55 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:50:56 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:50:56 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:50:56 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef Jan 22 23:50:56 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPACK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:04 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:05 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:05 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:07 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:07 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:08 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:51:08 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:16 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:16 DHCPACK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:50:55 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:50:55 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef Jan 22 23:50:55 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:50:56 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:50:56 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:50:56 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef Jan 22 23:50:56 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPACK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:04 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:04 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:04 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:05 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:05 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:07 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef Jan 22 23:51:07 DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef address in use Jan 22 23:51:08 DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 Jan 22 23:51:08 DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:16 DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 Jan 22 23:51:16 DHCPACK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 7) Introspection of node 34:17:eb:e6:45:ef failed because PXE boot couldn't get IP address in time B) Steps to reproduce in emulated environment (PXE using QEMU): As this happens on baremetals in my case where MAC's can't be "spoofed" before PXE boot (BIOS doesn't support that and without proper MAC setup this can't be reproduced), therefore we can use emulated QEMU PXE boot to reproduce. 1. Deploy OSPd7 manually up to step "Introspect Nodes" on baremetal, cancel introspection during progress ("waiting for nodes ..." stage) and from this point work on only instack node 2. Check if iptables discovery chain doesn't block spoofed MAC's (listed below, iptables -L discovery), otherwise delete DROP rules (iptables -D discovery XY) and disable discoverd service (service openstack-ironic-discoverd stop) so it can't create DROP rules again 3. Add two tap devices as connection point for emulated PXE consoles and add them as ports to br-ctlplane, make sure created all interfaces are UP ip tuntap add tap0 mode tap && ip link set tap0 up ip tuntap add tap1 mode tap && ip link set tap1 up ovs-vsctl add-port br-ctlplane tap0 ovs-vsctl add-port br-ctlplane tap1 4. Boot PXE virtually with MAC's causing collisions in our IP range qemu-system-x86_64 -boot n -net nic,macaddr=34:17:eb:e6:45:32 -net tap,ifname=tap0,script=no qemu-system-x86_64 -boot n -net nic,macaddr=34:17:eb:e6:45:ef -net tap,ifname=tap1,script=no 5. Even in emulated environment PXE boot fails for one of nodes (who DHCPACK's first wins, other one times out) Actual results: Introspection of baremetal node with MAC 34:17:eb:e6:45:ef fails with timeout $ baremetal introspection bulk status | Node UUID | Finished | Error | +--------------------------------------+----------+-----------------------+ | 3e926bb2-4f30-4eef-9021-269752ae42f4 | True | Introspection timeout | | 3217ecf2-b251-41f8-a23a-0d4d1a43b7aa | True | None | ... Expected results: Introspection will pass Additional info This issue might be possibly resolved in https://review.openstack.org/#/c/203040/ introducing introspection delay referenced in https://bugs.launchpad.net/ironic-inspector/+bug/1473024 Workarounds: Manual introspection (so one can add delay between introspection of each node), change IP pool size or spoof MAC addresses so dnsmasq can not cause collisions
*** Bug 1301663 has been marked as a duplicate of this bug. ***
Are there any chances you could retry this with OPSd8?
*** Bug 1312020 has been marked as a duplicate of this bug. ***
The same issue shows up with OSPd8, in a virtual environment with 8 VMs, default undercloud conf one of the VMs times out to boot. I see the following in openstack-ironic-inspector-dnsmasq.service journal: DHCPNAK(br-ctlplane) 192.0.2.108 00:c7:f4:cf:58:7a address in use DHCPNAK(br-ctlplane) 192.0.2.108 00:c7:f4:cf:58:7a address in use DHCPNAK(br-ctlplane) 192.0.2.108 00:c7:f4:cf:58:7a address in use DHCPNAK(br-ctlplane) 192.0.2.108 00:c7:f4:cf:58:7a address in use which corresponds with the mac of the VM which is failed to boot.
FWIW I have seen the same happen in a recent RDO Mitaka. Now I have switched to doing introspection node by node, thereby avoiding the bulk problem. If there are specific logs/tests that we want do have done, I can try to reproduce on mitaka again, just let me know what you need.
*** Bug 1306417 has been marked as a duplicate of this bug. ***
Dnsmasq changed their hashing algorithm in version 2.53 to ameliorate the problem described in that link; AFAICT the new algorithm (https://github.com/guns/dnsmasq/blob/nerv/src/dhcp.c#L649) shouldn't produce the same output for two MACs which differ only in the final byte. So I think the problem is more to do with the tiny range of available IPs - though dnsmasq's stateless hash-of-MAC design does mean it will happen deterministically.
I'll try using --dhcp-sequential-ip as suggested by Milan in one ML thread.
could someone please check if putting dhcp-sequential-ip to their /etc/ironic-inspector/dnsmasq.conf fixes the issue?
Needinfo is still in effect, see comment 11
Created attachment 1132900 [details] Reproducer python script This is indeed a dnsmasq hash-collision issue. I'm attaching a reproducer script. It works by generating fake random DHCP discoveries in a burst, exercising the dnsmasq server. It requires scapy; pip install should resolve this dependency. Plese, adjust as needed (edit conf.iface and and address_count variables); default values work OK on devstack. I've observed that some 15% DHCP offers collide for me with the default settings. As far as working around the dnsmasq hash-collision, running dnsmasq with the --dhcp-sequential-ip command line option solves it. Note please that dnsmasq doesn't rotate the address pool and will reject discoveries once the pool was exhausted. This workaround doesn't involve inspector.
Resetting needinfo flag based on Comment #13
The attachment #1132900 [details] was tested with scapy-2.2.0-5.fc22.noarch See also Comment #13
Created attachment 1132908 [details] Reproducer python script Update to Comment #13 dnsmasq --dhcp-sequential-ip solves the issue dnsmasq rotates the address pool with expiring leases Attached new version of reproducer. To reproduce the issue: * use default settings of dnsmasq and sudo execute the script * the script will exit once it detects a lease collision (usually after first iteration) To prove the workaround solves the issue: * update the dhcp pool lease time to 2 minutes /etc/ironic-inspector/dnsmasq.conf: dhcp-range=172.24.42.100,172.24.42.253,2m * dnsmasq --dhcp-sequential-ip --config-file=/etc/ironic-inspector/dnsmasq.conf * run the script through sudo * the script will try detect a collision 10 times * the script will exit after ~22min with no collision detected
Thanks for the confirmation! The patches merged upstream, so we'll get the fix soon.
Hello, reproduced and confirmed https://bugzilla.redhat.com/show_bug.cgi?id=1301659#c10 working on OSP7 baremetal deployment with MAC's emulated using QEMU (method B I mentioned in first post). Output of $ journalctl -f -u openstack-ironic-discoverd-dnsmasq after workaround: dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPREQUEST(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPACK(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPREQUEST(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPACK(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPOFFER(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPREQUEST(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPACK(br-ctlplane) 10.200.200.100 34:17:eb:e6:45:32 dnsmasq-dhcp[61273]: DHCPREQUEST(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef dnsmasq-dhcp[61273]: DHCPACK(br-ctlplane) 10.200.200.101 34:17:eb:e6:45:ef Same IP is not offered to MAC's similar enough anymore, NAK's are therefore gone. Without this workaround is still: dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPACK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 address in use dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 address in use dnsmasq-dhcp[61749]: DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 address in use dnsmasq-dhcp[61749]: DHCPREQUEST(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 dnsmasq-dhcp[61749]: DHCPNAK(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:32 address in use dnsmasq-dhcp[61749]: DHCPDISCOVER(br-ctlplane) 34:17:eb:e6:45:ef dnsmasq-dhcp[61749]: DHCPOFFER(br-ctlplane) 10.200.200.115 34:17:eb:e6:45:ef OSPD7 baremetal deployment: dnsmasq-2.66-14.el7_1.x86_64 dnsmasq-utils-2.66-14.el7_1.x86_64 instack-undercloud-2.1.2-39.el7ost.noarch python-rdomanager-oscplugin-0.0.10-28.el7ost.noarch openstack-tripleo-image-elements-0.9.6-10.el7ost.noarch openstack-tripleo-0.0.7-0.1.1664e566.el7ost.noarch openstack-tripleo-puppet-elements-0.0.1-5.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-123.el7ost.noarch openstack-tripleo-common-0.0.1.dev6-6.git49b57eb.el7ost.noarch openstack-ironic-api-2015.1.2-2.el7ost.noarch openstack-ironic-common-2015.1.2-2.el7ost.noarch openstack-ironic-conductor-2015.1.2-2.el7ost.noarch openstack-ironic-discoverd-1.1.0-8.el7ost.noarch Note this is more OSPd7 issue than OSPd8, as later one has mechanisms to prevent these issues (e.g. one can add delay to introspection process). Thanks.
Filip, would it be possible for you to run your reproducer with the suggested workaround (see Comment #11)? The workaround should resolve the issue no matter product version; it's dnsmasq configuration option. Thanks a lot! milan
(In reply to mkovacik from comment #19) ... Hello, adding "dhcp-sequential-ip" at the end of /etc/ironic-discoverd/dnsmasq.conf fixes the problem. Filip