Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1623706

Summary: autodiscovery doesn't work OOB - need to restart ironic_inspector_dnsmasq container.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: openstack-tripleo-heat-templatesAssignee: Harald Jensås <hjensas>
Status: CLOSED WORKSFORME QA Contact: mlammon
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 14.0 (Rocky)CC: bfournie, hjensas, mburns, mlammon, sasha, srevivo
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-20 15:07:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Consol screenshot of node being discovered. none

Description Alexander Chuzhoy 2018-08-29 23:17:55 UTC
autodiscovery doesn't work OOB - need to restart ironic_inspector_dnsmasq container.


Environment:
python2-ironicclient-2.5.0-0.20180810135843.fb94fb8.el7ost.noarch
python2-ironic-inspector-client-3.3.0-0.20180810080932.53bf4e8.el7ost.noarch
puppet-ironic-13.3.1-0.20180822161555.5d7cfcf.el7ost.noarch
instack-undercloud-9.2.1-0.20180809233055.ed96987.el7ost.noarch


Steps to reproduce:
1. Deploy undercloud with autodiscovery enabled.
2. Try to boot nodes.

Result:
The nodes aren't discovered.


W/A:

restart ironic_inspector_dnsmasq container.

Comment 1 Dmitry Tantsur 2018-08-30 08:57:30 UTC
Can we have some logs please? Ideally also tcpdump of DHCP traffic before and after restart?

Comment 4 Harald Jensås 2018-08-31 16:58:00 UTC
First I see the PXE client DHCPREQUEST, and DHCPACK with the correct options, afaict?

Aug 31 12:42:49 dnsmasq-dhcp[1]: 667612435 available DHCP range: 192.168.24.100 -- 192.168.24.120                                                                                                         [63/1967]
Aug 31 12:42:49 dnsmasq-dhcp[1]: 667612435 vendor class: PXEClient:Arch:00000:UNDI:002001
Aug 31 12:42:49 dnsmasq-dhcp[1]: 667612435 user class: iPXE
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 tags: ctlplane-subnet, known, ipxe, br-ctlplane
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 DHCPOFFER(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 requested options: 1:netmask, 3:router, 6:dns-server, 7:log-server, 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 requested options: 12:hostname, 15:domain-name, 17:root-path, 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 requested options: 43:vendor-encap, 60:vendor-class, 66:tftp-server, 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 requested options: 67:bootfile-name, 119:domain-search, 128, 
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 requested options: 129, 130, 131, 132, 133, 134, 135, 175, 203
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 next server: 192.168.24.1
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  1 option: 53 message-type  2
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 54 server-identifier  192.168.24.1
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 51 lease-time  10m
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size: 40 option: 67 bootfile-name  http://192.168.24.1:8088/inspector.ipxe
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 58 T1  5m
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 59 T2  8m45s
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  1 netmask  255.255.255.0
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 28 broadcast  192.168.24.255
Aug 31 12:42:52 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  3 router  192.168.24.1
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 vendor class: PXEClient:Arch:00000:UNDI:002001
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 user class: iPXE
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 tags: ctlplane-subnet, known, ipxe, br-ctlplane
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 DHCPOFFER(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 requested options: 1:netmask, 3:router, 6:dns-server, 7:log-server, 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 requested options: 12:hostname, 15:domain-name, 17:root-path, 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 requested options: 43:vendor-encap, 60:vendor-class, 66:tftp-server, 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 requested options: 67:bootfile-name, 119:domain-search, 128, 
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 requested options: 129, 130, 131, 132, 133, 134, 135, 175, 203
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 next server: 192.168.24.1
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  1 option: 53 message-type  2
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 54 server-identifier  192.168.24.1
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 51 lease-time  10m
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size: 40 option: 67 bootfile-name  http://192.168.24.1:8088/inspector.ipxe
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 58 T1  5m
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 59 T2  8m45s
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  1 netmask  255.255.255.0
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 28 broadcast  192.168.24.255
Aug 31 12:42:53 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  3 router  192.168.24.1
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 vendor class: PXEClient:Arch:00000:UNDI:002001
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 user class: iPXE
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 DHCPREQUEST(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 tags: ctlplane-subnet, known, ipxe, br-ctlplane
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 DHCPACK(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 requested options: 1:netmask, 3:router, 6:dns-server, 7:log-server, 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 requested options: 12:hostname, 15:domain-name, 17:root-path, 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 requested options: 43:vendor-encap, 60:vendor-class, 66:tftp-server, 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 requested options: 67:bootfile-name, 119:domain-search, 128, 
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 requested options: 129, 130, 131, 132, 133, 134, 135, 175, 203
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 next server: 192.168.24.1
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  1 option: 53 message-type  5
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 54 server-identifier  192.168.24.1
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 51 lease-time  10m
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size: 40 option: 67 bootfile-name  http://192.168.24.1:8088/inspector.ipxe
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 58 T1  5m
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 59 T2  8m45s
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  1 netmask  255.255.255.0
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option: 28 broadcast  192.168.24.255
Aug 31 12:43:01 dnsmasq-dhcp[1]: 667612435 sent size:  4 option:  3 router  192.168.24.1

Then a Second DHCP request, this is the inspector image booted. (It does not request bootfile, tftpserver etc in request options.)

Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 tags: ctlplane-subnet, known, br-ctlplane
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 DHCPOFFER(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 1:netmask, 28:broadcast, 2:time-offset, 121:classless-static-route, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 15:domain-name, 6:dns-server, 12:hostname, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 40:nis-domain, 41:nis-server, 42:ntp-server, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 26:mtu, 119:domain-search, 3:router, 121:classless-static-route, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 249, 33:static-route, 252, 42:ntp-server
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 bootfile name: undionly.kpxe
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 server name: localhost.localdomain
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 next server: 192.168.24.1
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  1 option: 53 message-type  2
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 54 server-identifier  192.168.24.1
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 51 lease-time  10m
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 58 T1  5m
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 59 T2  8m45s
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option:  1 netmask  255.255.255.0
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 28 broadcast  192.168.24.255
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option:  3 router  192.168.24.1
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 DHCPREQUEST(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 tags: ctlplane-subnet, known, br-ctlplane
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 DHCPACK(br-ctlplane) 192.168.24.100 52:54:00:73:4b:16 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 1:netmask, 28:broadcast, 2:time-offset, 121:classless-static-route, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 15:domain-name, 6:dns-server, 12:hostname, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 40:nis-domain, 41:nis-server, 42:ntp-server, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 26:mtu, 119:domain-search, 3:router, 121:classless-static-route, 
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 requested options: 249, 33:static-route, 252, 42:ntp-server
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 bootfile name: undionly.kpxe
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 server name: localhost.localdomain
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 next server: 192.168.24.1
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  1 option: 53 message-type  5
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 54 server-identifier  192.168.24.1
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 51 lease-time  10m
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 58 T1  5m
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 59 T2  8m45s
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option:  1 netmask  255.255.255.0
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option: 28 broadcast  192.168.24.255
Aug 31 12:43:09 dnsmasq-dhcp[1]: 932682500 sent size:  4 option:  3 router  192.168.24.1

The node is now in Ironic, and the PXE filter is reconfigured to block the new node.

Aug 31 12:44:17 dnsmasq[1]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:73:4b:16
Aug 31 12:44:17 dnsmasq-dhcp[1]: read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:73:4b:16

cat /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:73:4b:16
52:54:00:73:4b:16,ignore

After some time the inspector image tries to do DHCP again (see screenshow attached)

The DHCP server logs:
Aug 31 12:50:50 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:51:05 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:51:20 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:51:28 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:51:40 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:51:49 dnsmasq-dhcp[1]: 932682500 available DHCP range: 192.168.24.100 -- 192.168.24.120

Then the DHCP client in the inspector restarts and does DHCP requests using a different client ID:

Aug 31 12:55:10 dnsmasq-dhcp[1]: 1397926474 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:55:10 dnsmasq-dhcp[1]: 1397926474 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 ignored
Aug 31 12:55:15 dnsmasq-dhcp[1]: 1397926474 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:55:15 dnsmasq-dhcp[1]: 1397926474 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 ignored




Aug 31 12:55:10 dnsmasq-dhcp[1]: 1397926474 available DHCP range: 192.168.24.100 -- 192.168.24.120
Aug 31 12:55:10 dnsmasq-dhcp[1]: 1397926474 DHCPDISCOVER(br-ctlplane) 52:54:00:73:4b:16 ignored

Comment 5 Harald Jensås 2018-08-31 16:59:46 UTC
Created attachment 1480124 [details]
Consol screenshot of node being discovered.

Comment 6 Harald Jensås 2018-08-31 17:02:00 UTC
The node is now in enroll state:

(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| 9bc3ff59-e251-48fb-ae8b-16487cc704b3 | None | None          | power off   | enroll             | False       |
| 610eabf2-585b-459c-9644-2f51f338b437 | None | None          | power off   | enroll             | False       |
| 14912165-e71f-4f86-97b6-c40957513996 | None | None          | power off   | enroll             | False       |
| cac0e5f0-5c38-424b-aca0-a958991d0971 | None | None          | power off   | enroll             | False       |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
(undercloud) [stack@undercloud-0 ~]$ openstack baremetal port list --node cac0e5f0-5c38-424b-aca0-a958991d0971
+--------------------------------------+-------------------+
| UUID                                 | Address           |
+--------------------------------------+-------------------+
| 08fbe526-e38c-4153-85b3-1677070575be | 52:54:00:73:4b:16 |
+--------------------------------------+-------------------+


Not sure about the No ipa-api-url INFO log message on the console?

Comment 7 Dmitry Tantsur 2018-09-20 10:07:16 UTC
Harald, I did not quite get your last comment, did it work in the end? Sasha, do you still have problems with autodiscovery?

Comment 8 Alexander Chuzhoy 2018-09-20 12:07:39 UTC
Mike, does the issue reproduce for you?

Comment 9 mlammon 2018-09-20 14:38:32 UTC
I have not seen it.   The only autodiscovery issue I know about right now is 

https://bugzilla.redhat.com/show_bug.cgi?id=1627949

Comment 10 Harald Jensås 2018-09-20 14:46:03 UTC
(In reply to Dmitry Tantsur from comment #7)
> Harald, I did not quite get your last comment, did it work in the end?
> Sasha, do you still have problems with autodiscovery?

Yes, discovery worked and nodes introspected and ended up in 'enroll' state.

I think there may have been some confusion with https://bugzilla.redhat.com/show_bug.cgi?id=1625273 when I was looking into this. I.e the issue reproduced was 1625273 not this issue ...


Since Mike have'nt seen this again maby we should close it?