Created attachment 1058081 [details] log1 Discovery for Bare-metal environment takes longer time than it use to take, Waiting for discovery to finish... environment ( gold sign build for GA): ---------------------------------------- openstack-ironic-discoverd-1.1.0-5.el7ost.noarch openstack-ironic-conductor-2015.1.0-9.el7ost.noarch openstack-ironic-common-2015.1.0-9.el7ost.noarch python-ironic-discoverd-1.1.0-5.el7ost.noarch python-ironicclient-0.5.1-9.el7ost.noarch openstack-ironic-api-2015.1.0-9.el7ost.noarch python-rdomanager-oscplugin-0.0.8-43.el7ost.noarch Description: ------------- Eventually discovery work successfully (after ~+20 minutes) - but I suspect that the timing was much better and somehow the discovery process became longer . I'm using Bare-Metal environment sane as I used before : 7 X Dell Nodes. sudo journalctl -u openstack-ironic-discoverd -u openstack-ironic-discoverd-dnsmasq -u openstack-ironic-conducor: (Adding in Attachment .... log1 ) ---------------------------------------------------------------------------- [root@rhos-compute-node-18 /]# sudo journalctl -u openstack-ironic-discoverd -u openstack-ironic-discoverd-dnsmasq -u openstack-ironic-conducor -- Logs begin at Fri 2015-07-31 09:56:04 EDT, end at Fri 2015-07-31 11:15:40 EDT. -- Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Starting PXE boot dnsmasq service for ironic-discoverd... Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq[6687]: started, version 2.66 DNS disabled Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq[6687]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCP, IP range 192.168.0.100 -- 192.168.0.120, lease time 2m Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: TFTP root is /tftpboot Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Started PXE boot dnsmasq service for ironic-discoverd. Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Starting Hardware introspection service for OpenStack Ironic... Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Started Hardware introspection service for OpenStack Ironic. Jul 31 10:26:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.main:Enabled processing hooks: ['ramdisk_error' Jul 31 10:26:29 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug: * Running on http://0.0.0.0:5050/ Jul 31 10:35:32 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:32] "POST /v1/intro Jul 31 10:35:32 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:38 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:38] "POST /v1/intro Jul 31 10:35:38 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:43 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:43] "POST /v1/intro Jul 31 10:35:44 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:49 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:49] "POST /v1/intro Jul 31 10:35:49 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:55 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:55] "POST /v1/intro Jul 31 10:35:55 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:92 Jul 31 10:36:00 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:00] "POST /v1/intro Jul 31 10:36:00 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:36:06 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:06] "POST /v1/intro Jul 31 10:36:06 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:7c:29 Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:11] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros Jul 31 10:36:21 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:21] "GET /v1/intros ...skipping... Jul 31 10:51:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:51:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPACK(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:52:14 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:52:14 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPACK(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:53:05 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:53:05 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPACK(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 Jul 31 10:53:16 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: ERROR:ironicclient.common.http:Error contacting Ironic server: A port Jul 31 10:53:16 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: WARNING:ironic_discoverd.process:MAC 00:0a:f7:7c:29:28 appeared in in Jul 31 10:53:18 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:192.168.0.100 - - [31/Jul/2015 10:53:18] "POST /v1/cont Jul 31 10:53:18 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:18] "GET /v1/intros Jul 31 10:53:18 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:18] "GET /v1/intros Jul 31 10:53:18 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:18] "GET /v1/intros Jul 31 10:53:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:28] "GET /v1/intros Jul 31 10:53:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:28] "GET /v1/intros Jul 31 10:53:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.process:Introspection finished successfully for Jul 31 10:53:38 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:38] "GET /v1/intros Jul 31 10:53:48 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:53:48] "GET /v1/intros Jul 31 11:04:48 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: error 0 TFTP Aborted received from 192.168.0.8 Jul 31 11:04:48 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: failed sending /tftpboot/undionly.kpxe to 192.168.0.8 Jul 31 11:04:48 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/undionly.kpxe to 192.168.0.8 Jul 31 11:04:58 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: error 0 TFTP Aborted received from 192.168.0.9 Jul 31 11:04:58 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: failed sending /tftpboot/undionly.kpxe to 192.168.0.9 Jul 31 11:04:58 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/undionly.kpxe to 192.168.0.9 Jul 31 11:05:01 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: error 0 TFTP Aborted received from 192.168.0.10 Jul 31 11:05:01 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: failed sending /tftpboot/undionly.kpxe to 192.168.0.10 Jul 31 11:05:01 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/undionly.kpxe to 192.168.0.10 Jul 31 11:05:09 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: error 0 TFTP Aborted received from 192.168.0.11 Jul 31 11:05:09 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: failed sending /tftpboot/undionly.kpxe to 192.168.0.11 Jul 31 11:05:09 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/undionly.kpxe to 192.168.0.11 Jul 31 11:05:19 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/token-bcbd7011-1a01-4176-aae3-8bd8d3c7abcb to 192.168.0.8 Jul 31 11:05:29 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/token-1dd1d164-49fc-4d30-ab69-ac0f2f27cca6 to 192.168.0.9 Jul 31 11:05:32 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/token-90c12579-80dc-46f0-9d33-a5522bf0c2b7 to 192.168.0.10 Jul 31 11:05:35 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: error 0 TFTP Aborted received from 192.168.0.7 Jul 31 11:05:35 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: failed sending /tftpboot/undionly.kpxe to 192.168.0.7 Jul 31 11:05:35 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/undionly.kpxe to 192.168.0.7 Jul 31 11:05:39 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/token-6102e61d-d11e-44d1-b4ee-b3de695cf78b to 192.168.0.11 Jul 31 11:06:11 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: sent /tftpboot/token-e1cbb4df-dbbd-4426-a577-b1396dece066 to 192.168.0.7 [root@rhos-compute-node-18 /]# date Fri Jul 31 11:15:47 EDT 2015 [root@rhos-compute-node-18 /]# sudo journalctl -u openstack-ironic-discoverd -u openstack-ironic-discoverd-dnsmasq -u openstack-ironic-conducor -- Logs begin at Fri 2015-07-31 09:56:04 EDT, end at Fri 2015-07-31 11:15:50 EDT. -- Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Starting PXE boot dnsmasq service for ironic-discoverd... Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq[6687]: started, version 2.66 DNS disabled Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq[6687]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCP, IP range 192.168.0.100 -- 192.168.0.120, lease time 2m Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-tftp[6687]: TFTP root is /tftpboot Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Started PXE boot dnsmasq service for ironic-discoverd. Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Starting Hardware introspection service for OpenStack Ironic... Jul 31 10:26:27 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com systemd[1]: Started Hardware introspection service for OpenStack Ironic. Jul 31 10:26:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.main:Enabled processing hooks: ['ramdisk_error' Jul 31 10:26:29 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug: * Running on http://0.0.0.0:5050/ Jul 31 10:35:32 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:32] "POST /v1/intro Jul 31 10:35:32 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:38 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:38] "POST /v1/intro Jul 31 10:35:38 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:43 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:43] "POST /v1/intro Jul 31 10:35:44 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:49 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:49] "POST /v1/intro Jul 31 10:35:49 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:35:55 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:35:55] "POST /v1/intro Jul 31 10:35:55 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:92 Jul 31 10:36:00 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:00] "POST /v1/intro Jul 31 10:36:00 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:79:93 Jul 31 10:36:06 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:werkzeug:10.19.184.15 - - [31/Jul/2015 10:36:06] "POST /v1/intro Jul 31 10:36:06 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:7c:29 Adding in Attachment .... log1
Created attachment 1058082 [details] ironic-api.log
Created attachment 1058083 [details] ironic-conductor
My first finding is this: Jul 31 10:36:06 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com ironic-discoverd[6693]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:0a:f7:7c:29 Jul 31 10:51:28 rhos-compute-node-18.osqe.lab.eng.bos.redhat.com dnsmasq-dhcp[6687]: DHCPREQUEST(br-ctlplane) 192.168.0.100 00:0a:f7:7c:29:28 15 (!!) minutes passed since ironic-discoverd requested power on till node tried to get its IP address? Anyway, I'm sorry I mistyped conductor unit when giving you a command line. Could you reattempt getting the full logs with the following command: sudo journalctl -l -u openstack-ironic-discoverd -u openstack-ironic-discoverd-dnsmasq -u openstack-ironic-conductor --since "10:35" Please don't skip anything from the output, just attach it as it is
I am also seeing very slow introspection. Checking my environment for networking changes. My installs are scripted and I use same servers every time. But since Friday introspection has gone from 27 minutes to 90+ minutes. I typically halt it and then introspect the remaining servers one at a time manually.
Please provide the same logs as in comment 5.
Created attachment 1059124 [details] log2 I re-provision the setup and re-discover the nodes again, so the timing might be different , but I'm adding the log file of according the request : "sudo journalctl -l -u openstack-ironic-discoverd -u openstack-ironic-discoverd-dnsmasq -u openstack-ironic-conductor "
Given the provided logs, what's your assessment at this point?
There are different components that add up to the problem: 1. 2 minutes retries on a duplicate MAC 2. Slow ramdisk (no guesses for now) 3. Concurrency problem reported by Ben (can't reproduce on git master, need to bisect). New Omri's log do not have such a huge gap between power on request and first DHCP, only 2 minutes which is fine for a bare metal IMO. Need to investigate more. How much of a priority is this issue?
Dmitry, Thanks - this is lower priority. Moving this to OSP 8 for further investigation at that time.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Dmitry, I think the "slowness" reported back in July '15 and the failure I'm seeing today (April, '16) are related. I googled for "Director OR ironic 'failed sending /tftpboot/undionly.kpxe'" and this is the first bug to come up. I see this in the logs: Apr 11 10:35:16 ops2 dnsmasq-dhcp[1490]: DHCPREQUEST(br-ctlplane) 10.3.3.70 ec:f4:bb:e7:06:cc Apr 11 10:35:16 ops2 dnsmasq-dhcp[1490]: DHCPACK(br-ctlplane) 10.3.3.70 ec:f4:bb:e7:06:cc Apr 11 10:35:16 ops2 dnsmasq-dhcp[1490]: DHCPREQUEST(br-ctlplane) 10.3.3.70 ec:f4:bb:e7:06:cc Apr 11 10:35:16 ops2 dnsmasq-dhcp[1490]: DHCPACK(br-ctlplane) 10.3.3.70 ec:f4:bb:e7:06:cc Apr 11 10:35:16 ops2 dnsmasq-tftp[1490]: error 0 TFTP Aborted received from 10.3.3.70 Apr 11 10:35:16 ops2 dnsmasq-tftp[1490]: failed sending /tftpboot/undionly.kpxe to 10.3.3.70 Apr 11 10:35:16 ops2 dnsmasq-tftp[1490]: sent /tftpboot/undionly.kpxe to 10.3.3.70 While logged into the console of the node, I was able to catch it fail during the pxe boot and ctl-b to the ipxe command prompt. I waited a couple minutes, entered 'autoboot' and it successfully downloaded the ram disk and introspection completed, however, I did see repeated messages like this in the logs after that: Apr 11 11:12:10 ops2 dnsmasq-dhcp[4254]: DHCPDISCOVER(tapfd74e556-1e) ec:f4:bb:e7:06:cc no address available Apr 11 11:12:13 ops2 dnsmasq-dhcp[1490]: DHCPDISCOVER(br-ctlplane) ec:f4:bb:e7:06:cc Apr 11 11:12:13 ops2 dnsmasq-dhcp[1490]: DHCPOFFER(br-ctlplane) 10.3.3.71 ec:f4:bb:e7:06:cc Apr 11 11:12:14 ops2 dnsmasq-dhcp[1490]: DHCPDISCOVER(br-ctlplane) ec:f4:bb:e7:06:cc Apr 11 11:12:14 ops2 dnsmasq-dhcp[4254]: DHCPDISCOVER(tapfd74e556-1e) ec:f4:bb:e7:06:cc no address available Apr 11 11:12:14 ops2 dnsmasq-dhcp[1490]: DHCPOFFER(br-ctlplane) 10.3.3.71 ec:f4:bb:e7:06:cc Apr 11 11:12:14 ops2 dnsmasq-dhcp[1490]: DHCPDISCOVER(br-ctlplane) ec:f4:bb:e7:06:cc Apr 11 11:12:14 ops2 dnsmasq-dhcp[1490]: DHCPOFFER(br-ctlplane) 10.3.3.71 ec:f4:bb:e7:06:cc Apr 11 11:12:14 ops2 dnsmasq-dhcp[4254]: DHCPDISCOVER(tapfd74e556-1e) ec:f4:bb:e7:06:cc no address available Why are there 2 dnsmasq-dhcp services running - PIDs 1490 and 4254 - one is listening on the br-ctlplane the other is listening on tapfd74e556-1e. Where did that tapfd74e556-1e device even come from??
Created attachment 1146040 [details] director ironic logs v7.3.1
> Apr 11 10:35:16 ops2 dnsmasq-tftp[1490]: failed sending /tftpboot/undionly.kpxe to 10.3.3.70 Note that this message alone is not an error yet. > Why are there 2 dnsmasq-dhcp services running - PIDs 1490 and 4254 - one is listening on the br-ctlplane the other is listening on tapfd74e556-1e. Where did that tapfd74e556-1e device even come from?? One on tap is from neutron, the other one is from inspector. This is fine. Overall I think you have a different bug, please report it separately. This one is really about slowness, and maybe it's actually fixed.
Hi! Can anyone still reproduce it on OSP10? Please make sure you don't have benchmarking enabled, when reproducing.