openstack-ironic: bulk introspection doesn't comple [virtualbmc] Environment: instack-undercloud-6.0.0-0.20170123140647.25ecc19.el7.centos.noarch puppet-ironic-10.1.0-0.20170123133130.70e2f98.el7.centos.noarch python-ironic-inspector-client-1.10.0-0.20161219133602.0eae82e.el7.centos.noarch python2-ironicclient-1.10.0-0.20170120194459.808a4cb.el7.centos.noarch openstack-ironic-api-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch python-ironic-tests-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch openstack-ironic-inspector-4.2.1-0.20170120195931.59a2009.el7.centos.noarch openstack-ironic-common-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch openstack-ironic-conductor-6.2.1-0.20170120231147.eae8e07.el7.centos.noarch python-ironic-lib-2.5.1-0.20170120192630.4e16718.el7.centos.noarch Steps to reproduce: 1. Deploy undercloud 2. Register nodes 3. Attempt to introspection: openstack overcloud node introspect --all-manageable Result: All nodes but one complete introspection. Note: I removed the problematic node from ironic and restarted introspection. Again, all nodes but one (another this time) completed introspection. Expected result: The bulk introspection should complete for all nodes.
Created attachment 1243739 [details] ironic-inspector logs Interesting, it looks like introspection actually succeeded (see attached ironic-inspector logs). Will investigate further tomorrow.
Wanted to collect the ramdisk logs, but the thing is that the ramdisk for the very same node , not passing the introspection - doesn't get created. The rest ramdisk logs get created just fine (following the procedure http://tripleo.org/troubleshooting/troubleshooting.html#accessing-logs-from-the-ramdisk)
Ok, it seems like one node gets timed out for some reason. Could you please check the node's console when it's booting? Check that it actually reboots, and see at which stage it fails. Thanks!
Doing more introspections on that setup I got into situations like: [stack@undercloud ~]$ openstack baremetal introspection bulk status +--------------------------------------+----------+-------+ | Node UUID | Finished | Error | +--------------------------------------+----------+-------+ | c0ed29df-1b05-4d50-a76b-777945b0ec4c | True | None | | 5eb7a8be-9a01-47e9-8a7f-f6839449d4a3 | False | None | | 409cd097-7d2a-4664-bef3-3073729cf597 | True | None | | 5a7fa6e9-5a58-4bd7-bb19-357e421a0e82 | False | None | | b5de4633-e097-4f87-9d2e-fe38a3351127 | False | None | | ab60aeba-a868-40fb-92f2-92313f90b64b | False | None | | b20caa86-51b4-40f8-83a5-53295da56bd7 | True | None | +--------------------------------------+----------+-------+ Where only a subset of nodes would pass introspection. On the console of one of the nodes not passing the introspection I see what's in the attached files (console).
Created attachment 1244445 [details] console
Just deploying latest ocata with quickstart looks as following (after introspection): [stack@undercloud ~]$ ironic node-list +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ | ebf5678f-b69b-4f54-a773-8a3c92793d01 | control-0 | None | power off | manageable | False | | d8def4da-1001-4fc9-b6ee-00566e3d6125 | control-1 | None | power off | manageable | False | | d898ec08-0a1e-4e3b-a500-8e3af2966138 | control-2 | None | power off | manageable | False | | 6f277ba3-3ddf-4edf-9722-e281f2f518bf | compute-0 | None | power off | manageable | False | | 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | compute-1 | None | power on | manageable | False | | 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | compute-2 | None | power off | manageable | False | | 73822262-f63c-4b04-8909-cc306e5ed127 | ceph-0 | None | power off | manageable | False | +--------------------------------------+-----------+---------------+-------------+--------------------+-------------+ [stack@undercloud ~]$ openstack baremetal introspection bulk status +--------------------------------------+----------+-------+ | Node UUID | Finished | Error | +--------------------------------------+----------+-------+ | ebf5678f-b69b-4f54-a773-8a3c92793d01 | True | None | | d8def4da-1001-4fc9-b6ee-00566e3d6125 | True | None | | d898ec08-0a1e-4e3b-a500-8e3af2966138 | True | None | | 6f277ba3-3ddf-4edf-9722-e281f2f518bf | True | None | | 21d71ad9-3f60-4d47-8dd3-b00859c71ad9 | False | None | | 79aa0029-a1f2-4ed6-8369-51daa0c7c022 | True | None | | 73822262-f63c-4b04-8909-cc306e5ed127 | True | None | +--------------------------------------+----------+-------+
Try setting https://github.com/openstack/ironic-inspector/blob/master/example.conf#L63 to .* and see if it fixes your problem. If not, please paste your nodes.json.
Edited the file per comment #9: [root@undercloud ~]# grep introspection_delay_drivers /etc/ironic-inspector/inspector.conf introspection_delay_drivers = ^.*_ssh$ And bounced the openstack-ironic-inspector service: systemctl restart openstack-ironic-inspector.service Restarted bulk introspection. Same result.
[stack@undercloud ~]$ cat instackenv.json { "nodes": [ { "name": "control-0", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6230", "mac": [ "00:50:d0:ad:dd:d9" ], "cpu": "2", "memory": "16384", "disk": "50", "arch": "x86_64", "capabilities": "profile:control,boot_option:local" } , { "name": "control-1", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6231", "mac": [ "00:50:d0:ad:dd:dd" ], "cpu": "2", "memory": "16384", "disk": "50", "arch": "x86_64", "capabilities": "profile:control,boot_option:local" } , { "name": "control-2", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6232", "mac": [ "00:50:d0:ad:dd:e1" ], "cpu": "2", "memory": "16384", "disk": "50", "arch": "x86_64", "capabilities": "profile:control,boot_option:local" } , { "name": "compute-0", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6233", "mac": [ "00:50:d0:ad:dd:e5" ], "cpu": "2", "memory": "8192", "disk": "50", "arch": "x86_64", "capabilities": "profile:compute,boot_option:local" } , { "name": "compute-1", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6234", "mac": [ "00:50:d0:ad:dd:e9" ], "cpu": "2", "memory": "8192", "disk": "50", "arch": "x86_64", "capabilities": "profile:compute,boot_option:local" } , { "name": "compute-2", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6237", "mac": [ "00:50:d0:ad:dd:ed" ], "cpu": "2", "memory": "8192", "disk": "50", "arch": "x86_64", "capabilities": "profile:compute,boot_option:local" } , { "name": "ceph-0", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6235", "mac": [ "00:50:d0:ad:dd:f1" ], "cpu": "2", "memory": "8192", "disk": "50", "arch": "x86_64", "capabilities": "profile:ceph,boot_option:local" } , { "name": "ceph-1", "pm_password": "password", "pm_type": "pxe_ipmitool", "pm_user": "admin", "pm_addr": "127.0.0.1", "pm_port": "6236", "mac": [ "00:50:d0:ad:dd:f5" ], "cpu": "2", "memory": "8192", "disk": "50", "arch": "x86_64", "capabilities": "profile:ceph,boot_option:local" } ] }
> introspection_delay_drivers = ^.*_ssh$ Sorry, I asked to set it to .* but it seems like you've left the default value, no?
Changed the line to read: introspection_delay_drivers = ^.*$ The introspection passed successfully. Are we going to set it as default? Thanks.
Yeah, I'm on it.
Verified: Environment: openstack-ironic-inspector-5.0.1-0.20170214181727.babc2b6.el7ost.noarch Added 9 nodes to ironic using vbmc and successfully introspected as shown below: [stack@undercloud-0 ~]$ for i in `openstack baremetal node list -c UUID -f value`; do ironic node-show $i|grep driver_info -A3; done | driver_info | {u'ipmi_port': u'6231', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6232', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6233', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6237', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6238', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6239', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6234', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6235', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | | driver_info | {u'ipmi_port': u'6236', u'ipmi_username': u'admin', u'deploy_kernel': | | | u'b4634d98-c988-49f4-b9de-6d46e24da8d4', u'ipmi_address': u'172.16.0.1', | | | u'deploy_ramdisk': u'4b00a5a2-c3f3-42be-aabd-07c36f49ae4b', | | | u'ipmi_password': u'******'} | [stack@undercloud-0 ~]$ openstack baremetal introspection bulk start This command is deprecated. Please use "openstack overcloud node introspect" to introspect manageable nodes instead. Setting nodes for introspection to manageable... Starting introspection of manageable nodes Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: a9ca7e46-edf6-45c3-a191-b5d8e11f0ba1 Waiting for introspection to finish... Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout. Introspection for UUID f01bf51d-52ec-4236-8b2f-0c8cd395a8d6 finished successfully. Introspection for UUID 5ce84ff7-391e-4bdd-9131-f1e111d88bca finished successfully. Introspection for UUID 7c1811c1-4dbe-4562-8b0b-629fdac26a6d finished successfully. Introspection for UUID 6b631edb-a9ed-4cec-8756-1a5c26b6e320 finished successfully. Introspection for UUID f0e347ca-9636-43b3-950e-df0abbabd708 finished successfully. Introspection for UUID 267594c5-688f-4d9b-98d3-d7fa9a516083 finished successfully. Introspection for UUID 2c25db0e-8c50-430b-a922-0d0528e42dba finished successfully. Introspection for UUID e63c4901-3da4-41d1-9a2a-d24816873bd9 finished successfully. Introspection for UUID 0c2d6bda-e863-4f39-9f5e-97e84b4df9df finished successfully. Introspection completed. Setting manageable nodes to available... Started Mistral Workflow tripleo.baremetal.v1.provide_manageable_nodes. Execution ID: fd85187d-428a-428c-adaa-87229b166c08 Waiting for messages on queue 'f0c21899-2305-4581-aea2-aa098d4cce90' with no timeout.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245