Hide Forgot
Description of problem: Overcloud nodes HW : IBM x3650 M4 Introspection for overcloud nodes never getting completed. [stack@undercloud ~]$ ironic node-list +--------------------------------------+------------------------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+------------------------+---------------+-------------+--------------------+-------------+ | 1e72cddd-1063-454d-b76c-82611dd840b9 | overcloud-controller-1 | None | power off | available | False | | 96f3c503-59bd-4771-90ee-e3d74af40cbe | overcloud-controller-2 | None | power off | available | False | | 70f884be-348c-42fa-9d3d-d5ba4b9a8f8c | overcloud-controller-3 | None | power off | available | False | | 2912a594-ba73-4971-92c8-a9f7ea3e17a5 | overcloud-compute-1 | None | power off | available | False | | e7e51952-f639-493b-abee-1fe6fc43e6ee | overcloud-compute-2 | None | power off | available | False | | a48fc89b-9479-44fb-9bcc-a806da39b659 | overcloud-compute-3 | None | power off | available | False | | 7c243947-37c9-4adc-a563-cfcb3ce459c5 | overcloud-storage-1 | None | power off | available | False | | 9cb9d69a-7aba-4a91-8d05-0400a0992159 | overcloud-storage-2 | None | power off | available | False | | 9f92ebac-b0d7-4a44-98f4-ec5b50fc38ef | overcloud-storage-3 | None | power off | available | False | +--------------------------------------+------------------------+---------------+-------------+--------------------+-------------+ [stack@undercloud ~]$ openstack baremetal introspection bulk start Setting nodes for introspection to manageable... Starting introspection of node: 1e72cddd-1063-454d-b76c-82611dd840b9 Starting introspection of node: 96f3c503-59bd-4771-90ee-e3d74af40cbe Starting introspection of node: 70f884be-348c-42fa-9d3d-d5ba4b9a8f8c Starting introspection of node: 2912a594-ba73-4971-92c8-a9f7ea3e17a5 Starting introspection of node: e7e51952-f639-493b-abee-1fe6fc43e6ee Starting introspection of node: a48fc89b-9479-44fb-9bcc-a806da39b659 Starting introspection of node: 7c243947-37c9-4adc-a563-cfcb3ce459c5 Starting introspection of node: 9cb9d69a-7aba-4a91-8d05-0400a0992159 Starting introspection of node: 9f92ebac-b0d7-4a44-98f4-ec5b50fc38ef Waiting for introspection to finish... ****IT'S getting stuck here forever.**** Version-Release number of selected component (if applicable): RHEL OSP 9 How reproducible: Steps to Reproduce: 1. Try to do the introspection of overcloud nodes having HW "IBM x3650 M4". 2. Introspection is never getting completed. 3. Actual results: Introspection never getting completed. Expected results: Introspection should get completed successfully. Additional info: More information coming in internal comments.
Tweaked the agent.ramdisk, and add root passwd for agent.ramdisk following official Red Hat documentation. Confirmed that HW is certified : https://access.redhat.com/ecosystem/hardware/956863
Hi! The screenshot was taken too late, when introspection was already finished. Could you please find the moment when it actually tries to post introspection data? Also please attach openstack-ironic-inspector logs.
Please tell me more about your environment. E.g. it seems like you're using PXE instead of iPXE, why is that? Please attach /etc/ironic-inspector/dnsmasq.conf and /tftpboot/pxelinux.cfg/default.
AFAIK, they have changed from ipxe to pxe because of the following bug. change ironic to use pxe istead ipxe as steps ironic-ipxe-to-pxe https://bugzilla.redhat.com/show_bug.cgi?id=1326086#c4
I don't think it is needed any more, especially for OSP 9. I'd like to know the details of the original issue then.
Created attachment 1216481 [details] ipxe wont boot
If we use ipxe instead of PXE, the server will not boot at all.
Hi! I see that the MAC uses for iPXE (98:be:94:41:5d:bb) differs from one used for PXE (98:be:94:41:5e:9b). Have you changed boot order when switching to PXE? Have you tried making iPXE boot from the latter NIC? Also please attach the files requested in comment 10.
Created attachment 1216511 [details] /var/log/message of the introspected node The first NIC of IBM x3650 M4 self-contain a DHCP server, and it also share IPMI port with NIC, so the first NIC can not be PXEed, and can not use dhcp to get an IP. So the customer disables PXE on first NIC, and enables PXE on second NIC. I followed the link, https://access.redhat.com/articles/2142881https://access.redhat.com/articles/2142881 The content of default file /tftp/pxelinux.cfg/default default discovery LABEL discovery kernel agent.kernel append initrd=agent.ramdisk inspector_callback_url=http://172.16.0.1:5050/v1/continue RUNBENCH=0 ipappend 3
inspector_callback_url should be inspection_callback_url :) chances are high that will fix your PXE problem. Could you please try? The customer portal documentation is purely wrong, I'll file a separate bug against it. As to iPXE, I wonder why it switches to the first NIC anyway..
Oh, sorry, it should be ipa-inspection-callback-url even
I've filed a documentation bug: https://bugzilla.redhat.com/show_bug.cgi?id=1391019. Please check if the recommendations there help your case.
Hey Thanks for the helps, "ipa-inspection-callback-url" works for me. I followed the updated document, and introspection can be finished successfully. Thank you so much again.
Thanks for confirming!