Description of problem: This is an OSP bug to track the ipmitool bug - https://bugzilla.redhat.com/show_bug.cgi?id=1831158 With the version of ipmitool that is used in RHEL 8.2 we are getting introspection failures when using HP iLo BMCs. This was seen on an HP ProLiant DL360 Gen9. Introspection fails with: openstack overcloud node introspect hp-dl360-g9-02 --provide Waiting for introspection to finish... Waiting for messages on queue 'tripleo' with no timeout. Introspection of node completed:ced799b5-6619-44db-90cd-71c3955e3043. Status:FAILED. Errors:Failed to set boot device to PXE: Timed out waiting for a reply to message ID a3c7ab7325004808b4ae6411dce0f2db (HTTP 500) Retrying 1 nodes that failed introspection. Attempt 1 of 3 Introspection of node completed:ced799b5-6619-44db-90cd-71c3955e3043. Status:FAILED. Errors:Failed to set boot device to PXE: Timed out waiting for a reply to message ID ff20e76a05d444eabecf80031e9a518d (HTTP 500) Retrying 1 nodes that failed introspection. Attempt 2 of 3 Introspection of node completed:ced799b5-6619-44db-90cd-71c3955e3043. Status:FAILED. Errors:Failed to set boot device to PXE: Timed out waiting for a reply to message ID adcc70c4f43d4d06818e31718f1882e2 (HTTP 500) Retrying 1 nodes that failed introspection. Attempt 3 of 3 Introspection of node completed:ced799b5-6619-44db-90cd-71c3955e3043. Status:FAILED. Errors:Failed to set boot device to PXE: Timed out waiting for a reply to message ID 5c641cd7bf5847fb8d643ef4ad120243 (HTTP 500) Retry limit reached with 1 nodes still failing introspection In the logs we see: containers/ironic/ironic-conductor.log.1:2020-05-04 23:55:41.385 7 DEBUG ironic.common.utils [req-eb49faaa-94bd-4f0e-badd-064272ba1ebc - - - - -] Command stderr is: "Unable to Get Channel Cipher Suites containers/ironic/ironic-conductor.log.1:2020-05-04 23:57:52.657 7 DEBUG ironic.common.utils [req-eb49faaa-94bd-4f0e-badd-064272ba1ebc - - - - -] Command stderr is: "Unable to Get Channel Cipher Suites containers/ironic/ironic-conductor.log.1:2020-05-05 00:00:03.935 7 DEBUG ironic.common.utils [req-eb49faaa-94bd-4f0e-badd-064272ba1ebc - - - - -] Command stderr is: "Unable to Get Channel Cipher Suites Running the ipmitool command manually takes 2 minutes to complete: ()[ironic@hardprov-dl360-g9-01 /]$ time ipmitool -I lanplus -H 10.9.103.29 -U DMINISTRATOR -P XXX -v -R 12 -N 5 chassis status ... real 2m6.271s user 0m0.002s sys 0m0.004s This issue was also seen with vbmc but it was resolved with a new version of pyghmi in https://bugzilla.redhat.com/show_bug.cgi?id=1813889, pyghmi is not used with baremetal BMC access. Version-Release number of selected component (if applicable): HP ProLiant DL360 Gen9 - iLO versions 2.54 (Jun 15 2017) and 2.60 (latest available, May 23 2018) ipmitool-1.8.18-14.el8.x86_64 How reproducible: Happens every time with this BMC. It works fine with Dell systems that have been tested.
Package is in compose RHOS-16.1-RHEL-8-20200604.n.1.
Verified that we no longer get a 2 minute response from ipmitool due to the Cipher Suites issue. Ipmitool commands are now being issued with "-R 1 -N 1" and retries are done by ironic. Running cmd (subprocess): ipmitool -I lanplus -H 172.16.0.28 -L ADMINISTRATOR -p 6230 -U admin -R 1 -N 1 -f /tmp/tmpebyzf379 ipmi.use_ipmitool_retries = False
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3148