Description of problem: When enrolling overcloud baremetal nodes, the process fails when iDRAC pm_driver is being used. Version-Release number of selected component (if applicable): python3-ironic-inspector-client-3.7.0-0.20190923163033.d95a4cd.el8ost.noarch puppet-ironic-15.4.1-0.20191022165413.8fe6978.el8ost.noarch How reproducible: Everytime Steps to Reproduce: Set pm_type: idrac Try to add nodes with: openstack overcloud node import ~/nodes.json Actual results: [{'result': 'Node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1 did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1. Error: DRAC operation failed. Reason: WSMan request failed'}, {'result': 'Node d6440e61-4a61-430a-ae4c-644dff6b0fef did not reach state "manageable", the state is "enroll", error: Failed to get power state for node d6440e61-4a61-430a-ae4c-644dff6b0fef. Error: DRAC operation failed. Reason: WSMan request failed'}, {'result': 'Node 6f9fcedd-0711-494f-9017-db60f5e7147e did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 6f9fcedd-0711-494f-9017-db60f5e7147e. Error: DRAC operation failed. Reason: WSMan request failed'}] {'result': 'Failure caused by error in tasks: send_message\n\n send_message [task_ex_id=1c6b2cc6-cf5e-40f1-954f-e1ef30377103] -> Workflow failed due to message status. Status:FAILED Message:({\'result\': \'Node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1 did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1. Error: DRAC operation failed. Reason: WSMan request failed\'}, {\'result\': \'Node d6440e61-4a61-430a-ae4c-644dff6b0fef did not reach state "manageable", the state is "enroll", error: Failed to get power state for node d6440e61-4a61-430a-ae4c-644dff6b0fef. Error: DRAC operation failed. Reason: WSMan request failed\'}, {\'result\': \'Node 6f9fcedd-0711-494f-9017-db60f5e7147e did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 6f9fcedd-0711-494f-9017-db60f5e7147e. Error: DRAC operation failed. Reason: WSMan request failed\'})\n [wf_ex_id=6b8e31a9-6e86-4184-be63-57a1915dc63d, idx=0]: Workflow failed due to message status. Status:FAILED Message:({\'result\': \'Node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1 did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1. Error: DRAC operation failed. Reason: WSMan request failed\'}, {\'result\': \'Node d6440e61-4a61-430a-ae4c-644dff6b0fef did not reach state "manageable", the state is "enroll", error: Failed to get power state for node d6440e61-4a61-430a-ae4c-644dff6b0fef. Error: DRAC operation failed. Reason: WSMan request failed\'}, {\'result\': \'Node 6f9fcedd-0711-494f-9017-db60f5e7147e did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 6f9fcedd-0711-494f-9017-db60f5e7147e. Error: DRAC operation failed. Reason: WSMan request failed\'})\n', 'status': 'FAILED', 'message': [{'result': 'Node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1 did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 2870b3e3-d654-4aac-8b1c-8f3d8120dbc1. Error: DRAC operation failed. Reason: WSMan request failed'}, {'result': 'Node d6440e61-4a61-430a-ae4c-644dff6b0fef did not reach state "manageable", the state is "enroll", error: Failed to get power state for node d6440e61-4a61-430a-ae4c-644dff6b0fef. Error: DRAC operation failed. Reason: WSMan request failed'}, {'result': 'Node 6f9fcedd-0711-494f-9017-db60f5e7147e did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 6f9fcedd-0711-494f-9017-db60f5e7147e. Error: DRAC operation failed. Reason: WSMan request failed'}]} Main part being: error: Failed to get power state for node When pm_type is set to IPMI however (pm_type: ipmi) it works: 3 node(s) successfully moved to the "manageable" state. Successfully registered node UUID a7ab6673-e2d1-4f75-8d04-19210817e411 Successfully registered node UUID 84ecf90c-1b09-44b0-9459-8ae6ee3ddb58 Successfully registered node UUID bd4d53f6-ad0b-48f2-bfcf-f8ee136509be The hardware in both cases is the same: blades in a Dell m1000e chassis. Expected results: Introspection should work with iDRAC if that is still a supported pm_type. Additional info: If idrac is no longer supported for some reason, it should be removed from the sample undercloud.conf. A warning should be added to the documentation regarding this problem in this case as well.
Please provide an sosreport when the problem occurs. iDrac is currently supported.
Since this is easily reproducible I can do it right now. But before I do so, are there any other things you wish to collect (apart from sosreport)? Would it be beneficial if I left the systems in a broken state for a day or two if someone wants to log in and check it out? I cannot leave them like this for much longer that that unfortunately, since we need the environment deployed... Thanks!
ironic-conductor.log.1:5331:2020-03-01 16:14:51.963 8 ERROR dracclient.wsman [req-481413fd-cb58-4c76-9223-c089032e586d 634ba25f2ea14d0bb00b43b517bb3740 a5ec179025db483ca19a2847dfeb6a9a - default default] A SSLError error occurred while communicating with 10.19.136.1, attempt 3 of 3: requests.exceptions.SSLError: HTTPSConnectionPool(host='10.19.136.1', port=443): Max retries exceeded with url: /wsman (Caused by SSLError(SSLError(1, '[SSL: DH_KEY_TOO_SMALL] dh key too small (_ssl.c:897)'),)) Note the "dh key too small". We've seen it with another vendor: apparently RHEL 8 no longer accepts weak certificates that have previously been accepted. There is nothing we can do about it. Could you try updating/regenerating the TLS certificate on the server side? You may be able to set drac_protocol in the node's driver_info to "http" to use insecure connection, but it depends on whether the server will accept it (probably won't). As the last resort, switch to IPMI. If you don't need any advanced features, it should work for you just fine.
Hmm that is a certificate on the Dell chassis, or blades rather. Those certs might be older ones, that means that I would have to find a way to change/update the certificate on the blade(s). Since I only need this for simple OC deployment, I will switch to the IPMI, no exotic features needed in my case. Thanks for debugging!