Bug 1713497 - Failed to set boot device to PXE: Redfish exception occurred - Error: ('Connection aborted.', BadStatusLine('\\n',)) (HTTP 500)
Summary: Failed to set boot device to PXE: Redfish exception occurred - Error: ('Conne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-sushy
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z8
: 13.0 (Queens)
Assignee: Ilya Etingof
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-23 21:03 UTC by Chris Janiszewski
Modified: 2019-09-03 16:58 UTC (History)
9 users (show)

Fixed In Version: python-sushy-1.3.3-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-03 16:58:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
output-from-console (60.71 KB, text/plain)
2019-05-23 21:03 UTC, Chris Janiszewski
no flags Details
all ironic nodes (10.37 KB, application/gzip)
2019-05-24 15:37 UTC, Chris Janiszewski
no flags Details
ironic-conductor debug enabled (50.82 KB, text/plain)
2019-05-28 13:41 UTC, Chris Janiszewski
no flags Details
latest-logs-osp-13 (6.85 KB, application/gzip)
2019-06-07 15:00 UTC, Chris Janiszewski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 674684 0 'None' MERGED Disable HTTP connection pooling 2021-02-17 15:35:12 UTC
Red Hat Product Errata RHBA-2019:2623 0 None None None 2019-09-03 16:58:32 UTC

Description Chris Janiszewski 2019-05-23 21:03:42 UTC
Created attachment 1572690 [details]
output-from-console

Description of problem:
I am unable to introspect or boot up the node controlled by redfish driver in OSP14

Error:
Retrying 1 nodes that failed introspection. Attempt 1 of 3 
{u'body': {u'type': u'tripleo.baremetal.v1._introspect', u'payload': {u'status': u'FAILED', u'node_uuid': u'3217334c-ce31-4043-8aab-4cc4ffa68f51', u'introspected_node': {u'uuid': u'3217334c-ce31-4043-8aab-4cc4ffa68f51', u'links': [{u'href': u'http://172.31.8.10:13050/v1/introspection/3217334c-ce31-4043-8aab-4cc4ffa68f51', u'rel': u'self'}], u'finished_at': u'2019-05-23T20:35:53', u'finished': True, u'state': u'error', u'error': u"Failed to set boot device to PXE: Redfish exception occurred. Error: Redfish set boot device failed for node 3217334c-ce31-4043-8aab-4cc4ffa68f51. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1. Error: ('Connection aborted.', BadStatusLine('\\n',)) (HTTP 500)",


The same node works fine with IPMI driver

Version-Release number of selected component (if applicable):
OSP14

How reproducible:
Every time

Steps to Reproduce:
1. Inject a node into ironic with redfish driver
2. Introspect the node
3.

Actual results:
Error above

Expected results:
Success

Additional info:

(undercloud) [stack@undercloud-osp14 ~]$ cat instackenv-redfish.json 
{
    "nodes": [
        {
            "arch": "x86_64",
            "mac": [
                "0c:c4:7a:c8:16:8a"
            ],
            "pm_addr": "https://172.31.9.21",
            "pm_system_id": "/redfish/v1/Systems/1",
            "redfish_verify_ca": "False",
            "pm_password": "XXXX",
            "pm_type": "redfish",
            "name": "redfish-compute1",
            "cpu":"8",
            "memory":"65536",
            "disk":"110",
            "capabilities": "profile:compute,boot_option:local,boot_mode:uefi",
            "pm_user": "XXXX"
        }
    ]
}



Postman GET results for the node:

{
    "@odata.context": "/redfish/v1/$metadata#ComputerSystem.ComputerSystem",
    "@odata.type": "#ComputerSystem.v1_3_0.ComputerSystem",
    "@odata.id": "/redfish/v1/Systems/1",
    "Id": "1",
    "Name": "System",
    "Description": "Description of server",
    "Status": {
        "State": "Enabled",
        "Health": "OK"
    },
    "SerialNumber": "          ",
    "PartNumber": "",
    "SystemType": "Physical",
    "BiosVersion": "2.0a",
    "Manufacturer": "Supermicro",
    "Model": "Super Server",
    "SKU": "To be filled by O.E.M.",
    "UUID": "00000000-0000-0000-0000-0CC47AC8168A",
    "ProcessorSummary": {
        "Count": 4,
        "Model": "Intel(R) Xeon(R) processor",
        "Status": {
            "State": "Enabled",
            "Health": "OK"
        }
    },
    "MemorySummary": {
        "TotalSystemMemoryGiB": 64,
        "Status": {
            "State": "Enabled",
            "Health": "OK"
        }
    },
    "IndicatorLED": "Off",
    "PowerState": "Off",
    "Boot": {
        "BootSourceOverrideEnabled": "Once",
        "BootSourceOverrideTarget": "Pxe",
        "BootSourceOverrideTarget": [
            "None",
            "Pxe",
            "Hdd",
            "Diags",
            "CD/DVD",
            "BiosSetup",
            "FloppyRemovableMedia",
            "UsbKey",
            "UsbHdd",
            "UsbFloppy",
            "UsbCd",
            "UefiUsbKey",
            "UefiCd",
            "UefiHdd",
            "UefiUsbHdd",
            "UefiUsbCd"
        ]
    },
    "Processors": {
        "@odata.id": "/redfish/v1/Systems/1/Processors"
    },
    "Memory": {
        "@odata.id": "/redfish/v1/Systems/1/Memory"
    },
    "EthernetInterfaces": {
        "@odata.id": "/redfish/v1/Systems/1/EthernetInterfaces"
    },
    "SimpleStorage": {
        "@odata.id": "/redfish/v1/Systems/1/SimpleStorage"
    },
    "Storage": {
        "@odata.id": "/redfish/v1/Systems/1/Storage"
    },
    "LogServices": {
        "@odata.id": "/redfish/v1/Systems/1/LogServices"
    },
    "Links": {
        "Chassis": [
            {
                "@odata.id": "/redfish/v1/Chassis/1"
            }
        ],
        "ManagedBy": [
            {
                "@odata.id": "/redfish/v1/Managers/1"
            }
        ],
        "Oem": {}
    },
    "Actions": {
        "#ComputerSystem.Reset": {
            "target": "/redfish/v1/Systems/1/Actions/ComputerSystem.Reset",
            "ResetType": [
                "On",
                "ForceOff",
                "GracefulShutdown",
                "GracefulRestart",
                "ForceRestart",
                "Nmi",
                "ForceOn"
            ]
        }
    }
}

Comment 1 Bob Fournier 2019-05-24 13:08:47 UTC
Hi Chris - can you provide the rpm version of python-sushy that you are testing with?  Also can you provide all of the ironic logs? Thanks.

Comment 2 Chris Janiszewski 2019-05-24 15:34:19 UTC
()[root@undercloud-osp14 /]# rpm -qa | grep python2-sushy
python2-sushy-1.6.0-0.20180816092133.8f952f8.el7ost.noarch

Comment 3 Chris Janiszewski 2019-05-24 15:37:29 UTC
Created attachment 1572946 [details]
all ironic nodes

Comment 4 Ilya Etingof 2019-05-27 08:22:27 UTC
This seems to fail when ironic is attempting to change node's boot device from Pxe to Pxe. It's a good question if ironic should do this Pxe -> Pxe change, however BMC seems to announce this capability.

Is it that ironic is running against a real hardware BMC, not an emulator?

To troubleshoot this further, I'd like to suggest trying out exactly the same HTTP request as ironic does for setting boot device to see how it works in isolation -- is it BMC failing to process it or request itself is anyhow malformed.

The request in question should be in ironic logs, though the logs attached to this bug seem to be grep'ed out so that the request is not there. Could you please either Red/fish set boot device query out of logs or attach the whole ironic log?

Thanks!

Comment 5 Chris Janiszewski 2019-05-28 13:22:24 UTC
Hi Ilya,

I am attaching the only error that is being generated in the logs during this failed operation.
I am indeed running on a real hardware - Supermicro
Is there a command that you could provide for me to set the boot device in isolated way ?


 [root@undercloud-osp14 ironic]# tail -f *.log
==> app.log <==

==> ironic-conductor.log <==
2019-05-28 09:18:39.560 1 ERROR ironic.drivers.modules.redfish.management [req-6f3127e1-b651-46d6-848b-9886192b6d5d 60fdb8da9e6846cc8b6cb8d1099ab552 3c169191ab0b42c5a97364b943eb3b83 - default default] Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',)): ConnectionError: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server [req-6f3127e1-b651-46d6-848b-9886192b6d5d 60fdb8da9e6846cc8b6cb8d1099ab552 3c169191ab0b42c5a97364b943eb3b83 - default default] Exception during message handling: RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     result = f(*args, **kwargs)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     return func(*args, **kwargs)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 2828, in set_boot_device
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     persistent=persistent)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/redfish/management.py", line 124, in set_boot_device
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server     raise exception.RedfishError(error=error_msg)
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
2019-05-28 09:18:39.570 1 ERROR oslo_messaging.rpc.server 

==> app.log <==
2019-05-28 09:18:39.595 16 ERROR wsme.api [req-6f3127e1-b651-46d6-848b-9886192b6d5d 60fdb8da9e6846cc8b6cb8d1099ab552 3c169191ab0b42c5a97364b943eb3b83 - default default] Server-side error: "Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 60, in wrapped
    result = f(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 2828, in set_boot_device
    persistent=persistent)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper
    return f(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/redfish/management.py", line 124, in set_boot_device
    raise exception.RedfishError(error=error_msg)

RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
". Detail: 
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 85, in callfunction
    result = f(self, *args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 255, in put
    topic=topic)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/rpcapi.py", line 617, in set_boot_device
    device=device, persistent=persistent)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 179, in call
    retry=self.retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 133, in _send
    retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 584, in send
    call_monitor_timeout, retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 575, in _send
    raise result

RedfishError_Remote: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 60, in wrapped
    result = f(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 2828, in set_boot_device
    persistent=persistent)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper
    return f(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/redfish/management.py", line 124, in set_boot_device
    raise exception.RedfishError(error=error_msg)

RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))

: RedfishError_Remote: Redfish exception occurred. Error: Redfish set boot device failed for node 1b1dee7e-dc99-4e90-9376-d10ba1a31002. Error: Unable to connect to https://172.31.9.21/redfish/v1/Systems/1/. Error: ('Connection aborted.', BadStatusLine('\n',))

Comment 6 Chris Janiszewski 2019-05-28 13:41:39 UTC
Created attachment 1574341 [details]
ironic-conductor debug enabled

Comment 7 Chris Janiszewski 2019-05-28 13:42:31 UTC
Attached above a log for ironic-conductor during the Error with debug enabled.

Comment 8 Chris Janiszewski 2019-05-28 13:53:23 UTC
Sorry for the flood of info. I was trying to read into the logs and see if I can manually GET API that sushy is failing at and was able to do it using "Postman"

https://172.31.9.21/redfish/v1/SessionService/

{
    "@odata.context": "/redfish/v1/$metadata#SessionService.SessionService",
    "@odata.type": "#SessionService.v1_0_0.SessionService",
    "@odata.id": "/redfish/v1/SessionService",
    "Id": "SessionService",
    "Name": "Session Service",
    "Description": "Session Service",
    "Status": {
        "State": "Enabled",
        "Health": "OK"
    },
    "ServiceEnabled": true,
    "SessionTimeout": 300,
    "Sessions": {
        "@odata.id": "/redfish/v1/SessionService/Sessions"
    }
}

Comment 9 Ilya Etingof 2019-06-03 13:01:49 UTC
Chris,

The isolated command to set boot mode I think is like this:

    curl -u admin:password \
         -X PATCH \
         -H "Content-Type: application/json" \
         -d "{'Boot': {'BootSourceOverrideTarget': 'Pxe', 'BootSourceOverrideEnabled': 'Once'}" \
         https://172.31.9.21/redfish/v1/Systems/1

You should have something similar in the conductor log.

I assume you use basic HTTP auth (so put correct user/password in there), if it's a session auth you'd need to obtain a session token first.

Comment 10 Ilya Etingof 2019-06-03 13:03:42 UTC
...to set boot device, not boot mode indeed.

Comment 11 Chris Janiszewski 2019-06-03 13:18:33 UTC
(undercloud) [stack@undercloud-osp14 ~]$ curl -u XXXX:XXXX -k -X PATCH -H "Content-Type: application/json" -d "{'Boot': {'BootSourceOverrideTarget': 'Pxe','BootSourceOverrideEnabled': 'Once'}"    https://172.31.9.21/redfish/v1/Systems/1 
{"error":{"code":"Base.v1_0_0.GeneralError","Message":"A general error has occurred. See ExtendedInfo for more information.","@Message.ExtendedInfo":[{"MessageId":"Base.v1_0_0.MalformedJSON","Severity":"Critical","Resolution":"Ensure that the request body is valid JSON and resubmit the request.","Message":"The request body submitted was malformed JSON and could not be parsed by the receiving service.","MessageArgs":[""],"RelatedProperties":[""]}]}}



curl -u XXXX:XXXX -k https://172.31.9.21/redfish/v1/Systems/1
{"@odata.context":"/redfish/v1/$metadata#ComputerSystem.ComputerSystem","@odata.type":"#ComputerSystem.v1_3_0.ComputerSystem","@odata.id":"/redfish/v1/Systems/1","Id":"1","Name":"System","Description":"Description of server","Status":{"State":"Enabled","Health":"OK"},"SerialNumber":"          ","PartNumber":"","SystemType":"Physical","BiosVersion":"2.0a","Manufacturer":"Supermicro","Model":"Super Server","SKU":"To be filled by O.E.M.","UUID":"00000000-0000-0000-0000-0CC47AC8168A","ProcessorSummary":{"Count":4,"Model":"Intel(R) Xeon(R) processor","Status":{"State":"Enabled","Health":"OK"}},"MemorySummary":{"TotalSystemMemoryGiB":64,"Status":{"State":"Enabled","Health":"OK"}},"IndicatorLED":"Off","PowerState":"Off","Boot":{"BootSourceOverrideEnabled":"Once","BootSourceOverrideTarget":"Pxe","BootSourceOverrideTarget":["None","Pxe","Hdd","Diags","CD/DVD","BiosSetup","FloppyRemovableMedia","UsbKey","UsbHdd","UsbFloppy","UsbCd","UefiUsbKey","UefiCd","UefiHdd","UefiUsbHdd","UefiUsbCd"]},"Processors":{"@odata.id":"/redfish/v1/Systems/1/Processors"},"Memory":{"@odata.id":"/redfish/v1/Systems/1/Memory"},"EthernetInterfaces":{"@odata.id":"/redfish/v1/Systems/1/EthernetInterfaces"},"SimpleStorage":{"@odata.id":"/redfish/v1/Systems/1/SimpleStorage"},"Storage":{"@odata.id":"/redfish/v1/Systems/1/Storage"},"LogServices":{"@odata.id":"/redfish/v1/Systems/1/LogServices"},"Links":{"Chassis":[{"@odata.id":"/redfish/v1/Chassis/1"}],"ManagedBy":[{"@odata.id":"/redfish/v1/Managers/1"}],"Oem":{}},"Actions":{"#ComputerSystem.Reset":{"target":"/redfish/v1/Systems/1/Actions/ComputerSystem.Reset","ResetType":["On","ForceOff","GracefulShutdown","GracefulRestart","ForceRestart","Nmi","ForceOn"]}}}

Comment 12 Ilya Etingof 2019-06-04 12:54:38 UTC
Ah, perhaps we need to use double quotes in JSON:

 curl -u admin:password \
         -X PATCH \
         -H "Content-Type: application/json" \
         -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}' \
         https://172.31.9.21/redfish/v1/Systems/1

Sorry for the malformed command!

Comment 13 Chris Janiszewski 2019-06-06 14:54:09 UTC
Hi Ilya,

This still reports a malformed JSON.

Comment 14 Chris Janiszewski 2019-06-06 15:40:06 UTC
based on bfourier suggestion I tried the following:

curl -u XXXXX:XXXXX -k -X PATCH -H "Content-Typ application/json" -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}'    https://172.31.9.21/redfish/v1/Systems/1                              

{"Success":{"code":"Base.v1_0_0.Success","Message":"Successfully Completed Request."}}

Comment 15 Ilya Etingof 2019-06-07 08:35:40 UTC
Interesting, let's may be also try the same command when the node is powered up?

Could I take a peek at the complete original ironic conductor log? The logs attached to this bug seem to be heavily grepped out. What I am trying to do is to track down and replay the sequence of Redfish requests/responses leading up to the failure at hand...

Comment 16 Chris Janiszewski 2019-06-07 15:00:30 UTC
So I have decided to scrap my OSP14 and install OSP13 instead .. added the redfish node to ironic and attempted to inspect and clean it up. 
The introspection went through without error but then ironic tried cleaning it up and failed with the same error as it has on OSP14.
Since this is brand new deployment there is very little logs. I am attaching latest ironic logs (OSP13) to this BZ

Comment 17 Chris Janiszewski 2019-06-07 15:00:58 UTC
Created attachment 1578371 [details]
latest-logs-osp-13

Comment 18 Chris Janiszewski 2019-06-07 15:05:27 UTC
in case we need it .. here is the entire sosreport:
http://chrisj.cloud/sosreport-undercloud-osp13-2019-06-07-owqfdtp.tar.xz

Comment 19 Chris Janiszewski 2019-06-07 15:13:00 UTC
Manual curl request seems to be working whether note is power up or off

Comment 20 Joe Antkowiak 2019-06-20 20:14:54 UTC
Seeing similar issue with OSP13 against HP G9 nodes:


2019-06-20 15:59:36.351 17992 ERROR ironic.drivers.modules.redfish.management [req-d0611a24-ee79-42b1-bd58-e35e5fccb794 8bd580e4cfed4f8688d56cab827d9632 db2fd0d7b4264be083b
0cb340fe16b8d - default default] Redfish set boot device failed for node 42bb11e7-4f92-4330-980b-a618bad3016a. Error: Unable to connect to https://172.27.247.73/redfish/v1/
Systems/1. Error: ('Connection aborted.', BadStatusLine("''",)): ConnectionError: Unable to connect to https://172.27.247.73/redfish/v1/Systems/1. Error: ('Connection abort
ed.', BadStatusLine("''",))
2019-06-20 15:59:36.359 17992 ERROR oslo_messaging.rpc.server [req-d0611a24-ee79-42b1-bd58-e35e5fccb794 8bd580e4cfed4f8688d56cab827d9632 db2fd0d7b4264be083b0cb340fe16b8d -
default default] Exception during message handling: RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 42bb11e7-4f92-4330-980b-a618bad3016a. Error: Unable to connect to https://172.27.247.73/redfish/v1/Systems/1. Error: ('Connection aborted.', BadStatusLine("''",))
2019-06-20 15:59:36.359 17992 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/redfish/management.py", line 118, in set_boot_device
2019-06-20 15:59:36.359 17992 ERROR oslo_messaging.rpc.server     raise exception.RedfishError(error=error_msg)
2019-06-20 15:59:36.359 17992 ERROR oslo_messaging.rpc.server RedfishError: Redfish exception occurred. Error: Redfish set boot device failed for node 42bb11e7-4f92-4330-980b-a618bad3016a. Error: Unable to connect to https://172.27.247.73/redfish/v1/Systems/1. Error: ('Connection aborted.', BadStatusLine("''",))
2019-06-20 15:59:36.961 17992 DEBUG sushy.resources.base [req-233d4106-681d-47de-92cd-e941eedcb874 8bd580e4cfed4f8688d56cab827d9632 db2fd0d7b4264be083b0cb340fe16b8d - default default] Received representation of System /redfish/v1/Systems/1: {u'BiosVersion': u'P89 v1.52 (11/10/2015)', u'SKU': u'755258-B21', u'PowerState': u'Off', u'Processors': {u'Count': 2, u'Status': {u'HealthRollUp': u'OK'}, u'ProcessorFamily': u'Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz'}, u'links': {u'ManagedBy': [{u'href': u'/redfish/v1/Managers/1/'}], u'self': {u'href': u'/redfish/v1/Systems/1/'}, u'Chassis': [{u'href': u'/redfish/v1/Chassis/1/'}], u'Logs': {u'href': u'/redfish/v1/Systems/1/LogServices/'}}, u'SerialNumber': u'MXQ703001X', u'Boot': {u'BootSourceOverrideSupported': [u'None', u'Cd', u'Hdd', u'Usb', u'Utilities', u'Diags', u'BiosSetup', u'Pxe', u'UefiShell'], u'BootSourceOverrideTarget': u'Hdd', u'BootSourceOverrideEnabled': u'Continuous'}, u'Bios': {u'Current': {u'VersionString': u'P89 v1.52 (11/10/2015)'}}, u'Memory': {u'TotalSystemMemoryGB': 384}, u'ProcessorSummary': {u'Count': 2, u'Status': {u'HealthRollUp': u'OK'}, u'Model': u'Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz'}, u'@odata.type': u'#ComputerSystem.1.0.0.ComputerSystem', u'Description': u'Computer System View', u'Power': u'Off', u'Type': u'ComputerSystem.1.0.0', u'HostName': u'', u'@odata.context': u'/redfish/v1/$metadata#Systems/Members/$entity', u'Oem': {u'Hp': {u'@odata.type': u'#HpComputerSystemExt.1.1.0.HpComputerSystemExt', u'IntelligentProvisioningIndex': 3, u'PowerAutoOn': u'Restore', u'links': {u'SecureBoot': {u'href': u'/redfish/v1/Systems/1/SecureBoot/'}, u'SmartStorage': {u'href': u'/redfish/v1/Systems/1/SmartStorage/'}, u'PCISlots': {u'href': u'/redfish/v1/Systems/1/PCISlots/'}, u'FirmwareInventory': {u'href': u'/redfish/v1/Systems/1/FirmwareInventory/'}, u'BIOS': {u'href': u'/redfish/v1/systems/1/bios/'}, u'PCIDevices': {u'href': u'/redfish/v1/Systems/1/PCIDevices/'}, u'Memory': {u'href': u'/redfish/v1/Systems/1/Memory/'}, u'NetworkAdapters': {u'href': u'/redfish/v1/Systems/1/NetworkAdapters/'}}, u'PowerAllocationLimit': 1000, u'Battery': [{u'Index': 1, u'Condition': u'Ok', u'Present': u'Yes', u'SerialNumber': u'6EZBN0FB25067S', u'ProductName': u'HPE Smart Storage Battery ', u'ErrorCode': 0, u'Spare': u'815983-001', u'FirmwareVersion': u'1.1', u'Model': u'727258-B21', u'MaxCapWatts': 96}], u'Type': u'HpComputerSystemExt.1.1.0', u'PowerRegulatorModesSupported': [u'OSControl', u'Dynamic', u'Max', u'Min'], u'VirtualProfile': u'Inactive', u'Actions': {u'#HpComputerSystemExt.1.0.0.PowerButton': {u'PushType': [u'Press', u'PressAndHold'], u'target': u'/redfish/v1/Systems/1/Actions/Oem/Hp/ComputerSystemExt.PowerButton/'}, u'#HpComputerSystemExt.1.0.0.SystemReset': {u'ResetType': [u'ColdBoot'], u'target': u'/redfish/v1/Systems/1/Actions/Oem/Hp/ComputerSystemExt.SystemReset/'}, u'#HpComputerSystemExt.1.0.0.ServerSigRecompute': None}, u'PowerOnDelay': u'Minimum', u'Bios': {u'Current': {u'Date': u'11/10/2015', u'VersionString': u'P89 v1.52 (11/10/2015)', u'Family': u'P89'}, u'Backup': {u'Date': u'11/10/2015', u'VersionString': u'P89 v1.52 (11/10/2015)', u'Family': u'P89'}, u'UefiClass': 2}, u'IntelligentProvisioningVersion': u'N/A', u'DeviceDiscoveryComplete': {u'AMSDeviceDiscovery': u'NoAMS', u'DeviceDiscovery': u'vAuxDeviceDiscoveryComplete', u'SmartArrayDiscovery': u'Complete'}, u'AvailableActions': [{u'Action': u'PowerButton', u'Capabilities': [{u'AllowableValues': [u'Press', u'PressAndHold'], u'PropertyName': u'PushType'}, {u'AllowableValues': [u'/Oem/Hp'], u'PropertyName': u'Target'}]}, {u'Action': u'ServerSigRecompute', u'Capabilities': [{u'AllowableValues': [u'/Oem/Hp'], u'PropertyName': u'Target'}]}, {u'Action': u'SystemReset', u'Capabilities': [{u'AllowableValues': [u'ColdBoot'], u'PropertyName': u'ResetType'}, {u'AllowableValues': [u'/Oem/Hp'], u'PropertyName': u'Target'}]}], u'TrustedModules': [{u'Status': u'NotPresent'}], u'PostState': u'PowerOff', u'IntelligentProvisioningLocation': u'System Board', u'PowerRegulatorMode': u'Dynamic'}}, u'AvailableActions': [{u'Action': u'Reset', u'Capabilities': [{u'AllowableValues': [u'On', u'ForceOff', u'ForceRestart', u'Nmi', u'PushPowerButton'], u'PropertyName': u'ResetType'}]}], u'Manufacturer': u'HP', u'Status': {u'State': u'Disabled', u'Health': u'OK'}, u'Name': u'Computer System', u'AssetTag': u'', u'@odata.id': u'/redfish/v1/Systems/1/', u'IndicatorLED': u'Off', u'LogServices': {u'@odata.id': u'/redfish/v1/Systems/1/LogServices/'}, u'MemorySummary': {u'TotalSystemMemoryGiB': 384}, u'Model': u'ProLiant DL360 Gen9', u'UUID': u'32353537-3835-584D-5137-303330303158', u'HostCorrelation': {u'HostName': u'', u'IPAddress': [u'', u'', u'', u'', u'', u'', u'', u'', u'', u'', u''], u'HostMACAddress': [u'94:18:82:77:49:b4', u'94:18:82:77:49:b5', u'94:18:82:77:49:b6', u'94:18:82:77:49:b7', u'48:df:37:68:b3:5c', u'48:df:37:68:b3:5d', u'48:df:37:7a:e7:90', u'48:df:37:7a:e7:98', u'00:00:00:00:00:00', u'9c:dc:71:c0:7c:6e', u'9c:dc:71:c0:7c:6f']}, u'Actions': {u'#ComputerSystem.Reset': {u'ResetType': [u'On', u'ForceOff', u'ForceRestart', u'Nmi', u'PushPowerButton'], u'target': u'/redfish/v1/Systems/1/Actions/ComputerSystem.Reset/'}}, u'SystemType': u'Physical', u'Id': u'1'} refresh /usr/lib/python2.7/site-packages/sushy/resources/base.py:272


Comment 22 Bob Fournier 2019-07-26 15:33:20 UTC
Hi Chris - any more info on duplicating this issue?

Comment 23 Bob Fournier 2019-07-26 15:39:34 UTC
Maybe need to disable Connection pooling on the request, the thought is that the previous connection is being reused.

Comment 24 Chris Janiszewski 2019-07-26 16:59:35 UTC
I can duplicate on demand and we didn't not really get any further in troubleshooting. How do I disable the connection pooling on the request ?

Comment 25 Ilya Etingof 2019-07-29 12:16:16 UTC
Chris, I can come up with a patch disabling connection pooling for the sake of this experiment. So if you can build a reproducer environment, we could probably proceed from there.

WDYT?

Comment 26 Ilya Etingof 2019-07-29 15:31:42 UTC
I've proposed sushy patch [1] to work around this problem. Chris, if you could try it out - woudl be great!

1. https://review.opendev.org/673312

Comment 27 Ilya Etingof 2019-07-31 11:02:32 UTC
Update: the match [1] got merged upstream, you can just run it off sushy master.


1. https://review.opendev.org/673312

Comment 28 Chris Janiszewski 2019-08-02 14:31:27 UTC
Thanks Ilya,

This seems to be working now after applying https://review.opendev.org/673312 and restarting ironic-conductor.
This BZ can be now closed.

Comment 29 Bob Fournier 2019-08-02 15:12:02 UTC
That's great Ilya and Chris.

Since there is a case for this issue in OSP-13 can we backport the fix to Queens?

Comment 32 Bob Fournier 2019-08-06 18:49:24 UTC
Chris - if you have a chance would you be able to test this python-sushy package that will be in 13z8 - https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=945048, just to make sure the backports were successful?  Thanks.

Comment 39 Chris Janiszewski 2019-08-23 14:49:12 UTC
Bob,

I was able to test the new sushy package from https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=945048 .. I only have OSP13 available at the moment where I can test it and the provided package seems to be working fine for me. I am able to add and provision a BM using redfish driver.

Comment 41 Bob Fournier 2019-08-28 17:03:35 UTC
Moving to VERIFIED as Chris has tested the new package.  Thanks Chris.

Comment 44 errata-xmlrpc 2019-09-03 16:58:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2623


Note You need to log in before you can comment on or make changes to this bug.