Bug 1804127

Summary: Importing results in "Error: Cannot mix str and non-str arguments"
Product: Red Hat OpenStack Reporter: Christopher Brown <chris.brown>
Component: python-sushyAssignee: Ilya Etingof <ietingof>
Status: CLOSED ERRATA QA Contact: Arik Chernetsky <achernet>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.0 (Train)CC: achernet, astupnik, athomas, bfournie, dsneddon, dtantsur, ietingof, mkrcmari
Target Milestone: betaKeywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-sushy-2.0.3-0.20200522054330.0241cd9.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-29 07:50:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
concductor log from Director none

Description Christopher Brown 2020-02-18 09:48:04 UTC
Description of problem:

Importing using redfish fails with:


{'result': 'Node 9aa41594-3f41-44d8-8594-b5d032b439bc did not reach state "manageable", the state is "enroll", error: Failed to get power state for node 9aa41594-3f41-44d8-8594-b5d032b439bc. Error: Cannot mix str and non-str arguments'}]}

Version-Release number of selected component (if applicable):

 [root@director instackenv]# podman ps | grep -i ironic
f8623b8f3239  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-pxe:16.0-77                 kolla_start           46 minutes ago  Up 46 minutes ago         ironic_pxe_http
201398ebd179  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-pxe:16.0-77                 /bin/bash -c BIND...  46 minutes ago  Up 46 minutes ago         ironic_pxe_tftp
cc32799d2655  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-conductor:16.0-77           kolla_start           46 minutes ago  Up 46 minutes ago         ironic_conductor
7e20f6cb8cc3  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-nova-compute-ironic:16.0-77        kolla_start           16 hours ago    Up 10 hours ago           nova_compute
95e06f40c985  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-inspector:16.0-80           kolla_start           16 hours ago    Up 10 hours ago           ironic_inspector_dnsmasq
5d6c8c0a71fd  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-inspector:16.0-80           kolla_start           16 hours ago    Up 10 hours ago           ironic_inspector
86200ec5c38f  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-neutron-agent:16.0-80       kolla_start           16 hours ago    Up 10 hours ago           ironic_neutron_agent
a947185f9db3  director.ctlplane.localdomain:8787/rhosp-rhel8/openstack-ironic-api:16.0-77                 kolla_start           16 hours ago    Up 10 hours ago           ironic_api


How reproducible:

Always.

Additional info:

JSON for one node:

        {
            "mac":[
                "d8:9d:67:23:95:50"
            ],
            "name":"compute-2",
            "cpu":"4",
            "memory":"6144",
            "disk":"40",
            "arch":"x86_64",
            "pm_type":"redfish",
            "pm_user":"Administrator",
            "pm_password":"XXXXXXX",
            "pm_addr":"10.50.101.21",
            "pm_system_id":"/redfish/v1/Systems/1",
            "redfish_verify_ca":"false"
        }

This is HP Gen 8 ILO 4 2.70 and 2.73

Comment 1 Christopher Brown 2020-02-18 09:54:15 UTC
Created attachment 1663714 [details]
concductor log from Director

Comment 2 Christopher Brown 2020-02-18 10:26:05 UTC
Works fine with idrac:


(undercloud) [stack@director ~]$ openstack baremetal node list | grep compute-2
| dc53e65c-3c27-4aec-85b0-560b5233e391 | compute-2    | None          | power on    | manageable         | False       |


 [root@director stack]# grep -i dc53e65c-3c27-4aec-85b0-560b5233e391 /var/log/containers/ironic/ironic-conductor.log | grep -i manageable
2020-02-18 10:15:49.242 7 INFO ironic.conductor.task_manager [req-4ba01e06-22f8-4197-be7d-9dd449d0bb35 79d78d3c6b414169bac36cd622bbbd96 697fff159c20499c92fd2491aecd55d5 - default default] Node dc53e65c-3c27-4aec-85b0-560b5233e391 moved to provision state "verifying" from state "enroll"; target provision state is "manageable"
2020-02-18 10:15:50.278 7 INFO ironic.conductor.task_manager [req-4ba01e06-22f8-4197-be7d-9dd449d0bb35 79d78d3c6b414169bac36cd622bbbd96 697fff159c20499c92fd2491aecd55d5 - default default] Node dc53e65c-3c27-4aec-85b0-560b5233e391 moved to provision state "manageable" from state "verifying"; target provision state is "None"

Comment 3 Ilya Etingof 2020-02-18 14:41:54 UTC
Hi Chris,

In the attached conductor log I can see this error message:

2019-09-24 08:09:48.572 6106 ERROR ironic.conductor.manager [req-d096e35f-89ba-4203-853e-22d2193f55ca f6238b2a329a450292fe51f6ddf7ffcb 2b84130e6f704a779616ad9197022fd0 - default default] Failed to get power state for node e2c73cab-58f0-45ef-9f10-50c5afeeeb16. Error: HTTP GET https://10.52.45.17/redfish/v1/Systems/System.Embedded.1 returned code 401. unknown error: AccessError: HTTP GET https://10.52.45.17/redfish/v1/Systems/System.Embedded.1 returned code 401. unknown error

Could we make sure that BMC credentials are valid? May be trying to `curl` that URL against BMC by hand to make sure credentials work would be a way to go?

Comment 4 Christopher Brown 2020-02-18 15:04:21 UTC
Hi Ilya,

(In reply to Ilya Etingof from comment #3)

> Could we make sure that BMC credentials are valid? May be trying to `curl`
> that URL against BMC by hand to make sure credentials work would be a way to
> go?

per #c2 it works fine with ilo driver (I meant this, not idrac :/ )

so credentials should be fine. I imagine its more a problem with the path.

Any suggestions?

Comment 5 Ilya Etingof 2020-02-18 15:35:57 UTC
Well, the URL that ironic uses [1] seems valid, or, at least, conventional. But you never know before you try. The only standard part in Redfish is `/redfish/v1` location being "service root".

Besides common (?) credentials for iLO/Redfish, can BMC impose some additional access rights on the user to access specific protocol or document tree?

So my suggestion is to try browsing Redfish service by hand and,  if possible, check out BMC logs.

1. https://10.52.45.17/redfish/v1/Systems/System.Embedded.1

Comment 6 Christopher Brown 2020-02-18 21:02:47 UTC
(In reply to Ilya Etingof from comment #5)

> 
> 1. https://10.52.45.17/redfish/v1/Systems/System.Embedded.1

This isn't an IP I recognise and the timestamps are wrong on your comment as this node only cam up yesterday.

Comment 7 Christopher Brown 2020-02-18 22:14:58 UTC
I can browse the redfish service by hand:

(undercloud) [stack@director ~]$ curl --user Administrator:XXXXXXX --request GET --header "OData-Version: 4.0" https://10.50.101.21/redfish/v1/SessionService/Sessions/ --insecure --location | jq .Members                                 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   599  100   599    0     0   1468      0 --:--:-- --:--:-- --:--:--  1468
[
  {
    "@odata.id": "/redfish/v1/SessionService/Sessions/administrator5e4c50abb74bc6a8/"
  },
  {
    "@odata.id": "/redfish/v1/SessionService/Sessions/administrator5e4c58daee147ae1/"
  }
]

Comment 8 Christopher Brown 2020-02-20 11:17:08 UTC
Marian on CC reports this is working with latest iDrac 9 firmware. Not sure what to make of this. Might have to engage HPE however this worked without issue on OSP 15.

Comment 9 Marian Krcmarik 2020-02-21 09:12:59 UTC
(In reply to Christopher Brown from comment #8)
> Marian on CC reports this is working with latest iDrac 9 firmware. Not sure
> what to make of this. Might have to engage HPE however this worked without
> issue on OSP 15.

To provide a little bit more details - It works with DELL FC430 machine with iDRAC 8 after upgrading firmware from 2.30.30.30 to 2.70.70.70

Comment 10 Ilya Etingof 2020-02-21 16:06:44 UTC
Ah, sorry! I somehow looked into some other conductor log!

In the right log I can see two kinds of errors:

2020-02-18 09:20:19.975 7 ERROR ironic.conductor.manager [req-06ff7117-bb5a-44e0-922d-cd0e764a0326 79d78d3c6b414169bac36cd622bbbd96 697fff159c20499c92fd2491aecd55d5 - default default] Failed to get power state for node 64c09103-a338-44f8-8aa9-b3e2c71819ca. Error: VM with name controller-0 was not found

Is probably irrelevant (?) - I see nodes were deleted shortly after.

The second error:

2020-02-18 09:25:09.308 7 ERROR ironic.conductor.manager [req-3ce6d253-c43f-400c-a2fd-d74dae937e44 79d78d3c6b414169bac36cd622bbbd96 697fff159c20499c92fd2491aecd55d5 - default default] Failed to get power state for node 16954521-7204-40d9-89b2-f5abf204cbac. Error: Cannot mix str and non-str arguments

Can possibly be triggered by this Redfish message registry:

    GET https://10.50.101.97/redfish/v1/Registries/
iLOEvents

Response document looks reasonable:

Received representation of MessageRegistryFile /redfish/v1/Registries/iLOEvents: 

{
    "@odata.context": "/redfish/v1/$metadata#Registries/Members/$entity",
    "@odata.id": "/redfish/v1/Registries/iLOEvents/",
    "@odata.type": "#MessageRegistryFile.1.0.0.MessageRegistryFile",
    "Description": "Registry Definition File for iLOEvents",
    "Id": "iLOEvents",
    "Languages": [
        "en"
    ],
    "Location": [
        {
            "Language": "en",
            "Uri": {
                "extref": "/redfish/v1/RegistryStore/registries/en/iLOEvents.json/"
            }
        }
    ],
    "Name": "iLOEvents Message Registry File",
    "Registry": "iLOEvents.0.9.7"
}

But may be sushy choke at it for some reason...? Let me play with it locally and get back to you.

Comment 11 Ilya Etingof 2020-03-09 18:00:11 UTC
I've done some testing/coding, it seems that this piece:

    "Uri": {
        "extref": "/redfish/v1/RegistryStore/registries/en/iLOEvents.json/"
    }

Goes against Redfish schema spec [1]:

     "Uri": {
      "description": "The link to locally available URI for the Message Registry.",
      "format": "uri-reference",
      "longDescription": "This property shall contain a URI colocated with the Redfish Service that specifies the location of the Message Registry file, which can be retrieved using the Redfish protocol and authentication methods.  This property shall be used for only individual Message Registry files.  The file name portion of the URI shall conform to Redfish Specification-specified syntax.",
      "readonly": true,
      "type": "string"
     }

In other words, "Uri" must be a string, not an object. That might be the cause of sushy crash.

The only cure I can propose is to harden sushy against malformed URI - we can make it ignoring the entire message registry and go on.

1. https://redfish.dmtf.org/schemas/v1/MessageRegistryFile.v1_1_3.json

Comment 12 Ilya Etingof 2020-03-12 15:43:46 UTC
It came up upstream that the client should better include the OData-Version header [1] to all queries. We have it in place in sushy since Train (sushy > 2.0.0), so if you are using sushy > 2.0.0, then it's iLO to blame and linked patch might help. Otherwise you need to upgrade sushy and try again.

1. https://github.com/openstack/sushy/blob/stable/train/sushy/connector.py#L83

Comment 13 Bob Fournier 2020-03-13 15:07:05 UTC
Chris - can you try the referenced patch?  Thanks.

Comment 20 errata-xmlrpc 2020-07-29 07:50:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148

Comment 21 Christopher Brown 2020-07-30 08:23:26 UTC
Thanks for fixing. Was unable to test myself in the end.