Bug 1820689

Summary: HTTPError should raise the content of @Message.ExtendedInfo when the code is iLO.0.10.ExtendedInfo
Product: Red Hat OpenStack Reporter: David Vallee Delisle <dvd>
Component: python-sushyAssignee: RHOS Maint <rhos-maint>
Status: CLOSED NOTABUG QA Contact: Arik Chernetsky <achernet>
Severity: medium Docs Contact:
Priority: low    
Version: 16.0 (Train)CC: achernet, athomas, bfournie, dsneddon, eduen, hbrock, ietingof, jkreger, jslagle, mburns, mivollme, stendulker
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-24 17:41:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vallee Delisle 2020-04-03 15:36:09 UTC
Description of problem:
When using proliantutils, at least, it happens that the message raised is of no value at all [1].
For example:
iLO.0.10.ExtendedInfo: See @Message.ExtendedInfo for more information.

When in fact the useful message is in the ExtendedInfo:
~~~
{'error': {'code': 'iLO.0.10.ExtendedInfo', 'message': 'See @Message.ExtendedInfo for more information.', '@Message.ExtendedInfo': [{'MessageArgs': ['BootSourceOverrideTarget'], 'MessageId': 'iLO.2.13.UnableToModifyDuringSystemPOST'}]}}
~~~

Version-Release number of selected component (if applicable):
master

How reproducible:
All the time

Steps to Reproduce:
1. failed introspection
2. look at the logs

Actual results:
Nothing clear or useful

Expected results:
Should be clear why it's failing



Additional info:

This also hides another issue, I'll open a BZ for HardProv. After a first introspection failure, ironic should probably shutdown the node before attempting to set the BootOverride.

I believe we should work here [a]. I wonder if a lookup for the code and get the extended info only when necessary, would be the right approach.



[a] https://github.com/openstack/sushy/blob/master/sushy/exceptions.py#L103-L109
[1]
~~~
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/ilo/management.py", line 279, in set_boot_device
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     ilo_object.set_one_time_boot(boot_device)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/proliantutils/ilo/client.py", line 459, in set_one_time_boot
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     return self._call_method('set_one_time_boot', value)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/proliantutils/ilo/client.py", line 341, in _call_method
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     return method(*args, **kwargs)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/proliantutils/redfish/redfish.py", line 610, in set_one_time_boot
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     raise exception.IloError(msg)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server proliantutils.exception.IloError: [iLO xxx] The Redfish controller failed to set one time boot device NETWORK. Error: HTTP PATCH https://xxx/redfish/v1/Systems/1 returned code 400. iLO.0.10.ExtendedInfo: See @Message.ExtendedInfo for more information.
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     result = f(*args, **kwargs)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 235, in inner
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     return func(*args, **kwargs)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 3034, in set_boot_device
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     persistent=persistent)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     result = f(*args, **kwargs)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     return f(*args, **kwargs)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/ilo/management.py", line 286, in set_boot_device
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server     error=ilo_exception)
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server ironic.common.exception.IloOperationError: Setting pxe as boot device failed, error: [iLO xxx] The Redfish controller failed to set one time boot device NETWORK. Error: HTTP PATCH https://xxx/redfish/v1/Systems/1 returned code 400. iLO.0.10.ExtendedInfo: See @Message.ExtendedInfo for more information.
ironic/ironic-conductor.log:2020-04-01 17:38:06.494 7 ERROR oslo_messaging.rpc.server
~~~

Comment 1 Bob Fournier 2020-04-03 15:51:14 UTC
Seems to be proliantutils specific issue, changing component.

Comment 2 Bob Fournier 2020-04-03 16:27:36 UTC
Sushy is a requirement for proliantutils Redfish implementation [1].  The useful error probably resides in an OEM object (iLO.0.10.ExtendedInfo) that sushy has no clue about that, so it'd be up to proliantutils to pull the error message from sushy and re/raise it properly.


[1] https://opendev.org/x/proliantutils/src/branch/master/requirements.txt

Comment 5 Julia Kreger 2020-08-12 18:28:40 UTC
I was doing some soul searching on this item earlier and I think https://review.opendev.org/745994 might work to raise such information for operators. If it works, then it should at least make it easier for operators to debug/identify when errors are occuring and what the possible cause or remedy is. Since this can also happen with non-OEM wrapped interfaces. As such, I'm going to move this back to sushy, and associate it with the bug fix targeting 17. It, ideally, should merge by then.

Comment 7 Julia Kreger 2020-08-24 17:41:12 UTC
The upstream change has merged. Given the failure case and information, I don't think there is a need to backport the change at this time. As such I'm going to close this item as not a bug since really, it is not a defect in the software more so a deficiency in surfacing what the issue actually is. If there is a need for this to be backported please re-open this item. Thanks!