Bug 1910739 - Redfish-virtualmedia (idrac) deploy fails on "The Virtual Media image server is already connected"
Summary: Redfish-virtualmedia (idrac) deploy fails on "The Virtual Media image server ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.7.0
Assignee: Derek Higgins
QA Contact: Lubov
URL:
Whiteboard:
Depends On:
Blocks: dit
TreeView+ depends on / blocked
 
Reported: 2020-12-24 11:23 UTC by Lubov
Modified: 2021-03-23 20:20 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously when using virtual media on a Dell system, if the virtual media was already attached before the deployment commenced it would fail. Ironic now retries if this occurs.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:48:29 UTC
Target Upstream Version:


Attachments (Terms of Use)
VirtMediaAutoAttach.png (40.67 KB, image/png)
2020-12-24 11:23 UTC, Lubov
no flags Details
ironic conductor log (358.50 KB, text/plain)
2020-12-24 11:24 UTC, Lubov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift ironic-image pull 143 0 None closed Bug 1910739: Update ironic version to fix idrac bug 2021-02-15 10:33:11 UTC
OpenStack gerrit 770270 0 None MERGED Add a delay/retry is vmedia insert fails 2021-02-15 10:33:10 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:48:48 UTC

Description Lubov 2020-12-24 11:23:47 UTC
Created attachment 1741710 [details]
VirtMediaAutoAttach.png

Description of problem:
Trying to re-deploy on Dell setup using virtualmedia after the previous attempt of deployment using virtualmedia failed

The deployment fails with error 
Error: could not inspect: could not inspect node, node is currently 'inspect failed', last error was 'Failed to inspect hardware. Reason: unable to start inspection: HTTP POST https://10.46.61.40/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD/Actions/VirtualMedia.InsertMedia returned code 500. Base.1.5.GeneralError: The Virtual Media image server is already connected. Extended information: [{'Message': 'The Virtual Media image server is already connected.', 'MessageArgs': [], 'MessageArgs@odata.count': 0, 'MessageId': 'IDRAC.2.1.VRM0012', 'RelatedProperties': [], 'RelatedProperties@odata.count': 0, 'Resolution': 'No response action is required.', 'Severity': 'Informational'}]
 

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-12-21-131655

How reproducible:
100%

Steps to Reproduce:
1. Ensure virtualmedia already attached
2. Run deployment using virtualmedia

Actual results:
deployment fails

Expected results:
ironic should disconnect the existing image before connecting a new

Additional info:

I've seen https://bugzilla.redhat.com/show_bug.cgi?id=1861025 for 4.6 and the provided solution, but on our setup Attach Mode already set to AutoAttach (see attached png)

Comment 1 Lubov 2020-12-24 11:24:58 UTC
Created attachment 1741711 [details]
ironic conductor log

Comment 2 rlopez 2021-01-06 15:37:27 UTC
Hi, Lubov,

What Dell hardware is it and what firmware version are you running?

Comment 3 Lubov 2021-01-07 20:01:56 UTC
(In reply to rlopez from comment #2)
> Hi, Lubov,
> 
> What Dell hardware is it and what firmware version are you running?

Model	PowerEdge R740
BIOS Version	2.8.1
iDRAC Firmware Version	4.22.00.00

Comment 4 rlopez 2021-01-07 20:27:07 UTC
Lubov,

I went ahead and looked at my config and compared it to your virtual media configuration, difference between mine and yours is the Floppy Emulation is set to Enabled. 

I tested yesterday nightly: 4.7.0-0.nightly-2020-12-21-131655 and it installed successfully using virtual media.

My config was:

PowerEdge R640 
BIOS: 2.8.1
iDRAC FW: 4.20.20.20

Options to try:

1) Latest iDRAC driver now is 4.40.00.00 available here: https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=62gw1&oscode=rhel8&productcode=poweredge-r740 -- maybe worth trying this latest iDrac FW see if you see the same result?

2) Downgrade to 4.20.20.20 and see if that works (since it works for me): https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=369m3&oscode=rhel8&productcode=poweredge-r740

Comment 5 rlopez 2021-01-07 20:34:07 UTC
To add, I'm quickly going to update my R640s to the latest 4.40.00.00 and report back.

Comment 8 rlopez 2021-01-08 19:40:28 UTC
So 4.40.00.00 indeed has issues. Seems like you indeed need specifically version 4.20.20.20 to proceed.

Comment 9 rlopez 2021-01-09 04:23:37 UTC
Went ahead and tested the following firmware versions and these all worked for me. 

4.32.10.00
4.22.00.53
4.22.00.00

Comment 12 Derek Higgins 2021-01-11 14:57:37 UTC
the problem here seems to be nothing to do with the redfish driver, only that vmedia is left attached after the redfish driver job fails

the ironic code is doing an eject vmedia immediately followed by an insert
this fails because it looks like the eject isn't synchronous
waiting 2 seconds between the eject/insert is enough for it to succeed, a retry might also be an option, I don't see any callback url after the eject to check if its done

[root@localhost html]# bash -x eject_insert.sh                                                                                                                                                             [3/1941]
+ curl -v -u XXX -k -H 'Content-Type: application/json' -H 'OData-Version: 4.0' https://10.46.61.41/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD/Actions/VirtualMedia.EjectMedia -d '{}'          
> POST /redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD/Actions/VirtualMedia.EjectMedia HTTP/1.1
> Host: 10.46.61.41
> Content-Type: application/json
> OData-Version: 4.0
>
< HTTP/1.1 204 No Content
< Date: Mon, 11 Jan 2021 20:55:05 GMT
< Server: Apache
< X-Frame-Options: DENY
< Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
<
* Connection #0 to host 10.46.61.41 left intact
+ curl -s -u XXX -k -H 'Content-Type: application/json' -H 'OData-Version: 4.0' https://10.46.61.41/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD/Actions/VirtualMedia.InsertMedia -d '{"Image": "ht
tp://10.46.59.242:80/image.iso?filename=tmp4iq8w0u.iso", "Inserted": true, "WriteProtected": true}'
{"error":{"@Message.ExtendedInfo":[{"Message":"The Virtual Media image server is already connected.","MessageArgs":[],"MessageArgs@odata.count":0,"MessageId":"IDRAC.2.1.VRM0012","RelatedProperties":[],"RelatedPr
operties@odata.count":0,"Resolution":"No response action is required.","Severity":"Informational"}],"code":"Base.1.5.GeneralError","message":"A general error has occurred. See ExtendedInfo for more information"}
}
+ sleep 2
+ curl -s -u XXX -k -H 'Content-Type: application/json' -H 'OData-Version: 4.0' https://10.46.61.41/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD/Actions/VirtualMedia.InsertMedia -d '{"Image": "ht
tp://10.46.59.242:80/image.iso?filename=tmp4iq8w0u.iso", "Inserted": true, "WriteProtected": true}'
[root@localhost html]#

Comment 14 Derek Higgins 2021-01-27 14:24:44 UTC
(In reply to Derek Higgins from comment #12)
> the problem here seems to be nothing to do with the redfish driver, only
> that vmedia is left attached after the redfish driver job fails
> 
> the ironic code is doing an eject vmedia immediately followed by an insert
> this fails because it looks like the eject isn't synchronous
> waiting 2 seconds between the eject/insert is enough for it to succeed, a
> retry might also be an option, I don't see any callback url after the eject
> to check if its done

Removing the blocker flag as removing any vmedia before starting the deployment will prevent this occurring.

Comment 16 Lubov 2021-02-01 11:17:03 UTC
Verified on 4.7.0-0.nightly-2021-01-31-031653 by running the scenario mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1910739#c10

Comment 19 errata-xmlrpc 2021-02-24 15:48:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.