Bug 1379010
Summary: | Cisco UCS 200-M3 blades will not properly boot over the network from IPMI without skipping setting boot device with the IPMI driver | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Andreas Karis <akaris> | |
Component: | openstack-ironic | Assignee: | Dmitry Tantsur <dtantsur> | |
Status: | CLOSED ERRATA | QA Contact: | Alistair Tonner <atonner> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 8.0 (Liberty) | CC: | akaris, asoni, astupnik, atonner, bfournie, dbecker, dtantsur, jraju, lmartins, mburns, racedoro, rhel-osp-director-maint, srevivo, tony.pearce | |
Target Milestone: | Upstream M1 | Keywords: | Triaged | |
Target Release: | 15.0 (Stein) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-ironic-12.1.1-0.20190427020357.d537f13.el8ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1627041 (view as bug list) | Environment: | ||
Last Closed: | 2019-09-21 11:15:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1627041, 1627043 |
Description
Andreas Karis
2016-09-23 22:22:06 UTC
We are going to contact Cisco and figure out why 'chassis bootdev pxe' is not working on the M3 ==> is there a workaround / different IPMI command that we need to send? *) with the conclusions from Cisco, we can hopefully make changes to the ipmi_pxetool: ==> either give users a choice not to force the `bootdev pxe` ==> detect the vendor / chassis version and do not force pxe boot if it's an M3 / respectively use the correct command as provided by Cisco (In reply to Andreas Karis from comment #1) > We are going to contact Cisco and figure out why 'chassis bootdev pxe' is > not working on the M3 > ==> is there a workaround / different IPMI command that we need to send? > *) with the conclusions from Cisco, we can hopefully make changes to the > ipmi_pxetool: There's a specific driver for Cisco UCS [0], have you tried to use it ? > ==> either give users a choice not to force the `bootdev pxe` If we don't set the boot device we would require to ask the operator to change the boot device manually before starting deployment. That would be a manual step right ? By your report it seems to be a bug in the M3 model, it should not overwrite the boot order in the profile. > ==> detect the vendor / chassis version and do not force pxe boot if it's an > M3 / respectively use the correct command as provided by Cisco So, the ipmitool driver is suppose to be a generic driver. We shouldn't add commands for specific hardware models, that why I think you should try to the UCS driver for Ironic [0], because that's a specific driver for Cisco Hardware and we could have this type of code. ... Please lemme know if using the UCS driver [0] works for you. [0] http://docs.openstack.org/developer/ironic/drivers/ucs.html Also, can you please check if the firmware is up-to-date ? This behavior/bug might have been fixed on newer releases. We are trying to get clarification from Cisco as well. About the UCS driver .. we are a bit reluctant to use it, because the UCS driver tends to generate a lot of other issues (call it bad experience or "the burnt child dreads the fire"). We will talk this through and may try to use it. I will get back to you as soon as I have further information. (In reply to Andreas Karis from comment #4) > We are trying to get clarification from Cisco as well. About the UCS driver > .. we are a bit reluctant to use it, because the UCS driver tends to > generate a lot of other issues (call it bad experience or "the burnt child > dreads the fire"). We will talk this through and may try to use it. > > I will get back to you as soon as I have further information. Thank you, I will wait for an update on this then (will leave the NEEDINFO flag active). Also, good feedback on the UCS driver. If you can elaborate it a bit more I can raise the problem upstream and fix it or at least let the Ironic Cisco developers aware of it. Thanks again, Lucas From one of our consultants about the UCS driver: "From my experience, you can't import nodes with the UCS driver on OSPd 7/8. I went through a rotation of parameters it wanted / wanted removed, until I got back to where I started from. I believe the parsing util for the syntax looks for both pm attributes and the ucs attributes, and complains if they are both there or missing. I know that sounds confusing, but I left the exercise with a headache, and not able to import a node." Still trying to get hands on that info from Cisco. (In reply to Andreas Karis from comment #6) > From one of our consultants about the UCS driver: > > "From my experience, you can't import nodes with the UCS driver on OSPd 7/8. > I went through a rotation of parameters it wanted / wanted removed, until I > got back to where I started from. I believe the parsing util for the syntax > looks for both pm attributes and the ucs attributes, and complains if they > are both there or missing. I know that sounds confusing, but I left the > exercise with a headache, and not able to import a node." > > Still trying to get hands on that info from Cisco. Oh I see, I think he's talking about https://bugzilla.redhat.com/show_bug.cgi?id=1290338 Coming back to this:
~~~
> ==> either give users a choice not to force the `bootdev pxe`
If we don't set the boot device we would require to ask the operator to change the boot device manually before starting deployment. That would be a manual step right ?
~~~
Would it be possible to give users a choice when using the ipmi_pxe driver to simply power on/off the machines, and _not_ set the boot order? In that case, users would be required to manually set their machines to pxe boot, and the ipmi_pxe driver would only take care of power on/off, and it wouldn't touch user defined settings. I can create an RFE for that, as it looks like a newish feature.
Hi all! The bootdev behavior is likely a bug in the firmware, I'd suggest reaching out to Cisco with it. I'm converting this bug to an RFE, but I'm not sure it will be accepted upstream, as it's going to be a workaround for a firmware bug. P.S. Please do not set NEEDINFO to get attention to your bug, this is not what this field is for. Hi, No problem, thank you, and thanks for the advice! - Andreas Proposing upstream in https://storyboard.openstack.org/#!/story/2003203, will report back when it's accepted or rejected. Per https://bugzilla.redhat.com/show_bug.cgi?id=1627041#c7 core_puddle_version: RHOS_TRUNK-15.0-RHEL-8-20190722.n.1 python3-tripleoclient.noarch 11.5.1-0.20190719020420.bffda01.el8ost openstack baremetal node list --fields name management_interface +--------------+----------------------+ | Name | Management Interface | +--------------+----------------------+ | ceph-0 | noop | | ceph-1 | ipmitool | | ceph-2 | ipmitool | | compute-0 | noop | | compute-1 | ipmitool | | controller-0 | noop | | controller-1 | ipmitool | | controller-2 | ipmitool | | ironic-0 | ipmitool | | ironic-1 | ipmitool | +--------------+----------------------+ or i in compute controller ceph ; do for j in 0 1 2; do echo "$i-$j"; virsh dumpxml $i-$j |grep -A 4 '<os>'; echo; done; done compute-0 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='network'/> <boot dev='hd'/> </os> compute-1 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> controller-0 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='network'/> <boot dev='hd'/> </os> controller-1 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> controller-2 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> ceph-0 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='network'/> <boot dev='hd'/> </os> ceph-1 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> ceph-2 <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> overcloud_depoy.sh ran successfully and instances were deployed to the overcloud successfully: Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |