Bug 1653788

Summary: Ironic should not treat cpu_arch as mandatory
Product: Red Hat OpenStack Reporter: Dmitry Tantsur <dtantsur>
Component: openstack-ironicAssignee: Riccardo Pittau <rpittau>
Status: CLOSED ERRATA QA Contact: mlammon
Severity: medium Docs Contact:
Priority: medium    
Version: 14.0 (Rocky)CC: aarapov, bfournie, mariel, mburns, mlammon, rpittau, tonyb
Target Milestone: z1Keywords: Triaged, ZStream
Target Release: 14.0 (Rocky)Flags: mlammon: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: [openstack-ironic-11.1.2-0.20190112034459.4279258.el7ost Doc Type: Bug Fix
Doc Text:
Previously, the `cpu_arch` property in bare metal nodes was mandatory. This caused iSCSI deployments to fail if the property was not specified. This fix changes the property to optional, and now you can deploy iSCSI nodes without setting this property.
Story Points: ---
Clone Of:
: 1688838 (view as bug list) Environment:
Last Closed: 2019-03-18 12:56:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1688838    

Description Dmitry Tantsur 2018-11-27 16:17:17 UTC
Starting with OSP 14, Ironic fails iSCSI deployments late in the process if cpu_arch is not specified on a node. This is not intentional, and is probably a side effect of https://github.com/openstack/ironic/commit/8f89954f9a3edab71db1157578e9d029a395d2d9. We need to make it optional (again) for the simplicity of single-arch cases.

Comment 4 mlammon 2019-03-05 19:55:29 UTC
Failed_QA:

Environemnt:
(undercloud) [stack@undercloud-0 ~]$ yum info openstack-ironic-conductor
Loaded plugins: search-disabled-repos
Available Packages
Name        : openstack-ironic-conductor
Arch        : noarch
Epoch       : 1
Version     : 11.1.2
Release     : 0.20190112034459.4279258.el7ost
Size        : 5.4 k
Repo        : rhelosp-14.0-puddle/x86_64
Summary     : The Ironic Conductor
URL         : http://www.openstack.org
License     : ASL 2.0
Description : Ironic Conductor for management and provisioning of physical machines

In virtual deployment removed the cpu_arch for both controller/compute

Example:

| properties                 | capabilities:boot_option='local', capabilities:profile='controller', resources:CUSTOM_CONTROLLER='1', resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' |


openstack baremetal node show controller-1 -f value -c properties
{u'memory_mb': u'32768', u'local_gb': u'39', u'cpus': u'8', u'capabilities': u'profile:controller,boot_option:local'}


See the following error in /var/log/containters/nova/nova-compute.log

2019-03-05 14:39:21.899 1 WARNING nova.virt.ironic.driver [req-dca133d4-5358-4801-9018-885613f3f6ad - - - - -] cpu_arch not defined for node '8bf495e2-d763-4ef2-9b96-8e186e33f0│·············

Comment 5 Riccardo Pittau 2019-03-06 10:03:32 UTC
this looks like it may require a patch on nova

Comment 6 Tony Breeds 2019-03-07 22:49:04 UTC
(In reply to Riccardo Pittau from comment #5)
> this looks like it may require a patch on nova

What needs to happen in nova?

Comment 7 Riccardo Pittau 2019-03-08 10:59:14 UTC
Hey Tony, apologies, I should have elaborated more.
I'm not an expert on how nova works under the hood, but having a look at the code in the nova ironic driver, my understanding is that this is the expected behavior when the cpu_arch is not set, so it looks like this is working as intended.
If the cpu_arch is not set, nova will return an empty list of supported instances for a node, and the deploy will not happen.

Comment 8 Dmitry Tantsur 2019-03-11 16:43:58 UTC
Mike, did the deployment actually fail? Also this issue is about ironic validation, not about nova behavior.

Comment 9 mlammon 2019-03-11 19:40:52 UTC
Dmitry,
Yes, the deployment failed when we removed the cpu_arch.  Let me know if we need to re-create an environment?

Comment 10 Dmitry Tantsur 2019-03-12 09:01:53 UTC
Then could you paste the actual failure please? It will be interesting to see the cause.

Comment 11 mlammon 2019-03-12 14:27:29 UTC
openstack stack failures list overcloud
overcloud.Controller.1.Controller:
  resource_type: OS::TripleO::ControllerServer
  physical_resource_id: 65456b08-d180-4cdd-8aa8-e7bac455a44a
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
overcloud.Controller.0.Controller:
  resource_type: OS::TripleO::ControllerServer
  physical_resource_id: f7a0d60c-cb48-4783-87c8-b53222eeb66d
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
overcloud.Controller.2.Controller:
  resource_type: OS::TripleO::ControllerServer
  physical_resource_id: db25708e-d848-444d-87d1-ec0fd9c2e328
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"
overcloud.Compute.0.NovaCompute:
  resource_type: OS::TripleO::ComputeServer
  physical_resource_id: 85607a65-e299-42f1-afb6-7a2cea8fa4ce
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500"

Comment 14 errata-xmlrpc 2019-03-18 12:56:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0587