Bug 1421837 - rhos-director: deployment with custom ironic role: cleaning fails KeystoneFailure: 'NoneType' object has no attribute 'find'
Summary: rhos-director: deployment with custom ironic role: cleaning fails KeystoneFai...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 11.0 (Ocata)
Assignee: Steven Hardy
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-13 19:40 UTC by Alexander Chuzhoy
Modified: 2017-05-17 19:59 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-heat-templates-6.0.0-0.20170222195630.46117f4.el7ost.noarch.rpm
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-17 19:59:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1664524 0 None None None 2017-02-14 11:26:18 UTC
OpenStack gerrit 431425 0 None MERGED Make the DB URIs host-independent for all services 2021-02-19 18:32:32 UTC
OpenStack gerrit 436192 0 None MERGED Make the DB URIs host-independent for all services 2021-02-19 18:32:32 UTC
Red Hat Product Errata RHEA-2017:1245 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC

Description Alexander Chuzhoy 2017-02-13 19:40:46 UTC
rhos-director: deployment with custom ironic role: cleaning fails KeystoneFailure: 'NoneType' object has no attribute 'find'


Environment:
openstack-ironic-conductor-6.2.1-0.20170126214641.2427c7b.el7ost.noarch
openstack-ironic-inspector-4.2.1-0.20170126202204.d557080.el7ost.noarch
instack-undercloud-6.0.0-0.20170127055514.317db76.el7ost.noarch
puppet-ironic-10.1.0-0.20170126121442.d991f28.el7ost.noarch
openstack-tripleo-heat-templates-6.0.0-0.20170127041112.ce54697.el7ost.1.noarch
openstack-ironic-common-6.2.1-0.20170126214641.2427c7b.el7ost.noarch
python-ironic-lib-2.5.2-0.20170123115307.ace87b6.el7ost.noarch
openstack-ironic-api-6.2.1-0.20170126214641.2427c7b.el7ost.noarch
python-ironicclient-1.10.0-0.20170120194459.808a4cb.el7ost.noarch
python-ironic-inspector-client-1.10.0-0.20161219133602.0eae82e.el7ost.noarch



Steps to reproduce:
1)
Successfully deploy overcloud with custom role for ironic:

- name: Ironic
  CountDefault: 1
  HostnameFormatDefault: '%stackname%-ironic-%index%'
  disable_upgrade_deployment: True
  ServicesDefault:
    - OS::TripleO::Services::IronicConductor
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::FluentdClient



This was the deployment command:
openstack overcloud deploy --templates -r /home/stack/roles_data.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e virt/ceph.yaml -e /home/stack/network-isolation.yaml -e virt/network/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml -e ironic.yaml -e flat_networks.yaml




[stack@undercloud-0 ~]$ cat ironic.yaml 
parameter_defaults:
  IronicEnabledDrivers:
      - pxe_ssh
  NovaSchedulerDefaultFilters:
      - RetryFilter
      - AggregateInstanceExtraSpecsFilter
      - AvailabilityZoneFilter
      - RamFilter
      - DiskFilter
      - ComputeFilter
      - ComputeCapabilitiesFilter
      - ImagePropertiesFilter
  IronicCleaningDiskErase: metadata
  IronicIPXEEnabled: true
  ControllerExtraConfig:
      ironic::drivers::ssh::libvirt_uri: 'qemu:///system'



[stack@undercloud-0 ~]$ cat flat_networks.yaml
parameter_defaults:
  NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal
  NeutronFlatNetworks: datacentre,baremetal




[stack@undercloud-0 ~]$ heat stack-list && nova list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+--------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time |
+--------------------------------------+------------+-----------------+----------------------+--------------+
| 904bca90-52f8-40ad-a259-061d6d5f34d8 | overcloud  | CREATE_COMPLETE | 2017-02-11T22:07:45Z | None         |
+--------------------------------------+------------+-----------------+----------------------+--------------+
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks               |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| 5c3cbc5d-b888-4510-8f39-29edd91f9f8d | overcloud-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.7  |
| 00a2760b-453d-48e8-8a6e-ba101585c338 | overcloud-cephstorage-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.10 |
| 599ad661-bb1a-43a8-b987-07e0ca7a1601 | overcloud-cephstorage-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
| 4b05c551-049e-489d-967f-bc15b96f11a7 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.17 |
| 96b4afed-977a-4cbb-b64c-cc2f062ca321 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| 9c38e69b-2074-451e-b711-81c7df048b47 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.14 |
| 3a74c89f-51ce-40e1-a2f3-f09da4309d34 | overcloud-ironic-0      | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| b96cd88e-8e96-4978-850f-5f805f3436c5 | overcloud-ironic-1      | ACTIVE | -          | Running     | ctlplane=192.168.24.8  |
| ba6a1cac-7c20-4629-9d65-7e8e4c8c1433 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+


Attempt to create BM instances in overcloud following this guide:
http://docs.openstack.org/developer/tripleo-docs/advanced_deployment/baremetal_overcloud.html



"Note: The baremetal node provide command makes a node go through cleaning procedure, so it might take some time depending on the configuration."


Result:
The cleaning fails right away - the node doesn't even boot.
[stack@undercloud-0 ~]$ ironic node-list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| f0f253ef-1644-4116-b5e4-bc483256b81b | bm-0 | None          | power off   | clean failed       | True        |
| 7c835f19-deab-4b5d-aef0-d38463672f46 | bm-1 | None          | power off   | clean failed       | True        |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+



Note:
This works on the same release without moving "OS::TripleO::Services::IronicConductor" to a seperate role.

Do we need to move more services to ironic for this to work?

Comment 1 Red Hat Bugzilla Rules Engine 2017-02-13 19:41:23 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 5 Steven Hardy 2017-02-14 08:16:49 UTC
I think this may be because of the hiera interpolation here:

https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ironic-base.yaml#L64

To confirm, please can you provide the output from the following command, run on one of the controllers, and on the custom ironic role:

sudo hiera tripleo::profile::base::database::mysql::client_bind_address

I suspect this is nil on the ironic role node(s), so we need to define that data differently in the heat template (e.g via the EndpointMap not hiera interpolation).

Comment 6 Steven Hardy 2017-02-14 08:18:12 UTC
You could also collect "sudo hiera ironic::database_connection" in the failing and working cases, and/or inspect the related ironic.conf entry to see if it's malformed.

Comment 7 Ramon Acedo 2017-02-14 09:08:16 UTC
You are right Steven, the connection is malformed due to tripleo::profile::base::database::mysql::client_bind_address: being nil in hiera.

[heat-admin@overcloud-ironic-0 ~]$ sudo grep -w ^connection /etc/ironic/ironic.conf
connection = mysql+pymysql://ironic:u6EECMV8Uv2NT4qWnTkx698TE.1.20/ironic?bind_address=
[heat-admin@overcloud-ironic-0 ~]$ sudo hiera ironic::database_connection
mysql+pymysql://ironic:u6EECMV8Uv2NT4qWnTkx698TE.1.20/ironic?bind_address=
[heat-admin@overcloud-ironic-0 ~]$ sudo hiera tripleo::profile::base::database::mysql::client_bind_address
nil

In a controller it's added fine:

[root@overcloud-controller-1 ~]# sudo hiera tripleo::profile::base::database::mysql::client_bind_address
172.17.1.19
[root@overcloud-controller-1 ~]# grep ^connection /etc/ironic/ironic.conf
connection = mysql+pymysql://ironic:u6EECMV8Uv2NT4qWnTkx698TE.1.20/ironic?bind_address=172.17.1.19

Comment 8 Steven Hardy 2017-02-14 10:57:55 UTC
Ok, so we need to replace the 

"%{hiera('tripleo::profile::base::database::mysql::client_bind_address')}"" line with a reference to the local bind IP from the ServiceNetMap.

So it should probably look like:
  
  {get_param: [ServiceNetMap, MysqlNetwork]}

Same as in tht/puppet/services/database/mysql.yaml

I'm not quite sure why this was done using hiera interpolation but the same problem exists in various services which will all break in the same way:

$ grep -R client_bind_address ./* | grep hiera | cut -d: -f1 | sort | uniq
./aodh-base.yaml
./barbican-api.yaml
./ceilometer-base.yaml
./cinder-base.yaml
./ec2-api.yaml
./glance-api.yaml
./glance-registry.yaml.bak
./gnocchi-base.yaml
./heat-engine.yaml
./ironic-base.yaml
./keystone.yaml
./manila-base.yaml
./mistral-base.yaml
./neutron-api.yaml
./neutron-plugin-plumgrid.yaml
./nova-base.yaml
./octavia-api.yaml
./panko-base.yaml
./sahara-base.yaml

Comment 9 Steven Hardy 2017-02-14 10:58:33 UTC
Clearing needinfo as comment #7 provides the information needed to proceed with a fix.

Comment 10 Steven Hardy 2017-02-14 11:26:19 UTC
Upstream bug raised:

https://bugs.launchpad.net/tripleo/+bug/1664524

Comment 11 Steven Hardy 2017-02-14 11:55:33 UTC
Upstream patch pushed, needs further testing/review:

https://review.openstack.org/433607

Comment 12 Alex Schultz 2017-02-23 16:11:44 UTC
This should have been resolved with https://review.openstack.org/#/c/431425/ as we switched out how we were doing this configuration to use a file instead of trying to pass the ip address as params.

Comment 13 Steven Hardy 2017-03-02 15:56:44 UTC
https://review.openstack.org/#/c/431425/ landed which I verified by inspection is in openstack-tripleo-heat-templates-6.0.0-0.20170222195630.46117f4.el7ost.noarch.rpm

Comment 14 Alexander Chuzhoy 2017-03-03 21:44:51 UTC
Environment:
openstack-tripleo-heat-templates-6.0.0-0.20170222195630.46117f4.el7ost.noarch

What happens now is that the node changes status to "cleaning":
[stack@undercloud-0 ~]$ ironic node-list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| 0c471098-3720-4e3f-91af-2c7c5f21ad34 | bm-0 | None          | power off   | cleaning           | False       |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+



But it doesn't start and If I check the node:

[stack@undercloud-0 ~]$ ironic node-show bm-0
+------------------------+--------------------------------------------------------------------------+
| Property               | Value                                                                    |
+------------------------+--------------------------------------------------------------------------+
| chassis_uuid           | None                                                                     |
| clean_step             | {}                                                                       |
| console_enabled        | False                                                                    |
| created_at             | 2017-03-03T21:38:21+00:00                                                |
| driver                 | pxe_ssh                                                                  |
| driver_info            | {u'ssh_username': u'stack', u'deploy_kernel': u'71884d90-73ff-4e5d-bf3f- |
|                        | 7152af9c95ea', u'deploy_ramdisk': u'e9ca53aa-c54b-                       |
|                        | 4b63-a579-11da687ddcfa', u'ssh_key_contents': u'******',                 |
|                        | u'ssh_virt_type': u'virsh', u'ssh_address': u'172.16.0.1'}               |
| driver_internal_info   | {}                                                                       |
| extra                  | {}                                                                       |
| inspection_finished_at | None                                                                     |
| inspection_started_at  | None                                                                     |
| instance_info          | {}                                                                       |
| instance_uuid          | None                                                                     |
| last_error             | Async execution of _do_node_clean failed with error: 'NoneType' object   |
|                        | has no attribute 'find'                                                  |
| maintenance            | False                                                                    |
| maintenance_reason     | None                                                                     |
| name                   | bm-0                                                                     |
| network_interface      |                                                                          |
| power_state            | power off                                                                |
| properties             | {u'memory_mb': u'7822', u'cpu_arch': u'x86_64', u'local_gb': u'41',      |
|                        | u'cpus': u'1'}                                                           |
| provision_state        | cleaning                                                                 |
| provision_updated_at   | 2017-03-03T21:38:35+00:00                                                |
| raid_config            |                                                                          |
| reservation            | None                                                                     |
| resource_class         |                                                                          |
| target_power_state     | None                                                                     |
| target_provision_state | available                                                                |
| target_raid_config     |                                                                          |
| updated_at             | 2017-03-03T21:38:52+00:00                                                |
| uuid                   | 0c471098-3720-4e3f-91af-2c7c5f21ad34                                     |
+------------------------+--------------------------------------------------------------------------+


Note the failure:
Async execution of _do_node_clean failed with error: 'NoneType' object has no attribute 'find'

Comment 15 Ramon Acedo 2017-03-06 13:01:16 UTC
Hi Sasha, could you do the checks done in comment #7 to see if we are still missing the bind address in the database connection information?

Comment 16 Alexander Chuzhoy 2017-03-06 22:12:18 UTC
Seems like it:

[stack@undercloud-0 ~]$ nova list


+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks               |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| cdd0655b-4f6e-4e88-be5b-5ee8390430d3 | overcloud-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.9  |
| bbbfe9e2-97de-4c10-b6b1-0bc0293aa5e6 | overcloud-cephstorage-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 3a7af452-f386-432d-aeb5-96f0e0e14be2 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.19 |
| 569b1b9d-dedf-434b-a33b-bca90c92215f | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.10 |
| 5fae9050-afa0-4e98-b5a5-9767047b293d | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.18 |
| a33390dd-5cdd-4b69-a0ed-532fc37ded2f | overcloud-ironic-0      | ACTIVE | -          | Running     | ctlplane=192.168.24.22 |
| 943b2aaa-9d77-400f-a250-2202133b4b00 | overcloud-ironic-1      | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| a094afb3-5c6c-4d23-9182-98554ea2e8c7 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
n[stack@undercloud-0 ~]$ ssh heat-admin.24.10
[heat-admin@overcloud-controller-1 ~]$ sudo -i
[root@overcloud-controller-1 ~]# sudo hiera tripleo::profile::base::database::mysql::client_bind_address
172.17.1.13

[root@overcloud-controller-1 ~]# grep ^connection /etc/ironic/ironic.conf
connection = mysql+pymysql://ironic:9dkU8tBq4XAFydN4tpRaZFVvr.1.14/ironic?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo


[stack@undercloud-0 ~]$ ssh heat-admin.24.22
[heat-admin@overcloud-ironic-0 ~]$ sudo hiera tripleo::profile::base::database::mysql::client_bind_address
nil
[heat-admin@overcloud-ironic-0 ~]$  sudo hiera ironic::database_connection
mysql+pymysql://ironic:9dkU8tBq4XAFydN4tpRaZFVvr.1.14/ironic?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo
[heat-admin@overcloud-ironic-0 ~]$ sudo grep -w ^connection /etc/ironic/ironic.conf
connection = mysql+pymysql://ironic:9dkU8tBq4XAFydN4tpRaZFVvr.1.14/ironic?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo

Comment 17 Alexander Chuzhoy 2017-03-16 01:27:37 UTC
Environment:
instack-undercloud-6.0.0-2.el7ost.noarch
openstack-puppet-modules-10.0.0-0.20170307021643.0333c73.el7ost.noarch
openstack-tripleo-heat-templates-6.0.0-0.20170307170102.3134785.0rc2.el7ost.noarch


Was able to deploy and run the clean successfully.

nova list:
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks               |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+
| a3c941a1-dac0-455c-a9ff-3140d2fa0530 | overcloud-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 2591fe42-8491-4bd6-b824-3ed91fa5ca17 | overcloud-cephstorage-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
| 25ab1cc1-ab8c-424e-aa7e-f84d1888c7a8 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
| b30b51ed-2e71-4fd3-af5a-3c0abc9bae05 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.17 |
| c38cf70a-e39e-47c3-a745-3eb03ea82cb8 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| cca7b9d0-f40a-4776-9cf7-731ba142fe3f | overcloud-ironic-0      | ACTIVE | -          | Running     | ctlplane=192.168.24.9  |
| 197867aa-aad7-4e48-8b9c-b5b20c5fd763 | overcloud-ironic-1      | ACTIVE | -          | Running     | ctlplane=192.168.24.7  |
| 4793374e-549f-4a63-b2c0-e025c0f43f0c | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.10 |
+--------------------------------------+-------------------------+--------+------------+-------------+------------------------+




- name: Ironic
  CountDefault: 1
  HostnameFormatDefault: '%stackname%-ironic-%index%'
  disable_upgrade_deployment: True
  ServicesDefault:
    - OS::TripleO::Services::IronicConductor
    - OS::TripleO::Services::IronicApi
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::FluentdClient

Comment 18 Steven Hardy 2017-03-16 10:29:11 UTC
The read_default_group=tripleo in comment #16 shows https://review.openstack.org/#/c/431425 landed, I don't think the ironic issues were related to the bugfix under test here, moving back on ON_QA

Comment 19 Alexander Chuzhoy 2017-03-16 13:24:09 UTC
Verified based on Comment #17.

Comment 21 errata-xmlrpc 2017-05-17 19:59:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.