Bug 1624643 - Ironic Overcloud deployment: physical_network and version in ports table are NULL for all the ports
Summary: Ironic Overcloud deployment: physical_network and version in ports table are ...
Keywords:
Status: CLOSED DUPLICATE of bug 1624899
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: async
: 13.0 (Queens)
Assignee: Jiri Stransky
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-02 18:51 UTC by David Vallee Delisle
Modified: 2021-12-10 17:24 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-04 12:08:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 599350 0 None None None 2018-09-04 12:18:55 UTC
OpenStack gerrit 601244 0 None None None 2018-09-10 22:35:59 UTC
Red Hat Issue Tracker OSP-11602 0 None None None 2021-12-10 17:24:47 UTC
Red Hat Issue Tracker UPG-4866 0 None None None 2021-12-10 17:24:57 UTC

Description David Vallee Delisle 2018-09-02 18:51:35 UTC
Description of problem:
- This is an upgrade from Ocata to Queens. 
- DB migration fails to go beyond 868cb606a74a because the ports.version is NULL on all ports. Also, physical_network is NULL, but it should be using the "baremetal" network.
- They might have hit this bug [1] during the update process.
- [2] shows all the ironic tables with a summary of their version colums; conductor_hardware_interfaces and ports have NULL strings in their version columns.
- From the debug logs [3], the conductor_hardware_interface table is not submitted to this validation.
- A quick solution would probably be to set the port version to 1.7 and physical_network to "baremetal" but we chose to validate with engineering to make sure nothing breaks.


[1] https://bugs.launchpad.net/ironic/+bug/1715190
[2]
~~~
$ for t in $(mysql -h 127.0.0.1 -N -s -D ironic -e "show tables;" | grep -v alembic); do echo $t;mysql -h 127.0.0.1 -D ironic -e "select version,count(*) from $t group by version;";done
chassis
conductor_hardware_interfaces
+---------+----------+
| version | count(*) |
+---------+----------+
| NULL    |       90 |
+---------+----------+
conductors
+---------+----------+
| version | count(*) |
+---------+----------+
| 1.2     |        3 |
+---------+----------+
node_tags
nodes
+---------+----------+
| version | count(*) |
+---------+----------+
| 1.21    |        8 |
+---------+----------+
portgroups
ports
+---------+----------+
| version | count(*) |
+---------+----------+
| NULL    |        8 |
+---------+----------+
volume_connectors
volume_targets
~~~

[3]
~~~

[root@overcloud-controller-1 ironic]#  ironic-dbsync -d upgrade 2>&1 | less
Option "rpc_backend" from group "DEFAULT" is deprecated for removal (Replaced by [DEFAULT]/transport_url).  Its value may be silently ignored in the future.
2018-09-02 17:59:20.469 417673 DEBUG oslo_db.api [-] Loading backend 'sqlalchemy' from 'ironic.db.sqlalchemy.api' _load_backend /usr/lib/python2.7/site-packages/oslo_db/api.py:234
2018-09-02 17:59:20.744 417673 DEBUG oslo_db.sqlalchemy.engines [-] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:290
2018-09-02 17:59:20.748 417673 INFO alembic.runtime.migration [-] Context impl MySQLImpl.
2018-09-02 17:59:20.748 417673 INFO alembic.runtime.migration [-] Will assume non-transactional DDL.
2018-09-02 17:59:20.753 417673 INFO ironic.db.sqlalchemy.api [-] DVD checking Chassis Model Version Chassis.version in {'Node': set(['1.23', '1.22', '1.21']), 'Conductor': set(['1.2']), 'Trait': set(['1.0']), 'VolumeTarget': set(['1.0']), 'Port': set(['1.6', '1.7']), 'VolumeConnector': set(['1.0']), 'Chassis': set(['1.3']), 'Portgroup': set(['1.3']), 'TraitList': set(['1.0'])}
2018-09-02 17:59:20.759 417673 INFO ironic.db.sqlalchemy.api [-] DVD query out: SELECT chassis.version AS chassis_version
FROM chassis
WHERE chassis.version IS NULL OR chassis.version NOT IN (%(version_1)s) COUNT 0
2018-09-02 17:59:20.763 417673 INFO ironic.db.sqlalchemy.api [-] DVD checking Conductor Model Version Conductor.version in {'Node': set(['1.23', '1.22', '1.21']), 'Conductor': set(['1.2']), 'Trait': set(['1.0']), 'VolumeTarget': set(['1.0']), 'Port': set(['1.6', '1.7']), 'VolumeConnector': set(['1.0']), 'Chassis': set(['1.3']), 'Portgroup': set(['1.3']), 'TraitList': set(['1.0'])}
2018-09-02 17:59:20.767 417673 INFO ironic.db.sqlalchemy.api [-] DVD checking ConductorHardwareInterfaces Model Version ConductorHardwareInterfaces.version in {'Node': set(['1.23', '1.22', '1.21']), 'Conductor': set(['1.2']), 'Trait': set(['1.0']), 'VolumeTarget': set(['1.0']), 'Port': set(['1.6', '1.7']), 'VolumeConnector': set(['1.0']), 'Chassis': set(['1.3']), 'Portgroup': set(['1.3']), 'TraitList': set(['1.0'])}
2018-09-02 17:59:20.768 417673 INFO ironic.db.sqlalchemy.api [-] DVD checking Node Model Version Node.version in {'Node': set(['1.23', '1.22', '1.21']), 'Conductor': set(['1.2']), 'Trait': set(['1.0']), 'VolumeTarget': set(['1.0']), 'Port': set(['1.6', '1.7']), 'VolumeConnector': set(['1.0']), 'Chassis': set(['1.3']), 'Portgroup': set(['1.3']), 'TraitList': set(['1.0'])}
2018-09-02 17:59:20.773 417673 INFO ironic.db.sqlalchemy.api [-] DVD query out: SELECT nodes.version AS nodes_version
FROM nodes
WHERE nodes.version IS NULL OR nodes.version NOT IN (%(version_1)s, %(version_2)s, %(version_3)s) COUNT 0
2018-09-02 17:59:20.777 417673 INFO ironic.db.sqlalchemy.api [-] DVD checking Port Model Version Port.version in {'Node': set(['1.23', '1.22', '1.21']), 'Conductor': set(['1.2']), 'Trait': set(['1.0']), 'VolumeTarget': set(['1.0']), 'Port': set(['1.6', '1.7']), 'VolumeConnector': set(['1.0']), 'Chassis': set(['1.3']), 'Portgroup': set(['1.3']), 'TraitList': set(['1.0'])}
2018-09-02 17:59:20.783 417673 INFO ironic.db.sqlalchemy.api [-] DVD query out: SELECT ports.version AS ports_version
FROM ports
WHERE ports.version IS NULL OR ports.version NOT IN (%(version_1)s, %(version_2)s) COUNT 8
The database is not compatible with this release of ironic (10.1.2). Please run "ironic-dbsync online_data_migrations" using the previous release.
~~~

Version-Release number of selected component (if applicable):
openstack-ironic-api-10.1.2-4.el7ost.noarch                 Mon Aug 27 09:57:30 2018
openstack-ironic-common-10.1.2-4.el7ost.noarch              Mon Aug 27 09:56:48 2018
openstack-ironic-conductor-10.1.2-4.el7ost.noarch           Mon Aug 27 09:57:30 2018
puppet-ironic-12.4.0-0.20180329034302.8285d85.el7ost.noarch Mon Aug 27 09:55:32 2018
python-ironic-inspector-client-3.1.1-1.el7ost.noarch        Mon Aug 27 09:56:22 2018
python-ironic-lib-2.12.1-1.el7ost.noarch                    Mon Aug 27 09:56:39 2018
python2-ironicclient-2.2.0-1.el7ost.noarch                  Mon Aug 27 09:56:23 2018


Actual result:
Because of the missing column, ironic-conductor container fails to start on the overcloud controllers

~~~
2018-09-01 03:23:40.837 1 ERROR oslo_service.service DBError: (pymysql.err.InternalError) (1054, u"Unknown column 'nodes.rescue_interface' in 'field list'") 
~~~

Comment 2 Angus Thomas 2018-09-02 22:16:27 UTC
This looks like Ironic is complaining because the upgrade to Queens is being attempted without Pike's data migrations having completed. 

Was the original bug encountered during an attempt to upgrade directly from Ocata to Queens? If so, this would be the expected result, since skip-level upgrades aren't supported. 

If the intention was to upgrade through Pike in the supported manner, there's a bigger problem with the upgrade which isn't specific to Ironic, so that should be investigated, rather than the proposed solution of manually updating the DB schema.

Comment 5 Steve Bar Yakov Gindi 2018-09-03 06:20:09 UTC
The upgrade was the correct way from 11 to 12 to 13 , we didnt skip

Comment 6 Bob Fournier 2018-09-04 12:18:55 UTC
Related bug is https://bugzilla.redhat.com/show_bug.cgi?id=1624899.

Proposed fix is https://review.openstack.org/#/c/599350/.

Including Upgrades DFG.

Comment 8 Jiri Stransky 2019-04-04 12:08:28 UTC
The online data migration issue has been fixed as bug 1624899 and bug 1627680, no recent activity on this BZ, so i'll close it as duplicate of bug 1624899. Please comment in case the issue is still being observed.

*** This bug has been marked as a duplicate of bug 1624899 ***


Note You need to log in before you can comment on or make changes to this bug.