Description of problem: ----------------------- Attempt to upgrade UC after FFWD fails: "INFO:nova_statedir:Nova statedir ownership complete", "stdout: efca5abba7c5a82d0111758b17b0ae5d3e405958d622dd3aa1121cb3bdee1766", "stdout: Cell0 is already setup", "stdout: da74a3024e136851cd056015e864d74e247df07845996d27986a96fe6a265536", "Error running ['docker', 'run', '--name', 'ironic_db_sync', '--label', 'config_id=tripleo_step3', '--label', 'container_name=ironic_db_sync', '--label', 'managed_by=paunch', '--label', 'config_data={\"s tart_order\": 1, \"image\": \"192.168.24.1:8787/rhosp14/openstack-ironic-api:2018-10-10.3\", \"command\": \"/usr/bin/bootstrap_host_exec ironic_api su ironic -s /bin/bash -c \\'ironic-dbsync --config-file /etc/i ronic/ironic.conf\\'\", \"user\": \"root\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/s ource/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\" , \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/config-data/ironic_api/etc/ironic :/etc/ironic:ro\", \"/var/log/containers/ironic:/var/log/ironic\", \"/var/log/containers/httpd/ironic-api:/var/log/httpd\"], \"net\": \"host\", \"detach\": false, \"privileged\": false}', '--net=host', '--privil eged=false', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/so urce/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.t rust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--v olume=/var/lib/config-data/ironic_api/etc/ironic:/etc/ironic:ro', '--volume=/var/log/containers/ironic:/var/log/ironic', '--volume=/var/log/containers/httpd/ironic-api:/var/log/httpd', '192.168.24.1:8787/rhosp14 /openstack-ironic-api:2018-10-10.3', '/usr/bin/bootstrap_host_exec', 'ironic_api', 'su', 'ironic', '-s', '/bin/bash', '-c', \"'ironic-dbsync\", '--config-file', \"/etc/ironic/ironic.conf'\"]. [2]", "stderr: The database is not compatible with this release of ironic (11.1.1.dev8). Please run \"ironic-dbsync online_data_migrations\" using the previous release.", "stdout: (cellv2) Updating default cell_v2 cell 30c11559-302d-4feb-bb44-338fab6e915e", "INFO [alembic.runtime.migration] Running upgrade 18440d0834af -> 2970d2d44edc, Add manage_boot to nodes", Version-Release number of selected component (if applicable): ------------------------------------------------------------- openstack-tripleo-heat-templates-9.0.0-0.20181001174822.90afd18.0rc2.el7ost.noarch python2-ironicclient-2.5.0-0.20180810135843.fb94fb8.el7ost.noarch python2-ironic-neutron-agent-1.2.1-0.20180831194318.af95c72.el7ost.noarch puppet-ironic-13.3.1-0.20180911185738.317a1e5.el7ost.noarch openstack-ironic-staging-drivers-0.10.1-0.20180820161038.39c4e93.el7ost.noarch openstack-ironic-api-11.1.1-0.20181001152939.4167083.el7ost.noarch openstack-ironic-common-11.1.1-0.20181001152939.4167083.el7ost.noarch openstack-ironic-conductor-11.1.1-0.20181001152939.4167083.el7ost.noarch openstack-ironic-inspector-8.0.1-0.20180924215820.e89450c.el7ost.noarch python2-ironic-inspector-client-3.3.0-0.20180810080932.53bf4e8.el7ost.noarch python-ironic-lib-2.14.0-0.20180810074837.344161b.el7ost.noarch python-tripleoclient-10.6.1-0.20180929200237.1d8dcb6.el7ost.noarch python-tripleoclient-heat-installer-10.6.1-0.20180929200237.1d8dcb6.el7ost.noarch Steps to Reproduce: ------------------- 1. Perform FFWD of RHOS-10 env 2. Setup repos and prepare container-image-prepare file for uc upgrade 3. Follow upgrade doc for undercloud upgrade Actual results: --------------- Undercloud upgrade failed Expected results: ----------------- Undercloud upgrade succeeded Additional info: ---------------- Virtual environment 3controllers + 2computes + 3ceph
This is likely similar to bug 1624899, but it's on *undercloud*, so the fix needs to go into a different repo. The 13->14 upgrade which failed uses the new t-h-t approach to deploying/upgrading UC, but it looks like the migration(s) were missed in earlier UC upgrades (12->13 i guess?). So the breakage needs to be fixed in instack-undercloud.
Looking at instack-undercloud for Queens: https://github.com/openstack/instack-undercloud/blob/stable/queens/elements/puppet-stack-config/puppet-stack-config.pp Maybe it's just missing an include for this class? https://github.com/openstack/puppet-ironic/blob/stable/queens/manifests/db/online_data_migrations.pp
I don't think that's the cause. The class is included indirectly via the root manifest: https://github.com/openstack/puppet-ironic/blob/stable/queens/manifests/init.pp#L416 if this value is set to true: https://github.com/openstack/instack-undercloud/blob/stable/queens/elements/puppet-stack-config/puppet-stack-config.yaml.template#L495.
In the sosreport I see signs of online_data_migrations running on Queens: 2018-10-16 11:48:22.645 8618 INFO ironic.db.sqlalchemy.api [req-2dbfe38b-353b-4ece-949a-b84578c9074d - - - - -] Migrating nodes with driver pxe_ipmitool to {'management_interface': 'ipmitool', 'inspect_interface': 'inspector', 'raid_interface': 'no-raid', 'power_interface': 'ipmitool', 'driver': 'ipmi', 'deploy_interface': 'iscsi', 'boot_interface': 'pxe', 'console_interface': 'no-console', 'rescue_interface': 'no-rescue', 'vendor_interface': 'ipmitool'} This migration only existed in that release.
However, judging by the nodes table, the online data migrations never ran. MariaDB [ironic]> select version from nodes; +---------+ | version | +---------+ | 1.21 | | 1.21 | | 1.21 | | 1.21 | | 1.21 | | 1.21 | | 1.21 | | 1.21 | +---------+ 1.21 corresponds to Pike.
Still, the logs insist everything was done by puppet: $ grep -rn ironic-db undercloud_upgrade_1*.log undercloud_upgrade_11.log:2647:2018-10-16 11:16:39,245 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]/returns: executed successfully undercloud_upgrade_11.log:2648:2018-10-16 11:16:40,417 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]: Triggered 'refresh' from 1 events undercloud_upgrade_12.log:3276:2018-10-16 11:31:37,222 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]/returns: executed successfully undercloud_upgrade_12.log:3277:2018-10-16 11:31:38,461 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]: Triggered 'refresh' from 1 events undercloud_upgrade_12.log:3280:2018-10-16 11:31:39,726 INFO: Notice: /Stage[main]/Ironic::Db::Online_data_migrations/Exec[ironic-db-online-data-migrations]/returns: executed successfully undercloud_upgrade_12.log:3281:2018-10-16 11:31:40,937 INFO: Notice: /Stage[main]/Ironic::Db::Online_data_migrations/Exec[ironic-db-online-data-migrations]: Triggered 'refresh' from 3 events undercloud_upgrade_13.log:4274:2018-10-16 11:48:18,246 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]/returns: executed successfully undercloud_upgrade_13.log:4275:2018-10-16 11:48:19,665 INFO: Notice: /Stage[main]/Ironic::Db::Sync/Exec[ironic-dbsync]: Triggered 'refresh' from 2 events undercloud_upgrade_13.log:4278:2018-10-16 11:48:23,255 INFO: Notice: /Stage[main]/Ironic::Db::Online_data_migrations/Exec[ironic-db-online-data-migrations]/returns: executed successfully undercloud_upgrade_13.log:4279:2018-10-16 11:48:26,553 INFO: Notice: /Stage[main]/Ironic::Db::Online_data_migrations/Exec[ironic-db-online-data-migrations]: Triggered 'refresh' from 4 events
Yurii, could you run the same procedure, but please stop before trying an upgrade to 14? Essentially, just do FFU and leave the environment for me to investigate?
Discussed upstream. Apparently we have an incomplete upgrade procedure in ironic :( Affects all versions from Queens to master. Affects all upgrades, not only FFU.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045