Description of problem: Installing bits from the latest poodle/puddle and deploying the overcloud fails due to the 'Node xxx is locked by host' error: >> source /home/stack/stackrc; if [ -f "/home/stack/deploy-overcloudrc" ]; then source /home/stack/deploy-overcloudrc; fi; openstack overcloud deploy --plan-uuid 1803970e-ddf9-41d9-a101-bd67afe667a9 --control-scale $CONTROLSCALE --compute-scale $COMPUTESCALE --ceph-storage-scale $CEPHSTORAGESCALE 19:26:18 failed: [undercloud] => {"changed": true, "cmd": "source /home/stack/stackrc; if [ -f \"/home/stack/deploy-overcloudrc\" ]; then\n source /home/stack/deploy-overcloudrc;\n fi; openstack overcloud deploy --plan-uuid 1803970e-ddf9-41d9-a101-bd67afe667a9 --control-scale $CONTROLSCALE --compute-scale $COMPUTESCALE --ceph-storage-scale $CEPHSTORAGESCALE #Both swift and blockstorage are not supported downstream right now #--swift-storage-scale $SWIFTSTORAGESCALE #--block-storage-scale $BLOCKSTORAGESCALE;", "delta": "0:00:16.466670", "end": "2015-06-18 19:26:18.978551", "rc": 1, "start": "2015-06-18 19:26:02.511881", "warnings": []} 19:26:18 stderr: WARNING: ironicclient.common.http Request returned failure status. 19:26:18 WARNING: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 1 of 6 19:26:18 WARNING: ironicclient.common.http Request returned failure status. 19:26:18 WARNING: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 2 of 6 19:26:18 WARNING: ironicclient.common.http Request returned failure status. 19:26:18 WARNING: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 3 of 6 19:26:18 WARNING: ironicclient.common.http Request returned failure status. 19:26:18 WARNING: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 4 of 6 19:26:18 WARNING: ironicclient.common.http Request returned failure status. 19:26:18 WARNING: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 5 of 6 19:26:18 WARNING: ironicclient.common.http Request returned failure status. 19:26:18 ERROR: ironicclient.common.http Error contacting Ironic server: Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409). Attempt 6 of 6 19:26:18 ERROR: openstack Node 62e2f991-c2fe-4d5b-9b9e-cb8fecf1fbed is locked by host host15.beaker.tripleo.lab.eng.rdu2.redhat.com, please retry after the current operation is completed. (HTTP 409) 19:26:18 stdout: The following templates will be written: 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/overcloud_volume.pp 19:26:18 /tmp/tmpMG0VyV/hieradata/object.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/common.yaml 19:26:18 /tmp/tmpMG0VyV/provider-Swift-Storage-1.yaml 19:26:18 /tmp/tmpMG0VyV/network/ports/net_ip_map.yaml 19:26:18 /tmp/tmpMG0VyV/provider-Cinder-Storage-1.yaml 19:26:18 /tmp/tmpMG0VyV/provider-Compute-1.yaml 19:26:18 /tmp/tmpMG0VyV/network/noop.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/bootstrap-config.yaml 19:26:18 /tmp/tmpMG0VyV/net-config-bridge.yaml 19:26:18 /tmp/tmpMG0VyV/provider-Ceph-Storage-1.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/controller-post-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/cinder-storage-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/overcloud_cephstorage.pp 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/object.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/controller-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/overcloud_compute.pp 19:26:18 /tmp/tmpMG0VyV/puppet/cinder-storage-post.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/swift-storage-post.yaml 19:26:18 /tmp/tmpMG0VyV/provider-Controller-1.yaml 19:26:18 /tmp/tmpMG0VyV/network/networks.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/overcloud_object.pp 19:26:18 /tmp/tmpMG0VyV/hieradata/controller.yaml 19:26:18 /tmp/tmpMG0VyV/network/ports/ctlplane_vip.yaml 19:26:18 /tmp/tmpMG0VyV/hieradata/volume.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/compute-post-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/extraconfig/tasks/yum_update.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/swift-storage-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/extraconfig/tasks/yum_update.sh 19:26:18 /tmp/tmpMG0VyV/puppet/swift-devices-and-proxy-config.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/controller-config-pacemaker.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/compute-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/volume.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/ceph-storage-post-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/extraconfig/controller/noop.yaml 19:26:18 /tmp/tmpMG0VyV/network/ports/noop.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/ceph-storage-puppet.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/ceph.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/vip-config.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/controller.yaml 19:26:18 /tmp/tmpMG0VyV/plan.yaml 19:26:18 /tmp/tmpMG0VyV/environment.yaml 19:26:18 /tmp/tmpMG0VyV/network/ports/net_ip_list_map.yaml 19:26:18 /tmp/tmpMG0VyV/hieradata/compute.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/hieradata/compute.yaml 19:26:18 /tmp/tmpMG0VyV/hieradata/ceph.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/overcloud_controller_pacemaker.pp 19:26:18 /tmp/tmpMG0VyV/hieradata/common.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/manifests/ringbuilder.pp 19:26:18 /tmp/tmpMG0VyV/extraconfig/post_deploy/default.yaml 19:26:18 /tmp/tmpMG0VyV/net-config-noop.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/ceph-cluster-config.yaml 19:26:18 /tmp/tmpMG0VyV/firstboot/userdata_default.yaml 19:26:18 /tmp/tmpMG0VyV/puppet/all-nodes-config.yaml This error was logged previously during node discovery: https://bugzilla.redhat.com/show_bug.cgi?id=1212134 Version-Release number of selected component (if applicable): [root@host15 ~]# rpm -qa | grep openstack openstack-tripleo-common-0.0.1.dev6-0.git49b57eb.el7ost.noarch openstack-ceilometer-alarm-2015.1.0-2.el7ost.noarch openstack-swift-account-2.3.0-1.el7ost.noarch openstack-tripleo-puppet-elements-0.0.1-2.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.0-3.el7ost.noarch openstack-tripleo-0.0.6-0.1.git812abe0.el7ost.noarch openstack-tuskar-0.4.18-2.el7ost.noarch openstack-swift-2.3.0-1.el7ost.noarch openstack-nova-novncproxy-2015.1.0-10.el7ost.noarch openstack-swift-plugin-swift3-1.7-3.el7ost.noarch redhat-access-plugin-openstack-7.0.0-0.el7ost.noarch openstack-heat-api-2015.1.0-3.el7ost.noarch openstack-ceilometer-central-2015.1.0-2.el7ost.noarch openstack-nova-scheduler-2015.1.0-10.el7ost.noarch openstack-nova-cert-2015.1.0-10.el7ost.noarch openstack-nova-common-2015.1.0-10.el7ost.noarch openstack-tripleo-image-elements-0.9.6-1.el7ost.noarch openstack-ceilometer-notification-2015.1.0-2.el7ost.noarch openstack-ceilometer-collector-2015.1.0-2.el7ost.noarch openstack-ironic-common-2015.1.0-4.el7ost.noarch openstack-nova-compute-2015.1.0-10.el7ost.noarch openstack-nova-conductor-2015.1.0-10.el7ost.noarch openstack-neutron-openvswitch-2015.1.0-7.el7ost.noarch openstack-swift-container-2.3.0-1.el7ost.noarch openstack-nova-api-2015.1.0-10.el7ost.noarch openstack-dashboard-theme-2015.1.0-10.el7ost.noarch openstack-tuskar-ui-extras-0.0.4-1.el7ost.noarch openstack-nova-console-2015.1.0-10.el7ost.noarch openstack-neutron-common-2015.1.0-7.el7ost.noarch openstack-neutron-2015.1.0-7.el7ost.noarch openstack-heat-engine-2015.1.0-3.el7ost.noarch openstack-ceilometer-common-2015.1.0-2.el7ost.noarch openstack-heat-api-cfn-2015.1.0-3.el7ost.noarch openstack-ironic-conductor-2015.1.0-4.el7ost.noarch openstack-ceilometer-api-2015.1.0-2.el7ost.noarch openstack-ironic-api-2015.1.0-4.el7ost.noarch openstack-swift-proxy-2.3.0-1.el7ost.noarch openstack-puppet-modules-2015.1.5-1.el7ost.noarch openstack-dashboard-2015.1.0-10.el7ost.noarch openstack-heat-templates-0-0.6.20150605git.el7ost.noarch openstack-selinux-0.6.32-1.el7ost.noarch openstack-tempest-kilo-20150507.2.el7ost.noarch openstack-neutron-ml2-2015.1.0-7.el7ost.noarch openstack-keystone-2015.1.0-1.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-9.el7ost.noarch openstack-glance-2015.1.0-6.el7ost.noarch python-openstackclient-1.0.3-2.el7ost.noarch openstack-ironic-discoverd-1.1.0-3.el7ost.noarch openstack-swift-object-2.3.0-1.el7ost.noarch python-django-openstack-auth-1.2.0-2.el7ost.noarch openstack-tuskar-ui-0.3.0-2.el7ost.noarch openstack-utils-2014.2-1.el7ost.noarch openstack-heat-common-2015.1.0-3.el7ost.noarch How reproducible: Often but not at every deploy. Redeploying the overcloud gets by the error (but CI fails out) Steps to Reproduce: 1. Install bits from latest poodle/puddle 2. openstack overcloud deploy --plan-uuid $ID --control-scale $CONTROLSCALE --compute-scale $COMPUTESCALE --ceph-storage-scale $CEPHSTORAGESCALE 3. See warnings and errors Actual results: Deploying overcloud fails Expected results: Overcloud deployed Additional info:
Oh... please provide ironic conductor and API logs around failure time (sudo journalctl -u openstack-ironic-api -u openstack-ironic-conductor)
I've started an rdo-list thread to discuss the issue: https://www.redhat.com/archives/rdo-list/2015-June/msg00149.html
Will copy logs and journalctl output when we hit the error again - it's sporadic.
https://review.gerrithub.io/#/c/237471/ is an instack-undercloud patch to bump retry interval for Ironic globally. I'm still interested in logs, however.
This occurred in CI again on Dell BM. will pull the logs from the job and post here
Created attachment 1043486 [details] undercloud logs
Created attachment 1043487 [details] host0 logs
Upstream patch to bump retry interval: https://review.openstack.org/#/c/196020/ I intend to backport it asap.
I also suggest backporting https://review.openstack.org/#/c/194619/ for ga or for later to make such problems debugging simpler.
I couldn't reproduce this neither on virtual nor baremetal environment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548