Description of problem: Before the overcloud is deployed on HP env, the CI jobs run ' openstack baremetal introspection bulk start'. There are traces in the logs of power off command being executed, however, before deploy, ironic node-list shows: $ ironic node-list +--------------------------------------+------+---------------+-------------+-----------------+-------------+ | UUID | Name | Instance UUID | Power State | Provision State | Maintenance | +--------------------------------------+------+---------------+-------------+-----------------+-------------+ | 40da75b9-be0d-41aa-a5f3-4218c002c78c | None | None | power on | available | False | | f6629bf7-e55f-4837-ab18-85f0667a097a | None | None | power off | available | False | | 30a16686-6786-4f93-b04c-843b1a36f121 | None | None | power on | available | False | | 9ec7a021-943a-4d56-bd7c-f277af4710bd | None | None | power on | available | False | +--------------------------------------+------+---------------+-------------+-----------------+-------------+ The problem is that nodes that are on from previous deploy could result in overlapping ip addresses. Version-Release number of selected component (if applicable): rpm -qa | grep openstack openstack-heat-api-2015.1.0-4.el7ost.noarch openstack-ceilometer-central-2015.1.0-10.el7ost.noarch openstack-tuskar-0.4.18-3.el7ost.noarch openstack-swift-2.3.0-1.el7ost.noarch openstack-nova-novncproxy-2015.1.0-16.el7ost.noarch openstack-swift-object-2.3.0-1.el7ost.noarch redhat-access-plugin-openstack-7.0.0-0.el7ost.noarch openstack-ceilometer-collector-2015.1.0-10.el7ost.noarch openstack-tripleo-common-0.0.1.dev6-1.git49b57eb.el7ost.noarch openstack-neutron-openvswitch-2015.1.0-12.el7ost.noarch openstack-nova-api-2015.1.0-16.el7ost.noarch python-django-openstack-auth-1.2.0-3.el7ost.noarch openstack-nova-common-2015.1.0-16.el7ost.noarch openstack-tripleo-0.0.7-0.1.1664e566.el7ost.noarch python-openstackclient-1.0.3-2.el7ost.noarch openstack-tripleo-puppet-elements-0.0.1-4.el7ost.noarch openstack-neutron-common-2015.1.0-12.el7ost.noarch openstack-neutron-2015.1.0-12.el7ost.noarch openstack-heat-engine-2015.1.0-4.el7ost.noarch openstack-ceilometer-common-2015.1.0-10.el7ost.noarch openstack-ironic-common-2015.1.0-9.el7ost.noarch openstack-nova-compute-2015.1.0-16.el7ost.noarch openstack-nova-conductor-2015.1.0-16.el7ost.noarch openstack-swift-account-2.3.0-1.el7ost.noarch openstack-swift-proxy-2.3.0-1.el7ost.noarch openstack-dashboard-theme-2015.1.0-10.el7ost.noarch openstack-tuskar-ui-extras-0.0.4-1.el7ost.noarch openstack-nova-console-2015.1.0-16.el7ost.noarch openstack-heat-templates-0-0.6.20150605git.el7ost.noarch openstack-tripleo-image-elements-0.9.6-6.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-45.el7ost.noarch openstack-heat-common-2015.1.0-4.el7ost.noarch openstack-heat-api-cfn-2015.1.0-4.el7ost.noarch openstack-ironic-conductor-2015.1.0-9.el7ost.noarch openstack-ceilometer-api-2015.1.0-10.el7ost.noarch openstack-ceilometer-alarm-2015.1.0-10.el7ost.noarch openstack-ironic-api-2015.1.0-9.el7ost.noarch openstack-keystone-2015.1.0-4.el7ost.noarch openstack-swift-plugin-swift3-1.7-3.el7ost.noarch openstack-puppet-modules-2015.1.8-8.el7ost.noarch openstack-dashboard-2015.1.0-10.el7ost.noarch openstack-utils-2014.2-1.el7ost.noarch openstack-tempest-kilo-20150708.2.el7ost.noarch openstack-neutron-ml2-2015.1.0-12.el7ost.noarch openstack-nova-scheduler-2015.1.0-16.el7ost.noarch openstack-nova-cert-2015.1.0-16.el7ost.noarch openstack-glance-2015.1.0-6.el7ost.noarch openstack-heat-api-cloudwatch-2015.1.0-4.el7ost.noarch openstack-ceilometer-notification-2015.1.0-10.el7ost.noarch openstack-ironic-discoverd-1.1.0-5.el7ost.noarch openstack-selinux-0.6.37-1.el7ost.noarch openstack-swift-container-2.3.0-1.el7ost.noarch openstack-tuskar-ui-0.3.0-13.el7ost.noarch rpm -qa | grep osc python-rdomanager-oscplugin-0.0.8-43.el7ost.noarch How reproducible: Mostly Steps to Reproduce: 1.Install undercloud on HP hardware 2. execute openstack baremetal introspection bulk start 3. check node status Actual results: Some nodes are powered on before deploy Expected results: All nodes should be off Additional info: This may be an issue of the environment and not the product
Created attachment 1055864 [details] sudo journalctl -u openstack-ironic-conductor -l --no-pager | grep 40da75b9-be0d-41aa-a5f3-4218c002c78c
The full ironic-conductor log for the first node is attached. Some additional context, This power state change is not coming from Ironic. We see the node get powered off after the discovery ramdisk completes: Jul 24 12:29:10 virtblade11.virt.lab.eng.bos.redhat.com ironic-conductor[12221]: 2015-07-24 12:29:10.394 12221 DEBUG ironic.conductor.manager [-] RPC change_node_power_state called for node 40da75b9-be0d-41aa-a5f3-4218c002c78c. The desired new state is power off. change_node_power_state /usr/lib/python2.7/site-packages/ironic/conductor/manager.py:431 Jul 24 12:29:20 virtblade11.virt.lab.eng.bos.redhat.com ironic-conductor[12221]: 2015-07-24 12:29:20.963 12221 INFO ironic.conductor.utils [-] Successfully set node 40da75b9-be0d-41aa-a5f3-4218c002c78c power state to power off. Then, about a minute later, Ironic finds the node powered on: Jul 24 12:30:16 virtblade11.virt.lab.eng.bos.redhat.com ironic-conductor[12221]: 2015-07-24 12:30:16.414 12221 WARNING ironic.conductor.manager [-] During sync_power_state, node 40da75 b9-be0d-41aa-a5f3-4218c002c78c state does not match expected state 'power off'. Updating recorded state to 'power on'. Note, there are no logs in between showing a RPC call to change the power state. My suspicion is that there is something outside of the deployment powering on the node, but I am not sure how to confirm that.
Is this still reproducing?
Hard to tell - in CI ( keep testing operational) we are working around this by turning the nodes off via Ironic before deploy.