rhel-osp-director: 8.0 deployment fails with "Error: Could not find class ::tripleo::firewall for overcloud-controller-2.localdomain" Environment: openstack-tripleo-heat-templates-0.8.8-2.el7ost.noarch instack-undercloud-2.2.3-1.el7ost.noarch openstack-tripleo-puppet-elements-0.0.2-3.el7ost.noarch openstack-puppet-modules-7.0.9-1.el7ost.noarch Steps to reproduce: Attempt to deploy an overcloud on BM with: export THT=/usr/share/openstack-tripleo-heat-templates openstack overcloud deploy --templates $THT \ -e $THT/environments/storage-environment.yaml \ -e $THT/environments/network-isolation.yaml \ -e /home/stack/network-environment.yaml \ --control-scale 3 \ --ceph-storage-scale 3 \ --compute-scale 2 \ --neutron-disable-tunneling \ --neutron-network-type vlan \ --neutron-network-vlan-ranges tenantvlan:18:43 \ --neutron-bridge-mappings datacentre:br-ex,tenantvlan:br-nic4 \ --ntp-server clock.redhat.com \ --timeout 180 Result: The deployment fails. [stack@undercloud ~]$ heat resource-list -n5 overcloud|grep -v COMPLE +----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | stack_name | +----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ | CephStorageAllNodesValidationDeployment | 6e1fa4d7-e975-4b44-b8df-02bb95998c64 | OS::Heat::StructuredDeployments | CREATE_FAILED | 2016-02-27T04:52:25 | overcloud | | ComputeNodesPostDeployment | 67627839-e962-41aa-be91-6655a8d01158 | OS::TripleO::ComputePostDeployment | CREATE_FAILED | 2016-02-27T04:52:26 | overcloud | | ControllerNodesPostDeployment | 0fea6772-7a65-4c3b-8fd3-99c1205ca5c4 | OS::TripleO::ControllerPostDeployment | CREATE_FAILED | 2016-02-27T04:52:26 | overcloud | | 0 | 33190079-6aa2-4baa-a21b-ab652de4d2f9 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:37:45 | overcloud-CephStorageAllNodesValidationDeployment-g76mbxc4vxw4 | | ComputePuppetDeployment | df1f22f5-3986-4781-95a7-aa8e0a775516 | OS::Heat::StructuredDeployments | CREATE_FAILED | 2016-02-27T05:37:57 | overcloud-ComputeNodesPostDeployment-4as4cpmwxywq | | 1 | 77b41b6e-2dbc-4aa2-b73b-97f24d0957bd | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:38:01 | overcloud-ComputeNodesPostDeployment-4as4cpmwxywq-ComputePuppetDeployment-naa47n3rvsma | | 0 | fbb569e9-fe38-4628-bfca-f0e9aef7b824 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:38:02 | overcloud-ComputeNodesPostDeployment-4as4cpmwxywq-ComputePuppetDeployment-naa47n3rvsma | | ControllerLoadBalancerDeployment_Step1 | 2d5010ff-ce1e-40f1-93a0-f04b01afb758 | OS::Heat::StructuredDeployments | CREATE_FAILED | 2016-02-27T05:38:11 | overcloud-ControllerNodesPostDeployment-5by2nnzblmxy | | 1 | c2b012c0-599f-470c-be31-91c3b4d05152 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:39:24 | overcloud-ControllerNodesPostDeployment-5by2nnzblmxy-ControllerLoadBalancerDeployment_Step1-p5peyf3x7sof | | 0 | c81dc204-8249-4bdb-85d2-7d5dbc8a313f | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:39:26 | overcloud-ControllerNodesPostDeployment-5by2nnzblmxy-ControllerLoadBalancerDeployment_Step1-p5peyf3x7sof | | 2 | b8b614c2-deea-4738-a4fc-30ae0eef3383 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-02-27T05:39:27 | overcloud-ControllerNodesPostDeployment-5by2nnzblmxy-ControllerLoadBalancerDeployment_Step1-p5peyf3x7sof | +----------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+ [stack@undercloud ~]$ heat deployment-show b8b614c2-deea-4738-a4fc-30ae0eef3383 { "status": "FAILED", "server_id": "1c2f8d4d-e31e-4bbc-bc2f-d71f4c79981e", "config_id": "04d24542-3a6e-488b-ae43-34554e780743", "output_values": { "deploy_stdout": "", "deploy_stderr": "Device \"br_ex\" does not exist.\nDevice \"br_nic2\" does not exist.\nDevice \"br_nic4\" does not exist.\nDevice \"ovs_system\" does not exist.\n\u001b[1;31mError: Could not find class ::tripleo::firewall for overcloud-controller-2.localdomain on node overcloud-controller-2.localdomain\u001b[0m\n\u001b[1;31mError: Could not find class ::tripleo::firewall for overcloud-controller-2.localdomain on node overcloud-controller-2.localdomain\u001b[0m\n", "deploy_status_code": 1 }, "creation_time": "2016-02-27T05:39:28", "updated_time": "2016-02-27T05:40:26", "input_values": {}, "action": "CREATE", "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1", "id": "b8b614c2-deea-4738-a4fc-30ae0eef3383" }
If this is happening in a baremetal environment but not a virtual one, my only suggestion is that maybe the disk on the baremetal still has the image from an older install, or the overcloud image is out of date for some other reason.
Indeed the opm version differes among OC nodes. The introspection completed successfully on all. The assigned deploy image is the same on all in ironic db. So, apparently some nodes still booted with the old image.
(In reply to Steve Baker from comment #4) > If this is happening in a baremetal environment but not a virtual one, my > only suggestion is that maybe the disk on the baremetal still has the image > from an older install, or the overcloud image is out of date for some other > reason. @Steve, Where does your suggestion some from? If you would elaborate on this more, it would be much appreciated. I'm just quite interested in the difference of the behaviour between BMs and VMs.
Baremetal has a real disk which the OS image gets copied to on each run, whereas a VM has a virtual disk which always starts out emtpy. If the image copying fails, the baremetal risks booting a previously copied image, whereas the VM won't boot at all. Also image copying is more likely to fail on baremetal since there are any number of storage configurations which haven't had as much testing as the simple single disk you would get with a VM.
sasha, i think you're working with lucas on this one. so i'm assigning it to him to see if there's a BM provisioning issue here.
Ironic does have a mechanism called "cleaning" which can erase the disks prior to the node become available to nova and after the instance is teared down. Unfortunately cleaning only arrived to the iSCSI drivers (pxe_*) in Mitaka [0], and as it's a feature it wasn't backported to stable/liberty upstream. [0] https://review.openstack.org/#/c/220898/
The issue reproduces on nodes with more than 1 disk (4 disks).
So I initialized all disks on nodes with multiple disks and upon re-deployment these nodes failed to boot due to missing bootloader, suggesting to chose another boot method.
To overcome the issue: 1. Add a property to the node specifying the disk size: ironic node-update <node_ID> add properties/root_device='{"size": <size>}' Note: A node’s ‘local_gb’ property is often set to a value 1 GiB less than the actual disk size to account for partitioning. However, in this case size should be the actual size. For example, for a 128 GiB disk local_gb will be 127, but size hint will be 128. 2. In BIOS or Controller congiuration menu - select the same disk to do the boot from. http://docs.openstack.org/developer/ironic/deploy/install-guide.html?highlight=wwn#specifying-the-disk-for-deployment
Related to this issue with Ceph: https://bugzilla.redhat.com/show_bug.cgi?id=1282897
I reproduce it now with 7.3 too. The weird part is that I use the same setup where I used to deploy 7.3 without any issue.
Hi Dan, Suggestions: 1) Instead export IRONIC_DISCOVERD_PASSWORD=`sudo grep admin_password /etc/ironic-inspector/inspector.conf | egrep -v '^#' | awk '{print $NF}'` export IRONIC_DISCOVERD_PASSWORD=`sudo grep admin_password /etc/ironic-inspector/inspector.conf | awk '! /^#/ {print $NF}'` 2) Similarly, instead of: for node in $(ironic node-list | grep -v UUID| awk '{print $2}'); for node in $(ironic node-list |awk '!/UUID/ {print $2}'); 3) To add a note to configure the BIOS to include boot from the respective disk or controller.
Sasha, the feedback from comment #18 is now live: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/director-installation-and-usage/#sect-Defining_the_Root_Disk_for_Nodes How does it look? Anything further we should add/modify?
Verified: Looks good. Thanks.
Cool. Closing.