Hide Forgot
Description of problem: Overcloud fails to deploy with HA controllers (non ha not tested) due to a dependency on /usr/lib/rabbitmq/bin/rabbitmq-plugins list -E -m This command fails, and returns no output with rabbitmq-server version 3.6.5-1.el7 If rabbitmq-server 3.3.5-28.el7 is installed from epel the puppet commands will succeed Version-Release number of selected component (if applicable): rabbitmq-server version 3.6.5-1.el7 Mitaka Newton How reproducible: 100% Steps to Reproduce: 1. Deploy undercloud 2. Deploy overcloud 3. Run puppet apply -e 'include rabbitmq' --debug --verbose --detailed-exitcodes on failed node to reveal command that is failing Actual results: Overcloud deploy fails | ControllerNodesPostDeployment | 77540034-fd02-4682-96ef-9633528efea7 | OS::TripleO::ControllerPostDeployment | CREATE_FAILED | 2016-11-23T04:26:12 | overcloud | | ControllerServicesBaseDeployment_Step2 | 5554d089-f352-4274-a572-15b6350c054f | OS::Heat::StructuredDeployments | CREATE_FAILED | 2016-11-23T05:31:00 | overcloud-ControllerNodesPostDeployment-crvexxdlqiio | | 0 | a29475fc-2c65-4abd-a316-6c13ce4707b2 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-11-23T05:58:20 | overcloud-ControllerNodesPostDeployment-crvexxdlqiio-ControllerServicesBaseDeployment_Step2-ng7pkurce7pp | | 2 | 16614478-6575-44b8-9a0a-ccde4748bf70 | OS::Heat::StructuredDeployment | CREATE_FAILED | 2016-11-23T05:58:20 | overcloud-ControllerNodesPostDeployment-crvexxdlqiio-ControllerServicesBaseDeployment_Step2-ng7pkurce7pp | | ComputeNodesPostDeployment | a120a8c4-38bc-4cd2-9944-8989ebabb5da | OS::TripleO::ComputePostDeployment | CREATE_FAILED | 2016-11-23T04:26:11 | overcloud | | ComputePuppetDeployment | 488edbf4-d85b-4588-b7a4-87aa49593901 | OS::Heat::StructuredDeployments | CREATE_FAILED | 2016-11-23T05:30:57 | overcloud-ComputeNodesPostDeployment-juxi6oeglyi6 Expected results: Overcloud deploys Additional info: This is a regression, I deployed this exact same configuration last month on a production scale. without issue. This is also not an issue with RHOSP, and appears to be linked to upstream puppet modules. Here is the debug output from a failed node with rabbitmq-server from mitaka or newton repos. Debug: Executing '/usr/lib/rabbitmq/bin/rabbitmq-plugins list -E -m' Debug: Command failed, retrying Here is the debug output from a failed node with rabbitmq-server from the epel repo. Debug: Executing '/usr/lib/rabbitmq/bin/rabbitmq-plugins list -E -m' Debug: Command succeeded
I suspect the openstack-puppet modules will need to be updated to work with the latest version of rabbitmq-server. WORKAROUND #create a firstboot config file mkdir -p templates/firstboot/ cat << EOF >> templates/firstboot/firstboot-config.yaml heat_template_version: 2014-10-16 resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: repo_config} repo_config: type: OS::Heat::SoftwareConfig properties: config: | #!/bin/bash yum -y remove rabbitmq-server yum -y install http://mirror.centos.org/centos/7/cloud/x86_64/openstack-mitaka/common/rabbitmq-server-3.3.5-17.el7.noarch.rpm outputs: OS::stack_id: value: {get_resource: userdata} EOF cat << EOF >> templates/firstboot-environment.yaml resource_registry: OS::TripleO::NodeUserData: /home/stack/templates/firstboot/firstboot-config.yaml EOF #add this template to the deployment openstack overcloud deploy --templates \ -e /home/stack/templates/firstboot-environment.yaml \ --control-scale 3 \ --compute-scale 9 \ --ceph-storage-scale 0 \ --control-flavor control \ --compute-flavor compute \ --neutron-network-type vxlan \ --neutron-tunnel-type vxlan \ --ntp-server time-a.nist.gov I am not sure of the implications of using an older version of rabbitmq-server, so implement this workaround at your own risk.
Sorry you didn't get a response to this. I can't replicate this and as this is against an EOL product I'm closing as such but please re-open if still a problem with current packages.