Description of problem: The installation of a cloud using the standalone deployment method is failing because galera thinks it is a slave despite being a cluster of size 1. Version-Release number of selected component (if applicable): ansible-pacemaker-1.0.4-0.20180827141254.0e4d7c0.el7ost.noarch ansible-role-tripleo-modify-image-1.0.1-0.20181011160036.48a56c1.el7ost.noarch ansible-tripleo-ipsec-9.0.1-0.20181012162415.8b37e93.el7ost.noarch openstack-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch openstack-tripleo-common-containers-9.4.1-0.20181012010886.el7ost.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060908.el7ost.noarch openstack-tripleo-image-elements-9.0.1-0.20181007200835.el7ost.noarch openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7ost.noarch openstack-tripleo-validations-9.3.1-0.20181008110759.4064fb7.el7ost.noarch pacemaker-1.1.19-8.el7_6.2.x86_64 pacemaker-cli-1.1.19-8.el7_6.2.x86_64 pacemaker-cluster-libs-1.1.19-8.el7_6.2.x86_64 pacemaker-libs-1.1.19-8.el7_6.2.x86_64 puppet-pacemaker-0.7.2-0.20181008172520.9a4bc2d.el7ost.noarch puppet-tripleo-9.3.1-0.20181010034754.157eaab.el7ost.noarch python2-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost.noarch python-tripleoclient-heat-installer-10.6.1-0.20181010222413.8c8f259.el7ost.noarch How reproducible: Steps to Reproduce: 1. configure system with OSP14 repos + HA repo 2. sudo yum install -y python-tripleoclient 3. openstack tripleo container image prepare default \ --output-env-file $HOME/containers-prepare-parameters.yaml 4. Confiugure standalone_parameters.yaml export IP=192.168.24.2 export NETMASK=24 export INTERFACE=eth1 cat <<EOF > $HOME/standalone_parameters.yaml parameter_defaults: CloudName: $IP ControlPlaneStaticRoutes: [] Debug: true DeploymentUser: $USER DnsServers: - 1.1.1.1 - 8.8.8.8 DockerInsecureRegistryAddress: - $IP:8787 NeutronPublicInterface: $INTERFACE # domain name used by the host NeutronDnsDomain: localdomain # re-use ctlplane bridge for public net, defined in the standalone # net config (do not change unless you know what you're doing) NeutronBridgeMappings: datacentre:br-ctlplane NeutronPhysicalBridge: br-ctlplane # enable to force metadata for public net #NeutronEnableForceMetadata: true StandaloneEnableRoutedNetworks: false StandaloneHomeDir: $HOME StandaloneLocalMtu: 1500 # Needed if running in a VM, not needed if on baremetal NovaComputeLibvirtType: qemu EOF 5. Run deployment: sudo openstack tripleo deploy \ --templates \ --local-ip=$IP/$NETMASK \ -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \ -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \ -e $HOME/containers-prepare-parameters.yaml \ -e $HOME/standalone_parameters.yaml \ --output-dir $HOME \ --standalone Actual results: Deployment will fail with errors on Galera "Error: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]", "Error: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql_bundle/Exec[galera-ready]/returns: change from notrun to 0 failed: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]", "Error: Failed to apply catalog: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111 \"Connection refused\")", Expected results: Deployment should be successful. Additional info:
[rhel@localhost ~]$ sudo pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: localhost (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum Last updated: Sat Jan 12 00:30:08 2019 Last change: Fri Jan 11 23:24:11 2019 by root via cibadmin on localhost 4 nodes configured 9 resources configured Online: [ localhost ] GuestOnline: [ galera-bundle-0@localhost rabbitmq-bundle-0@localhost redis-bundle-0@localhost ] Full list of resources: Docker container: rabbitmq-bundle [registry.access.redhat.com/rhosp14/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started localhost Docker container: galera-bundle [registry.access.redhat.com/rhosp14/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Slave localhost Docker container: redis-bundle [registry.access.redhat.com/rhosp14/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Slave localhost Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
It appears that this might be hostname related. I reran it after setting the hostname and it succeeded. sudo hostnamectl set-hostname standalone.localhost sudo hostnamectl set-hostname standalone.localhost --transient Will need to look into this further.
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -.
VERIFIED [stack@standalone ~]$ rpm -qa python-tripleoclient python-tripleoclient-10.6.1-0.20190117233304.e780899.el7ost.noarch [stack@undercloud-0 ~]$ hostname -f undercloud-0 [stack@undercloud-0 ~]$ hostnamectl --static localhost.localdomain [stack@undercloud-0 ~]$ hostnamectl --transient undercloud-0 [stack@undercloud-0 ~]$ sudo openstack tripleo deploy \ > --templates \ > --local-ip=$IP/$NETMASK \ > -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \ > -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \ > -e $HOME/containers-prepare-parameters.yaml \ > -e $HOME/standalone_parameters.yaml \ > --output-dir $HOME \ > --standalone [sudo] password for stack: [EXPERIMENTAL] The tripleo deploy interface is an experimental interface. It may change in the next release. Static hostname "localhost.localdomain" does not match transient hostname "undercloud-0". Use hostnamectl to set matching hostnames. Static and transient hostnames do not match [stack@undercloud-0 ~]$ hostnamectl --static standalone.localdomain [stack@undercloud-0 ~]$ hostnamectl --transient standalone.localdomain .... PLAY [External deployment Post Deploy tasks] ********************************************************************************************************************************************************************** PLAY RECAP ******************************************************************************************************************************************************************************************************** standalone : ok=254 changed=110 unreachable=0 failed=0 undercloud : ok=10 changed=7 unreachable=0 failed=0 Not cleaning working directory /home/stack/tripleo-heat-installer-templates Not cleaning ansible directory /home/stack/undercloud-ansible-Dg7rNT Install artifact is located at /home/stack/undercloud-install-20190227161508.tar.bzip2 ######################################################## Deployment successfull! ######################################################## Writing the stack virtual update mark file /var/lib/tripleo-heat-installer/update_mark_standalone
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0446