Bug 1665613 - standalone deployment fails on galera setup due to being a slave
Summary: standalone deployment fails on galera setup due to being a slave
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 14.0 (Rocky)
Assignee: Alex Schultz
QA Contact: Artem Hrechanychenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-12 00:19 UTC by Alex Schultz
Modified: 2019-03-18 13:03 UTC (History)
5 users (show)

Fixed In Version: python-tripleoclient-10.6.1-0.20181010222416.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-18 13:03:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1814564 0 None None None 2019-02-04 16:46:21 UTC
OpenStack gerrit 634803 0 None MERGED Run hostname check for standalone 2020-09-25 03:11:29 UTC
OpenStack gerrit 634836 0 None MERGED Specify hostname for standalone 2020-09-25 03:11:29 UTC
OpenStack gerrit 634971 0 None MERGED Document the requirement for the hostname on standalone 2020-09-25 03:11:34 UTC
OpenStack gerrit 637200 0 None MERGED Run hostname check for standalone 2020-09-25 03:11:30 UTC
Red Hat Product Errata RHBA-2019:0446 0 None None None 2019-03-18 13:03:37 UTC

Description Alex Schultz 2019-01-12 00:19:29 UTC
Description of problem:
The installation of a cloud using the standalone deployment method is failing because galera thinks it is a slave despite being a cluster of size 1.

Version-Release number of selected component (if applicable):
ansible-pacemaker-1.0.4-0.20180827141254.0e4d7c0.el7ost.noarch
ansible-role-tripleo-modify-image-1.0.1-0.20181011160036.48a56c1.el7ost.noarch
ansible-tripleo-ipsec-9.0.1-0.20181012162415.8b37e93.el7ost.noarch
openstack-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch
openstack-tripleo-common-containers-9.4.1-0.20181012010886.el7ost.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060908.el7ost.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200835.el7ost.noarch
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7ost.noarch
openstack-tripleo-validations-9.3.1-0.20181008110759.4064fb7.el7ost.noarch
pacemaker-1.1.19-8.el7_6.2.x86_64
pacemaker-cli-1.1.19-8.el7_6.2.x86_64
pacemaker-cluster-libs-1.1.19-8.el7_6.2.x86_64
pacemaker-libs-1.1.19-8.el7_6.2.x86_64
puppet-pacemaker-0.7.2-0.20181008172520.9a4bc2d.el7ost.noarch
puppet-tripleo-9.3.1-0.20181010034754.157eaab.el7ost.noarch
python2-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch
python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost.noarch
python-tripleoclient-heat-installer-10.6.1-0.20181010222413.8c8f259.el7ost.noarch


How reproducible:


Steps to Reproduce:
1. configure system with OSP14 repos + HA repo
2. sudo yum install -y python-tripleoclient
3. openstack tripleo container image prepare default \
  --output-env-file $HOME/containers-prepare-parameters.yaml
4. Confiugure standalone_parameters.yaml

export IP=192.168.24.2
export NETMASK=24
export INTERFACE=eth1

cat <<EOF > $HOME/standalone_parameters.yaml
parameter_defaults:
  CloudName: $IP
  ControlPlaneStaticRoutes: []
  Debug: true
  DeploymentUser: $USER
  DnsServers:
    - 1.1.1.1
    - 8.8.8.8
  DockerInsecureRegistryAddress:
    - $IP:8787
  NeutronPublicInterface: $INTERFACE
  # domain name used by the host
  NeutronDnsDomain: localdomain
  # re-use ctlplane bridge for public net, defined in the standalone
  # net config (do not change unless you know what you're doing)
  NeutronBridgeMappings: datacentre:br-ctlplane
  NeutronPhysicalBridge: br-ctlplane
  # enable to force metadata for public net
  #NeutronEnableForceMetadata: true
  StandaloneEnableRoutedNetworks: false
  StandaloneHomeDir: $HOME
  StandaloneLocalMtu: 1500
  # Needed if running in a VM, not needed if on baremetal
  NovaComputeLibvirtType: qemu
EOF

5. Run deployment:

sudo openstack tripleo deploy \
  --templates \
  --local-ip=$IP/$NETMASK \
  -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
  -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
  -e $HOME/containers-prepare-parameters.yaml \
  -e $HOME/standalone_parameters.yaml \
  --output-dir $HOME \
  --standalone

Actual results:
Deployment will fail with errors on Galera

        "Error: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]",
        "Error: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql_bundle/Exec[galera-ready]/returns: change from notrun to 0 failed: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]",
        "Error: Failed to apply catalog: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111 \"Connection refused\")",


Expected results:
Deployment should be successful.

Additional info:

Comment 1 Alex Schultz 2019-01-12 00:30:35 UTC
[rhel@localhost ~]$ sudo pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: localhost (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum
Last updated: Sat Jan 12 00:30:08 2019
Last change: Fri Jan 11 23:24:11 2019 by root via cibadmin on localhost

4 nodes configured
9 resources configured

Online: [ localhost ]
GuestOnline: [ galera-bundle-0@localhost rabbitmq-bundle-0@localhost redis-bundle-0@localhost ]

Full list of resources:

 Docker container: rabbitmq-bundle [registry.access.redhat.com/rhosp14/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started localhost
 Docker container: galera-bundle [registry.access.redhat.com/rhosp14/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Slave localhost
 Docker container: redis-bundle [registry.access.redhat.com/rhosp14/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Slave localhost

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 2 Alex Schultz 2019-01-12 03:56:18 UTC
It appears that this might be hostname related. I reran it after setting the hostname and it succeeded. 

sudo hostnamectl set-hostname standalone.localhost
sudo hostnamectl set-hostname standalone.localhost --transient

Will need to look into this further.

Comment 4 Mikey Ariel 2019-02-20 12:44:27 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 5 Artem Hrechanychenko 2019-02-27 16:19:10 UTC
VERIFIED

[stack@standalone ~]$ rpm -qa python-tripleoclient
python-tripleoclient-10.6.1-0.20190117233304.e780899.el7ost.noarch

    [stack@undercloud-0 ~]$ hostname -f
    undercloud-0
    [stack@undercloud-0 ~]$ hostnamectl --static
    localhost.localdomain
    [stack@undercloud-0 ~]$ hostnamectl --transient
    undercloud-0
     
     
    [stack@undercloud-0 ~]$ sudo openstack tripleo deploy \
    >   --templates \
    >   --local-ip=$IP/$NETMASK \
    >   -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
    >   -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
    >   -e $HOME/containers-prepare-parameters.yaml \
    >   -e $HOME/standalone_parameters.yaml \
    >   --output-dir $HOME \
    >   --standalone
    [sudo] password for stack:
    [EXPERIMENTAL] The tripleo deploy interface is an experimental interface. It may change in the next release.
    Static hostname "localhost.localdomain" does not match transient hostname "undercloud-0".
    Use hostnamectl to set matching hostnames.
    Static and transient hostnames do not match
     
     
    [stack@undercloud-0 ~]$ hostnamectl --static
    standalone.localdomain
    [stack@undercloud-0 ~]$ hostnamectl --transient
    standalone.localdomain
     
     
     
    ....
    PLAY [External deployment Post Deploy tasks] **********************************************************************************************************************************************************************
     
    PLAY RECAP ********************************************************************************************************************************************************************************************************
    standalone                 : ok=254  changed=110  unreachable=0    failed=0
    undercloud                 : ok=10   changed=7    unreachable=0    failed=0
     
    Not cleaning working directory /home/stack/tripleo-heat-installer-templates
    Not cleaning ansible directory /home/stack/undercloud-ansible-Dg7rNT
    Install artifact is located at /home/stack/undercloud-install-20190227161508.tar.bzip2
     
    ########################################################
     
    Deployment successfull!
     
    ########################################################
     
    Writing the stack virtual update mark file /var/lib/tripleo-heat-installer/update_mark_standalone

Comment 7 errata-xmlrpc 2019-03-18 13:03:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0446


Note You need to log in before you can comment on or make changes to this bug.