Bug 1665613

Summary: standalone deployment fails on galera setup due to being a slave
Product: Red Hat OpenStack Reporter: Alex Schultz <aschultz>
Component: python-tripleoclientAssignee: Alex Schultz <aschultz>
Status: CLOSED ERRATA QA Contact: Artem Hrechanychenko <ahrechan>
Severity: high Docs Contact:
Priority: high    
Version: 14.0 (Rocky)CC: amcleod, hbrock, jslagle, mburns, mcornea
Target Milestone: z1Keywords: Triaged, ZStream
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-tripleoclient-10.6.1-0.20181010222416.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-18 13:03:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Schultz 2019-01-12 00:19:29 UTC
Description of problem:
The installation of a cloud using the standalone deployment method is failing because galera thinks it is a slave despite being a cluster of size 1.

Version-Release number of selected component (if applicable):
ansible-pacemaker-1.0.4-0.20180827141254.0e4d7c0.el7ost.noarch
ansible-role-tripleo-modify-image-1.0.1-0.20181011160036.48a56c1.el7ost.noarch
ansible-tripleo-ipsec-9.0.1-0.20181012162415.8b37e93.el7ost.noarch
openstack-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch
openstack-tripleo-common-containers-9.4.1-0.20181012010886.el7ost.noarch
openstack-tripleo-heat-templates-9.0.1-0.20181013060908.el7ost.noarch
openstack-tripleo-image-elements-9.0.1-0.20181007200835.el7ost.noarch
openstack-tripleo-puppet-elements-9.0.0-0.20181007201103.daf9069.el7ost.noarch
openstack-tripleo-validations-9.3.1-0.20181008110759.4064fb7.el7ost.noarch
pacemaker-1.1.19-8.el7_6.2.x86_64
pacemaker-cli-1.1.19-8.el7_6.2.x86_64
pacemaker-cluster-libs-1.1.19-8.el7_6.2.x86_64
pacemaker-libs-1.1.19-8.el7_6.2.x86_64
puppet-pacemaker-0.7.2-0.20181008172520.9a4bc2d.el7ost.noarch
puppet-tripleo-9.3.1-0.20181010034754.157eaab.el7ost.noarch
python2-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch
python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost.noarch
python-tripleoclient-heat-installer-10.6.1-0.20181010222413.8c8f259.el7ost.noarch


How reproducible:


Steps to Reproduce:
1. configure system with OSP14 repos + HA repo
2. sudo yum install -y python-tripleoclient
3. openstack tripleo container image prepare default \
  --output-env-file $HOME/containers-prepare-parameters.yaml
4. Confiugure standalone_parameters.yaml

export IP=192.168.24.2
export NETMASK=24
export INTERFACE=eth1

cat <<EOF > $HOME/standalone_parameters.yaml
parameter_defaults:
  CloudName: $IP
  ControlPlaneStaticRoutes: []
  Debug: true
  DeploymentUser: $USER
  DnsServers:
    - 1.1.1.1
    - 8.8.8.8
  DockerInsecureRegistryAddress:
    - $IP:8787
  NeutronPublicInterface: $INTERFACE
  # domain name used by the host
  NeutronDnsDomain: localdomain
  # re-use ctlplane bridge for public net, defined in the standalone
  # net config (do not change unless you know what you're doing)
  NeutronBridgeMappings: datacentre:br-ctlplane
  NeutronPhysicalBridge: br-ctlplane
  # enable to force metadata for public net
  #NeutronEnableForceMetadata: true
  StandaloneEnableRoutedNetworks: false
  StandaloneHomeDir: $HOME
  StandaloneLocalMtu: 1500
  # Needed if running in a VM, not needed if on baremetal
  NovaComputeLibvirtType: qemu
EOF

5. Run deployment:

sudo openstack tripleo deploy \
  --templates \
  --local-ip=$IP/$NETMASK \
  -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
  -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
  -e $HOME/containers-prepare-parameters.yaml \
  -e $HOME/standalone_parameters.yaml \
  --output-dir $HOME \
  --standalone

Actual results:
Deployment will fail with errors on Galera

        "Error: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]",
        "Error: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql_bundle/Exec[galera-ready]/returns: change from notrun to 0 failed: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0]",
        "Error: Failed to apply catalog: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111 \"Connection refused\")",


Expected results:
Deployment should be successful.

Additional info:

Comment 1 Alex Schultz 2019-01-12 00:30:35 UTC
[rhel@localhost ~]$ sudo pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: localhost (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum
Last updated: Sat Jan 12 00:30:08 2019
Last change: Fri Jan 11 23:24:11 2019 by root via cibadmin on localhost

4 nodes configured
9 resources configured

Online: [ localhost ]
GuestOnline: [ galera-bundle-0@localhost rabbitmq-bundle-0@localhost redis-bundle-0@localhost ]

Full list of resources:

 Docker container: rabbitmq-bundle [registry.access.redhat.com/rhosp14/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started localhost
 Docker container: galera-bundle [registry.access.redhat.com/rhosp14/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Slave localhost
 Docker container: redis-bundle [registry.access.redhat.com/rhosp14/openstack-redis:pcmklatest]
   redis-bundle-0	(ocf::heartbeat:redis):	Slave localhost

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 2 Alex Schultz 2019-01-12 03:56:18 UTC
It appears that this might be hostname related. I reran it after setting the hostname and it succeeded. 

sudo hostnamectl set-hostname standalone.localhost
sudo hostnamectl set-hostname standalone.localhost --transient

Will need to look into this further.

Comment 4 Mikey Ariel 2019-02-20 12:44:27 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 5 Artem Hrechanychenko 2019-02-27 16:19:10 UTC
VERIFIED

[stack@standalone ~]$ rpm -qa python-tripleoclient
python-tripleoclient-10.6.1-0.20190117233304.e780899.el7ost.noarch

    [stack@undercloud-0 ~]$ hostname -f
    undercloud-0
    [stack@undercloud-0 ~]$ hostnamectl --static
    localhost.localdomain
    [stack@undercloud-0 ~]$ hostnamectl --transient
    undercloud-0
     
     
    [stack@undercloud-0 ~]$ sudo openstack tripleo deploy \
    >   --templates \
    >   --local-ip=$IP/$NETMASK \
    >   -e /usr/share/openstack-tripleo-heat-templates/environments/standalone/standalone-tripleo.yaml \
    >   -r /usr/share/openstack-tripleo-heat-templates/roles/Standalone.yaml \
    >   -e $HOME/containers-prepare-parameters.yaml \
    >   -e $HOME/standalone_parameters.yaml \
    >   --output-dir $HOME \
    >   --standalone
    [sudo] password for stack:
    [EXPERIMENTAL] The tripleo deploy interface is an experimental interface. It may change in the next release.
    Static hostname "localhost.localdomain" does not match transient hostname "undercloud-0".
    Use hostnamectl to set matching hostnames.
    Static and transient hostnames do not match
     
     
    [stack@undercloud-0 ~]$ hostnamectl --static
    standalone.localdomain
    [stack@undercloud-0 ~]$ hostnamectl --transient
    standalone.localdomain
     
     
     
    ....
    PLAY [External deployment Post Deploy tasks] **********************************************************************************************************************************************************************
     
    PLAY RECAP ********************************************************************************************************************************************************************************************************
    standalone                 : ok=254  changed=110  unreachable=0    failed=0
    undercloud                 : ok=10   changed=7    unreachable=0    failed=0
     
    Not cleaning working directory /home/stack/tripleo-heat-installer-templates
    Not cleaning ansible directory /home/stack/undercloud-ansible-Dg7rNT
    Install artifact is located at /home/stack/undercloud-install-20190227161508.tar.bzip2
     
    ########################################################
     
    Deployment successfull!
     
    ########################################################
     
    Writing the stack virtual update mark file /var/lib/tripleo-heat-installer/update_mark_standalone

Comment 7 errata-xmlrpc 2019-03-18 13:03:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0446