Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1609022

Summary: unable to login galera in RHOSP13 with root account after changing the mysql root password
Product: Red Hat OpenStack Reporter: David Hill <dhill>
Component: openstack-tripleo-heat-templatesAssignee: Damien Ciabrini <dciabrin>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: urgent Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: abays, astupnik, brault, dcain, dciabrin, ebarrera, emacchi, jschluet, jslagle, mburns, michele, pgambard, pkomarov, rcernin, wlehman
Target Milestone: zstreamKeywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.3.1-2.el7ost puppet-tripleo-8.4.1-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1674070 (view as bug list) Environment:
Last Closed: 2019-04-30 17:27:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1674070    

Description David Hill 2018-07-26 18:25:52 UTC
Description of problem:
unable to login galera in RHOSP13 with root account (it might be after a scale up operation)


source stackrc
openstack overcloud plan export overcloud
tar xf overcloud.tar.gz
grep MysqlRootPassword *
 MysqlRootPassword: password




[root@overcloud-controller-0 ~]# mysql -p -u root -D mysql
Enter password: password
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 David Hill 2018-07-26 18:36:22 UTC
This won't work either:

[root@overcloud-controller-0 etc]# docker exec -it 7d1805bab23d mysql
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

Comment 2 David Hill 2018-07-26 18:36:33 UTC
+ echo 'Running command: '\''/usr/sbin/pacemaker_remoted'\'''
+ exec /usr/sbin/pacemaker_remoted
Running command: '/usr/sbin/pacemaker_remoted'
  notice: crm_add_logfile:      Additional logging available in /var/log/pacemaker.log
    info: crm_log_init: Changed active directory to /var/lib/pacemaker/cores
    info: qb_ipcs_us_publish:   server name: lrmd
  notice: lrmd_init_remote_tls_server:  Starting TLS listener on port 3123
  notice: bind_and_listen:      Listening on address ::
    info: qb_ipcs_us_publish:   server name: cib_ro
    info: qb_ipcs_us_publish:   server name: cib_rw
    info: qb_ipcs_us_publish:   server name: cib_shm
    info: qb_ipcs_us_publish:   server name: attrd
    info: qb_ipcs_us_publish:   server name: stonith-ng
    info: qb_ipcs_us_publish:   server name: crmd
    info: main: Starting
    info: crm_remote_accept:    New remote connection from ::ffff:192.168.0.20
  notice: lrmd_remote_listen:   LRMD client connection established. 0x5578858e1fa0 id: b83739b2-63c2-4056-bd80-6d5d2a7a6922
    info: process_lrmd_get_rsc_info:    Resource 'galera' not found (0 active resources)
    info: process_lrmd_get_rsc_info:    Resource 'galera:0' not found (0 active resources)
    info: process_lrmd_rsc_register:    Added 'galera' to the rsc list (1 active resources)
    info: log_execute:  executing - rsc:galera action:start call_id:13
    info: log_finished: finished - rsc:galera action:start call_id:13 pid:91 exit-code:0 exec-time:3184ms queue-time:0ms
    info: cancel_recurring_action:      Cancelling ocf operation galera_monitor_20000
    info: cancel_recurring_action:      Cancelling ocf operation galera_monitor_30000
    info: log_execute:  executing - rsc:galera action:promote call_id:149
  notice: operation_finished:   galera_promote_0:510:stderr [ ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:Unable to retrieve wsrep_cluster_status, verify check_user 'root' has permissions to view status ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:local node <overcloud-controller-0> is started, but not in primary mode. Unknown state. ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:Failed initial monitor action ]

Comment 3 David Hill 2018-07-26 18:52:52 UTC
[root@overcloud-controller-0 etc]# paunch list | grep step2 | grep galera
| tripleo_step2 | mysql_init_bundle             | 192.168.12.16:8787/rhosp13/openstack-mariadb:13.0-47                   | /docker_puppet_apply.sh 2 file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation,galera_ready,mysql_database,mysql_grant,mysql_user include ::tripleo::profile::base::pacemaker;include ::tripleo::profile::pacemaker::database::mysql_bundle  | running |
| tripleo_step2 | mysql_restart_bundle          | 192.168.12.16:8787/rhosp13/openstack-mariadb:13.0-47                   | /usr/bin/bootstrap_host_exec mysql if /usr/sbin/pcs resource show galera-bundle; then /usr/sbin/pcs resource restart --wait=600 galera-bundle; echo "galera-bundle restart invoked"; fi                                                                                                                                                                      | exited  |

Comment 4 David Hill 2018-07-26 18:56:48 UTC
So basically, this happens when we scale up the overcloud with +1 HCI compute node.    It looks like mysql won't allow root to connect ...   We can reset the password and it'll resume working until the next scale up.

Comment 5 David Hill 2018-07-26 18:59:16 UTC
By the way, we can trigger this bug by siply re-running the " openstack overcloud deploy " command without changing anything .

Comment 11 Alex Stupnikov 2018-08-23 10:55:38 UTC
We have got the same issue on latest RHOSP 13 environment. It is simple to reproduce it: deploy the env and configure fencing according to document [1]

This looks like a general issue to me that will affect lots of customers that will modify their RHOSP 13 envs, so I would like to ask you to prioritize this issue accordingly.

[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/advanced_overcloud_customization/sect-fencing_the_controller_nodes

BR, Alex.

Comment 21 Andrew Bays 2018-08-31 11:49:08 UTC
I am able to reproduce this in my development environment by simply re-running the overcloud deploy command again after successfully deploying a cloud (that is, I make no changes to the templates).  

Templates used are here: 

https://github.com/atyronesmith/ha-director-install/tree/ir-osp13-hci-sriov

Comment 23 Dave Cain 2019-01-11 14:27:31 UTC
(In reply to Michele Baldessari from comment #22)
> Thanks Andrew and Alex. I sat down and finally reproduced the password
> changing (aka point a) from c#13). I filed
> https://bugzilla.redhat.com/show_bug.cgi?id=1624462 as a separate bug to
> track this.

BZ 1624462 was closed as CLOSED ERRATA and resolved, yet this BZ, 1609022 remains open.  

Can we clarify if the issue of scaleout with HCI changing the Galera PW has been addressed and fixed?

Comment 24 Michele Baldessari 2019-01-11 14:46:40 UTC
(In reply to Dave Cain from comment #23)
> (In reply to Michele Baldessari from comment #22)
> > Thanks Andrew and Alex. I sat down and finally reproduced the password
> > changing (aka point a) from c#13). I filed
> > https://bugzilla.redhat.com/show_bug.cgi?id=1624462 as a separate bug to
> > track this.
> 
> BZ 1624462 was closed as CLOSED ERRATA and resolved, yet this BZ, 1609022
> remains open.  
> 
> Can we clarify if the issue of scaleout with HCI changing the Galera PW has
> been addressed and fixed?

If you don't change the password explicitly there should be no issue. Changing the root password of mysql is currently problematic due to our switch to containers

If something other than using custom_plans (which was fixed in BZ1624462) is changing the mysql root password without the operator knowing, we need to investigate it because it is a separate issue.
I.e. we'd like to have the content of the swift plan before and after the problematic deploy to understand a bit more what is going on

Comment 25 Damien Ciabrini 2019-01-30 21:05:48 UTC
Two reviews merged upstream [1,2] to enable root password update in containerized environment.
I'm still working on a last review [3] to polish the support before starting the backport.

[1] https://review.openstack.org/#/c/602499/
[2] https://review.openstack.org/#/c/602969/
[3] https://review.openstack.org/#/c/633768/

Comment 26 Damien Ciabrini 2019-02-08 14:38:51 UTC
All reviews merged upstream, starting the backport to upstream stable releases and downstream

Comment 27 Damien Ciabrini 2019-02-23 15:14:48 UTC
Fixed in Upstream Queens with the 5 gerrit reviews attached in the bz

Comment 35 pkomarov 2019-04-16 08:59:20 UTC
Verified, 

[stack@undercloud-0 ~]$  rhos-release -L
Installed repositories (rhel-7.6):
  13
  ceph-3
  ceph-osd-3
  rhel-7.6
[stack@undercloud-0 ~]$ cat core_puddle_version 
2019-04-10.1[stack@undercloud-0 ~]$ 

verification as per https://bugzilla.redhat.com/show_bug.cgi?id=1674070#c4 :
HA Overcloud:
1)

cat >password.yaml<<EOF
parameter_defaults:
    MysqlRootPassword: 'anewpassword'
EOF

sed -i 's/--log.*/-e \/home\/stack\/password.yaml /g' ./overcloud_deploy.sh

./overcloud_deploy.sh |& tee deploy.out

Ansible passed.
Overcloud configuration completed.
Overcloud Endpoint: https://10.0.0.101:13000
Overcloud Horizon Dashboard URL: https://10.0.0.101:443/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"mysql -uroot -panewpassword -e 'select 1;'"
 [WARNING]: Found both group and host with same name: undercloud

controller-2 | SUCCESS | rc=0 >>
1
1

controller-0 | SUCCESS | rc=0 >>
1
1

controller-1 | SUCCESS | rc=0 >>
1
1

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"grep -re mysql::server::root_password /etc/puppet/hieradata"
 [WARNING]: Found both group and host with same name: undercloud

controller-1 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

controller-2 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

controller-0 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

2)

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"docker ps|grep galera"


 [WARNING]: Found both group and host with same name: undercloud

controller-2 | SUCCESS | rc=0 >>
b71a4dfeef64        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-2

controller-1 | SUCCESS | rc=0 >>
4f3ec9660e54        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-1

controller-0 | SUCCESS | rc=0 >>
9407b9ab2470        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-0

. stackrc ; ./overcloud_deploy.sh |& tee deploy.out

Ansible passed.
Overcloud configuration completed.
Overcloud Endpoint: https://10.0.0.101:13000
Overcloud Horizon Dashboard URL: https://10.0.0.101:443/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"docker ps|grep galera"
 [WARNING]: Found both group and host with same name: undercloud

controller-1 | SUCCESS | rc=0 >>
4f3ec9660e54        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-1

controller-2 | SUCCESS | rc=0 >>
b71a4dfeef64        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-2

controller-0 | SUCCESS | rc=0 >>
9407b9ab2470        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-0


Undercloud:

1) 

sed -i 's/undercloud_db_password=.*/undercloud_db_password=anewpassword/g'  undercloud-passwords.conf

 openstack undercloud install |& tee uc.out

#############################################################################
Undercloud install complete.

(undercloud) [stack@undercloud-0 ~]$ mysql -uroot -panewpassword -e 'select 1;'
+---+
| 1 |
+---+
| 1 |
+---+

2)

sed -i 's/undercloud_db_password=.*/undercloud_db_password=yetanotherone/g'  undercloud-passwords.conf

openstack undercloud upgrade |& tee uc_upgrade.out

#############################################################################
Undercloud upgrade complete.

Comment 38 errata-xmlrpc 2019-04-30 17:27:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0939

Comment 39 David Hill 2019-05-02 13:11:59 UTC
*** Bug 1705523 has been marked as a duplicate of this bug. ***