Bug 1609022 - unable to login galera in RHOSP13 with root account after changing the mysql root password
Summary: unable to login galera in RHOSP13 with root account after changing the mysql ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: zstream
: 13.0 (Queens)
Assignee: Damien Ciabrini
QA Contact: pkomarov
URL:
Whiteboard:
: 1705523 (view as bug list)
Depends On:
Blocks: 1674070
TreeView+ depends on / blocked
 
Reported: 2018-07-26 18:25 UTC by David Hill
Modified: 2023-09-07 19:16 UTC (History)
15 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.3.1-2.el7ost puppet-tripleo-8.4.1-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1674070 (view as bug list)
Environment:
Last Closed: 2019-04-30 17:27:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1792416 0 None None None 2018-09-13 16:56:51 UTC
Launchpad 1809145 0 None None None 2019-02-18 16:38:24 UTC
Launchpad 1814514 0 None None None 2019-02-18 16:38:24 UTC
OpenStack gerrit 637525 0 None MERGED mysql: use clustercheck credentials to poll galera state 2021-02-10 14:47:26 UTC
OpenStack gerrit 637539 0 None MERGED mysql: do not overwrite password file during docker-puppet 2021-02-10 14:47:26 UTC
OpenStack gerrit 637559 0 None MERGED Fix generation of configs that contain password files 2021-01-12 15:44:13 UTC
OpenStack gerrit 637577 0 None MERGED mysql: fix root password update for containerized mysql 2021-01-12 15:44:13 UTC
OpenStack gerrit 637581 0 None MERGED mysql: sync credentials in running container on password change 2021-01-12 15:44:13 UTC
Red Hat Issue Tracker OSP-13618 0 None None None 2022-03-13 15:26:31 UTC
Red Hat Knowledge Base (Solution) 4101391 0 None None None 2019-05-02 13:24:58 UTC
Red Hat Product Errata RHBA-2019:0939 0 None None None 2019-04-30 17:27:45 UTC

Description David Hill 2018-07-26 18:25:52 UTC
Description of problem:
unable to login galera in RHOSP13 with root account (it might be after a scale up operation)


source stackrc
openstack overcloud plan export overcloud
tar xf overcloud.tar.gz
grep MysqlRootPassword *
 MysqlRootPassword: password




[root@overcloud-controller-0 ~]# mysql -p -u root -D mysql
Enter password: password
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 David Hill 2018-07-26 18:36:22 UTC
This won't work either:

[root@overcloud-controller-0 etc]# docker exec -it 7d1805bab23d mysql
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

Comment 2 David Hill 2018-07-26 18:36:33 UTC
+ echo 'Running command: '\''/usr/sbin/pacemaker_remoted'\'''
+ exec /usr/sbin/pacemaker_remoted
Running command: '/usr/sbin/pacemaker_remoted'
  notice: crm_add_logfile:      Additional logging available in /var/log/pacemaker.log
    info: crm_log_init: Changed active directory to /var/lib/pacemaker/cores
    info: qb_ipcs_us_publish:   server name: lrmd
  notice: lrmd_init_remote_tls_server:  Starting TLS listener on port 3123
  notice: bind_and_listen:      Listening on address ::
    info: qb_ipcs_us_publish:   server name: cib_ro
    info: qb_ipcs_us_publish:   server name: cib_rw
    info: qb_ipcs_us_publish:   server name: cib_shm
    info: qb_ipcs_us_publish:   server name: attrd
    info: qb_ipcs_us_publish:   server name: stonith-ng
    info: qb_ipcs_us_publish:   server name: crmd
    info: main: Starting
    info: crm_remote_accept:    New remote connection from ::ffff:192.168.0.20
  notice: lrmd_remote_listen:   LRMD client connection established. 0x5578858e1fa0 id: b83739b2-63c2-4056-bd80-6d5d2a7a6922
    info: process_lrmd_get_rsc_info:    Resource 'galera' not found (0 active resources)
    info: process_lrmd_get_rsc_info:    Resource 'galera:0' not found (0 active resources)
    info: process_lrmd_rsc_register:    Added 'galera' to the rsc list (1 active resources)
    info: log_execute:  executing - rsc:galera action:start call_id:13
    info: log_finished: finished - rsc:galera action:start call_id:13 pid:91 exit-code:0 exec-time:3184ms queue-time:0ms
    info: cancel_recurring_action:      Cancelling ocf operation galera_monitor_20000
    info: cancel_recurring_action:      Cancelling ocf operation galera_monitor_30000
    info: log_execute:  executing - rsc:galera action:promote call_id:149
  notice: operation_finished:   galera_promote_0:510:stderr [ ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:Unable to retrieve wsrep_cluster_status, verify check_user 'root' has permissions to view status ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:local node <overcloud-controller-0> is started, but not in primary mode. Unknown state. ]
  notice: operation_finished:   galera_promote_0:510:stderr [ ocf-exit-reason:Failed initial monitor action ]

Comment 3 David Hill 2018-07-26 18:52:52 UTC
[root@overcloud-controller-0 etc]# paunch list | grep step2 | grep galera
| tripleo_step2 | mysql_init_bundle             | 192.168.12.16:8787/rhosp13/openstack-mariadb:13.0-47                   | /docker_puppet_apply.sh 2 file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation,galera_ready,mysql_database,mysql_grant,mysql_user include ::tripleo::profile::base::pacemaker;include ::tripleo::profile::pacemaker::database::mysql_bundle  | running |
| tripleo_step2 | mysql_restart_bundle          | 192.168.12.16:8787/rhosp13/openstack-mariadb:13.0-47                   | /usr/bin/bootstrap_host_exec mysql if /usr/sbin/pcs resource show galera-bundle; then /usr/sbin/pcs resource restart --wait=600 galera-bundle; echo "galera-bundle restart invoked"; fi                                                                                                                                                                      | exited  |

Comment 4 David Hill 2018-07-26 18:56:48 UTC
So basically, this happens when we scale up the overcloud with +1 HCI compute node.    It looks like mysql won't allow root to connect ...   We can reset the password and it'll resume working until the next scale up.

Comment 5 David Hill 2018-07-26 18:59:16 UTC
By the way, we can trigger this bug by siply re-running the " openstack overcloud deploy " command without changing anything .

Comment 11 Alex Stupnikov 2018-08-23 10:55:38 UTC
We have got the same issue on latest RHOSP 13 environment. It is simple to reproduce it: deploy the env and configure fencing according to document [1]

This looks like a general issue to me that will affect lots of customers that will modify their RHOSP 13 envs, so I would like to ask you to prioritize this issue accordingly.

[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/advanced_overcloud_customization/sect-fencing_the_controller_nodes

BR, Alex.

Comment 21 Andrew Bays 2018-08-31 11:49:08 UTC
I am able to reproduce this in my development environment by simply re-running the overcloud deploy command again after successfully deploying a cloud (that is, I make no changes to the templates).  

Templates used are here: 

https://github.com/atyronesmith/ha-director-install/tree/ir-osp13-hci-sriov

Comment 23 Dave Cain 2019-01-11 14:27:31 UTC
(In reply to Michele Baldessari from comment #22)
> Thanks Andrew and Alex. I sat down and finally reproduced the password
> changing (aka point a) from c#13). I filed
> https://bugzilla.redhat.com/show_bug.cgi?id=1624462 as a separate bug to
> track this.

BZ 1624462 was closed as CLOSED ERRATA and resolved, yet this BZ, 1609022 remains open.  

Can we clarify if the issue of scaleout with HCI changing the Galera PW has been addressed and fixed?

Comment 24 Michele Baldessari 2019-01-11 14:46:40 UTC
(In reply to Dave Cain from comment #23)
> (In reply to Michele Baldessari from comment #22)
> > Thanks Andrew and Alex. I sat down and finally reproduced the password
> > changing (aka point a) from c#13). I filed
> > https://bugzilla.redhat.com/show_bug.cgi?id=1624462 as a separate bug to
> > track this.
> 
> BZ 1624462 was closed as CLOSED ERRATA and resolved, yet this BZ, 1609022
> remains open.  
> 
> Can we clarify if the issue of scaleout with HCI changing the Galera PW has
> been addressed and fixed?

If you don't change the password explicitly there should be no issue. Changing the root password of mysql is currently problematic due to our switch to containers

If something other than using custom_plans (which was fixed in BZ1624462) is changing the mysql root password without the operator knowing, we need to investigate it because it is a separate issue.
I.e. we'd like to have the content of the swift plan before and after the problematic deploy to understand a bit more what is going on

Comment 25 Damien Ciabrini 2019-01-30 21:05:48 UTC
Two reviews merged upstream [1,2] to enable root password update in containerized environment.
I'm still working on a last review [3] to polish the support before starting the backport.

[1] https://review.openstack.org/#/c/602499/
[2] https://review.openstack.org/#/c/602969/
[3] https://review.openstack.org/#/c/633768/

Comment 26 Damien Ciabrini 2019-02-08 14:38:51 UTC
All reviews merged upstream, starting the backport to upstream stable releases and downstream

Comment 27 Damien Ciabrini 2019-02-23 15:14:48 UTC
Fixed in Upstream Queens with the 5 gerrit reviews attached in the bz

Comment 35 pkomarov 2019-04-16 08:59:20 UTC
Verified, 

[stack@undercloud-0 ~]$  rhos-release -L
Installed repositories (rhel-7.6):
  13
  ceph-3
  ceph-osd-3
  rhel-7.6
[stack@undercloud-0 ~]$ cat core_puddle_version 
2019-04-10.1[stack@undercloud-0 ~]$ 

verification as per https://bugzilla.redhat.com/show_bug.cgi?id=1674070#c4 :
HA Overcloud:
1)

cat >password.yaml<<EOF
parameter_defaults:
    MysqlRootPassword: 'anewpassword'
EOF

sed -i 's/--log.*/-e \/home\/stack\/password.yaml /g' ./overcloud_deploy.sh

./overcloud_deploy.sh |& tee deploy.out

Ansible passed.
Overcloud configuration completed.
Overcloud Endpoint: https://10.0.0.101:13000
Overcloud Horizon Dashboard URL: https://10.0.0.101:443/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"mysql -uroot -panewpassword -e 'select 1;'"
 [WARNING]: Found both group and host with same name: undercloud

controller-2 | SUCCESS | rc=0 >>
1
1

controller-0 | SUCCESS | rc=0 >>
1
1

controller-1 | SUCCESS | rc=0 >>
1
1

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"grep -re mysql::server::root_password /etc/puppet/hieradata"
 [WARNING]: Found both group and host with same name: undercloud

controller-1 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

controller-2 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

controller-0 | SUCCESS | rc=0 >>
/etc/puppet/hieradata/service_configs.json:    "mysql::server::root_password": "anewpassword",

2)

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"docker ps|grep galera"


 [WARNING]: Found both group and host with same name: undercloud

controller-2 | SUCCESS | rc=0 >>
b71a4dfeef64        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-2

controller-1 | SUCCESS | rc=0 >>
4f3ec9660e54        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-1

controller-0 | SUCCESS | rc=0 >>
9407b9ab2470        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   35 minutes ago      Up 35 minutes                                galera-bundle-docker-0

. stackrc ; ./overcloud_deploy.sh |& tee deploy.out

Ansible passed.
Overcloud configuration completed.
Overcloud Endpoint: https://10.0.0.101:13000
Overcloud Horizon Dashboard URL: https://10.0.0.101:443/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed

(undercloud) [stack@undercloud-0 ~]$ ansible controller -b -mshell -a"docker ps|grep galera"
 [WARNING]: Found both group and host with same name: undercloud

controller-1 | SUCCESS | rc=0 >>
4f3ec9660e54        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-1

controller-2 | SUCCESS | rc=0 >>
b71a4dfeef64        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-2

controller-0 | SUCCESS | rc=0 >>
9407b9ab2470        192.168.24.1:8787/rhosp14/openstack-mariadb:pcmklatest                       "/bin/bash /usr/lo..."   About an hour ago   Up About an hour                             galera-bundle-docker-0


Undercloud:

1) 

sed -i 's/undercloud_db_password=.*/undercloud_db_password=anewpassword/g'  undercloud-passwords.conf

 openstack undercloud install |& tee uc.out

#############################################################################
Undercloud install complete.

(undercloud) [stack@undercloud-0 ~]$ mysql -uroot -panewpassword -e 'select 1;'
+---+
| 1 |
+---+
| 1 |
+---+

2)

sed -i 's/undercloud_db_password=.*/undercloud_db_password=yetanotherone/g'  undercloud-passwords.conf

openstack undercloud upgrade |& tee uc_upgrade.out

#############################################################################
Undercloud upgrade complete.

Comment 38 errata-xmlrpc 2019-04-30 17:27:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0939

Comment 39 David Hill 2019-05-02 13:11:59 UTC
*** Bug 1705523 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.