| Summary: | Galera resource agent cannot connect to custom host/port | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Damien Ciabrini <dciabrin> | |
| Component: | resource-agents | Assignee: | Damien Ciabrini <dciabrin> | |
| Status: | CLOSED ERRATA | QA Contact: | Asaf Hirshberg <ahirshbe> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 7.3 | CC: | agk, ahirshbe, cfeist, cluster-maint, dciabrin, fdinitto, mbayer, mkolaja, oalbrigt, royoung, snagar | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | resource-agents-3.9.5-56.el7 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1304711 (view as bug list) | Environment: | ||
| Last Closed: | 2016-11-04 00:00:47 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1304711 | |||
|
Description
Damien Ciabrini
2016-01-18 10:05:04 UTC
fixed upstream in https://github.com/ClusterLabs/resource-agents/pull/680 Hi We need instructions how to test in in OSPd Thanks Ofer Instruction for test
While galera cluster is up, create a new user for the test
create user 'testuser'@'%' identified by 'test';
Allow user to log in to a tcp socket
grant all privileges on 'testuser'@'%';
Stop galera.
pcs resource disable galera
Once stopped, change /etc/my.cnf.d/galera.cnf to have mysql listen to
port 4242 on an available interface:
bind_address=0.0.0.0
port=4242
Then configure /etc/sysconfig/clustercheck to make the resource
agent contact mysqld on the tcp port above. Modify the entries so
that the file looks like:
MYSQL_USERNAME="testuser"
MYSQL_PASSWORD="test"
MYSQL_HOST="127.0.0.1"
MYSQL_PORT="4242"
(don't set MYSQL_HOST to "localhost" otherwise mysql will try to
connect via UNIX socket)
Start galera again.
pcs resource enable galera
After all machine are marked as master, verify that mysqld is listening
to port 4242.
netstat -tnlp | grep mysqld
If so, the test is working as expected since the resource agent polls
mysqld regularly to ensure that galera has started properly and is
still running.
For cleaning up, stop galera, reset the two config files to their
original values, and restart galera. Once started, you can delete the
test user
drop user 'testuser'@'%';
After following Damien Ciabrini instructions on comment 9 the end result is that mysql listening to the given port: [root@overcloud-controller-0 ~]# netstat -tnlp | grep mysqld tcp 0 0 0.0.0.0:4242 0.0.0.0:* LISTEN 20232/mysqld All machines were marked as masters but all openstack resources managed by pacemaker are in failed state: Master/Slave Set: galera-master [galera] Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Failed Actions: * openstack-nova-scheduler_start_0 on overcloud-controller-0 'not running' (7): call=493, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:18 2016', queued=0ms, exec=134380ms * openstack-heat-engine_start_0 on overcloud-controller-0 'not running' (7): call=507, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:36 2016', queued=0ms, exec=2108ms * openstack-cinder-scheduler_monitor_60000 on overcloud-controller-0 'not running' (7): call=483, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:11 2016', queued=0ms, exec=0ms * openstack-cinder-api_monitor_60000 on overcloud-controller-0 'not running' (7): call=476, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:09 2016', queued=0ms, exec=0ms * neutron-server_start_0 on overcloud-controller-0 'not running' (7): call=472, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:06 2016', queued=0ms, exec=120568ms * openstack-nova-scheduler_start_0 on overcloud-controller-1 'not running' (7): call=483, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:19 2016', queued=0ms, exec=134432ms * openstack-heat-engine_start_0 on overcloud-controller-1 'not running' (7): call=497, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:36 2016', queued=0ms, exec=2122ms * openstack-cinder-scheduler_monitor_60000 on overcloud-controller-1 'not running' (7): call=475, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:11 2016', queued=0ms, exec=0ms * openstack-cinder-api_monitor_60000 on overcloud-controller-1 'not running' (7): call=468, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:09 2016', queued=0ms, exec=0ms * neutron-server_start_0 on overcloud-controller-1 'not running' (7): call=464, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:06 2016', queued=0ms, exec=120717ms * openstack-nova-scheduler_start_0 on overcloud-controller-2 'not running' (7): call=474, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:19 2016', queued=0ms, exec=134380ms * openstack-heat-engine_start_0 on overcloud-controller-2 'not running' (7): call=488, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:36 2016', queued=0ms, exec=2108ms * openstack-cinder-scheduler_monitor_60000 on overcloud-controller-2 'not running' (7): call=466, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:11 2016', queued=0ms, exec=0ms * openstack-cinder-api_monitor_60000 on overcloud-controller-2 'not running' (7): call=459, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:12:09 2016', queued=0ms, exec=0ms * neutron-server_start_0 on overcloud-controller-2 'not running' (7): call=455, status=complete, exitreason='none', last-rc-change='Tue Mar 8 10:09:06 2016', queued=0ms, exec=120576ms After deleting the changes all resources started. So I want to make sure that this is the expected results of the changes made. System info: RHEL-OSP director 8.0 puddle - 2016-03-03.1 [root@overcloud-controller-0 ~]# rpm -qa|grep resource-agents resource-agents-3.9.5-54.el7_2.6.x86_64 Asaf, I think some steps are missing in #c9. Could you please check whether haproxy is listening as expected on port 3306 on the VIP, and that the haproxy.cfg setting forwards request to port 4242, i.e. config looks similar to the lines below: server overcloud-controller-0 192.0.2.21:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2 server overcloud-controller-1 192.0.2.20:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2 server overcloud-controller-2 192.0.2.22:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2 Veirifed on RHEL-OSP director 8.0 puddle - 2016-03-03.1
[root@overcloud-controller-0 ~]# rpm -qa|grep resource-agents
resource-agents-3.9.5-54.el7_2.6.x86_64
1) Create a new user in mysqland grant it with all privileges:
i) create user 'testuser'@'%' identified by 'test';
ii) grant all privileges on *.* to 'testuser'@'%';
2) Stop galera cluster using pacemaker:
i) pcs resource disable galera
3) After galera is down change the port mysql listening to in /etc/my.cnf.d/galera.cnf:
i) vi /etc/my.cnf.d/galera.cnf :
bind_address=0.0.0.0
port=4242
4) Then configure /etc/sysconfig/clustercheck to make the resource
agent contact mysqld on the tcp port above. Modify the entries so
that the file looks like:
i) vi /etc/sysconfig/clustercheck
MYSQL_USERNAME="testuser"
MYSQL_PASSWORD="test"
MYSQL_HOST="127.0.0.1"
MYSQL_PORT="4242"
5) Change also the ports haproxy forwarding to in /etc/haproxy/haproxy.cfg at mysql section:
i) vi /etc/haproxy/haproxy.cfg
listen mysql
bind 172.17.0.11:3306 transparent
option tcpka
option httpchk
stick on dst
stick-table type ip size 1000
timeout client 90m
timeout server 90m
server overcloud-controller-0 172.17.0.16:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2
server overcloud-controller-1 172.17.0.17:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2
server overcloud-controller-2 172.17.0.15:4242 backup check fall 5 inter 2000 on-marked-down shutdown-sessions port 9200 rise 2
6) Restart haproxy service:
i) systemctl restart haproxy
ii) systemctl status haproxy
7) Start galera cluster using pacemaker:
i) pcs resource enalbe galera
8) Verify that the galera cluster is up and machines are marked as master, also make sure that all openstack-resources are started.
i) pcs status
ii) netstat -tupln |grep haproxy # haproxy should listen to the old port
iii) clustercheck
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html |