Bug 1235458
Summary: | [mariadb-galera]: 'power-off node' corrupt the DB, the corruption prevent galera to start and as result the node won't be part of cluster after the node is powered back on. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> | ||||
Component: | mariadb-galera | Assignee: | Michael Bayer <mbayer> | ||||
Status: | CLOSED NOTABUG | QA Contact: | yeylon <yeylon> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.0 (Kilo) | CC: | jschluet, lhh, mcornea, rohara, srevivo, yeylon | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | 8.0 (Liberty) | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-06-25 17:18:49 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Omri Hochman
2015-06-24 21:09:27 UTC
Created attachment 1042832 [details]
mariadb.log
full mariadb.log
my initial take on this as discussed on IRC is that the power off is causing a single mariadb node to become corrupted in some way. The established approach to re-introducing a failed node into the cluster is to repair the corruption on the failed node first. In this case, the tables that are impacted are the MyISAM tables "user" and "db", which aren't handled by Galera. The operator should log into the console for this database specifically and run the "REPAIR TABLE" command, documented at https://dev.mysql.com/doc/refman/5.1/en/repair-table.html. E.g. this corruption is not a bug; it is a known behavior of MySQL/MariaDB with an established solution. At that stage, the node should be able to rejoin the cluster, perhaps after a restart. Deleting grastate.dat will ensure the node does a full SST when it rejoins. From my POV this is a "worksforme". confirm that the mysql.user and mysql.db tables are in fact MyISAM, and that these are not replicated by galera. https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/ "Currently replication works only with the InnoDB storage engine. Any writes to tables of other types, including system (mysql.*) tables are not replicated (this limitation excludes DDL statements such as CREATE USER, which implicitly modify the mysql.* tables — those are replicated). There is however experimental support for MyISAM - see the wsrep_replicate_myisam system variable) " This is only significant since we can confirm that these two tables are corrupted in an ordinary way. Just run "REPAIR TABLE" on them and restart. please reopen if the resolution doesn't work for you or there are other compounding factors, thanks! I tried to recover the broken node from such condition and encountered some factors that required workarounds. End results was the node rejoined the cluster but can you confirm the steps I followed are correct? I couldn't login to the console since the server was only accepting connections form localhost on via file socket: [root@overcloud-controller-0 ~]# mysql ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) [root@overcloud-controller-0 ~]# ps axu | grep sql mysql 1342 0.0 0.0 115348 1700 ? Ss 08:35 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr mysql 3583 0.1 1.2 1490632 104136 ? Sl 08:35 0:01 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --wsrep-provider=/usr/lib64/galera/libgalera_smm.so --log-error=/var/log/mariadb/mariadb.log --open-files-limit=-1 --pid-file=/var/run/mariadb/mariadb.pid --socket=/var/lib/mysql/mysql.sock --port=3306 --wsrep_start_position=f66ffed3-1b32-11e5-a6b0-b25242c6d09d:4365 root 25412 0.0 0.0 112644 928 pts/0 S+ 08:50 0:00 grep --color=auto sql To workaround this I added 'innodb_force_recovery = 1' to /etc/my.cnf.d/galera.cnf and restarted the mariadb service. After this I could access the console and run the repair steps: SET wsrep_on=OFF; repair table mysql.user; repair table mysql.db; Next I run: systemctl stop mariadb rm /var/lib/mysql/grastate.dat remove 'innodb_force_recovery = 1' from /etc/my.cnf.d/galera.cnf crm_resource -C -r galera-master After running these steps the node rejoined the cluster.Can you please confirm the steps I did are correct? Thanks. absolutely, once a *single* mariadb node is corrupted or otherwise unable to start, the general steps are: 1. get that mariadb node to start as an independent database service first. Any special commands or startup recovery flags needed to make this happen are OK. It is even usually OK to just rebuild the data directory of this node completely fresh to start with, because it will be getting all the current data and user accounts from the other nodes when it rejoins the cluster in any case. E.g. there's nothing you need to preserve there, consider if you built a brand new MariaDB, that brand new service could join into your cluster just as easily (*as long as the rest of the cluster is still running fine*. If you've lost all nodes or some corruption has occurred to all/most of them, that's a different ballgame.) 2. delete grastate.dat on the node that had a problem. This effectively means it will unconditionally get all of its data replaced by the other nodes when it rejoins the cluster. E.g. all the InnoDB datafiles that we possibly were worried about in step #1 are going to get overwritten via an rsync in any case. 3. restart the node again and it should join the cluster and be synchronized. In a HA environment, it's not the case that the node that failed is guaranteed to be clean and able to rejoin the cluster (or even boot) - the HA software is there to provide continuity of service from the other hosts. We should probably note these recovery steps somewhere. |