Rubygem-staypuft: HA: mariadb will fail to start because puppet are not adding "op start timeout=120s" to the configuration. Try to pcs cluster stop && sleep 5 && pcs cluster start on any node and mariadb will fail to start because puppet are not adding "op start timeout=120s" to the configuration. galera/mariadb can take a decent amount of time to sync. This problem is solved by switching to the galera resource agent. must switch galera to use the new how-to and resource-agent See: http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5-on-rhel7-db
(In reply to Leonid Natapov from comment #0) > Rubygem-staypuft: HA: mariadb will fail to start because puppet are not > adding "op start timeout=120s" to the configuration. Provide logs to support this. Specifically, show that pacemaker tried to recover because a node took longer than 60s to join the cluster. > Try to pcs cluster stop && sleep 5 && pcs cluster start on any node and > mariadb will fail to start because puppet are not adding "op start > timeout=120s" to the configuration. Running "pcs cluster stop && sleep 5 && pcs cluster start" worked fine for me. > galera/mariadb can take a decent amount of time to sync. This problem is > solved by switching to the galera resource agent. Did you try this? > must switch galera to use the new how-to and resource-agent > > See: > http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5- > on-rhel7-db
(In reply to Ryan O'Hara from comment #4) > (In reply to Leonid Natapov from comment #0) > > Rubygem-staypuft: HA: mariadb will fail to start because puppet are not > > adding "op start timeout=120s" to the configuration. > > Provide logs to support this. Specifically, show that pacemaker tried to > recover because a node took longer than 60s to join the cluster. > > > Try to pcs cluster stop && sleep 5 && pcs cluster start on any node and > > mariadb will fail to start because puppet are not adding "op start > > timeout=120s" to the configuration. > > Running "pcs cluster stop && sleep 5 && pcs cluster start" worked fine for > me. There are too many variables involved with this to say a simple stop/sleep/start will trigger it. It all depends on if during the start syncing (SST) is occurring and how large the transfer is during the sync. It also depends if galera instances on other nodes are attempting to sync with a donor node at the same time. I believe only one node can sync from a donor at a time, which might mean there's a period of time a galera instance is blocking waiting to sync during the start operation. This would increase the chances of timing out during the start operation as well. To be safe we should definitely set the timeout to at least 2 minutes. For larger databases the sync might even take longer. Even a 5 minute timeout just to be safe wouldn't be unreasonable. It is possible that managing galera with systemd forces us into a 60 start window. I've see instances where systemd enforces its own timeout value, which could conflict with pacemakers timeout if pacemaker's timeout is longer. If we are stuck trying to support the systemd management for now. Increase the timeout to >= 120 seconds and make sure to also set the ordered=true metadata option. Example: pcs resource create db systemd:mariadb op start timeout=300s meta ordered=true --clone The 'ordered=true' option will guarantee pacemaker starts the galera instances in series instead of parallel. This will prevent the condition were two nodes are attempting to SST the same donor at the same time. -- Vossel
(In reply to David Vossel from comment #5) > (In reply to Ryan O'Hara from comment #4) > > (In reply to Leonid Natapov from comment #0) > > > Rubygem-staypuft: HA: mariadb will fail to start because puppet are not > > > adding "op start timeout=120s" to the configuration. > > > > Provide logs to support this. Specifically, show that pacemaker tried to > > recover because a node took longer than 60s to join the cluster. > > > > > Try to pcs cluster stop && sleep 5 && pcs cluster start on any node and > > > mariadb will fail to start because puppet are not adding "op start > > > timeout=120s" to the configuration. > > > > Running "pcs cluster stop && sleep 5 && pcs cluster start" worked fine for > > me. > > There are too many variables involved with this to say a simple > stop/sleep/start will trigger it. It all depends on if during the start > syncing (SST) is occurring and how large the transfer is during the sync. > It also depends if galera instances on other nodes are attempting to sync > with a donor node at the same time. I believe only one node can sync from a > donor at a time, which might mean there's a period of time a galera instance > is blocking waiting to sync during the start operation. This would increase > the chances of timing out during the start operation as well. Right. But if we assume that the nodes are sync'd when the cluster is stopped, there will be no SST when the nodes rejoin on cluster start. So there isn't enough information in this bug to say that mariadb failed due to start delay being too short, etc. You're right that a node can only be a donor for one joiner at a time, and yes this could delay things if your bootstrap node is sync-ing node #2 (while node #3 waits). I am not convinced this is the case here. > To be safe we should definitely set the timeout to at least 2 minutes. For > larger databases the sync might even take longer. Even a 5 minute timeout > just to be safe wouldn't be unreasonable. I don't disagree, but I was also under the impression that excessively long start delay is bad. > It is possible that managing galera with systemd forces us into a 60 start > window. I've see instances where systemd enforces its own timeout value, > which could conflict with pacemakers timeout if pacemaker's timeout is > longer. > > If we are stuck trying to support the systemd management for now. Increase > the timeout to >= 120 seconds and make sure to also set the ordered=true > metadata option. > > Example: > pcs resource create db systemd:mariadb op start timeout=300s meta > ordered=true --clone > > The 'ordered=true' option will guarantee pacemaker starts the galera > instances in series instead of parallel. This will prevent the condition > were two nodes are attempting to SST the same donor at the same time. OK that might be useful.
(In reply to Ryan O'Hara from comment #6) > (In reply to David Vossel from comment #5) > > (In reply to Ryan O'Hara from comment #4) > > > (In reply to Leonid Natapov from comment #0) > > > > Rubygem-staypuft: HA: mariadb will fail to start because puppet are not > > > > adding "op start timeout=120s" to the configuration. > > > > > > Provide logs to support this. Specifically, show that pacemaker tried to > > > recover because a node took longer than 60s to join the cluster. > > > > > > > Try to pcs cluster stop && sleep 5 && pcs cluster start on any node and > > > > mariadb will fail to start because puppet are not adding "op start > > > > timeout=120s" to the configuration. > > > > > > Running "pcs cluster stop && sleep 5 && pcs cluster start" worked fine for > > > me. > > > > There are too many variables involved with this to say a simple > > stop/sleep/start will trigger it. It all depends on if during the start > > syncing (SST) is occurring and how large the transfer is during the sync. > > It also depends if galera instances on other nodes are attempting to sync > > with a donor node at the same time. I believe only one node can sync from a > > donor at a time, which might mean there's a period of time a galera instance > > is blocking waiting to sync during the start operation. This would increase > > the chances of timing out during the start operation as well. > > Right. But if we assume that the nodes are sync'd when the cluster is > stopped, there will be no SST when the nodes rejoin on cluster start. So > there isn't enough information in this bug to say that mariadb failed due to > start delay being too short, etc. yep, we're running on theoretical assumptions here. > > You're right that a node can only be a donor for one joiner at a time, and > yes this could delay things if your bootstrap node is sync-ing node #2 > (while node #3 waits). I am not convinced this is the case here. > > > To be safe we should definitely set the timeout to at least 2 minutes. For > > larger databases the sync might even take longer. Even a 5 minute timeout > > just to be safe wouldn't be unreasonable. > > I don't disagree, but I was also under the impression that excessively long > start delay is bad. just to be clear, we're not talking about start-delay here. Everyone collectively erase start-delay from your memory. It was only a poor workaround for a issue with Pacemaker management of systemd... that's all behind us now :D Long start timeouts in this case should be fine. Start will return when mariadb finishes the sync, so idealy the start timeout should never be observed except in failure conditions. > > It is possible that managing galera with systemd forces us into a 60 start > > window. I've see instances where systemd enforces its own timeout value, > > which could conflict with pacemakers timeout if pacemaker's timeout is > > longer. > > > > If we are stuck trying to support the systemd management for now. Increase > > the timeout to >= 120 seconds and make sure to also set the ordered=true > > metadata option. > > > > Example: > > pcs resource create db systemd:mariadb op start timeout=300s meta > > ordered=true --clone > > > > The 'ordered=true' option will guarantee pacemaker starts the galera > > instances in series instead of parallel. This will prevent the condition > > were two nodes are attempting to SST the same donor at the same time. > > OK that might be useful.
Patch posted: https://github.com/redhat-openstack/astapor/pull/347
Merged
openstack-foreman-installer-2.0.19-1.el6ost [root@mac047d7b627d5a ~]# pcs resource show mysqld-clone Clone: mysqld-clone Resource: mysqld (class=systemd type=mysqld) Attributes: timeout=300s Meta Attrs: ordered=true Operations: monitor interval=30s (mysqld-monitor-interval-30s)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1090.html