Red Hat Bugzilla – Bug 1462985
Cassandra pods may enter the ready state prematurely when multiple pods are restarted with new ip addresses
Last modified: 2017-10-05 14:51:51 EDT
Description of problem:
In certain situations, when we scale down Cassandra and bring it back up again, the cluster will not properly form.
The pods in the cluster will try to join older existing ip address and not connect to the new cluster size.
This problem occurs when you scale up cassandra (eg: Two pods and then scale it back to one) and when you scale cassandra pods to zero and then create a new ones.
It is very difficult to reproduce it on Openshift, but it seems to be related to the bug that logged (https://bugzilla.redhat.com/show_bug.cgi?id=1459345).
(In reply to Guilherme Baufaker Rêgo from comment #2)
> This problem occurs when you scale up cassandra (eg: Two pods and then scale
> it back to one) and when you scale cassandra pods to zero and then create a
> new ones.
This would be another situation. We had an issue opened where we need to clarify in the docs what you need to do to scale up and down Cassandra pods. You can't scale down without encountering problems.
> It is very difficult to reproduce it on Openshift, but it seems to be
> related to the bug that logged
Its not related to this.
I am lowing the priority of this issue.
When you bring back up the Cassandra pods, it will for a moment try and connect to the old IP address, but it will eventually resolve the proper IP address of the new pods.
There are a few things we need to figure out here:
1) if there is some option we can give to Cassandra to only use the seed list when trying to determine what else is in the cluster and ignore any existing IP addresses it knew about in the past.
2) we should update our readiness probes to take into affect this situation. Currently the probes will go into the ready state when the cluster is still trying to figure things out.
This will be resolved when we move over to stateful sets for our Cassandra pods.
Deferring as this should be resolved when we move to stateful sets.