Red Hat Bugzilla – Bug 249509
ring id not stored properly in all cases
Last modified: 2016-04-26 10:18:14 EDT
Description of problem:
IN some cases (crash or kill of a node) the ring id for a node that has reached
consensus is not stored in the commit phase before the commit token is sent.
Instead it is saved when the operational state is entered. This leaves a race
where the new ring id for the processor could be lost between the sending of the
commit token and the entering of the operational phase.
Also the ring id is not fsync'ed which could mean the ring id could be in
transit before originating the commit token.
Version-Release number of selected component (if applicable):
very difficult to reproduce took 15+ hrs of run time of mp5
Steps to Reproduce:
1. run mp5
ring id is stored when entering operational phase.
ring id should be stored when the commit token is forwarded for the first time.
patch attached to fix the problem.
Created attachment 159909 [details]
patch to fix the problem.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.