Hide Forgot
Description of problem: During major upgrades of overcloud from OSP11 -> OSP12, one of the upgrade tasks of HA services is to delete existing pacemaker resource (e.g. galera-master). We have notice that in some cases (e.g. overcloud with services splitted across dedicated server ) the resource deletion task is triggered, it returns a successful rc, but the resource is not delete from the CIB. From the logs we see that this happens when a concurrent operation is scheduled in pacemaker at the same time of the deletion, for instance, a resource cleanup. This is because "pcs delete" is not an atomic action, so any concurrent action on the resource can impact how resource deletion will success. Version-Release number of selected component (if applicable): How reproducible: Randomly Steps to Reproduce: 1. Deploy OSP11 on composable HA (split services on specific nodes) 2. Upgrade to OSP12 3. Actual results: OSP12 upgrade should succeed Expected results: Sometimes old OSP11 resources are not deleted and this breaks the creation of new containerized resources, so OSP12 upgrade fails. Additional info:
Fix [1] proposed and merged upstream [1] https://review.gerrithub.io/#/c/382117/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462