Description of problem: In OSP16 HA, one rabbitmq bundle is in stopped after disruption : simultaneous controller hard reset system does not recover on it's own Version-Release number of selected component (if applicable): RHOS_TRUNK-16.0-RHEL-8-20200213.n.1 Steps to Reproduce: 1. simultaneous controller hard reset (echo b>/proc/sysrq-trigger) 2. check all pacemaker for rabbitmq resource 3. Actual results: one rabbitmq resource is stopped Expected results: rabbitmq resource cluster is all up Additional info:
sosreports and all overcloud /var/log at : http://rhos-release.virt.bos.redhat.com/log/pkomarov_sosreports/BZ_1806248/
Bz discovered by Tobiko: tobiko.tests.faults.ha.test_cloud_recovery.RebootNodesTest.test_reboot_controllers_recovery Tobiko: https://tobiko.readthedocs.io/en/latest/