Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1351204

Summary: Migration to AODH fails, leaving unmanaged pacemaker resources
Product: Red Hat OpenStack Reporter: Jiri Stransky <jstransk>
Component: openstack-tripleo-heat-templatesAssignee: Jiri Stransky <jstransk>
Status: CLOSED NOTABUG QA Contact: Arik Chernetsky <achernet>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: jason.dobies, jstransk, mburns, mcornea, morazi, rhel-osp-director-maint, tvignaud
Target Milestone: ---   
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-27 14:57:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1333977    
Attachments:
Description Flags
heat software deployment output
none
pcs status
none
corosync.log none

Description Jiri Stransky 2016-06-29 13:05:27 UTC
Description of problem:

On a first try to execute the AODH migration script, the migration fails and leaves all pacemaker resources unmanaged. (Would this result in fencing instead, if fencing was enabled?)


Version-Release number of selected component (if applicable):

openstack-tripleo-heat-templates-2.0.0-12.el7ost.noarch

resource-agents-3.9.5-54.el7_2.10.x86_64
pacemaker-cluster-libs-1.1.13-10.el7_2.2.x86_64
pacemaker-1.1.13-10.el7_2.2.x86_64
pacemaker-remote-1.1.13-10.el7_2.2.x86_64
pacemaker-cli-1.1.13-10.el7_2.2.x86_64
pacemaker-libs-1.1.13-10.el7_2.2.x86_64
corosync-2.3.4-7.el7_2.1.x86_64
corosynclib-2.3.4-7.el7_2.1.x86_64


Workaround:

Run on one of the controllers `pcs property set maintenance-mode=false`, wait for the cluster to stabilize, and then re-run the AODH migration, which now succeeds.


I hit the issue on every migration try so far, and the worakround has also worked every time so far.

Comment 2 Jiri Stransky 2016-06-29 13:06:04 UTC
Created attachment 1173890 [details]
heat software deployment output

Comment 3 Jiri Stransky 2016-06-29 13:06:29 UTC
Created attachment 1173891 [details]
pcs status

Comment 4 Jiri Stransky 2016-06-29 13:06:48 UTC
Created attachment 1173892 [details]
corosync.log

Comment 5 Jiri Stransky 2016-07-11 14:45:06 UTC
I saw folks report on upgrades WIP etherpad that the AODH migration went through successfully for them, so this issue may have been some sort of a problem with my environment.

Comment 6 Jay Dobies 2016-07-12 13:06:58 UTC
Jirka - Given that it may only be on your environment, I'm going to leave this open until the triage on 7/14. If you're not seeing it regularly reproduced by then I'm going to close it out as cannot reproduce.

Comment 7 Jiri Stransky 2016-07-12 14:35:17 UTC
Ack, that makes sense. I haven't heard of anyone being able to reproduce it to date.

Comment 8 Jay Dobies 2016-07-27 14:57:31 UTC
Closing this out based on Jiri's comment.