Bug 1237329

Summary: Overcloud: HA: Pacemaker and ironic fighting for control causing fencing to fail when rebooting/powering off the node.
Product: Red Hat OpenStack Reporter: Leonid Natapov <lnatapov>
Component: rhosp-directorAssignee: Lucas Alvares Gomes <lmartins>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact:
Priority: high    
Version: DirectorCC: dmacpher, gfidente, jslagle, jtrowbri, mburns, oblaut, ohochman, rhel-osp-director-maint, rrosa, sclewis
Target Milestone: ga   
Target Release: Director   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: instack-undercloud-2.1.2-14 Doc Type: Bug Fix
Doc Text:
Pacemaker and ironic fought for control over power management, which caused issues with fencing. This fix sets force_power_state_during_sync=False in /etc/ironic/ironic.conf by default. This stops ironic automatically restoring the power state of the node during its synchronization. Pacemaker can now successfully fence the node.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-05 13:57:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Leonid Natapov 2015-06-30 20:05:40 UTC
Overcloud: HA: Pacemaker and ironic fighting for control causing fencing to fail when using ironic to power off  the node.

pacemaker is trying to turn the node off (or on) using fencing while ironic is trying to return the node to correct state.

We have to  configure Ironic to *not* try to restore the node to a correct power state and let the pacemaker do the job.

Comment 2 Leonid Natapov 2015-07-01 13:20:27 UTC
The work around:
1.on the instack node edit /etc/ironic/ironic.conf file
2.Uncomment force_power_state_during_sync and set it to false.
3.restart openstack-ironic-conductor service.

Comment 3 chris alfonso 2015-07-01 17:23:47 UTC
Lucas, can you make sure you default to having the force_power_state_during_sync to false and make sure there are docs describing how to turn it on if needed?

Comment 4 Lucas Alvares Gomes 2015-07-02 13:01:30 UTC
@Chris, will do!

@Leonid, that's not a workaround, that's the right way to do. Ironic by default will try to make sure that the machines are in sync with the database state. But we made it configurable for this kinda of situations.

Comment 5 Lucas Alvares Gomes 2015-07-02 14:04:48 UTC
Btw, we don't have any fencing agent for Ironic in Peacemaker right?

If Peacemaker used the Ironic interface to power on and off the nodes it's fencing this problem wouldn't happen.

Comment 6 James Slagle 2015-07-02 16:44:15 UTC
Lucas, a question on the gerrithub review about the use of tabs there. If that needs to be fixed, can you update the patch?

once it's in shape, please submit to code.engineering as well.

Comment 8 James Slagle 2015-07-07 11:08:00 UTC
upstream and downstream patches merged

Comment 10 Leonid Natapov 2015-07-20 14:33:55 UTC
ironic.conf includes force_power_state_during_sync=False

instack-undercloud-2.1.2-21.el7ost.noarch

Comment 12 errata-xmlrpc 2015-08-05 13:57:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549