I deploy RHOSP13.
I am using ipmi as fencing mechanism for the underecloud's controllers and as a way for tripleo to power on and off the nodes.
I import the overcloud node with
openstack overcloud node import ~/instackenv.json
Excerpt from the instackenv.json:
The node can be powered on and off when needed during introspection and deployment.
In order for the controller node to be able to fence other controller node, i use a fencing.yaml that has been created with:
openstack overcloud generate fencing instackenv.json --output fencing.yaml
Excerpt from the file:
- agent: fence_ipmilan
Now, when the undercloud is deployed, a "pcs status" show tht the fencing device are stopped:
stonith-fence_ipmilan-000077880001 (stonith:fence_ipmilan): Stopped
stonith-fence_ipmilan-000077880002 (stonith:fence_ipmilan): Stopped
stonith-fence_ipmilan-000077880003 (stonith:fence_ipmilan): Stopped
It show also this kind of error after the cleanup fail:
* stonith-fence_ipmilan-000077880001_start_0 on overcloud-controller-2 'unknown error' (1): call=254, status=Error, exitreason='',
last-rc-change='Tue Jul 30 08:25:18 2019', queued=0ms, exec=21309ms
This is how Director configurd the fencing:
[heat-admin overcloud-controller-1 ~]$ sudo pcs stonith show stonith-fence_ipmilan-000077880001
Resource: stonith-fence_ipmilan-000077880001 (class=stonith type=fence_ipmilan)
Attributes: ipaddr=192.168.104.1 ipport=6233 login=admin passwd=redhat pcmk_host_list=overcloud-controller-0 power_timeout=60
Operations: monitor interval=60s (stonith-fence_ipmilan-000077880001-monitor-interval-60s)
This is how i can actually fence a node:
ipmitool -I lanplus -U admin -P redhat -H 192.168.104.1 -p 6233 power off
Note that the pcs configuration is lacking the lanplus parameters.
This fix the problem:
sudo pcs stonith update stonith-fence_ipmilan-000077880001 lanplus=true
sudo pcs stonith update stonith-fence_ipmilan-000077880002 lanplus=true
sudo pcs stonith update stonith-fence_ipmilan-000077880003 lanplus=true
Chances are that the way the fence_ipmilan template is used can be improved to automatically add the lanplus parameter.
Note that you can just add --ipmi-lanplus to the generate fencing command.
We did switch to that as default in rocky. I will use this BZ to track the backport of that
verified on 2019-10-23.1:
help=_('DEPRECATED: This is the default.'))
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.