I deploy RHOSP13.
I am using ipmi as fencing mechanism for the underecloud's controllers and as a way for tripleo to power on and off the nodes.
I import the overcloud node with
openstack overcloud node import ~/instackenv.json
Excerpt from the instackenv.json:
{
"mac":[
"00:00:77:88:00:02"
],
"name":"controller01",
"cpu":"4",
"memory":"32768",
"disk":"40",
"arch":"x86_64",
"pm_type":"ipmi",
"pm_user":"admin",
"pm_password":"redhat",
"pm_addr":"192.168.104.1",
"pm_port":"6234"
},
The node can be powered on and off when needed during introspection and deployment.
In order for the controller node to be able to fence other controller node, i use a fencing.yaml that has been created with:
openstack overcloud generate fencing instackenv.json --output fencing.yaml
Excerpt from the file:
EnableFencing: true
FencingConfig:
devices:
- agent: fence_ipmilan
host_mac: 00:00:77:88:00:01
params:
ipaddr: 192.168.104.1
ipport: '6233'
login: admin
passwd: redhat
Now, when the undercloud is deployed, a "pcs status" show tht the fencing device are stopped:
stonith-fence_ipmilan-000077880001 (stonith:fence_ipmilan): Stopped
stonith-fence_ipmilan-000077880002 (stonith:fence_ipmilan): Stopped
stonith-fence_ipmilan-000077880003 (stonith:fence_ipmilan): Stopped
It show also this kind of error after the cleanup fail:
* stonith-fence_ipmilan-000077880001_start_0 on overcloud-controller-2 'unknown error' (1): call=254, status=Error, exitreason='',
last-rc-change='Tue Jul 30 08:25:18 2019', queued=0ms, exec=21309ms
This is how Director configurd the fencing:
[heat-admin overcloud-controller-1 ~]$ sudo pcs stonith show stonith-fence_ipmilan-000077880001
Resource: stonith-fence_ipmilan-000077880001 (class=stonith type=fence_ipmilan)
Attributes: ipaddr=192.168.104.1 ipport=6233 login=admin passwd=redhat pcmk_host_list=overcloud-controller-0 power_timeout=60
Operations: monitor interval=60s (stonith-fence_ipmilan-000077880001-monitor-interval-60s)
This is how i can actually fence a node:
ipmitool -I lanplus -U admin -P redhat -H 192.168.104.1 -p 6233 power off
Note that the pcs configuration is lacking the lanplus parameters.
This fix the problem:
sudo pcs stonith update stonith-fence_ipmilan-000077880001 lanplus=true
sudo pcs stonith update stonith-fence_ipmilan-000077880002 lanplus=true
sudo pcs stonith update stonith-fence_ipmilan-000077880003 lanplus=true
Chances are that the way the fence_ipmilan template is used can be improved to automatically add the lanplus parameter.
Comment 1Michele Baldessari
2019-08-01 08:28:06 UTC
Note that you can just add --ipmi-lanplus to the generate fencing command.
We did switch to that as default in rocky. I will use this BZ to track the backport of that
verified on 2019-10-23.1:
parser.add_argument('--ipmi-lanplus',
dest='ipmi_lanplus',
default=True,
action='store_true',
help=_('DEPRECATED: This is the default.'))
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2019:3794