Bug 1483920
| Summary: | Deployment of native fencing occasionally fails | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tomas Jamrisko <tjamrisk> |
| Component: | puppet-tripleo | Assignee: | Chris Jones <chjones> |
| Status: | CLOSED ERRATA | QA Contact: | pkomarov |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 11.0 (Ocata) | CC: | aschultz, ccollett, chjones, fdinitto, jjoyce, jschluet, mburns, michele, rhel-osp-director-maint, slinaber, tvignaud, ushkalim |
| Target Milestone: | z3 | Keywords: | Triaged, ZStream |
| Target Release: | 11.0 (Ocata) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | puppet-tripleo-6.5.1-1.el7ost | Doc Type: | Bug Fix |
| Doc Text: |
In the release version of OSP11, there was a bug that caused the generation of overcloud fencing configuration to occasionally fail. This update includes improvements to the generator so that overcloud fencing configuration generation is now reliable.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-10-31 17:37:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1444621 | ||
|
Description
Tomas Jamrisko
2017-08-22 09:24:42 UTC
Linking the stable/ocata review only as the master one has merged Verified , controller fencing using overcloud deploy was used :
verified initial pacemaker setup :
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum
Last updated: Tue Oct 3 10:50:28 2017
Last change: Tue Oct 3 09:54:07 2017 by root via cibadmin on controller-2
3 nodes configured
19 resources configured
Online: [ controller-0 controller-1 controller-2 ]
Full list of resources:
Master/Slave Set: galera-master [galera]
Masters: [ controller-0 controller-1 controller-2 ]
Clone Set: rabbitmq-clone [rabbitmq]
Started: [ controller-0 controller-1 controller-2 ]
Master/Slave Set: redis-master [redis]
Masters: [ controller-2 ]
Slaves: [ controller-0 controller-1 ]
ip-192.168.24.8 (ocf::heartbeat:IPaddr2): Started controller-0
ip-10.35.180.18 (ocf::heartbeat:IPaddr2): Started controller-1
ip-172.17.0.18 (ocf::heartbeat:IPaddr2): Started controller-2
ip-172.17.0.14 (ocf::heartbeat:IPaddr2): Started controller-0
ip-172.18.0.16 (ocf::heartbeat:IPaddr2): Started controller-1
ip-172.19.0.19 (ocf::heartbeat:IPaddr2): Started controller-2
Clone Set: haproxy-clone [haproxy]
Started: [ controller-0 controller-1 controller-2 ]
openstack-cinder-volume (systemd:openstack-cinder-volume): Started controller-0
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Verify STONITH is disabled:
[root@controller-2 ~]# sudo pcs property show
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: tripleo_cluster
dc-version: 1.1.16-12.el7_4.2-94ff4df
have-watchdog: false
last-lrm-refresh: 1507024323
maintenance-mode: false
redis_REPL_INFO: controller-2
stonith-enabled: false
Node Attributes:
controller-0: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@controller-0
controller-1: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@controller-1
controller-2: cinder-volume-role=true galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@controller-2
Generate the fencing.yaml file:
openstack overcloud generate fencing --ipmi-lanplus --ipmi-level administrator --output fencing.yaml instackenv.json
update the overcloud with the fencing configuration :
penstack overcloud deploy \
--templates /usr/share/openstack-tripleo-heat-templates \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/deployment_files/network/network-environment.yaml \
-e /home/stack/deployment_files/hostnames.yml \
-e /home/stack/deployment_files/nodes_data.yaml \
-e /home/stack/deployment_files/debug.yaml \
-e /home/stack/deployment_files/docker-images.yaml \
-e /home/stack/deployment_files/workaround_params.yaml \
-e /home/stack/fencing.yaml \
--log-file overcloud_deployment_95.log
...
OUTPUT :
Stack overcloud UPDATE_COMPLETE
Check that new stonith resources were created :
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum
Last updated: Tue Oct 3 11:55:05 2017
Last change: Tue Oct 3 11:35:15 2017 by root via cibadmin on controller-0
3 nodes configured
22 resources configured
Online: [ controller-0 controller-1 controller-2 ]
Full list of resources:
Master/Slave Set: galera-master [galera]
Masters: [ controller-0 controller-1 controller-2 ]
Clone Set: rabbitmq-clone [rabbitmq]
Started: [ controller-0 controller-1 controller-2 ]
Master/Slave Set: redis-master [redis]
Masters: [ controller-2 ]
Slaves: [ controller-0 controller-1 ]
ip-192.168.24.8 (ocf::heartbeat:IPaddr2): Started controller-0
ip-10.35.180.18 (ocf::heartbeat:IPaddr2): Started controller-1
ip-172.17.0.18 (ocf::heartbeat:IPaddr2): Started controller-2
ip-172.17.0.14 (ocf::heartbeat:IPaddr2): Started controller-0
ip-172.18.0.16 (ocf::heartbeat:IPaddr2): Started controller-1
ip-172.19.0.19 (ocf::heartbeat:IPaddr2): Started controller-2
Clone Set: haproxy-clone [haproxy]
Started: [ controller-0 controller-1 controller-2 ]
openstack-cinder-volume (systemd:openstack-cinder-volume): Started controller-0
stonith-fence_ipmilan-441ea173385f (stonith:fence_ipmilan): Started controller-2
stonith-fence_ipmilan-441ea1733d43 (stonith:fence_ipmilan): Started controller-1
stonith-fence_ipmilan-441ea1733991 (stonith:fence_ipmilan): Started controller-1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3098 |