Bug 1346164
| Summary: | Incompatibility with pacemaker version 1.1.13-10.el7_2.2 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Kari Hautio <kari.hautio> | ||||
| Component: | openstack-foreman-installer | Assignee: | Jason Guiditta <jguiditt> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Shai Revivo <srevivo> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 6.0 (Juno) | CC: | mburns, morazi, rhos-maint, ryan.andrew.baker, sknauss, srevivo | ||||
| Target Milestone: | --- | Keywords: | ZStream | ||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-06-22 18:17:41 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
(In reply to Kari Hautio from comment #0) > Created attachment 1167789 [details] > proposed patch > > Description of problem: > > When installing with updated pacemaker ha-all-in-one-util goes into endless > loop because of changes in output format of pcs command. > > > Version-Release number of selected component (if applicable): > openstack-foreman-installer-3.0.27-1.el7ost.noarch > pacemaker-1.1.13-10.el7_2.2.x86_64 > > How reproducible: > always > > Steps to Reproduce: > 1. Perform foreman installation with new pacemaker version. > > Actual results: > Installation stuck > > Expected results: > Installation is successful > > Additional info: > > With old pacemaker > > [root@controller-1 ~]# rpm -q pacemaker > pacemaker-1.1.13-10.el7.x86_64 > [root@controller-1 ~]# /usr/sbin/pcs property show pcmk-controller-1 > Cluster Properties: > pcmk-controller-1: > memcached,haproxy,mysqlinit,rabbitmq,keystone,glance,nova,cinder,neutron, > heat,horizon,nosql,ceilometer > [root@controller-1 ~]# > > With updated pacemaker: > > [root@controller-1 ~]# /usr/sbin/pcs property show pcmk-controller-1 > Cluster Properties: > pcmk-controller-1: memcached,haproxy,mysqlinit,keystone > Node Attributes: > pcmk-controller-1: > rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-1 > pcmk-controller-2: > rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-2 > pcmk-controller-3: > rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-3 > > Proposed patch for ha-all-in-one-util attached. Are you sure there is not just a problem with rabbit here? That is kind of what it looks like from the 'updated pacemaker' output, specifically all the 'rmq-node-attr-last-known-rabbitmq-server' messages in the node attributes. Was this a clean installation? I would suggest to look at pacemaker and rabbit logs, as this version of ofi was tested successfully with bug #1290684. I think searching for 'node attributes would simply mask this issue and cause you to have a non-working deployment at the end. Hmh, I will check, the way it fails is anyway something that should be corrected. (ha-all-in-one-util goes to endless loop trying to modify the property string). Briefly checked, yes pcs property show will show attributes if present also with older version so the problem is somewhere else. A graceful fail would be beneficial in any case. I guess this BZ can be closed. Not a bug with the component described, possible BZ needed elsewhere. I don't think that BZ was triaged appropriately and I think this actually is a bug with the ha-all-in-one.bash script. Comment #2 suggests that the version of OFI has already been tested per: BZ 1290684. However, that BZ was opened in 2015. Based on: https://rhn.redhat.com/errata/RHBA-2016-0556.html It looks like the following code was added to the resource-agents package in March of 2016: https://git.centos.org/blob/rpms!resource-agents/f784e8cb080c453bc9c1cafa447fb125da652761/SOURCES!bz1311180-rabbitmq-cluster-forget-stopped-cluster-nodes.patch;jsessionid=osbxg57k6dho155jb94ooijjw#L15 Which add's the attribute: +# this attr represents the current active local rmq node name. +# when rmq stops or the node is fenced, this attr disappears RMQ_CRM_ATTR_COOKIE="rmq-node-attr-${OCF_RESOURCE_INSTANCE}" +# this attr represents the last known active local rmq node name +# when rmp stops or the node is fenced, the attr stays forever so +# we can continue to map an offline pcmk node to it's rmq node name +# equivalent. +RMQ_CRM_ATTR_COOKIE_LAST_KNOWN="rmq-node-attr-last-known-${OCF_RESOURCE_INSTANCE}" To me - when I read that, this means that this attribute will be added to the cluster so that if the node goes offline, pacemaker still knows the node name. So, having this attribute added to the cluster is not a indication of a problem, but just a steady state. To me, that makes the fix suggested appropriate as the properties would not have been there in 2015 when the OFI was tested. |
Created attachment 1167789 [details] proposed patch Description of problem: When installing with updated pacemaker ha-all-in-one-util goes into endless loop because of changes in output format of pcs command. Version-Release number of selected component (if applicable): openstack-foreman-installer-3.0.27-1.el7ost.noarch pacemaker-1.1.13-10.el7_2.2.x86_64 How reproducible: always Steps to Reproduce: 1. Perform foreman installation with new pacemaker version. Actual results: Installation stuck Expected results: Installation is successful Additional info: With old pacemaker [root@controller-1 ~]# rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [root@controller-1 ~]# /usr/sbin/pcs property show pcmk-controller-1 Cluster Properties: pcmk-controller-1: memcached,haproxy,mysqlinit,rabbitmq,keystone,glance,nova,cinder,neutron,heat,horizon,nosql,ceilometer [root@controller-1 ~]# With updated pacemaker: [root@controller-1 ~]# /usr/sbin/pcs property show pcmk-controller-1 Cluster Properties: pcmk-controller-1: memcached,haproxy,mysqlinit,keystone Node Attributes: pcmk-controller-1: rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-1 pcmk-controller-2: rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-2 pcmk-controller-3: rmq-node-attr-last-known-rabbitmq-server=rabbit@lb-backend-controller-3 Proposed patch for ha-all-in-one-util attached.