Bug 1288481 - pacemaker master HA can not set up due to "Set the cluster user password" is skipped.
Summary: pacemaker master HA can not set up due to "Set the cluster user password" is ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Andrew Butcher
QA Contact: Ma xiaoqiang
URL:
Whiteboard:
: 1290312 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-04 11:25 UTC by Johnny Liu
Modified: 2016-07-04 00:46 UTC (History)
7 users (show)

Fixed In Version: openshift-ansible-3.0.19-1.git.2.530aaf8
Doc Type: Bug Fix
Doc Text:
When deploying pacemaker based HA masters the cluster password was not set properly. This error has been corrected.
Clone Of:
Environment:
Last Closed: 2015-12-17 21:19:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2667 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise bug fix update 2015-12-18 02:18:50 UTC

Description Johnny Liu 2015-12-04 11:25:11 UTC
Description of problem:
When using openshift-ansible master branch to install pacemaker master HA, due to "Set the cluster user password" step is skipped, that would lead to installation failure.

Version-Release number of selected component (if applicable):
https://github.com/openshift/openshift-ansible.git -b master

How reproducible:
Always

Steps to Reproduce:
1. Use openshift-ansible master branch to install pacemaker master HA
openshift_master_ha=True
openshift_master_cluster_method=pacemaker
openshift_master_cluster_password=openshift_cluster
openshift_master_cluster_vip=10.66.79.250
openshift_master_cluster_public_vip=10.66.79.250
openshift_master_cluster_hostname=master.cluster.local
openshift_master_cluster_public_hostname=master.cluster.local
2.
3.

Actual results:
Installation failed at the following step:
TASK: [openshift_master_cluster | Authenticate to the cluster] **************** 
failed: [test1.cluster.local] => {"changed": true, "cmd": ["pcs", "cluster", "auth", "-u", "hacluster", "-p", "openshift_cluster", "test1.cluster.local", "test2.cluster.local"], "delta": "0:00:02.604792", "end": "2015-12-04 19:07:36.007250", "rc": 1, "start": "2015-12-04 19:07:33.402458", "warnings": []}
stderr: Error: test1.cluster.local: Username and/or password is incorrect
Error: test2.cluster.local: Username and/or password is incorrect

FATAL: all hosts have already failed -- aborting



Review the previous logs, found the following step is skipped.
TASK: [openshift_master | Set the cluster user password] ********************** 
skipping: [test2.cluster.local]



Expected results:
Pacemaker master HA installation is passed.

Additional info:
prod branch does not have such issue.

Comment 1 Andrew Butcher 2015-12-04 22:41:09 UTC
Was pacemaker already installed on this system? In the ansible tasks it looks like we only set the password when 'pcs' was installed that run.

- name: Install cluster packages                                                                                                                                                                                   
  yum: pkg=pcs state=present                                                                                                                                                                                       
  when: (ansible_pkg_mgr == "yum") and openshift_master_ha | bool and openshift.master.cluster_method == 'pacemaker'                                                                                               
  register: install_result

...
                                                                                                                                                                                                       
- name: Set the cluster user password                                                                                                                                                                              
  shell: echo {{ openshift_master_cluster_password | quote }} | passwd --stdin hacluster                                                                                                                           
  when: install_result | changed

If 'pcs' already being installed is the issue, we may need to mention this in the Known Issues section of the installation docs or perhaps we can attempt to authenticate to the cluster with the provided password and only set the password when authenticating failed.

Comment 2 Johnny Liu 2015-12-07 03:39:21 UTC
> If 'pcs' already being installed is the issue, we may need to mention this
> in the Known Issues section of the installation docs or perhaps we can
> attempt to authenticate to the cluster with the provided password and only
> set the password when authenticating failed.

Before start installation, the system is a clean rhel7 system, no 'pcs' is installed. 
After the installation process exit with failure, log into the 1st master, 'pcs' already is installed. 
[root@test1 ~]# rpm -q pcs
pcs-0.9.143-15.el7.x86_64
[root@test1 ~]# pcs status
Error: cluster is not currently running on this node

One more thing need to be highlighted, if use "prod" branch, everything is working well, only "master" branch has such issue.

Comment 6 Johnny Liu 2015-12-10 10:29:11 UTC
Verified with this bug with the latest openshift-ansible master branch and openshift-ansible-playbooks-3.0.19-1.git.9.b1d3049.el7aos.noarch, PASS.

Comment 8 errata-xmlrpc 2015-12-17 21:19:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2015:2667

Comment 9 Andrew Butcher 2016-01-05 14:34:55 UTC
*** Bug 1290312 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.