Bug 2178614
| Summary: | "Create Cluster tripleo_cluster" fails if it's the second attempt | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | David Hill <dhill> |
| Component: | puppet-pacemaker | Assignee: | Luca Miccini <lmiccini> |
| Status: | NEW --- | QA Contact: | Nobody <nobody> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 16.2 (Train) | CC: | jjoyce, jmarcian, jschluet, lmiccini, slinaber, tvignaud |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: "Create Cluster tripleo_cluster" fails if it's the second attempt. In this customer case (not the first time we see this), the authentication failed for some reasons (MTU size, etc) and then, second deployment fails with : ~~~ <13>Mar 13 14:39:12 puppet-user: Notice: /Stage[main]/Pacemaker::Corosync/Exec[Create Cluster tripleo_cluster]/returns: Error: Hosts 'overcloud-controller-1', 'overcloud-controller-2' are not known to pcs, try to authenticate the hosts using 'pcs host auth overcloud-controller-1 overcloud-controller-2' command ~~~ Exec <|tag == 'pacemaker-auth'|> -> exec {"Create Cluster ${cluster_name}": creates => '/etc/cluster/cluster.conf', command => $cluster_setup_cmd, timeout => $cluster_start_timeout, tries => $cluster_start_tries, try_sleep => $cluster_start_try_sleep, unless => '/usr/bin/test -f /etc/corosync/corosync.conf', require => Class['pacemaker::install'], } -> Version-Release number of selected component (if applicable): All How reproducible: If the first "Create Cluster tripleo_cluster" wasn't executed for some reasons. Steps to Reproduce: 1. idk exactly what happened but it happened and hacluster password was set, then auth happened (probably) and "Create Cluster tripleo_cluster" didn't complete or wasn't even executed 2. Retry deployment 3. Actual results: Fails because pcsd is not authenticated to all hosts Expected results: It should authenticate if it's not authenticated Additional info: It's not the first time we see this behavior but it's the first time we open a BZ for this.