Bug 1138407
| Summary: | Rubygem-Staypuft: HA-nova deployment fails upon installing the computes: puppet error: change from absent to present failed: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.32.0/21 6 --vlan_start 10' returned 1: Command failed | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> | ||||
| Component: | rubygem-staypuft | Assignee: | Mike Burns <mburns> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Alexander Chuzhoy <sasha> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 5.0 (RHEL 7) | CC: | cwolfe, mburns, yeylon | ||||
| Target Milestone: | ga | Keywords: | TestOnly | ||||
| Target Release: | Installer | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-02-09 15:15:06 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Tried to run exactly the same deployment - completed successfully. I noticed the following (reproduced 2 times): while not affecting the deployment result, there's a failed puppet report on one controller (out of 3). The failure is: "Could not restart Service[galera]: Execution of '/usr/bin/systemctl restart mariadb' returned 1: Job for mariadb.service canceled. Wrapped exception: Execution of '/usr/bin/systemctl restart mariadb' returned 1: Job for mariadb.service canceled." There's only one failed puppet run in the reports for that controller - the next report for the same host is successful, although no changes were initiated by me. As discussed on IRC with eck and rohara, puppet restarting on a controller during a post-install puppet check-in was the culprit. Specifically, galera.cnf was set to non-bootstrap mode (the relevant line being "wsrep_cluster_address=...") at the beginning of the post-install puppet run and this caused a refresh of the galera service. The timing just happened to coincide with when the compute node attempted to use the database, hence this BZ. This should be fixed here: https://github.com/redhat-openstack/astapor/blob/42c1fc6795a4cc2934878ea47760b1f38b7d702c/puppet/modules/quickstack/manifests/pacemaker/galera.pp#L128 which is included in: https://bugzilla.redhat.com/show_bug.cgi?id=1132155 Now, When galera is set up on the bootstrap node during install, the last thing that happens is galera.cnf is set to non-bootstrap mode by use of puppet's file_line. So, when the puppet agent checks in again, it will determine that *no changes* are needed to galera.cnf and hence galera will not be restarted. Verified: Environment: ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el7ost.noarch openstack-foreman-installer-3.0.8-1.el7ost.noarch ruby193-rubygem-staypuft-0.5.9-1.el7ost.noarch rhel-osp-installer-client-0.5.4-1.el7ost.noarch openstack-puppet-modules-2014.2.8-1.el7ost.noarch rhel-osp-installer-0.5.4-1.el7ost.noarch The reported issue doesn't reproduce. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0156.html |
Created attachment 934567 [details] Logs collected from machines - named respectively Rubygem-Staypuft: HA-nova deployment fails upon installing the computes: puppet error: change from absent to present failed: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.32.0/21 6 --vlan_start 10' returned 1: Command failed, please check log for more info Environment: rhel-osp-installer-0.1.10-2.el6ost.noarch openstack-foreman-installer-2.0.22-1.el6ost.noarch ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el6ost.noarch openstack-puppet-modules-2014.1-21.7.el6ost.noarch Steps to rerpoduce: 1. Install rhel-osp-installer. 2.Configure/run an HANova deployment with vlan network type (3 controllers+2 computes). Result: The deployment gets paused with error upon installing one of the compute nodes. Checking the puppet report - found this: change from absent to present failed: Execution of '/usr/bin/nova-manage network create novanetwork 192.168.32.0/21 6 --vlan_start 10' returned 1: Command failed, please check log for more info Subsequent run of puppet agent on the same node had no issues, so I resumed the deployment and it completed successfully. Expected results: The deployment should complete with no issues.