| Summary: | For Self Hosted RHV Deployment changing data center and cluster name causes deployment to fail | |||
|---|---|---|---|---|
| Product: | Red Hat Quickstart Cloud Installer | Reporter: | James Olin Oden <joden> | |
| Component: | Installation - RHEV | Assignee: | John Matthews <jmatthew> | |
| Status: | CLOSED EOL | QA Contact: | Tasos Papaioannou <tpapaioa> | |
| Severity: | high | Docs Contact: | Dan Macpherson <dmacpher> | |
| Priority: | unspecified | |||
| Version: | 1.0 | CC: | bthurber, fabian, tpapaioa | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1367897 (view as bug list) | Environment: | ||
| Last Closed: | 2018-02-26 19:58:09 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1367897 | |||
This is outside scope of GA, for GA we will disable the ability to configure the data center/cluster name with self-hosted. Disabled datacenter/cluster configuration for self-hosted: https://github.com/fusor/fusor/pull/1165 Verified on QCI-1.0-RHEL-7-20160819.t.0. |
Description of problem: I have three times (well, Fabian did it once and I twice). What originally happened was was doing a RHV self hosted deployment with four hosts, and I had change the data center and cluster name to be: 0123456789112345678921234567893123456789 Which is exactly 40 characters long, the maximum length of a data center or cluster name. As it was deploying the first host for the engine to run on, it died with the following error: ===== Puppet run for the host hyperviso14.b.b status reported as Error ====== On the host that had the puppet failure, /var/log/messages had the following concerning puppet: Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ ERROR ] Failed to execute stage 'Closing up': Specified cluster does not exist: 1111 Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO ] Stage: Clean up Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160816211614.conf' Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO ] Stage: Pre-termination Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO ] Stage: Termination Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue,fix and redeploy Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160816204743-iylgq9.log Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: hosted-engine --deploy --config-append=/etc/qci/answers returned 1 instead of one of [0] Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) change from notrun to 0 failed: hosted-engine --deploy --config-append=/etc/qci/answers returned 1 instead of one of [0] Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Done]) Dependency Exec[hosted-engine-setup] has failures: true Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Done]) Skipping because of failed dependencies Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[Datacenter is not in upstatus, going over configuration]) Dependency Exec[hosted-engine-setup] has failures: true Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[Datacenter is not in upstatus, going over configuration]) Skipping because of failed dependencies Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/File[/etc/qci/engine-DC-config.py]) Dependency Exec[hosted-engine-setup] has failures: true Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/File[/etc/qci/engine-DC-config.py]) Skipping because of failed dependencies Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Exec[engine_dc_config]) Dependency Exec[hosted-engine-setup] has failures: true Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Exec[engine_dc_config]) Skipping because of failed dependencies Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Starting]) Dependency Exec[hosted-engine-setup] has failures: true Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Starting]) Skipping because of failed dependencies Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: Finished catalog run in 2140.07 seconds When you look in the log, /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160817004633-g16j41.log, pointed to by the error above you find this: Aug 17 00:45:25 hypervisor14.b.b vdsm[13818]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1) Aug 17 00:45:25 hypervisor14.b.b vdsm[13818]: vdsm root ERROR failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 232, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg) BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1) This error is listed several times in that log. Version-Release number of selected component (if applicable): QCI-1.0-RHEL-7-20160815.t.0 How reproducible: every time Steps to Reproduce: 1. Do a RHV deployment 2. Change the data center and cluster names 3. Continue with the deployment Actual results: It will fail with the puppet error which seems to be due to vdsm not running. Expected results: No errors.