Bug 1746976 - Problem installing Hosted Engine in disconnected with latest installation media
Summary: Problem installing Hosted Engine in disconnected with latest installation media
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-metrics
Version: 4.3.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.3.7
: 4.3.7
Assignee: Shirly Radco
QA Contact: Guilherme Santos
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-29 16:35 UTC by Robert McSwain
Modified: 2020-02-25 15:47 UTC (History)
3 users (show)

Fixed In Version: ovirt-engine-metrics-1.3.5.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-12 10:36:34 UTC
oVirt Team: Metrics
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:4229 0 None None None 2019-12-12 10:37:02 UTC
oVirt gerrit 104428 0 'None' MERGED Remove yum from initial validations 2020-07-17 15:53:23 UTC
oVirt gerrit 104523 0 'None' MERGED Remove yum from initial validations 2020-07-17 15:53:23 UTC

Description Robert McSwain 2019-08-29 16:35:08 UTC
RHV-M 4.3/RHHI 1.6 

We are completely disconnected. We only used the install media from the customer portal to build our environment. We started with the Async #4 ISOs where we had several successful builds (after struggling for a number of weeks) and were able to then use Async #5 using the same (or slightly modified) build procedure. We did not create any repos. We did have to download and hand carry the rhvm-appliance to our disconnected environment. 

We are unsure how to complete an install install without the rhvm-appliance (or the ovirt-engine-appliance). When my install gets to that play (state: present) for the ovirt-engine-appliance (/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/install_appliance.yml) , the playbook terminates with a 'cannot find RPM' message. If I do it and reference the rhvm-appliance but do not yum localinstall before launching the setup playbook (from cockpit), it says that it cannot find that RPM either. If I install the rhvm-appliance RPM *and* change the play to reference rhvm-appliance rather than ovirt-engine-appliance, it will make it past that particular hurdle. In fact, I can end with a working installation (using software RAID5 for my gluster disks).

We are not using hardware RAID. We initially (Async #4) used a RAID5 software device (assembled using the Cockpit storage widget after the base RHVH image had been installed.  When doing this, we had to modify the Gluster playbook to accommodate a problem with the granular-entry-heal enable play. We do not believe that this is an issue with the installer but rather an issue with us not using hardware RAID. As an FYI, software RAID6 devices to back our bricks did not work at all. We also had successful installs using a software RAID0 device and then selecting JBOD during our gluster install. We know that it is not supported, but it did work. In fact, it is our preferred method at the moment. We are going to buy some RAID cards (with flash backed write cache) in the future. We will likely not use a hot spare.

Observations:
Because we did not create any repos  on in our disconnected environment (we will eventually have them through a satellite that we will build on our RHHI-V lab build), we needed to hand carry the rhvm-appliance rpm from the rhel-7-server-rhvh-4-rpms channel. However, our installer would still die when trying to run the ovirt-engine-appliance play int the /usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/install_appliance.yml playbook. Even if connected, I could not see how this was going to work as the ovirt-engine-appliance rpm that is referenced is not part of the repo. As mentioned elsewhere in this case, we could yum localinstall the rhvm-appliance rpm and modify the play to refer to rhvm-appliance (rather than ovirt-engine-appliance) to get past this issue.

The ova for the rhvm-appliance that is available from the Download area (https://access.redhat.com/downloads/content/415/ver=4.3/rhel---7/4.3/x86_64/product-software) is not referenced anywhere in the installation documentation. Looking through the playbooks tells me that this could be useful in a command line execution of the install, but I am not sure that the playbooks are currently in a state to run successfully from the command line.

Expected Outcome:
The playbook can use the .ova from multiple locations without requiring a modification to the playbook to allow for disconnected installations

Actual Outcome:
The play fails due to only recognizing ovirt-engine-appliance and not rhvm-appliance 


Additional notes:
Documentation for performing an offline/disconnected RHHI 1.6 install should be added to the RHHI guides.

Comment 1 Simone Tiraboschi 2019-09-05 11:03:42 UTC
On a disconnected environment host-deployment currently fails with:
2019-09-05 12:43:03,641 p=1974 u=ovirt |  TASK [oVirt.metrics/roles/oVirt.initial-validations : Check Rsyslog packages are available] ***
2019-09-05 12:43:17,294 p=1974 u=ovirt |  failed: [c76he20190905h1.localdomain] (item=rsyslog) => {
    "ansible_loop_var": "item",
    "changed": false,
    "item": "rsyslog"
}

MSG:

Error from repoquery: ['/usr/bin/repoquery', '--show-duplicates', '--plugins', '--quiet', '--disablerepo', '', '--enablerepo', '', '--qf', '%{name}|%{epoch}|%{version}|%{release}|%{arch}|%{repoid}', 'rsyslog']: Repository ovirt-4.3-epel is listed more than once in the configuration
Repository ovirt-4.3-centos-gluster6 is listed more than once in the configuration
Repository ovirt-4.3-virtio-win-latest is listed more than once in the configuration
Repository ovirt-4.3-centos-qemu-ev is listed more than once in the configuration
Repository ovirt-4.3-centos-ovirt43 is listed more than once in the configuration
Repository ovirt-4.3-centos-opstools is listed more than once in the configuration
Repository centos-sclo-rh-release is listed more than once in the configuration
Repository sac-gluster-ansible is listed more than once in the configuration
Repository ovirt-4.3 is listed more than once in the configuration
Could not match packages: Cannot find a valid baseurl for repo: base/7/x86_64

Comment 2 Simone Tiraboschi 2019-09-05 11:14:51 UTC
Please notice that rsyslog was already there:

[root@c76he20190905h1 ~]# yum list installed rsyslog
Loaded plugins: enabled_repos_upload, fastestmirror, package_upload, product-id, search-disabled-repos, subscription-manager, vdsmupgrade
This system is not registered with an entitlement server. You can use subscription-manager to register.
Loading mirror speeds from cached hostfile
 * base: mirrors.prometeus.net
 * epel: mirror.imt-systems.com
 * extras: mirrors.prometeus.net
 * ovirt-4.3: ovirt.repo.nfrance.com
 * ovirt-4.3-epel: mirror.imt-systems.com
 * updates: mirrors.prometeus.net
Installed Packages
rsyslog.x86_64                                                                                                                             8.24.0-34.el7                                                                                                                             @base
Uploading Enabled Repositories Report

Comment 3 Simone Tiraboschi 2019-09-05 11:48:00 UTC
in:
https://github.com/oVirt/ovirt-engine-metrics/blob/master/roles/oVirt.initial-validations/tasks/check_logging_collectors.yml#L30

we have:
      yum:
        list: 

so it should be able at least to reach the yum repos.

Comment 5 Guilherme Santos 2019-11-15 13:18:59 UTC
Verified on:
ovirt-engine-4.3.7.1-0.1.el7.noarch
ovirt-engine-metrics-1.3.5.1-1.el7ev.noarch

Host environment:
rsyslog-8.24.0-41.el7_7.2.x86_64 installed and up to date
collectd-5.8.1-3.el7ost.x86_64 installed and up to date

Steps:
1. Installed rsyslog and collectd packages on host.
2. Edited the ansible host-deploy playbook adding the following lines on very beginning of ovirt-host-deploy-facts/tasks/main.yml:

#---
# - name: Break connection
#   shell: /sbin/route del default gw $(/sbin/route -n | awk '$1=="0.0.0.0" {print $2; exit}')
#(...)

This was done in order to cancel any route to internet and have a disconnect environment.

3. Logged in on the engine and run the Installation of host (host-deploy) 

Results:
Completed successfully

Comment 7 errata-xmlrpc 2019-12-12 10:36:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4229


Note You need to log in before you can comment on or make changes to this bug.