Bug 1746585

Summary: cockpit Hosted Engine install looping during installation
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Travis <Travis.Ross>
Component: GeneralAssignee: Gal Zaidman <gzaidman>
Status: CLOSED DUPLICATE QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: high    
Version: 2.3.11CC: bugs, henry, irosenzw
Target Milestone: ovirt-4.3.7Keywords: ZStream
Target Release: ---Flags: sbonazzo: ovirt-4.3?
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-02 07:06:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Setup Logs none

Description Travis 2019-08-28 20:51:26 UTC
Created attachment 1609127 [details]
Setup Logs

Description of problem:

During installation of Hosted-Engine through cockpit, Prepare VM Step gets into loop and won't proceed.



Version-Release number of selected component (if applicable):


How reproducible:





Steps to Reproduce:
1. Fresh Install - ovirtnode 4.3.5-2019080513
2. Enable/Start cockpit
3. Launch installer

Actual results:

[ INFO ] TASK [ovirt.hosted_engine_setup : Define 3rd chunk]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set 3rd chunk]
[ INFO ] skipping: [localhost]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Get ip route]
[ INFO ] skipping: [localhost]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if can't find an available subnet]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set new IPv4 subnet prefix]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Search again with another prefix]
[ INFO ] skipping: [localhost]

Expected results:

Continues through to Storage Configuration step

Additional info:

Comment 1 Travis 2019-08-30 03:52:13 UTC
Looks like I may have had some DNS issues as I am passing this step successfully after a few hours. 

However now I'm running into an issue with the virtio-win repository.

[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 10, "changed": false, "msg": "Failure talking to yum: failure: repodata/repomd.xml from ovirt-4.3-virtio-win-latest: [Errno 256] No more mirrors to try.\nhttp://fedorapeople.org/groups/virt/virtio-win/repo/latest/repodata/repomd.xml: [Errno 14] HTTPS Error 302 - Found"}


Looks like the repo has been relocated:

[root@kbukdl-ovirt2 ovirt-hosted-engine-setup]# curl http://fedorapeople.org/groups/virt/virtio-win/repo/stable/repodata/repomd.xml
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://fedorapeople.org/groups/virt/virtio-win/repo/stable/repodata/repomd.xml">here</a>.</p>
</body></html>


Default repository information on the ovirtnode 4.3.5-2019080513 ISO Installation media needs to be updated "ovirt-4.3-dependencies.repo"

Replaced:
 
[ovirt-4.3-virtio-win-latest]
enabled=1
name = virtio-win builds roughly matching what will be shipped in upcoming RHEL
baseurl = http://fedorapeople.org/groups/virt/virtio-win/repo/latest
enabled = 0
gpgcheck = 0
includepkgs = ovirt-node-ng-image-update ovirt-node-ng-image ovirt-engine-appliance

With: 

[ovirt-4.3-virtio-win-stable]
enabled=1
name = virtio-win builds roughly matching what will be shipped in upcoming RHEL
baseurl = http://fedorapeople.org/groups/virt/virtio-win/repo/stable
enabled = 0
gpgcheck = 0
includepkgs = ovirt-node-ng-image-update ovirt-node-ng-image ovirt-engine-appliance

Now the system builds flawlessly.

So bug seems to be the virtio-win repository.  Replace latest w/ stable

Comment 2 Ido Rosenzwig 2019-09-01 08:48:14 UTC
Even though you had a problem in your DNS servers the deployment shouldn't loop for so long.

Second, the issue you described in https://bugzilla.redhat.com/show_bug.cgi?id=1746585#c0 is due to the fact that the 
deployment tries to find an available subnet to use for a local libvirt network.
The problem is that it tries searching for a IPv6 network while your setup is IPv4 only - thus, it can't find any and try searching for many times.

The way I see it we have 2 issues here:
1. The search for available subnets can take a lot of time and should be limited.
2. The deployment was identified as an IPv6 deployment but it wasn't.

Comment 4 Henry 2019-09-02 10:17:59 UTC
I'm also experiencing the issue with what appears to be a looping subnet search.

Is there any workaround to implement other than waiting a few hours?

Comment 5 Ido Rosenzwig 2019-09-02 12:10:20 UTC
If you are using IPv4 setup you can edit the default variables file on the hosted-engine ansible role to use only IPv4 addresses - so the deployment will ignore all IPv6 addresses.

The defaults file is located here:
/usr/share/ansible/roles/ovirt.hosted-engine-setup/defaults/main.yml

Change the variable 'he_force_ip4' to true as follows:
he_force_ip4: true

If you are using IPv6 setup, You can do the same with the variable 'he_force_ip6'

NOTE: If you change both he_force_ip4 and he_force_ip6 to 'true' the setup will fail.

Comment 6 Henry 2019-09-02 16:14:59 UTC
Hi Ido,

Thanks, I can confirm setting `he_force_ip4` resolved the issue for me.

Comment 7 Sandro Bonazzola 2019-10-02 07:06:34 UTC
Closing as duplicate of bug #1756244

*** This bug has been marked as a duplicate of bug 1756244 ***