Bug 1498855 - Installation of host to a cluster using Linux bridge switch type fails with "Failed to extract OVN central IP..."
Summary: Installation of host to a cluster using Linux bridge switch type fails with "...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Host-Deploy
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ovirt-4.2.0
: ---
Assignee: Dominik Holler
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On: 1501798
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-05 12:26 UTC by Jiri Belka
Modified: 2017-10-18 11:44 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-10-18 11:44:16 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)

Description Jiri Belka 2017-10-05 12:26:37 UTC
Description of problem:

My engine VM changed hostname and I used ovirt-engine-rename to rename it and re-run engine-setup.

After this step, I tried to add a host into a cluster with Linux bridge switch type and it failed with:

2017-10-05 14:00:34,891+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-123) [1d69b573] EVENT_ID: VDS_INSTALL_FAILED(505), Host 10-37-138-193 installation failed. Failed to extract OVN central IP from https://10-34-61-152.example.com:9696.

1. why is OVN involved if switch type in the cluster is Linux bridge
2. why ovirt-engine-rename tool did not rename it?

Ad 2, I could imagine this could be complicated. Maybe rename tool could ask user for confirmation to rename all old hostnames to new one like:

- "do you want to rename OVN from $old to $new? confirm: [yes|no]


I corrected FQDN in OVN provider and I did 'Reinstall' action for the host in engine:

~~~
Error while executing action: 

10-37-138-193:
Cannot edit Host. The provider type should be 'OpenStack Networking'.
~~~

'Network Provider' in Host settings was empty. <-- is this an post issue from failed installation?

Setting Network Provider in Host settings to 'ovirt-provider-ovn' cause same error message as above.

Solution: Remove the host and put it back after previously correcting OVN FQDN in providers.
Version-Release number of selected component (if applicable):


How reproducible:
1 in 1

Steps to Reproduce:
1. have 4.2 engine
2. change hostname of engine OS
3. ovirt-engine-rename & engine-setup
4. try to add host

Actual results:
installation fails with
2017-10-05 14:00:34,891+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-123) [1d69b573] EVENT_ID: VDS_INSTALL_FAILED(505), Host 10-37-138-193 installation failed. Failed to extract OVN central IP from https://10-34-61-152.example.com:9696.


Expected results:
should work, our cluster is linux bridge switch type anyway

Additional info:

Comment 2 Dominik Holler 2017-10-06 15:18:20 UTC
Currently, this behavior is intended, because the host is configured to provide OVN networking (not yet visible in UI and REST, because the changes are not yet merged), but the configuration of the ovirt-ovn-provider is broken. This way the new host added host would be not able to provide OVN networking, what would be an invalid state.

Any suggestions how to improve the current behavior?
Would a more detailed error message be helpful?

Comment 3 Jiri Belka 2017-10-06 15:44:56 UTC
(In reply to Dominik Holler from comment #2)
> Currently, this behavior is intended, because the host is configured to
> provide OVN networking (not yet visible in UI and REST, because the changes
> are not yet merged), but the configuration of the ovirt-ovn-provider is
> broken. This way the new host added host would be not able to provide OVN
> networking, what would be an invalid state.
> 
> Any suggestions how to improve the current behavior?
> Would a more detailed error message be helpful?

what is host here, the engine VM? anyway, why is the provider checked if cluster is linux bridge?

does it mean if i put www.example.com as OVN provider this provider is checked during adding every host into linux bridge switch type cluster? it seems to, see below:

2017-10-06 17:40:41,929+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-415) [799f4af7] EVENT_ID: VDS_ANSIBLE_INSTALL_STARTED(560), Ansible host-deploy playbook execution has started on host 10-37-138-193.rhev.lab.eng.brq.redhat.com.
2017-10-06 17:40:41,932+02 INFO  [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-415) [799f4af7] Executing Ansible command: /usr/bin/ansible-playbook --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=/tmp/ansible-inventory6689691072862403033 --extra-vars=host_deploy_cluster_version=4.2 --extra-vars=host_deploy_gluster_enabled=false --extra-vars=host_deploy_virt_enabled=true --extra-vars=host_deploy_vdsm_port=54321 --extra-vars=host_deploy_override_firewall=true --extra-vars=host_deploy_firewall_type=IPTABLES --extra-vars=ansible_port=22 --extra-vars=host_deploy_post_tasks=/etc/ovirt-engine/../ovirt-ansible-roles/ovirt-host-deploy-post-tasks.yml --extra-vars=host_deploy_ovn_tunneling_network=ovirtmgmt --extra-vars=host_deploy_ovn_central=93.184.216.34 /usr/share/ovirt-engine/../ovirt-ansible-roles/playbooks/ovirt-host-deploy.yml

$ dig +short www.example.com
93.184.216.34

imo this is wrong. either provider should not be checked for this cluster, or adding provider which does not work should be not accepted (nothing prevents me putting there 'http://i-love-sushi' as hostname for the provider :)

Comment 4 Dominik Holler 2017-10-06 16:05:26 UTC
> what is host here, the engine VM? 

The host to be added into a cluster.

> anyway, why is the provider checked if cluster is linux bridge?
> 

Because the cluster is marked to provide OVN networking.
This is a new feature which is not yet completely merged. The network
provider for the cluster will be visible in UI if
https://gerrit.ovirt.org/#/c/82237/ is merged.

> does it mean if i put www.example.com as OVN provider this provider
> is checked during adding every host into linux bridge switch type
> cluster? it seems to, see below:
> 

Yes, because the host will be configured to provide the networks of
this provider to the VMs running on this host.

> 2017-10-06 17:40:41,929+02 INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-415)
> [799f4af7] EVENT_ID: VDS_ANSIBLE_INSTALL_STARTED(560), Ansible
> host-deploy playbook execution has started on host
> 10-37-138-193.rhev.lab.eng.brq.redhat.com. 2017-10-06 17:40:41,932+02
> INFO  [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
> (org.ovirt.thread.EE-ManagedThreadFactory-engine-Thread-415)
> [799f4af7] Executing Ansible command: /usr/bin/ansible-playbook
> --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa
> --inventory=/tmp/ansible-inventory6689691072862403033
> --extra-vars=host_deploy_cluster_version=4.2
> --extra-vars=host_deploy_gluster_enabled=false
> --extra-vars=host_deploy_virt_enabled=true
> --extra-vars=host_deploy_vdsm_port=54321
> --extra-vars=host_deploy_override_firewall=true
> --extra-vars=host_deploy_firewall_type=IPTABLES
> --extra-vars=ansible_port=22
> --extra-vars=host_deploy_post_tasks=/etc/ovirt-engine/../ovirt-ansible-roles/ovirt-host-deploy-post-tasks.yml
> --extra-vars=host_deploy_ovn_tunneling_network=ovirtmgmt
> --extra-vars=host_deploy_ovn_central=93.184.216.34 /usr/share/ovirt-engine/../ovirt-ansible-roles/playbooks/ovirt-host-deploy.yml
> 
> $ dig +short www.example.com
> 93.184.216.34
> 
> imo this is wrong. either provider should not be checked for this

https://gerrit.ovirt.org/#/c/82122/ will provide the possibility to
add hosts without supporting the provider.
But if the host is added and marked to be configured to provide the
networks of the provider, the provider is checked.

> cluster, or adding provider which does not work should be not
> accepted 

I will check if it is possible to use the validation in the UI to
provide a more interactive feedback.

> (nothing prevents me putting there 'http://i-love-sushi' as
> hostname for the provider :)

The using the Test-Button in the Add/Edit Provider dialog would give a
hint.

Comment 5 Dan Kenigsberg 2017-10-18 11:44:16 UTC
I understand that the root cause of this is bug 1501798. When solved, we would not have had a default provider with a bogus name.

If the user ends up with a bogus url, he should go an fix it, or modify the host's provider to None (the latter option is not yet available in GUI see https://gerrit.ovirt.org/#/c/82910/ )

Thus I believe there is nothing here to be fixed. Please reopen if I am mistaken.


Note You need to log in before you can comment on or make changes to this bug.