Description of problem: An IPv4 deployment of Hosted-Engine sets up NAT on the libvirt network, like this: # virsh -r net-dumpxml default <network> <name>default</name> <uuid>1cb6f807-270f-497f-b3f8-172260946b47</uuid> <forward mode='nat'> <nat> <port start='1024' end='65535'/> </nat> </forward> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:9c:99:8f'/> <ip address='192.168.222.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.222.2' end='192.168.222.254'/> </dhcp> </ip> </network> But an IPv6 deployment, it sets up an isolated network. So HostedEngineLocal cannot talk to any hosts except the deployment. # virsh -r net-dumpxml default <network> <name>default</name> <uuid>d074ab29-4e1d-4bbd-b471-e0f8d6a2776b</uuid> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:36:aa:e5'/> <ip family='ipv6' address='fd00:1234:5678:900::1' prefix='64'> <dhcp> <range start='fd00:1234:5678:900::10' end='fd00:1234:5678:900::ff'/> </dhcp> </ip> </network> This is a problem, because the HostedEngineLocal VM cannot reach outside. Which means that during an upgrade from RHV 4.3 it cannot reach the SPM Host, which is finely functioning on the network. Without reaching the SPM the Data-Center is down, so it cannot setup the new hosted_storage for the RHV 4.4 Hosted-Engine and the deployment fails, as the DC is down, we have no SPM.... [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Failed to attach Storage due to an error on the Data Center master Storage Domain.\n-Please activate the master Storage Domain first.]\". HTTP response code is 409."} This is done here: https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml#L17 Version-Release number of selected component (if applicable): ovirt-hosted-engine-setup-2.4.9-2.el8ev.noarch vdsm-4.40.40-1.el8ev.x86_64 ovirt-ansible-collection-1.2.4-1.el8ev.noarch How reproducible: Always Steps to Reproduce: 1. Deploy IPv6 HostedEngine 2. Try to ping outside from HostedEngineLocal (before adding storage details) Additional details: * Libvirt since 6.5.0 accepts IPv6 NAT, like we do for IPv4: https://libvirt.org/formatnetwork.html#examplesNATv6 Workaround: I had to do 2 things to make this work in a test. 1) Enable accept_ra on ovirtmgmt (otherwise libvirt would fail to bring up the default network with nat), $ echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra 2) Modify the ansible role to setup NAT for IPv6. $ git diff diff --git a/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml b/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml index ef813c3..ce7423f 100644 --- a/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml +++ b/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml @@ -14,15 +14,29 @@ xpath: /network/ip state: absent register: editednet_noipv4 - - name: Configure it as an isolated network + - name: Configure it as an forward network xml: xmlstring: "{{ editednet_noipv4.xmlstring }}" xpath: /network/forward - state: absent - register: editednet_isolated + state: present + register: editednet_forward + - name: Edit libvirt default network configuration, enable NAT + xml: + xmlstring: "{{ editednet_forward.xmlstring }}" + xpath: /network/forward + attribute: mode + value: "nat" + register: editednet_nat + - name: Edit libvirt default network configuration, enable NAT IPv6 + xml: + xmlstring: "{{ editednet_nat.xmlstring }}" + xpath: /network/forward/nat + attribute: ipv6 + value: "yes" + register: editednet_natipv6 - name: Edit libvirt default network configuration, set IPv6 address xml: - xmlstring: "{{ editednet_isolated.xmlstring }}" + xmlstring: "{{ editednet_natipv6.xmlstring }}" xpath: /network/ip[@family='ipv6'] attribute: address value: "{{ he_ipv6_subnet_prefix + '::1' }}"
Test with latest 4.4.8 build, I detected a similar bug. Test Version RHVH-4.4-20210818.0-RHVH-x86_64-dvd1.iso cockpit-ws-238.2-1.el8.x86_64 cockpit-bridge-238.2-1.el8.x86_64 cockpit-ovirt-dashboard-0.15.1-2.el8ev.noarch cockpit-system-238.2-1.el8.noarch cockpit-storaged-238.2-1.el8.noarch cockpit-238.2-1.el8.x86_64 subscription-manager-cockpit-1.28.13-3.el8_4.noarch ovirt-hosted-engine-ha-2.4.8-1.el8ev.noarch ovirt-hosted-engine-setup-2.5.3-1.el8ev.noarch rhvm-appliance-4.4-20210715.0.el8ev.x86_64 ovirt-ansible-collection-1.6.0-1.el8ev.noarch Test Steps: 1. Deploy hosted engine with pure ipv6 environment. Test Result: Hosted engine deploy failed as "/proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory" ovirt-hosted-engine-setup-ansible-bootstrap_local_vm-20210820131658-v01vav.log 2021-08-20 13:18:05,349+0800 ERROR ansible failed { "ansible_host": "localhost", "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_result": { "_ansible_no_log": false, "changed": true, "cmd": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra", "delta": "0:00:00.003019", "end": "2021-08-20 13:18:05.152754", "invocation": { "module_args": { "_raw_params": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": true } }, "msg": "non-zero return code", "rc": 1, "start": "2021-08-20 13:18:05.149735", "stderr": "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory", "stderr_lines": [ "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory" ], "stdout": "", "stdout_lines": [] }, "ansible_task": "Accept IPv6 Router Advertisements", "ansible_type": "task", "status": "FAILED", "task_duration": 0 } Attatch screenshot and log files in attachment.
(In reply to Wei Wang from comment #18) > Test with latest 4.4.8 build, I detected a similar bug. > > Test Version > RHVH-4.4-20210818.0-RHVH-x86_64-dvd1.iso > cockpit-ws-238.2-1.el8.x86_64 > cockpit-bridge-238.2-1.el8.x86_64 > cockpit-ovirt-dashboard-0.15.1-2.el8ev.noarch > cockpit-system-238.2-1.el8.noarch > cockpit-storaged-238.2-1.el8.noarch > cockpit-238.2-1.el8.x86_64 > subscription-manager-cockpit-1.28.13-3.el8_4.noarch > ovirt-hosted-engine-ha-2.4.8-1.el8ev.noarch > ovirt-hosted-engine-setup-2.5.3-1.el8ev.noarch > rhvm-appliance-4.4-20210715.0.el8ev.x86_64 > ovirt-ansible-collection-1.6.0-1.el8ev.noarch > > Test Steps: > 1. Deploy hosted engine with pure ipv6 environment. > > Test Result: > Hosted engine deploy failed as "/proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: > No such file or directory" > > ovirt-hosted-engine-setup-ansible-bootstrap_local_vm-20210820131658-v01vav. > log > 2021-08-20 13:18:05,349+0800 ERROR ansible failed { > "ansible_host": "localhost", > "ansible_playbook": > "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", > "ansible_result": { > "_ansible_no_log": false, > "changed": true, > "cmd": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra", > "delta": "0:00:00.003019", > "end": "2021-08-20 13:18:05.152754", > "invocation": { > "module_args": { > "_raw_params": "echo 2 > > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra", > "_uses_shell": true, > "argv": null, > "chdir": null, > "creates": null, > "executable": null, > "removes": null, > "stdin": null, > "stdin_add_newline": true, > "strip_empty_ends": true, > "warn": true > } > }, > "msg": "non-zero return code", > "rc": 1, > "start": "2021-08-20 13:18:05.149735", > "stderr": "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No > such file or directory", > "stderr_lines": [ > "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such > file or directory" > ], > "stdout": "", > "stdout_lines": [] > }, > "ansible_task": "Accept IPv6 Router Advertisements", > "ansible_type": "task", > "status": "FAILED", > "task_duration": 0 > } > > Attatch screenshot and log files in attachment. Test with RHVH-4.4-20210829.0-RHVH-x86_64-dvd1.iso and rhvm-appliance-4.4-20210827.0.el8ev.rpm, Hosted engine can be deployed successful with ipv6 environment. This issue is gone.
Forth to comment #29, moving to verified.
(In reply to Nikolai Sednev from comment #30) > Forth to comment #29, moving to verified. Comment 18 mentions that HE deployment failed due to the change we have added in order to fix this bug. Comment 29 mentions that the issue described in comment 18 is fixed now (after an additional fix has been made), but AFAIU the original bug is not verified. We still need to verify that the bootstrap engine VM, during HE deployment, can reach IPv6 networks outside the host (verification steps are described in comment 6).
(In reply to Asaf Rachmani from comment #31) > (In reply to Nikolai Sednev from comment #30) > > Forth to comment #29, moving to verified. > > Comment 18 mentions that HE deployment failed due to the change we have > added in order to fix this bug. > Comment 29 mentions that the issue described in comment 18 is fixed now > (after an additional fix has been made), but AFAIU the original bug is not > verified. Wei, can you provide your input on this? > We still need to verify that the bootstrap engine VM, during HE deployment, > can reach IPv6 networks outside the host (verification steps are described > in comment 6).
(In reply to Nikolai Sednev from comment #32) > (In reply to Asaf Rachmani from comment #31) > > (In reply to Nikolai Sednev from comment #30) > > > Forth to comment #29, moving to verified. > > > > Comment 18 mentions that HE deployment failed due to the change we have > > added in order to fix this bug. > > Comment 29 mentions that the issue described in comment 18 is fixed now > > (after an additional fix has been made), but AFAIU the original bug is not > > verified. > Wei, can you provide your input on this? Nikolai, the ipv6 HE environment has provided to you via chat. Hope it can help you to verify it. :) > > We still need to verify that the bootstrap engine VM, during HE deployment, > > can reach IPv6 networks outside the host (verification steps are described > > in comment 6).
Moving to 4.4.9 due to QE capacity.
Thanks to Germano Veit Michel and his findings in comment #46, we can move this bug to verified. In case that there is still any issues related to this bug, please reopen.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: RHV Engine and Host Common Packages security update [ovirt-4.4.9]), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4703