Bug 1947709 - [IPv6] HostedEngineLocal is an isolated libvirt network, breaking upgrades from 4.3
Summary: [IPv6] HostedEngineLocal is an isolated libvirt network, breaking upgrades fr...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-ansible-collection
Version: 4.4.4
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ovirt-4.4.9
: ---
Assignee: Asaf Rachmani
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-09 01:45 UTC by Germano Veit Michel
Modified: 2021-11-16 14:45 UTC (History)
15 users (show)

Fixed In Version: ovirt-ansible-collection-1.6.1-1.el8ev
Doc Type: Bug Fix
Doc Text:
Previously, upgrading from Red Hat Virtualization 4.3 failed when using an isolated network during IPv6 deployment. In this release, a forward network is used instead of an isolated network during an IPv6 deployment. As a result, upgrade from Red Hat Virtualization 4.3 using IPv6 now succeeds.
Clone Of:
Environment:
Last Closed: 2021-11-16 14:45:46 UTC
oVirt Team: Integration
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-ansible-collection pull 315 0 None closed roles: hosted_engine_setup: Use forward network during an IPv6 deployment 2021-07-06 10:42:56 UTC
Github oVirt ovirt-ansible-collection pull 331 0 None None None 2021-08-24 10:27:12 UTC
Red Hat Knowledge Base (Solution) 5950521 0 None None None 2021-04-09 03:23:22 UTC
Red Hat Product Errata RHSA-2021:4703 0 None None None 2021-11-16 14:45:57 UTC

Description Germano Veit Michel 2021-04-09 01:45:15 UTC
Description of problem:

An IPv4 deployment of Hosted-Engine sets up NAT on the libvirt network, like this:

# virsh -r net-dumpxml default
<network>
  <name>default</name>
  <uuid>1cb6f807-270f-497f-b3f8-172260946b47</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:9c:99:8f'/>
  <ip address='192.168.222.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.222.2' end='192.168.222.254'/>
    </dhcp>
  </ip>
</network>

But an IPv6 deployment, it sets up an isolated network. So HostedEngineLocal cannot talk to any hosts except the deployment.

# virsh -r net-dumpxml default
<network>
  <name>default</name>
  <uuid>d074ab29-4e1d-4bbd-b471-e0f8d6a2776b</uuid>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:36:aa:e5'/>
  <ip family='ipv6' address='fd00:1234:5678:900::1' prefix='64'>
    <dhcp>
      <range start='fd00:1234:5678:900::10' end='fd00:1234:5678:900::ff'/>
    </dhcp>
  </ip>
</network>

This is a problem, because the HostedEngineLocal VM cannot reach outside. Which means that during an upgrade from RHV 4.3 it cannot reach the SPM Host, which is finely functioning on the network. Without reaching the SPM the Data-Center is down, so it cannot setup the new hosted_storage for the RHV 4.4 Hosted-Engine and the deployment fails, as the DC is down, we have no SPM....

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Failed to attach Storage due to an error on the Data Center master Storage Domain.\n-Please activate the master Storage Domain first.]\". HTTP response code is 409."}

This is done here: https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml#L17

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.4.9-2.el8ev.noarch
vdsm-4.40.40-1.el8ev.x86_64
ovirt-ansible-collection-1.2.4-1.el8ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy IPv6 HostedEngine
2. Try to ping outside from HostedEngineLocal (before adding storage details)

Additional details:
* Libvirt since 6.5.0 accepts IPv6 NAT, like we do for IPv4: https://libvirt.org/formatnetwork.html#examplesNATv6

Workaround:

I had to do 2 things to make this work in a test. 
1) Enable accept_ra on ovirtmgmt (otherwise libvirt would fail to bring up the default network with nat), 
$ echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra

2) Modify the ansible role to setup NAT for IPv6.
$ git diff
diff --git a/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml b/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml
index ef813c3..ce7423f 100644
--- a/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml
+++ b/roles/hosted_engine_setup/tasks/alter_libvirt_default_net_configuration.yml
@@ -14,15 +14,29 @@
         xpath: /network/ip
         state: absent
       register: editednet_noipv4
-    - name: Configure it as an isolated network
+    - name: Configure it as an forward network
       xml:
         xmlstring: "{{ editednet_noipv4.xmlstring }}"
         xpath: /network/forward
-        state: absent
-      register: editednet_isolated
+        state: present
+      register: editednet_forward
+    - name: Edit libvirt default network configuration, enable NAT
+      xml:
+        xmlstring: "{{ editednet_forward.xmlstring }}"
+        xpath: /network/forward
+        attribute: mode
+        value: "nat"
+      register: editednet_nat
+    - name: Edit libvirt default network configuration, enable NAT IPv6
+      xml:
+        xmlstring: "{{ editednet_nat.xmlstring }}"
+        xpath: /network/forward/nat
+        attribute: ipv6
+        value: "yes"
+      register: editednet_natipv6
     - name: Edit libvirt default network configuration, set IPv6 address
       xml:
-        xmlstring: "{{ editednet_isolated.xmlstring }}"
+        xmlstring: "{{ editednet_natipv6.xmlstring }}"
         xpath: /network/ip[@family='ipv6']
         attribute: address
         value: "{{ he_ipv6_subnet_prefix + '::1' }}"

Comment 18 Wei Wang 2021-08-23 09:23:43 UTC
Test with latest 4.4.8 build, I detected a similar bug.

Test Version
RHVH-4.4-20210818.0-RHVH-x86_64-dvd1.iso
cockpit-ws-238.2-1.el8.x86_64
cockpit-bridge-238.2-1.el8.x86_64
cockpit-ovirt-dashboard-0.15.1-2.el8ev.noarch
cockpit-system-238.2-1.el8.noarch
cockpit-storaged-238.2-1.el8.noarch
cockpit-238.2-1.el8.x86_64
subscription-manager-cockpit-1.28.13-3.el8_4.noarch
ovirt-hosted-engine-ha-2.4.8-1.el8ev.noarch
ovirt-hosted-engine-setup-2.5.3-1.el8ev.noarch
rhvm-appliance-4.4-20210715.0.el8ev.x86_64
ovirt-ansible-collection-1.6.0-1.el8ev.noarch

Test Steps:
1. Deploy hosted engine with pure ipv6 environment.

Test Result:
Hosted engine deploy failed as "/proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory"

ovirt-hosted-engine-setup-ansible-bootstrap_local_vm-20210820131658-v01vav.log
2021-08-20 13:18:05,349+0800 ERROR ansible failed {
    "ansible_host": "localhost",
    "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
    "ansible_result": {
        "_ansible_no_log": false,
        "changed": true,
        "cmd": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra",
        "delta": "0:00:00.003019",
        "end": "2021-08-20 13:18:05.152754",
        "invocation": {
            "module_args": {
                "_raw_params": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra",
                "_uses_shell": true,
                "argv": null,
                "chdir": null,
                "creates": null,
                "executable": null,
                "removes": null,
                "stdin": null,
                "stdin_add_newline": true,
                "strip_empty_ends": true,
                "warn": true
            }
        },
        "msg": "non-zero return code",
        "rc": 1,
        "start": "2021-08-20 13:18:05.149735",
        "stderr": "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory",
        "stderr_lines": [
            "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such file or directory"
        ],
        "stdout": "",
        "stdout_lines": []
    },
    "ansible_task": "Accept IPv6 Router Advertisements",
    "ansible_type": "task",
    "status": "FAILED",
    "task_duration": 0
}

Attatch screenshot and log files in attachment.

Comment 29 Wei Wang 2021-08-31 04:31:29 UTC
(In reply to Wei Wang from comment #18)
> Test with latest 4.4.8 build, I detected a similar bug.
> 
> Test Version
> RHVH-4.4-20210818.0-RHVH-x86_64-dvd1.iso
> cockpit-ws-238.2-1.el8.x86_64
> cockpit-bridge-238.2-1.el8.x86_64
> cockpit-ovirt-dashboard-0.15.1-2.el8ev.noarch
> cockpit-system-238.2-1.el8.noarch
> cockpit-storaged-238.2-1.el8.noarch
> cockpit-238.2-1.el8.x86_64
> subscription-manager-cockpit-1.28.13-3.el8_4.noarch
> ovirt-hosted-engine-ha-2.4.8-1.el8ev.noarch
> ovirt-hosted-engine-setup-2.5.3-1.el8ev.noarch
> rhvm-appliance-4.4-20210715.0.el8ev.x86_64
> ovirt-ansible-collection-1.6.0-1.el8ev.noarch
> 
> Test Steps:
> 1. Deploy hosted engine with pure ipv6 environment.
> 
> Test Result:
> Hosted engine deploy failed as "/proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra:
> No such file or directory"
> 
> ovirt-hosted-engine-setup-ansible-bootstrap_local_vm-20210820131658-v01vav.
> log
> 2021-08-20 13:18:05,349+0800 ERROR ansible failed {
>     "ansible_host": "localhost",
>     "ansible_playbook":
> "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
>     "ansible_result": {
>         "_ansible_no_log": false,
>         "changed": true,
>         "cmd": "echo 2 > /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra",
>         "delta": "0:00:00.003019",
>         "end": "2021-08-20 13:18:05.152754",
>         "invocation": {
>             "module_args": {
>                 "_raw_params": "echo 2 >
> /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra",
>                 "_uses_shell": true,
>                 "argv": null,
>                 "chdir": null,
>                 "creates": null,
>                 "executable": null,
>                 "removes": null,
>                 "stdin": null,
>                 "stdin_add_newline": true,
>                 "strip_empty_ends": true,
>                 "warn": true
>             }
>         },
>         "msg": "non-zero return code",
>         "rc": 1,
>         "start": "2021-08-20 13:18:05.149735",
>         "stderr": "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No
> such file or directory",
>         "stderr_lines": [
>             "/bin/sh: /proc/sys/net/ipv6/conf/ovirtmgmt/accept_ra: No such
> file or directory"
>         ],
>         "stdout": "",
>         "stdout_lines": []
>     },
>     "ansible_task": "Accept IPv6 Router Advertisements",
>     "ansible_type": "task",
>     "status": "FAILED",
>     "task_duration": 0
> }
> 
> Attatch screenshot and log files in attachment.


Test with RHVH-4.4-20210829.0-RHVH-x86_64-dvd1.iso and rhvm-appliance-4.4-20210827.0.el8ev.rpm, Hosted engine can be deployed successful with ipv6 environment. 

This issue is gone.

Comment 30 Nikolai Sednev 2021-08-31 06:47:19 UTC
Forth to comment #29, moving to verified.

Comment 31 Asaf Rachmani 2021-08-31 09:17:29 UTC
(In reply to Nikolai Sednev from comment #30)
> Forth to comment #29, moving to verified.

Comment 18 mentions that HE deployment failed due to the change we have added in order to fix this bug.
Comment 29 mentions that the issue described in comment 18 is fixed now (after an additional fix has been made), but AFAIU the original bug is not verified.
We still need to verify that the bootstrap engine VM, during HE deployment, can reach IPv6 networks outside the host (verification steps are described in comment 6).

Comment 32 Nikolai Sednev 2021-08-31 11:55:29 UTC
(In reply to Asaf Rachmani from comment #31)
> (In reply to Nikolai Sednev from comment #30)
> > Forth to comment #29, moving to verified.
> 
> Comment 18 mentions that HE deployment failed due to the change we have
> added in order to fix this bug.
> Comment 29 mentions that the issue described in comment 18 is fixed now
> (after an additional fix has been made), but AFAIU the original bug is not
> verified.
Wei, can you provide your input on this?
> We still need to verify that the bootstrap engine VM, during HE deployment,
> can reach IPv6 networks outside the host (verification steps are described
> in comment 6).

Comment 33 Wei Wang 2021-09-01 00:22:27 UTC
(In reply to Nikolai Sednev from comment #32)
> (In reply to Asaf Rachmani from comment #31)
> > (In reply to Nikolai Sednev from comment #30)
> > > Forth to comment #29, moving to verified.
> > 
> > Comment 18 mentions that HE deployment failed due to the change we have
> > added in order to fix this bug.
> > Comment 29 mentions that the issue described in comment 18 is fixed now
> > (after an additional fix has been made), but AFAIU the original bug is not
> > verified.
> Wei, can you provide your input on this?
Nikolai, the ipv6 HE environment has provided to you via chat. Hope it can help you to verify it. :)

> > We still need to verify that the bootstrap engine VM, during HE deployment,
> > can reach IPv6 networks outside the host (verification steps are described
> > in comment 6).

Comment 40 Sandro Bonazzola 2021-09-06 07:08:16 UTC
Moving to 4.4.9 due to QE capacity.

Comment 47 Nikolai Sednev 2021-10-07 06:15:28 UTC
Thanks to Germano Veit Michel and his findings in comment #46, we can move this bug to verified.
In case that there is still any issues related to this bug, please reopen.

Comment 51 errata-xmlrpc 2021-11-16 14:45:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Engine and Host Common Packages security update [ovirt-4.4.9]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4703


Note You need to log in before you can comment on or make changes to this bug.