Bug 1642491

Summary: [Tracker for bug 1107803] Second VLAN on same bond gets disabled during hosted engine install
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Krist van Besien <kvanbesi>
Component: rhhiAssignee: Gobinda Das <godas>
Status: CLOSED CURRENTRELEASE QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: medium    
Version: rhhiv-1.5CC: danken, rhs-bugs, sabose
Target Milestone: ---Keywords: Tracking
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-17 12:31:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1107803    
Bug Blocks:    

Description Krist van Besien 2018-10-24 14:11:20 UTC
Description of problem:

When two VLAN interfaces are defined on the same bond and one of those is marked for use by the ovirtmgmt bridge the other vlan interface gets disabled as soon as vdsm takes control of the network.
If this second vlan is used as backend network for storage this makes the hosted engine deploy fail, as the gluster volumes are no longer reachable.

Version-Release number of selected component (if applicable):
RHHI 2
RHVH 4.2.6


How reproducible:

Always.

Steps to Reproduce:
1. Install 3 hosts with RHVH 4.2.6
2. Set up networking as follow:

slave all interfaces to bond0
create bond0.1 and bond0.2 as two vlan interfaces. bond0.1 will be used for ovirtmgmt, bond0.2 is used for storage.
3. do the gluster part of the hosted engine deploy. Use the ips on bond0.2 for that. This should succeed.
4. do the hosted engine part of the deploy. This fails.
Actual results:


Expected results:

Hosted engine deploy should succeed.

Additional info:

vdsm does not appear to use NetworkManager for network configuration. What we observed is that when vdsm takes control of the network it moves the bond0, bond0.1 and all the slaves out of NM control and adds the ovirtmgmt bridge. As a side effect the bond0.2 vlan interface goes down, and is not brought up.

As a workaround we reenabled bond0.2 as a non network manager controlled interface using an ansible playbook, that we ran in between the two phases of the hosted engine deploy.

Comment 2 Krist van Besien 2018-10-24 14:13:30 UTC
Seems related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=1107803

Comment 3 Sahina Bose 2018-10-30 06:03:55 UTC
Dan, can you review if this bug is a known limitation of vdsm network.

Comment 4 Dan Kenigsberg 2018-11-12 14:04:20 UTC
Yes, this is a Vdsm limitation: Vdsm acquires NM-generated nics when an ovirt network is defined on top them.

The issue described here is a particularly painful effect of this limitation. We would be able to overcome it only with bug 1107803.

Comment 5 Gobinda Das 2020-11-17 12:31:05 UTC
This issue is already fixed long back as per comment#4, So closing this.