Bug 1650650

Summary: Fail to activate Host with fcoe devices after upgrade
Product: Red Hat Enterprise Virtualization Manager Reporter: Javier Coscia <jcoscia>
Component: vdsmAssignee: Edward Haas <edwardh>
Status: CLOSED DEFERRED QA Contact: Michael Burman <mburman>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.5CC: danken, dholler, edwardh, jcoscia, lsurette, mburman, mkalinin, mtessun, srevivo, ycui
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-12-21 15:08:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Javier Coscia 2018-11-16 17:55:48 UTC
Description of problem:

Host activation fails and remains in non-operational state due to engine can't setup `rhevm` bridge stating it's being used.


Version-Release number of selected component (if applicable):

ovirt-engine-4.2.6.4-0.1.el7ev.noarch


Problematic version

Red Hat Virtualization Host 4.2.5 (el7.5)
redhat-virtualization-host-image-update-4.2-20180813.0.el7_5.noarch
rhvh-4.2.5.2-0.20180813.0+1
vdsm-4.20.35-1.el7ev.x86_64


Working version

Red Hat Virtualization Host 4.1 (el7.4)
rhvh-4.1-0.20180102.0+1
vdsm-4.19.43-3.el7ev.x86_64


How reproducible:
Always in user's setup

Steps to Reproduce:
1. Have an active RHV Hypervisor with 4.1 version with fcoe NIC devices
2. Upgrade host from rhvh-4.1.x to 4.2.x
3. Installation succeeds but activation fails and host is placed into non-operational state due to missing networks



Actual results:
event: Host rhvh1.example.com installation failed. Failed to configure management network on the host. >> Non-operational

Failed to find a valid interface for the management network of host rhvh1.example.com. If the interface rhevm is a bridge, it should be torn-down manually.



Expected results:

Host should be activated after upgrade

Additional info:

Host has fcoe devices configured.

Comment 17 Dan Kenigsberg 2018-12-11 11:25:07 UTC
ovirt-4.2.8 is going to ship with a vdsm configutable lldp_enable, which can be set to False in order to avoid the averse effects of lldpad on this hardware. This is the best workaround that we can ship now. If this is not enough please reopen.

Comment 19 Dan Kenigsberg 2018-12-11 11:27:25 UTC
Sorry, wrong bug.

Here our problem seems more complex; as if fcoe creates vdsm-unsupported configuration automatically.

Comment 25 Michael Burman 2018-12-18 18:56:10 UTC
Dan, see Javier's need info from previous comment

Comment 28 Michael Burman 2018-12-19 07:19:31 UTC
hidden_vlans in /etc/vdsm/vdsm.conf 

Update from QE - we have tested the hidden_vlans option in vdsm.conf and it is working.


I have tested the next scenario -
- Create bond0 mode=4 via RHV manager

ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000
ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000
bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

- Create manually vlan tagged interfaces(on the bond's slaves) using ifcfg-* files(not NM) on the host(RHV not involved in this point) 

ens1f0.155@ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
ens1f1.156@ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

- Try to attach a logical network to bond0 via RHV manager - Failed as expected with:
VDSM <hostname> command HostSetupNetworksVDS failed: nic ens1f0 already used by (155,)

- Edit /etc/vdsm/vdsm.conf with
[vars]
hidden_vlans = ens1f0.155,ens1f1.156  - (note that space between the comma doesn't work)
- Restart vdsmd
- Attach network/s to bond0 - SUCCEED
brctl show 
test-net                8000.0015173dcdce       no              bond

The WA is working.