Bug 742047 - Ability to add a RHEL host to RHEV with a preconfigured bond
Summary: Ability to add a RHEL host to RHEV with a preconfigured bond
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.VDSM
Version: ---
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 1.0.0
Assignee: Alon Bar-Lev
QA Contact: Martin Pavlik
URL:
Whiteboard: network
: 531390 (view as bug list)
Depends On: 761411 bootstrap-rewrite
Blocks: 731146
TreeView+ depends on / blocked
 
Reported: 2011-09-28 20:51 UTC by John Brier
Modified: 2019-04-28 09:58 UTC (History)
16 users (show)

Fixed In Version: sf2
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Network
Embargoed:
danken: devel_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 63343 0 None None None 2012-09-26 19:25:32 UTC

Description John Brier 2011-09-28 20:51:44 UTC
Description of problem:

# Who is the customer behind the request?
US Courts System Deploy Sppt Div

Account name:
448897

Customer segment:
2

TAM/SRM customer yes/no: 
yes

VHT score: 
2

# What is the nature and description of the request?
Ability to add a RHEL host to RHEV that has a preconfigured bond

# Why does the customer need this? (List the business requirements here)
US courts uses cobbler to provision all the hosts in their data center
The administrators connect to the hosts from their workstations over a network segment which is connected to eth0 on the hosts in the data center.

The hosts are only able to connect to RHEV Manager via a bond on eth8/eth9

Cobbler is not able to configure a bridge on a host (RFE is open on this https://fedorahosted.org/cobbler/ticket/670)

Currently they cannot use cobbler to setup the 'rhevmv' bridge required and the host will fail to add to RHEV getting the following error in the Event log:

Host <hostname or name> is missing the following networks: 'rhevm'

/tmp/vds_bootstrap.<number>.log shows the following as well:


(note these logs and above event log message are from a RHEV 2.2 setup, but this issue is reproducible on RHEV 3.0 beta2 as well, see below)
Tue, 27 Sep 2011 15:17:54 DEBUG    Bridge rhevm not found, need to create it.
Tue, 27 Sep 2011 15:17:54 DEBUG    getAddress Entry. url=http://rhev5-m.gsslab.rdu.redhat.com/co
mponents/4/5/vds
Tue, 27 Sep 2011 15:17:54 DEBUG    getAddress return. address=rhev5-m.gsslab.rdu.redhat.com port
=None
Tue, 27 Sep 2011 15:17:54 DEBUG    makeBridge begin.
Tue, 27 Sep 2011 15:17:54 DEBUG    ['/bin/rpm', '-q', 'ovirt-node']
Tue, 27 Sep 2011 15:17:54 DEBUG    package ovirt-node is not installed

Tue, 27 Sep 2011 15:17:54 DEBUG    
Tue, 27 Sep 2011 15:17:54 DEBUG    _getMGTIface: read vdc_host_name: rhev5-m.gsslab.rdu.redhat.c
om
Tue, 27 Sep 2011 15:17:54 DEBUG    _getMGTIface: using vdc_host_name rhev5-m.gsslab.rdu.redhat.c
om strVDCIP= 10.12.57.96
Tue, 27 Sep 2011 15:17:54 DEBUG    _getMGTIface VDC IP=10.12.57.96 strIface=bond0
Tue, 27 Sep 2011 15:17:54 DEBUG    makeBridge found the following bridge paramaters: ['ONBOOT=ye
s', 'BOOTPROTO=dhcp', 'USERCTL=no', 'BONDING_OPTS="mode=active-backup miimon=100"']
Tue, 27 Sep 2011 15:17:54 ERROR    _getRHELBridgeParams Found bonding: bond0. This network confi
guration is not supported! Please configure rhevm bridge manually and re-install.
Tue, 27 Sep 2011 15:17:54 ERROR    makeBridge errored:  out=
err=None
ret=None
Tue, 27 Sep 2011 15:17:54 DEBUG    makeBridge return.
Tue, 27 Sep 2011 15:17:54 ERROR    addNetwork errored trying to add rhevm bridge

NOTE: The customer found an odd work around to add a host with a preconfigured bond. 

In RHEV Admin portal 2.2.4.5290 I can reproduce their work around (I don't have vdsm version available to me anymore):

After host fails to add with 'Host <hostname or name> is missing the following networks: 'rhevm'' and goes non-operational do the following:

Resolution 1

   1. Put host in maintenance mode
   2. In Details pane click Network interfaces
   3. Click bond0
   4. Click 'Edit/Add VLAN'
   5. Make sure network is set to 'rhevm'
   6. Click Ok
After a few minutes the host should switch to Up status.

The workaround above is documented with screenshots at the following kbase:

https://access.redhat.com/kb/docs/DOC-63358 (use redhat.com user/pass)
https://access.redhat.com/kb/docs/DOC-63343 (use kerberos user/pass)

Because of this workaround, to the customer, this limitation seems arbitrary. 

Note this workaround does not work in RHEV 3.0 external beta 2, When you click "Ok" in the Edit Network Interface dialog you get 

Operation Canceled
rhevh-5.gsslab.rdu.redhat.com
* Specified network doesn't exist

in /var/log/rhevm/rhevm.log you get:

2011-09-28 16:38:57,134 WARN  [org.nogah.bll.UpdateNetworkToVdsInterfaceCommand] (http-0.0.0.0-8443-3) CanDoAction of action UpdateNetworkToVdsInterface failed. Reasons:NETWORK_NETWORK_NOT_EXISTS

Because this workaround doesn't work in RHEV 3.0 I'm sure this RFE will be more important to US Courts once they start needing to upgrade to RHEV 3.0 for other reasons.

# How would the customer like to achieve this? (List the functional
requirements here)
RHEV can attach a 'rhevm' bridge to a preexisting bond device on a RHEL host.

# For each functional requirement listed in question 4, specify how Red Hat
and the customer can test to confirm the requirement is successfully
implemented.

On a RHEL host
configure eth0 with an IP
configure eth1/eth2 or some higher interfaces than eth0 in a bond
add the IP of the bond to RHEV Manager
host installs, reboots and switches to Up status.

# Is there already an existing RFE upstream or in Red Hat bugzilla?
no

# How quickly does this need resolved? (desired target release)
At least in RHEV 3.x

# Does this request meet the RHEL Bug and Feature Inclusion Criteria
N/A

# List the affected packages
RHEV M and vdsm potentially since vdsm_bootstrap complains bonding is not supported (is logic on RHEV M and vdsm bootstrap??)

# Would the customer be able to assist in testing this functionality if
implemented?
yes

Version-Release number of selected component (if applicable):

RHEV Manager for Servers and Desktops: 3.0.0_0001-41.el6

Red Hat Enterprise Virtualization Hypervisor Hosts:
	rhcx-1.gsslab.rdu.redhat.com:	OS Version - RHEL - 6Server - 6.1.0.2.el6	VDSM Version - 3.0.0.91 
	rhevh-5.gsslab.rdu.redhat.com:	OS Version - RHEL - 6Server - 6.1.0.2.el6	VDSM Version - 3.0.96.1 
	unused-57-122.gsslab.rdu.redhat.com:	OS Version - RHEL - 6Server - 6.1.0.2.el6	VDSM Version - 3.0.0.91 

^in the above only rhevh-5.gsslab.rdu.redhat.com was used to test for this RFE

How reproducible:
100%

Since I showed RHEV 2.2 logs in the top of this description, here are RHEV 3.0 logs:

Steps to Reproduce:
1. On a RHEL host
2. configure eth0 with an IP
3. configure eth1/eth2 or some higher interfaces than eth0 in a bond
4. add the host using IP the bond uses to RHEV Manager

  
Actual results:
host installs, reboots and switches to Non Operational status

 Event log shows:
Host rhevh-5.gsslab.rdu.redhat.com does not comply with the cluster Default networks, the following networks are missing on host: 'rhevm' 

/var/log/rhevm/rhevm.log shows

2011-09-28 15:37:12,419 INFO  [org.nogah.bll.VdsCommand] (pool-11-thread-3) Waiting 300 seconds, for server to finish reboot process.
2011-09-28 15:42:12,421 INFO  [org.nogah.vdsbroker.SetVdsStatusVDSCommand] (pool-11-thread-3) START, SetVdsStatusVDSCommand(vdsId = acd8dfd0-ea02-11e0-8fbf-00163e748d8b, status=NonResponsive, nonOperationalReason=NONE), log id: 50f5418d
2011-09-28 15:42:12,440 INFO  [org.nogah.vdsbroker.SetVdsStatusVDSCommand] (pool-11-thread-3) FINISH, SetVdsStatusVDSCommand, log id: 50f5418d
2011-09-28 15:42:14,247 INFO  [org.nogah.bll.SetNonOperationalVdsCommand] (QuartzScheduler_Worker-74) Running command: SetNonOperationalVdsCommand internal: true. Entities affected :  ID: acd8dfd0-ea02-11e0-8fbf-00163e748d8b Type: VDS
2011-09-28 15:42:14,249 INFO  [org.nogah.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-74) START, SetVdsStatusVDSCommand(vdsId = acd8dfd0-ea02-11e0-8fbf-00163e748d8b, status=NonOperational, nonOperationalReason=NETWORK_UNREACHABLE), log id: 566c5f4
2011-09-28 15:42:14,263 INFO  [org.nogah.vdsbroker.SetVdsStatusVDSCommand] (QuartzScheduler_Worker-74) FINISH, SetVdsStatusVDSCommand, log id: 566c5f4
2011-09-28 15:42:14,269 INFO  [org.nogah.bll.SetNonOperationalVdsCommand] (QuartzScheduler_Worker-74) Host rhevh-5.gsslab.rdu.redhat.com is set to Non-Operational, it is missing the following networks: rhevm, 

/var/log/vdsm/vdsm.log shows:

Wed, 28 Sep 2011 15:35:34 DEBUG    Bridge rhevm not found, need to create it.
Wed, 28 Sep 2011 15:35:34 DEBUG    getAddress Entry. url=http://rhev3test1.rhev.gsslab.rdu.redhat.com:8080/Components/vds/
Wed, 28 Sep 2011 15:35:34 DEBUG    getAddress return. address=rhev3test1.rhev.gsslab.rdu.redhat.com port=8080
Wed, 28 Sep 2011 15:35:34 DEBUG    makeBridge begin.
Wed, 28 Sep 2011 15:35:34 DEBUG    _getMGTIface: read vdc_host_name: rhev3test1.rhev.gsslab.rdu.redhat.com
Wed, 28 Sep 2011 15:35:34 DEBUG    _getMGTIface: using vdc_host_name rhev3test1.rhev.gsslab.rdu.redhat.com strVDCIP= 10.12.57.131
Wed, 28 Sep 2011 15:35:34 DEBUG    _getMGTIface VDC IP=10.12.57.131 strIface=bond0
Wed, 28 Sep 2011 15:35:34 DEBUG    makeBridge found the following bridge paramaters: ['ONBOOT=yes', 'BOOTPROTO=dhcp', 'USERCTL=no', 'BONDING_OPTS=mode=active-backup miimon=100']
Wed, 28 Sep 2011 15:35:34 ERROR    _getRHELBridgeParams Found bonding: bond0. This network configuration is not supported! Please configure rhevm bridge manually and re-install.
Wed, 28 Sep 2011 15:35:34 ERROR    makeBridge errored:  out=
err=None
ret=None
Wed, 28 Sep 2011 15:35:34 DEBUG    makeBridge return.
Wed, 28 Sep 2011 15:35:34 ERROR    addNetwork error trying to add rhevm bridge
Wed, 28 Sep 2011 15:35:34 DEBUG    <BSTRAP component='SetNetworking' status='FAIL' message='addNetwork error trying to add rhevm bridge'/>

Expected results:

host installs, reboots and switches to Up status.


Additional info:

Please see the above "NOTE: The customer found an odd work around to add a host with a preconfigured bond." 

I think this complicates this RFE since this probably shouldn't be allowed at all. It is probably a separate bug but because it is allowed it makes the customer think this is an arbitrary limitation (is it?)

Comment 1 John Brier 2011-09-28 20:55:07 UTC
(In reply to comment #0)
> /var/log/vdsm/vdsm.log shows:
> 
> Wed, 28 Sep 2011 15:35:34 DEBUG    Bridge rhevm not found, need to create it.

That should be /tmp/vds_bootstrap.457582.log

Comment 2 John Brier 2011-11-17 15:22:00 UTC
=== In Red Hat Customer Portal Case 00473041 ===
--- Comment by Brier, John on 11/17/2011 10:21 AM ---

This RFE might not be as important anymore for US Courts..

I was just checking up on the status of this and I was surprised to see the RFE for cobbler has apparently been fulfilled:

https://fedorahosted.org/cobbler/ticket/670

==
10/16/11 08:12:01 changed by unbeliever ΒΆ

    * status changed from new to closed.
    * state changed from Under review to Released.
    * version set to 2.1 devel.
    * resolution set to fixed.

this feature is available in cobbler 2.2.x
==


https://fedorahosted.org/pipermail/cobbler/2011-October/006751.html

==
Cobbler 2.2 has been tagged and released, with RPMS built and
available in EPEL-testing now. This release incorporates a ton of new
features:

* Import modules, which allowed easy integration of...
* Ubuntu and Debian support again!
* Better support for SuSE
* Support for FreeBSD
* Support for ESX 4+/ESXi
* Integration with the python TFTP server pytftpd
* "fetchable files" and "boot files" support for distros that need to
get more files from the TFTP server
* Faster sync using link cache
* Support for EFI grub booting
* Support for bridged interfaces <-------------------------------------------BRIDGE support!
* WSGI instead of mod_python for the web interface.
* Lots of Web UI improvements
* A lot more I'm sure I missed when going through the change log

Bugfixes:

* Seriously way too many to list individually. Read the change log,
there were almost 1000 commits since the last release!

This will also start the new development period, in which we will
target a major release every 6 months. That means we should release
2.4 in April of 2012, with periodic updates to 2.2.x monthly for bug
fixes. February 6th of 2012 will mark the end of the development
period for 2.4, after which only bug fixes will be applied to the
master branch until 2.4 is released.
==

I have asked the customer to try this.

Comment 3 Bryan Yount 2012-03-21 16:32:21 UTC
=== In Red Hat Customer Portal Case 00599673 ===
--- Comment by Yount, Bryan on 3/21/2012 12:32 PM ---

+1 for Qualcomm wanting this

They have a workaround for now but they want the ability to do this in a future version of RHEV. Perhaps 3.1? The customer commented that he "actually wrote a script that will pre-bond the interfaces, then create the rhevm bridge and migrate the configuration to it. After RHEV installs the host, it uses the existing rhevm bridge without an issue.  (If only that would work for the hosts' logical networks!). So while this isn't a super-high priority, it'd be nice to have fixed. Medium is fine."

Comment 4 Simon Grinberg 2012-04-22 09:50:17 UTC
(In reply to comment #3)
> === In Red Hat Customer Portal Case 00599673 ===
> --- Comment by Yount, Bryan on 3/21/2012 12:32 PM ---
> 
> +1 for Qualcomm wanting this
> 
> They have a workaround for now but they want the ability to do this in a future
> version of RHEV. Perhaps 3.1? The customer commented that he "actually wrote a
> script that will pre-bond the interfaces, then create the rhevm bridge and
> migrate the configuration to it. After RHEV installs the host, it uses the
> existing rhevm bridge without an issue.  (If only that would work for the
> hosts' logical networks!). So while this isn't a super-high priority, it'd be
> nice to have fixed. Medium is fine."

Bryan, this looks like a different issue. In the original BZ the problem is that their cobbler setup enforced them to start up with a bond. Here you are describing how to bond rhevm. So if the problem is bonding rhevm network after initial installation - this is already solved in 3.0.3

Comment 5 Bryan Yount 2012-04-25 20:13:59 UTC
(In reply to comment #4)
> Bryan, this looks like a different issue. In the original BZ the problem is
> that their cobbler setup enforced them to start up with a bond. Here you are
> describing how to bond rhevm. So if the problem is bonding rhevm network after
> initial installation - this is already solved in 3.0.3

Maybe my description of it was lacking. Basically, here's what Qualcomm just said about it: "This actually was in the event our install process creates the bond0 interface for us - there will be an existing bond0 interface and rhev-m won't create the 'rhevm' bridge on an already bonded adapter."

Comment 6 Itamar Heim 2012-05-30 11:43:20 UTC
this is not trivial to solve.
requesting this will be added to a future PRD in coordination between GSS and PM

Comment 7 Josep 'Pep' Turro Mauri 2012-06-07 10:37:25 UTC
British Airways (TAM, strategic, segment 1 customer) are also interested on this for the same reasons: all their hosts get provisioned with bonded interfaces. The outlook is that if they finally fully adopt RHEV they will be adding lots of hosts to it, so they'd like the process to be as straightforward as possible.

From the original description:

> Note this workaround does not work in RHEV 3.0 external beta 2,
> When you click "Ok" in the Edit Network Interface dialog you get 
> 
> Operation Canceled
> rhevh-5.gsslab.rdu.redhat.com
> * Specified network doesn't exist

This is bug 761411, and once addressed they will hopefully have a reasonable workaround, but still the interest of being able to do this without extra steps remains.

Comment 8 lpeer 2012-07-09 12:25:05 UTC
*** Bug 531390 has been marked as a duplicate of this bug. ***

Comment 9 Itamar Heim 2012-09-12 19:59:14 UTC
livnat - what is the gap from the current implementation?
is this a network management issue or more of a bootstrap issue?

Comment 10 Alon Bar-Lev 2012-11-28 08:43:54 UTC
new bootstrap does not support this either.

I guess bug#761411 should be validated first.

Comment 11 Alon Bar-Lev 2012-12-04 20:42:19 UTC
Dan,

So can I use addNetwork on bond interface in 3.2?

What is the exact syntax?

Thanks!

Comment 12 Dan Kenigsberg 2012-12-05 08:55:50 UTC
Usage:

/usr/share/vdsm/addNetwork bridge {vlan-id|''} {bonding|''} nic[,nic] [option=value]

For example: /usr/share/vdsm/addNetwork ovirtmgmt '' bond0 em1,em2 MTU=9000

Comment 13 Alon Bar-Lev 2012-12-05 11:34:08 UTC
commit aef14b90363ebacdba27940ffa6826ba5e1062a5
Author: Alon Bar-Lev <alonbl>
Date:   Wed Dec 5 13:25:40 2012 +0200

    vdsm: bridge: support bridge over bond interface
    
    Change-Id: Iace6ca78e494811f91b9df94b5312a0518cfcf78
    Signed-off-by: Alon Bar-Lev <alonbl>

http://gerrit.ovirt.org/#/c/9741/

NOTE: it will probably fail when trying to deploy vdsm of 3.0, as bug#761411 was not fixed in 3.0.z.

Comment 14 Alon Bar-Lev 2012-12-05 11:35:56 UTC
Need to be tested within specific rhev-h/bonding setup.

Comment 15 Alon Bar-Lev 2012-12-10 07:09:06 UTC
Modified per future rebase for 3.2.

Comment 16 Dan Yasny 2013-02-20 16:46:02 UTC
Alon, I see an upstream commit, where are we with this downstream?

Comment 17 Alon Bar-Lev 2013-02-20 21:31:05 UTC
(In reply to comment #16)
> Alon, I see an upstream commit, where are we with this downstream?

Untested, but should work in 3.2. If someone can provide feedback it would be great, should be trivial to fix any issue.

Comment 18 Dan Yasny 2013-02-21 09:23:38 UTC
Setting on 3.2 then, feel free to postpone to 3.3 if not feasible

Comment 19 Yaniv Kaul 2013-02-21 10:21:57 UTC
(In reply to comment #18)
> Setting on 3.2 then, feel free to postpone to 3.3 if not feasible

Pushing. We don't add features now.

I do wonder how the BZ is in MODIFIED state already, though.

Comment 20 Dan Yasny 2013-02-21 10:39:42 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Setting on 3.2 then, feel free to postpone to 3.3 if not feasible
> 
> Pushing. We don't add features now.
> 
> I do wonder how the BZ is in MODIFIED state already, though.

I'm not even sure why this should be a feature and not a bugfix, but lets leave it on 3.3 anyway

Comment 21 Alon Bar-Lev 2013-02-21 14:55:52 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Setting on 3.2 then, feel free to postpone to 3.3 if not feasible
> 
> Pushing. We don't add features now.
> 
> I do wonder how the BZ is in MODIFIED state already, though.

Because it already implemented upstream.

Comment 22 Yaniv Kaul 2013-02-21 14:58:28 UTC
(In reply to comment #21)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > Setting on 3.2 then, feel free to postpone to 3.3 if not feasible
> > 
> > Pushing. We don't add features now.
> > 
> > I do wonder how the BZ is in MODIFIED state already, though.
> 
> Because it already implemented upstream.

Move to POST, then. Unless it was merged upstream before the last rebase.
In that case, it should be in ON_QA already.

Comment 23 Alon Bar-Lev 2013-02-21 15:07:43 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > (In reply to comment #19)
> > > (In reply to comment #18)
> > > > Setting on 3.2 then, feel free to postpone to 3.3 if not feasible
> > > 
> > > Pushing. We don't add features now.
> > > 
> > > I do wonder how the BZ is in MODIFIED state already, though.
> > 
> > Because it already implemented upstream.
> 
> Move to POST, then. Unless it was merged upstream before the last rebase.
> In that case, it should be in ON_QA already.

I think I learned the game ball rules...
This is MODIFIED as the ovirt-host-deploy-1.0.0 already contain this.
You can ignore it and/or skip testing but it is there.

Comment 24 Bryan Yount 2013-03-15 23:57:15 UTC
(In reply to comment #20)
> I'm not even sure why this should be a feature and not a bugfix, but lets
> leave it on 3.3 anyway

So are we considering this an RFE or a bugfix now? I want to make sure my ticket is tagged appropriately as I just picked back up Qualcomm now that Kevin Masaryk is no longer with Red Hat.

Comment 25 Alon Bar-Lev 2013-03-16 08:14:59 UTC
(In reply to comment #24)
> (In reply to comment #20)
> > I'm not even sure why this should be a feature and not a bugfix, but lets
> > leave it on 3.3 anyway
> 
> So are we considering this an RFE or a bugfix now? I want to make sure my
> ticket is tagged appropriately as I just picked back up Qualcomm now that
> Kevin Masaryk is no longer with Red Hat.

Hmmm... I really consider this as bugfix as vdsm should have supported it in the past... there is the cmdline option of addNetwork and such...

Comment 26 Alon Bar-Lev 2013-04-21 09:15:13 UTC
ovirt-host-deploy side all is setup.

if there is an issue with addNetwork of vdsm, please discuss it at bug#761411

Comment 27 Martin Pavlik 2013-04-22 09:10:26 UTC
in sf13.1 host with preconfigured bond is added properly, rhevm bridge interface is created on top of the bond

Comment 28 Itamar Heim 2013-06-11 09:38:36 UTC
3.2 has been released

Comment 29 Itamar Heim 2013-06-11 09:38:37 UTC
3.2 has been released

Comment 30 Itamar Heim 2013-06-11 09:52:44 UTC
3.2 has been released


Note You need to log in before you can comment on or make changes to this bug.