Bug 1271273 - Failed to add RHEV-H 3.5.z host to RHEV-M 3.6 with 3.5 cluster due to missing ovirtmgmt network.
Summary: Failed to add RHEV-H 3.5.z host to RHEV-M 3.6 with 3.5 cluster due to missing...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.4
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-3.6.2
: 3.6.2
Assignee: Yevgeny Zaspitsky
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1206139 RHEV3.6Upgrade
TreeView+ depends on / blocked
 
Reported: 2015-10-13 14:07 UTC by cshao
Modified: 2016-04-20 01:32 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Adding a rhev-h-3.5 to a fresh rhev-m-3.6 installation requires a preparation stage: one should define a 3.5 cluster in Engine, define a "rhevm" network, and set "rhevm" as the management network of the 3.5 cluster. The ovirtmgmt network should be removed from the 3.5 cluster, or at the very least - defined as non-required, to avoid confusion. Consequence: Fix: Result:
Clone Of:
Environment:
Last Closed: 2016-04-20 01:32:15 UTC
oVirt Team: Network
Target Upstream Version:
Embargoed:
yzaspits: needinfo-
ylavi: Triaged+


Attachments (Terms of Use)
failed to approved 7.1 (25.46 KB, image/png)
2015-10-13 14:07 UTC, cshao
no flags Details
/var/log/*.* + sosreport + engine.log (5.59 MB, application/x-gzip)
2015-10-13 14:18 UTC, cshao
no flags Details
management_network.png (15.94 KB, image/png)
2015-10-23 03:49 UTC, cshao
no flags Details
network-rhevm.png (15.10 KB, image/png)
2015-10-23 03:50 UTC, cshao
no flags Details
error.png (8.02 KB, image/png)
2015-10-23 03:50 UTC, cshao
no flags Details
logs+screenshots (734.55 KB, application/x-gzip)
2015-12-06 10:00 UTC, Michael Burman
no flags Details
vlan-network (29.62 KB, image/png)
2016-01-20 02:34 UTC, cshao
no flags Details
vlan-failed (61.80 KB, image/png)
2016-01-20 02:35 UTC, cshao
no flags Details
bond_vlan_36 (40.21 KB, image/png)
2016-01-20 05:46 UTC, cshao
no flags Details
screenshot (30.40 KB, image/png)
2016-01-20 06:58 UTC, Michael Burman
no flags Details
screenshot2 (37.54 KB, image/png)
2016-01-20 06:59 UTC, Michael Burman
no flags Details
screenshot3 (36.50 KB, image/png)
2016-01-20 06:59 UTC, Michael Burman
no flags Details
bond_vlan_all_logs (1.23 MB, application/x-gzip)
2016-01-20 10:42 UTC, cshao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 47794 0 master MERGED engine: allow adding a RHEV-H host with predefined rhevm mgmt network Never
oVirt gerrit 47854 0 ovirt-engine-3.6 MERGED engine: allow adding a RHEV-H host with predefined rhevm mgmt network Never
oVirt gerrit 50037 0 master MERGED engine: prevent NPE in PersistentHostSetupNetworksCommand Never
oVirt gerrit 50038 0 ovirt-engine-3.6 MERGED engine: prevent NPE in PersistentHostSetupNetworksCommand Never
oVirt gerrit 50102 0 ovirt-engine-3.6.1 MERGED engine: prevent NPE in PersistentHostSetupNetworksCommand Never
oVirt gerrit 50205 0 master MERGED engine: CommitNetworkChanges only in case that it's known as dirty Never
oVirt gerrit 50206 0 master MERGED engine: Prevent running SetupNetworks as part of ApproveHost flow Never
oVirt gerrit 50207 0 ovirt-engine-3.6 MERGED engine: CommitNetworkChanges only in case that it's known as dirty Never
oVirt gerrit 50208 0 ovirt-engine-3.6 MERGED engine: Prevent running SetupNetworks as part of ApproveHost flow Never

Description cshao 2015-10-13 14:07:21 UTC
Created attachment 1082460 [details]
failed to approved 7.1

Description of problem:
Failed to approve RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster due to miss
ovirtmgmt network.

Version-Release number of selected component (if applicable):
rhev-hypervisor-7-7.1-20150911.0
ovirt-node-3.2.3-20.el7.noarch
vdsm-4.16.26-1.el7ev.x86_64
RHEV-M 3.6.0-0.18.el6

How reproducible:
100%


Steps to Reproduce:
1. Install RHEV-H 7.1-20150911.0.
2. Register RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster.
3. Approve the host.

Actual results:
Failed to approve RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster.

Expected results:
It will report " failed to configure management network on the host, the following network are missing on host "ovirtmgmt".

Additional info:

Comment 1 cshao 2015-10-13 14:18:38 UTC
Created attachment 1082462 [details]
/var/log/*.* + sosreport + engine.log

Comment 2 Fabian Deutsch 2015-10-14 06:21:20 UTC
This bug affects registering newly installed RHEV-H 3.5.z (on 7.1) to RHEV-M 3.6 in 3.5 cluster mode. This flow is not so common, so targeting this for 3.6.1 for now.

Comment 6 Fabian Deutsch 2015-10-20 10:58:23 UTC
Dan, the problem seems to be the change of the name of the bridge between 3.5.z and 3.6.
I don't see what RHEV-H is doing different than RHEL-H, and I wonder how RHEL-H is expected to work in the upgrade scenario.

Comment 7 Fabian Deutsch 2015-10-21 19:43:17 UTC
Simone, do you have an insight what is going wrong here?

Comment 8 Simone Tiraboschi 2015-10-22 08:02:19 UTC
Dan, is vdsm-4.16.26-1.el7ev.x86_64 from downstream 3.5 able to work with a differently named management bridge?

Comment 9 Dan Kenigsberg 2015-10-22 14:36:14 UTC
Unlike RHEL-H, RHEV-H creates a management bridge BEFORE being added to a cluster. This makes RHEV-H-3.5.z incompatible with the default name of the management network on rhev-m-3.6. RHEL-H waits for Engine to initiate creation of the management network. Engine should choose the correct name.

What is the management network on you 3.5 cluster?

If it is "ovirtmgmt", it is not a bug. Please create a cluster with management network "rhevm", and add 3.5 hosts to it.

Comment 10 cshao 2015-10-23 03:48:23 UTC
(In reply to Dan Kenigsberg from comment #9)
> Unlike RHEL-H, RHEV-H creates a management bridge BEFORE being added to a
> cluster. This makes RHEV-H-3.5.z incompatible with the default name of the
> management network on rhev-m-3.6. RHEL-H waits for Engine to initiate
> creation of the management network. Engine should choose the correct name.
> 
> What is the management network on you 3.5 cluster?

I used the default value "ovirtmgmt".
> 
> If it is "ovirtmgmt", it is not a bug. Please create a cluster with
> management network "rhevm", and add 3.5 hosts to it.

But after create a cluster with "rhevm" management network, it will report "Cannot edit host, Changing management network in a non-empty cluster is not allowed" during approve RHEV-H.

Please see attachment for more details.

Comment 11 cshao 2015-10-23 03:49:20 UTC
Created attachment 1085695 [details]
management_network.png

Comment 12 cshao 2015-10-23 03:50:03 UTC
Created attachment 1085696 [details]
network-rhevm.png

Comment 13 cshao 2015-10-23 03:50:48 UTC
Created attachment 1085697 [details]
error.png

Comment 14 Yevgeny Zaspitsky 2015-10-26 15:54:34 UTC
shaochen,

I succeed reproducing the error message that we can see in "error.png" attachment by trying to move a host to a cluster with different mgmt network. In that flow the message seems erroneous and misleading. I have opened a bug about that (https://bugzilla.redhat.com/show_bug.cgi?id=1275337) and submitted the patch for that. Please approve that's the flow or please describe the how to reproduce the message.

I succeed adding a host with pre-configured rhevm network to a cluster with the same named network with both vdsm versions (3.5 and 3.6).
RHEV-M tries to avoid changing the management network on a host as that might lead to connectivity loss with the host. Thus adding a host to a cluster with different mgmt network doesn't work.

Comment 15 cshao 2015-10-27 08:44:52 UTC
(In reply to Yevgeny Zaspitsky from comment #14)
> shaochen,
> 
> I succeed reproducing the error message that we can see in "error.png"
> attachment by trying to move a host to a cluster with different mgmt
> network. In that flow the message seems erroneous and misleading. I have
> opened a bug about that
> (https://bugzilla.redhat.com/show_bug.cgi?id=1275337) and submitted the
> patch for that. Please approve that's the flow or please describe the how to
> reproduce the message.

Update detail steps in here
Test version:
rhev-hypervisor-7-7.1-20150911.0
ovirt-node-3.2.3-20.el7.noarch
vdsm-4.16.26-1.el7ev.x86_64
RHEV-M 3.6.0-0.18.el6

How reproducible:
100%

Test steps:
1. Install RHEV-H 7.1-20150911.0.
2. RHEV-M 3.6: Create data center with 3.5 compatibility version.
3. Edit management network and rename "ovirtmgmt" to "rhevm".
4. Create a cluster with management network "rhevm", 
5. Register RHEV-H 7.1 to RHEV-M 3.6.
6. Approve the host.

Test result:
It will report "Cannot edit host, Changing management network in a non-empty cluster is not allowed" during approve RHEV-H.

Comment 17 Yaniv Lavi 2015-11-01 09:55:42 UTC
Should this be on MODIFIED?

Comment 18 Ying Cui 2015-11-19 07:36:22 UTC
Adding Testblocker keyword, see comment 5, that blocked RHEV-H upgrade flow from 7.1 to 7.2 via rhevm.

Comment 19 Michael Burman 2015-12-06 09:54:47 UTC
Tested and failedQA, this bug still relevant on 3.6.1.1-0.1.el6.

Failed to add Red Hat Enterprise Virtualization Hypervisor release 7.2 (20151129.1.el7ev) to 3.5 cluster in rhev-m 3.6.1.1 with error:
- "Host orchid-vds2.qa.lab.tlv.redhat.com installation failed. Failed to configure management network on the host."

- "Host orchid-vds2.qa.lab.tlv.redhat.com does not comply with the cluster dfs networks, the following networks are missing on host: 'ovirtmgmt'"

Host was installed and configured via TUI to register to rhev-m.

ovirt-node-3.2.3-29.el7.noarch
vdsm-4.16.30-1.el7ev

Engine should choose the correct name, but it fails.

Comment 20 Michael Burman 2015-12-06 10:00:16 UTC
Created attachment 1102715 [details]
logs+screenshots

Comment 21 Michael Burman 2015-12-06 12:18:31 UTC
(In reply to Ying Cui from comment #18)
> Adding Testblocker keyword, see comment 5, that blocked RHEV-H upgrade flow
> from 7.1 to 7.2 via rhevm.

And indeed this is blocking the upgrade flow to latest 3.6 for example(Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151201.2.el7ev)
After handling manually the management network over the host(removing unmanaged 'rhevm' and attaching 'ovirtmgmt') .

supervdsm failing to restore-net, because trying to restore 'rhevm' network..

 [root@localhost ~]# tree /var/lib/vdsm/persistence/netconf/nets/
/var/lib/vdsm/persistence/netconf/nets/
└── rhevm

0 directories, 1 file

root@localhost ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
ovirtmgmt               8000.001a647a9462       no              enp4s0


Dec  6 12:04:17 localhost vdsm-tool: Traceback (most recent call last):
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 425, in <module>
Dec  6 12:04:17 localhost vdsm-tool: restore(args)
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 388, in restore
Dec  6 12:04:17 localhost vdsm-tool: unified_restoration()
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 110, in unified_restoration
Dec  6 12:04:17 localhost vdsm-tool: _remove_networks_in_running_config()
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 195, in _remove_networks_in_running_config
Dec  6 12:04:17 localhost vdsm-tool: _inRollback=True)
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/network/api.py", line 932, in setupNetworks
Dec  6 12:04:17 localhost vdsm-tool: "system" % network)
Dec  6 12:04:17 localhost vdsm-tool: network.errors.ConfigNetworkError: (27, "Cannot delete network 'rhevm': It doesn't exist in the system")
Dec  6 12:04:17 localhost vdsm-tool: Traceback (most recent call last):
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/bin/vdsm-tool", line 219, in main
Dec  6 12:04:17 localhost vdsm-tool: return tool_command[cmd]["command"](*args)
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 41, in restore_command
Dec  6 12:04:17 localhost vdsm-tool: exec_restore(cmd)
Dec  6 12:04:17 localhost vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 54, in exec_restore
Dec  6 12:04:17 localhost vdsm-tool: raise EnvironmentError('Failed to restore the persisted networks')
Dec  6 12:04:17 localhost vdsm-tool: EnvironmentError: Failed to restore the persisted networks

After upgrade, host ends up in non-responsive state.

Comment 22 Yevgeny Zaspitsky 2015-12-06 19:05:27 UTC
Looks like the following NPE (from the attached log) is related to the problem:

java.lang.NullPointerException
	at org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand.checkForChanges(PersistentHostSetupNetworksCommand.java:69) [bll.jar:]
	at org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand.executeCommand(PersistentHostSetupNetworksCommand.java:52) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1215) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1359) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1983) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1396) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:378) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:475) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:457) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:667) [bll.jar:]
	at sun.reflect.GeneratedMethodAccessor162.invoke(Unknown Source) [:1.8.0_51]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_51]
	at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_51]
	at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:374) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:74) [jboss-as-weld.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:84) [jboss-as-weld.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:97) [jboss-as-weld.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:93) [jboss-as-weld.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:53) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:52) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:266) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:377) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:246) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:43) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:185) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1]
	at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:73) [jboss-as-ee.jar:7.5.4.Final-redhat-4]
	at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view2.runInternalAction(Unknown Source) [bll.jar:]
	at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source) [:1.8.0_51]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_51]
	at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_51]
	at org.jboss.weld.util.reflection.SecureReflections$13.work(SecureReflections.java:267) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.util.reflection.SecureReflectionAccess.run(SecureReflectionAccess.java:52) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.util.reflection.SecureReflectionAccess.runAsInvocation(SecureReflectionAccess.java:137) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.util.reflection.SecureReflections.invoke(SecureReflections.java:263) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.bean.proxy.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:115) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:105) [weld-core.jar:1.1.31.Final-redhat-1]
	at org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$142795634$Proxy$_$$_Weld$Proxy$.runInternalAction(BackendCommandObjectsHandler$BackendInternal$BackendLocal$142795634$Proxy$_$$_Weld$Proxy$.java) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2346) [bll.jar:]
	at org.ovirt.engine.core.bll.ChangeVDSClusterCommand$2.run(ChangeVDSClusterCommand.java:380) [bll.jar:]
	at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:92) [utils.jar:]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_51]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_51]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_51]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_51]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_51]

Comment 23 Fabian Deutsch 2015-12-06 19:50:26 UTC
Raising the priority according to comment 21.

Comment 24 Michael Burman 2015-12-07 14:16:53 UTC
Dan, The workaround suggested is not working. (new 3.5 cluster with rhevm network as a management one). null point exception is still here in the log. 

"Host orchid-vds2.qa.lab.tlv.redhat.com does not comply with the cluster dfs networks, the following networks are missing on host: 'ovirtmgmt'" 

- The workaround for renaming the management network of the cluster from 'ovirtmgmt' to 'rhevm' is not acceptable at all.
What if we have more then one cluster? i need to go to all my clusters and rename the management networks??

This must be taken care by engine in a better way. while the host waiting in 'Approve' state in engine.

Note, that one of the side effects of this is:
If we manually removing the 'rhevm' network from the host via setup networks and attaching 'ovirtmgmt' network to host and approving setup networks. The persist is not send by engine to vdsm and 'ovirtmgmt' is not persistent on the host as if should.

Comment 25 Dan Kenigsberg 2015-12-08 12:13:27 UTC
Let me repeat. In order to allow registering rhev-h-3.5 to a fresh rhev-m-3.6 installation, the user must first create a "rhevm" network, create an 3.5 cluster, and set "rhevm" as its management network.

I understand that this currently result in a NPE which is fixed by http://gerrit.ovirt.org/50037 .

Comment 26 Dan Kenigsberg 2015-12-09 14:14:55 UTC
50133 is should be a better fix, but even without it, this bug should not block 3.6.1.

Comment 27 Michael Burman 2016-01-18 09:40:41 UTC
Verified on - 3.6.2.5-0.1.el6

RHEV-H 3.5.7 installed with success in a 3.6 engine after a proper WA ^^ 
Tested with:
rhev-hypervisor7-7.2-20160105.1
4.16.32-1.el7ev
ovirt-node-3.2.3-30.el7.noarch
3.6.2.5-0.1.el6


Also, on the way, after installing the server with success in a 3.6 engine, i tested with success an upgrade from :
rhev-hypervisor7-7.2-20160105.1 > RHEV Hypervisor - 7.2 - 20160113.0.el7ev
vdsm-4.17.17-0.el7ev
ovirt-node-3.6.1-3.0.el7ev.noarch

Comment 28 cshao 2016-01-19 08:27:41 UTC
> Adding a rhev-h-3.5 to a fresh rhev-m-3.6 installation requires a preparation  > stage: one should define a 3.5 cluster in Engine, define a "rhevm" network, and > set "rhevm" as the management network of the 3.5 cluster. The ovirtmgmt network > should be removed from the 3.5 cluster, or at the very least - defined as 
> non-required, to avoid confusion.


Seem the preparation stage is not for vlan test env.

I still met the bug with vlan env.

Test version:
rhev-hypervisor7-7.2-20151129.1
ovirt-node-3.2.3-29.el7.noarch


Test steps:
1. define a 3.5 cluster in Engine
2. define a "rhevm" network with vlan tag 
3. set "rhevm" as the management network of the 3.5 cluster
4. Install rhev-hypervisor7-7.2-20151129.1
5. configure network with vlan tag
6. register rhevh to rhevm

Test result:
Host xxx does not comply with the cluster xxx networks, the following networks are missing on the host "rhevm".
Actually I have added rhevm network in step 2.

Comment 29 Michael Burman 2016-01-19 12:37:41 UTC
Hi 
 
Did you removed or set 'ovirtmgmt' network as non required in this cluster?

Comment 30 cshao 2016-01-20 02:33:21 UTC
(In reply to Michael Burman from comment #29)
> Hi 
>  
> Did you removed or set 'ovirtmgmt' network as non required in this cluster?

Yes, I did it.

Please see the attachment for more details.

Comment 31 cshao 2016-01-20 02:34:41 UTC
Created attachment 1116450 [details]
vlan-network

Comment 32 cshao 2016-01-20 02:35:27 UTC
Created attachment 1116451 [details]
vlan-failed

Comment 33 Yevgeny Zaspitsky 2016-01-20 04:25:24 UTC
Just to make sure…
Is the actual connection between the engine and rhev-h over a VLAN network in your environement?
Pls excuse me for the silly question.

Comment 34 cshao 2016-01-20 05:45:59 UTC
(In reply to Yevgeny Zaspitsky from comment #33)
> Just to make sure…
> Is the actual connection between the engine and rhev-h over a VLAN network
> in your environement?

No, I can sure that our vlan env can work well, because RHEV-H 7.2 for RHEV 3.6 (rhev-hypervisor7-7.2-20160113.0) build register to the same engine (3.6 cluster)with bond+vlan networking(ovirtmgmt) can up.

Please see my new attachment "bond_vlan_36.png"
> Pls excuse me for the silly question.

It doesn't matter :)

Comment 35 cshao 2016-01-20 05:46:47 UTC
Created attachment 1116478 [details]
bond_vlan_36

Comment 36 Michael Burman 2016-01-20 05:50:36 UTC
I will try to reproduce it over vlan network

Comment 37 Michael Burman 2016-01-20 06:55:43 UTC
Installed and added with success rhev-h 3.5 RHEV Hypervisor - 7.2 - 20160105.1.el7ev in a 3.6.2.5-0.1.el6 engine with the WA ^^ over a vlan NIC.

Comment 38 Michael Burman 2016-01-20 06:58:03 UTC
Created attachment 1116508 [details]
screenshot

Comment 39 Michael Burman 2016-01-20 06:59:08 UTC
Created attachment 1116510 [details]
screenshot2

Comment 40 Michael Burman 2016-01-20 06:59:51 UTC
Created attachment 1116511 [details]
screenshot3

Comment 41 cshao 2016-01-20 08:02:21 UTC
(In reply to Michael Burman from comment #36)
> I will try to reproduce it over vlan network

Do you mind have a try with bond+vlan test env?
Thanks!

Comment 42 cshao 2016-01-20 09:38:04 UTC
I test it again with vlan network, the host can up.

But with bond+vlan network, still met #c28

Host xxx does not comply with the cluster xxx networks, the following networks are missing on the host "rhevm".

Comment 43 Michael Burman 2016-01-20 09:48:52 UTC
Hi Saochen 

I tested it with vlan tagged bond and succeeded :)

Everything working as expected, host added with success over a vlan bond.

Do you see something in the logs? what is the bond mode? 

Yevgeny, any idea what is going on? please take a look on Saochen's env..

Comment 44 cshao 2016-01-20 10:42:42 UTC
Created attachment 1116579 [details]
bond_vlan_all_logs

(In reply to Michael Burman from comment #43)
> Hi Saochen 
> 
> I tested it with vlan tagged bond and succeeded :)
> 
> Everything working as expected, host added with success over a vlan bond.
> 
> Do you see something in the logs? what is the bond mode? 

I used mode 0.

> 
> Yevgeny, any idea what is going on? please take a look on Saochen's env..

Comment 45 Michael Burman 2016-01-20 11:29:55 UTC
And this is the problem --> on 3.6 VM networks cannot be attached to bonds in mode 0, 5 or 6. 
See BZ 1094842

Because the default bond mode on 3.5 is balance-rr(mode 0) you are failing(at least i believe that this is the reason), please try it with active-backup(mode 1 ) for example. 

This is why we requested to fix the default bond mode to safe one(active-backup) as well for 3.5.z .
See BZ 1294340

Comment 46 Yevgeny Zaspitsky 2016-01-20 12:29:56 UTC
shaochen, would you provide the output of `vdsClient -s 0 getVdsCaps` right after you configure networking on your rhevh, and then again, after host approval? vdsm.log would help, too.

Comment 47 Yevgeny Zaspitsky 2016-01-20 12:36:30 UTC
Please also include engine.log - we'd like to verify that the bond mode issue spotted by Michael is logged there.

Comment 48 cshao 2016-01-21 02:25:03 UTC
(In reply to Yevgeny Zaspitsky from comment #46)
> shaochen, would you provide the output of `vdsClient -s 0 getVdsCaps` right
> after you configure networking on your rhevh, and then again, after host
> approval? vdsm.log would help, too.

all log info already provide in #c44

after configure networking on your rhevh
# vdsClient -s 0 getVdsCaps
Connection to 0.0.0.0:54321 refused

after host approval(failed)
# vdsClient -s 0 getVdsCaps
	HBAInventory = {'FC': [], 'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:bc6bfca72792'}]}
	ISCSIInitiatorName = 'iqn.1994-05.com.redhat:bc6bfca72792'
	autoNumaBalancing = 0
	bondings = {'bond1': {'addr': '',
	                      'cfg': {'BONDING_OPTS': 'mode=balance-rr miimon=100',
	                              'DEVICE': 'bond1',
	                              'NM_CONTROLLED': 'no',
	                              'ONBOOT': 'yes',
	                              'TYPE': 'Bond'},
	                      'hwaddr': '00:1b:21:27:47:0b',
	                      'ipv4addrs': [],
	                      'ipv6addrs': ['fe80::21b:21ff:fe27:470b/64'],
	                      'mtu': '1500',
	                      'netmask': '',
	                      'opts': {'miimon': '100'},
	                      'slaves': ['p3p1', 'p4p1']}}
	bridges = {}
	clusterLevels = ['3.4', '3.5']
	cpuCores = '4'
	cpuFlags = 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,aperfmperf,eagerfpu,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,lahf_lm,ida,arat,epb,pln,pts,dtherm,tpr_shadow,vnmi,flexpriority,ept,vpid,xsaveopt,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge'
	cpuModel = 'Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz'
	cpuSockets = '1'
	cpuSpeed = '3696.703'
	cpuThreads = '8'
	emulatedMachines = ['pc-i440fx-rhel7.2.0',
	                    'pc',
	                    'pc-i440fx-rhel7.0.0',
	                    'pc-q35-rhel7.1.0',
	                    'rhel6.3.0',
	                    'pc-q35-rhel7.2.0',
	                    'q35',
	                    'rhel6.4.0',
	                    'rhel6.0.0',
	                    'pc-i440fx-rhel7.1.0',
	                    'rhel6.5.0',
	                    'rhel6.6.0',
	                    'rhel6.1.0',
	                    'pc-q35-rhel7.0.0',
	                    'rhel6.2.0']
	guestOverhead = '65'
	hooks = {'after_network_setup': {'30_ethtool_options': {'md5': '2f6fe7f77eb498fb9493ef9ea5ad8705'}},
	         'after_vm_destroy': {'50_vhostmd': {'md5': '92498ed80c0219749829ecce2813fc7c'}},
	         'before_vm_dehibernate': {'50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}},
	         'before_vm_migrate_destination': {'50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}},
	         'before_vm_start': {'50_hostedengine': {'md5': '45dde62155b5412eafbfff5ef265acc2'},
	                             '50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}}}
	kdumpStatus = 0
	kvmEnabled = 'true'
	lastClient = '127.0.0.1'
	lastClientIface = 'lo'
	liveMerge = 'true'
	liveSnapshot = 'true'
	memSize = '7607'
	netConfigDirty = 'False'
	networks = {}
	nics = {'em1': {'addr': '',
	                'cfg': {},
	                'hwaddr': 'd4:be:d9:95:61:ca',
	                'ipv4addrs': [],
	                'ipv6addrs': [],
	                'mtu': '1500',
	                'netmask': '',
	                'speed': 0},
	        'p3p1': {'addr': '',
	                 'cfg': {'DEVICE': 'p3p1',
	                         'HWADDR': '00:1b:21:27:47:0b',
	                         'MASTER': 'bond1',
	                         'NM_CONTROLLED': 'no',
	                         'ONBOOT': 'yes',
	                         'SLAVE': 'yes'},
	                 'hwaddr': '00:1b:21:27:47:0b',
	                 'ipv4addrs': [],
	                 'ipv6addrs': [],
	                 'mtu': '1500',
	                 'netmask': '',
	                 'permhwaddr': '00:1b:21:27:47:0b',
	                 'speed': 1000},
	        'p4p1': {'addr': '',
	                 'cfg': {'DEVICE': 'p4p1',
	                         'HWADDR': '00:10:18:81:a4:a0',
	                         'MASTER': 'bond1',
	                         'NM_CONTROLLED': 'no',
	                         'ONBOOT': 'yes',
	                         'SLAVE': 'yes'},
	                 'hwaddr': '00:1b:21:27:47:0b',
	                 'ipv4addrs': [],
	                 'ipv6addrs': [],
	                 'mtu': '1500',
	                 'netmask': '',
	                 'permhwaddr': '00:10:18:81:a4:a0',
	                 'speed': 1000},
	        'p4p2': {'addr': '',
	                 'cfg': {},
	                 'hwaddr': '00:10:18:81:a4:a2',
	                 'ipv4addrs': [],
	                 'ipv6addrs': [],
	                 'mtu': '1500',
	                 'netmask': '',
	                 'speed': 0}}
	numaNodeDistance = {'0': [10]}
	numaNodes = {'0': {'cpus': [0, 1, 2, 3, 4, 5, 6, 7], 'totalMemory': '7607'}}
	onlineCpus = '0,1,2,3,4,5,6,7'
	operatingSystem = {'name': 'RHEV Hypervisor', 'release': '20151129.1.el7ev', 'version': '7.2'}
	packages2 = {'kernel': {'buildtime': 1446139769.0,
	                        'release': '327.el7.x86_64',
	                        'version': '3.10.0'},
	             'libvirt': {'buildtime': 1444310232,
	                         'release': '13.el7',
	                         'version': '1.2.17'},
	             'mom': {'buildtime': 1431358543, 'release': '5.el7ev', 'version': '0.4.1'},
	             'qemu-img': {'buildtime': 1444825406,
	                          'release': '31.el7',
	                          'version': '2.3.0'},
	             'qemu-kvm': {'buildtime': 1444825406,
	                          'release': '31.el7',
	                          'version': '2.3.0'},
	             'spice-server': {'buildtime': 1443519054,
	                              'release': '15.el7',
	                              'version': '0.12.4'},
	             'vdsm': {'buildtime': 1448308136, 'release': '1.el7ev', 'version': '4.16.30'}}
	reservedMem = '321'
	rngSources = ['random']
	selinux = {'mode': '1'}
	software_revision = '1'
	software_version = '4.16'
	supportedENGINEs = ['3.4', '3.5']
	supportedProtocols = ['2.2', '2.3']
	uuid = '4C4C4544-0050-4310-8039-B6C04F423358'
	version_name = 'Snow Man'
	vlans = {'bond1.20': {'addr': '192.168.20.129',
	                      'cfg': {'BOOTPROTO': 'dhcp',
	                              'DEVICE': 'bond1.20',
	                              'IPV6INIT': 'no',
	                              'IPV6_AUTOCONF': 'no',
	                              'NM_CONTROLLED': 'no',
	                              'ONBOOT': 'yes',
	                              'PEERNTP': 'yes',
	                              'VLAN': 'yes'},
	                      'iface': 'bond1',
	                      'ipv4addrs': ['192.168.20.129/24'],
	                      'ipv6addrs': ['fe80::21b:21ff:fe27:470b/64'],
	                      'mtu': '1500',
	                      'netmask': '255.255.255.0',
	                      'vlanid': 20}}
	vmTypes = ['kvm']

Comment 49 cshao 2016-01-21 02:41:28 UTC
(In reply to Michael Burman from comment #45)
> And this is the problem --> on 3.6 VM networks cannot be attached to bonds
> in mode 0, 5 or 6. 
> See BZ 1094842
> 
> Because the default bond mode on 3.5 is balance-rr(mode 0) you are
> failing(at least i believe that this is the reason), please try it with
> active-backup(mode 1 ) for example. 
> 
> This is why we requested to fix the default bond mode to safe
> one(active-backup) as well for 3.5.z .
> See BZ 1294340

After set bond mode as active-backup(mode 1), add RHEV-H to engine with bond_vlan network can succeed.

Thank you for the explanation.


Note You need to log in before you can comment on or make changes to this bug.