Bug 1271273
Summary: | Failed to add RHEV-H 3.5.z host to RHEV-M 3.6 with 3.5 cluster due to missing ovirtmgmt network. | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | cshao <cshao> | ||||||||||||||||||||||||||||
Component: | ovirt-engine | Assignee: | Yevgeny Zaspitsky <yzaspits> | ||||||||||||||||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Michael Burman <mburman> | ||||||||||||||||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||||||||||||||||
Priority: | urgent | ||||||||||||||||||||||||||||||
Version: | 3.5.4 | CC: | bazulay, cshao, cwu, danken, fdeutsch, gklein, huiwa, huzhao, leiwang, lsurette, mburman, rbalakri, Rhev-m-bugs, sbonazzo, srevivo, stirabos, yaniwang, ycui, ykaul, ylavi, yzaspits | ||||||||||||||||||||||||||||
Target Milestone: | ovirt-3.6.2 | Keywords: | TestBlocker | ||||||||||||||||||||||||||||
Target Release: | 3.6.2 | Flags: | yzaspits:
needinfo-
ylavi: Triaged+ |
||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||||||||||||
Doc Text: |
Cause:
Adding a rhev-h-3.5 to a fresh rhev-m-3.6 installation requires a preparation stage: one should define a 3.5 cluster in Engine, define a "rhevm" network, and set "rhevm" as the management network of the 3.5 cluster. The ovirtmgmt network should be removed from the 3.5 cluster, or at the very least - defined as non-required, to avoid confusion.
Consequence:
Fix:
Result:
|
Story Points: | --- | ||||||||||||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||||||||||||
Last Closed: | 2016-04-20 01:32:15 UTC | Type: | Bug | ||||||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||||||
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||||||||||||
Bug Blocks: | 1206139, 1285700 | ||||||||||||||||||||||||||||||
Attachments: |
|
Created attachment 1082462 [details]
/var/log/*.* + sosreport + engine.log
This bug affects registering newly installed RHEV-H 3.5.z (on 7.1) to RHEV-M 3.6 in 3.5 cluster mode. This flow is not so common, so targeting this for 3.6.1 for now. Dan, the problem seems to be the change of the name of the bridge between 3.5.z and 3.6. I don't see what RHEV-H is doing different than RHEL-H, and I wonder how RHEL-H is expected to work in the upgrade scenario. Simone, do you have an insight what is going wrong here? Dan, is vdsm-4.16.26-1.el7ev.x86_64 from downstream 3.5 able to work with a differently named management bridge? Unlike RHEL-H, RHEV-H creates a management bridge BEFORE being added to a cluster. This makes RHEV-H-3.5.z incompatible with the default name of the management network on rhev-m-3.6. RHEL-H waits for Engine to initiate creation of the management network. Engine should choose the correct name. What is the management network on you 3.5 cluster? If it is "ovirtmgmt", it is not a bug. Please create a cluster with management network "rhevm", and add 3.5 hosts to it. (In reply to Dan Kenigsberg from comment #9) > Unlike RHEL-H, RHEV-H creates a management bridge BEFORE being added to a > cluster. This makes RHEV-H-3.5.z incompatible with the default name of the > management network on rhev-m-3.6. RHEL-H waits for Engine to initiate > creation of the management network. Engine should choose the correct name. > > What is the management network on you 3.5 cluster? I used the default value "ovirtmgmt". > > If it is "ovirtmgmt", it is not a bug. Please create a cluster with > management network "rhevm", and add 3.5 hosts to it. But after create a cluster with "rhevm" management network, it will report "Cannot edit host, Changing management network in a non-empty cluster is not allowed" during approve RHEV-H. Please see attachment for more details. Created attachment 1085695 [details]
management_network.png
Created attachment 1085696 [details]
network-rhevm.png
Created attachment 1085697 [details]
error.png
shaochen, I succeed reproducing the error message that we can see in "error.png" attachment by trying to move a host to a cluster with different mgmt network. In that flow the message seems erroneous and misleading. I have opened a bug about that (https://bugzilla.redhat.com/show_bug.cgi?id=1275337) and submitted the patch for that. Please approve that's the flow or please describe the how to reproduce the message. I succeed adding a host with pre-configured rhevm network to a cluster with the same named network with both vdsm versions (3.5 and 3.6). RHEV-M tries to avoid changing the management network on a host as that might lead to connectivity loss with the host. Thus adding a host to a cluster with different mgmt network doesn't work. (In reply to Yevgeny Zaspitsky from comment #14) > shaochen, > > I succeed reproducing the error message that we can see in "error.png" > attachment by trying to move a host to a cluster with different mgmt > network. In that flow the message seems erroneous and misleading. I have > opened a bug about that > (https://bugzilla.redhat.com/show_bug.cgi?id=1275337) and submitted the > patch for that. Please approve that's the flow or please describe the how to > reproduce the message. Update detail steps in here Test version: rhev-hypervisor-7-7.1-20150911.0 ovirt-node-3.2.3-20.el7.noarch vdsm-4.16.26-1.el7ev.x86_64 RHEV-M 3.6.0-0.18.el6 How reproducible: 100% Test steps: 1. Install RHEV-H 7.1-20150911.0. 2. RHEV-M 3.6: Create data center with 3.5 compatibility version. 3. Edit management network and rename "ovirtmgmt" to "rhevm". 4. Create a cluster with management network "rhevm", 5. Register RHEV-H 7.1 to RHEV-M 3.6. 6. Approve the host. Test result: It will report "Cannot edit host, Changing management network in a non-empty cluster is not allowed" during approve RHEV-H. Should this be on MODIFIED? Adding Testblocker keyword, see comment 5, that blocked RHEV-H upgrade flow from 7.1 to 7.2 via rhevm. Tested and failedQA, this bug still relevant on 3.6.1.1-0.1.el6. Failed to add Red Hat Enterprise Virtualization Hypervisor release 7.2 (20151129.1.el7ev) to 3.5 cluster in rhev-m 3.6.1.1 with error: - "Host orchid-vds2.qa.lab.tlv.redhat.com installation failed. Failed to configure management network on the host." - "Host orchid-vds2.qa.lab.tlv.redhat.com does not comply with the cluster dfs networks, the following networks are missing on host: 'ovirtmgmt'" Host was installed and configured via TUI to register to rhev-m. ovirt-node-3.2.3-29.el7.noarch vdsm-4.16.30-1.el7ev Engine should choose the correct name, but it fails. Created attachment 1102715 [details]
logs+screenshots
(In reply to Ying Cui from comment #18) > Adding Testblocker keyword, see comment 5, that blocked RHEV-H upgrade flow > from 7.1 to 7.2 via rhevm. And indeed this is blocking the upgrade flow to latest 3.6 for example(Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151201.2.el7ev) After handling manually the management network over the host(removing unmanaged 'rhevm' and attaching 'ovirtmgmt') . supervdsm failing to restore-net, because trying to restore 'rhevm' network.. [root@localhost ~]# tree /var/lib/vdsm/persistence/netconf/nets/ /var/lib/vdsm/persistence/netconf/nets/ └── rhevm 0 directories, 1 file root@localhost ~]# brctl show bridge name bridge id STP enabled interfaces ovirtmgmt 8000.001a647a9462 no enp4s0 Dec 6 12:04:17 localhost vdsm-tool: Traceback (most recent call last): Dec 6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 425, in <module> Dec 6 12:04:17 localhost vdsm-tool: restore(args) Dec 6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 388, in restore Dec 6 12:04:17 localhost vdsm-tool: unified_restoration() Dec 6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 110, in unified_restoration Dec 6 12:04:17 localhost vdsm-tool: _remove_networks_in_running_config() Dec 6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/vdsm-restore-net-config", line 195, in _remove_networks_in_running_config Dec 6 12:04:17 localhost vdsm-tool: _inRollback=True) Dec 6 12:04:17 localhost vdsm-tool: File "/usr/share/vdsm/network/api.py", line 932, in setupNetworks Dec 6 12:04:17 localhost vdsm-tool: "system" % network) Dec 6 12:04:17 localhost vdsm-tool: network.errors.ConfigNetworkError: (27, "Cannot delete network 'rhevm': It doesn't exist in the system") Dec 6 12:04:17 localhost vdsm-tool: Traceback (most recent call last): Dec 6 12:04:17 localhost vdsm-tool: File "/usr/bin/vdsm-tool", line 219, in main Dec 6 12:04:17 localhost vdsm-tool: return tool_command[cmd]["command"](*args) Dec 6 12:04:17 localhost vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 41, in restore_command Dec 6 12:04:17 localhost vdsm-tool: exec_restore(cmd) Dec 6 12:04:17 localhost vdsm-tool: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 54, in exec_restore Dec 6 12:04:17 localhost vdsm-tool: raise EnvironmentError('Failed to restore the persisted networks') Dec 6 12:04:17 localhost vdsm-tool: EnvironmentError: Failed to restore the persisted networks After upgrade, host ends up in non-responsive state. Looks like the following NPE (from the attached log) is related to the problem: java.lang.NullPointerException at org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand.checkForChanges(PersistentHostSetupNetworksCommand.java:69) [bll.jar:] at org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand.executeCommand(PersistentHostSetupNetworksCommand.java:52) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1215) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1359) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1983) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1396) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:378) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:475) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:457) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:667) [bll.jar:] at sun.reflect.GeneratedMethodAccessor162.invoke(Unknown Source) [:1.8.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_51] at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_51] at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:374) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:74) [jboss-as-weld.jar:7.5.4.Final-redhat-4] at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:84) [jboss-as-weld.jar:7.5.4.Final-redhat-4] at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:97) [jboss-as-weld.jar:7.5.4.Final-redhat-4] at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:93) [jboss-as-weld.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:53) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:52) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:266) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:377) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:246) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:43) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:185) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.2.Final-redhat-1] at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:73) [jboss-as-ee.jar:7.5.4.Final-redhat-4] at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view2.runInternalAction(Unknown Source) [bll.jar:] at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source) [:1.8.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_51] at java.lang.reflect.Method.invoke(Method.java:497) [rt.jar:1.8.0_51] at org.jboss.weld.util.reflection.SecureReflections$13.work(SecureReflections.java:267) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.util.reflection.SecureReflectionAccess.run(SecureReflectionAccess.java:52) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.util.reflection.SecureReflectionAccess.runAsInvocation(SecureReflectionAccess.java:137) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.util.reflection.SecureReflections.invoke(SecureReflections.java:263) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.bean.proxy.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:115) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) [weld-core.jar:1.1.31.Final-redhat-1] at org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:105) [weld-core.jar:1.1.31.Final-redhat-1] at org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$142795634$Proxy$_$$_Weld$Proxy$.runInternalAction(BackendCommandObjectsHandler$BackendInternal$BackendLocal$142795634$Proxy$_$$_Weld$Proxy$.java) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2346) [bll.jar:] at org.ovirt.engine.core.bll.ChangeVDSClusterCommand$2.run(ChangeVDSClusterCommand.java:380) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:92) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_51] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_51] Raising the priority according to comment 21. Dan, The workaround suggested is not working. (new 3.5 cluster with rhevm network as a management one). null point exception is still here in the log. "Host orchid-vds2.qa.lab.tlv.redhat.com does not comply with the cluster dfs networks, the following networks are missing on host: 'ovirtmgmt'" - The workaround for renaming the management network of the cluster from 'ovirtmgmt' to 'rhevm' is not acceptable at all. What if we have more then one cluster? i need to go to all my clusters and rename the management networks?? This must be taken care by engine in a better way. while the host waiting in 'Approve' state in engine. Note, that one of the side effects of this is: If we manually removing the 'rhevm' network from the host via setup networks and attaching 'ovirtmgmt' network to host and approving setup networks. The persist is not send by engine to vdsm and 'ovirtmgmt' is not persistent on the host as if should. Let me repeat. In order to allow registering rhev-h-3.5 to a fresh rhev-m-3.6 installation, the user must first create a "rhevm" network, create an 3.5 cluster, and set "rhevm" as its management network. I understand that this currently result in a NPE which is fixed by http://gerrit.ovirt.org/50037 . 50133 is should be a better fix, but even without it, this bug should not block 3.6.1. Verified on - 3.6.2.5-0.1.el6 RHEV-H 3.5.7 installed with success in a 3.6 engine after a proper WA ^^ Tested with: rhev-hypervisor7-7.2-20160105.1 4.16.32-1.el7ev ovirt-node-3.2.3-30.el7.noarch 3.6.2.5-0.1.el6 Also, on the way, after installing the server with success in a 3.6 engine, i tested with success an upgrade from : rhev-hypervisor7-7.2-20160105.1 > RHEV Hypervisor - 7.2 - 20160113.0.el7ev vdsm-4.17.17-0.el7ev ovirt-node-3.6.1-3.0.el7ev.noarch > Adding a rhev-h-3.5 to a fresh rhev-m-3.6 installation requires a preparation > stage: one should define a 3.5 cluster in Engine, define a "rhevm" network, and > set "rhevm" as the management network of the 3.5 cluster. The ovirtmgmt network > should be removed from the 3.5 cluster, or at the very least - defined as
> non-required, to avoid confusion.
Seem the preparation stage is not for vlan test env.
I still met the bug with vlan env.
Test version:
rhev-hypervisor7-7.2-20151129.1
ovirt-node-3.2.3-29.el7.noarch
Test steps:
1. define a 3.5 cluster in Engine
2. define a "rhevm" network with vlan tag
3. set "rhevm" as the management network of the 3.5 cluster
4. Install rhev-hypervisor7-7.2-20151129.1
5. configure network with vlan tag
6. register rhevh to rhevm
Test result:
Host xxx does not comply with the cluster xxx networks, the following networks are missing on the host "rhevm".
Actually I have added rhevm network in step 2.
Hi Did you removed or set 'ovirtmgmt' network as non required in this cluster? (In reply to Michael Burman from comment #29) > Hi > > Did you removed or set 'ovirtmgmt' network as non required in this cluster? Yes, I did it. Please see the attachment for more details. Created attachment 1116450 [details]
vlan-network
Created attachment 1116451 [details]
vlan-failed
Just to make sure… Is the actual connection between the engine and rhev-h over a VLAN network in your environement? Pls excuse me for the silly question. (In reply to Yevgeny Zaspitsky from comment #33) > Just to make sure… > Is the actual connection between the engine and rhev-h over a VLAN network > in your environement? No, I can sure that our vlan env can work well, because RHEV-H 7.2 for RHEV 3.6 (rhev-hypervisor7-7.2-20160113.0) build register to the same engine (3.6 cluster)with bond+vlan networking(ovirtmgmt) can up. Please see my new attachment "bond_vlan_36.png" > Pls excuse me for the silly question. It doesn't matter :) Created attachment 1116478 [details]
bond_vlan_36
I will try to reproduce it over vlan network Installed and added with success rhev-h 3.5 RHEV Hypervisor - 7.2 - 20160105.1.el7ev in a 3.6.2.5-0.1.el6 engine with the WA ^^ over a vlan NIC. Created attachment 1116508 [details]
screenshot
Created attachment 1116510 [details]
screenshot2
Created attachment 1116511 [details]
screenshot3
(In reply to Michael Burman from comment #36) > I will try to reproduce it over vlan network Do you mind have a try with bond+vlan test env? Thanks! I test it again with vlan network, the host can up. But with bond+vlan network, still met #c28 Host xxx does not comply with the cluster xxx networks, the following networks are missing on the host "rhevm". Hi Saochen I tested it with vlan tagged bond and succeeded :) Everything working as expected, host added with success over a vlan bond. Do you see something in the logs? what is the bond mode? Yevgeny, any idea what is going on? please take a look on Saochen's env.. Created attachment 1116579 [details] bond_vlan_all_logs (In reply to Michael Burman from comment #43) > Hi Saochen > > I tested it with vlan tagged bond and succeeded :) > > Everything working as expected, host added with success over a vlan bond. > > Do you see something in the logs? what is the bond mode? I used mode 0. > > Yevgeny, any idea what is going on? please take a look on Saochen's env.. And this is the problem --> on 3.6 VM networks cannot be attached to bonds in mode 0, 5 or 6. See BZ 1094842 Because the default bond mode on 3.5 is balance-rr(mode 0) you are failing(at least i believe that this is the reason), please try it with active-backup(mode 1 ) for example. This is why we requested to fix the default bond mode to safe one(active-backup) as well for 3.5.z . See BZ 1294340 shaochen, would you provide the output of `vdsClient -s 0 getVdsCaps` right after you configure networking on your rhevh, and then again, after host approval? vdsm.log would help, too. Please also include engine.log - we'd like to verify that the bond mode issue spotted by Michael is logged there. (In reply to Yevgeny Zaspitsky from comment #46) > shaochen, would you provide the output of `vdsClient -s 0 getVdsCaps` right > after you configure networking on your rhevh, and then again, after host > approval? vdsm.log would help, too. all log info already provide in #c44 after configure networking on your rhevh # vdsClient -s 0 getVdsCaps Connection to 0.0.0.0:54321 refused after host approval(failed) # vdsClient -s 0 getVdsCaps HBAInventory = {'FC': [], 'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:bc6bfca72792'}]} ISCSIInitiatorName = 'iqn.1994-05.com.redhat:bc6bfca72792' autoNumaBalancing = 0 bondings = {'bond1': {'addr': '', 'cfg': {'BONDING_OPTS': 'mode=balance-rr miimon=100', 'DEVICE': 'bond1', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'TYPE': 'Bond'}, 'hwaddr': '00:1b:21:27:47:0b', 'ipv4addrs': [], 'ipv6addrs': ['fe80::21b:21ff:fe27:470b/64'], 'mtu': '1500', 'netmask': '', 'opts': {'miimon': '100'}, 'slaves': ['p3p1', 'p4p1']}} bridges = {} clusterLevels = ['3.4', '3.5'] cpuCores = '4' cpuFlags = 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,aperfmperf,eagerfpu,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,lahf_lm,ida,arat,epb,pln,pts,dtherm,tpr_shadow,vnmi,flexpriority,ept,vpid,xsaveopt,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge' cpuModel = 'Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz' cpuSockets = '1' cpuSpeed = '3696.703' cpuThreads = '8' emulatedMachines = ['pc-i440fx-rhel7.2.0', 'pc', 'pc-i440fx-rhel7.0.0', 'pc-q35-rhel7.1.0', 'rhel6.3.0', 'pc-q35-rhel7.2.0', 'q35', 'rhel6.4.0', 'rhel6.0.0', 'pc-i440fx-rhel7.1.0', 'rhel6.5.0', 'rhel6.6.0', 'rhel6.1.0', 'pc-q35-rhel7.0.0', 'rhel6.2.0'] guestOverhead = '65' hooks = {'after_network_setup': {'30_ethtool_options': {'md5': '2f6fe7f77eb498fb9493ef9ea5ad8705'}}, 'after_vm_destroy': {'50_vhostmd': {'md5': '92498ed80c0219749829ecce2813fc7c'}}, 'before_vm_dehibernate': {'50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}}, 'before_vm_migrate_destination': {'50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}}, 'before_vm_start': {'50_hostedengine': {'md5': '45dde62155b5412eafbfff5ef265acc2'}, '50_vhostmd': {'md5': 'f3ee6dbf6fbd01333bd3e32afec4fbba'}}} kdumpStatus = 0 kvmEnabled = 'true' lastClient = '127.0.0.1' lastClientIface = 'lo' liveMerge = 'true' liveSnapshot = 'true' memSize = '7607' netConfigDirty = 'False' networks = {} nics = {'em1': {'addr': '', 'cfg': {}, 'hwaddr': 'd4:be:d9:95:61:ca', 'ipv4addrs': [], 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'speed': 0}, 'p3p1': {'addr': '', 'cfg': {'DEVICE': 'p3p1', 'HWADDR': '00:1b:21:27:47:0b', 'MASTER': 'bond1', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'SLAVE': 'yes'}, 'hwaddr': '00:1b:21:27:47:0b', 'ipv4addrs': [], 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'permhwaddr': '00:1b:21:27:47:0b', 'speed': 1000}, 'p4p1': {'addr': '', 'cfg': {'DEVICE': 'p4p1', 'HWADDR': '00:10:18:81:a4:a0', 'MASTER': 'bond1', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'SLAVE': 'yes'}, 'hwaddr': '00:1b:21:27:47:0b', 'ipv4addrs': [], 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'permhwaddr': '00:10:18:81:a4:a0', 'speed': 1000}, 'p4p2': {'addr': '', 'cfg': {}, 'hwaddr': '00:10:18:81:a4:a2', 'ipv4addrs': [], 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'speed': 0}} numaNodeDistance = {'0': [10]} numaNodes = {'0': {'cpus': [0, 1, 2, 3, 4, 5, 6, 7], 'totalMemory': '7607'}} onlineCpus = '0,1,2,3,4,5,6,7' operatingSystem = {'name': 'RHEV Hypervisor', 'release': '20151129.1.el7ev', 'version': '7.2'} packages2 = {'kernel': {'buildtime': 1446139769.0, 'release': '327.el7.x86_64', 'version': '3.10.0'}, 'libvirt': {'buildtime': 1444310232, 'release': '13.el7', 'version': '1.2.17'}, 'mom': {'buildtime': 1431358543, 'release': '5.el7ev', 'version': '0.4.1'}, 'qemu-img': {'buildtime': 1444825406, 'release': '31.el7', 'version': '2.3.0'}, 'qemu-kvm': {'buildtime': 1444825406, 'release': '31.el7', 'version': '2.3.0'}, 'spice-server': {'buildtime': 1443519054, 'release': '15.el7', 'version': '0.12.4'}, 'vdsm': {'buildtime': 1448308136, 'release': '1.el7ev', 'version': '4.16.30'}} reservedMem = '321' rngSources = ['random'] selinux = {'mode': '1'} software_revision = '1' software_version = '4.16' supportedENGINEs = ['3.4', '3.5'] supportedProtocols = ['2.2', '2.3'] uuid = '4C4C4544-0050-4310-8039-B6C04F423358' version_name = 'Snow Man' vlans = {'bond1.20': {'addr': '192.168.20.129', 'cfg': {'BOOTPROTO': 'dhcp', 'DEVICE': 'bond1.20', 'IPV6INIT': 'no', 'IPV6_AUTOCONF': 'no', 'NM_CONTROLLED': 'no', 'ONBOOT': 'yes', 'PEERNTP': 'yes', 'VLAN': 'yes'}, 'iface': 'bond1', 'ipv4addrs': ['192.168.20.129/24'], 'ipv6addrs': ['fe80::21b:21ff:fe27:470b/64'], 'mtu': '1500', 'netmask': '255.255.255.0', 'vlanid': 20}} vmTypes = ['kvm'] (In reply to Michael Burman from comment #45) > And this is the problem --> on 3.6 VM networks cannot be attached to bonds > in mode 0, 5 or 6. > See BZ 1094842 > > Because the default bond mode on 3.5 is balance-rr(mode 0) you are > failing(at least i believe that this is the reason), please try it with > active-backup(mode 1 ) for example. > > This is why we requested to fix the default bond mode to safe > one(active-backup) as well for 3.5.z . > See BZ 1294340 After set bond mode as active-backup(mode 1), add RHEV-H to engine with bond_vlan network can succeed. Thank you for the explanation. |
Created attachment 1082460 [details] failed to approved 7.1 Description of problem: Failed to approve RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster due to miss ovirtmgmt network. Version-Release number of selected component (if applicable): rhev-hypervisor-7-7.1-20150911.0 ovirt-node-3.2.3-20.el7.noarch vdsm-4.16.26-1.el7ev.x86_64 RHEV-M 3.6.0-0.18.el6 How reproducible: 100% Steps to Reproduce: 1. Install RHEV-H 7.1-20150911.0. 2. Register RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster. 3. Approve the host. Actual results: Failed to approve RHEV-H 7.1 to RHEV-M 3.6 with 3.5 cluster. Expected results: It will report " failed to configure management network on the host, the following network are missing on host "ovirtmgmt". Additional info: