Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1052314

Summary: [RHEVM] cannot put 3.3 host into maintenance after failed addition to 3.4 cluster
Product: Red Hat Enterprise Virtualization Manager Reporter: Martin Pavlik <mpavlik>
Component: ovirt-engineAssignee: Liran Zelkha <lzelkha>
Status: CLOSED CURRENTRELEASE QA Contact: Tareq Alayan <talayan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.0CC: acathrow, bazulay, emesika, gklein, iheim, lpeer, lzelkha, mpavlik, Rhev-m-bugs, sbonazzo, talayan, yeylon
Target Milestone: ---   
Target Release: 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: ovirt-3.4.0-beta3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot 1 none

Description Martin Pavlik 2014-01-13 15:45:45 UTC
Description of problem:
if user puts host with 3.3 vdsm into RHEVM which has 3.4 Data Center and 3.4 Cluster, installation fails with host non-operational 

Host dell-05 is installed with VDSM version (Non interactive user) and cannot join cluster Default which is compatible with VDSM versions [4.13, 4.9, 4.11, 4.12, 4.10].

after this it is not possible to put host in maintenance node in order to reassign it to other cluster, host stays stuck forever in preparing for maintenance state, therefore it is impossible to fix the host or get rid of it

Version-Release number of selected component (if applicable):
RHEV-M:  oVirt Engine Version: 3.4.0-0.2.master.20140106180914.el6 
hypervisor: vdsm-4.13.2-0.6.el6ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. install host with 3.3 compatible vdsm into RHEVM 3.4 with 3.4 DC/CL
2. after host becomes non-operational, try to put it in maintenance mode

Actual results:
host stays stuck forever in preparing for maintenance state

Expected results:
host is switched to maintenance

Additional info:

seems like we had similar bug https://bugzilla.redhat.com/show_bug.cgi?id=1031536 from rhevm 3.2 to rhevm 3.3


2014-01-13 08:18:00,711 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-6-thread-15) [4cac7eb1] Host installation failed for host 4f2bbf7d-3338-49dc-aca4-178bbd5deca6, dell-05.: org.ovirt.engine.core.bll.InstallVdsCommand$VdsInstallException: Failed to configure management network on the host
	at org.ovirt.engine.core.bll.InstallVdsCommand.configureManagementNetwork(InstallVdsCommand.java:294) [bll.jar:]
	at org.ovirt.engine.core.bll.InstallVdsCommand.installHost(InstallVdsCommand.java:203) [bll.jar:]
	at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:105) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1114) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1199) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1875) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1219) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:351) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:414) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:393) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:624) [bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_45]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_45]
	at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_45]
	at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]

Comment 1 Liran Zelkha 2014-01-14 14:17:21 UTC
Can you describe which networks exist for the host?

Comment 2 Liran Zelkha 2014-01-14 14:21:15 UTC
Also, can you send engine and vdsm host information (ssh username/password)?

Comment 3 Martin Pavlik 2014-01-14 16:07:54 UTC
I used clean RHEL 6.5 with 3.3 vdsm repo enabled

Installed it via rhevm 3.4 into 3.4 Data Center / Cluster

at the end host was non operational, in setup networks host appears as it has no ovirtmgmt assigned to any interface (see screenshot) 

however on host it seems to be configured OK

root@localhost ~]# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1a:4a:9c:6e:92 brd ff:ff:ff:ff:ff:ff
4: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether 4a:7b:d3:28:34:f9 brd ff:ff:ff:ff:ff:ff
5: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
6: bond4: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
7: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
8: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
9: bond3: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN 
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
10: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 00:1a:4a:9c:6e:92 brd ff:ff:ff:ff:ff:ff
    inet 10.34.66.2/24 brd 10.34.66.255 scope global ovirtmgmt

[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 
# Generated by VDSM version 4.13.2-0.6.el6ev
DEVICE=eth0
ONBOOT=yes
HWADDR=00:1a:4a:9c:6e:92
BRIDGE=ovirtmgmt
NM_CONTROLLED=no
STP=no

[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt 
# Generated by VDSM version 4.13.2-0.6.el6ev
DEVICE=ovirtmgmt
ONBOOT=yes
TYPE=Bridge
DELAY=0
BOOTPROTO=dhcp
DEFROUTE=no
NM_CONTROLLED=no
STP=no

Comment 4 Martin Pavlik 2014-01-14 16:08:24 UTC
Created attachment 850022 [details]
screenshot 1

Comment 6 Liran Zelkha 2014-01-20 19:49:25 UTC
Hi Martin - is it ok if I restart the engine so I can connect to it with a debugger? I can see the error, but can't understand what's causing it.

Comment 7 Martin Pavlik 2014-02-03 12:43:47 UTC
(In reply to Liran Zelkha from comment #6)
> Hi Martin - is it ok if I restart the engine so I can connect to it with a
> debugger? I can see the error, but can't understand what's causing it.

yes, you can restart, go ahead please

Comment 8 Liran Zelkha 2014-02-04 09:51:20 UTC
Thanks. I connected, found the bug, and disconnected. You'll see the patch soon.

Comment 10 Tareq Alayan 2014-02-25 17:11:36 UTC
1- Added 3.3 host to 3.4 Datacenter/cluster
2- Host stuck on initializing for ~3minutes then became non-opertional 
3. successfully put into Maintenance

verified ovirt-engine-3.4.0-0.11.beta3.el6.noarch

Comment 11 Itamar Heim 2014-06-12 14:07:51 UTC
Closing as part of 3.4.0