Bug 1570388

Summary: Add host failed if cluster has a required network
Product: [oVirt] ovirt-engine Reporter: Michael Burman <mburman>
Component: BLL.NetworkAssignee: Alona Kaplan <alkaplan>
Status: CLOSED CURRENTRELEASE QA Contact: Michael Burman <mburman>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.3.2CC: alkaplan, bugs, eraviv, lveyde
Target Milestone: ovirt-4.2.3Keywords: Regression
Target Release: ---Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.2.3.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-10 06:28:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Michael Burman 2018-04-22 13:58:10 UTC
Created attachment 1425340 [details]
logs

Description of problem:
Add host failed if cluster has a required network.

In case that cluster has a required network(not the managment) and it attached to the hots/s(if cluster have hosts), add host will fail and host will get non-operational, until unchecking the required network.

2018-04-22 16:43:34,620+03 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-819) [758db545] FINISH, SetVdsStatusVDSCom
mand, log id: 3b11b634
2018-04-22 16:43:34,791+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-24) [] EVENT_ID: VD
S_DETECTED(13), Status of host orchid-vds1.qa.lab.tlv.redhat.com was set to NonOperational.
2018-04-22 16:43:34,805+03 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engine-Thread-819) [758db545] Host 'orchid-vds1.qa.lab.tl
v.redhat.com' is set to Non-Operational, it is missing the following networks: 'req-net'
2018-04-22 16:43:34,928+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-819) [758db545] EVENT_ID: VD
S_SET_NONOPERATIONAL_NETWORK(519), Host orchid-vds1.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following networks are missing on host: '
req-net'
2018-04-22 16:43:34,930+03 ERROR [org.ovirt.engine.core.bll.job.ExecutionHandler] (EE-ManagedThreadFactory-engine-Thread-819) [758db545] Exception: org.springframework.jdb
c.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwab
le in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@7af89b[state=NORMAL managed connection=org.jboss.jca.adapters.jd
bc.local.LocalManagedConnection@5b8510d5 connection handles=1 lastReturned=1524404614851 lastValidated=1524404026942 lastCheckedOut=1524404614851 trackByTx=false pool=org.
jboss.jca.core.connectionmanager.pool.strategy.OnePool@3d718bcd mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@5034ce22[pool=ENGINEDataSource] xaResource=LocalXAR
esourceImpl@5ea2d360[connectionListener=7af89b connectionManager=17bff327 warned=false currentXid=null productName=PostgreSQL productVersion=9.5.9 jndiName=java:/ENGINEDataSource] txSync=null]

'Host orchid-vds1.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following networks are missing on host: 'req-net'

Version-Release number of selected component (if applicable):
4.2.3.2-0.1.el7
vdsm-4.20.26-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create required network in cluster(beside the ovirtmgmt network)
2. If have any hosts in this cluster, then attach this network to all hosts
3. Add new host to this cluster

Actual results:
Add host failed - become non-operational
Engine complaining that required network is missing on the host

Expected results:
Add hots shouldn't fail in such case

Comment 1 Michael Burman 2018-04-22 14:08:54 UTC
The bug exist in 4.1 as well

Comment 2 Alona Kaplan 2018-04-23 12:55:14 UTC
(In reply to Michael Burman from comment #1)
> The bug exist in 4.1 as well

It is ok the host is non-operation until the required network is added to the host. Therefore, there is no bug in 4.1.

The bug in 4.2 is that the engine fails to get host's capabilities. The engine cannot see the host's nics and the host stucks in activating-non-operational loop.

Another way to reproduce the bug is removing a required network from an active host.

Comment 3 Michael Burman 2018-04-23 14:12:57 UTC
(In reply to Alona Kaplan from comment #2)
> (In reply to Michael Burman from comment #1)
> > The bug exist in 4.1 as well
> 
> It is ok the host is non-operation until the required network is added to
> the host. Therefore, there is no bug in 4.1.
> 
> The bug in 4.2 is that the engine fails to get host's capabilities. The
> engine cannot see the host's nics and the host stucks in
> activating-non-operational loop.
> 
> Another way to reproduce the bug is removing a required network from an
> active host.

Yes, according to this setting the bug as regression

Comment 4 Red Hat Bugzilla Rules Engine 2018-04-23 14:13:04 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 5 Alona Kaplan 2018-04-25 07:23:23 UTC
*** Bug 1569487 has been marked as a duplicate of this bug. ***

Comment 6 Michael Burman 2018-05-03 15:54:43 UTC
Verified on - 4.2.3.4-0.1.el7

Comment 7 Sandro Bonazzola 2018-05-10 06:28:54 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.