Bug 1569487

Summary: Failed to add new host to cluster VDS_SET_NONOPERATIONAL_NETWORK, Failed to configure management network on the host.
Product: [oVirt] ovirt-engine Reporter: Sergii Melnyk <melnyksergii>
Component: BLL.NetworkAssignee: Dominik Holler <dholler>
Status: CLOSED DUPLICATE QA Contact: Meni Yakove <myakove>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.2.5CC: alkaplan, bugs, emesika, melnyksergii
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-25 07:23:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine deploy log
none
supervsdm log from deployed host
none
complete engine log
none
vdsm log none

Description Sergii Melnyk 2018-04-19 11:42:20 UTC
Created attachment 1424019 [details]
engine deploy log

Description of problem:
Dears, 
After update ovirt from 4.2.1 to 4.2.2 I was try to add new host to cluster, but can't complete deploy installs to host with ERROR: Host trk-ovhv-4 installation failed. Failed to configure management network on the host.

Version-Release number of selected component (if applicable):
On ovirt-engine node:
ovirt-engine-tools-backup-4.2.2.5-1.el7.centos.noarch
ovirt-engine-tools-4.2.2.5-1.el7.centos.noarch
ovirt-engine-dwh-4.2.2.2-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-4.2.2.6-1.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.2.2.6-1.el7.centos.noarch
ovirt-ansible-cluster-upgrade-1.1.6-1.el7.centos.noarch
ovirt-ansible-manageiq-1.1.6-1.el7.centos.noarch
ovirt-engine-wildfly-11.0.0-1.el7.centos.x86_64
ovirt-engine-websocket-proxy-4.2.2.6-1.el7.centos.noarch
ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch
ovirt-engine-extensions-api-impl-4.2.2.6-1.el7.centos.noarch
ovirt-ansible-repositories-1.1.0-1.el7.centos.noarch
ovirt-vmconsole-1.0.4-1.el7.noarch
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
ovirt-engine-dwh-setup-4.2.2.2-1.el7.centos.noarch
ovirt-engine-extension-aaa-jdbc-1.1.7-1.el7.centos.noarch
ovirt-js-dependencies-1.2.0-3.1.el7.centos.noarch
python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
ovirt-host-deploy-java-1.7.3-1.el7.centos.noarch
ovirt-iso-uploader-4.2.0-1.el7.centos.noarch
ovirt-engine-dbscripts-4.2.2.5-1.el7.centos.noarch
ovirt-engine-webadmin-portal-4.2.2.5-1.el7.centos.noarch
ovirt-engine-restapi-4.2.2.5-1.el7.centos.noarch
ovirt-engine-backend-4.2.2.5-1.el7.centos.noarch
ovirt-engine-lib-4.2.2.6-1.el7.centos.noarch
ovirt-engine-setup-base-4.2.2.6-1.el7.centos.noarch
ovirt-imageio-proxy-setup-1.2.2-0.el7.centos.noarch
ovirt-provider-ovn-1.2.9-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.2.2.6-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-4.2.2.6-1.el7.centos.noarch
ovirt-ansible-vm-infra-1.1.5-1.el7.centos.noarch
ovirt-ansible-infra-1.1.4-1.el7.centos.noarch
ovirt-engine-metrics-1.1.3.4-1.el7.centos.noarch
ovirt-engine-wildfly-overlay-11.0.1-1.el7.centos.noarch
ovirt-imageio-proxy-1.2.2-0.el7.centos.noarch
ovirt-release42-4.2.2-3.el7.centos.noarch
ovirt-web-ui-1.3.7-2.el7.centos.noarch
ovirt-ansible-engine-setup-1.1.0-1.el7.centos.noarch
ovirt-ansible-image-template-1.1.5-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.4-1.el7.noarch
ovirt-cockpit-sso-0.0.4-1.el7.noarch
ovirt-engine-api-explorer-0.0.2-1.el7.centos.noarch
ovirt-host-deploy-1.7.3-1.el7.centos.noarch
ovirt-engine-dashboard-1.2.2-3.el7.centos.noarch
ovirt-engine-4.2.2.5-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.2.2.6-1.el7.centos.noarch
ovirt-imageio-common-1.2.2-0.el7.centos.noarch
ovirt-engine-setup-4.2.2.6-1.el7.centos.noarch
ovirt-ansible-disaster-recovery-0.3-1.el7.centos.noarch
ovirt-ansible-roles-1.1.3-1.el7.centos.noarch

On new add host to cluster:
ovirt-vmconsole-1.0.4-1.el7.noarch
ovirt-imageio-common-1.2.2-0.el7.centos.noarch
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
cockpit-ovirt-dashboard-0.11.20-1.el7.centos.noarch
ovirt-release42-4.2.2-3.el7.centos.noarch
ovirt-imageio-daemon-1.2.2-0.el7.centos.noarch
ovirt-host-4.2.2-2.el7.centos.x86_64
ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
ovirt-host-deploy-1.7.3-1.el7.centos.noarch
ovirt-host-dependencies-4.2.2-2.el7.centos.x86_64
ovirt-hosted-engine-ha-2.2.10-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.4-1.el7.noarch
ovirt-hosted-engine-setup-2.2.16-1.el7.centos.noarch
ovirt-provider-ovn-driver-1.2.9-1.el7.centos.noarch
vdsm-hook-vmfex-dev-4.20.23-1.el7.centos.noarch
vdsm-hook-vfio-mdev-4.20.23-1.el7.centos.noarch
vdsm-hook-ethtool-options-4.20.23-1.el7.centos.noarch
vdsm-client-4.20.23-1.el7.centos.noarch
vdsm-python-4.20.23-1.el7.centos.noarch
vdsm-hook-openstacknet-4.20.23-1.el7.centos.noarch
vdsm-network-4.20.23-1.el7.centos.x86_64
vdsm-jsonrpc-4.20.23-1.el7.centos.noarch
vdsm-api-4.20.23-1.el7.centos.noarch
vdsm-hook-fcoe-4.20.23-1.el7.centos.noarch
vdsm-yajsonrpc-4.20.23-1.el7.centos.noarch
vdsm-common-4.20.23-1.el7.centos.noarch
vdsm-hook-vhostmd-4.20.23-1.el7.centos.noarch
vdsm-http-4.20.23-1.el7.centos.noarch
vdsm-4.20.23-1.el7.centos.x86_64


How reproducible:
Try to add new host to cluster from ovirt-engine web with standart method.

Steps to Reproduce:
1. Setup CentOS7 to new host
2. add repo: http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm 
3. prepare network interfaces and bond for ovirt host setup:
eth2:
DEVICE=eth2
NAME=eth2
TYPE=Ethernet
BOOTPROTO=none
NM_CONTROLLER=no
DEFROUTE=no
PEERDNS=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
eth3:
DEVICE=eth3
NAME=eth3
TYPE=Ethernet
BOOTPROTO=none
NM_CONTROLLER=no
DEFROUTE=no
PEERDNS=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
bond0:
DEVICE=bond0
NAME=bond0
BONDING_MASTER=yes
DEFROUTE=yes
IPADDR=10.0.55.10
PREFIX=24
GATEWAY=10.0.55.1
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=4 miimon=100 lacp_rate=0 xmit_hash_policy=layer2+3"

Actual results:
After try to deploy host to cluster in log I see error:
2018-04-19 13:50:35,448+03 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Host 'trk-ovhv-4' is set to Non-Operational, it is missing the following networks: 'vlan10,vlan11'
2018-04-19 13:50:35,505+03 ERROR [org.ovirt.engine.core.bll.job.ExecutionHandler] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Exception: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@fa6e6e5[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5b050fc3 connection handles=0 lastReturned=1524135035492 lastValidated=1524134463813 lastCheckedOut=1524135035481 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@123909eb mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@62d662ff[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@1a6a5b19[connectionListener=fa6e6e5 connectionManager=653865e8 warned=false currentXid=null productName=PostgreSQL productVersion=9.5.9 jndiName=java:/ENGINEDataSource] txSync=null]
2018-04-19 13:50:35,536+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Failed in 'CollectVdsNetworkDataAfterInstallationVDS' method, for vds: 'trk-ovhv-4'; host: 'trk-ovhv-4.kv.in.trkua.net': Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@fa6e6e5[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5b050fc3 connection handles=0 lastReturned=1524135035527 lastValidated=1524134463813 lastCheckedOut=1524135035518 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@123909eb mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@62d662ff[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@1a6a5b19[connectionListener=fa6e6e5 connectionManager=653865e8 warned=false currentXid=null productName=PostgreSQL productVersion=9.5.9 jndiName=java:/ENGINEDataSource] txSync=null]
2018-04-19 13:50:35,537+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Command 'CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = trk-ovhv-4, CollectHostNetworkDataVdsCommandParameters:{hostId='ff2dbd5a-908c-4120-856d-45a85cccdaba', vds='Host[trk-ovhv-4,ff2dbd5a-908c-4120-856d-45a85cccdaba]'})' execution failed: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@fa6e6e5[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5b050fc3 connection handles=0 lastReturned=1524135035527 lastValidated=1524134463813 lastCheckedOut=1524135035518 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@123909eb mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@62d662ff[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@1a6a5b19[connectionListener=fa6e6e5 connectionManager=653865e8 warned=false currentXid=null productName=PostgreSQL productVersion=9.5.9 jndiName=java:/ENGINEDataSource] txSync=null]
2018-04-19 13:50:35,537+03 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: javax.resource.ResourceException: IJ000457: Unchecked throwable in managedConnectionReconnected() cl=org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@fa6e6e5[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5b050fc3 connection handles=0 lastReturned=1524135035527 lastValidated=1524134463813 lastCheckedOut=1524135035518 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@123909eb mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@62d662ff[pool=ENGINEDataSource] xaResource=LocalXAResourceImpl@1a6a5b19[connectionListener=fa6e6e5 connectionManager=653865e8 warned=false currentXid=null productName=PostgreSQL productVersion=9.5.9 jndiName=java:/ENGINEDataSource] txSync=null] (Failed with error ENGINE and code 5001)
2018-04-19 13:50:35,550+03 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] Host installation failed for host 'ff2dbd5a-908c-4120-856d-45a85cccdaba', 'trk-ovhv-4': Failed to configure management network on the host
2018-04-19 13:50:35,610+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-2068) [3b423c4c] EVENT_ID: VDS_INSTALL_FAILED(505), Host trk-ovhv-4 installation failed. Failed to configure management network on the host.

and host deploy going to  status='NonOperational', nonOperationalReason='NETWORK_UNREACHABLE'

Expected results:


Additional info:
vdsm not create interfaces:
tree /var/lib/vdsm/persistence/
/var/lib/vdsm/persistence/
├── netconf -> /var/lib/vdsm/persistence/netconf.TAv2Xhos
└── netconf.TAv2Xhos
    ├── bonds
    └── nets

Comment 1 Sergii Melnyk 2018-04-19 11:43:34 UTC
Created attachment 1424027 [details]
supervsdm log from deployed host

Comment 2 Dominik Holler 2018-04-20 10:36:13 UTC
More dangerous than the problem with adding a host seems to be the JDBC errors.
When do they start?

About the problem with adding a host, I miss a line in engine.log which explains the reason. Can you please share a larger snipped of engine.log?
The vdsm.log from the host might be helpful, too. Would you share this file?
What is the switch type of the cluster you try to add the host?

Comment 3 Sergii Melnyk 2018-04-20 14:32:06 UTC
Hi,Dominik
So, I'm not shure, JDBC errors in engine log is starts after upgarade ovirt from 4.2.1 to 4.2.2 in April, because in log before upgrade I don't saw this ERROR.

vdsm.log an full engine.log in attachment.
I'm new in Ovirt an I can't clear answere aboute switch type in cluster, I'm use default network installs in web.

Comment 4 Sergii Melnyk 2018-04-20 14:32:49 UTC
Created attachment 1424525 [details]
complete engine log

Comment 5 Sergii Melnyk 2018-04-20 14:33:34 UTC
Created attachment 1424526 [details]
vdsm log

Comment 6 Sergii Melnyk 2018-04-20 14:36:11 UTC
On worked node what I was setup in Ovirt 4.2.1 I have an a network tree:
tree /var/lib/vdsm/persistence/
/var/lib/vdsm/persistence/
├── netconf -> /var/lib/vdsm/persistence/netconf.B1ilzZvK
└── netconf.B1ilzZvK
    ├── bonds
    │   ├── bond0
    │   └── bond1
    └── nets
        ├── ovirtmgmt
        ├── vlan10
        └── vlan11

Comment 7 Eli Mesika 2018-04-22 10:04:08 UTC
Please add relevant PG log as well

Comment 8 Sergii Melnyk 2018-04-23 14:40:49 UTC
Hi, I can't found postgresql logs on engine host.
How can I make start this log file?

Comment 9 Alona Kaplan 2018-04-25 07:23:23 UTC

*** This bug has been marked as a duplicate of bug 1570388 ***