Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1320606

Summary: Host deploys fails - ping flood issue on VDSM side
Product: [oVirt] ovirt-engine Reporter: Fabrice Bacchella <fabrice.bacchella>
Component: BLL.NetworkAssignee: Martin Mucha <mmucha>
Status: CLOSED CURRENTRELEASE QA Contact: Meni Yakove <myakove>
Severity: high Docs Contact:
Priority: medium    
Version: 3.6.3CC: bugs, danken, jcoscia, mburman, mkalinin, mmucha, oourfali, phoracek, s.kieske, ylavi
Target Milestone: ovirt-3.6.6Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
danken: devel_ack+
rule-engine: testing_ack+
Target Release: 3.6.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1334862 (view as bug list) Environment:
Last Closed: 2016-05-30 10:51:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1334862    
Attachments:
Description Flags
log files from the host
none
the networks
none
locked hosts
none
the failed network configuration for this host
none
the engine.log
none
New Logs none

Description Fabrice Bacchella 2016-03-23 15:48:45 UTC
Description of problem:

I have hosts with a native vlan, and two tagged vlan.

The hosts is a Centos 7.2 with two interfaces.

The networks configuration was done using kickstart, so only eth0 is configured, with dhcp.

No other configuration is done on the host, just remote access from ovirt-engine is allowed.

I configured the tagged vlan in the ovirt engine.

But when I add the new host, the configuration failed, the host get stuck and worst I can't do anything on the admin GUI, it keeps saying there is another operation going on the host, I need to restart the ovirt-engine to unlock it.

Version-Release number of selected component (if applicable):
Everything is brand new so use latest version of both ovirt-engine (ovirt-engine-3.6.3.4-1.el7.centos.noarch) and I reinstalled the host from scratch.

Comment 1 Fabrice Bacchella 2016-03-23 15:51:05 UTC
Created attachment 1139644 [details]
log files from the host

Some logs from the host, the network configuration, and the host-deploy logs from the engine.

Comment 2 Fabrice Bacchella 2016-03-23 15:55:19 UTC
Created attachment 1139646 [details]
the networks

Comment 3 Fabrice Bacchella 2016-03-23 16:06:43 UTC
Created attachment 1139650 [details]
locked hosts

What I get when a want to do operations on the hosts, even after I rebooted it using the GUI.

Comment 4 Fabrice Bacchella 2016-03-23 16:07:53 UTC
Created attachment 1139651 [details]
the failed network configuration for this host

Comment 5 Fabrice Bacchella 2016-03-23 16:18:13 UTC
Created attachment 1139656 [details]
the engine.log

Comment 6 Dan Kenigsberg 2016-03-24 08:21:32 UTC
jsonrpc.Executor/0::DEBUG::2016-03-23 14:51:45,528::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/0::DEBUG::2016-03-23 14:51:45,529::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/1::DEBUG::2016-03-23 14:51:45,532::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/1::DEBUG::2016-03-23 14:51:45,532::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True

vdsm.log ends with this tight loop, while engine.log sees an network exception

2016-03-23 14:17:10,080 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default task-27) [279c2be] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Client close

Martin, could the network be flooded and disconnected due to the bug fixed by your https://gerrit.ovirt.org/#/c/54644/ ?

Comment 7 Red Hat Bugzilla Rules Engine 2016-03-24 08:21:38 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 8 Martin Mucha 2016-03-30 06:22:03 UTC
(In reply to Dan Kenigsberg from comment #6)
> jsonrpc.Executor/0::DEBUG::2016-03-23
> 14:51:45,528::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'Host.ping' in bridge with {}
> jsonrpc.Executor/0::DEBUG::2016-03-23
> 14:51:45,529::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'Host.ping' in bridge with True
> jsonrpc.Executor/1::DEBUG::2016-03-23
> 14:51:45,532::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'Host.ping' in bridge with {}
> jsonrpc.Executor/1::DEBUG::2016-03-23
> 14:51:45,532::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'Host.ping' in bridge with True
> 
> vdsm.log ends with this tight loop, while engine.log sees an network
> exception
> 
> 2016-03-23 14:17:10,080 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default task-27)
> [279c2be] Exception:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> VDSGenericException: VDSNetworkException: Client close
> 
> Martin, could the network be flooded and disconnected due to the bug fixed
> by your https://gerrit.ovirt.org/#/c/54644/ ?

Hardly — this patch blocks flood instead, and it wasn't merged at the timestamp of your question or reported issues.

Comment 9 Dan Kenigsberg 2016-03-30 09:55:57 UTC
> > Martin, could the network be flooded and disconnected due to the bug fixed
> > by your https://gerrit.ovirt.org/#/c/54644/ ?
> 
> Hardly — this patch blocks flood instead, and it wasn't merged at the
> timestamp of your question or reported issues.

Martin, let me rephrase my question. Could it be that the connection is broken due to the ping flood, which is solved by your patch?

Comment 10 Martin Mucha 2016-03-30 10:25:58 UTC
Sorry, I was reading too quickly. I'm not sure, but it might be the case. In engine.log there is:

org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection failed
	at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:157) [vdsm-jsonrpc-java-client.jar:]

so the vdsm-jsonrpc is trying to establish connection, but it fails. It might be the case, that vds is clogged by that ping DOS. But I havent notice described behavior even before fixing this bug — even then I was able to add host despite of flood.

Comment 11 Sven Kieske 2016-04-21 15:54:30 UTC
is this a duplicate: BZ 1329317 ?

Comment 12 Michael Burman 2016-04-28 07:23:44 UTC
Add host stuck in installing state with ping flood issue, i failed with this symptoms 1 time of a 3 attempts. Not sure if this can be verified, attaching logs.

Tested on 3.6.6-0.1.el6 and vdsm-4.17.27-0.el7ev.noarch

Comment 13 Michael Burman 2016-04-28 07:24:20 UTC
Created attachment 1151744 [details]
New Logs

Comment 14 Dan Kenigsberg 2016-04-28 13:25:53 UTC
Pings in your log are received every 1.5 seconds. The tight loop we've seen before is verified.

However, bug 1329317 is still unsolved. Which version of vdsm-jsonrpc-java have you used?

jsonrpc.Executor/0::DEBUG::2016-04-28 09:10:13,641::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/0::DEBUG::2016-04-28 09:10:13,641::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/1::DEBUG::2016-04-28 09:10:15,146::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/1::DEBUG::2016-04-28 09:10:15,147::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/2::DEBUG::2016-04-28 09:10:16,652::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/2::DEBUG::2016-04-28 09:10:16,652::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/3::DEBUG::2016-04-28 09:10:17,953::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/3::DEBUG::2016-04-28 09:10:17,954::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True

Comment 15 Michael Burman 2016-05-01 05:36:33 UTC
vdsm-jsonrpc-java-1.1.9-1.el6ev.noarch

Comment 16 Michael Burman 2016-05-01 08:52:56 UTC
Verified on - 3.6.6-0.1.el6 and vdsm-4.17.27-0.el7ev.noarch

Comment 17 Dan Kenigsberg 2016-05-01 10:24:32 UTC
verification of  bug 1329317 must take place with vdsm-jsonrpc-java >= 1.1.10 and ovirt-engine >= 3.6.0_alpha1-2572-g16e91cf.