Bug 1320606 - Host deploys fails - ping flood issue on VDSM side
Summary: Host deploys fails - ping flood issue on VDSM side
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 3.6.3
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ovirt-3.6.6
: 3.6.6
Assignee: Martin Mucha
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On:
Blocks: 1334862
TreeView+ depends on / blocked
 
Reported: 2016-03-23 15:48 UTC by Fabrice Bacchella
Modified: 2016-05-30 10:51 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
: 1334862 (view as bug list)
Environment:
Last Closed: 2016-05-30 10:51:38 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
danken: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
log files from the host (293.96 KB, application/x-gzip)
2016-03-23 15:51 UTC, Fabrice Bacchella
no flags Details
the networks (41.31 KB, image/png)
2016-03-23 15:55 UTC, Fabrice Bacchella
no flags Details
locked hosts (26.14 KB, image/png)
2016-03-23 16:06 UTC, Fabrice Bacchella
no flags Details
the failed network configuration for this host (63.44 KB, image/png)
2016-03-23 16:07 UTC, Fabrice Bacchella
no flags Details
the engine.log (35.51 KB, application/x-gzip)
2016-03-23 16:18 UTC, Fabrice Bacchella
no flags Details
New Logs (211.57 KB, application/x-gzip)
2016-04-28 07:24 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 55482 0 ovirt-engine-3.6 MERGED core: do not ping too often 2016-04-05 12:11:09 UTC
oVirt gerrit 55524 0 ovirt-engine-3.6.5 ABANDONED core: do not ping too often 2016-04-11 13:43:25 UTC

Description Fabrice Bacchella 2016-03-23 15:48:45 UTC
Description of problem:

I have hosts with a native vlan, and two tagged vlan.

The hosts is a Centos 7.2 with two interfaces.

The networks configuration was done using kickstart, so only eth0 is configured, with dhcp.

No other configuration is done on the host, just remote access from ovirt-engine is allowed.

I configured the tagged vlan in the ovirt engine.

But when I add the new host, the configuration failed, the host get stuck and worst I can't do anything on the admin GUI, it keeps saying there is another operation going on the host, I need to restart the ovirt-engine to unlock it.

Version-Release number of selected component (if applicable):
Everything is brand new so use latest version of both ovirt-engine (ovirt-engine-3.6.3.4-1.el7.centos.noarch) and I reinstalled the host from scratch.

Comment 1 Fabrice Bacchella 2016-03-23 15:51:05 UTC
Created attachment 1139644 [details]
log files from the host

Some logs from the host, the network configuration, and the host-deploy logs from the engine.

Comment 2 Fabrice Bacchella 2016-03-23 15:55:19 UTC
Created attachment 1139646 [details]
the networks

Comment 3 Fabrice Bacchella 2016-03-23 16:06:43 UTC
Created attachment 1139650 [details]
locked hosts

What I get when a want to do operations on the hosts, even after I rebooted it using the GUI.

Comment 4 Fabrice Bacchella 2016-03-23 16:07:53 UTC
Created attachment 1139651 [details]
the failed network configuration for this host

Comment 5 Fabrice Bacchella 2016-03-23 16:18:13 UTC
Created attachment 1139656 [details]
the engine.log

Comment 6 Dan Kenigsberg 2016-03-24 08:21:32 UTC
jsonrpc.Executor/0::DEBUG::2016-03-23 14:51:45,528::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/0::DEBUG::2016-03-23 14:51:45,529::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/1::DEBUG::2016-03-23 14:51:45,532::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/1::DEBUG::2016-03-23 14:51:45,532::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True

vdsm.log ends with this tight loop, while engine.log sees an network exception

2016-03-23 14:17:10,080 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default task-27) [279c2be] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Client close

Martin, could the network be flooded and disconnected due to the bug fixed by your https://gerrit.ovirt.org/#/c/54644/ ?

Comment 7 Red Hat Bugzilla Rules Engine 2016-03-24 08:21:38 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 8 Martin Mucha 2016-03-30 06:22:03 UTC
(In reply to Dan Kenigsberg from comment #6)
> jsonrpc.Executor/0::DEBUG::2016-03-23
> 14:51:45,528::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'Host.ping' in bridge with {}
> jsonrpc.Executor/0::DEBUG::2016-03-23
> 14:51:45,529::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'Host.ping' in bridge with True
> jsonrpc.Executor/1::DEBUG::2016-03-23
> 14:51:45,532::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'Host.ping' in bridge with {}
> jsonrpc.Executor/1::DEBUG::2016-03-23
> 14:51:45,532::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'Host.ping' in bridge with True
> 
> vdsm.log ends with this tight loop, while engine.log sees an network
> exception
> 
> 2016-03-23 14:17:10,080 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default task-27)
> [279c2be] Exception:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> VDSGenericException: VDSNetworkException: Client close
> 
> Martin, could the network be flooded and disconnected due to the bug fixed
> by your https://gerrit.ovirt.org/#/c/54644/ ?

Hardly — this patch blocks flood instead, and it wasn't merged at the timestamp of your question or reported issues.

Comment 9 Dan Kenigsberg 2016-03-30 09:55:57 UTC
> > Martin, could the network be flooded and disconnected due to the bug fixed
> > by your https://gerrit.ovirt.org/#/c/54644/ ?
> 
> Hardly — this patch blocks flood instead, and it wasn't merged at the
> timestamp of your question or reported issues.

Martin, let me rephrase my question. Could it be that the connection is broken due to the ping flood, which is solved by your patch?

Comment 10 Martin Mucha 2016-03-30 10:25:58 UTC
Sorry, I was reading too quickly. I'm not sure, but it might be the case. In engine.log there is:

org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection failed
	at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:157) [vdsm-jsonrpc-java-client.jar:]

so the vdsm-jsonrpc is trying to establish connection, but it fails. It might be the case, that vds is clogged by that ping DOS. But I havent notice described behavior even before fixing this bug — even then I was able to add host despite of flood.

Comment 11 Sven Kieske 2016-04-21 15:54:30 UTC
is this a duplicate: BZ 1329317 ?

Comment 12 Michael Burman 2016-04-28 07:23:44 UTC
Add host stuck in installing state with ping flood issue, i failed with this symptoms 1 time of a 3 attempts. Not sure if this can be verified, attaching logs.

Tested on 3.6.6-0.1.el6 and vdsm-4.17.27-0.el7ev.noarch

Comment 13 Michael Burman 2016-04-28 07:24:20 UTC
Created attachment 1151744 [details]
New Logs

Comment 14 Dan Kenigsberg 2016-04-28 13:25:53 UTC
Pings in your log are received every 1.5 seconds. The tight loop we've seen before is verified.

However, bug 1329317 is still unsolved. Which version of vdsm-jsonrpc-java have you used?

jsonrpc.Executor/0::DEBUG::2016-04-28 09:10:13,641::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/0::DEBUG::2016-04-28 09:10:13,641::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/1::DEBUG::2016-04-28 09:10:15,146::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/1::DEBUG::2016-04-28 09:10:15,147::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/2::DEBUG::2016-04-28 09:10:16,652::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/2::DEBUG::2016-04-28 09:10:16,652::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True
jsonrpc.Executor/3::DEBUG::2016-04-28 09:10:17,953::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Host.ping' in bridge with {}
jsonrpc.Executor/3::DEBUG::2016-04-28 09:10:17,954::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'Host.ping' in bridge with True

Comment 15 Michael Burman 2016-05-01 05:36:33 UTC
vdsm-jsonrpc-java-1.1.9-1.el6ev.noarch

Comment 16 Michael Burman 2016-05-01 08:52:56 UTC
Verified on - 3.6.6-0.1.el6 and vdsm-4.17.27-0.el7ev.noarch

Comment 17 Dan Kenigsberg 2016-05-01 10:24:32 UTC
verification of  bug 1329317 must take place with vdsm-jsonrpc-java >= 1.1.10 and ovirt-engine >= 3.6.0_alpha1-2572-g16e91cf.


Note You need to log in before you can comment on or make changes to this bug.