Bug 1053114

Summary:

[vdsm] [Scalability] When host is loaded with networks - addNetwork and getVdsCaps takes a lot of time to return.

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

pagupta

Component:

vdsm

Assignee:

Barak <bazulay>

Status:

CLOSED ERRATA

QA Contact:

Yuri Obshansky <yobshans>

Severity:

high

Docs Contact:

Priority:

high

Version:

3.2.0

CC:

bazulay, danken, dnaori, iheim, juwu, lpeer, mavital, mgoldboi, nyechiel, pagupta, srevivo, tdosek, yeylon

Target Milestone:

---

Target Release:

3.5.0

Hardware:

x86_64

OS:

Linux

Whiteboard:

network

Fixed In Version:

vt1.3, 4.16.0-1.el6_5

Doc Type:

Bug Fix

Doc Text:

Previously, extracting information on networks took a long time when there were multiple networks defined on the host. Using a host with 200+ networks was very slow or impossible. Now, the code has been refactored with attention to asymptotic time efficiency, so that 1000 networks are workable.

Story Points:

---

Clone Of:

714421

Environment:

Last Closed:

2015-02-11 21:10:03 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Network

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

714421

Bug Blocks:

612978, 1142923, 1156165

Attachments:

Description	Flags
getVdsCaps Graph	none
RT_AddNetwork graph	none

Comment 2 pagupta 2014-01-15 08:44:14 UTC

*** Bug 1052976 has been marked as a duplicate of this bug. ***

Comment 3 Dan Kenigsberg 2014-03-04 22:03:54 UTC

Toni have made a significant progress, and all his patches are merged into the master branch. However, they are quite intrusive, and I would prefer not to backport them to the 3.4 branch.

Unless this issue is extremely urgent for a customer, I suggest waiting for rhev-3.5.

Comment 6 Yuri Obshansky 2015-01-04 09:59:48 UTC

I verified that bug on RHEV-M 3.5.0-0.27.el6ev (build vt13.5)
RHEL - 6Server - 6.6.0.2.el6
KVM - 0.12.1.2 - 2.448.el6
libvirt-0.10.2-46.el6_6.1
vdsm-4.16.8.1-4.el6ev

First of all I changed default ovirt configuration vdsHeartbeatInSeconds from 10 sec to 60 sec. 
Other host became not functional (forever in Connecting state). 

I created and attached to host 1000 vlans and ran simple script which performs 100 times command getVdsCaps and measure time:
#!/bin/bash
for x in {1..100}; do
        STARTTIME=`date +%s.%N`
        vdsClient -s 0 getVdsCaps
        ENDTIME=`date +%s.%N`
        TIMEDIFF=`echo "$ENDTIME - $STARTTIME" | bc | awk -F"." '{print $1"."substr($2,1,3)}'`
        echo "$TIMEDIFF" >> getVdsCaps_1000_networks.csv   
done

I got following results:
- min: 21.76 sec
- average: 59.28 sec	
- max: 3268.48 sec	
During script running I got 2 very slow times: 340.66 sec and 3268.48 sec.
See attached graph: getVdsCaps_graph.jpg

I measured response time of REST API calls during population of vlans 
- Create Netowrks: /api/networks/
- Attach Network to cluster: /api/clusters/${cl_id}/networks
- Attach Network to Host NIC: /api/hosts/${host_id}/nics/${dummy_id}/attach
Here is the results (msec):
	                        Count	Average	90% 	Min	Max
Create Networks	                1000	45	55	29	1132
Attach Network to Cluster 	1000	51	63	33	190
Attach Network to Host NIC	1000	18116	35448	808	41075
See attached graph: RT_AddNetwork.png

Comment 7 Yuri Obshansky 2015-01-04 10:00:46 UTC

Created attachment 975935 [details]
getVdsCaps  Graph

Comment 8 Yuri Obshansky 2015-01-04 10:01:24 UTC

Created attachment 975936 [details]
RT_AddNetwork graph

Comment 10 errata-xmlrpc 2015-02-11 21:10:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0159.html