Bug 1179696 - [scale] Reduce time spent in processing of guest agent messages
Summary: [scale] Reduce time spent in processing of guest agent messages
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Vinzenz Feenstra [evilissimo]
QA Contact: Eldad Marciano
Depends On:
Blocks: 1177634 1265144
TreeView+ depends on / blocked
Reported: 2015-01-07 11:18 UTC by Vinzenz Feenstra [evilissimo]
Modified: 2016-02-17 07:46 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1265144 (view as bug list)
Last Closed: 2016-02-17 07:46:45 UTC
oVirt Team: Virt
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0362 normal SHIPPED_LIVE vdsm 3.6.0 bug fix and enhancement update 2016-03-09 23:49:32 UTC
oVirt gerrit 36630 master MERGED virt: Optimize guest agent string filtering Never
oVirt gerrit 36949 master ABANDONED virt: Only filter guest agent data, only on XMLRPC 2016-01-06 12:14:35 UTC
oVirt gerrit 36952 master MERGED virt test: check the full range of replaceable chars in filtering Never

Description Vinzenz Feenstra [evilissimo] 2015-01-07 11:18:50 UTC
Description of problem:

Currently the guest agent message handling contains inefficiencies which cause quite significant time spent during handling of messages sent by guest agents of VMs.
The 2 most significant portions of this have been identified as the filtering of bad characters and the dispatching of the message to the appropriate handler.

This bug shall address these issues.

Comment 1 Michal Skrivanek 2015-01-07 11:39:18 UTC
might want to consider backporting based on profiling results with the patch/solution.

Comment 2 Vinzenz Feenstra [evilissimo] 2015-01-15 08:09:11 UTC
I have made some investigations:

I have produced a 25 MiB data capture of the messages sent by the guest agent including bigger application lists reported by windows guest agents.

I written a little script which read each message from the data capture and passed it as string to the line processing. Which decodes the utf-8 data, parses the json to an python object and then applies the filtering.

The current solution takes around 20-22 seconds on my machine.
Another solution suggested by Nir Sofer (can be seen in attached gerrit patch)
takes around 14-16 seconds on each run.

The pure python approach as suggested by Nir has a performance gain of 30%.

For the sake of completeness I have been attempting to try a non-optimized c++ version of parsing json, decoding the object etc, takes around 800ms on my computer for the same 25 MiB. Which just proves that there's plenty of room for improvement. We'll have to see however how we would do this, and this is not something what would go into a backport.

As the discussion came up we're now investigating to post-pone the filtering to a later point, and filter the data only for XMLRPC requests before it's passed along. And only parse certain fields.
This is a discussion which has to be continued on the mailing list, to see if this is relevant.

Comment 3 Eyal Edri 2015-02-25 08:43:40 UTC
3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2

Comment 4 Michal Skrivanek 2015-05-05 08:17:53 UTC
postponing to 3.5.4
acks would be nice...

Comment 5 Scott Herold 2015-06-02 13:54:01 UTC
Removing from 3.5.z.  No customer case.  Impact insufficient for 3.5.z.

Comment 6 Vinzenz Feenstra [evilissimo] 2015-09-22 08:29:21 UTC
Moved pending patch out of scope of this BZ and created a new BZ#1265144 for it

Comment 8 Gil Klein 2016-02-17 07:46:45 UTC
This bug was fixed and is slated to be in the upcoming version. As we
are focusing our testing at this phase on severe bugs, this bug was
closed without going through its verification step. If you think this
bug should be verified by QE, please set its severity to high and move
it back to ON_QA

Note You need to log in before you can comment on or make changes to this bug.