Bug 1592847 - [v2v] ovirt-Imageio-daemon Memory growth unreasonable during disk transfer
Summary: [v2v] ovirt-Imageio-daemon Memory growth unreasonable during disk transfer
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-imageio-daemon
Version: 4.2.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.6
: ---
Assignee: Nir Soffer
QA Contact: guy chen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-19 12:35 UTC by guy chen
Modified: 2018-11-28 12:40 UTC (History)
6 users (show)

Fixed In Version: ovirt-imageio-{common,daemon}-1.4.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-28 12:40:37 UTC
oVirt Team: Scale
Target Upstream Version:
Embargoed:
derez: needinfo-


Attachments (Terms of Use)
Updated graph of 4 VMS transfer (27.72 KB, image/png)
2018-06-27 10:33 UTC, guy chen
no flags Details
RES of imageio build 1.4.3 (30.51 KB, image/png)
2018-08-28 08:21 UTC, guy chen
no flags Details
1.4.2 imagio rate (31.21 KB, image/png)
2018-08-28 09:50 UTC, guy chen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 93222 0 master MERGED daemon: Optimize transferred calculation 2018-08-11 00:08:06 UTC

Description guy chen 2018-06-19 12:35:44 UTC
Description of problem:
During V2V transfer ovirt-imageio process memory continue to grow unreasonably.
All requests are kept historically, every 10s the stats are reported to engine for UI progress bar.
This is aggravated by many requests from v2v.

Version-Release number of selected component (if applicable):
ovirt-imageio-common-1.3.0-0.el7ev.noarch
ovirt-imageio-daemon-1.3.0-0.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Transfer VM from VMware to RHV system 
2. On the host monitor ovirt-imageio RES utiliation

Actual results:
Continuous linear memory growth

Expected results:
No continuous linear memory growth

Additional info:
Full logs will be attahced

Comment 3 guy chen 2018-06-27 10:33:26 UTC
Created attachment 1454997 [details]
Updated graph of 4 VMS transfer

Comment 4 Nir Soffer 2018-07-20 23:36:37 UTC
Guy, this issue should be fixed by the attached patch. Can you check that it solve
the issue you see?

Comment 5 Nir Soffer 2018-08-12 15:13:03 UTC
I tests this also in scale setup now. Before this change we had 2.5G RES (in top)
after 20-30 100G virt-v2v imports. With this change, we have constant 100M RES
after 10 100G virt-v2v imports.

Fixed in:

commit b8e83ef58d3b8c57191cd5913fc348153aace094
Author: Nir Soffer <nsoffer>
Date:   Sat Jul 21 01:15:40 2018 +0300

    daemon: Optimize transferred calculation
    
    When uploading or downloading big images using tiny chunks, the
    operations list becomes huge, taking lot of memory and consuming
    significant cpu time.
    
    Optimize the calculation by removing completed operations from the
    ticket, and keeping a sorted list of transferred ranges. When an
    operation is removed, the operation range is merged into the completed
    ranges list, which should be small in typical download or upload flows.
    
    When calculating total transferred bytes, we create list of ranges from
    the ongoing operations, which should be small, and merge the completed
    and ongoing ranges.
    
    Change-Id: Ib33967d9e858353037542eedbb3c68d350bf3ad4
    Bug-Url: https://bugzilla.redhat.com/1592847
    Signed-off-by: Nir Soffer <nsoffer>

Comment 6 Sandro Bonazzola 2018-08-14 19:14:43 UTC
We're releasing today 4.2.6 RC2 including v1.4.3 which is referencing this bug. can you please check this bug status?

Comment 7 Nir Soffer 2018-08-14 19:32:30 UTC
(In reply to Sandro Bonazzola from comment #6)
1.4.3 should fix this, but it was not tested yet by QE.

We can list this bug as fixes included by this version, but we cannot mark it
as verified without testing.

Comment 8 Nir Soffer 2018-08-14 23:43:35 UTC
We have a downstream build, moving to ON_QA

Comment 9 guy chen 2018-08-28 08:21:44 UTC
Created attachment 1479169 [details]
RES of imageio build 1.4.3

Comment 10 guy chen 2018-08-28 08:22:20 UTC
From load run on 19.8 with ovirt-imageio-daemon-1.4.3 version with V2V migration of 10 VMS 100GB to FC 66% full disk we see the RES is indeed stable on 95G and not increasing as before, thus verifying the bug, attache the RES image.

Comment 11 Nir Soffer 2018-08-28 08:36:50 UTC
Guy, can you add more details how the behavior changed since previous versions
during same flows? (e.g. 10x100g import)

- On which versions we tested the same flow?
- What was the memory usage seen in each tested version?

Comment 12 guy chen 2018-08-28 09:50:16 UTC
Created attachment 1479201 [details]
1.4.2 imagio rate

Comment 13 guy chen 2018-08-28 09:57:38 UTC
Before comment10 got mixed up, should be MB not GB.
Added previous ovirt-imageio-daemon 1.4.2 version RES with the same scenario.
So in 1.4.2 the RES increases throughout the test up to 3.6 GB.
In 1.4.3 the RES stabilize around 95 MB and does not continue to increase.

Comment 14 Nir Soffer 2018-08-28 10:01:03 UTC
Thanks! I did not know it was so bad.
any chance to get a graph from the
1.4.3 test run?

Comment 15 guy chen 2018-08-28 10:20:32 UTC
1.4.3 test run graph is attached - "RES of imageio build 1.4.3"

Comment 16 Nir Soffer 2018-08-28 19:34:43 UTC
Thanks Guy, I think we have all data.


Note You need to log in before you can comment on or make changes to this bug.