Bug 1264008

Summary: libvirt take too much time to redefine a guest when set a big iothreads
Product: Red Hat Enterprise Linux 7 Reporter: Luyao Huang <lhuang>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 7.2CC: dyuan, hliu, jdenemar, jferlan, mzhan, rbalakri, yalzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.3.3-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:24:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luyao Huang 2015-09-17 10:11:12 UTC
Description of problem:
libvirt take too much time to redefine a guest when set a big iothreads

find by dice:
(https://github.com/code-dice/dice)

Version-Release number of selected component (if applicable):
libvirt-1.2.17-9.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.

edit a guest and add this in guest xml:

  <iothreads>1000000</iothreads>


# time virsh edit test4
Domain test4 XML configuration edited.


real	89m52.184s
user	0m0.172s
sys	0m0.132s

2.
3.

Actual results:
libvirt use 89m to parse the xml

Expected results:
improve the time used to define a guest or limit the iothreads number in a small range

Additional info:

Comment 1 John Ferlan 2015-10-15 21:29:50 UTC
Posted some patches upstream...

v1: http://www.redhat.com/archives/libvir-list/2015-October/msg00417.html
(patches 3 & 4)

v2:  http://www.redhat.com/archives/libvir-list/2015-October/msg00550.html
(patch 4 only - patch 3 from v1 was ACK'd and pushed already)

Comment 2 John Ferlan 2015-10-16 11:11:34 UTC
Patch pushed upstream:

$ git describe bb02d4c429673b9ee6d979a8bdf6eda54f62fe54
v1.2.20-113-gbb02d4c
$

Comment 4 yalzhang@redhat.com 2016-03-14 05:41:27 UTC
I have tried on libvirt-1.3.2-1.el7.x86_64, it still need a long time when I edit a guest with adding " <iothreads>1000000</iothreads> " in the xml. Please help to confirm, thank you!

# rpm -q libvirt
libvirt-1.3.2-1.el7.x86_64

# virsh destroy r7.1
Domain r7.1 destroyed

# time virsh edit r7.1
Domain r7.1 XML configuration edited.


real	129m23.899s
user	0m0.138s
sys	0m0.201s

Comment 5 John Ferlan 2016-03-16 16:05:54 UTC
Ugh... Well that fix *did* work at the time it was generated. 

Now if all I do is modify a guest to have 100000 iothreads, there's a noticeable pause to edit the domain *and* to dumpxml (which gives me a clue). Note that there's neither iothreadid's nor iothreadsched's defined. 

So it's off to git bisection land to find the culprit that causes the regression...  Starting with poor man's bisection of primary libvirt upstream releases

v1.2.20 - good
v1.2.21 - good
v1.3.0  - good
v1.3.1  - good

So something in v1.3.2 has caused the regression apparently....

I have a feeling I know what, so continuing with poor man's bisection by picking and choosing a specific commit or two...


Up through commit id '99c5fe0' we're good; however, commit id 'e992aa21' adds virDomainIOThreadIDMap which commit id '6dfb4507' starts to use.

But alone those aren't the problem...  So again poor man's debugging... Using this commit as my debugging point (and not top of tree)...

Added some debug messages at the start and end of format shows:

2016-03-16 14:56:28.339+0000: 6947: warning : virDomainDefFormatInternal:21711 : Start format
2016-03-16 14:56:57.152+0000: 6947: warning : virDomainDefFormatInternal:22445 : End format

30 seconds... Hmm...

Now to narrow things down a bit...

2016-03-16 15:03:52.143+0000: 30545: warning : virDomainDefFormatInternal:21711 : Start format
2016-03-16 15:03:52.144+0000: 30545: warning : virDomainDefFormatInternal:21898 : Start cputune format
2016-03-16 15:04:21.685+0000: 30545: warning : virDomainDefFormatInternal:21901 : End cputune format
2016-03-16 15:04:21.685+0000: 30545: warning : virDomainDefFormatInternal:22448 : End format


Will sir, I believe I have a smoking gun:

2016-03-16 15:10:13.204+0000: 22008: warning : virDomainDefFormatInternal:21712 : Start format
2016-03-16 15:10:13.205+0000: 22008: warning : virDomainDefFormatInternal:21899 : Start cputune format
2016-03-16 15:10:13.205+0000: 22008: warning : virDomainFormatIOThreadSchedDef:21575 : def->niothreadids=100000 def->iothreads=100000
2016-03-16 15:10:13.205+0000: 22008: warning : virDomainIOThreadIDMap:18311 : def->niothreadids=100000 def->iothreads=100000
2016-03-16 15:10:13.206+0000: 22008: warning : virDomainFormatIOThreadSchedDef:21581 : done allthreadmap
2016-03-16 15:10:41.798+0000: 22008: warning : virDomainFormatIOThreadSchedDef:21585 : done format sched def
2016-03-16 15:10:41.798+0000: 22008: warning : virDomainDefFormatInternal:21902 : End cputune format
2016-03-16 15:10:41.798+0000: 22008: warning : virDomainDefFormatInternal:22448 : End format

An inordinate amount of time spent in virDomainFormatSchedDef.

While yes, using 100,000 iothreads is not really useful, I get that; however, it's possible to define that many. Having virsh edit or virsh dumpxml spin it's wheels to find nothing defined, probably not expected.

If I modify another domain I have that has some <iothreadid> and <iothreadsched> entries defined using that same 100000 value - I get the same pauses.  I even get the same pause if changing *back* from 100000 to a more realistic number such as 4.

Adding more debugging, I find that the loop in virDomainFormatSchedDef will be executed once for each of the (currently) 4 virProcessSchedPolicy modes (BATCH, IDLE, FIFO, RR)... The first while loop then occurs for as many iothreadid's that are available.

Something just doesn't seem efficient about this.  The Parse code seems to be fine, but the Format code is dreadful.

Perhaps an important note... When we enter virDomainFormatIOThreadSchedDef the 'niothreadids' will be the same as the 'iothreads' - that's because virDomainIOThreadIDDefArrayInit initializes things for us.  That is what was "fixed" by this bz originally.

Since Peter Krempa wrote and understands that code - I'm reassigning this to him.

Comment 6 John Ferlan 2016-03-16 18:07:35 UTC
FWIW: In some limiting testing after digging and thinking about it a bit more the following "resolves" the issue - whether it's right depending on design I'm not sure (I'll leave that decision to Peter):

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 7d68096..51392df 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -18556,7 +18556,9 @@ virDomainIOThreadIDMap(virDomainDefPtr def)
         return NULL;
 
     for (i = 0; i < def->niothreadids; i++)
-        ignore_value(virBitmapSetBit(ret, def->iothreadids[i]->iothread_id));
+        if (def->iothreadids[i]->sched.policy > VIR_PROC_POLICY_NONE &&
+            def->iothreadids[i]->sched.policy < VIR_PROC_POLICY_LAST)
+            ignore_value(virBitmapSetBit(ret, def->iothreadids[i]->iothread_id));
 
     return ret;
 }

The key being - the map being returned only sets bits for those iothread_id's for which a policy was parsed (and I would assume/hope) assigned in a live manner rather than a fully populated map that eventually finds nothing but causes us to spin in virDomainFormatSchedDef.

I'll also note that the comment " * Returns a map of active iothreads for @def.
" isn't quite correct... 

Active IOThreads would be determined if the iothreadids[i]->thread_id had a value. With the above change this function would return a map of iothreadids which have a sched.policy defined.

And if someone wants to define 100,000 of those - well it'll take a while, but it shouldn't take any time if none are defined.

I assume similar checks would be done for vcpus, but I didn't chase those changes...  They have the same problem, but on a smaller scale. The problem being the needless spin through the code even if there are no <vcpusched>'s ever parsed (or created in a live system).

Comment 7 Peter Krempa 2016-03-29 19:34:12 UTC
Recent regression fixed by:

commit 0b4b58690d61e5b23cca24eec9912791cf4c21a3
Author: Peter Krempa <pkrempa>
Date:   Mon Mar 21 17:04:35 2016 +0100

    conf: decrease iterations complexity when formatting iothreads
    
    Create a bitmap of iothreads that have scheduler info set so that the
    transformation algorithm does not have to iterate the empty bitmap many
    times. By reusing self-expanding bitmaps the bitmap size does not need
    to be pre-calculated.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264008

commit 917426c8d7dd26f13142fc4c5c1a8a19137ac647
Author: Peter Krempa <pkrempa>
Date:   Fri Mar 18 15:41:59 2016 +0100

    util: bitmap: Introduce self-expanding bitmap APIs
    
    In some cases it's impractical to use the regular APIs as the bitmap
    size needs to be pre-declared. These new APIs allow to use bitmaps that
    self expand.
    
    The new code adds a property to the bitmap to track the allocation of
    memory so that VIR_RESIZE_N can be used.

Comment 8 yalzhang@redhat.com 2016-04-14 03:31:20 UTC
Verified on libvirt-1.3.3-1.el7.x86_64, the results is as expected.

Edit the guest to adding " <iothreads>1000000</iothreads> " in the xml, the time is acceptable.
# time virsh edit rhel7.2
Domain rhel7.2 XML configuration edited.


real	0m12.919s
user	0m0.011s
sys	0m0.013s

Comment 10 errata-xmlrpc 2016-11-03 18:24:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html