Bug 1262143 - VM startup is very slow with large amounts of hotpluggable memory
VM startup is very slow with large amounts of hotpluggable memory
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
ppc64le Linux
unspecified Severity medium
: rc
: 7.3
Assigned To: David Gibson
Virtualization Bugs
:
Depends On: 1263039
Blocks: RHEV3.6PPC 1261812 1277183 1277184 1277186 RHEV3.6PPC_PCI_Passthrough
  Show dependency treegraph
 
Reported: 2015-09-10 19:47 EDT by David Gibson
Modified: 2016-02-18 10:47 EST (History)
14 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.3.0-24.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-04 11:57:01 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 133677 None None None Never

  None (edit)
Description David Gibson 2015-09-10 19:47:50 EDT
Description of problem:

If qemu is configured with large amounts of hotpluggable memory (say > 256G) it takes a very long time to start the guest.  

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-2.3.0-22.el7.ppc64le

How reproducible:

100%

Steps to Reproduce:
1. Start qemu with
      /usr/libexec/qemu-kvm -M pseries -m 512M,maxmem=XXXXX ...
For maxmem=512G, 1024G, 2048G
2. Time how long until qemu starts booting the guest

Actual results:

512G: ~17s
1024G: ~2m30s
2048G: ~20m

Expected results:

<10s startup in all cases.

Additional info:

The root cause is that qemu constructs a DR connector object internally for every 256M of hotpluggable memory.  The current algorithm for doing so is O(n^3).
Comment 1 David Gibson 2015-09-10 19:51:52 EDT
I've posted a patch upstream which improves adding the connectors to O(n^2).  This brings startup with 2048G to an acceptable 4s, unfortunately 4096G (RHEV's usual limit) is still ~35s.
Comment 4 David Gibson 2015-09-14 20:41:55 EDT
I have a patch for this in the works, but I'll need a bunch of ack flags ASAP if I'm to get it in for the next snapshot.
Comment 5 David Gibson 2015-09-14 21:12:54 EDT
I've made  draft brew build with the partial fix for slow startup at:

https://brewweb.devel.redhat.com/taskinfo?taskID=9835517
Comment 8 Qunfang Zhang 2015-09-15 03:07:39 EDT
(In reply to David Gibson from comment #5)
> I've made  draft brew build with the partial fix for slow startup at:
> 
> https://brewweb.devel.redhat.com/taskinfo?taskID=9835517


Test with the above build, boot guest with the following configuration:

Maxmem: 
256G: ~ 10s
512G: ~10s
1024G: ~15S
2048G: After 5s, hit the issue in bug 1263039 comment 4.

And this bug could be reproduced on the official build qemu-kvm-rhev-2.3.0-22.el7.ppc64le:

Maxmem: 
256G: ~ 10s
512G: ~17s
1024G: 2~3 mins
2048G: after about 27 mins, hit bug 1263039.

Anyway, this qemu-kvm build fixes the original issue in this bug.
Comment 9 Miroslav Rezanina 2015-09-18 07:54:29 EDT
Fix included in qemu-kvm-rhev-2.3.0-24.el7
Comment 10 Qunfang Zhang 2015-09-24 01:23:32 EDT
Verified this bug with the following version:

kernel-3.10.0-316.el7.ppc64le
qemu-kvm-rhev-2.3.0-25.el7.ppc64le
SLOF-20150313-5.gitc89b0df.el7.noarch

Test with the same command line with comment 2 and the result is:

Maxmem: 
256G: ~ 10s
512G: ~10s without guest desktop (with guest desktop, it consumes 20+ seconds)
1024G: ~10s without guest desktop (with guest desktop, it consumes 20+ seconds)

According to bug 1263039 comment6, we limit the maxmem to 1T.

So according to comment 8 and the above result, I will set the status to VERIFIED.
Comment 11 IBM Bug Proxy 2015-11-30 10:20:36 EST
------- Comment From fnovak@us.ibm.com 2015-11-30 15:17 EDT-------
reverse mirror of RHBZ 1262143 - VM startup is very slow with large amounts of hotpluggable memory
Comment 13 errata-xmlrpc 2015-12-04 11:57:01 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html

Note You need to log in before you can comment on or make changes to this bug.