RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1026808 - qemu-kvm slow RHEL-7 guest
Summary: qemu-kvm slow RHEL-7 guest
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 7.0
Assignee: Vlad Yasevich
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1002621
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-05 13:30 UTC by Heinz Mauelshagen
Modified: 2017-01-30 17:04 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1002621
Environment:
Last Closed: 2017-01-30 17:04:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Heinz Mauelshagen 2013-11-05 13:30:35 UTC
Description of problem:
Running a "make -j25" upstream kernel compilation takes about factor 10 more time in a RHEL-7 guest than on its Fedora-19 host on an Intel Hexacore i7-3930K.
Running 12 processes with busy loops (ie. no IO) saturates the guests VCPUs but
not the hosts coresm which are still idling at ~50%.

Version-Release number of selected component (if applicable):
qemu-kvm-1.4.2-12.fc19.x86_64

How reproducible:
Create a RHEL-7 guest on an Intel Hexacore with 12 VCPUS pinned to 12 hyperthreads (mapped to 6 cores in one socket), 16G memory for the guest (32G physical RAM on the host) with qcow2 backup file as virtio system disk for the guest.

Steps to Reproduce:
1. Make usptream kernel source tree accessible to guest
   (NFS or copy to /tmp doesn't matter in my case!)
2. in kernel source topdir: make distclean; make localmodconfig;make -j 25
3. check with top in guest and host

Actual results:
Builds slow with guest idling at ~80% and host at ~77%

Expected results:
Guest VCPUS _and_ host Cores all saturated. Build only some one digit percentage slower than on the host.

Additional info:

Comment 1 Gleb Natapov 2013-11-05 14:11:42 UTC
There was a lot of investigation on original BZ1002621. Can you update this one with the result? After all we found that original bug report is misleading (the reason for tmpfs compile is ccache misconfiguration) and network is the real bottleneck, but you copied original bug report verbatim.

Comment 2 Heinz Mauelshagen 2013-11-05 14:37:45 UTC
(In reply to Gleb Natapov from comment #1)
> There was a lot of investigation on original BZ1002621. Can you update this
> one with the result? After all we found that original bug report is
> misleading (the reason for tmpfs compile is ccache misconfiguration) and
> network is the real bottleneck, but you copied original bug report verbatim.


https://bugzilla.redhat.com/show_bug.cgi?id=1002621#c44 shows this best:

Evidence is, that networking constraints are the core of the issue:

I disabled ccache completely in a test series:

1. on the host with a kernel build tree on local ext4
2. fs shared build tree via NFS locally (ie. nfs client and server on the host)
3. fs shared build tree via NFS on the VM (NFS server on the host as before)

The NFSv4 mount options have been the same in case 2 and 3:

Results of a "make clean ; make -j 25":

1. 2m8s
2. 2m38s
3. 6m39s

BTW: putting the workset in tmpfs on the VM: 2m24s (good vs. 1 above)

So the virtio newtworking (with 64K jumbo frames) seems to be the bottleneck with a factor _10_ difference in ops as per nfsiostat in case 2 compared to 3.

How can the virtio bottleneck be analyzed further and eased?

Comment 3 Ronen Hod 2014-02-25 17:43:24 UTC
Postponed to 7.1, as this is not about a regression, and we are too late.

Comment 6 Vlad Yasevich 2015-08-27 00:06:59 UTC
Moving to 7.3.  I don't think I'll get to it in 7.2 timeframe.

Comment 13 Yanhui Ma 2016-07-05 10:01:35 UTC
Here are latest RHEL7.3 results and test steps

Host:
qemu-kvm-rhev-2.6.0-10.el7.x86_64
kernel-3.10.0-461.el7.x86_64

MemTotal:       32727644 kB

CPU(s):                64
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             4
Vendor ID:             GenuineIntel
Model name:            Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz

guest:
kernel-3.10.0-461.el7.x86_64
16G memory
16 VCPUS

Steps:
1.boot a RHEL7.3 guest:
/usr/libexec/qemu-kvm -machine accel=kvm -name RHEL-7.3 -S -machine pc,accel=kvm,usb=off -cpu SandyBridge -m 16384 -smp 16,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 3bf7f3da-8b6f-eb49-6086-f63952b29ed1 -no-user-config -nodefaults -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/RHEL-Server-7.3-64-virtio.raw,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:19:af:71,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc :0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -monitor stdio

2. git clone RHEL7.3 kernel to /home in guest
git clone git://git.app.eng.bos.redhat.com/rhel7.git 
git checkout kernel-3.10.0-461.el7

3.copy /boot/config-3.10.0-461.el7.x86_64 in host to kernel source code dir
4. make distclean
   time make -j 16/64

Results:
time make -j 16 in guest and host respectively
---------------------------------------------
        time  | guest cpu idle | host cpu idle 
---------------------------------------------
guest | 10m34s|    ~1%         |  ~75%          
---------------------------------------------
host: | 8m45s |     x          |  ~71%



time make -j 64 in guest and host respectively
---------------------------------------------
        time  | guest cpu idle | host cpu idle 
---------------------------------------------
guest | 11m13s|    ~0%         |  ~75%         
---------------------------------------------
host: | 4m21s |    x           |  ~0%

Comment 15 Vlad Yasevich 2016-07-06 17:01:17 UTC
There issue was that an nfs mounted source tree was slow to build.

The original bug describes a host system acting as an NFS server to the guest running on it.  The guest then mounts the source tree and builds it.

A suggested config (the one I used to reproduce) was a private bridge with jumbo
mtu configured on it as well as on the tap devices and virtio devices.

What I just thought of and didn't consider before was whether NFSv4 or v3 was used and whether TCP or UDP protocols were used.

-vlad


Note You need to log in before you can comment on or make changes to this bug.