| Summary: | qemu-kvm slow RHEL-7 guest | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Heinz Mauelshagen <heinzm> |
| Component: | qemu-kvm | Assignee: | Vlad Yasevich <vyasevic> |
| Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 | CC: | ailan, amit.shah, berrange, bmr, cfergeau, crobinso, dwmw2, hhuang, huding, itamar, jasowang, juzhang, knoel, michen, mkarg, mst, mtosatti, pbonzini, rbalakri, rjones, scottt.tw, todayyang, virt-bugs, virt-maint, virt-maint, vyasevic, wquan, yama |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | 7.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1002621 | Environment: | |
| Last Closed: | 2017-01-30 17:04:10 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | 1002621 | ||
| Bug Blocks: | |||
|
Description
Heinz Mauelshagen
2013-11-05 13:30:35 UTC
There was a lot of investigation on original BZ1002621. Can you update this one with the result? After all we found that original bug report is misleading (the reason for tmpfs compile is ccache misconfiguration) and network is the real bottleneck, but you copied original bug report verbatim. (In reply to Gleb Natapov from comment #1) > There was a lot of investigation on original BZ1002621. Can you update this > one with the result? After all we found that original bug report is > misleading (the reason for tmpfs compile is ccache misconfiguration) and > network is the real bottleneck, but you copied original bug report verbatim. https://bugzilla.redhat.com/show_bug.cgi?id=1002621#c44 shows this best: Evidence is, that networking constraints are the core of the issue: I disabled ccache completely in a test series: 1. on the host with a kernel build tree on local ext4 2. fs shared build tree via NFS locally (ie. nfs client and server on the host) 3. fs shared build tree via NFS on the VM (NFS server on the host as before) The NFSv4 mount options have been the same in case 2 and 3: Results of a "make clean ; make -j 25": 1. 2m8s 2. 2m38s 3. 6m39s BTW: putting the workset in tmpfs on the VM: 2m24s (good vs. 1 above) So the virtio newtworking (with 64K jumbo frames) seems to be the bottleneck with a factor _10_ difference in ops as per nfsiostat in case 2 compared to 3. How can the virtio bottleneck be analyzed further and eased? Postponed to 7.1, as this is not about a regression, and we are too late. Moving to 7.3. I don't think I'll get to it in 7.2 timeframe. Here are latest RHEL7.3 results and test steps
Host:
qemu-kvm-rhev-2.6.0-10.el7.x86_64
kernel-3.10.0-461.el7.x86_64
MemTotal: 32727644 kB
CPU(s): 64
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 4
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz
guest:
kernel-3.10.0-461.el7.x86_64
16G memory
16 VCPUS
Steps:
1.boot a RHEL7.3 guest:
/usr/libexec/qemu-kvm -machine accel=kvm -name RHEL-7.3 -S -machine pc,accel=kvm,usb=off -cpu SandyBridge -m 16384 -smp 16,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 3bf7f3da-8b6f-eb49-6086-f63952b29ed1 -no-user-config -nodefaults -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/RHEL-Server-7.3-64-virtio.raw,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:19:af:71,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc :0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -monitor stdio
2. git clone RHEL7.3 kernel to /home in guest
git clone git://git.app.eng.bos.redhat.com/rhel7.git
git checkout kernel-3.10.0-461.el7
3.copy /boot/config-3.10.0-461.el7.x86_64 in host to kernel source code dir
4. make distclean
time make -j 16/64
Results:
time make -j 16 in guest and host respectively
---------------------------------------------
time | guest cpu idle | host cpu idle
---------------------------------------------
guest | 10m34s| ~1% | ~75%
---------------------------------------------
host: | 8m45s | x | ~71%
time make -j 64 in guest and host respectively
---------------------------------------------
time | guest cpu idle | host cpu idle
---------------------------------------------
guest | 11m13s| ~0% | ~75%
---------------------------------------------
host: | 4m21s | x | ~0%
There issue was that an nfs mounted source tree was slow to build. The original bug describes a host system acting as an NFS server to the guest running on it. The guest then mounts the source tree and builds it. A suggested config (the one I used to reproduce) was a private bridge with jumbo mtu configured on it as well as on the tap devices and virtio devices. What I just thought of and didn't consider before was whether NFSv4 or v3 was used and whether TCP or UDP protocols were used. -vlad |