Bug 1275144 - There is about 20% throughput regression between qemu-kvm-1.5.3-94 and qemu-kvm-1.5.3-95 on fusion-io and ramdisk backend
There is about 20% throughput regression between qemu-kvm-1.5.3-94 and qemu-k...
Status: CLOSED DUPLICATE of bug 1251353
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.2
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Paolo Bonzini
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-26 01:48 EDT by Yanhui Ma
Modified: 2015-12-11 09:45 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-11 09:45:32 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Yanhui Ma 2015-10-26 01:48:07 EDT
Description of problem:

Here are regression test results for qemu-kvm1.5.3-86.el7.x86_64+7.1guest+7.1host vs qemu-kvm-1.5.3-104.el7.x86_64+7.2guest+7.2host:

https://mojo.redhat.com/docs/DOC-1049185

For fusion-io and ramdisk backend, there is about 20% throughput degradation on iodepth 8 and iodepth 64 of small block size.

I have identified qemu-kvm-1.5.3-95 is the first regression version via bisecting qemu-kvm.

Version-Release number of selected component (if applicable):

qemu-kvm-1.5.3-104.el7.x86_64
kernel-3.10.0-320.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.generate a ramdisk in host
#modprobe brd rd_size=1048576 rd_nr=1
2.boot guest with one ramdisk
numactl \
    -m 1 /usr/libexec/qemu-kvm  \
    -S \
    -name 'virt-tests-vm1' \
    -nodefaults \
    -chardev socket,id=hmp_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20150930-010040-dDtHkBI5,server,nowait \
    -mon chardev=hmp_id_humanmonitor1,mode=readline \
    -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20150930-010040-dDtHkBI5,server,nowait \
    -device isa-serial,chardev=serial_id_serial1 \
    -chardev socket,id=seabioslog_id_20150930-010040-dDtHkBI5,path=/tmp/seabios-20150930-010040-dDtHkBI5,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20150930-010040-dDtHkBI5,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x3 \
    -drive file='/usr/local/autotest/tests/virt/shared/data/images/RHEL-Server-7.2-64.raw',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=off,format=raw,aio=native \
    -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,bootindex=0 \
    -drive file='/dev/ram0',index=2,if=none,id=drive-virtio-disk2,media=disk,cache=none,snapshot=off,format=raw,aio=native \
    -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk2,bootindex=1 \
    -device virtio-net-pci,netdev=idRejOFD,mac='9a:37:37:37:37:8e',bus=pci.0,addr=0x6,id='id2CKRmT' \
    -netdev tap,id=idRejOFD,fd=25 \
    -m 4096 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
    -cpu 'Westmere' \
    -M pc \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -vnc :0 \
    -vga cirrus \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off   \
    -device sga \
    -enable-kvm
3.in host:
# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 8175 MB
node 0 free: 7551 MB
node 1 cpus: 4 5 6 7
node 1 size: 8192 MB
node 1 free: 6810 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

pin two vcpu threads to cpu(4) and cpu(5) respectively

4.in guest:

fio --rw=read --bs=4k --iodepth=8 --runtime=1m --direct=1 --filename=/dev/vdb --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --time_based --output=/tmp/fio_result

Actual results:

For fusion-io and ramdisk backend, there is about 20% throughput degradation on iodepth 8 and iodepth 64 of small block size.

Expected results:

no regression on fusion-io and ramdisk backend

Additional info:
Comment 2 Markus Armbruster 2015-10-26 03:19:10 EDT
Related: bug 1251353 "Throughput degrades when doing I/O on multiple disks with tcmalloc optimization" against gperftools.
Comment 4 Stefan Hajnoczi 2015-10-27 12:35:57 EDT
I looked at the commits between qemu-kvm-1.5.3-94 and qemu-kvm-1.5.3-95.  The significant change was that tcmalloc was introduced.  Markus has posted another tcmalloc-related performance regression.
Comment 5 Paolo Bonzini 2015-11-02 04:11:31 EST
yama, can you repeat the benchmark with G_SLICE=always-malloc set in the environment?
Comment 8 Paolo Bonzini 2015-11-04 08:05:29 EST
Note that this is without dataplane.
Comment 9 Paolo Bonzini 2015-12-11 09:45:32 EST

*** This bug has been marked as a duplicate of bug 1251353 ***

Note You need to log in before you can comment on or make changes to this bug.