Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1275144 - There is about 20% throughput regression between qemu-kvm-1.5.3-94 and qemu-kvm-1.5.3-95 on fusion-io and ramdisk backend
Summary: There is about 20% throughput regression between qemu-kvm-1.5.3-94 and qemu-k...
Status: CLOSED DUPLICATE of bug 1251353
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.2
Hardware: x86_64
OS: Linux
Target Milestone: rc
: ---
Assignee: Paolo Bonzini
QA Contact: Virtualization Bugs
Depends On:
TreeView+ depends on / blocked
Reported: 2015-10-26 05:48 UTC by Yanhui Ma
Modified: 2015-12-11 14:45 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2015-12-11 14:45:32 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Yanhui Ma 2015-10-26 05:48:07 UTC
Description of problem:

Here are regression test results for qemu-kvm1.5.3-86.el7.x86_64+7.1guest+7.1host vs qemu-kvm-1.5.3-104.el7.x86_64+7.2guest+7.2host:


For fusion-io and ramdisk backend, there is about 20% throughput degradation on iodepth 8 and iodepth 64 of small block size.

I have identified qemu-kvm-1.5.3-95 is the first regression version via bisecting qemu-kvm.

Version-Release number of selected component (if applicable):


How reproducible:

Steps to Reproduce:
1.generate a ramdisk in host
#modprobe brd rd_size=1048576 rd_nr=1
2.boot guest with one ramdisk
numactl \
    -m 1 /usr/libexec/qemu-kvm  \
    -S \
    -name 'virt-tests-vm1' \
    -nodefaults \
    -chardev socket,id=hmp_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20150930-010040-dDtHkBI5,server,nowait \
    -mon chardev=hmp_id_humanmonitor1,mode=readline \
    -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20150930-010040-dDtHkBI5,server,nowait \
    -device isa-serial,chardev=serial_id_serial1 \
    -chardev socket,id=seabioslog_id_20150930-010040-dDtHkBI5,path=/tmp/seabios-20150930-010040-dDtHkBI5,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20150930-010040-dDtHkBI5,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x3 \
    -drive file='/usr/local/autotest/tests/virt/shared/data/images/RHEL-Server-7.2-64.raw',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=off,format=raw,aio=native \
    -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,bootindex=0 \
    -drive file='/dev/ram0',index=2,if=none,id=drive-virtio-disk2,media=disk,cache=none,snapshot=off,format=raw,aio=native \
    -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk2,bootindex=1 \
    -device virtio-net-pci,netdev=idRejOFD,mac='9a:37:37:37:37:8e',bus=pci.0,addr=0x6,id='id2CKRmT' \
    -netdev tap,id=idRejOFD,fd=25 \
    -m 4096 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
    -cpu 'Westmere' \
    -M pc \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -vnc :0 \
    -vga cirrus \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off   \
    -device sga \
3.in host:
# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 8175 MB
node 0 free: 7551 MB
node 1 cpus: 4 5 6 7
node 1 size: 8192 MB
node 1 free: 6810 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

pin two vcpu threads to cpu(4) and cpu(5) respectively

4.in guest:

fio --rw=read --bs=4k --iodepth=8 --runtime=1m --direct=1 --filename=/dev/vdb --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --time_based --output=/tmp/fio_result

Actual results:

For fusion-io and ramdisk backend, there is about 20% throughput degradation on iodepth 8 and iodepth 64 of small block size.

Expected results:

no regression on fusion-io and ramdisk backend

Additional info:

Comment 2 Markus Armbruster 2015-10-26 07:19:10 UTC
Related: bug 1251353 "Throughput degrades when doing I/O on multiple disks with tcmalloc optimization" against gperftools.

Comment 4 Stefan Hajnoczi 2015-10-27 16:35:57 UTC
I looked at the commits between qemu-kvm-1.5.3-94 and qemu-kvm-1.5.3-95.  The significant change was that tcmalloc was introduced.  Markus has posted another tcmalloc-related performance regression.

Comment 5 Paolo Bonzini 2015-11-02 09:11:31 UTC
yama, can you repeat the benchmark with G_SLICE=always-malloc set in the environment?

Comment 8 Paolo Bonzini 2015-11-04 13:05:29 UTC
Note that this is without dataplane.

Comment 9 Paolo Bonzini 2015-12-11 14:45:32 UTC

*** This bug has been marked as a duplicate of bug 1251353 ***

Note You need to log in before you can comment on or make changes to this bug.