Red Hat Bugzilla – Bug 1031987
Randrw kvm exits is higher on virtio_blk driver with native gluster backend
Last modified: 2016-10-26 17:18:18 EDT
As you said "KVM_Exits is much higher (~70%-95%) on latest version compared to old version", so could you update which older version you tested is good? Thanks.
What is the old version and is there any number of the test on it, so that I can compare them?
Created attachment 827005 [details]
Created attachment 827006 [details]
(In reply to Fam Zheng from comment #8)
> It will be helpful if we can also compare the exit reasons, Xigao, could you
> run both tests again and collect the two traces with:
> 1. Before starting test, run as root on host:
> # mount -t debugfs none /sys/kernel/debug
> # echo 1 >/sys/kernel/debug/tracing/events/kvm/enable
> # cat /sys/kernel/debug/tracing/trace_pipe | grep kvm_exit >
> 2. Leave the above command running, and run the fio test in the guest
> 3. When the test finishes, copy out /tmp/kvm_exit_trace.
- Run following fio command in guest:
# fio --rw=randrw --bs=4k --iodepth=8 --runtime=1m --direct=1 --filename=/mnt/randrw_4k_8 --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --size=512MB --time_based --ioscheduler=deadline
- Results on qemu-kvm-tools-0.12.1.2-2.415.el6.x86_64
BW:2.825MB/s IOPS:723 Kvm_Exits: 201574
* Kvm_exit trace is in the comment #10 "qemu_kvm_415.el6_exit_trace"
- Results on qemu-kvm-0.12.1.2-2.415.el6_5.3.x86_64
BW:3.281MB/s IOPS:839 Kvm_Exits: 663718
* Kvm_exit trace is in the comment #11 "qemu_kvm_415.el6_5.3_exit_trace"
(In reply to Fam Zheng from comment #14)
> Xiaomei, please see if using no-op scheduler + setting nomerges [*] can
> reproduce this. We expect the numbers be linear in that case.
> (Or alternatively, setting option "use_bio=true" to virtio_blk module
> parameter should have similar effects.)
We could still reproduce the issue when using noop scheduler and nomerges.
# echo noop > /sys/block/vdb/queue/scheduler
# echo 2 > /sys/block/vdb/queue/nomerges
# fio --rw=randrw --bs=4k --iodepth=8 --runtime=1m --direct=1 --filename=/mnt/randrw_4k_8 --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --size=512MB --time_based --ioscheduler=noop
BW=2868.3KB/s IOPS=717 KVM_Exits=205470
BW=3440.4KB/s IOPS=860 KVM_Exits=705056
Please check attachment for KVM_Exit trace.
Created attachment 827662 [details]
noop + nomerges + kvm_exit_trace_415.el6
Created attachment 827663 [details]
noop + nomerges + kvm_exit_trace_415.el6_5.3
260175 MSR_WRITE 0xffffffff8103ec08 native_write_msr_safe
107396 IO_INSTRUCTION 0xffffffff81293431 iowrite16
101080 HLT 0xffffffff8103eaca native_safe_halt
88341 EXCEPTION_NMI 0xffffffff8100c644 __math_state_restore
88341 CR_ACCESS 0xffffffff81009777 __switch_to
24427 EXCEPTION_NMI 0x37dba845fb unknown
9298 EXCEPTION_NMI 0xffffffff8128d39e copy_user_generic_unrolled
3462 PENDING_INTERRUPT 0xffffffff8103eacb native_safe_halt
3207 EXCEPTION_NMI 0xffffffff8103f6e7 native_set_pte_at
2401 EXCEPTION_NMI 0x37de28020b unknown
2011 EXCEPTION_NMI 0x418cf6 unknown
1508 INVLPG 0xffffffff8104fc2d flush_tlb_page
1486 CR_ACCESS 0xffffffff8103efbc native_flush_tlb
1486 CR_ACCESS 0xffffffff8103efb9 native_flush_tlb
1479 EXCEPTION_NMI 0x37dba79ddd unknown
No difference with previous results.
Jeff, do you think there's anything special about gluster backend?
(In reply to Fam Zheng from comment #18)
> 260175 MSR_WRITE 0xffffffff8103ec08 native_write_msr_safe
> 107396 IO_INSTRUCTION 0xffffffff81293431 iowrite16
> 101080 HLT 0xffffffff8103eaca native_safe_halt
> 88341 EXCEPTION_NMI 0xffffffff8100c644 __math_state_restore
> 88341 CR_ACCESS 0xffffffff81009777 __switch_to
> 24427 EXCEPTION_NMI 0x37dba845fb unknown
> 9298 EXCEPTION_NMI 0xffffffff8128d39e copy_user_generic_unrolled
> 3462 PENDING_INTERRUPT 0xffffffff8103eacb native_safe_halt
> 3207 EXCEPTION_NMI 0xffffffff8103f6e7 native_set_pte_at
> 2401 EXCEPTION_NMI 0x37de28020b unknown
> 2011 EXCEPTION_NMI 0x418cf6 unknown
> 1508 INVLPG 0xffffffff8104fc2d flush_tlb_page
> 1486 CR_ACCESS 0xffffffff8103efbc native_flush_tlb
> 1486 CR_ACCESS 0xffffffff8103efb9 native_flush_tlb
> 1479 EXCEPTION_NMI 0x37dba79ddd unknown
> No difference with previous results.
> Jeff, do you think there's anything special about gluster backend?
This is really odd - Fam noticed that my previous comment in this bug has disappeared. Here is what it said:
That is a good question. I am honestly not sure if there is something specific to gluster that would cause this. Do we see increased exit events on other network block drivers?
I do wonder if this is somehow similar or related to BZ 1010638. In that bug, when running fio continuously in a guest that has a data drive that is using the qemu native gluster driver, memory usage continues to increase until we hit the kernel OOM killer. Do you want to reassign this BZ to me, Fam?
(In reply to Jeff Cody from comment #19)
> This is really odd - Fam noticed that my previous comment in this bug has
> disappeared. Here is what it said:
> That is a good question. I am honestly not sure if there is something
> specific to gluster that would cause this. Do we see increased exit events
> on other network block drivers?
On Netapp NFS backend, we could not see increased exit events.
(In reply to Jeff Cody from comment #20)
> I do wonder if this is somehow similar or related to BZ 1010638. In that
> bug, when running fio continuously in a guest that has a data drive that is
> using the qemu native gluster driver, memory usage continues to increase
> until we hit the kernel OOM killer. Do you want to reassign this BZ to me,
OK. I don't know whether this is closely related to BZ 1010638 but since this BZ is only reproduced with QEMU's gluster driver so far, I'm reassigning it to Jeff.
Removing regression keyword and the z-stream flag, as this is not a general problem (there was no gluster in RHEL6.4, so it can't be a regression).