Bug 1425700

Summary: virtio-scsi data plane takes 100% host CPU with polling
Product: Red Hat Enterprise Linux 7 Reporter: Fam Zheng <famz>
Component: qemu-kvm-rhevAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: CongLi <coli>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: aliang, chayang, coli, juzhang, knoel, kuwei, michen, virt-maint, wquan, xuwei, yama
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-6.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:44:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1404303    

Description Fam Zheng 2017-02-22 07:21:21 UTC
The issue introduced by polling should be fixed by this upstream commit:

commit 0793169870f376bc9959b7d81df48ab4a90dcceb
Author: Fam Zheng <famz>
Date:   Thu Feb 9 16:40:47 2017 +0800

    virtio: Report real progress in VQ aio poll handler
    
    In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()"
    cases are making true progress.
    
    Currently the offending one is virtio-scsi event queue, whose handler
    does nothing if no event is pending. As a result aio_poll() will spin on
    the "non-empty" VQ and take 100% host CPU.
    
    Fix this by reporting actual progress from virtio queue aio handlers.
    
    Reported-by: Ed Swierk <eswierk>
    Signed-off-by: Fam Zheng <famz>
    Tested-by: Ed Swierk <eswierk>
    Reviewed-by: Stefan Hajnoczi <stefanha>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

For reproducer see also:

https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg01703.html

Comment 3 CongLi 2017-02-22 14:41:40 UTC
Reproduced this bug on:
qemu-kvm-rhev-2.8.0-4.el7.x86_64
kernel-3.10.0-552.el7.x86_64

1. With data plane:
1.1 poll-max-ns=32768:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0,poll-max-ns=32768 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
29240 root      20   0 16.606g 1.502g  11984 S 100.3  4.8   2:31.31 qemu-kvm

1.2 poll-max-ns=0:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0,poll-max-ns=0 \


  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
29408 root      20   0 16.583g 1.500g  12008 S 100.0  4.8   2:19.84 qemu-kvm

1.3 without poll-max-ns:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
29580 root      20   0 16.602g 1.521g  11980 S 100.7  4.9   3:49.51 qemu-kvm


2. Without data plane:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
29752 root      20   0 16.573g 1.505g  12008 S   0.7  4.8   0:48.02 qemu-kvm

Comment 4 CongLi 2017-02-22 14:48:20 UTC
And there is no such problem on qemu-kvm-rhev-2.8.0-3.el7.x86_64:

    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
30143 root      20   0 16.588g 1.505g  12020 S   0.3  4.8   0:43.59 qemu-kvm

Comment 6 Miroslav Rezanina 2017-03-08 10:55:44 UTC
Fix included in qemu-kvm-rhev-2.8.0-6.el7

Comment 7 CongLi 2017-03-09 02:50:51 UTC
Verified this bug on:
qemu-kvm-rhev-2.8.0-6.el7.x86_64

1. With data plane:
1.1 poll-max-ns=32768:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0,poll-max-ns=32768 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
18761 root      20   0 16.695g 1.700g  11992 S   0.3  5.4   1:07.90 qemu-kvm 


1.2 poll-max-ns=0:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0,poll-max-ns=0 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
19178 root      20   0 16.586g 1.469g  11972 S   0.7  4.7   0:46.14 qemu-kvm


1.3 without poll-max-ns:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,iothread=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -object iothread,id=iothread0 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND               
19316 root      20   0 16.583g 1.492g  11984 S   0.7  4.8   0:46.43 qemu-kvm


2. Without data plane:
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                      
19454 root      20   0 16.580g 1.494g  11968 S   0.3  4.8   0:45.43 qemu-kvm

Comment 8 Richard W.M. Jones 2017-03-10 12:15:18 UTC
*** Bug 1430287 has been marked as a duplicate of this bug. ***

Comment 11 errata-xmlrpc 2017-08-01 23:44:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 12 errata-xmlrpc 2017-08-02 01:22:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 13 errata-xmlrpc 2017-08-02 02:14:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 14 errata-xmlrpc 2017-08-02 02:55:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 03:19:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 03:37:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392