Bug 1902548 - When there are more than 7 virtio-blk devices hot-inserted, HMP reports an error: qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
Summary: When there are more than 7 virtio-blk devices hot-inserted, HMP reports an er...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: 8.4
Assignee: Greg Kurz
QA Contact: Xujun Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-30 03:02 UTC by Zhenyu Zhang
Modified: 2020-12-30 08:12 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-15 08:41:18 UTC
Type: ---
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
lsof log (111.81 KB, text/plain)
2020-12-07 07:05 UTC, Xujun Ma
no flags Details

Description Zhenyu Zhang 2020-11-30 03:02:32 UTC
Description of problem:
When there are more than 7 virtio-blk devices hot-inserted, 
HMP reports an error: 
qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
virtio-blk failed to set host notifier (-24)
qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
qemu-kvm: virtio-blk failed to set guest notifier (-24), ensure -accel kvm is set.
qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).


Version-Release number of selected component (if applicable):
Host Kernel: 4.18.0-250.el8.dt4.ppc64le
Guest Kernel: 4.18.0-254.el8.ppc64le
Qemu-kvm: qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9
SLOF: SLOF-20191022-3.git899d9883.module+el8.3.0+6423+e4cb6418.noarch

How reproducible:
100%

Steps to Reproduce:
1. Boot guest:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine pseries  \
-nodefaults \
-device VGA,bus=pci.0,addr=0x2 \
-m 248832  \
-smp 60,maxcpus=60,cores=30,threads=1,sockets=2  \
-cpu 'host' \
-chardev socket,server,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,nowait  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,server,id=chardev_serial0,path=/var/tmp/serial-serial0,nowait \
-device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \
-device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kar/vt_test_images/rhel840-ppc64le-virtio.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pci.0,addr=0x4 \
-device virtio-net-pci,mac=9a:ac:2f:f2:95:8d,id=idT9IhMN,netdev=idpSaD7d,bus=pci.0,addr=0x5  \
-netdev tap,id=idpSaD7d,vhost=on \
-vnc :20  \
-rtc base=utc,clock=host  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm 

2. Run hotplug and unplug command:
{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_0.qcow2", "node-name": "file_block-id4fcjgc"}, "id": "DGW91isx"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id4fcjgc", "file": "file_block-id4fcjgc"}, "id": "fnTQd7g4"}
{"execute": "device_add", "arguments": {"id": "block-id4fcjgc", "driver": "virtio-blk-pci", "drive": "block-id4fcjgc"}, "id": "zw5pdBkE"}

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_1.qcow2", "node-name": "file_block-idlxYVyb"}, "id": "MqnzZxWZ"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idlxYVyb", "file": "file_block-idlxYVyb"}, "id": "zR71NhES"}
{"execute": "device_add", "arguments": {"id": "block-idlxYVyb", "driver": "virtio-blk-pci", "drive": "block-idlxYVyb"}, "id": "kYsG5im6"}

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_2.qcow2", "node-name": "file_block-id1l86Qo"}, "id": "noa1ORcI"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id1l86Qo", "file": "file_block-id1l86Qo"}, "id": "eLRzZjRq"}
{"execute": "device_add", "arguments": {"id": "block-id1l86Qo", "driver": "virtio-blk-pci", "drive": "block-id1l86Qo"}, "id": "sPK74ki2"}

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_3.qcow2", "node-name": "file_block-iddYeNtw"}, "id": "Kcm9lCyp"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddYeNtw", "file": "file_block-iddYeNtw"}, "id": "ld8F8l1T"}
{"execute": "device_add", "arguments": {"id": "block-iddYeNtw", "driver": "virtio-blk-pci", "drive": "block-iddYeNtw"}, "id": "SWBGpgPc"}

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_4.qcow2", "node-name": "file_block-idxDBhNZ"}, "id": "xJgSCb90"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idxDBhNZ", "file": "file_block-idxDBhNZ"}, "id": "I51sqQWV"}
{"execute": "device_add", "arguments": {"id": "block-idxDBhNZ", "driver": "virtio-blk-pci", "drive": "block-idxDBhNZ"}, "id": "rr6RTpJJ"}

 {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_5.qcow2", "node-name": "file_block-iddGGL3b"}, "id": "8iMWOGKi"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddGGL3b", "file": "file_block-iddGGL3b"}, "id": "vKpwTYMr"}
{"execute": "device_add", "arguments": {"id": "block-iddGGL3b", "driver": "virtio-blk-pci", "drive": "block-iddGGL3b"}, "id": "X7DMQcji"}

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_6.qcow2", "node-name": "file_block-idAMfZD0"}, "id": "enmXS1pU"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idAMfZD0", "file": "file_block-idAMfZD0"}, "id": "1IRPcfKE"}
{"execute": "device_add", "arguments": {"id": "block-idAMfZD0", "driver": "virtio-blk-pci", "drive": "block-idAMfZD0"}, "id": "KfvoCWwf"}  ==============> HMP reports an error

{"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/var/lib/avocado/data/avocado-vt/storage_7.qcow2", "node-name": "file_block-id3WjYqy"}, "id": "svGjzLQU"}
{"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id3WjYqy", "file": "file_block-id3WjYqy"}, "id": "oqKsFhNv"}
{"execute": "device_add", "arguments": {"id": "block-id3WjYqy", "driver": "virtio-blk-pci", "drive": "block-id3WjYqy"}, "id": "5NgfMCsL"}  ==============> HMP reports an error


Actual results:
qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
virtio-blk failed to set host notifier (-24)
qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
qemu-kvm: virtio-blk failed to set guest notifier (-24), ensure -accel kvm is set.
qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).

Expected results:
No error

Additional info:

Comment 1 Zhenyu Zhang 2020-11-30 03:05:04 UTC
Hi Qingwang,

Could you try it on the x86 platform?

Comment 2 qing.wang 2020-11-30 07:48:45 UTC
Tested on

Red Hat Enterprise Linux release 8.4 Beta (Ootpa)
4.18.0-250.el8.dt4.x86_64
qemu-kvm-common-4.2.0-37.module+el8.4.0+8837+c89bcfe6.x86_64

and

Red Hat Enterprise Linux release 8.3 (Ootpa)
4.18.0-240.el8.x86_64
qemu-kvm-common-5.1.0-14.module+el8.3.0+8790+80f9c6d8.1.x86_64


No issue found.

Maybe this issus due to similar cause as Bug 1897550 - qemu crashed when hotplug many disks with error virtio_scsi_data_plane_handle_ctrl: Assertion `s->ctx && s->dataplane_started' failed 

There are different threshold value on ppc platform.




steps:

1.create images
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_0.qcow2 1G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_1.qcow2 2G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_2.qcow2 3G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_3.qcow2 4G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_4.qcow2 5G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_5.qcow2 6G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_6.qcow2 7G
  qemu-img create -f qcow2 /home/kvm_autotest_root/images/storage_7.qcow2 8G

2. boot vm
/usr/libexec/qemu-kvm \
  -name 'avocado-vt-vm1' \
  -sandbox on \
  -machine pc \
  -nodefaults \
  -device VGA,bus=pci.0,addr=0x2 \
  -m 8096 \
  -chardev socket,server,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,nowait \
  -mon chardev=qmp_id_qmpmonitor1,mode=control \
  -chardev socket,server,id=chardev_serial0,path=/var/tmp/serial-serial0,nowait \
  -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
  -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
  -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pci.0,addr=0x4 \
  -device virtio-net-pci,mac=9a:ac:2f:f2:95:8d,id=idT9IhMN,netdev=idpSaD7d,bus=pci.0,addr=0x5 \
  -netdev tap,id=idpSaD7d,vhost=on \
  -rtc base=utc,clock=host \
  -boot menu=off,order=cdn,once=c,strict=off \
  -enable-kvm \
  -vnc :5 \
  -rtc base=localtime,clock=host,driftfix=slew \
  -boot order=cdn,once=c,menu=off,strict=off \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pci.0 \
  -monitor stdio \
  -qmp tcp:0:5955,server,nowait \
  -chardev file,path=/var/tmp/monitor-serialdbg.log,id=serial_id_serial0 \
  -device isa-serial,chardev=serial_id_serial0

3.hotpulg disks
{"execute":"qmp_capabilities"}
  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_0.qcow2", "node-name": "file_block-id4fcjgc"}, "id": "DGW91isx"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id4fcjgc", "file": "file_block-id4fcjgc"}, "id": "fnTQd7g4"}
  {"execute": "device_add", "arguments": {"id": "block-id4fcjgc", "driver": "virtio-blk-pci", "drive": "block-id4fcjgc"}, "id": "zw5pdBkE"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_1.qcow2", "node-name": "file_block-idlxYVyb"}, "id": "MqnzZxWZ"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idlxYVyb", "file": "file_block-idlxYVyb"}, "id": "zR71NhES"}
  {"execute": "device_add", "arguments": {"id": "block-idlxYVyb", "driver": "virtio-blk-pci", "drive": "block-idlxYVyb"}, "id": "kYsG5im6"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_2.qcow2", "node-name": "file_block-id1l86Qo"}, "id": "noa1ORcI"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id1l86Qo", "file": "file_block-id1l86Qo"}, "id": "eLRzZjRq"}
  {"execute": "device_add", "arguments": {"id": "block-id1l86Qo", "driver": "virtio-blk-pci", "drive": "block-id1l86Qo"}, "id": "sPK74ki2"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_3.qcow2", "node-name": "file_block-iddYeNtw"}, "id": "Kcm9lCyp"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddYeNtw", "file": "file_block-iddYeNtw"}, "id": "ld8F8l1T"}
  {"execute": "device_add", "arguments": {"id": "block-iddYeNtw", "driver": "virtio-blk-pci", "drive": "block-iddYeNtw"}, "id": "SWBGpgPc"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_4.qcow2", "node-name": "file_block-idxDBhNZ"}, "id": "xJgSCb90"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idxDBhNZ", "file": "file_block-idxDBhNZ"}, "id": "I51sqQWV"}
  {"execute": "device_add", "arguments": {"id": "block-idxDBhNZ", "driver": "virtio-blk-pci", "drive": "block-idxDBhNZ"}, "id": "rr6RTpJJ"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_5.qcow2", "node-name": "file_block-iddGGL3b"}, "id": "8iMWOGKi"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddGGL3b", "file": "file_block-iddGGL3b"}, "id": "vKpwTYMr"}
  {"execute": "device_add", "arguments": {"id": "block-iddGGL3b", "driver": "virtio-blk-pci", "drive": "block-iddGGL3b"}, "id": "X7DMQcji"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_6.qcow2", "node-name": "file_block-idAMfZD0"}, "id": "enmXS1pU"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idAMfZD0", "file": "file_block-idAMfZD0"}, "id": "1IRPcfKE"}
  {"execute": "device_add", "arguments": {"id": "block-idAMfZD0", "driver": "virtio-blk-pci", "drive": "block-idAMfZD0"}, "id": "KfvoCWwf"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_7.qcow2", "node-name": "file_block-id3WjYqy"}, "id": "svGjzLQU"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id3WjYqy", "file": "file_block-id3WjYqy"}, "id": "oqKsFhNv"}
  {"execute": "device_add", "arguments": {"id": "block-id3WjYqy", "driver": "virtio-blk-pci", "drive": "block-id3WjYqy"}, "id": "5NgfMCsL"}

Comment 3 Zhenyu Zhang 2020-11-30 09:18:24 UTC
Hi qinwang, 

Thanks for the feedback,
Tested 5 times with "qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9" on x86 and did not hit this issue.
So set to ppc only

Comment 4 Xujun Ma 2020-12-01 08:52:36 UTC
Reproduced this problem with qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.ppc64le when hotplug the seventh virtio_blk disk.error as following:
qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
virtio-blk failed to set host notifier (-24)
qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).

Comment 5 Greg Kurz 2020-12-03 12:05:30 UTC
(In reply to Xujun Ma from comment #4)
> Reproduced this problem with
> qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.ppc64le when hotplug the
> seventh virtio_blk disk.error as following:
> qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too
> many open files (-24)
> virtio-blk failed to set host notifier (-24)
> qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).

It seems that the QEMU process has reached RLIMIT_NOFILE.

Can you check the limit ?

$ grep files /proc/$(pgrep qemu)/limits
Max open files            1024                 262144               files     

As well as the number of already open descriptors before adding the
the disks ?

$ ls /proc/$(pgrep qemu)/fd | wc -l
85

FYI this is what I get after the 8 disks were added:

$ ls /proc/$(pgrep qemu)/fd | wc -l
107

Comment 6 Xujun Ma 2020-12-04 07:54:55 UTC
(In reply to Greg Kurz from comment #5)
> (In reply to Xujun Ma from comment #4)
> > Reproduced this problem with
> > qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.ppc64le when hotplug the
> > seventh virtio_blk disk.error as following:
> > qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too
> > many open files (-24)
> > virtio-blk failed to set host notifier (-24)
> > qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
> 
> It seems that the QEMU process has reached RLIMIT_NOFILE.
> 
> Can you check the limit ?
> 
> $ grep files /proc/$(pgrep qemu)/limits
> Max open files            1024                 262144               files   
# grep files /proc/$(pgrep qemu)/limits
Max open files            1024                 8192                 files   
> 
> 
> As well as the number of already open descriptors before adding the
> the disks ?
# ls /proc/$(pgrep qemu)/fd | wc -l
211
> 
> $ ls /proc/$(pgrep qemu)/fd | wc -l
> 85
> 
> FYI this is what I get after the 8 disks were added:
# ls /proc/$(pgrep qemu)/fd | wc -l
999
> 
> $ ls /proc/$(pgrep qemu)/fd | wc -l
> 107

Comment 7 Greg Kurz 2020-12-04 14:46:00 UTC
(In reply to Xujun Ma from comment #6)
> (In reply to Greg Kurz from comment #5)
> > (In reply to Xujun Ma from comment #4)
> > > Reproduced this problem with
> > > qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.ppc64le when hotplug the
> > > seventh virtio_blk disk.error as following:
> > > qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too
> > > many open files (-24)
> > > virtio-blk failed to set host notifier (-24)
> > > qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower).
> > 
> > It seems that the QEMU process has reached RLIMIT_NOFILE.
> > 
> > Can you check the limit ?
> > 
> > $ grep files /proc/$(pgrep qemu)/limits
> > Max open files            1024                 262144               files   
> # grep files /proc/$(pgrep qemu)/limits
> Max open files            1024                 8192                 files   
> > 
> > 
> > As well as the number of already open descriptors before adding the
> > the disks ?
> # ls /proc/$(pgrep qemu)/fd | wc -l
> 211

This number of open file descriptors seems very huge for
the QEMU command line provided in the description...

> > 
> > $ ls /proc/$(pgrep qemu)/fd | wc -l
> > 85
> > 
> > FYI this is what I get after the 8 disks were added:
> # ls /proc/$(pgrep qemu)/fd | wc -l
> 999

... and this one is pretty much insane but it is very close
to 1024. It is thus consistent with QEMU hitting the "too
many open files" error.

I can't think of a way for 8 disks to open nearly 800 file
descriptors... I need some more details. Can you provide
the output of 'lsof -p $(pgrep qemu)' before and after
adding the disks ?

> > 
> > $ ls /proc/$(pgrep qemu)/fd | wc -l
> > 107

Comment 8 David Gibson 2020-12-07 03:39:17 UTC
I wonder if we could have an fd leak in qemu.

Comment 9 David Gibson 2020-12-07 03:51:38 UTC
Xujun, could you also check if this is a regression?

Comment 10 Xujun Ma 2020-12-07 07:05:47 UTC
Created attachment 1737204 [details]
lsof log

Comment 11 Greg Kurz 2020-12-07 15:19:43 UTC
(In reply to David Gibson from comment #8)
> I wonder if we could have an fd leak in qemu.

I've bisected down to:

commit 9445e1e15e66c19e42bea942ba810db28052cd05
Author: Stefan Hajnoczi <stefanha>
Date:   Tue Aug 18 15:33:47 2020 +0100

    virtio-blk-pci: default num_queues to -smp N



This explains the inflation of event fds observed in the
lsof output.

This wasn't reproduced on x86 because the QEMU cmdline in
comment #2 creates a mono-CPU guest.

Qingwang,

Please retry on x86 with the following on the QEMU cmdline:

-smp 60,maxcpus=60,cores=30,threads=1,sockets=2

Comment 12 qing.wang 2020-12-08 02:16:48 UTC
Test on 
Red Hat Enterprise Linux release 8.3 (Ootpa)
4.18.0-240.el8.x86_64
qemu-kvm-common-5.1.0-14.module+el8.3.0+8790+80f9c6d8.1.x86_64

Boot vm :
/usr/libexec/qemu-kvm \
  -name 'avocado-vt-vm1' \
  -sandbox on \
  -machine pc \
  -nodefaults \
  -device VGA,bus=pci.0,addr=0x2 \
  -m 8096 \
  -smp 60,maxcpus=60,cores=30,threads=1,sockets=2  \
  -cpu 'host' \
  -chardev socket,server,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,nowait \
  -mon chardev=qmp_id_qmpmonitor1,mode=control \
  -chardev socket,server,id=chardev_serial0,path=/var/tmp/serial-serial0,nowait \
  -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
  -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
  -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pci.0,addr=0x4 \
  -device virtio-net-pci,mac=9a:ac:2f:f2:95:8d,id=idT9IhMN,netdev=idpSaD7d,bus=pci.0,addr=0x5 \
  -netdev tap,id=idpSaD7d,vhost=on \
  -rtc base=utc,clock=host \
  -boot menu=off,order=cdn,once=c,strict=off \
  -enable-kvm \
  -vnc :5 \
  -rtc base=localtime,clock=host,driftfix=slew \
  -boot order=cdn,once=c,menu=off,strict=off \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pci.0 \
  -monitor stdio \
  -qmp tcp:0:5955,server,nowait \
  -chardev file,path=/var/tmp/monitor-serialdbg.log,id=serial_id_serial0 \
  -device isa-serial,chardev=serial_id_serial0

hotplug disks
{"execute":"qmp_capabilities"}
  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_0.qcow2", "node-name": "file_block-id4fcjgc"}, "id": "DGW91isx"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id4fcjgc", "file": "file_block-id4fcjgc"}, "id": "fnTQd7g4"}
  {"execute": "device_add", "arguments": {"id": "block-id4fcjgc", "driver": "virtio-blk-pci", "drive": "block-id4fcjgc"}, "id": "zw5pdBkE"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_1.qcow2", "node-name": "file_block-idlxYVyb"}, "id": "MqnzZxWZ"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idlxYVyb", "file": "file_block-idlxYVyb"}, "id": "zR71NhES"}
  {"execute": "device_add", "arguments": {"id": "block-idlxYVyb", "driver": "virtio-blk-pci", "drive": "block-idlxYVyb"}, "id": "kYsG5im6"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_2.qcow2", "node-name": "file_block-id1l86Qo"}, "id": "noa1ORcI"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id1l86Qo", "file": "file_block-id1l86Qo"}, "id": "eLRzZjRq"}
  {"execute": "device_add", "arguments": {"id": "block-id1l86Qo", "driver": "virtio-blk-pci", "drive": "block-id1l86Qo"}, "id": "sPK74ki2"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_3.qcow2", "node-name": "file_block-iddYeNtw"}, "id": "Kcm9lCyp"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddYeNtw", "file": "file_block-iddYeNtw"}, "id": "ld8F8l1T"}
  {"execute": "device_add", "arguments": {"id": "block-iddYeNtw", "driver": "virtio-blk-pci", "drive": "block-iddYeNtw"}, "id": "SWBGpgPc"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_4.qcow2", "node-name": "file_block-idxDBhNZ"}, "id": "xJgSCb90"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idxDBhNZ", "file": "file_block-idxDBhNZ"}, "id": "I51sqQWV"}
  {"execute": "device_add", "arguments": {"id": "block-idxDBhNZ", "driver": "virtio-blk-pci", "drive": "block-idxDBhNZ"}, "id": "rr6RTpJJ"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_5.qcow2", "node-name": "file_block-iddGGL3b"}, "id": "8iMWOGKi"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-iddGGL3b", "file": "file_block-iddGGL3b"}, "id": "vKpwTYMr"}
  {"execute": "device_add", "arguments": {"id": "block-iddGGL3b", "driver": "virtio-blk-pci", "drive": "block-iddGGL3b"}, "id": "X7DMQcji"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_6.qcow2", "node-name": "file_block-idAMfZD0"}, "id": "enmXS1pU"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-idAMfZD0", "file": "file_block-idAMfZD0"}, "id": "1IRPcfKE"}
  {"execute": "device_add", "arguments": {"id": "block-idAMfZD0", "driver": "virtio-blk-pci", "drive": "block-idAMfZD0"}, "id": "KfvoCWwf"}

  {"execute": "blockdev-add", "arguments": {"driver": "file", "filename": "/home/kvm_autotest_root/images/storage_7.qcow2", "node-name": "file_block-id3WjYqy"}, "id": "svGjzLQU"}
  {"execute": "blockdev-add", "arguments": {"driver": "qcow2", "node-name": "block-id3WjYqy", "file": "file_block-id3WjYqy"}, "id": "oqKsFhNv"}
  {"execute": "device_add", "arguments": {"id": "block-id3WjYqy", "driver": "virtio-blk-pci", "drive": "block-id3WjYqy"}, "id": "5NgfMCsL"}


No issue found.

Comment 13 Greg Kurz 2020-12-08 10:48:16 UTC
(In reply to qing.wang from comment #12)
> Test on 
> Red Hat Enterprise Linux release 8.3 (Ootpa)
> 4.18.0-240.el8.x86_64
> qemu-kvm-common-5.1.0-14.module+el8.3.0+8790+80f9c6d8.1.x86_64
> 

This QEMU is too old. It doesn't have the change mentioned in
comment #11.

Please re-test using the same QEMU version as used on ppc :

http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/qemu-kvm/5.2.0/0.module+el8.4.0+8855+a9e237a9/

You should then hit the issue since I'm observing it on my laptop
using upstream QEMU (5.2-rc4).

Comment 14 qing.wang 2020-12-09 06:16:59 UTC
Reproduced issue like comment 0 on 
Red Hat Enterprise Linux release 8.4 Beta (Ootpa)
4.18.0-252.el8.dt4.x86_64
qemu-kvm-common-5.2.0-0.module+el8.4.0+8855+a9e237a9.x86_64

before hotplug
root@dell-per440-10 ~ # lsof -p $(pgrep qemu)|wc -l
270

after hotplug
root@dell-per440-10 ~ # lsof -p $(pgrep qemu)|wc -l
1059

root@dell-per440-10 ~ # grep files /proc/$(pgrep qemu)/limits
Max open files            1024                 262144               files     

root@dell-per440-10 ~ # ls /proc/$(pgrep qemu)/fd | wc -l
998

Comment 15 Xujun Ma 2020-12-09 07:38:53 UTC
(In reply to David Gibson from comment #9)
> Xujun, could you also check if this is a regression?

OK

Comment 16 Greg Kurz 2020-12-09 10:13:43 UTC
(In reply to Xujun Ma from comment #15)
> (In reply to David Gibson from comment #9)
> > Xujun, could you also check if this is a regression?
> 
> OK

No need to re-check. This was checked already by Qinwang in comment #12
on x86 with QEMU 5.1. It is a regression introduced by QEMU 5.2 as
explained in comment #11.

First thing that comes to mind is that QEMU should be started with a higher
ulimit for open file descriptors.

If QEMU is run manually from the shell, this can be achieved by setting
a high enough 'nolimit' value in a conf file in /etc/security/limits.d/.

If QEMU is run from libvirtd, this is achieved similarly by setting
'max_files' in /etc/libvirt/qemu.conf.

This is a bit sub-optimal because we don't really have metrics for
the number of descriptors QEMU will need. With the setups described
in this BZ, it seems to be slightly more than 1024, so if you use
a much higher limit, eg. 2048, you shouldn't hit the issue.

This is only a band-aid though. This scalability issue should be
addressed with the QEMU community through documentation in a first
time. Ultimately we might want QEMU to advertise the expected number
of descriptors for a given setup, and teach libvirt to exploit it
when calling setrlimit().

Anyway, this affects all architectures so I'm updating this BZ
accordingly.

Comment 17 qing.wang 2020-12-10 02:19:22 UTC
The increment is 121 in my test for each disk. Is it expected result? 

root@dell-per440-10 ~ # ls /proc/$(pgrep qemu)/fd | wc -l;lsof -p $(pgrep qemu)|wc -l
210
271
root@dell-per440-10 ~ # ls /proc/$(pgrep qemu)/fd | wc -l;lsof -p $(pgrep qemu)|wc -l
331
392
root@dell-per440-10 ~ # ls /proc/$(pgrep qemu)/fd | wc -l;lsof -p $(pgrep qemu)|wc -l
452
513

Comment 18 Greg Kurz 2020-12-10 09:27:55 UTC
(In reply to qing.wang from comment #17)
> The increment is 121 in my test for each disk. Is it expected result? 
> 

A virtio blk or scsi device now creates 2 * ${num_vcpus} * ${num_queues} fds,
so, yes, with the command line like the one in comment #12, you get at least
120 descriptors per added disk. This definitely causes a scalability concern
regarding RLIMIT_NOFILE, which seems to default to 1024 on RHEL (systemd ?).

I've talked with Stefan Hajnoczi (Cc'd) about the issue, and aside from
reverting the optimization, which would be very unfortunate, there isn't
much that we can do at the QEMU level in the short term. The best option
for now would be to  document the issue so that users can increase the
NOFILE limit for the QEMU process.

Comment 19 qing.wang 2020-12-10 09:43:12 UTC
I got it, so the question is if this feature can not bring performance improvement or just little improvement.
Is is valueable to enable it? will it cousume many system resource like fds?

Comment 20 Greg Kurz 2020-12-15 08:41:18 UTC
(In reply to qing.wang from comment #19)
> I got it, so the question is if this feature can not bring performance
> improvement or just little improvement.
> Is is valueable to enable it? will it cousume many system resource like fds?

The feature mentioned in comment #11 does bring substantial performance improvement.
This is why it is now enabled by default for newer machine types and we definitely
want keep it as such. And, yes, it will consume 2 * ${num_queues} * ${num_cpus} per
virtio blk/scsi disk.

The right way to cope with this at the current time is to raise the RLIMIT_NOFILE
for the QEMU process. This can be achieved at the libvirt level by setting 'max_files'
in /etc/libvirt/qemu.conf:

# If max_processes is set to a positive integer, libvirt will use
# it to set the maximum number of processes that can be run by qemu
# user. This can be used to override default value set by host OS.
# The same applies to max_files which sets the limit on the maximum
# number of opened files.
#
#max_processes = 0
max_files = 2048

If QEMU is started manually from the command line, this can be achieved through
the PAM /etc/security/limits.conf file. See the limits.conf(5) manual page.


Note You need to log in before you can comment on or make changes to this bug.