Bug 1911581
Summary: | Core dump when hitting file descriptor limit | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Xujun Ma <xuma> |
Component: | qemu-kvm | Assignee: | Greg Kurz <gkurz> |
qemu-kvm sub component: | General | QA Contact: | Xujun Ma <xuma> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | ailan, chayang, ddepaula, gkurz, jinzhao, juzhang, nilal, pbonzini, qzhang, smitterl, virt-maint, yama, yuhuang |
Version: | 8.4 | Keywords: | TestOnly, Triaged |
Target Milestone: | rc | ||
Target Release: | 8.5 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-6.0.0-17.module+el8.5.0+11173+c9fce0bb | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-11-16 07:51:11 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1788991 |
Description
Xujun Ma
2020-12-30 06:31:41 UTC
Xujun, can you confirm the hardware field? I'm pretty sure this is the same as the problem Greg looked at a while back. Some changes have meant that qemu consumes more file descriptors per vcpu, which means we can now run into the RHEL default fd limits. (In reply to Qunfang Zhang from comment #1) > Xujun, can you confirm the hardware field? Have tested,x86 has this problem too. (In reply to David Gibson from comment #2) > I'm pretty sure this is the same as the problem Greg looked at a while back. > Some changes have meant that qemu consumes more file descriptors per vcpu, > which means we can now run into the RHEL default fd limits. Hmm... indeed we do see a "Too many open files" error that is likely the same as in bug #1902548, but here we also have a QEMU crash. I'd prefer to have a look before marking this bug as a duplicate of the other bug. (In reply to Greg Kurz from comment #4) > (In reply to David Gibson from comment #2) > > I'm pretty sure this is the same as the problem Greg looked at a while back. > > Some changes have meant that qemu consumes more file descriptors per vcpu, > > which means we can now run into the RHEL default fd limits. > > Hmm... indeed we do see a "Too many open files" error that is likely the > same as in bug #1902548, but here we also have a QEMU crash. I'd prefer > to have a look before marking this bug as a duplicate of the other bug. The QEMU crash happens because the rollback path does: fail_vrings: aio_wait_bh_oneshot(s->ctx, virtio_scsi_dataplane_stop_bh, s); virtio_scsi_dataplane_stop_bh() clears the host notifiers and causes the vq handlers to be invoked. This triggers the assertion in virtio_scsi_data_plane_handle_ctrl() because s->dataplane_started hasn't been set to true yet. So even if the root cause is the same (ran into fd limits), this isn't a duplicate of bug #1902548: virtio-scsi should have a working fallback like virtio-blk for this case, or at least print an error+hint to raise the fd limit and exit gracefully instead of aborting. Good analysis, thanks Greg. Can you work with Igor to figure out how to fix that? virtio-scsi is Paolo's domain, adding him to CC. (In reply to Xujun Ma from comment #3) > (In reply to Qunfang Zhang from comment #1) > > Xujun, can you confirm the hardware field? > > Have tested,x86 has this problem too. The results of x86 are as follows: # ./cmd.bak QEMU 5.2.0 monitor - type 'help' for more information (qemu) qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24) virtio-scsi: Failed to set host notifier (-24) qemu-kvm: ../hw/scsi/virtio-scsi-dataplane.c:59: virtio_scsi_data_plane_handle_cmd: Assertion `s->ctx && s->dataplane_started' failed. ./cmd.bak: line 35: 109355 Aborted (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -sandbox on -machine q35,kernel-irqchip=split -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 -nodefaults -device VGA,bus=pcie.0,addr=0x2 -m 30720 -smp 384 -device intel-iommu,intremap=on,eim=on -cpu 'IvyBridge',+kvm_pv_unhalt -chardev socket,nowait,path=/var/tmp/monitor-qmpmonitor1-20201130-083617-UdoMuUZg,server,id=qmp_id_qmpmonitor1 -mon chardev=qmp_id_qmpmonitor1,mode=control -chardev socket,nowait,path=/var/tmp/monitor-catch_monitor-20201130-083617-UdoMuUZg,server,id=qmp_id_catch_monitor -mon chardev=qmp_id_catch_monitor,mode=control -device pvpanic,ioport=0x505,id=idUR0xIV -chardev socket,nowait,path=/var/tmp/serial-serial0-20201130-083617-UdoMuUZg,server,id=chardev_serial0 -device isa-serial,id=serial0,chardev=chardev_serial0 -chardev socket,id=seabioslog_id_20201130-083617-UdoMuUZg,path=/var/tmp/seabios-20201130-083617-UdoMuUZg,server,nowait -device isa-debugcon,chardev=seabioslog_id_20201130-083617-UdoMuUZg,iobase=0x402 -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/rhel840-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 -device scsi-hd,id=image1,drive=drive_image1,write-cache=on -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 -vnc :0 -rtc base=utc -boot menu=off,order=cdn,once=c,strict=off -enable-kvm -monitor stdio -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 (In reply to Xujun Ma from comment #0) > Can solve this issue by increasing file limit,but there will be another > problem:guest stop at "Trying to load: from: > /pci@800000020000000/scsi@4/disk@100000000000000 ... Successfully loaded" > about 3:18 minutes,I think it's not acceptable.and please help add friendly > message if need more file limit. > Having to increase the file descriptor limit for large VMs is not necessarily a bug. If you are seeing additional problems after you increase the limit, this might be a bug that's more serious than this one and it needs a separate BZ. Please open a BZ with more details. (In reply to Eduardo Habkost from comment #9) > (In reply to Xujun Ma from comment #0) > > Can solve this issue by increasing file limit,but there will be another > > problem:guest stop at "Trying to load: from: > > /pci@800000020000000/scsi@4/disk@100000000000000 ... Successfully loaded" > > about 3:18 minutes,I think it's not acceptable.and please help add friendly > > message if need more file limit. > > > > Having to increase the file descriptor limit for large VMs is not > necessarily a bug. > > If you are seeing additional problems after you increase the limit, this > might be a bug that's more serious than this one and it needs a separate BZ. > Please open a BZ with more details. Have filed a new bug:https://bugzilla.redhat.com/show_bug.cgi?id=1927108 for that problem. Do we need to add a friendly warning instead of core dumped for this situation? (In reply to Xujun Ma from comment #10) > Have filed a new bug:https://bugzilla.redhat.com/show_bug.cgi?id=1927108 for > that problem. Thanks! > Do we need to add a friendly warning instead of core dumped for this > situation? Yes. Crashing instead of printing a more friendly error message after hitting the limit is a bug, but not a major one. Surely it is not a regression. Note that a newer machine requiring more resources than older machines is not a regression. "pseries-rhel8.4.0" and "pc-q35-rhel8.4.0" are expected to increase the number of virtio queues depending on the number of VCPUs, and will require higher open file limits. (In reply to Xujun Ma from comment #0) > Expected results: > Boot up guest successfully. > Additional info: > Can solve this issue by increasing file limit,but there will be another > problem:guest stop at "Trying to load: from: > /pci@800000020000000/scsi@4/disk@100000000000000 ... Successfully loaded" > about 3:18 minutes,I think it's not acceptable.and please help add friendly > message if need more file limit. > There are three different problems described in the paragraph above: 1) Default configuration needs to be manually changed to run larger VMs. 2) A request to print a more friendly error message if the limit is too low. 3) A report that boot is stuck (or low) after manually increasing the file descriptor limit. The scope of this BZ needs to be clearly defined. Please clarify which of the problems above is being tracked by this BZ. If I understand correctly, item #3 is being tracked at bug 1927108 and is out of the scope of this BZ. (In reply to Eduardo Habkost from comment #12) > (In reply to Xujun Ma from comment #0) > > Expected results: > > Boot up guest successfully. > > Additional info: > > Can solve this issue by increasing file limit,but there will be another > > problem:guest stop at "Trying to load: from: > > /pci@800000020000000/scsi@4/disk@100000000000000 ... Successfully loaded" > > about 3:18 minutes,I think it's not acceptable.and please help add friendly > > message if need more file limit. > > > > There are three different problems described in the paragraph above: > 1) Default configuration needs to be manually changed to run larger VMs. > 2) A request to print a more friendly error message if the limit is too low. > 3) A report that boot is stuck (or low) after manually increasing the file > descriptor limit. > > The scope of this BZ needs to be clearly defined. Please clarify which of > the problems above is being tracked by this BZ. > > If I understand correctly, item #3 is being tracked at bug 1927108 and is > out of the scope of this BZ. Yes,You are right.I think should add friendly error message for user at least so that user know how to handle when meeting this kind of situation. (In reply to Xujun Ma from comment #13) > (In reply to Eduardo Habkost from comment #12) > > (In reply to Xujun Ma from comment #0) > > > Expected results: > > > Boot up guest successfully. > > > Additional info: > > > Can solve this issue by increasing file limit,but there will be another > > > problem:guest stop at "Trying to load: from: > > > /pci@800000020000000/scsi@4/disk@100000000000000 ... Successfully loaded" > > > about 3:18 minutes,I think it's not acceptable.and please help add friendly > > > message if need more file limit. > > > > > > > There are three different problems described in the paragraph above: > > 1) Default configuration needs to be manually changed to run larger VMs. > > 2) A request to print a more friendly error message if the limit is too low. > > 3) A report that boot is stuck (or low) after manually increasing the file > > descriptor limit. > > > > The scope of this BZ needs to be clearly defined. Please clarify which of > > the problems above is being tracked by this BZ. > > > > If I understand correctly, item #3 is being tracked at bug 1927108 and is > > out of the scope of this BZ. > > Yes,You are right.I think should add friendly error message for user at > least so that user know how to handle when meeting this kind of situation. If this BZ is just about making the error message friendlier (#2), it is not a regression and priority/severity shouldn't be high. Item #1 above could be tracked on a separate BZ, but I don't believe it is a bug (a new machine type requiring more resources to run is not a regression). The following upstream changes fixes the crash: commit 6f1a5c37db5a6fc7c5c44b3e45cee6e33df31e9d Author: Maxim Levitsky <mlevitsk> Date: Thu Dec 17 17:00:38 2020 +0200 virtio-scsi: don't process IO on fenced dataplane If virtio_scsi_dataplane_start fails, there is a small window when it drops the aio lock (in aio_wait_bh_oneshot) and the dataplane's AIO handler can still run during that window. This is done after the dataplane was marked as fenced, thus we use this flag to avoid it doing any IO. Signed-off-by: Maxim Levitsky <mlevitsk> Message-Id: <20201217150040.906961-2-mlevitsk> Signed-off-by: Paolo Bonzini <pbonzini> QEMU now falls back to run in a degraded (slower) mode instead. This is indicated by the following warning: virtio-scsi: Failed to set host notifier (-24) qemu-system-ppc64: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower). Unfortunately, the same warning is printed for each queue and floods the monitor. I'll post a patch for that. (In reply to Greg Kurz from comment #15) > The following upstream changes fixes the crash: > > commit 6f1a5c37db5a6fc7c5c44b3e45cee6e33df31e9d > Author: Maxim Levitsky <mlevitsk> > Date: Thu Dec 17 17:00:38 2020 +0200 > > virtio-scsi: don't process IO on fenced dataplane > > If virtio_scsi_dataplane_start fails, there is a small window when it > drops the > aio lock (in aio_wait_bh_oneshot) and the dataplane's AIO handler can > still run during that window. > > This is done after the dataplane was marked as fenced, thus we use this > flag > to avoid it doing any IO. > > Signed-off-by: Maxim Levitsky <mlevitsk> > Message-Id: <20201217150040.906961-2-mlevitsk> > Signed-off-by: Paolo Bonzini <pbonzini> > > > QEMU now falls back to run in a degraded (slower) mode instead. > > This is indicated by the following warning: > > virtio-scsi: Failed to set host notifier (-24) > qemu-system-ppc64: virtio_bus_start_ioeventfd: failed. Fallback to userspace > (slower). > > Unfortunately, the same warning is printed for each queue and floods the > monitor. > I'll post a patch for that. Reducing the flood isn't that trivial and it is just a nice to have. The above commit is enough to fix the current bug. Let's move forward. Upstream feature already present in qemu-6.0. Marked as TestOnly and moved directly to ON_QA Boot up guest successfully with 384 vcpus. Bug has been fixed in this build. Booting log: Trying to load: from: /pci@800000020000000/scsi@4/disk@100000000000000 ... qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24) virtio-scsi: Failed to set host notifier (-24) qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower). Successfully loaded qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24) virtio-scsi: Failed to set host notifier (-24) qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower). qemu-kvm: virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24) virtio-scsi: Failed to set host notifier (-24) qemu-kvm: virtio_bus_start_ioeventfd: failed. Fallback to userspace (slower). Base the test result above, set bug to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4684 |