Bug 1370356

Summary: [ppc64le] [data-plane]qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
Product: Red Hat Enterprise Linux 7 Reporter: Zhengtong <zhengtli>
Component: qemu-kvm-rhevAssignee: Thomas Huth <thuth>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: knoel, mdeng, qzhang, thuth, virt-maint, zhengtli
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-06 06:52:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zhengtong 2016-08-26 03:15:50 UTC
Description of problem:
Boot up guest with data-plane for plenty of virtio_scsi devices with multifunction=on. During booting process, qemu raises a lot of error message:

"""
...
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
qemu-kvm: virtio_pci_start_ioeventfd: failed. Fallback to a userspace (slower).
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
qemu-kvm: virtio_pci_start_ioeventfd: failed. Fallback to a userspace (slower).
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
qemu-kvm: virtio_pci_start_ioeventfd: failed. Fallback to a userspace (slower).
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
...
"""
Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-22.el7
Host kernel:3.10.0-493.el7.ppc64le
RHEL7 & RHEL6 guest all hit the issue

How reproducible:
3/3

Steps to Reproduce:
1.boot up guest with data-plane and virtio-scsi-pci devices, the full cmd is in "additional info" part

2.Keep eyes on qemu-kvm output


Actual results:
A lot of error msg
"""
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
qemu-kvm: virtio_pci_start_ioeventfd: failed. Fallback to a userspace (slower).
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
qemu-kvm: virtio_pci_start_ioeventfd: failed. Fallback to a userspace (slower).
qemu-kvm: virtio_pci_set_host_notifier_internal: unable to init event notifier: -24
virtio-scsi: Failed to set guest notifiers (-24), ensure -enable-kvm is set
"""

Expected results:
No this kind of error msg

Additional info:
1. Guest can boot up successfully, but the accelerator may be not correct as stated in error msg
2. Didn't happened on virtio-blk-pci devices
3. Didn't find these error msg in x86_64, through there are some stuck during guest booting up, but not on qemu.

boot guest cmd:

/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -enable-kvm \
    -sandbox off  \
    -machine pseries  \
    -nodefaults  \
    -vga std  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20160823-051250-dVXrvtSr,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20160823-051250-dVXrvtSr,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20160823-051250-dVXrvtSr,server,nowait \
    -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 \
    -device pci-ohci,id=usb1,bus=pci.0,addr=03 \
    -device virtio-net-pci,mac=9a:80:81:82:83:84,id=idza3wdN,vectors=4,netdev=idR2v8Fj,bus=pci.0,addr=04 \
    -netdev tap,id=idR2v8Fj,vhost=on \
    -m 8192  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -device usb-kbd \
    -device usb-mouse \
    -vnc :0  \
    -rtc base=utc,clock=host  \
    -qmp tcp:0:4445,server,nowait \
    -object iothread,id=iothread0 \
    -device virtio-scsi-pci,iothread=iothread0,id=virtio_scsi_pci0,bus=pci.0,addr=0x1f \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,werror=stop,rerror=stop,file=/home/RHEL-Server-7.3-ppc64-virtio.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0 \
    -object iothread,id=iothread1 \
    -monitor stdio \
    -device virtio-scsi-pci,iothread=iothread1,id=virtio_scsi_pci1,bus=pci.0,addr=06 \
    -drive id=drive_image2,if=none,snapshot=off,aio=native,cache=none,format=qcow2,werror=stop,rerror=stop,file=/home/data1.qcow2 \
    -device scsi-hd,id=image2,drive=drive_image2,bus=virtio_scsi_pci1.0 \
    -drive file=/home/images/u_1,if=none,id=drive_1,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_1,iothread=iothread1,multifunction=on,addr=7.1 -device scsi-hd,id=scsi_test_1,drive=drive_1,bus=virt_1.0 \
    -drive file=/home/images/u_2,if=none,id=drive_2,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_2,iothread=iothread1,multifunction=on,addr=7.2 -device scsi-hd,id=scsi_test_2,drive=drive_2,bus=virt_2.0 \
    -drive file=/home/images/u_3,if=none,id=drive_3,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_3,iothread=iothread1,multifunction=on,addr=7.3 -device scsi-hd,id=scsi_test_3,drive=drive_3,bus=virt_3.0 \
    -drive file=/home/images/u_4,if=none,id=drive_4,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_4,iothread=iothread1,multifunction=on,addr=7.4 -device scsi-hd,id=scsi_test_4,drive=drive_4,bus=virt_4.0 \
...
-drive file=/home/images/u_164,if=none,id=drive_164,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_164,iothread=iothread1,multifunction=on,addr=1e.4 -device scsi-hd,id=scsi_test_164,drive=drive_164,bus=virt_164.0 \
    -drive file=/home/images/u_165,if=none,id=drive_165,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_165,iothread=iothread1,multifunction=on,addr=1e.5 -device scsi-hd,id=scsi_test_165,drive=drive_165,bus=virt_165.0 \
    -drive file=/home/images/u_166,if=none,id=drive_166,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_166,iothread=iothread1,multifunction=on,addr=1e.6 -device scsi-hd,id=scsi_test_166,drive=drive_166,bus=virt_166.0 \
    -drive file=/home/images/u_167,if=none,id=drive_167,format=raw,cache=none,aio=native -device virtio-scsi-pci,id=virt_167,iothread=iothread1,multifunction=on,addr=1e.7 -device scsi-hd,id=scsi_test_167,drive=drive_167,bus=virt_167.0 \


ps: I have to set the address of controller of booting device to be "0x1f", otherwise, the guest will not boot up as it can't find the boot device...

Comment 4 Thomas Huth 2016-08-30 07:24:02 UTC
This sounds like a duplicate of BZ 1271060 - please use "ulimit -n xxx" to increase the number of file descriptors. However, I wonder why x86 works differently here ... do we need more file descriptors on ppc than on x86 ?

Comment 5 Zhengtong 2016-08-30 08:44:30 UTC
hi Thomas, 

Do you have a suggest value on ulimit?

I have set fd number to be 81920 by "ulimit -n 81920", then I ran the guest boot cmd, the error msg still existed.

Comment 6 Thomas Huth 2016-08-30 09:31:17 UTC
81920 sounds plenty, so that sounds strange that it is still now working with this value. When I got some spare minutes, I'll try to have a closer look why this is still not sufficient in this case...

Comment 7 Thomas Huth 2016-09-02 21:03:59 UTC
FWIW, I can reproduce the problem with the following "shortened" command line:

/usr/libexec/qemu-kvm -nographic -vga none -enable-kvm -object iothread,id=iothread1 `for ((x=1;x<160;x++)); do echo " -drive file=/scratch/disk$x.qcow2,if=none,id=drive_$x,format=qcow2,cache=none,aio=native -device virtio-scsi-pci,id=virt_$x,iothread=iothread1,multifunction=on,addr=\`printf "%X.%X" $(($x / 8)) $(($x % 8))\` -device scsi-hd,id=scsi_test_$x,drive=drive_$x,bus=virt_$x.0" ; done`

Comment 8 Thomas Huth 2016-09-02 21:30:01 UTC
Zhengtong, for me, the error messages go away when I use "ulimit -n 81920" ... did you by any chance ran the qemu-kvm program with a different user ID than the ulimit? If not, could you please attach your whole command line that you use to run qemu, in case I missed there something in my "shortened" version? Thanks!

Comment 9 Zhengtong 2016-09-05 03:09:42 UTC
Thomas, I always run the qemu-kvm program in "root" account. The full command is the comment #c0 of this bug . 

I will do the test again to confirm the result, and will update the result here is the result is different with that in my previous test

Comment 10 Thomas Huth 2016-09-05 08:57:13 UTC
Ok, thanks for confirming that you're running everything as "root" (I though maybe you'd use libvirt here - and libvirt is running the qemu binary as a different user, but sounds like you're running qemu-kvm directly, without libvirt, right?).

Another thing to check: What's your global maximum amount of file handles? Could you please run the following command and post the result here:

 sysctl fs.file-max

If that value is very low, please try to increase it with

 sysctl -w fs.file-max=...

and run the test again.

Comment 11 Zhengtong 2016-09-06 02:42:10 UTC
Thomas, the host I used to reproduce the issue was released. I reserved another host yesterday. With this host. I can reproduce the issue with default ulimit value.  But It disappeared after I set "ulimit -n 81920". The issue never raised up any more after I tried for several times. so, may be there are some wrong configuration in previous test.  

Above all, I think you original analysis is correct. please change the bug status or give resolve method as your want, thanks.  

yes, I always boot up the guest with running the qemu-kvm program directly. without libvirt.

Comment 12 Thomas Huth 2016-09-06 06:52:02 UTC
OK, thanks for checking again! I assume that fs.file-max was likely set to a low value on your original host for some reason, and it was back to a proper value on the second host that you tried. It now sounds like this is the very same problem as in BZ 1271060, so I'm closing this one here as duplicate.

*** This bug has been marked as a duplicate of bug 1271060 ***