Bug 1450960
Summary: | Libvirt's nodedev udev backend reports 'udev_monitor_receive_device returned NULL' when a great amount of device 'add' events are emitted by udev | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ning Bo <ning.bo9> | ||||
Component: | libvirt | Assignee: | Erik Skultety <eskultet> | ||||
Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.3 | CC: | dyuan, eskultet, libvirt-maint, mzhan, ning.bo9, rbalakri, xuzhang, yafu, yalzhang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-3.2.0-7.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-08-02 00:08:25 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ning Bo
2017-05-15 13:05:10 UTC
Created attachment 1278999 [details]
a simple fix to change the buffer size of udev monitor
The reason is that the buffer of udev monitor default size cann't save all udev
events reported by kernel.
So we need change buffer size so that we can receive as much events as possible
whitin a short time.
(In reply to Ning Bo from comment #1) > Created attachment 1278999 [details] > a simple fix to change the buffer size of udev monitor > > The reason is that the buffer of udev monitor default size cann't save all > udev > events reported by kernel. > So we need change buffer size so that we can receive as much events as > possible > whitin a short time. There was an upstream discussion to your patch [1]. Would you send a v2 containing a check for privileges, or shall I do that? I'm trying to reproduce it in the meantime and get more information out of libudev. Although I'm able to reproduce it only in about 75-80% cases, I haven't had much luck with regards to libudev for the time being. [1] https://www.redhat.com/archives/libvir-list/2017-May/msg00264.html I'm not familiar with privilege, i'm sorry for that, would you help me to improve the patch?(In reply to Erik Skultety from comment #3) > (In reply to Ning Bo from comment #1) > > Created attachment 1278999 [details] > > a simple fix to change the buffer size of udev monitor > > > > The reason is that the buffer of udev monitor default size cann't save all > > udev > > events reported by kernel. > > So we need change buffer size so that we can receive as much events as > > possible > > whitin a short time. > > There was an upstream discussion to your patch [1]. Would you send a v2 > containing a check for privileges, or shall I do that? I'm trying to > reproduce it in the meantime and get more information out of libudev. > Although I'm able to reproduce it only in about 75-80% cases, I haven't had > much luck with regards to libudev for the time being. > > [1] https://www.redhat.com/archives/libvir-list/2017-May/msg00264.html I'm not familiar with privilege, i'm sorry for that, would you help me to improve the patch? I confirmed with GDB, that we indeed get ENOBUFFS from libudev and sent a v2 of your patch to the list [1]. [1] https://www.redhat.com/archives/libvir-list/2017-May/msg00732.html Fixed upstream by: commit d1eea6c12aad5cb503562a52915138bf0d0a70a2 Refs: v3.4.0-rc1-2-gd1eea6c12 Author: ning.bo <ning.bo9.cn> AuthorDate: Tue May 9 10:09:07 2017 +0800 Commit: Erik Skultety <eskultet> CommitDate: Mon May 29 15:57:04 2017 +0200 nodedev: Increase the netlink socket buffer size to the one used by udev When a number of SRIOV VFs (up to 128 on Intel XL710) is created: for i in `seq 0 1`; do echo 63 > /sys/class/net/<interface>/device/sriov_numvfs done libvirtd will then report "udev_monitor_receive_device returned NULL" error because the netlink socket buffer is not big enough (using GDB on libudev confirmed this with ENOBUFFS) and thus some udev events were dropped. This results in some devices being missing in the nodedev-list output. This patch overrides the system's rmem_max limit but for that, we need to make sure we've got root privileges. https://bugzilla.redhat.com/show_bug.cgi?id=1450960 Signed-off-by: ning.bo <ning.bo9.cn> Signed-off-by: Erik Skultety <eskultet> Reproduced with libvirt-3.2.0-5.el7. Verify with: libvirt-3.2.0-13.el7.x86_64 systemd-219-41.el7.x86_64 kernel-3.10.0-682.el7.x86_64 Test steps(Test on ixgbe and i40e driver): 1.Create 126 vfs in the short time: # for i in `seq 1 2`; do echo 63 > /sys/class/net/p7p$i/device/sriov_numvfs ;done 2.Check the log in the syslog(The warning and error log of libvirtd will output in the syslog by default): #cat /var/log/messages | grep "udev_monitor" no error outputs 3.Since only increase the netlink socket buffer size to the one used by udev with privileged user, so there still errors in the libvirtd log started by non-privileged user, which is the expected result. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |