Bug 1429551
Summary: | Default FD limit prevents running more than ~470 VMs | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Eldad Marciano <emarcian> | |
Component: | libvirt | Assignee: | Laine Stump <laine> | |
Status: | CLOSED ERRATA | QA Contact: | chhu | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 7.3 | CC: | berrange, dyuan, emarcian, jsuchane, mgoldboi, mtessun, nsoffer, pkrempa, rbalakri, xuzhang, yalzhang, ykaul | |
Target Milestone: | rc | Keywords: | Performance, Upstream, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-3.2.0-1.el7 | Doc Type: | Bug Fix | |
Doc Text: |
On very large installations, libvirt could fail to start some guests, giving a "Too many open files error". The default open file handle limits have been substantially increased for the libvirtd, virtlockd, and virtlogd processes, allowing for several thousand guests on a host (if these generous limits are exceeded, the limits can be further increased in the systemd service files for the processes).
|
Story Points: | --- | |
Clone Of: | ||||
: | 1442043 (view as bug list) | Environment: | ||
Last Closed: | 2017-08-01 17:24:15 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1442043 |
Description
Eldad Marciano
2017-03-06 15:21:17 UTC
adding kernel version: 3.10.0-514.10.2.el7.x86_64 RHEL version Red Hat Enterprise Linux Server release 7.3 (Maipo) Our configuration file states the following on increasing the file limit: from /etc/sysconfig/libvirtd: # Override the maximum number of opened files. # This only works with traditional init scripts. # In the systemd world, the limit can only be changed by overriding # LimitNOFILE for libvirtd.service. To do that, just create a *.conf # file in /etc/systemd/system/libvirtd.service.d/ (for example # /etc/systemd/system/libvirtd.service.d/openfiles.conf) and write # the following two lines in it: # [Service] # LimitNOFILE=2048 Is this what you are looking for? There's probably an argument to be made that the default libvirtd.service script should raise the number of FDs to a much larger number. eg I could see us raising it to something like 65536 by default - even if libvirtd misbehaves and mistakenly creates that many FDs when it shouldn't, the resource usage associated with it is negligible compared to resource usage of even 1 single VM we launch. (In reply to Peter Krempa from comment #3) > Our configuration file states the following on increasing the file limit: > > from /etc/sysconfig/libvirtd: > # Override the maximum number of opened files. > # This only works with traditional init scripts. > # In the systemd world, the limit can only be changed by overriding > # LimitNOFILE for libvirtd.service. To do that, just create a *.conf > # file in /etc/systemd/system/libvirtd.service.d/ (for example > # /etc/systemd/system/libvirtd.service.d/openfiles.conf) and write > # the following two lines in it: > # [Service] > # LimitNOFILE=2048 > > > Is this what you are looking for? Probably yes, and per to comment #4 raise that number by default could be a good and quick solution, I just wonder if we should invest more time and effort for better understanding whether it is a FD's leak or not. please advise ovirt configures libvirt, so we can add our own systemd libvirt dropin to raise the number of open files, but I think the right place to fix this is in libvirt defaults, so all users enjoy a better default. (In reply to Eldad Marciano from comment #5) > Probably yes, > and per to comment #4 raise that number by default could be a good and quick > solution, I just wonder if we should invest more time and effort for better > understanding whether it is a FD's leak or not. I don't think there's any leak there - libvirtd requires at least two file descriptors per VM & you have 470 VMs, so that'll be getting very close to the default 1024 limit. (In reply to Daniel Berrange from comment #7) > (In reply to Eldad Marciano from comment #5) > > Probably yes, > > and per to comment #4 raise that number by default could be a good and quick > > solution, I just wonder if we should invest more time and effort for better > > understanding whether it is a FD's leak or not. > > I don't think there's any leak there - libvirtd requires at least two file > descriptors per VM & you have 470 VMs, so that'll be getting very close to > the default 1024 limit. Hmm make sense, Nir do we know that each vm take at least 2 file descriptors already ? either way, as Nir suggested lets increase that default number in libvirt. Can you provide 'lsof -p $(pgrep libvirtd)' for a machine with many RHV guests, so we can just confirm the FDs libvirt has open does match expectations (eg 1 for qemu monitor and 1 for qemu agent). This patch proposes setting limits to allow for approx 4096 guests to run per guest, with a "typical" file handle usage scenario https://www.redhat.com/archives/libvir-list/2017-March/msg00678.html The commit was pushed upstream: commit 27cd76350021d36b9bd8b187ce5c8919659e3806 Author: Daniel P. Berrange <berrange> Date: Wed Mar 15 16:51:51 2017 +0000 Increase default file handle limits for daemons Linux still defaults to a 1024 open file handle limit. This causes scalability problems for libvirtd / virtlockd / virtlogd on large hosts which might want > 1024 guest to be running. In fact if each guest needs > 1 FD, we can't even get to 500 guests. This is not good enough when we see machines with 100's of physical cores and TBs of RAM. In comparison to other memory requirements of libvirtd & related daemons, the resource usage associated with open file handles is essentially line noise. It is thus reasonable to increase the limits unconditionally for all installs. Verified on packages: qemu-kvm-rhev-2.9.0-3.el7.x86_64 libvirt-3.2.0-4.el7.x86_64 kernel-3.10.0-663.el7.x86_64 1. Check the configuration files: PASS #cat /usr/lib/systemd/system/libvirtd.service| grep LimitNOFILE LimitNOFILE=8192 #cat /usr/lib/systemd/system/virtlogd.service| grep LimitNOFILE LimitNOFILE=8192 #cat /usr/lib/systemd/system/virtlockd.service| grep LimitNOFILE LimitNOFILE=16384 2. Check the service limit settings: PASS #cat /proc/`pidof libvirtd`/limits| grep open Max open files 8192 8192 files #cat /proc/`pidof virtlogd`/limits| grep open Max open files 8192 8192 files #cat /proc/`pidof virtlockd`/limits| grep open Max open files 16384 16384 files 3. Check the virtlockd limit:16384: PASS Try to start 4100 guests with 4 disks in each guests, enable virtlockd service. Met error:"Too many open files", when near the virtlockd limit: 16384 Settings: 1) Edit /etc/libvirt/qemu.conf lock_manager = "lockd" max_processes = 65535 max_files = 65535 2) Edit /etc/libvirt/qemu-lockd.conf auto_disk_leases = 1 file_lockspace_dir = "/var/lib/libvirt/lockd/files" 3) Edit max_clients in /etc/libvirt/virtlockd.conf max_clients = 16385 4) #ulimit -n 65535 5) Restart service #systemctl start virtlockd #systemctl restart libvirtd Steps: 1) Try to define and start 4100 guests. Met error:"Too many open files", when near the virtlockd limit: 16384 #virsh start r7_test3275 error: Failed to start domain r7_test3275 error: Unable to open/create resource /var/lib/libvirt/lockd/files/403e582c39cc88f45b039d796a9cc4ae9175dce9b5323667562d14543436cbac: Too many open files #ls /proc/`pidof libvirtd`/fd/ | wc -l 3296 #ls /proc/`pidof virtlogd`/fd/ | wc -l 6558 #ls /proc/`pidof virtlockd`/fd/ | wc -l 16381 4. Check the virtlogd limit: 8192: PASS Try to start 2058 guests with xml below with 1 disk in each guest, hit error in libvirtd.log, when near the virtlogd limit 8192. "error : virNetClientProgramDispatchError:177 : Unable to duplicate FD 8190: Too many open files" 1) xml with serial log and guest agent. <serial type='file'> <source path='/var/log/libvirt/serial_##.log' append='off'/> <target port='0'/> </serial> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> #virsh start r7_test2045 error: Failed to start domain r7_test2045 error: Unable to duplicate FD 8190: Too many open files 2) check the limit, near virtlogd limit:8192 #ls /proc/`pidof virtlogd`/fd/ | wc -l 8186 #ls /proc/`pidof libvirtd`/fd/ | wc -l 2067 #ls /proc/`pidof virtlockd`/fd/ | wc -l 4099 5. Check libvirtd limit: 8192: PASS Try to start 8192 guests, with 1 disk in each guest, hit error in libvirtd.log, when near libvirtd limit: 8192. "error : getDevNull:401 : cannot open /dev/null: Too many open files" Settings: 1) Edit the virtlogd limit to 65535 #cat /usr/lib/systemd/system/virtlogd.service| grep LimitNOFILE #LimitNOFILE=8192 LimitNOFILE=65535 2) check the limits: #cat /proc/`pidof virtlockd`/limits| grep open Max open files 16384 16384 files #cat /proc/`pidof virtlogd`/limits| grep open Max open files 65535 65535 files #cat /proc/`pidof libvirtd`/limits| grep open Max open files 8192 8192 files Steps: 1) Try to start the 8192 guest with script. Failed to start the VM, and hit error "Too many open files" in libvirtd.log when near libvirtd limit: 8192 -------------------------------------------------------------------- error : getDevNull:401 : cannot open /dev/null: Too many open files debug : virCommandRunAsync:2451 : Command result -1, with PID -1 debug : virFileClose:110 : Closed fd 8189 debug : virFileClose:110 : Closed fd 8190 debug : qemuProcessLaunch:5700 : QEMU vm=0x7fd689753a70 name=r7_test8161 failed to spawn debug : qemuProcessLaunch:5703 : Writing early domain status to disk debug : virFileMakePathHelper:2912 : path=/var/run/libvirt/qemu mode=0777 debug : virFileClose:110 : Closed fd 8189 debug : qemuProcessLaunch:5707 : Waiting for handshake from child error : virCommandHandshakeWait:2672 : internal error: invalid use of command API --------------------------------------------------------------- 2) check the libvirtd open files near the limit 8192. #ls /proc/`pidof libvirtd`/fd/ |wc -l 8183 #ls /proc/`pidof virtlockd`/fd/ |wc -l 16331 #ls /proc/`pidof virtlogd`/fd/ |wc -l 16330 3) virsh list --all check 8160 guests are running. No error in: /var/log/libvirt/qemu/r7_test8161.log According to the comment20,21,22, set the bug status to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |