Bug 1429880
Summary: | keepalived high number of close syscalls | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaroslav Reznik <jreznik> |
Component: | keepalived | Assignee: | Ryan O'Hara <rohara> |
Status: | CLOSED ERRATA | QA Contact: | Brandon Perkins <bperkins> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.2 | CC: | aos-bugs, bmchugh, cluster-maint, csochin, dlbewley, erich, jruemker, mnavrati, nkim, rhowe, rmanes, rohara, yann.morice |
Target Milestone: | rc | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | keepalived-1.2.13-9.el7_3 | Doc Type: | Bug Fix |
Doc Text: |
Previously, the keepalived utility attempted to close a large number of file descriptors each time a notification script was invoked. As a consequence, keepalived generated many unnecessary close() system calls in an attempt to close file descriptors that were not open. This bug has been fixed by using the SOCK_CLOEXEC flag when opening all sockets, and the FD_CLOEXEC flag when opening all file descriptors. As a result, the number of close() system calls is no longer excessive.
|
Story Points: | --- |
Clone Of: | 1324594 | Environment: | |
Last Closed: | 2017-05-25 15:37:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1324594 | ||
Bug Blocks: |
Description
Jaroslav Reznik
2017-03-07 11:35:54 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1305 Hi, It seems that this patch of keepalived (1.2.13-9.el7_3) have introduced a regression. In an openstack cloud, keepalived instance of ha routers began to create a lot of pipes until too many open files. ex. Keepalived_vrrp[xxx]: Netlink: Cannot open netlink socket : (Too many open files). This was not the case in previous version (1.2.13-8.el7) => only two pipes... I suspect this to be caused by SIGHUP signal from configuration changes (add of floating ip) as a similar router with no config changes stay ok. In fact, I think the bug was introduced is in lib/signals.c. I don't see any mechanism to replace signal_handler_destroy for closing these two pipes. This piece of code is still in the trunk (and latest versions) of keepalived. I think these four lines should not be deleted : - close(signal_pipe[1]); - close(signal_pipe[0]); - signal_pipe[1] = -1; - signal_pipe[0] = -1; For information (but not directly linked), we never use the code in #ifdef HAVE_PIPE2 because set of variable in configure is not propagated in Makefiles*. In fact, if we do strings /usr/sbin/keepalived |grep pipe on installed keepalived, we got pipe (instead of pipe2). I suspect we should have something like DEFS = @DFLAGS@ -D@SNMP_SUPPORT@ @DEFS@ in lib/Makefile.in to do this. (In reply to yann.morice from comment #7) > Hi, > > It seems that this patch of keepalived (1.2.13-9.el7_3) have introduced a > regression. In an openstack cloud, keepalived instance of ha routers began > to create a lot of pipes until too many open files. ex. > Keepalived_vrrp[xxx]: Netlink: Cannot open netlink socket : (Too many open > files). This was not the case in previous version (1.2.13-8.el7) => only two > pipes... > > I suspect this to be caused by SIGHUP signal from configuration changes (add > of floating ip) as a similar router with no config changes stay ok. > > In fact, I think the bug was introduced is in lib/signals.c. I don't see any > mechanism to replace signal_handler_destroy for closing these two pipes. > This piece of code is still in the trunk (and latest versions) of > keepalived. I think these four lines should not be deleted : > > - close(signal_pipe[1]); > - close(signal_pipe[0]); > - signal_pipe[1] = -1; > - signal_pipe[0] = -1; > > For information (but not directly linked), we never use the code in #ifdef > HAVE_PIPE2 because set of variable in configure is not propagated in > Makefiles*. In fact, if we do strings /usr/sbin/keepalived |grep pipe on > installed keepalived, we got pipe (instead of pipe2). I suspect we should > have something like DEFS = @DFLAGS@ -D@SNMP_SUPPORT@ @DEFS@ in > lib/Makefile.in to do this. Can you please provide some details regarding how you are seeing an increased number of pipes? This will help reproduce and fix any regression that was introduced by the patch. Thanks. * If we run a simple config using vrrp only : # keepalived -P -f /etc/keepalived/keepalived.conf vrrp_instance VR_1 { state BACKUP interface eth0 virtual_router_id 1 priority 50 garp_master_delay 60 nopreempt advert_int 2 track_interface { eth0 } virtual_ipaddress { 169.254.0.2/24 dev eth0 } } * We have then two new processes : # ps -eafwww |grep keepalived root 13722 1 0 09:00 ? 00:00:00 keepalived -P -f /etc/keepalived/keepalived-simple.conf root 13723 13722 0 09:00 ? 00:00:00 keepalived -P -f /etc/keepalived/keepalived-simple.conf * First one has two pipes : # lsof |grep 13722|grep pipe|wc -l 2 * Second one has four pipes at the beginning : # lsof |grep 13723|grep pipe|wc -l 4 * If we do SIGHUP to the main process to live update configuration : # kill -HUP 13722 * and do again : # lsof |grep 13723|grep pipe|wc -l 6 We have now six pipes... (+2 pipes at each SIGHUP in fact) With the version 1.2.13-8.el7, doing the same, the second process stay at only two pipes across all SIGHUPs. The problem is that this is widely used by openstack to live update configuration of routers (floating-ips, etc...) and do the router go off-line after too many open files... Thanks. Please open a new bugzilla for this issue. Ok. done => Bug #1464869 |