RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1429880 - keepalived high number of close syscalls
Summary: keepalived high number of close syscalls
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: keepalived
Version: 7.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Ryan O'Hara
QA Contact: Brandon Perkins
URL:
Whiteboard:
Depends On: 1324594
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-07 11:35 UTC by Jaroslav Reznik
Modified: 2020-08-13 08:55 UTC (History)
13 users (show)

Fixed In Version: keepalived-1.2.13-9.el7_3
Doc Type: Bug Fix
Doc Text:
Previously, the keepalived utility attempted to close a large number of file descriptors each time a notification script was invoked. As a consequence, keepalived generated many unnecessary close() system calls in an attempt to close file descriptors that were not open. This bug has been fixed by using the SOCK_CLOEXEC flag when opening all sockets, and the FD_CLOEXEC flag when opening all file descriptors. As a result, the number of close() system calls is no longer excessive.
Clone Of: 1324594
Environment:
Last Closed: 2017-05-25 15:37:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1323526 0 unspecified CLOSED keepalived-1.2.20 is available 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2017:1305 0 normal SHIPPED_LIVE keepalived bug fix update 2017-05-25 19:32:21 UTC

Description Jaroslav Reznik 2017-03-07 11:35:54 UTC
This bug has been copied from bug #1324594 and has been proposed
to be backported to 7.3 z-stream (EUS).

Comment 6 errata-xmlrpc 2017-05-25 15:37:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1305

Comment 7 yann.morice 2017-06-20 13:46:57 UTC
Hi,

It seems that this patch of keepalived (1.2.13-9.el7_3) have introduced a regression. In an openstack cloud, keepalived instance of ha routers began to create a lot of pipes until too many open files. ex. Keepalived_vrrp[xxx]: Netlink: Cannot open netlink socket : (Too many open files). This was not the case in previous version (1.2.13-8.el7) => only two pipes...

I suspect this to be caused by SIGHUP signal from configuration changes (add of floating ip) as a similar router with no config changes stay ok.

In fact, I think the bug was introduced is in lib/signals.c. I don't see any mechanism to replace signal_handler_destroy for closing these two pipes. This piece of code is still in the trunk (and latest versions) of keepalived. I think these four lines should not be deleted :

-       close(signal_pipe[1]);
-       close(signal_pipe[0]);
-       signal_pipe[1] = -1;
-       signal_pipe[0] = -1;

For information (but not directly linked), we never use the code in #ifdef HAVE_PIPE2 because set of variable in configure is not propagated in Makefiles*. In fact, if we do strings /usr/sbin/keepalived |grep pipe on installed keepalived, we got pipe (instead of pipe2). I suspect we should have something like DEFS	 = @DFLAGS@ -D@SNMP_SUPPORT@ @DEFS@ in lib/Makefile.in to do this.

Comment 8 Ryan O'Hara 2017-06-21 15:41:06 UTC
(In reply to yann.morice from comment #7)
> Hi,
> 
> It seems that this patch of keepalived (1.2.13-9.el7_3) have introduced a
> regression. In an openstack cloud, keepalived instance of ha routers began
> to create a lot of pipes until too many open files. ex.
> Keepalived_vrrp[xxx]: Netlink: Cannot open netlink socket : (Too many open
> files). This was not the case in previous version (1.2.13-8.el7) => only two
> pipes...
> 
> I suspect this to be caused by SIGHUP signal from configuration changes (add
> of floating ip) as a similar router with no config changes stay ok.
> 
> In fact, I think the bug was introduced is in lib/signals.c. I don't see any
> mechanism to replace signal_handler_destroy for closing these two pipes.
> This piece of code is still in the trunk (and latest versions) of
> keepalived. I think these four lines should not be deleted :
> 
> -       close(signal_pipe[1]);
> -       close(signal_pipe[0]);
> -       signal_pipe[1] = -1;
> -       signal_pipe[0] = -1;
> 
> For information (but not directly linked), we never use the code in #ifdef
> HAVE_PIPE2 because set of variable in configure is not propagated in
> Makefiles*. In fact, if we do strings /usr/sbin/keepalived |grep pipe on
> installed keepalived, we got pipe (instead of pipe2). I suspect we should
> have something like DEFS	 = @DFLAGS@ -D@SNMP_SUPPORT@ @DEFS@ in
> lib/Makefile.in to do this.

Can you please provide some details regarding how you are seeing an increased number of pipes? This will help reproduce and fix any regression that was introduced by the patch. Thanks.

Comment 9 yann.morice 2017-06-23 07:14:47 UTC
* If we run a simple config using vrrp only :

# keepalived -P -f /etc/keepalived/keepalived.conf

vrrp_instance VR_1 {
    state BACKUP
    interface eth0
    virtual_router_id 1
    priority 50
    garp_master_delay 60
    nopreempt
    advert_int 2
    track_interface {
        eth0
    }
    virtual_ipaddress {
        169.254.0.2/24 dev eth0
    }
}

* We have then two new processes : 

# ps -eafwww |grep keepalived
root     13722     1  0 09:00 ?        00:00:00 keepalived -P -f /etc/keepalived/keepalived-simple.conf
root     13723 13722  0 09:00 ?        00:00:00 keepalived -P -f /etc/keepalived/keepalived-simple.conf

* First one has two pipes  :

# lsof |grep 13722|grep pipe|wc -l
2

* Second one has four pipes at the beginning :
# lsof |grep 13723|grep pipe|wc -l
4

* If we do SIGHUP to the main process to live update configuration :
# kill -HUP 13722

* and do again :
# lsof |grep 13723|grep pipe|wc -l
6

We have now six pipes... (+2 pipes at each SIGHUP in fact)

With the version 1.2.13-8.el7, doing the same, the second process stay at only two pipes across all SIGHUPs.

The problem is that this is widely used by openstack to live update configuration of routers (floating-ips, etc...) and do the router go off-line after too many open files...

Comment 10 Ryan O'Hara 2017-06-23 15:19:11 UTC
Thanks. Please open a new bugzilla for this issue.

Comment 11 yann.morice 2017-06-26 06:47:45 UTC
Ok. done => Bug #1464869


Note You need to log in before you can comment on or make changes to this bug.