RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1508435 - keepalived v1.3.5 segfaults with older versions of selinux-policy and when running in a container or Linode VM
Summary: keepalived v1.3.5 segfaults with older versions of selinux-policy and when ru...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: keepalived
Version: 7.4
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ryan O'Hara
QA Contact: Brandon Perkins
URL:
Whiteboard:
: 1492827 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-01 12:29 UTC by Quentin Armitage
Modified: 2018-04-10 18:15 UTC (History)
5 users (show)

Fixed In Version: keepalived-1.3.5-4.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-10 18:15:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0972 0 None None None 2018-04-10 18:15:57 UTC

Description Quentin Armitage 2017-11-01 12:29:07 UTC
Description of problem:
keepalived segfaults

Version-Release number of selected component (if applicable):
keepalived-1.3.5-1

How reproducible:
Always

The following problems have been reported when using keepalived-1.3.5-1 on Centos 7.4, and it is assumed that they would apply to RHEL 7.4 also.

1. keepalived-1.3.5-1 requires selinux-policy 3.13.1-151 or later, and preferably 3.13.1-158 or later.
   3.13.1-149 added: Allow keepalived_t domain read usermodehelper_t
   3.13.1-151 added: Allow keepalived_t domain creating netlink_netfilter_socket
   3.13.1-158 added: Allow keepalived domain connect to squid tcp port

   We are seeing users installing keepalived-1.3.5-1 on systems with selinux-policy earlier than 3.13.1-151 and keepalived then segfaults when it cannot open the netlink netfilter socket or /proc/sys/kernel/modprobe.

See report at https://github.com/acassen/keepalived/issues/650#issuecomment-331119608 for example.

   Can a
      Requires: selinux-policy >= 3.13.1-158
   be added to keepalived.spec?

2. Due to a bug in keepalived v1.3.5 it would segfault if it was running in a container and the host system did not have the iptables or ipset modules loaded. keepalived commit f7cd991 resolved this issue (see https://github.com/acassen/keepalived/commit/f7cd991). Commit dac727e resolved a similar issue if the ip_vs module is not loaded in the host.

3. Due to another bug in keepalived v1.3.5 it would segfault when running in a Linode VM due to the kernel having the iptables and ipset code built in to the kernel rather than being loadable modules. Commits f7cf991 and dac727e also appear to have resolved this issue.

Keepalived commit 47b5e44 is also worth incorporating since it cleanly handles the case when keepalived isn't able to initialise ipsets due to being in a container.

(see comment at https://github.com/acassen/keepalived/issues/650#issuecomment-331929769)

Comment 2 Ryan O'Hara 2017-11-08 14:31:50 UTC
I ran some tests over the past couple days and here is what I've been able to reproduce so far:

keepalived-1.3.9-1.el7 (the rebase done in RHEL7.4) will definitely segfault at startup in a container, although it the log messages don't give a reason. I'm assuming that this is the issues with iptables or ipset modules as described in comment #0. Here is the output:

# keepalived -PRln
Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Keepalived[268]: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Unable to resolve default script username 'keepalived_script' - ignoring
Keepalived[268]: Unable to resolve default script username 'keepalived_script' - ignoring
Opening file '/etc/keepalived/keepalived.conf'.
Keepalived[268]: Opening file '/etc/keepalived/keepalived.conf'.
Starting VRRP child process, pid=269
Keepalived[268]: Starting VRRP child process, pid=269
Registering Kernel netlink reflector
Keepalived_vrrp[269]: Registering Kernel netlink reflector
Registering Kernel netlink command channel
Keepalived_vrrp[269]: Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Keepalived_vrrp[269]: Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
Keepalived_vrrp[269]: Opening file '/etc/keepalived/keepalived.conf'.
Keepalived_vrrp exited due to segmentation fault (SIGSEGV).
Keepalived[268]: Keepalived_vrrp exited due to segmentation fault (SIGSEGV).

A quick test with the latest upstreanm keepalived (1.3.9) worked without segfault.

Comment 3 Ryan O'Hara 2017-11-08 14:37:43 UTC
Quentin,

Although keepalived -1.3.9 worked without segfault when run inside a container, I did run into this message:

Keepalived_vrrp[278]: Netlink: error: Operation not permitted, type=(20), seq=1510151532, pid=0

Any idea what this is about? There is no message in the logs indicating what operation was attempted.

Comment 4 Quentin Armitage 2017-11-08 14:50:53 UTC
Ryan,

I presume the beginning of the second paragraph of comment 2 should start "keepalived-1.3.5-1.el7 ..." and not "keepalived-1.3.9-1.el7 ..."; when I first read it as 1.3.9 I was somewhat worried!

Re comment 3, the type in the (somewhat unhelpful) Netlink error message is the type of message, as defined in /usr/include/linux/rtnetlink.h, and in this case, 20 == RTM_NEWADDR, so for some reason keepalived doesn't have permission within the container to add an address to an interface.

I must have a look to see if those Netlink error messages can give some more useful information, such as the address and the interface.

Comment 5 Ryan O'Hara 2017-11-08 16:08:27 UTC
(In reply to Quentin Armitage from comment #4)
> Ryan,
> 
> I presume the beginning of the second paragraph of comment 2 should start
> "keepalived-1.3.5-1.el7 ..." and not "keepalived-1.3.9-1.el7 ..."; when I
> first read it as 1.3.9 I was somewhat worried!

Yes, you are correct. That is a typo.

> Re comment 3, the type in the (somewhat unhelpful) Netlink error message is
> the type of message, as defined in /usr/include/linux/rtnetlink.h, and in
> this case, 20 == RTM_NEWADDR, so for some reason keepalived doesn't have
> permission within the container to add an address to an interface.

Ah ok. I thought 20 was indicative of "operation not permitted", but I misread the netlink code. Anyway, that confirms my suspicion that the netlink error was a result of failing to add the virtual IP address to the interface.

> I must have a look to see if those Netlink error messages can give some more
> useful information, such as the address and the interface.

Yes, I am looking at this as well. In the meantime I will get the patches you referenced backported to 1.3.5 since it is too late to rebase to 1.3.9 in RHEL7.5.

Comment 6 Quentin Armitage 2017-11-08 16:14:18 UTC
Ryan,

Are you also able to add the BuildRequires for selinux-policy at this stage?

Comment 7 Ryan O'Hara 2017-11-08 17:16:44 UTC
(In reply to Quentin Armitage from comment #6)
> Ryan,
> 
> Are you also able to add the BuildRequires for selinux-policy at this stage?

Yes. Not a problem at all.

By the way, my scratch build of 1.3.9 was failing because README was being installed twice: once to /usr/share/doc/keepalived/ and once to /usr/share/keepalived-1.3.9/. I had to change the RHEL7 spec file to remove /usr/share/doc/keepalived/README since README file is handled by %doc line under %files -- that is what ultimately gets the README into the doc directory.

Comment 8 Ryan O'Hara 2017-11-08 18:19:36 UTC
(In reply to Ryan O'Hara from comment #7)
> (In reply to Quentin Armitage from comment #6)
> > Ryan,
> > 
> > Are you also able to add the BuildRequires for selinux-policy at this stage?
> 
> Yes. Not a problem at all.

Actually this might be a problem. Consider a docker container where selinux-policy is not installed and is not required. I need to investigate if we can use a rpm macro to check if selinux is enabled prior to installing the policy.

Comment 9 Ryan O'Hara 2017-11-09 23:14:06 UTC
A couple comments/questions:

1. I don't think we will we add "Requires: selinux-policy >= 3.13.1-158". I talked to a few people about this and they advised against this, and I tend to agree. This would force the selinux-policy package to be installed anywhere keepalived is installed. That is not always desirable. First, consider containers. There is no need for it there. Also, how would be handle policy types (eg. mls, targeted)?

Most importantly here is that we require complete upgrades. For example, when 7.4 is release you should not simply update keepalived -- you should update everything. That is support. Ad-hoc updates are not. I think this problem was observed because a user updated individual package(s), not the entire system.

2. Are we sure that keepalived even runs correctly in docker containers on RHEL/CentOS? I was able to reproduce the segfault, but even with that fixed keepalived running inside a container cannot create the virtual IP address on the interface (netlink reports operation not permitted). It seems like a very minor victory to fix the segfault, but in the end keepalived doesn't work in a container.

Comment 10 Quentin Armitage 2017-11-10 16:18:37 UTC
Re 1 above, could you add "Conflicts: selinux-policy < 3.13.1-158"? I think this deals with the situation of selinux-policy not being installed.

Whilst I see the point of complete upgrades, if someone does a "yum install keepalived" on a system that hasn't been upgraded, then I think there would be a problem.

Re 2, we had a number of issue reports where people were running keepalived inside containers, so we know it is being done. It might be that they are using the healthchecker (IPVS) parts of keepalived rather than the VRRP part.

Comment 11 Ryan O'Hara 2017-11-16 19:06:09 UTC
Here are some test results/notes:

Using the base RHEL7 docker image (registry.access.redhat.com/rhel7/rhel), start a container and connect to it with:

# docker exec -it --privileged <ID> /bin/bash

Once in the container, I had to setup yum repos to grab keepalived and write a simple config file. Since systemd is not used in docker container, start keepalived from the command-line and dump the output to the console:

# keepalived -DRln
Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Unable to resolve default script username 'keepalived_script' - ignoring
Opening file '/etc/keepalived/keepalived.conf'.
Starting Healthcheck child process, pid=461
Initializing ipvs
Starting VRRP child process, pid=462
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
(VI_1): Cannot start in MASTER state if not address owner
VRRP_Instance(VI_1) removing protocol VIPs.
VRRP_Instance(VI_1) removing protocol iptable drop rule
IPVS: Can't initialize ipvs: Protocol not available
Keepalived_vrrp exited due to segmentation fault (SIGSEGV).

Note that the above segfault occurred with keepalived-1.3.5-1.el7.x86_64 from RHEL7.4.

With the new patched version of keepalived, run the same test. The ip_vs module may fail to load (different issue), but you should not get a segfault:

# keepalived -DRln
Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Opening file '/etc/keepalived/keepalived.conf'.
Starting Healthcheck child process, pid=554
Initializing ipvs
Starting VRRP child process, pid=555
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
VRRP_Instance(VRRP) removing protocol VIPs.
Using LinkWatch kernel netlink reflector...
VRRP sockpool: [ifindex(9), proto(112), unicast(0), fd(9,10)]
IPVS: Can't initialize ipvs: Protocol not available
Stopped
Keepalived_healthcheckers exited with permanent error FATAL. Terminating
Stopping
Stopped
Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2

On the host itself, you can run 'modprobe ip_vs' and run keepalived again and it will work. There is a better way to do this (so the container can load the module itself, by way of keepalived). I think we would need extra capabilities (--cap-add option) given when the container is started (docker run). But the three patches mentioned in comment #0 do fix segfaults.

Comment 12 Ryan O'Hara 2017-11-16 20:26:21 UTC
docker run -it --privileged --cap-add=ALL -v /lib/modules:/lib/modules registry.access.redhat.com/rhel7/rhel

This allowed keepalived to load ip_vs module (which resides on the host) from the container. I'm not sure it is ideal, but it works for testing.

Comment 13 Ryan O'Hara 2017-12-03 22:10:44 UTC
*** Bug 1492827 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2018-04-10 18:15:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0972


Note You need to log in before you can comment on or make changes to this bug.