Bug 1718049 - Need SCTP module in RHEL CoreOS by default
Summary: Need SCTP module in RHEL CoreOS by default
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.2.0
Assignee: Steve Milner
QA Contact: Micah Abbott
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-06 19:15 UTC by Weibin Liang
Modified: 2019-11-07 21:52 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:31:35 UTC
Target Upstream Version:
Embargoed:
weliang: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:31:52 UTC

Description Weibin Liang 2019-06-06 19:15:38 UTC
Description of problem:
Openshift-SDN need to support SCTP defined in https://jira.coreos.com/browse/SDN-137, but in current OCP v4.1 code, SCTP is not supported in AWS nodes which running RHEL CoreOS 

Version-Release number of selected component (if applicable):
kernel: 4.18.0-80.1.2.el8_0.x86_64
RHEL CoreOS: Red Hat Enterprise Linux CoreOS release 4.1
OCP: 4.1.0-0.ci-2019-06-06-145709

How reproducible:
Always

Steps to Reproduce:
[root@dhcp-41-193 ~]# oc debug node/ip-10-0-131-6.us-east-2.compute.internal
Starting pod/ip-10-0-131-6us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# lsmod | grep sctp
sh-4.4# modprobe sctp
modprobe: FATAL: Module sctp not found in directory /lib/modules/4.18.0-80.1.2.el8_0.x86_64
sh-4.4# uname -r
4.18.0-80.1.2.el8_0.x86_64
sh-4.4# cat /etc/redhat-release
Red Hat Enterprise Linux CoreOS release 4.1
sh-4.4# 


Actual results:
sh-4.4# lsmod | grep sctp

Expected results:
sh-4.4# lsmod | grep sctp
sctp                  385024  42
libcrc32c              16384  5 nf_conntrack,nf_nat,openvswitch,xfs,sctp


Additional info:

Comment 1 Colin Walters 2019-06-06 20:05:26 UTC
It's part of kernel-modules-extra which IMO was a well-intentioned but ultimately ill-advised attempt to support smaller installations.  (Anyways at this point RHCOS is huge due to kubelet and kernel-modules-extra weighs in at 600k, a whole 0.3% of the size of hyperkube).

Comment 3 Josh Boyer 2019-06-06 21:04:15 UTC
(In reply to Colin Walters from comment #1)
> It's part of kernel-modules-extra which IMO was a well-intentioned but
> ultimately ill-advised attempt to support smaller installations.  (Anyways
> at this point RHCOS is huge due to kubelet and kernel-modules-extra weighs
> in at 600k, a whole 0.3% of the size of hyperkube).

To be clear, kernel-modules-extra was introduced in Fedora as a compromise for modules we really wanted to disable entirely but didn't have the gumption to follow through on.  It is one of my biggest regrets from when I did the kernel repackaging, and I would completely delete it if I could go back in time.

It got carried into RHEL 8 from there and adjusted.  SCTP probably doesn't belong there.

The overall kernel-core/kernel-modules split actually does help in many cases, but clearly it depends on what each environment needs.  The Fedora version was fine for VMs to only have kernel-core.  I would think something similar could work for CoreOS, but I haven't followed what RHEL did.

Comment 4 Colin Walters 2019-06-07 00:04:26 UTC
> It got carried into RHEL 8 from there and adjusted.  SCTP probably doesn't belong there.

Yeah, probably.  But SCTP is in an interesting space adoption wise; feels ultimately like HTTP2 is taking over a lot of that space, and ultimately HTTP3 will take that farther.

>   The Fedora version was fine for VMs to only have kernel-core.  I would think something similar could work for CoreOS

We don't want to have separate builds for physical/virtual.

Comment 5 Casey Callendrello 2019-06-07 13:48:54 UTC
(In reply to Colin Walters from comment #4)
> Yeah, probably.  But SCTP is in an interesting space adoption wise; feels
> ultimately like HTTP2 is taking over a lot of that space, and ultimately
> HTTP3 will take that farther.

For sure, the influence of SCTP on QUIC is unmistakable. Regardless, SCTP is widely used in the telco space, and that won't be changing anytime soon.

Comment 7 Dan Winship 2019-06-07 14:19:47 UTC
I think sctp.ko is considered by some to be in the set of kernel modules that haven't had enough eyes on them and so are more likely to have undiscovered security holes. It might be good if admins had to explicitly enable it.

Comment 10 Dan Winship 2019-06-24 12:22:30 UTC
So the linked CoreOS MR brings in the kernel-modules-extra package, but that includes a file /etc/modprobe.d/sctp-blacklist.conf blacklisting the module. Plus, our selinux policy blocks containers from being able to cause modules to be autoloaded even if they aren't blacklisted:

    danw@p50:~> oc exec test-pod -- nc -l --sctp -p 9999
    Ncat: Unable to open any listening sockets. QUITTING.
    command terminated with exit code 2

with the kernel logging:

    [ 1957.432504] audit: type=1400 audit(1561323434.328:7): avc:  denied  { module_request } for  pid=68007 comm="nc" kmod="net-pf-10-proto-132-type-1" scontext=system_u:system_r:container_t:s0:c519,c876 tcontext=system_u:system_r:kernel_t:s0 tclass=system permissive=0
    [ 1957.449268] audit: type=1400 audit(1561323434.328:8): avc:  denied  { module_request } for  pid=68007 comm="nc" kmod="net-pf-10-proto-132" scontext=system_u:system_r:container_t:s0:c519,c876 tcontext=system_u:system_r:kernel_t:s0 tclass=system permissive=0


So to enable SCTP, an administrator will need to unblacklist it and explicitly modprobe it. What is the right way for them to do that in RHCOS?

Comment 11 Colin Walters 2019-06-24 12:59:45 UTC
> So to enable SCTP, an administrator will need to unblacklist it and explicitly modprobe it. What is the right way for them to do that in RHCOS?

We will definitely need a KCS entry for this.  The general mechanism to manage RHCOS is MachineConfig:
https://github.com/openshift/machine-config-operator/#applying-configuration-changes-to-the-cluster

In this case it'd be a MachineConfig file that replaced /etc/modprobe.d/sctp-blacklist.conf with an empty file,
and also a systemd unit that runs `modprobe sctp` and is Before=kubelet.service.

Comment 13 Micah Abbott 2019-06-26 18:35:30 UTC
Using release 4.2.0-0.nightly-2019-06-26-081712, I was able to confirm that the sctp module is available in the OS and can be loaded.

```
$ oc get clusterversion                                                                                           
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-06-26-081712   True        False         3h15m   Cluster version is 4.2.0-0.nightly-2019-06-26-081712                                                                                

$ oc get node                                                                                                     
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-133-132.us-west-2.compute.internal   Ready    master   3h32m   v1.14.0+04ae0f405
ip-10-0-136-252.us-west-2.compute.internal   Ready    worker   3h24m   v1.14.0+04ae0f405
ip-10-0-150-76.us-west-2.compute.internal    Ready    master   3h32m   v1.14.0+04ae0f405
ip-10-0-151-188.us-west-2.compute.internal   Ready    worker   3h24m   v1.14.0+04ae0f405
ip-10-0-160-45.us-west-2.compute.internal    Ready    master   3h32m   v1.14.0+04ae0f405
ip-10-0-174-25.us-west-2.compute.internal    Ready    worker   3h24m   v1.14.0+04ae0f405

$ oc debug node/ip-10-0-136-252.us-west-2.compute.internal                                                        
Starting pod/ip-10-0-136-252us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c4a8d13f7d441df335ee2b434887ab4ae0b3cf1bbd3ac29d70297779cd900e01                                                                                    
              CustomOrigin: Managed by pivot tool
                   Version: 420.8.20190625.0 (2019-06-25T20:28:05Z)

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dd3bd07c9a4fd39d8039b137fa161c3a5b38a2855e8df9f6c16946ccdf7b3f31                                                                                    
              CustomOrigin: Provisioned from oscontainer
                   Version: 420.8.20190624.0 (2019-06-24T00:25:32Z)

sh-4.4# rpm -q kernel-modules-extra
kernel-modules-extra-4.18.0-80.4.2.el8_0.x86_64
sh-4.4# rpm -ql kernel-modules-extra | grep sctp
/etc/modprobe.d/sctp-blacklist.conf
/etc/modprobe.d/sctp_diag-blacklist.conf
/lib/modules/4.18.0-80.4.2.el8_0.x86_64/kernel/net/sctp/sctp.ko.xz
/lib/modules/4.18.0-80.4.2.el8_0.x86_64/kernel/net/sctp/sctp_diag.ko.xz
sh-4.4# modprobe sctp
sh-4.4# lsmod sctp
Usage: lsmod
sh-4.4# lsmod | grep sctp
sctp                  385024  8
libcrc32c              16384  5 nf_conntrack,nf_nat,openvswitch,xfs,sctp
```

Comment 15 errata-xmlrpc 2019-10-16 06:31:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.