Bug 1373119
Summary: | [infrastructure_public_178]Pod with 'tcp_max_syn_backlog' and 'tcp_syncookies' sysctls always Failed | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> | |
Component: | Node | Assignee: | Stefan Schimanski <sttts> | |
Status: | CLOSED ERRATA | QA Contact: | DeShuai Ma <dma> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 3.3.1 | CC: | agoldste, aos-bugs, jeder, jokerman, mmccomas, tdawson, wmeng, xtian | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Release Note | ||
Doc Text: |
Part of https://github.com/openshift/openshift-docs/commit/5a532e7b9d2795e31c18c169078f037e68a0afdf
|
Story Points: | --- | |
Clone Of: | ||||
: | 1390706 (view as bug list) | Environment: | ||
Last Closed: | 2017-01-18 12:53:19 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
DeShuai Ma
2016-09-05 09:28:21 UTC
net.ipv4.tcp_syncookies works fine here on the fedora kernel. Will check on the RHEL kernel. Double checked with recent CentOS VM: net.ipv4.tcp_syncookies is not namespaced. I will check what we can do about better error behavior in the docker runtime. I guess (without a lot of Kernel knowledge in the kubelet code) the we can do is to fail hard with a good error message. Right now, the kubelet retries a number of times when a container creation fails. That's not a good user experience. Here is the kernel patch for tcp_syncookies: https://github.com/torvalds/linux/commit/12ed8244ed8b31b023ea6d2851fd8b15f2999e9b Summary: available in >=4.6 net.ipv4.tcp_max_syn_backlog does not belong onto the whitelist as it is not namespaced in any of today's kernel. Having it whitelisted does no harm though because the sysctl is just not available under /proc/sys in the container. Here is the upstream fix: https://github.com/kubernetes/kubernetes/pull/32072 About better error behavior: the error message about the failed sysctl is part of the stdout/err of the container in Docker 1.10, not of the actual container status. Compare: $ docker -l=debug run -d --sysctl=kernel.shm_rmid_forced=hello ubuntu /bin/bash -c "sysctl kernel.shm_rmid_forced" 09813c9c38cf071e0581c38acccce149331d88729bdb0a47f586ca73a92c7221 docker: Error response from daemon: Cannot start container 09813c9c38cf071e0581c38acccce149331d88729bdb0a47f586ca73a92c7221: [9] System error: could not synchronise with container process. $ docker logs 09813c9c38cf071e0581c38acccce149331d88729bdb0a47f586ca73a92c7221 write /proc/sys/kernel/shm_rmid_forced: invalid argument With Docker 1.12 the situation is better: $ docker -l=debug run -d --sysctl=kernel.shm_rmid_forced=hello busybox /bi> 9483cb34b2559e26a96db01c29727c11afd7d5461588c607992119355507f541 docker: Error response from daemon: oci runtime error: write /proc/sys/kernel/shm_rmid_forced: invalid argument. Bad news (even with Docker 1.12) is that we don't have any mechanism right now to react on specific container creation errors and fail a pod. net.ipv4.tcp_syncookies is same issue, could you double check. Summing up all above in one comment: The following applies to all failing sysctls: - if the kernel does not namespace a sysctl ("no such file or directory") or reject a sysctl value ("invalid argument"), Docker 1.10 will show this in the container logs only. Kubernetes/OpenShift will not see this output. Moreover it will continue to launch the container (with a backoff) creating multiple events. - With Docker 1.12 the situation is a bit better as Docker reports the very sysctl error in the container launch error message. Then at least the user can see that it was a sysctl error. The actual failure reason for all the sysctls mentioned is slightly different: - net.ipv4.tcp_max_syn_backlog is not namespaced on any kernel and should be remove from the whitelist (https://github.com/kubernetes/kubernetes/pull/32072) - net.ipv4.tcp_syncookies is namespaced in Kernels >= 4.6, but not in RHEL's enterprise kernel. If customers request this, here is the kernel patch to backport by the kernel team: https://github.com/torvalds/linux/commit/12ed8244ed8b31b023ea6d2851fd8b15f2999e9b - kernel.shm_rmid_forced=hello fails because the kernel validates the value "hello" and rejects it. For net.ipv4.tcp_syncookies support need document Kernels >= 4.6 in openshift-doc. Still has issue, Don't know why changed to ON_QA. Backport of the net.ipv4.tcp_max_syn_backlog removal to 3.3.x: https://github.com/openshift/ose/pull/441 https://github.com/openshift/openshift-docs/pull/3144 is the docs PR that will soon contain the release note that net.ipv4.tcp_syncookies is not namespaced in the RHEL kernel. The ose PR will fix net.ipv4.tcp_max_syn_backlog in 3.3.x. It is already fixed in 3.4. Moving to MODIFIED as the fix for net.ipv4.tcp_max_syn_backlog is in 3.4 already. As described above, we will add a release note about net.ipv4.tcp_syncookies. DeShuai Ma - when verifying this bz, please only verify that net.ipv4.tcp_max_syn_backlog is no longer in the whitelist. This has been merged into ose and is in OSE v3.4.0.19 or newer. Move to ON_QA according to comment 17 Test on openshift v3.4.0.21+ca4702d Verify the bug. "net.ipv4.tcp_max_syn_backlog" is not whitelisted [root@dhcp-128-7 dma]# oc get pod NAME READY STATUS RESTARTS AGE hello-pod 0/1 SysctlForbidden 0 <invalid> [root@dhcp-128-7 dma]# oc describe pod hello-pod Name: hello-pod Namespace: dma Security Policy: restricted Node: weshi-3.centralus.cloudapp.azure.com/ Start Time: Fri, 04 Nov 2016 15:15:37 +0800 Labels: name=hello-pod Status: Failed Reason: SysctlForbidden Message: Pod forbidden sysctl: "net.ipv4.tcp_max_syn_backlog" not whitelisted IP: Controllers: <none> Containers: hello-pod: Image: docker.io/deshuai/hello-pod:latest Port: 8080/TCP Volume Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-jakux (ro) Environment Variables: <none> Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-jakux: Type: Secret (a volume populated by a Secret) SecretName: default-token-jakux QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- <invalid> <invalid> 1 {kubelet weshi-3.centralus.cloudapp.azure.com} Warning SysctlForbidden forbidden sysctl: "net.ipv4.tcp_max_syn_backlog" not whitelisted Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066 |