Bug 1372658
Summary: | [infrastructure_public_178]Using invalid sysctls value will cause pod keep in ContainerCreating | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> |
Component: | Node | Assignee: | Stefan Schimanski <sttts> |
Status: | CLOSED NOTABUG | QA Contact: | DeShuai Ma <dma> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.3.1 | CC: | aos-bugs, jokerman, mmccomas, sttts, wmeng |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-10-31 15:52:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1386569 | ||
Bug Blocks: |
Description
DeShuai Ma
2016-09-02 09:56:26 UTC
This behaves as expected. There is no (and cannot be) verification of sysctl values. Values are partially checked by the kernel and Docker will fail to start a container if the kernel rejects the value. The only thing we can do here: make Docker return better error messages or pass them through to the user. In fact, it looks like the second is what we are missing: docker run -it --sysctl=kernel.shm_rmid_forced=hello ubuntu /bin/bash -c "sysctl kernel.shm_rmid_forced" write /proc/sys/kernel/shm_rmid_forced: invalid argument docker: Error response from daemon: Cannot start container a1bc7c9dee79e00732ebf7d48a7eaa1e79232a700115476fba72425371532238: [9] System error: could not synchronise with container process. To get this sorted out we need a better error message in Docker. Here is a BZ issue for that: https://bugzilla.redhat.com/show_bug.cgi?id=1386569 Will you fix this bug in 3.3.1 release or 3.4? We depend on the Docker fix for https://bugzilla.redhat.com/show_bug.cgi?id=1386569 to fix this in origin/kubernetes. As 3.3.1 deadline is at the end of this week, this looks improbable. After discussing on IRC this does not sound like a blocker for 3.3.1 so I'm moving target to 3.4 and attaching to the 3.4 release as items that can be fixed are fixed there. In 3.4 pod still in 'ContainerCreating'. Don't know why changed to ON_QA openshift v3.4.0.16+cc70b72 [root@ip-172-18-7-27 ~]# oc get pod hello-pod NAME READY STATUS RESTARTS AGE hello-pod 0/1 ContainerCreating 0 1m [root@ip-172-18-7-27 ~]# oc describe pod hello-pod Name: hello-pod Namespace: default Security Policy: anyuid Node: ip-172-18-2-117.ec2.internal/172.18.2.117 Start Time: Thu, 27 Oct 2016 22:01:03 -0400 Labels: name=hello-pod Status: Pending IP: Controllers: <none> Containers: hello-pod: Container ID: Image: docker.io/deshuai/hello-pod:latest Image ID: Port: 8080/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Volume Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-khrla (ro) Environment Variables: <none> Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-khrla: Type: Secret (a volume populated by a Secret) SecretName: default-token-khrla QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned hello-pod to ip-172-18-2-117.ec2.internal 1m 12s 9 {kubelet ip-172-18-2-117.ec2.internal} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"oci runtime error: write /proc/sys/kernel/shm_rmid_forced: invalid argument\"}" As https://bugzilla.redhat.com/show_bug.cgi?id=1386569 seems to be fixed in Docker upstream, I will implement the corresponding kubernetes error check upstream. If we decide to backport the Docker patch, we can also backport the Kubernetes patch later. While we have the proper error message with Docker 1.12, the kubelet right now does not allow to fail a pod immediately when the Pod initialization fails. This is by design and not related to sysctls. |