Hide Forgot
Description of problem: After upgrading to docker-1.13.1-56.git6c336e4.fc28.x86_64, I am unable to run `oc cluster up` and have a complete openshift cluster. Version-Release number of selected component (if applicable): docker-1.13.1-56.git6c336e4.fc28.x86_64 How reproducible: Always Steps to Reproduce: 1. Install docker-1.13.1-56.git6c336e4.fc28.x86_64 and oc-3.7.46 2. Run `oc cluster up` 3. Run docker ps -a Actual results: Tons of containers in Created state, but failing. Logs show: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Expected results: A running origin cluster Additional info:
Any chance this is an SELinux issue?
While it doesn't appear to be an SELinux error, I'm getting additional output. See below for my output with setenforce=1 and setenforce=0: $ oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.7 --host-data-dir=/home/blentz/.oc/openshift.local.data --use-existing-config=true Starting OpenShift using registry.access.redhat.com/openshift3/ose:v3.7 ... -- Checking OpenShift client ... OK -- Checking Docker client ... OK -- Checking Docker version ... OK -- Checking for existing OpenShift container ... OK -- Checking for registry.access.redhat.com/openshift3/ose:v3.7 image ... OK -- Checking Docker daemon configuration ... OK -- Checking for available ports ... WARNING: Binding DNS on port 8053 instead of 53, which may not be resolvable from all clients. -- Checking type of volume mount ... Using nsenter mounter for OpenShift volumes -- Creating host directories ... OK -- Finding server IP ... Using 127.0.0.1 as the server IP -- Starting OpenShift container ... Starting OpenShift using container 'origin' Waiting for API server to start listening FAIL Error: cannot access master readiness URL https://127.0.0.1:8443/healthz/ready Details: Last 10 lines of "origin" container log: E0604 21:09:44.095751 19781 reflector.go:216] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:72: Failed to list *rbac.ClusterRoleBinding: no kind "ClusterRoleBinding" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.109782 19781 status.go:62] apiserver received an error that is not an metav1.Status: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.111169 19781 reflector.go:216] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:72: Failed to list *rbac.ClusterRole: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.646567 19781 controllers.go:118] Server isn't healthy yet. Waiting a little while. E0604 21:09:44.651282 19781 reflector.go:216] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:72: Failed to list *api.Endpoints: User "system:node:localhost" cannot list endpoints at the cluster scope: User "system:node:localhost" cannot list all endpoints in the cluster (get endpoints) E0604 21:09:44.861087 19781 status.go:62] apiserver received an error that is not an metav1.Status: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.861719 19781 storage_rbac.go:166] unable to initialize clusterroles: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.864708 19781 status.go:62] apiserver received an error that is not an metav1.Status: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:09:44.865419 19781 storage_rbac.go:166] unable to initialize clusterroles: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" F0604 21:09:44.865824 19781 hooks.go:133] PostStartHook "authorization.openshift.io-bootstrapclusterroles" failed: unable to initialize roles: timed out waiting for the condition Caused By: Error: Get https://127.0.0.1:8443/healthz/ready: dial tcp 127.0.0.1:8443: getsockopt: connection refused $ oc cluster down $ sudo setenforce 0 $ oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.7 --host-data-dir=/home/blentz/.oc/openshift.local.data --use-existing-config=true Starting OpenShift using registry.access.redhat.com/openshift3/ose:v3.7 ... -- Checking OpenShift client ... OK -- Checking Docker client ... OK -- Checking Docker version ... OK -- Checking for existing OpenShift container ... OK -- Checking for registry.access.redhat.com/openshift3/ose:v3.7 image ... OK -- Checking Docker daemon configuration ... OK -- Checking for available ports ... WARNING: Binding DNS on port 8053 instead of 53, which may not be resolvable from all clients. -- Checking type of volume mount ... Using nsenter mounter for OpenShift volumes -- Creating host directories ... OK -- Finding server IP ... Using 127.0.0.1 as the server IP -- Starting OpenShift container ... Starting OpenShift using container 'origin' Waiting for API server to start listening FAIL Error: cannot access master readiness URL https://127.0.0.1:8443/healthz/ready Details: Last 10 lines of "origin" container log: E0604 21:10:49.341243 25230 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "persistent-volume-setup-vpxs6_default(f2d29cfc-6801-11e8-a25f-c85b76f3eace)" failed: rpc error: code = 2 desc = failed to start sandbox container for pod "persistent-volume-setup-vpxs6": Error response from daemon: {"message":"oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"No such device or address\\\"\"\n"} E0604 21:10:49.341267 25230 kuberuntime_manager.go:622] createPodSandbox for pod "persistent-volume-setup-vpxs6_default(f2d29cfc-6801-11e8-a25f-c85b76f3eace)" failed: rpc error: code = 2 desc = failed to start sandbox container for pod "persistent-volume-setup-vpxs6": Error response from daemon: {"message":"oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"No such device or address\\\"\"\n"} E0604 21:10:49.341364 25230 pod_workers.go:186] Error syncing pod f2d29cfc-6801-11e8-a25f-c85b76f3eace ("persistent-volume-setup-vpxs6_default(f2d29cfc-6801-11e8-a25f-c85b76f3eace)"), skipping: failed to "CreatePodSandbox" for "persistent-volume-setup-vpxs6_default(f2d29cfc-6801-11e8-a25f-c85b76f3eace)" with CreatePodSandboxError: "CreatePodSandbox for pod \"persistent-volume-setup-vpxs6_default(f2d29cfc-6801-11e8-a25f-c85b76f3eace)\" failed: rpc error: code = 2 desc = failed to start sandbox container for pod \"persistent-volume-setup-vpxs6\": Error response from daemon: {\"message\":\"oci runtime error: container_linux.go:247: starting container process caused \\\"process_linux.go:258: applying cgroup configuration for process caused \\\\\\\"No such device or address\\\\\\\"\\\"\\n\"}" W0604 21:10:49.392559 25230 pod_container_deletor.go:77] Container "c9db9acf9ed59d4c6839864468622e32c3984a9be848fe3f84123503c016ba6c" not found in pod's containers E0604 21:10:49.662676 25230 reflector.go:216] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:72: Failed to list *api.Endpoints: User "system:node:localhost" cannot list endpoints at the cluster scope: User "system:node:localhost" cannot list all endpoints in the cluster (get endpoints) E0604 21:10:49.838991 25230 status.go:62] apiserver received an error that is not an metav1.Status: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:10:49.839534 25230 storage_rbac.go:166] unable to initialize clusterroles: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:10:49.842465 25230 status.go:62] apiserver received an error that is not an metav1.Status: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" E0604 21:10:49.843252 25230 storage_rbac.go:166] unable to initialize clusterroles: no kind "ClusterRole" is registered for version "rbac.authorization.k8s.io/v1" F0604 21:10:49.843559 25230 hooks.go:133] PostStartHook "authorization.openshift.io-bootstrapclusterroles" failed: unable to initialize roles: timed out waiting for the condition Caused By: Error: Get https://127.0.0.1:8443/healthz/ready: dial tcp 127.0.0.1:8443: getsockopt: connection refused
Antonio another healthcheck error? Also related to runc?
(In reply to Daniel Walsh from comment #3) > Antonio another healthcheck error? Also related to runc? I believe so, Mrunal?
For the record. I have seen similar issue on f26 when I have seen similar issue while trying to run osbs-box with origin 3.9 (docker-1.13.1-44.git584d391.fc26.x86_64). I have tried to disable selinux(set to disabled in /etc/selinux/config, even in permissive mode the error appeared) and it have workaround the issue for me. Also after hitting this issue system's selinux seemed in the inconsistent state, i.e. adit2allow failed with really weird errors. I will try to get hold of it again and reproduce with more recent version of Fedora. Does disabling selinux workaround the issue?
(In reply to Jakub Čajka from comment #5) > For the record. I have seen similar issue on f26 when I have seen similar > issue while trying to run osbs-box with origin 3.9 > (docker-1.13.1-44.git584d391.fc26.x86_64). I have tried to disable > selinux(set to disabled in /etc/selinux/config, even in permissive mode the > error appeared) and it have workaround the issue for me. > > Also after hitting this issue system's selinux seemed in the inconsistent > state, i.e. adit2allow failed with really weird errors. I will try to get > hold of it again and reproduce with more recent version of Fedora. > > Does disabling selinux workaround the issue? Retracting my previous statement. It seems to also affect f27 docker-1.13.1-54.git6c336e4.fc27.x86_64 and it seems to be unrelated to the whatever selinux is enforcing or not. I see those poping up in the log: Jun 05 17:08:53 osbs dockerd-current[808]: E0605 15:08:53.644311 1895 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "router-1-deploy_default(b3486eea-68ca-11e8-9c55-5254002bdf1c)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "router-1-deploy": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Jun 05 17:08:53 osbs dockerd-current[808]: E0605 15:08:53.644323 1895 kuberuntime_manager.go:647] createPodSandbox for pod "router-1-deploy_default(b3486eea-68ca-11e8-9c55-5254002bdf1c)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "router-1-deploy": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Jun 05 17:08:53 osbs dockerd-current[808]: E0605 15:08:53.644376 1895 pod_workers.go:186] Error syncing pod b3486eea-68ca-11e8-9c55-5254002bdf1c ("router-1-deploy_default(b3486eea-68ca-11e8-9c55-5254002bdf1c)"), skipping: failed to "CreatePodSandbox" for "router-1-deploy_default(b3486eea-68ca-11e8-9c55-5254002bdf1c)" with CreatePodSandboxError: "CreatePodSandbox for pod \"router-1-deploy_default(b3486eea-68ca-11e8-9c55-5254002bdf1c)\" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod \"router-1-deploy\": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"No such device or address\\\"\""
BTW BZ1586107 and BZ1584909 seems to be duplicates. and my workaround for working "oc cluster up" is to downgrade docker to docker-1.13.1-51.git4032bd5.fc28.x86_64
For the record this is not architecturally dependent. I'm hitting the same issue on ppc64le. And docker-1.13.1-51.git4032bd5.fc27.x86_64/ppc64le is also workaround for the issue for me.
Seems to be caused by https://github.com/projectatomic/runc/pull/8
Lokesh, can you please share some details about https://github.com/projectatomic/runc/pull/8#issuecomment-381171089 ?
docker-1.13.1-58.git6c336e4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-bace62295c
docker-1.13.1-58.git6c336e4.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-b6ba25b167
docker-1.13.1-59.gitaf6b32b.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623
docker-1.13.1-59.gitaf6b32b.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd
docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623
docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.
docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd
docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.