Hide Forgot
Description of problem: oc cluster up fails using docker-2:1.13.1-56.git6c336e4.fc28.x86_64 Version-Release number of selected component (if applicable): docker-2:1.13.1-56.git6c336e4.fc28.x86_64 How reproducible: Always Steps to Reproduce: 1. Run oc cluster up on Fedora 28 with docker-2:1.13.1-56.git6c336e4.fc28.x86_64 Actual results: The one container that comes up shows failed actions in the logs Expected results: oc cluster up works. Additional info: dnf downgrade docker, brings the system down to docker-2:1.13.1-51.git4032bd5.fc28.x86_64. After restarting the service it works properly. oc version oc v3.10.0-alpha.0+a861408-1354 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Using an /etc/systemd/system/docker.service.d/override.conf to change the cgroup driver gets the newest version working. [Service] ExecStart= ExecStart=/usr/bin/dockerd-current \ --add-runtime oci=/usr/libexec/docker/docker-runc-current \ --default-runtime=oci \ --authorization-plugin=rhel-push-plugin \ --containerd /run/containerd.sock \ --exec-opt native.cgroupdriver=cgroupfs \ --userland-proxy-path=/usr/libexec/docker/docker-proxy-current \ --init-path=/usr/libexec/docker/docker-init-current \ --seccomp-profile=/etc/docker/seccomp.json \ $OPTIONS \ $DOCKER_STORAGE_OPTIONS \ $DOCKER_NETWORK_OPTIONS \ $ADD_REGISTRY \ $BLOCK_REGISTRY \ $INSECURE_REGISTRY \ $REGISTRIES Eventually you'll see the first container starts repeating: E0531 22:02:00.112824 20802 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Logs from the container. docker logs -f 4c7e8cc33aef Flag --address has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --allow-privileged has been deprecated, will be removed in a future version Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --authentication-token-webhook has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --authentication-token-webhook-cache-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --authorization-mode has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --authorization-webhook-cache-authorized-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --authorization-webhook-cache-unauthorized-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --cadvisor-port has been deprecated, The default will change to 0 (disabled) in 1.12, and the cadvisor port will be removed entirely in 1.13 Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --cluster-domain has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --file-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --healthz-bind-address has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --healthz-port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --host-ipc-sources has been deprecated, will be removed in a future version Flag --host-ipc-sources has been deprecated, will be removed in a future version Flag --host-network-sources has been deprecated, will be removed in a future version Flag --host-network-sources has been deprecated, will be removed in a future version Flag --host-pid-sources has been deprecated, will be removed in a future version Flag --host-pid-sources has been deprecated, will be removed in a future version Flag --http-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --iptables-masquerade-bit has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --max-pods has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --read-only-port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cert-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-min-version has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --tls-private-key-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. I0531 22:01:51.495324 20802 feature_gate.go:226] feature gates: &{{} map[]} W0531 22:01:51.509511 20802 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d I0531 22:01:51.515382 20802 server.go:383] Version: v1.10.0+b81c8f8 I0531 22:01:51.515451 20802 feature_gate.go:226] feature gates: &{{} map[]} I0531 22:01:51.515603 20802 plugins.go:89] No cloud provider specified. I0531 22:01:52.166465 20802 server.go:621] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to / I0531 22:01:52.167596 20802 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: / I0531 22:01:52.167623 20802 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/tmp/openshift.local.clusterup/openshift.local.volumes ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true} I0531 22:01:52.167843 20802 container_manager_linux.go:266] Creating device plugin manager: true I0531 22:01:52.168178 20802 state_mem.go:36] [cpumanager] initializing new in-memory state store I0531 22:01:52.168956 20802 state_file.go:82] [cpumanager] state file: created new state file "/tmp/openshift.local.clusterup/openshift.local.volumes/cpu_manager_state" I0531 22:01:52.169207 20802 kubelet.go:273] Adding pod path: /var/lib/origin/pod-manifests I0531 22:01:52.169256 20802 kubelet.go:298] Watching apiserver E0531 22:01:52.171967 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:52.173509 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:52.173548 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused W0531 22:01:52.175867 20802 kubelet_network.go:139] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth" I0531 22:01:52.175913 20802 kubelet.go:565] Hairpin mode set to "hairpin-veth" I0531 22:01:52.177877 20802 client.go:75] Connecting to docker on unix:///var/run/docker.sock I0531 22:01:52.177917 20802 client.go:104] Start docker client with request timeout=2m0s W0531 22:01:52.182236 20802 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d I0531 22:01:52.187186 20802 docker_service.go:244] Docker cri networking managed by kubernetes.io/no-op I0531 22:01:52.492151 20802 docker_service.go:249] Docker Info: &{ID:4S3O:NG4F:OEN3:CNV7:IOVE:AYAD:DY45:RTMT:AN6J:VJQY:OBRW:5R4O Containers:172 ContainersRunning:1 ContainersPaused:0 ContainersStopped:171 Images:1173 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type false] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[rhel-push-plugin] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:31 OomKillDisable:true NGoroutines:31 SystemTime:2018-05-31T18:01:52.481010732-04:00 LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:4.16.12-300.fc28.x86_64 OperatingSystem:Fedora 28 (Twenty Eight) OSType:linux Architecture:x86_64 IndexServerAddress:https://registry.fedoraproject.org/v1/ RegistryConfig:0xc42152e850 NCPU:16 MemTotal:135117225984 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:jmontleo.usersys.redhat.com Labels:[] ExperimentalBuild:false ServerVersion:1.13.1 ClusterStore: ClusterAdvertise: Runtimes:map[oci:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:oci Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc420985540} LiveRestoreEnabled:true Isolation: InitBinary:/usr/libexec/docker/docker-init-current ContainerdCommit:{ID:c301b045f9faddcf7693229601303639af6b0885 Expected:aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1} RuncCommit:{ID:1ab62f1ec5429ccf03e8dfcc2110bc665ec9e308-dirty Expected:9df8b306d01f59d3a8029be411de015b7304dd8f} InitCommit:{ID:N/A Expected:949e6facb77383876aeff8a6944dde66b3089574} SecurityOptions:[name=seccomp,profile=/etc/docker/seccomp.json name=selinux]} I0531 22:01:52.492316 20802 docker_service.go:262] Setting cgroupDriver to systemd I0531 22:01:52.794304 20802 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock I0531 22:01:52.797431 20802 kuberuntime_manager.go:186] Container runtime docker initialized, version: 1.13.1, apiVersion: 1.26.0 W0531 22:01:52.799268 20802 probe.go:215] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating. I0531 22:01:52.800143 20802 csi_plugin.go:61] kubernetes.io/csi: plugin initializing... E0531 22:01:52.802293 20802 kubelet.go:1299] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container / I0531 22:01:52.802295 20802 server.go:129] Starting to listen on 0.0.0.0:10250 I0531 22:01:52.802389 20802 server.go:952] Started kubelet E0531 22:01:52.803362 20802 event.go:209] Unable to write event: 'Post https://localhost:8443/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8443: getsockopt: connection refused' (may retry after sleeping) I0531 22:01:52.803699 20802 server.go:303] Adding debug handlers to kubelet server. I0531 22:01:52.804019 20802 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer I0531 22:01:52.804075 20802 status_manager.go:140] Starting to sync pod status with apiserver I0531 22:01:52.804097 20802 volume_manager.go:247] Starting Kubelet Volume Manager I0531 22:01:52.804117 20802 kubelet.go:1799] Starting kubelet main sync loop. I0531 22:01:52.804207 20802 desired_state_of_world_populator.go:129] Desired state populator starts to run I0531 22:01:52.804167 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s] I0531 22:01:52.904326 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:52.904356 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down] I0531 22:01:52.910487 20802 kubelet_node_status.go:82] Attempting to register node localhost E0531 22:01:52.911502 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:53.104499 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down] I0531 22:01:53.111720 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:53.122791 20802 kubelet_node_status.go:82] Attempting to register node localhost E0531 22:01:53.123590 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:53.173167 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:53.174609 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:53.175612 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:53.504664 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down] I0531 22:01:53.523839 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:53.531345 20802 kubelet_node_status.go:82] Attempting to register node localhost E0531 22:01:53.532278 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:54.174504 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:54.176349 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:54.176654 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:54.304816 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down] I0531 22:01:54.332526 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:54.339529 20802 kubelet_node_status.go:82] Attempting to register node localhost E0531 22:01:54.340458 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:55.175704 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:55.177501 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:55.178495 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:55.905084 20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down] I0531 22:01:55.940661 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:55.948096 20802 kubelet_node_status.go:82] Attempting to register node localhost E0531 22:01:55.949007 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:56.114536 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:56.119335 20802 cpu_manager.go:155] [cpumanager] starting with none policy I0531 22:01:56.119358 20802 cpu_manager.go:156] [cpumanager] reconciling every 10s I0531 22:01:56.119373 20802 policy_none.go:42] [cpumanager] none policy: Start E0531 22:01:56.119417 20802 container_manager_linux.go:544] failed to get rootfs info, cannot set ephemeral storage capacity: failed to get device for dir "/tmp/openshift.local.clusterup/openshift.local.volumes": could not find device with major: 0, minor: 45 in cached partitions map Starting Device Plugin manager E0531 22:01:56.176908 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:56.178576 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:56.179418 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:56.327769 20802 event.go:209] Unable to write event: 'Post https://localhost:8443/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8443: getsockopt: connection refused' (may retry after sleeping) E0531 22:01:57.178448 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:57.179605 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:57.180773 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:58.180001 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:58.180332 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:01:58.181724 20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:59.105503 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:59.127892 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:59.128051 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach W0531 22:01:59.135632 20802 status_manager.go:461] Failed to get status for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/kube-scheduler-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:59.141711 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:59.141844 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach W0531 22:01:59.148217 20802 status_manager.go:461] Failed to get status for pod "master-api-localhost_kube-system(8364885b3a3efddca13f5a6ada480812)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/master-api-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused I0531 22:01:59.149165 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:59.154798 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach I0531 22:01:59.154924 20802 kubelet_node_status.go:82] Attempting to register node localhost I0531 22:01:59.154959 20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach E0531 22:01:59.157234 20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused W0531 22:01:59.164447 20802 status_manager.go:461] Failed to get status for pod "master-etcd-localhost_kube-system(4bb0d3d3b26a7aa3dc2bd08a4b4326a5)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/master-etcd-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused E0531 22:02:09.284089 20802 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" E0531 22:02:09.284126 20802 kuberuntime_manager.go:646] createPodSandbox for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" E0531 22:02:09.284228 20802 pod_workers.go:186] Error syncing pod 4699e4e1c8a7146d6acd92baf39234ef ("kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)"), skipping: failed to "CreatePodSandbox" for "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)\" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod \"kube-scheduler-localhost\": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"No such device or address\\\"\""
The bug usually starts at the point origin needs to pull down the openshift/origin-web-console image: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned webconsole-7dfbffd44d-bz44s to localhost Normal SuccessfulMountVolume 1m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-config" Normal SuccessfulMountVolume 1m kubelet, localhost MountVolume.SetUp succeeded for volume "serving-cert" Normal SuccessfulMountVolume 1m kubelet, localhost MountVolume.SetUp succeeded for volume "webconsole-token-zxjsp" Warning FailedCreatePodSandBox 13s (x2 over 36s) kubelet, localhost Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "webconsole-7dfbffd44d-bz44s": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Normal SandboxChanged 10s (x2 over 36s) kubelet, localhost Pod sandbox changed, it will be killed and re-created. https://gitlab.com/tom81094/bugs/raw/master/f28/docker-2:1.13.1-56.git6c336e4/openshift-web-console-logs https://gitlab.com/tom81094/bugs/raw/master/f28/docker-2:1.13.1-56.git6c336e4/oc-cluster-logs
ditto here; can be reproduced by docker run --rm --cpu-shares=128 fedora:28 bash
Please see: https://github.com/projectatomic/runc/pull/10 Which fixes this problem. (NOTE: While the backport of the single commit/PR seems to be enough, it's probably best to look at backporting more, since there were other changes around that code. Perhaps a whole refresh of upstream "runc" would be good there.) Cheers, Filipe
I'm seeing this when installing/running openshift origin 3.9.0 on Fedora Atomic Host release candidate. This is blocking future releases of FAH. I tracked down the problem to this change: ``` # rpm-ostree db diff a5f1234a302fb064f67f09afe8ddd9cbac524a406a257a562fd18000dac99ba8 cefc79e6ea4d7e5eec51a32c00e1ecd6ca678d322406fecd347bc9c49e5d5255 ostree diff commit old: a5f1234a302fb064f67f09afe8ddd9cbac524a406a257a562fd18000dac99ba8 ostree diff commit new: cefc79e6ea4d7e5eec51a32c00e1ecd6ca678d322406fecd347bc9c49e5d5255 Upgraded: docker 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28 docker-common 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28 docker-rhel-push-plugin 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28 quota 1:4.04-5.fc28 -> 1:4.04-6.fc28 quota-nls 1:4.04-5.fc28 -> 1:4.04-6.fc28 selinux-policy 3.14.1-29.fc28 -> 3.14.1-30.fc28 selinux-policy-targeted 3.14.1-29.fc28 -> 3.14.1-30.fc28 Removed: oci-register-machine-0-6.1.git66fa845.fc28.x86_64 systemd-container-238-8.git0e0aa59.fc28.x86_64 ``` An example of a container not getting started is one of the glusterfs daemonset containers. Here is a snippet from oc describe: ``` Warning FailedCreatePodSandBox 7m (x16287 over 5h) kubelet, 10.0.12.155 Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "glusterfs-storage-mlpdl": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\"" Normal SandboxChanged 2m (x16532 over 5h) kubelet, 10.0.12.155 Pod sandbox changed, it will be killed and re-created. ```
Any chance this is SELinux related?
Mrunal, this is another bz about the cgroup fix that went into runc :/ https://github.com/projectatomic/runc/commit/99a2d0844a013541744154a07380422a073c4926
docker-1.13.1-59.gitaf6b32b.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623
docker-1.13.1-59.gitaf6b32b.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd
Ran an openshift cluster on top of docker-1.13.1-59.gitaf6b32b.fc28 using ostree ref `fedora/28/x86_64/atomic-host ` in repo `https://dustymabe.fedorapeople.org/repo/` fixes it for me
docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623
docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.
docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd
docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.