1584909 – oc cluster up does not work on docker-2:1.13.1-56.git6c336e4.fc28.x86_64

Bug 1584909 - oc cluster up does not work on docker-2:1.13.1-56.git6c336e4.fc28.x86_64

Summary: oc cluster up does not work on docker-2:1.13.1-56.git6c336e4.fc28.x86_64

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	docker
Sub Component:
Version:	28
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Daniel Walsh
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-05-31 22:23 UTC by Jason Montleon
Modified:	2018-08-15 13:26 UTC (History)
CC List:	21 users (show)
Fixed In Version:	docker-1.13.1-59.gitaf6b32b.fc28 docker-1.13.1-59.gitaf6b32b.fc27
Clone Of:
Environment:
Last Closed:	2018-06-13 15:18:25 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Jason Montleon 2018-05-31 22:23:54 UTC

Description of problem:
oc cluster up fails using docker-2:1.13.1-56.git6c336e4.fc28.x86_64

Version-Release number of selected component (if applicable):
docker-2:1.13.1-56.git6c336e4.fc28.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Run oc cluster up on Fedora 28 with docker-2:1.13.1-56.git6c336e4.fc28.x86_64

Actual results:
The one container that comes up shows failed actions in the logs


Expected results:
oc cluster up works.

Additional info:
dnf downgrade docker, brings the system down to docker-2:1.13.1-51.git4032bd5.fc28.x86_64. After restarting the service it works properly.

oc version
oc v3.10.0-alpha.0+a861408-1354
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Using an /etc/systemd/system/docker.service.d/override.conf to change the cgroup driver gets the newest version working.

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd-current \
          --add-runtime oci=/usr/libexec/docker/docker-runc-current \
          --default-runtime=oci \
          --authorization-plugin=rhel-push-plugin \
          --containerd /run/containerd.sock \
          --exec-opt native.cgroupdriver=cgroupfs \
          --userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
          --init-path=/usr/libexec/docker/docker-init-current \
          --seccomp-profile=/etc/docker/seccomp.json \
          $OPTIONS \
          $DOCKER_STORAGE_OPTIONS \
          $DOCKER_NETWORK_OPTIONS \
          $ADD_REGISTRY \
          $BLOCK_REGISTRY \
          $INSECURE_REGISTRY \
          $REGISTRIES


Eventually you'll see the first container starts repeating:
E0531 22:02:00.112824   20802 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\""

Logs from the container.
docker logs -f 4c7e8cc33aef
Flag --address has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --allow-privileged has been deprecated, will be removed in a future version
Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --authentication-token-webhook has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --authentication-token-webhook-cache-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --authorization-mode has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --authorization-webhook-cache-authorized-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --authorization-webhook-cache-unauthorized-ttl has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cadvisor-port has been deprecated, The default will change to 0 (disabled) in 1.12, and the cadvisor port will be removed entirely in 1.13
Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --client-ca-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cluster-domain has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --file-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --healthz-bind-address has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --healthz-port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --host-ipc-sources has been deprecated, will be removed in a future version
Flag --host-ipc-sources has been deprecated, will be removed in a future version
Flag --host-network-sources has been deprecated, will be removed in a future version
Flag --host-network-sources has been deprecated, will be removed in a future version
Flag --host-pid-sources has been deprecated, will be removed in a future version
Flag --host-pid-sources has been deprecated, will be removed in a future version
Flag --http-check-frequency has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --iptables-masquerade-bit has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --max-pods has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --read-only-port has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cert-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-min-version has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --tls-private-key-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
I0531 22:01:51.495324   20802 feature_gate.go:226] feature gates: &{{} map[]}
W0531 22:01:51.509511   20802 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
I0531 22:01:51.515382   20802 server.go:383] Version: v1.10.0+b81c8f8
I0531 22:01:51.515451   20802 feature_gate.go:226] feature gates: &{{} map[]}
I0531 22:01:51.515603   20802 plugins.go:89] No cloud provider specified.
I0531 22:01:52.166465   20802 server.go:621] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
I0531 22:01:52.167596   20802 container_manager_linux.go:242] container manager verified user specified cgroup-root exists: /
I0531 22:01:52.167623   20802 container_manager_linux.go:247] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/tmp/openshift.local.clusterup/openshift.local.volumes ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>}]} ExperimentalQOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true}
I0531 22:01:52.167843   20802 container_manager_linux.go:266] Creating device plugin manager: true
I0531 22:01:52.168178   20802 state_mem.go:36] [cpumanager] initializing new in-memory state store
I0531 22:01:52.168956   20802 state_file.go:82] [cpumanager] state file: created new state file "/tmp/openshift.local.clusterup/openshift.local.volumes/cpu_manager_state"
I0531 22:01:52.169207   20802 kubelet.go:273] Adding pod path: /var/lib/origin/pod-manifests
I0531 22:01:52.169256   20802 kubelet.go:298] Watching apiserver
E0531 22:01:52.171967   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:52.173509   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:52.173548   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
W0531 22:01:52.175867   20802 kubelet_network.go:139] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0531 22:01:52.175913   20802 kubelet.go:565] Hairpin mode set to "hairpin-veth"
I0531 22:01:52.177877   20802 client.go:75] Connecting to docker on unix:///var/run/docker.sock
I0531 22:01:52.177917   20802 client.go:104] Start docker client with request timeout=2m0s
W0531 22:01:52.182236   20802 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
I0531 22:01:52.187186   20802 docker_service.go:244] Docker cri networking managed by kubernetes.io/no-op
I0531 22:01:52.492151   20802 docker_service.go:249] Docker Info: &{ID:4S3O:NG4F:OEN3:CNV7:IOVE:AYAD:DY45:RTMT:AN6J:VJQY:OBRW:5R4O Containers:172 ContainersRunning:1 ContainersPaused:0 ContainersStopped:171 Images:1173 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type false] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[rhel-push-plugin] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:31 OomKillDisable:true NGoroutines:31 SystemTime:2018-05-31T18:01:52.481010732-04:00 LoggingDriver:journald CgroupDriver:systemd NEventsListener:0 KernelVersion:4.16.12-300.fc28.x86_64 OperatingSystem:Fedora 28 (Twenty Eight) OSType:linux Architecture:x86_64 IndexServerAddress:https://registry.fedoraproject.org/v1/ RegistryConfig:0xc42152e850 NCPU:16 MemTotal:135117225984 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:jmontleo.usersys.redhat.com Labels:[] ExperimentalBuild:false ServerVersion:1.13.1 ClusterStore: ClusterAdvertise: Runtimes:map[oci:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:oci Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc420985540} LiveRestoreEnabled:true Isolation: InitBinary:/usr/libexec/docker/docker-init-current ContainerdCommit:{ID:c301b045f9faddcf7693229601303639af6b0885 Expected:aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1} RuncCommit:{ID:1ab62f1ec5429ccf03e8dfcc2110bc665ec9e308-dirty Expected:9df8b306d01f59d3a8029be411de015b7304dd8f} InitCommit:{ID:N/A Expected:949e6facb77383876aeff8a6944dde66b3089574} SecurityOptions:[name=seccomp,profile=/etc/docker/seccomp.json name=selinux]}
I0531 22:01:52.492316   20802 docker_service.go:262] Setting cgroupDriver to systemd
I0531 22:01:52.794304   20802 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock
I0531 22:01:52.797431   20802 kuberuntime_manager.go:186] Container runtime docker initialized, version: 1.13.1, apiVersion: 1.26.0
W0531 22:01:52.799268   20802 probe.go:215] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
I0531 22:01:52.800143   20802 csi_plugin.go:61] kubernetes.io/csi: plugin initializing...
E0531 22:01:52.802293   20802 kubelet.go:1299] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container /
I0531 22:01:52.802295   20802 server.go:129] Starting to listen on 0.0.0.0:10250
I0531 22:01:52.802389   20802 server.go:952] Started kubelet
E0531 22:01:52.803362   20802 event.go:209] Unable to write event: 'Post https://localhost:8443/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8443: getsockopt: connection refused' (may retry after sleeping)
I0531 22:01:52.803699   20802 server.go:303] Adding debug handlers to kubelet server.
I0531 22:01:52.804019   20802 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
I0531 22:01:52.804075   20802 status_manager.go:140] Starting to sync pod status with apiserver
I0531 22:01:52.804097   20802 volume_manager.go:247] Starting Kubelet Volume Manager
I0531 22:01:52.804117   20802 kubelet.go:1799] Starting kubelet main sync loop.
I0531 22:01:52.804207   20802 desired_state_of_world_populator.go:129] Desired state populator starts to run
I0531 22:01:52.804167   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
I0531 22:01:52.904326   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:52.904356   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down]
I0531 22:01:52.910487   20802 kubelet_node_status.go:82] Attempting to register node localhost
E0531 22:01:52.911502   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:53.104499   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down]
I0531 22:01:53.111720   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:53.122791   20802 kubelet_node_status.go:82] Attempting to register node localhost
E0531 22:01:53.123590   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:53.173167   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:53.174609   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:53.175612   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:53.504664   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down]
I0531 22:01:53.523839   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:53.531345   20802 kubelet_node_status.go:82] Attempting to register node localhost
E0531 22:01:53.532278   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:54.174504   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:54.176349   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:54.176654   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:54.304816   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down]
I0531 22:01:54.332526   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:54.339529   20802 kubelet_node_status.go:82] Attempting to register node localhost
E0531 22:01:54.340458   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:55.175704   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:55.177501   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:55.178495   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:55.905084   20802 kubelet.go:1816] skipping pod synchronization - [container runtime is down]
I0531 22:01:55.940661   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:55.948096   20802 kubelet_node_status.go:82] Attempting to register node localhost
E0531 22:01:55.949007   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:56.114536   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:56.119335   20802 cpu_manager.go:155] [cpumanager] starting with none policy
I0531 22:01:56.119358   20802 cpu_manager.go:156] [cpumanager] reconciling every 10s
I0531 22:01:56.119373   20802 policy_none.go:42] [cpumanager] none policy: Start
E0531 22:01:56.119417   20802 container_manager_linux.go:544] failed to get rootfs info,  cannot set ephemeral storage capacity: failed to get device for dir "/tmp/openshift.local.clusterup/openshift.local.volumes": could not find device with major: 0, minor: 45 in cached partitions map
Starting Device Plugin manager
E0531 22:01:56.176908   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:56.178576   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:56.179418   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:56.327769   20802 event.go:209] Unable to write event: 'Post https://localhost:8443/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8443: getsockopt: connection refused' (may retry after sleeping)
E0531 22:01:57.178448   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:57.179605   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:57.180773   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:58.180001   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:58.180332   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:01:58.181724   20802 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:59.105503   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:59.127892   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:59.128051   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
W0531 22:01:59.135632   20802 status_manager.go:461] Failed to get status for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/kube-scheduler-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:59.141711   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:59.141844   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
W0531 22:01:59.148217   20802 status_manager.go:461] Failed to get status for pod "master-api-localhost_kube-system(8364885b3a3efddca13f5a6ada480812)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/master-api-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0531 22:01:59.149165   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:59.154798   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0531 22:01:59.154924   20802 kubelet_node_status.go:82] Attempting to register node localhost
I0531 22:01:59.154959   20802 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
E0531 22:01:59.157234   20802 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
W0531 22:01:59.164447   20802 status_manager.go:461] Failed to get status for pod "master-etcd-localhost_kube-system(4bb0d3d3b26a7aa3dc2bd08a4b4326a5)": Get https://localhost:8443/api/v1/namespaces/kube-system/pods/master-etcd-localhost: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0531 22:02:09.284089   20802 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\""
E0531 22:02:09.284126   20802 kuberuntime_manager.go:646] createPodSandbox for pod "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-scheduler-localhost": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\""
E0531 22:02:09.284228   20802 pod_workers.go:186] Error syncing pod 4699e4e1c8a7146d6acd92baf39234ef ("kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)"), skipping: failed to "CreatePodSandbox" for "kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)" with CreatePodSandboxError: "CreatePodSandbox for pod \"kube-scheduler-localhost_kube-system(4699e4e1c8a7146d6acd92baf39234ef)\" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod \"kube-scheduler-localhost\": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"No such device or address\\\"\""

Comment 1 Tom Nguyen 2018-06-02 19:18:02 UTC

The bug usually starts at the point origin needs to pull down the openshift/origin-web-console image:

Events:
  Type     Reason                  Age                From                Message
  ----     ------                  ----               ----                -------
  Normal   Scheduled               1m                 default-scheduler   Successfully assigned webconsole-7dfbffd44d-bz44s to localhost
  Normal   SuccessfulMountVolume   1m                 kubelet, localhost  MountVolume.SetUp succeeded for volume "webconsole-config"
  Normal   SuccessfulMountVolume   1m                 kubelet, localhost  MountVolume.SetUp succeeded for volume "serving-cert"
  Normal   SuccessfulMountVolume   1m                 kubelet, localhost  MountVolume.SetUp succeeded for volume "webconsole-token-zxjsp"
  Warning  FailedCreatePodSandBox  13s (x2 over 36s)  kubelet, localhost  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "webconsole-7dfbffd44d-bz44s": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\""
  Normal   SandboxChanged          10s (x2 over 36s)  kubelet, localhost  Pod sandbox changed, it will be killed and re-created.

https://gitlab.com/tom81094/bugs/raw/master/f28/docker-2:1.13.1-56.git6c336e4/openshift-web-console-logs
https://gitlab.com/tom81094/bugs/raw/master/f28/docker-2:1.13.1-56.git6c336e4/oc-cluster-logs

Comment 2 Enrico Scholz 2018-06-04 11:34:07 UTC

ditto here; can be reproduced by 

docker run --rm  --cpu-shares=128  fedora:28 bash

Comment 3 Filipe Brandenburger 2018-06-04 18:01:10 UTC

Please see:

https://github.com/projectatomic/runc/pull/10

Which fixes this problem.

(NOTE: While the backport of the single commit/PR seems to be enough, it's probably best to look at backporting more, since there were other changes around that code. Perhaps a whole refresh of upstream "runc" would be good there.)

Cheers,
Filipe

Comment 4 Dusty Mabe 2018-06-11 20:56:21 UTC

I'm seeing this when installing/running openshift origin 3.9.0 on Fedora Atomic Host release candidate. This is blocking future releases of FAH.

I tracked down the problem to this change:

```
# rpm-ostree db diff a5f1234a302fb064f67f09afe8ddd9cbac524a406a257a562fd18000dac99ba8 cefc79e6ea4d7e5eec51a32c00e1ecd6ca678d322406fecd347bc9c49e5d5255 
ostree diff commit old: a5f1234a302fb064f67f09afe8ddd9cbac524a406a257a562fd18000dac99ba8
ostree diff commit new: cefc79e6ea4d7e5eec51a32c00e1ecd6ca678d322406fecd347bc9c49e5d5255
Upgraded:
  docker 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28
  docker-common 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28
  docker-rhel-push-plugin 2:1.13.1-51.git4032bd5.fc28 -> 2:1.13.1-56.git6c336e4.fc28
  quota 1:4.04-5.fc28 -> 1:4.04-6.fc28
  quota-nls 1:4.04-5.fc28 -> 1:4.04-6.fc28
  selinux-policy 3.14.1-29.fc28 -> 3.14.1-30.fc28
  selinux-policy-targeted 3.14.1-29.fc28 -> 3.14.1-30.fc28
Removed:
  oci-register-machine-0-6.1.git66fa845.fc28.x86_64
  systemd-container-238-8.git0e0aa59.fc28.x86_64
```

An example of a container not getting started is one of the glusterfs daemonset containers. Here is a snippet from oc describe:

```
  Warning  FailedCreatePodSandBox  7m (x16287 over 5h)  kubelet, 10.0.12.155  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "glusterfs-storage-mlpdl": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:258: applying cgroup configuration for process caused \"No such device or address\""
  Normal   SandboxChanged          2m (x16532 over 5h)  kubelet, 10.0.12.155  Pod sandbox changed, it will be killed and re-created.
```

Comment 5 Daniel Walsh 2018-06-12 12:57:40 UTC

Any chance this is SELinux related?

Comment 6 Antonio Murdaca 2018-06-12 12:59:05 UTC

Mrunal, this is another bz about the cgroup fix that went into runc :/ https://github.com/projectatomic/runc/commit/99a2d0844a013541744154a07380422a073c4926

Comment 7 Fedora Update System 2018-06-12 19:26:05 UTC

docker-1.13.1-59.gitaf6b32b.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623

Comment 8 Fedora Update System 2018-06-12 20:16:44 UTC

docker-1.13.1-59.gitaf6b32b.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd

Comment 9 Dusty Mabe 2018-06-12 21:31:21 UTC

Ran an openshift cluster on top of docker-1.13.1-59.gitaf6b32b.fc28 using ostree ref `fedora/28/x86_64/atomic-host ` in repo `https://dustymabe.fedorapeople.org/repo/` fixes it for me

Comment 10 Fedora Update System 2018-06-13 04:31:54 UTC

docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c2e93d5623

Comment 11 Fedora Update System 2018-06-13 15:18:25 UTC

docker-1.13.1-59.gitaf6b32b.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2018-06-14 13:47:26 UTC

docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-993659ebfd

Comment 13 Fedora Update System 2018-06-17 19:44:16 UTC

docker-1.13.1-59.gitaf6b32b.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.

adimania
admiller
amurdaca
dustymabe
dwalsh
filbranden
fkluknav
ichavero
jcajka
jpazdziora
jwhiting
lsm5
marianne
nalin
rh-bugzilla
santiago
tom81094
tomek
ttomecek
twaugh
vbatts