1698251 – Bind: Address already in use for openshift-kube-scheduler/openshift-kube-scheduler

Bug 1698251 - Bind: Address already in use for openshift-kube-scheduler/openshift-kube-scheduler

Summary: Bind: Address already in use for openshift-kube-scheduler/openshift-kube-sche...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.1.0
Assignee:	ravig
QA Contact:	Weinan Liu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-04-09 22:34 UTC by W. Trevor King
Modified:	2019-06-04 10:47 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:47:18 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Occurrences of this error in CI from 2019-04-08T21:15 to 2019-04-09T20:50 UTC (258.49 KB, image/svg+xml) 2019-04-09 22:36 UTC, W. Trevor King	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:47:25 UTC

Internal Links: 1691055

Description W. Trevor King 2019-04-09 22:34:04 UTC

Description of problem:

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6581/artifacts/e2e-aws/pods/openshift-kube-scheduler_openshift-kube-scheduler-ip-10-0-134-174.ec2.internal_scheduler_previous.log.gz | gunzip | tail -n1
failed to create listener: failed to listen on 0.0.0.0:10251: listen tcp 0.0.0.0:10251: bind: address already in use

Just like bug 1691055, but for a different operator.

Version-Release number of selected component (if applicable):

$ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6581/artifacts/release-images-latest/release-images-latest | jq -r '.spec.tags[] | select(.name == "cluster-kube-scheduler-operator").annotations'
{
  "io.openshift.build.commit.id": "299320e432339747de3c7048d680ae9e22a5af7f",
  "io.openshift.build.commit.ref": "master",
  "io.openshift.build.source-location": "https://github.com/openshift/cluster-kube-scheduler-operator"
}

Comment 1 W. Trevor King 2019-04-09 22:36:38 UTC

Created attachment 1553955 [details]
Occurrences of this error in CI from 2019-04-08T21:15 to 2019-04-09T20:50 UTC

This occurred in 36 of our 355 failures (10%) in *-e2e-aws* jobs across the whole CI system over the past 23 hours.  Generated with [1]:

  $ deck-build-log-plot 'kube-scheduler.*listen tcp 0.0.0.0:10251: bind: address already in use'

[1]: https://github.com/wking/openshift-release/tree/debug-scripts/deck-build-log

Comment 2 Greg Blomquist 2019-04-10 12:59:48 UTC

Maciej has a fix in Trevor's linked bug ... might be a similar fix here?

Comment 3 Seth Jennings 2019-04-10 13:08:00 UTC

kube-controller-manager-operator addresses this with an init container that waits on the port to open up
https://github.com/openshift/cluster-kube-controller-manager-operator/blob/master/bindata/v3.11.0/kube-controller-manager/pod.yaml#L12-L22

Comment 4 Seth Jennings 2019-04-10 13:09:58 UTC

Ravi, already has a PR open
https://github.com/openshift/cluster-kube-scheduler-operator/pull/90

Comment 8 errata-xmlrpc 2019-06-04 10:47:18 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.