Bug 1822417 - "bind: address already in use" masked by usage spew in pod logs
Summary: "bind: address already in use" masked by usage spew in pod logs
Keywords:
Status: CLOSED DUPLICATE of bug 1831533
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.6.0
Assignee: David Eads
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-08 23:54 UTC by W. Trevor King
Modified: 2020-05-21 04:24 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 14:54:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description W. Trevor King 2020-04-08 23:54:15 UTC
For example [1]:

...
I0408 16:44:23.374217       1 server.go:194] Version: v1.18.0-rc.1
Error: failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use
Usage:
  kube-apiserver [flags]
...

But the usage information is so verbose that the bind error doesn't show up in lastState:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1630/pull-ci-openshift-machine-config-operator-master-e2e-gcp-upgrade/1627/artifacts/e2e-gcp-upgrade/pods.json | jq '.items[] | select(.metadata.name == "kube-apiserver-ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal").status.containerStatuses[].lastState.terminated | select(. != null)'
{
  "containerID": "cri-o://fb487ce3249e53080a3f1eecebe269747ca859f3f90bbd066943c574215a87a7",
  "exitCode": 1,
  "finishedAt": "2020-04-08T16:44:23Z",
  "message": "t-signing-key-file string   Path to the file that contains the current private key of the service account token issuer. The issuer will sign issued ID tokens with this private key. (Requires the 'TokenRequest' feature gate.)\n      --service-cluster-ip-range string           A CIDR notation IP range from which to assign service cluster IPs. This must not overlap with any IP ranges assigned to nodes for pods.\n      --service-node-port-range portRange         A port range to reserve for services with NodePort visibility. Example: '30000-32767'. Inclusive at both ends of the range. (default 30000-32767)\n\nGlobal flags:\n\n      --add-dir-header                   If true, adds the file directory to the header\n      --alsologtostderr                  log to standard error as well as files\n  -h, --help                             help for kube-apiserver\n      --log-backtrace-at traceLocation   when logging hits line file:N, emit a stack trace (default :0)\n      --log-dir string                   If non-empty, write log files in this directory\n      --log-file string                  If non-empty, use this log file\n      --log-file-max-size uint           Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)\n      --log-flush-frequency duration     Maximum number of seconds between log flushes (default 5s)\n      --logtostderr                      log to standard error instead of files (default true)\n      --skip-headers                     If true, avoid header prefixes in the log messages\n      --skip-log-headers                 If true, avoid headers when opening log files\n      --stderrthreshold severity         logs at or above this threshold go to stderr (default 2)\n  -v, --v Level                          number for the log level verbosity (default 0)\n      --version version[=true]           Print version information and quit\n      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging\n\n",
  "reason": "Error",
  "startedAt": "2020-04-08T16:44:23Z"
}

We should reserve the usage dump for invocations that use an unrecognized option, not for invocations where we understand the request but are unable to deliver.

[1]: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1630/pull-ci-openshift-machine-config-operator-master-e2e-gcp-upgrade/1627/artifacts/e2e-gcp-upgrade/pods/openshift-kube-apiserver_kube-apiserver-ci-op-kf7v5-m-0.c.openshift-gce-devel-ci.internal_kube-apiserver_previous.log

Comment 1 W. Trevor King 2020-04-27 21:05:14 UTC
Seen again on [1], where it's now fatal for 4.5 update CI (via bug 1828427).  Should I split off a separate bug for the "bind: address already in use" crashlooping, so this one stays focused on log spew?

[1]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/27314

Comment 2 W. Trevor King 2020-04-27 23:13:38 UTC
Seems like this got a 'SilenceUsage' upstream in [1].  Not sure when that will land in OpenShift, but that should address the usage-spew portion of this.

[1]: https://github.com/kubernetes/kubernetes/pull/90297

Comment 4 David Eads 2020-05-20 14:54:04 UTC
fixed by https://github.com/openshift/origin/pull/24945

*** This bug has been marked as a duplicate of bug 1831533 ***

Comment 5 W. Trevor King 2020-05-21 04:24:35 UTC
Looks like bug 1831533 addressed the usage spew.  Did we have a separate ticket open for the bind contention that was being masked?


Note You need to log in before you can comment on or make changes to this bug.