Pick upstream PR: https://github.com/kubernetes/kubernetes/pull/96901 we want it in 4.7: when we debug customer escalations we run into scenarios where we see request activity lasting longer than 60s. This PR wires the context with an appropriate timeout immediately after receiving the request. This ensures that authentication, authorization, aggregation filters, they all use a deadline bound request context. This lays the ground work for a better management of request deadline. We will inspect the authentication, authorization filters and the aggregation layer and ensure that the wired context is used. So in future we may pick more fixes along this line.
Not a 4.7 blocker but something we'd like to merge b/f code freeze.
Tested in 4.7.0-0.nightly-2021-02-09-024347: When timeout is invalid, got 400: $ curl -XDELETE -ksSH "Authorization: Bearer $TOKEN" https://...:6443/api/v1/namespaces/xxia-proj/pods/node-hello-854495b46-2vmkd?timeout=aaaa { "kind": "Status", "apiVersion": "v1", "metadata": { }, "status": "Failure", "message": "invalid timeout specified in the request URL - time: invalid duration \"aaaa\"", "reason": "BadRequest", "code": 400 } When timeout is valid: i.e. when it is ?timeout= or ?timeout=${NUM}s or ?timeout=0s , the requests have no regression issue (per the PR's test code, need test if hasDeadlineExpected is set and what deadlineExpected is, but seems no way to test via e2e way. Now that the PR unit test covers, will not be stuck in thinking out how to test this via e2e way)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633