Bug 2030726 - checkProxyConfig generating high number of requests sent through the proxy endpoints.
Summary: checkProxyConfig generating high number of requests sent through the proxy en...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Krzysztof Ostrowski
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-09 15:19 UTC by German Parente
Modified: 2022-08-24 14:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-24 14:23:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description German Parente 2021-12-09 15:19:12 UTC
Description of problem:

A customer can observe a high number of requests to the proxy when the endpoint is defined in the noProxy settings. 

The function checkProxyConfig should send a request once each 5 minutes.

instead it's claimed that the oauth route:

oauth-openshift.apps.<domain>:443

is going through the proxy 4K times an hour.


Version-Release number of selected component (if applicable): 4.8.5

I will give more details in private notes.

Comment 3 Krzysztof Ostrowski 2022-01-30 15:31:45 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.

Comment 7 Krzysztof Ostrowski 2022-06-22 12:45:43 UTC
Hi,

I am starting to work on this bug. Our team has some shortcomings in capacity.

Thank you for your patience.

Comment 8 Krzysztof Ostrowski 2022-06-22 14:53:01 UTC
Hey,


It would be lovely, if I could get a pcap alongside a must gather and any additional information that could help me to understand the who-is-who.

So from the pcap we see that a client is asking an intermediate server to proxy a connection, hence proxy. First they establish the TCP connection and then there is the HTTP CONNECT request from the client. The proxy agrees and tells that it established the connection to the target location, which is oauth-openshift.apps domain. But once the client starts the TLS handshake, which the proxy should just proxy, the proxy kills the connection with a reset. This dance happens roughly once a second. 

It is hard for me to identify who is calling whom and why the whom is killing the connection, without getting the appropriate logs.

I checked the `checkProxyConfig` in cluster-authentication-operator and it could've been the root issue, but from the must-gather logs that are attached, there is no logging of that behavior (error messages that indicates error on connection and retries). So either the must gather is not having the same issue as the pcap or it is not `checkProxyConfig` causing all that communication.

I am also not completely sure if it is really an auth problem. Still looking forward to. It is quite interesting.

Comment 9 German Parente 2022-06-22 15:54:58 UTC
Hi Krzysztof,

thanks a lot for taking care of this bug. I will try to ask a new set of data ( pcap + must gather with oauth pod operator logs ) to the customer.

Comment 21 Krzysztof Ostrowski 2022-08-15 14:43:31 UTC
There is a related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111670

It seems like the router is making "health probe checks" by opening connections and then resetting them.
The proxy in front of the app seems to log them.

Usually it seems to happen in a 5sec-interval, but it  can happen for every pod behind a route, leading to more than one request per 5sec-interval.

There are currently two solutions:

- increase interval as described here: https://docs.openshift.com/container-platform/4.10/networking/routes/route-configuration.html#nw-route-specific-annotations_route-configuration

- tell the proxy to not log empty requests.

The later  would need to be added to the upstream proxy as a feature and then handed down to our downstream fork.
This might take some time.

It this helpful @gparente?

Kudos to @mmasters, who described that behavior in one of the chats.

Comment 22 Krzysztof Ostrowski 2022-08-15 14:44:22 UTC
The above is the current working assumption.

Comment 24 Krzysztof Ostrowski 2022-08-24 14:23:47 UTC
Ok, so I am closing this bug as of now. Thanks all for your support.


Note You need to log in before you can comment on or make changes to this bug.