Bug 1939640

Summary: The server has asked for the client to provide credentials
Product: OpenShift Container Platform Reporter: Sonigra Saurab <ssonigra>
Component: Insights OperatorAssignee: Tomas Remes <tremes>
Status: CLOSED ERRATA QA Contact: Pavel Šimovec <psimovec>
Severity: high Docs Contact:
Priority: high    
Version: 4.6.zCC: aos-bugs, cpassare, gmeghnag, inecas, jdelft, jortizpa, lmohanty, mfojtik, mklika, scuppett, skanakal, sttts, swasthan, tremes, vlours, wking, xxia
Target Milestone: ---Keywords: Reopened, Upgrades
Target Release: 4.6.z   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-21 18:31:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1925659    
Bug Blocks:    

Description Sonigra Saurab 2021-03-16 17:52:58 UTC
Description of problem:

During upgrade from ocp 4.6.17 to ocp 4.6.19 the insight operator goes in degraded state , due to the error (The server has asked for the client to provide credentials   pods/log sdn-xxxxx)

Version-Release number of selected component (if applicable):


Actual results:

The upgrade to be completed without any issues.

Expected results:

The upgrade is stuck and insight operator is in degraded state.


Additional info:

The MCP is not degraded. tried to regenerate the kubelet CSR for the affected node. The there are a lots of errors " Unable to authenticate the request due to an error: x509: certificate signed by unknown authority"

Comment 3 Tomas Remes 2021-03-18 15:05:56 UTC
I created a PR fixing the IO issue. I would suggest to create another issue for the Node-auth, MCO.

Comment 7 Pavel Šimovec 2021-03-24 13:52:11 UTC
Started at this commit https://github.com/openshift/insights-operator/commit/5b8e5dce854bfc96a5c1b53a1e2d25346f476639
Changed IO code of CSR gatherer to always return an error
built & replaced IO on cluster

insights-operator-688645c897-f9zs6   1/1       Running   0          42s

added csr resource

checked IO log - it contains the nonsense error I have added to the code
I0324 13:45:50.047785       1 status.go:248] The operator has some internal errors: Source clusterconfig could not be retrieved: Too many requests: brrrrrr
I0324 13:45:50.047998       1 status.go:300] The operator has some internal errors: Source clusterconfig could not be retrieved: Too many requests: brrrrrr

IO is not degraded

Comment 14 errata-xmlrpc 2021-04-20 19:27:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.25 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1153

Comment 21 Lalatendu Mohanty 2021-04-21 18:55:13 UTC
If the cluster is stuck between two 4.6.z versions i.e. the insight operator is in degraded state because of this bug then the safest way to recover is to update to 4.6.25 or later version. You can just update from e.g. 4.6.19 and 4.6.25 if that's a recommended edge available in your channel (fast or stable channel) you are subscribed to. Please Note that you do not need to force the update.

Comment 22 Lalatendu Mohanty 2021-04-21 20:28:50 UTC
oc has a client-side guard for starting a new update when hen the cluster is already in between an update.  So when you run:
  $ oc adm upgrade --to 4.6.25

(or any version later than 4.6.25). 

If it can not trigger the update then it will give you the list of warnings if there are any applicable to your cluster. 

Review the warnings listed. Make sure that the warning are only because of the current bug i.e. cluster operator insights is degraded then you can add --allow-upgrade-with-warnings to the command (oc adm upgrade --allow-upgrade-with-warnings --to 4.6.25) to trigger the update.

Comment 23 Tomas Remes 2021-04-22 08:32:51 UTC
Lalatendu already provided the information. Thanks.

Comment 26 W. Trevor King 2021-04-23 04:16:09 UTC
4.6.25 went into stable channels at 2021-04-22 20:36Z [1].

[1]: https://github.com/openshift/cincinnati-graph-data/pull/758#event-4633317668