Bug 1623108 - [free-stg] scale group ami pull keys expire for operations registry
Summary: [free-stg] scale group ami pull keys expire for operations registry
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Unknown
Version: 3.x
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.x
Assignee: Justin Pierce
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-28 13:31 UTC by Justin Pierce
Modified: 2018-10-24 13:16 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-24 13:16:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Justin Pierce 2018-08-28 13:31:38 UTC
Description of problem:
Presently, 268 projects are stuck in terminating state. Cluster upgrades are presently blocked. 

Version-Release number of selected component (if applicable):
v3.11.0-0.21.0

How reproducible:
Unknown - current state of the cluster

Comment 4 Xingxing Xia 2018-08-29 02:45:39 UTC
Your attachment shows the error message "unable to retrieve the complete list of server APIs".
Just fyi, it is seen in many bugs, here is a search list: https://url.corp.redhat.com/unable-to-retrieve-the-complete-list-of-server-APIs . A short summary: seems most of them cause server problem, some cause client problem (like bug 1623195)

Comment 5 Michal Fojtik 2018-08-29 08:58:35 UTC
From the logs it looks like the metrics api server is stuck in ContainerCreate. We are not able to finalize the namespace until we can reach that server and assure that all resources created by that server were removed.

If we ignore this error, the danger is that when the aggregated API server (metrics) come back and the deleted namespace is recreated, you might gain access to resources that were part of deleted namespace...

The easiest way to fix this is to figure out why the metrics api server is stuck in ContainerCreate and have it up and running, so the namespace finalizer can function properly.

Comment 6 Michal Fojtik 2018-08-29 09:01:53 UTC
Alternatively, you can disable the metrics api service (backup && oc delete?) which will unstuck the namespace controller, however the metrics api server might be needed in next step.

Comment 8 Justin Pierce 2018-08-29 13:57:04 UTC
Moving to online component as this is presently specific to the online environment and the means by which docker pull secrets are managed.


Note You need to log in before you can comment on or make changes to this bug.