Description of problem: Seeing errors in logs for pruning such as "E0306 16:03:15.892155 1 prune_controller.go:303] key failed with : unable to set pruner pod ownerrefs: configmap "revision-status-0" not found" Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Check logs of any static pod operator (eg, kube-apiserver-operator) 2. 3. Actual results: Error message exists Expected results: Error does not exist Additional info: May be related to revisions failing to prune based on previous investigation that this failure could be getting stuck in backoff/retry logic that prevents future pruning attempts to run. Regardless, there is never a "revision-status-0" map and it should not be attempting to find one, as there is not explicitly a revision "0"
Previous attempt to fix this for reference: https://github.com/openshift/library-go/pull/291
Opened https://github.com/openshift/library-go/pull/303 to try and address this again but on closer inspection the errors mentioned above should not be coming from line 303 in prune_controller.go because after the above mentioned fix (https://github.com/openshift/library-go/pull/291) that error would have been bumped down 2 lines, and in the most recent code it is still on line 305. This leads me to believe that the payload image is perhaps not updated. The vendored library-go code in kube-apiserver-operator on master is up to date with the fix. @David could you try to verify again if you can get this message from where it currently is in the code? I don't know any other way to verify the running version of the operator itself, but this is definitely a signal
Release image building was fixed last week and confirmed that this is still an issue. Testing https://github.com/openshift/cluster-kube-apiserver-operator/pull/335 (with the library-go fix above) cleared up this problem, so hopefully once that merges this can move to qa
PR has merged and this should be addressed now
Confirmed with OCP,still could see errors like: [root@192 ~]# oc version --short Client Version: v4.0.22 Server Version: v1.12.4+befe71b Payload: 4.0.0-0.nightly-2019-03-19-004004 [root@192 ~]# oc logs -f po/kube-apiserver-operator-6764545587-qhbft -n openshift-kube-apiserver-operator |grep ownerrefs I0319 09:21:21.254250 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"b75e0411-49f4-11e9-a3be-02b842adcc06", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'InstallerPodFailed' Failed to create installer pod for revision 19 on node "ip-172-31-131-4.ap-northeast-2.compute.internal": unable to set installer pod ownerrefs: configmap "revision-status-19" not found E0319 09:21:21.315681 1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found E0319 09:21:21.317350 1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found
Thanks for verifying. I believe the issue you're seeing is actually related to a separate bug (see https://bugzilla.redhat.com/show_bug.cgi?id=1686070). We saw these errors with "revision-status-0" frequently and regularly, and the issue here was that revision-status-0 should not exist. The issue you're seeing is more likely that configmaps which *should* exist are being deleted, with different revision numbers. That is also related to the installer pod, where the problem in this bug was related to the pruner pod. Could you please verify that the "revision-status-0" issue is resolved, and if necessary add this new information to the bug I mentioned above? Thanks!
Since could not see the error about revision-status-0, will verify this.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758