Bug 1941224

Summary: Serial e2e should not complain about the authentication operator going Progressing=True during the "test RequestHeaders IdP" test-case
Product: OpenShift Container Platform Reporter: W. Trevor King <wking>
Component: Test FrameworkAssignee: W. Trevor King <wking>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 4.8CC: aconstan, bleanhar, ricarril, zzhao
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1939744 Environment:
Last Closed: 2021-10-06 20:51:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description W. Trevor King 2021-03-20 22:17:18 UTC
+++ This bug was initially created as a clone of Bug #1939744 +++

Origin recently began watching ClusterOperator conditions for surprising behavior [1].  That's turned up things like [2,3]:

  [bz-apiserver-auth] clusteroperator/authentication should not change condition/Progressing
    Run #0: Failed	0s
      4 unexpected clusteroperator state transitions during e2e test run 

      authentication was Progressing=false, but became Progressing=true at 2021-03-16 18:37:52.415503966 +0000 UTC -- OAuthServerDeploymentProgressing: Waiting for OAuth server observed generation 3 to match expected generation 4
      authentication was Progressing=true, but became Progressing=false at 2021-03-16 18:38:31.871845671 +0000 UTC -- All is well
      authentication was Progressing=false, but became Progressing=true at 2021-03-16 18:38:46.819597482 +0000 UTC -- OAuthServerDeploymentProgressing: Waiting for OAuth server observed generation 6 to match expected generation 7
      authentication was Progressing=true, but became Progressing=false at 2021-03-16 18:39:16.842078089 +0000 UTC -- All is well

Per the API docs, Progressing is for [4]:

  Progressing indicates that the operator is actively rolling out new code, propagating config changes, or otherwise moving from one steady state to another.  Operators should not report progressing when they are reconciling a previously known state.

In this case, the 18:37 disruption seems to have been an actual config change (which makes sense, given that the auth operator was mentioning generation bumps):

  $ curl -s curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.8/1371878957158240256/artifacts/e2e-aws-serial/e2e.log | grep '18:37:4'
  ...
  Mar 16 18:37:42.861 I e2e-test/"[Serial] [sig-auth][Feature:OAuthServer] [RequestHeaders] [IdP] test RequestHeaders IdP [Suite:openshift/conformance/serial]" started
  ...
  Mar 16 18:37:46.991 I ns/openshift-authentication-operator deployment/authentication-operator reason/ObserveIdentityProviders identity providers changed to [map["challenge":%!q(bool=true) "login":%!q(bool=true) "mappingMethod":"claim" "name":"test-request-header" "provider":map["apiVersion":"osin.config.openshift.io/v1" "challengeURL":"https://dontcare.com/challenges/oauth/authorize?${query}" "clientCA":"/var/config/user/idp/0/configMap/v4-0-config-user-idp-0-ca/ca.crt" "clientCommonNames":["A good cert" "testing-client-cert" "Some other cert"] "emailHeaders":[] "headers":["X-Remote-User"] "kind":"RequestHeaderIdentityProvider" "loginURL":"https://dontcare.com/web-login/oauth/authorize?${query}" "nameHeaders":[] "preferredUsernameHeaders":[]]]]
  ...

The test suite should disable the Progressing invariant for the authentication operator while the "test RequestHeaders IdP" test-case is running.

[1]: https://github.com/openshift/origin/pull/25918#event-4423357757
[2]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-blocking#release-openshift-ocp-installer-e2e-aws-serial-4.8
[3]: https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.8/1371878957158240256
[4]: https://github.com/openshift/api/blob/8356aa4d4afb94790d3ad58c4debe0e1bdabcbe9/config/v1/types_cluster_operator.go#L147-L151

Comment 1 W. Trevor King 2021-03-20 22:27:32 UTC
Also, if Test Infrastructure is the wrong component for origin, let me know (and probably get a 'component' entry into [1]).

[1]: https://github.com/openshift/origin/blob/1bbd924fd280f360873d466fee68ce5c77c0f5ee/OWNERS

Comment 2 Steve Kuznetsov 2021-05-17 19:21:53 UTC
@wking shouldn't the auth team own this?

Comment 3 W. Trevor King 2021-05-17 19:46:45 UTC
It's not the auth folks' fault.  Their operator behavior is appropriate.  It's Clayton/David/me in [1] that are complaining about that valid operator behavior, because the new origin test code is too generic and doesn't realize that the 'test RequestHeaders IdP' is actually bumping an auth config object.  But there's no Bugzilla component for "Clayton/David/me in origin", or even "origin's test suite".  So I just picked the component that had "Test" in the name :/.  Would growing a real Bugzilla component for origin be hard?

[1]: https://github.com/openshift/origin/blob/4fb407c2e22faa27276287f411829fb84a22545a/pkg/synthetictests/operators.go#L34

Comment 4 W. Trevor King 2021-08-03 01:28:40 UTC
We have a Test Framework component now, so moving this one over there.

Comment 5 W. Trevor King 2021-08-03 19:11:43 UTC
[1] is dropping all:

  clusteroperator/... should not change condition/Progressing

until we have time to focus on them.

[1]: https://github.com/openshift/origin/pull/26374

Comment 8 Brenton Leanhardt 2021-09-27 13:19:10 UTC
Hi Trevor, can this bug be closed?

Comment 9 W. Trevor King 2021-10-06 20:51:51 UTC
Yup, we don't watch for Progressing changes at all since origin#26374.