Description of problem: Customer has connected Azure AD via the OpenID Connect setup and is occasionally seeing a user be unable to authenticate. They have verified that this user is able to login to Azure without issue but when they try to login to OpenShift they simply see the error message "Sate is Invalid" and no amount of increasing log level on OCP will change the message because the actual state is overwritten in the code [0] [0] https://github.com/openshift/origin/blob/master/pkg/auth/oauth/external/handler.go#L162 Version-Release number of selected component (if applicable): 3.4.1.18 Additional info: The "problem user" is now reporting success via Chrome and Firefox, though he still sees the same problem when trying to login when using Internet Explorer. The issue appears to have first been identified when trying to login via Firefox
Is this an HA master setup? If so, do all the masters specify an oauthConfig.sessionSecretsFile with identical authentication/encryption values? If not, the secure session which stores the state may not be readable by all masters.
Customer indicated that it is multi-master and that they logged onto all three masters in the multimaster cluster, and the session secrets (/etc/origin/master/session-secrets.yaml) is the same on all servers. They also collected the versions of the browsers to provide better context: Google Chrome(Version 58.0.3029.96 (64-bit)), Firefox(Version 52.1.0(32-bit)) and IE(Version 11.0).
Can they check for time skew between the api servers, and between the client running the browser? This might be an issue with browser discarding the cookie storing state because the client clock is more than 5 minutes behind the server (see https://bugzilla.redhat.com/show_bug.cgi?id=1270436)
Logging issue is fixed in https://github.com/openshift/origin/pull/14692
Try to verify it with latest OCP3.6 build, the issue still not presented, so put the status to VERIFIED first. Here is the verified steps: 1. setup openshift with Azure AD OpenID Connect 2. login openshift with IE(version 10, 11), firefox(52, 54), chrome(59), all login successfully. 3. tried step 2 many times all successfully. # openshift version openshift v3.6.133 kubernetes v1.6.1+5115d708d7 etcd 3.2.1
@Jordan, Apologies for the delay. The customer uses NTP servers for all of their systems and the customer confirmed that their user was in sync with the NTP server as well.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188