Bug 1282718
Summary: | Login is failed with Unauthorized error sometimes on ha etcd environment | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> | |
Component: | apiserver-auth | Assignee: | Jordan Liggitt <jliggitt> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | weiwei jiang <wjiang> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | unspecified | CC: | aos-bugs, ccoleman, dma, jliggitt, mmccomas, wsun | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1289603 (view as bug list) | Environment: | ||
Last Closed: | 2016-05-12 17:11:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1289603 |
Description
DeShuai Ma
2015-11-17 08:52:14 UTC
Can you provide the following information? what is the HA setup? (how many masters, how many etcd servers) how is the https://openshift-162.lab.eng.nay.redhat.com:8443 URL pointing to multiple masters? do you have server logs from the master API? can you include the master configurations from all the HA masters? Pretty sure this is a stale read issue when using an etcd cluster. I see this in the master configs: etcdClientInfo: ca: master.etcd-ca.crt certFile: master.etcd-client.crt keyFile: master.etcd-client.key urls: - https://openshift-159.lab.eng.nay.redhat.com:2379 - https://openshift-138.lab.eng.nay.redhat.com:2379 - https://openshift-155.lab.eng.nay.redhat.com:2379 So there are at least three etcd servers in place, right? 1. The token is created, written to etcd, and returned to the client. 2. The client then uses the token against the users/~ API 3. The authentication layer attempts to verify the token exists in etcd. There is no guarantee the same etcd server is queried for the token. In this case, I think a quorum read may be needed when the token is not found. Fix pending in https://github.com/openshift/origin/pull/6530 Tested with an etcd cluster: ip=192.168.99.100 count=3 cluster_members=() for i in `seq 1 $count`; do cluster_members+=("etcd${i}=http://${ip}:700${i}") done IFS=',' eval 'initial_cluster="${cluster_members[*]}"' for i in `seq 1 $count`; do docker run -d -p 400${i}:400${i} -p 700${i}:700${i} \ --name "etcd${i}" quay.io/coreos/etcd:latest \ -name "etcd${i}" \ -advertise-client-urls "http://${ip}:400${i}" \ -listen-client-urls "http://0.0.0.0:400${i}" \ -initial-advertise-peer-urls "http://${ip}:700${i}" \ -listen-peer-urls "http://0.0.0.0:700${i}" \ -initial-cluster-token "my-etcd-cluster" \ -initial-cluster "${initial_cluster}" \ -initial-cluster-state "new" done Started from master-config file with: etcdClientInfo: urls: - http://192.168.99.100:4001 - http://192.168.99.100:4002 - http://192.168.99.100:4003 https://github.com/openshift/origin/pull/6530 in the merge queue Verify on the latest origin evn, this bug is fixed. |