Red Hat Bugzilla – Bug 1465022
Kibana login loop with Openshift custom CA
Last modified: 2017-09-13 14:07:40 EDT
Description of problem:
Unable to login into Kibana when Openshift was deployed using a provided CA to generate all the certificates.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Configure inventory with custom CA.
2. Regenerate CA and Certificates using ansible
3. Regenerate Certificates using ansible
4. Deploy logging using default values
5. Access Kibana page and try to login
Kibana keeps redirecting to Openshift login page and doesn't show any error
Access Kibana Discovery page
Comparing with a working environment, I have noticed that after providing the credentials, the following request:
does not include a response cookie like the following:
From the kibana-proxy logs (debug enabled):
in callback handler for req path /auth/openshift/callback
10.129.0.1 - - [26/Jun/2017:12:43:53 +0000] "GET /auth/openshift/callback?code=0yddkJVT2bzVanPF7tEzbbyT7b6uygzo7b2i-zFY5DE&state= HTTP/1.1" 302 58 "https://openshift.internal.rromerocerts.quicklab.pnq2.cee.redhat.com/login?then=%2Foauth%2Fauthorize%3Fresponse_type%3Dcode%26redirect_uri%3Dhttps%253A%252F%252Fkibana.apps.rromerocerts.quicklab.pnq2.cee.redhat.com%252Fauth%252Fopenshift%252Fcallback%26scope%3Duser%253Ainfo%2520user%253Acheck-access%2520user%253Alist-projects%26client_id%3Dkibana-proxy" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0"
in passport.ensureAuthenticated for req path /
not authenticated by request session.
10.129.0.1 - - [26/Jun/2017:12:43:53 +0000] "GET / HTTP/1.1" 302 0 "https://openshift.internal.rromerocerts.quicklab.pnq2.cee.redhat.com/login?then=%2Foauth%2Fauthorize%3Fresponse_type%3Dcode%26redirect_uri%3Dhttps%253A%252F%252Fkibana.apps.rromerocerts.quicklab.pnq2.cee.redhat.com%252Fauth%252Fopenshift%252Fcallback%26scope%3Duser%253Ainfo%2520user%253Acheck-access%2520user%253Alist-projects%26client_id%3Dkibana-proxy" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0"
Reassigning as this is not a logging problem. The customer does not appear to want to use the recommended method of using separate certificates for external and internal services.
The kibana-proxy talks to the master at https://kubernetes.default.svc.cluster.local which is a cluster-internal address resolved by SkyDNS to an internal IP. It uses the /var/run/secrets/kubernetes.io/serviceaccount/ca.crt supplied to every pod by kubernetes for the purpose of talking to the master at its internal address.
I'm actually still a little confused about why the solution described here doesn't work for the customer:
Even if they're wanting to use a single address for the external master API and for the nodes and other infrastructure to talk to, because kibana-proxy is making the request against kubernetes.default.svc.cluster.local the master should still be able to serve that address separately with the OpenShift-generated internal cert validated by the internal CA. So if their master config ends up like the following I don't see the problem:
- certFile: custom.crt
I'm also surprised there isn't similarly a problem with the router which I believe also watches the master at this internal address to build its routing tables (I could be off though).
But, let's suppose we wanted to reconfigure kibana-proxy to make the master request using the same address as everything else. This is possible with some reconfiguration to have kibana-proxy use a different address and CA.
The address is actually the easiest thing to change; there's a parameter OAP_MASTER_URL in the deployer kibana-proxy template (see https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/templates/kibana.yaml#L112) which can follows the deployer parameters MASTER_URL (see https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/deployer.yaml#L154).
The CA is a bit more tricky. I suspect that updating the master-config.yaml to replace ca-bundle.crt with their custom CA chain should work, but I've never tried it (and don't know if there's an ansible variable to achieve this). But that would probably be the most thorough solution if it worked, as other pods would have access to the correct CA not just kibana-proxy.
To update the kibana-proxy specifically, you would need to add a secret (or an entry to an existing secret) to provide the CA bundle. Then you would need to update the environment variable OAP_MASTER_CA_FILE on the kibana-proxy container template (you can see where it is defined now at https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/templates/kibana.yaml#L121) to point where the secret has mounted it. There is definitely no deployer/ansible parameter for doing this, it would be an after-deployment modification.
Having done all that, it occurs to me that Elasticsearch will need the exact same treatment because it too queries the master with the auth token that it gets. It is not at all apparent to me how or if configuration there is even possible; it's using kubernetes-client which seems to have the values hardcoded. https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/Config.java#L110
I have tested with 6.11.0  and I don't have this problem. Also tested with tag 0.1.0 using nodejs 4.7.2  but it didn't work.
What inputs were provided to the ansible configuration?
1. Did they override the signing CA?
2. Did they provide a custom CA bundle?
3. Were certificates for internal hostnames regenerated as well?
(In reply to Jordan Liggitt from comment #7)
> What inputs were provided to the ansible configuration?
> 1. Did they override the signing CA?
> 2. Did they provide a custom CA bundle?
> 3. Were certificates for internal hostnames regenerated as well?
They redeployed certificates using a custom CA.
I honestly don't know the exact reason but it has been confirmed to be solved after upgrading to nodeJS 4.8.1
Not auth related, moving to Kibana (Logging)
Setting the following env variable on kibana-proxy is enough to avoid the logging loop
If I have read the documentation correctly, this value will ignore self-signed certificates.
What I don't understand is why with the default value for OAP_MASTER_CA_FILE which is the /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is not working
Thank you for your comments beforehand.
> NODE_TLS_REJECT_UNAUTHORIZED: 0
> If I have read the documentation correctly, this value will ignore self-signed > certificates.
No! This value will ignore certificate validity entirely, accepting anything.
You should *never* use this option except for some debugging/development reason.
We clearly need a CI job that deploys an uses an "external" CA, so we can validate these code paths.