Bug 1465022 - Kibana login loop with Openshift custom CA
Kibana login loop with Openshift custom CA
Status: NEW
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.4.1
Unspecified Unspecified
unspecified Severity high
: ---
: 3.8.0
Assigned To: Jeff Cantrill
Xia Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-26 08:44 EDT by Ruben Romero Montes
Modified: 2017-09-13 14:07 EDT (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github fabric8io/openshift-auth-proxy/pull/21 None None None 2017-09-11 05:51 EDT

  None (edit)
Description Ruben Romero Montes 2017-06-26 08:44:36 EDT
Description of problem:
Unable to login into Kibana when Openshift was deployed using a provided CA to generate all the certificates.

Version-Release number of selected component (if applicable):
3.4.1

How reproducible:
Always

Steps to Reproduce:
1. Configure inventory with custom CA.

2. Regenerate CA and Certificates using ansible
ansible-playbook playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml

3. Regenerate Certificates using ansible
ansible-playbook playbooks/byo/openshift-cluster/redeploy-certificates.yml

4. Deploy logging using default values
5. Access Kibana page and try to login

Actual results:
Kibana keeps redirecting to Openshift login page and doesn't show any error

Expected results:
Access Kibana Discovery page

Additional info:
Comparing with a working environment, I have noticed that after providing the credentials, the following request:
https://kibana.apps.rromeromlogging34.quicklab.pnq2.cee.redhat.com/auth/openshift/callback?code=Cplg-_4vohCLJpmzH5YBT04EjiFm2qbzwyCAGFlteM4&state=

does not include a response cookie like the following:
openshift-auth-proxy-session	
value	"l8WjW84RkaRhJIYtNk5qVw.TLxYps…BZfgDAe-ciOmvHTdS44BUkK3RDow"
path	"/"
secure	true
httpOnly	true

From the kibana-proxy logs (debug enabled):
in callback handler for req path /auth/openshift/callback
10.129.0.1 - - [26/Jun/2017:12:43:53 +0000] "GET /auth/openshift/callback?code=0yddkJVT2bzVanPF7tEzbbyT7b6uygzo7b2i-zFY5DE&state= HTTP/1.1" 302 58 "https://openshift.internal.rromerocerts.quicklab.pnq2.cee.redhat.com/login?then=%2Foauth%2Fauthorize%3Fresponse_type%3Dcode%26redirect_uri%3Dhttps%253A%252F%252Fkibana.apps.rromerocerts.quicklab.pnq2.cee.redhat.com%252Fauth%252Fopenshift%252Fcallback%26scope%3Duser%253Ainfo%2520user%253Acheck-access%2520user%253Alist-projects%26client_id%3Dkibana-proxy" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0"
in passport.ensureAuthenticated for req path /
not authenticated by request session.
10.129.0.1 - - [26/Jun/2017:12:43:53 +0000] "GET / HTTP/1.1" 302 0 "https://openshift.internal.rromerocerts.quicklab.pnq2.cee.redhat.com/login?then=%2Foauth%2Fauthorize%3Fresponse_type%3Dcode%26redirect_uri%3Dhttps%253A%252F%252Fkibana.apps.rromerocerts.quicklab.pnq2.cee.redhat.com%252Fauth%252Fopenshift%252Fcallback%26scope%3Duser%253Ainfo%2520user%253Acheck-access%2520user%253Alist-projects%26client_id%3Dkibana-proxy" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:54.0) Gecko/20100101 Firefox/54.0"
Comment 2 Peter Portante 2017-06-28 10:44:50 EDT
Reassigning as this is not a logging problem.  The customer does not appear to want to use the recommended method of using separate certificates for external and internal services.
Comment 3 Luke Meyer 2017-06-28 14:24:12 EDT
The kibana-proxy talks to the master at https://kubernetes.default.svc.cluster.local which is a cluster-internal address resolved by SkyDNS to an internal IP. It uses the /var/run/secrets/kubernetes.io/serviceaccount/ca.crt supplied to every pod by kubernetes for the purpose of talking to the master at its internal address.

I'm actually still a little confused about why the solution described here doesn't work for the customer:
https://docs.openshift.com/container-platform/3.5/install_config/certificate_customization.html#configuring-custom-certificates

Even if they're wanting to use a single address for the external master API and for the nodes and other infrastructure to talk to, because kibana-proxy is making the request against kubernetes.default.svc.cluster.local the master should still be able to serve that address separately with the OpenShift-generated internal cert validated by the internal CA. So if their master config ends up like the following I don't see the problem:

servingInfo:
  bindAddress: 0.0.0.0:8443
  bindNetwork: tcp4
  certFile: master.server.crt
  clientCA: ca-bundle.crt
  keyFile: master.server.key
  maxRequestsInFlight: 500
  requestTimeoutSeconds: 3600
  namedCertificates:
  - certFile: custom.crt
    keyFile: custom.key
    names:
    - "master.customer.com"

I'm also surprised there isn't similarly a problem with the router which I believe also watches the master at this internal address to build its routing tables (I could be off though).

But, let's suppose we wanted to reconfigure kibana-proxy to make the master request using the same address as everything else. This is possible with some reconfiguration to have kibana-proxy use a different address and CA.

The address is actually the easiest thing to change; there's a parameter OAP_MASTER_URL in the deployer kibana-proxy template (see https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/templates/kibana.yaml#L112) which can follows the deployer parameters MASTER_URL (see https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/deployer.yaml#L154).

The CA is a bit more tricky. I suspect that updating the master-config.yaml to replace ca-bundle.crt with their custom CA chain should work, but I've never tried it (and don't know if there's an ansible variable to achieve this). But that would probably be the most thorough solution if it worked, as other pods would have access to the correct CA not just kibana-proxy.

To update the kibana-proxy specifically, you would need to add a secret (or an entry to an existing secret) to provide the CA bundle. Then you would need to update the environment variable OAP_MASTER_CA_FILE on the kibana-proxy container template (you can see where it is defined now at https://github.com/openshift/origin-aggregated-logging/blob/release-1.4/deployer/templates/kibana.yaml#L121) to point where the secret has mounted it. There is definitely no deployer/ansible parameter for doing this, it would be an after-deployment modification.

Having done all that, it occurs to me that Elasticsearch will need the exact same treatment because it too queries the master with the auth token that it gets. It is not at all apparent to me how or if configuration there is even possible; it's using kubernetes-client which seems to have the values hardcoded. https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/Config.java#L110
Comment 4 Ruben Romero Montes 2017-06-28 16:31:38 EDT
I have tested with 6.11.0 [1] and I don't have this problem. Also tested with tag 0.1.0 using nodejs 4.7.2 [2] but it didn't work.

[1] https://github.com/ruromero/openshift-auth-proxy/tree/update-pkg
[2] https://github.com/fabric8io/openshift-auth-proxy/tree/0.1.0
Comment 7 Jordan Liggitt 2017-07-12 11:42:53 EDT
What inputs were provided to the ansible configuration?

1. Did they override the signing CA?
2. Did they provide a custom CA bundle?
3. Were certificates for internal hostnames regenerated as well?
Comment 9 Ruben Romero Montes 2017-09-11 05:54:33 EDT
(In reply to Jordan Liggitt from comment #7)
> What inputs were provided to the ansible configuration?
> 
> 1. Did they override the signing CA?
> 2. Did they provide a custom CA bundle?
> 3. Were certificates for internal hostnames regenerated as well?

They redeployed certificates using a custom CA.

https://docs.openshift.com/container-platform/3.5/install_config/redeploying_certificates.html#redeploying-new-custom-ca

I honestly don't know the exact reason but it has been confirmed to be solved after upgrading to nodeJS 4.8.1 

https://github.com/fabric8io/openshift-auth-proxy/pull/21
Comment 10 Simo Sorce 2017-09-11 09:59:19 EDT
Not auth related, moving to Kibana (Logging)
Comment 11 Ruben Romero Montes 2017-09-12 10:20:10 EDT
Setting the following env variable on kibana-proxy is enough to avoid the logging loop

NODE_TLS_REJECT_UNAUTHORIZED: 0

If I have read the documentation correctly, this value will ignore self-signed certificates.

What I don't understand is why with the default value for OAP_MASTER_CA_FILE which is the /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is not working
  https://github.com/fabric8io/openshift-auth-proxy/blob/master/lib/config.js#L77

Thank you for your comments beforehand.
Comment 12 Simo Sorce 2017-09-12 14:56:46 EDT
> NODE_TLS_REJECT_UNAUTHORIZED: 0

> If I have read the documentation correctly, this value will ignore self-signed > certificates.

No! This value will ignore certificate validity entirely, accepting anything.
You should *never* use this option except for some debugging/development reason.
Comment 16 Simo Sorce 2017-09-13 14:07:40 EDT
We clearly need a CI job that deploys an uses an "external" CA, so we can validate these code paths.

Note You need to log in before you can comment on or make changes to this bug.