Bug 1880784 - [3.11] - If cafile is not defined in named certificates - components like web console, prometheus will not trust the masterPublicURL
Summary: [3.11] - If cafile is not defined in named certificates - components like web...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 3.11.z
Assignee: Russell Teague
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-20 00:04 UTC by Vladislav Walek
Modified: 2020-12-16 12:35 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-16 12:35:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12273 0 None closed Bug 1880784: playbooks/init: Validate openshift_master_named_certificates 2021-02-14 13:24:39 UTC
Github openshift openshift-docs pull 27684 0 None open Bug 1880784 - adding note about cafile 2021-02-14 13:24:39 UTC
Red Hat Product Errata RHSA-2020:5363 0 None None None 2020-12-16 12:35:50 UTC

Description Vladislav Walek 2020-09-20 00:04:59 UTC
Description of problem:

There is no playbook to redeploy the named certificates only - to redeploy the NEW named certificates, the openshift-master/redeploy-certificates.yml is needed to be executed.


the code reference below:
https://github.com/openshift/openshift-ansible/blob/release-3.11/playbooks/openshift-master/redeploy-certificates.yml#L4
https://github.com/openshift/openshift-ansible/blob/release-3.11/playbooks/openshift-master/private/redeploy-certificates.yml#L4
https://github.com/openshift/openshift-ansible/blob/release-3.11/playbooks/openshift-master/private/certificates.yml#L6

The redeploy named certificate tasks are in the code below:
https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_named_certificates/tasks/main.yml

The 'cafile' from the inventory is added to the ca-bundle.crt using task below:
https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_master_certificates/tasks/main.yml#L53-L73

~~~
master_named_certificates: "{{ openshift.master.named_certificates | default([]) | lib_utils_oo_collect('cafile') }}"
...
- name: Create the master server certificate
  command: >
    {{ hostvars[openshift_ca_host]['first_master_client_binary'] }} adm ca create-server-cert
    {% for named_ca_certificate in master_named_certificates %}
    --certificate-authority {{ named_ca_certificate }}
    {% endfor %}
    {% for legacy_ca_certificate in master_legacy_ca_files %}
    --certificate-authority {{ legacy_ca_certificate }}
    {% endfor %}
    --hostnames={{ hostvars[item].openshift.common.all_hostnames | join(',') }}
    --cert={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.crt
    --key={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.key
    --expire-days={{ openshift_master_cert_expire_days }}
    --signer-cert={{ openshift_ca_cert }}
    --signer-key={{ openshift_ca_key }}
    --signer-serial={{ openshift_ca_serial }}
    --overwrite=false
  when: item != openshift_ca_host
  with_items: "{{ masters_missing_certs }}"
  delegate_to: "{{ openshift_ca_host }}"
  run_once: true

The 'cafile' for the named certificates is then automatically added to the ca-bundle.crt.
However, it is only added if the 'cafile' is defined in the inventory. 

If 'cafile' is not defined, then non of the components will trust the masterPublicURL, causing that prometheus proxy won't work, web console and others.
Basically any component that uses oauth URL.


Version-Release number of the following components:
openshift-ansible - release-3.11


How reproducible:
- do not define the 'cafile' in the inventory and rerun the openshift-master/redeploy-certificates.yml playbook.


Steps to Reproduce:
1.
2.
3.

Actual results:
- when the playbook finishes, the issue is not visible at the first sight as the playbook succeeds and services are up and running


Expected results:
- the playbook should warn/fail if 'cafile' is not provided by administrator
- change the documentation to mark it as required field


Additional info:

Comment 2 Vladislav Walek 2020-09-21 02:16:12 UTC
to correct the description, 

the CA is added to the bundle with:

https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_master_certificates/tasks/main.yml#L75-L97

Comment 5 Abhinav Dahiya 2020-10-02 16:36:35 UTC
There is no customer case attached to this bug. Therefore for 3.11 which is in maintain phase high priority is not accurate. lowering.

Comment 11 Russell Teague 2020-11-05 13:43:01 UTC
If this is going to fail an install, it may be prudent to do this validation much sooner in the process to prevent late failures.  Also, this needs to be clearly documented if installs that would normally succeed will now fail.

Comment 12 Russell Teague 2020-11-05 14:51:04 UTC
This is not a 3.11 release blocker.

Comment 13 Vladislav Walek 2020-11-05 22:40:08 UTC
Hey Russell,

>> If this is going to fail an install, it may be prudent to do this validation much sooner in the process to prevent late failures.  
>> Also, this needs to be clearly documented if installs that would normally succeed will now fail.

Yeah, I agree, that it would prevent the installation. My idea was that this should be just informing the admin that if this is missed, cluster will have problem after installation.

Maybe the approach of the debug message only without fail would be better option, however, then it shows the message only as 'ok' and not the warning.

Thinking about it, failing is not a good option and I will rework the PR.
Maybe it needs also better documentation.

Comment 14 Vladislav Walek 2020-11-07 01:37:43 UTC
Hey Russell,

I will rework that and create a draft in different branch of my fork. I got better idea to do that.
I will let you know then.

Comment 24 Gaoyun Pei 2020-11-26 07:37:56 UTC
Verify this bug with openshift-ansible-3.11.322-1.git.0.ef8d7eb.el7.noarch.rpm.

1. When no "cafile" parameter set in openshift_master_named_certificates.

openshift_master_named_certificates=[{"certfile": "/files/to/custom_hostname.pem", "keyfile": "/files/to/custom_hostname.key.pem"}]


The fresh install will fail at the pre-checking step. 

TASK [Fail if the cafile is not configured when using openshift_master_named_certificates] ***
Thursday 26 November 2020  11:39:25 +0800 (0:00:00.900)       0:01:13.786 ***** 
fatal: [ci-vm-10-0-151-6.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "msg": "The cafile is not configured in openshift_named_certificates. The cafile must be configured for the cluster's components to trust the named certificate signer. Set 'openshift_named_certificate_omit_cafile=true' to skip this error.\n"}


By setting "openshift_named_certificate_omit_cafile=true" in ansible inventory file, the installation could bypass this check.


2. With "cafile" parameter set in openshift_master_named_certificates

openshift_master_named_certificates=[{"certfile": "/files/to/custom_hostname.pem", "keyfile": "/files/to/custom_hostname.key.pem", "cafile": "/files/to/custom_hostname_ca.pem"}]

After installation, check the /etc/origin/master/ca-bundle.crt file, the "custom_hostname_ca.pem" was added into the file. 

Select one prometheus pod
# oc -n openshift-monitoring rsh prometheus-k8s-0
sh-4.2$ cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
....
The full content of ca-bundle.crt file were in the serfviceaccount.ca.crt

Move this bug to verified.

Comment 28 errata-xmlrpc 2020-12-16 12:35:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 3.11.343 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5363


Note You need to log in before you can comment on or make changes to this bug.