Description of problem: See the following details. Version-Release number of the following components: openshift-ansible-3.7.0-0.191.0.git.0.bc2ff60.el7.noarch How reproducible: Always Steps to Reproduce: 1. Setting router shards like the following: openshift_hosted_routers=[{"name": "router", "replicas": 2, "serviceaccount": "router", "namespace": "default", "stats_port": 1936, "ports": ["80:80", "443:443"], "images": "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}", "certificate": {"certfile": "{{ lookup(\"env\", \"WORKSPACE\") }}/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.crt", "keyfile": "{{ lookup(\"env\", \"WORKSPACE\") }}/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.key", "cafile": "{{ lookup(\"env\", \"WORKSPACE\") }}/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1_rootca.crt"}, "selector": "role=node,router=enabled", "edits": [{"action": "append", "key": "spec.template.spec.containers[0].env", "value": {"name": "ROUTE_LABELS", "value": "route=external"}}]}, {"name": "router1", "replicas": 1, "serviceaccount": "router", "namespace": "default", "stats_port": 1937, "ports": ["7080:7080", "7443:7443"], "images": "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}", "certificate": {}, "selector": "role=node,router=enabled", "edits": [{"action": "update", "curr_value": {"name": "ROUTER_SERVICE_HTTPS_PORT", "value": "443"}, "key": "spec.template.spec.containers[0].env", "value": {"name": "ROUTER_SERVICE_HTTPS_PORT", "value": "7443"}}, {"action": "update", "curr_value": {"name": "ROUTER_SERVICE_HTTP_PORT", "value": "80"}, "key": "spec.template.spec.containers[0].env", "value": {"name": "ROUTER_SERVICE_HTTP_PORT", "value": "7080"}}, {"action": "append", "key": "spec.template.spec.containers[0].env", "value": {"name": "NAMESPACE_LABELS", "value": "n=install-test"}}, {"action": "append", "key": "spec.template.spec.containers[0].env", "value": {"name": "ROUTER_SERVICE_SNI_PORT", "value": "10445"}}, {"action": "append", "key": "spec.template.spec.containers[0].env", "value": {"name": "ROUTER_SERVICE_NO_SNI_PORT", "value": "10442"}}]}] 2. after installation, multiple router shards are created successfully. 3. But user customized router certificates is not used when creating router. Actual results: # openssl s_client -servername www.example.com -connect 52.90.104.51:443 | grep 'subject\|issuer' depth=1 CN = openshift-signer@1509703691 verify error:num=19:self signed certificate in certificate chain verify return:0 subject=/CN=*.apps.1103-err.qe.rhcloud.com issuer=/CN=openshift-signer@1509703691 # curl --cacert custom_router_1_rootca.crt --resolve testjialiu.com:443:52.90.104.51 https://testjialiu.com curl: (60) Peer's certificate issuer has been marked as not trusted by the user. Checking installation log, found the following task is skipped: TASK [openshift_hosted : Get the certificate contents for router] ************** Friday 03 November 2017 10:23:06 +0000 (0:00:00.057) 0:21:31.881 ******* skipping: [ec2-54-89-152-147.compute-1.amazonaws.com] => (item=/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.key) => {"changed": false, "item": "/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.key", "skip_reason": "Conditional result was False", "skipped": true} skipping: [ec2-54-89-152-147.compute-1.amazonaws.com] => (item=/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.crt) => {"changed": false, "item": "/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1.crt", "skip_reason": "Conditional result was False", "skipped": true} skipping: [ec2-54-89-152-147.compute-1.amazonaws.com] => (item=/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1_rootca.crt) => {"changed": false, "item": "/home/slave3/workspace/Launch Environment Flexy/private-openshift-misc/v3-launch-templates/functionality-testing/aos-37/extra-ansible/files/custom_router_1_rootca.crt", "skip_reason": "Conditional result was False", "skipped": true} Expected results: customized router certificate files defined in openshift_hosted_routers should be uploaded to master for router creating. Additional info: Go to openshift-ansible code: - name: Get the certificate contents for router copy: backup: True dest: "/etc/origin/master/{{ item | basename }}" src: "{{ item }}" with_items: "{{ openshift_hosted_routers | oo_collect(attribute='certificate') | oo_select_keys_from_list(['keyfile', 'certfile', 'cafile']) }}" when: ( not openshift_hosted_router_create_certificate | bool ) or openshift_hosted_router_certificate != {} Seem like the judgment has some problem.
Johnny Liu, After reviewing the code I do recall that a change went in a while back that requires users to pass in a flag if they do not want the routers to generate default certificates. This PR changed the fundamental behavior and defaults to creating certificates instead of using the ones passed in. The PR was here: https://github.com/openshift/openshift-ansible/pull/4693 Do you have this option in your inventory? openshift_hosted_router_create_certificate: False Please try this creation again with the above option set in your inventory and let me know if that works. Thanks
No, I was not requesting to revert your change, I was saying your PR seem like introduce some new bug. I think it is reasonable that the default value is "openshift_hosted_router_create_certificate: True" to make the routers to generate default certificates *WHEN* no any router certificates are passed in. The main issue in current openshift-ansible code is the judgment is not correct, does NOT take the router certificates passed defined in openshift_hosted_routers into consideration. - name: Get the certificate contents for router <--snip--> when: ( not openshift_hosted_router_create_certificate | bool ) or openshift_hosted_router_certificate != {} - block: - name: generate a default wildcard router certificate oc_adm_ca_server_cert: signer_cert: "{{ openshift_master_config_dir }}/ca.crt" signer_key: "{{ openshift_master_config_dir }}/ca.key" signer_serial: "{{ openshift_master_config_dir }}/ca.serial.txt" hostnames: - "{{ openshift_master_default_subdomain | default('router.default.svc.cluster.local') }}" - "*.{{ openshift_master_default_subdomain | default('router.default.svc.cluster.local') }}" cert: "{{ ('/etc/origin/master/' ~ (item.certificate.certfile | basename)) if 'certfile' in item.certificate else ((openshift_master_config_dir) ~ '/openshift-router.crt') }}" key: "{{ ('/etc/origin/master/' ~ (item.certificate.keyfile | basename)) if 'keyfile' in item.certificate else ((openshift_master_config_dir) ~ '/openshift-router.key') }}" with_items: "{{ openshift_hosted_routers }}" - name: set the openshift_hosted_router_certificate set_fact: openshift_hosted_router_certificate: certfile: "{{ openshift_master_config_dir ~ '/openshift-router.crt' }}" keyfile: "{{ openshift_master_config_dir ~ '/openshift-router.key' }}" cafile: "{{ openshift_master_config_dir ~ '/ca.crt' }}" when: - openshift_hosted_router_create_certificate | bool - openshift_hosted_router_certificate == {} We have to consider the following scenarios: 1. user does not pass in any router cert, installer should generate a default cert 2. user pass in router cert via openshift_hosted_router_certificate, installer should use the passed one 3. user pass in router cert via openshift_hosted_routers, installer should use the passed one Seen from the when judgment, we only take #1 and #2 into consideration, did not take #3 into consideration. Another minor issue in "generate a default wildcard router certificate" step, it is reading certs passed in openshift_hosted_routers, and generate those certs with the same names upon #1 scenarios is happening. Understanding from me, when installer is generating default certs, should use default openshift-router.{crt,key}, or else, never need reading any info from openshift_hosted_routers. The above is only my understanding, if wrong, pls correct me.
Johnny Liu, I came up with a fix that checks the openshift_hosted_routers for certificates. I will test these changes tomorrow. If you have a chance to test against this PR please do: https://github.com/openshift/openshift-ansible/pull/6040 The "generate a default wildcard router certificate" step uses the openshift_hosted_routers because in openshift_hosted/defaults/main.yml the default router is stored inside of the default variables. For this reason we loop over the openshift_hosted_routers. If the user specifies routers or not, we use the same variable. You are correct that it should never read any cert being passed because they are empty here and should default to the 'openshift_master_config_dir' + '/openshift-router.key' and cert files. This should be fixed in the PR. Thanks.
I tested with your PR, seem it is working well now. But maybe need some other improvement. I came up with some scenarios: 1. openshift_hosted_routers=[{router1 with empty certificates},{router2 with empty certificates}] 2. openshift_hosted_routers=[{router1 with certificates},{router2 with empty certificates}] Based on your PR, these scenarios would have the following behavior: For #1, installer would generate a default wildcard router certificate, in later router creation step, the router1 and router2 would be created with the same newly created default certificate. For #2 (the same scenarios as the initial report), installer would skip "generate a default wildcard router certificate" step, and upload the passed certificates to master, in later router creation step, router1 would be created with the passed certificate; router2 would be created with empty certificate. Compare the behaviors of #1 and #2, router2 in #1 is having certificate, while router2 in #2 is having empty certificate. If possible, in #2, "generate a default wildcard router certificate" step should be run, and router1 is using the passed certificate, router2 is using the newly created default certificate. This is only a minor issue now, the PR almost perfect, if my proposed improvement is risky to introduce new bug, then could do that in the future release.
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/bfbafebc2e4268088536bbfed2c62bf5855719d0 [Bug 1509354] Check if routers have certificates and use them https://github.com/openshift/openshift-ansible/commit/c42ef6a6963cd4337f858bf25187d0b94018e927 Merge pull request #6040 from kwoodson/router_shard_custom_cert_fix [Bug 1509354] Check if routers have certificates and use them
Verified this bug with openshift-ansible-3.7.4-1.git.0.254e849.el7.noarch, and PASS. [jialiu@dhcp-141-223 ~]$ openssl s_client -servername www.example.com -connect 54.173.x.x:443 | grep 'subject\|issuer' depth=1 C = XX, L = Default City, O = Default Company Ltd, OU = BJ, CN = jialiu-qe-testing verify error:num=19:self signed certificate in certificate chain subject=/C=XX/L=Default City/O=Default Company Ltd/CN=testjialiu.com issuer=/C=XX/L=Default City/O=Default Company Ltd/OU=BJ/CN=jialiu-qe-testing ^C [jialiu@dhcp-141-223 ~]$ openssl s_client -servername www.example.com -connect 54.174.y.y:443 | grep 'subject\|issuer' depth=1 C = XX, L = Default City, O = Default Company Ltd, OU = BJ, CN = jialiu-qe-testing verify error:num=19:self signed certificate in certificate chain subject=/C=XX/L=Default City/O=Default Company Ltd/CN=testjialiu.com issuer=/C=XX/L=Default City/O=Default Company Ltd/OU=BJ/CN=jialiu-qe-testing Later will clone this bug for 3.6 backport, and clone another one to track the left issue described in comment 6.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188