Red Hat Bugzilla – Bug 1466233
catalog apiserver fail to be started due to a non-existing etcd ca file path when cluster is using embedded etcd.
Last modified: 2017-08-16 15:51 EDT
Description of problem: see the following details. Version-Release number of selected component (if applicable): openshift-ansible-3.6.126.3-1.git.0.178cea4.el7.noarch How reproducible: always Steps to Reproduce: 1. Add the following lines into inventory host file to deploy catalog service. openshift_hosted_etcd_storage_kind=nfs openshift_hosted_etcd_storage_nfs_options="*(rw,root_squash,sync,no_wdelay)" openshift_hosted_etcd_storage_nfs_directory=/exports openshift_hosted_etcd_storage_volume_name=etcd openshift_hosted_etcd_storage_access_modes=["ReadWriteOnce"] openshift_hosted_etcd_storage_volume_size=10G openshift_hosted_etcd_storage_labels={'storage': 'etcd'} openshift_enable_service_catalog=true openshift_service_catalog_image_prefix=docker.io/openshift/origin- openshift_service_catalog_image_version=latest ansible_service_broker_image_prefix=ansibleplaybookbundle/ ansible_service_broker_image_tag=latest ansible_service_broker_etcd_image_prefix=quay.io/coreos/ ansible_service_broker_etcd_image_tag=latest 2. Trigger installation. 3. Actual results: Failed at the following task: TASK [openshift_service_catalog : wait for api server to be ready] ************* Thursday 29 June 2017 07:16:38 +0000 (0:00:01.511) 0:32:12.027 ********* FAILED - RETRYING: TASK: openshift_service_catalog : wait for api server to be ready (120 retries left). <--snip--> FAILED - RETRYING: TASK: openshift_service_catalog : wait for api server to be ready (1 retries left). fatal: [openshift-143.lab.sjc.redhat.com]: FAILED! => { "attempts": 120, "changed": false, "cmd": [ "curl", "-k", "https://apiserver.kube-service-catalog.svc/healthz" ], "delta": "0:00:01.264269", "end": "2017-06-29 03:22:04.963677", "failed": true, "rc": 7, "start": "2017-06-29 03:22:03.699408", "warnings": [] } STDERR: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0curl: (7) Failed connect to apiserver.kube-service-catalog.svc:443; Connection refused Log into cluster, check apiserver logs, found that: # oc get po NAME READY STATUS RESTARTS AGE apiserver-7xgxn 0/1 CrashLoopBackOff 6 11m controller-manager-3ljw3 0/1 CrashLoopBackOff 15 1h # oc logs apiserver-7xgxn <--snip--> F0629 08:11:41.686596 1 storage_decorator.go:61] Unable to create storage backend: config (&{ /k8s.io/service-catalog [https://192.168.2.41:4001 https://192.168.2.41:4001] /etc/origin/master/master.etcd-client.key /etc/origin/master/master.etcd-client.crt /etc/origin/master/master.etcd-ca.crt false 0 {0xc4204498c0 0xc420449950} 0xc420377a00}), err (open /etc/origin/master/master.etcd-ca.crt: no such file or directory) <--snip--> On host, there is no /etc/origin/master/master.etcd-ca.crt, only /etc/origin/master/ca-bundle.crt is available for embedded etcd. After update daemonset to /etc/origin/master/ca-bundle.crt, apiserver pod is running now. Expected results: api server pod is running well. Additional info:
We should probably check if /etc/origin/master/master.etcd-ca.crt exists or not and then if not then use /etc/origin/master/ca-bundle.crt
Verified this bug with openshift-ansible-3.6.132-1.git.0.0d0f54a.el7.noarch, and PASS. # oc edit ds apiserver -n kube-service-catalog <--snip--> - args: - --storage-type - etcd - --secure-port - "6443" - --etcd-servers - https://openshift-136.lab.sjc.redhat.com:4001 - --etcd-cafile - /etc/origin/master/ca-bundle.crt - --etcd-certfile - /etc/origin/master/master.etcd-client.crt - --etcd-keyfile - /etc/origin/master/master.etcd-client.key <--snip--> # cat /etc/origin/master/master-config.yaml <--snip--> etcdClientInfo: ca: ca-bundle.crt certFile: master.etcd-client.crt keyFile: master.etcd-client.key urls: - https://openshift-136.lab.sjc.redhat.com:4001 etcdConfig: address: openshift-136.lab.sjc.redhat.com:4001 peerAddress: openshift-136.lab.sjc.redhat.com:7001 peerServingInfo: bindAddress: 0.0.0.0:7001 certFile: etcd.server.crt clientCA: ca-bundle.crt keyFile: etcd.server.key servingInfo: bindAddress: 0.0.0.0:4001 certFile: etcd.server.crt clientCA: ca-bundle.crt keyFile: etcd.server.key storageDirectory: /var/lib/origin/openshift.local.etcd <--snip-->
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716