Description of problem: Hi, while running the custom router with command: oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true --replicas=3 --labels='router=internal' -n default It will create in dc following entry: volumeMounts: - mountPath: /etc/pki/tls/private name: server-certificate readOnly: true Unfortunately, the volume is not mounted and the deployment will fail on: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "int-router-1-9pr9l"/"default". list of unattached/unmounted volumes=[server-certificate] In normal router, the volume is not there. Version-Release number of selected component (if applicable): OpenShift Conatiner Platfrom 3.3.1 How reproducible: Start new deployment of router in default and check the dc (compare it to normal router): oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true --replicas=3 --labels='router=internal' -n default Actual results: the deployment fails Expected results: what is exactly the volume for, if it is needed. The workaround is to remove the volume entry from dc, then the deployment works. Additional info:
The mount should be there. In 3.3 we changed the router so that it gets a unique default certificate (if one is not provided) from a service. Please look at the service definition associated with that router and make sure that it has annotations like: apiVersion: v1 kind: Service metadata: annotations: service.alpha.openshift.io/serving-cert-secret-name: router-certs service.alpha.openshift.io/serving-cert-signed-by: openshift-service-serving-signer@1478546568 Then do 'oc get secrets' and check that one with the name referenced by the service.alpha.openshift.io/serving-cert-secret-name annotation exists. In my case it is named 'router-certs' and is there. If it was not created, please check that the master-config.yaml has: controllerConfig: serviceServingCert: signer: certFile: service-signer.crt keyFile: service-signer.key If it doesn't, then you can add that and restart the master. How did you perform the upgrade?
Hello, this is comment from customer: 1. we made an automated in-place upgrade as documented here: https://docs.openshift.com/container-platform/3.3/install_config/upgrading/automated_upgrades.html 2. The following is present in master-config.yaml: controllerConfig: serviceServingCert: signer: certFile: service-signer.crt keyFile: service-signer.key 3. The service looks as follows: { "kind": "Service", "apiVersion": "v1", "metadata": { "name": "tools-prod-internal", "namespace": "openpaas-router-ingress-internal", ... "labels": { "router": "tools-prod-internal" }, "annotations": { "service.alpha.openshift.io/serving-cert-secret-name": "tools-prod-internal-certs" } }, "spec": { "ports": [ -> The second annotation service.alpha.openshift.io/serving-cert-signed-by: openshift-service-serving-signer@1478546568 is missing 4. -> A secret with the name tools-prod-internal-certs is missing What is this new feature used for? Any documentation?
vwalek The router must always have a default cert. When one is not provided by the user in "oadm router --default-cert=..." one is automatically provided through a annotation in the router's service.
Vladislav: Does the service-signer.crt and service-signer.key exist in the same directory as master-config.yaml? Is there anything in the openshift logs about those files?
Vladislav: Since those files are there, can you look at the log messages from the master to see if there's anything funny: journalctl -lu atomic-openshift-master | grep -i sign
(In reply to Ben Bennett from comment #7) > Vladislav: Since those files are there, can you look at the log messages > from the master to see if there's anything funny: > journalctl -lu atomic-openshift-master | grep -i sign Hello Ben, customer checked, nothing in the logs.
This seems to be a problem with the secret generator... passing it off to the Kubernetes team.
I also got this issue when trying out router sharding. A named router pod stuck at ContainerCreating state for missing cert secret.
*** Bug 1410757 has been marked as a duplicate of this bug. ***
Just FYI, we just fixed a bug where if you delete the secret that was automatically created when you annotate the svc manually, it won't be generated again. I can confirm that the secret is automatically created after annotation.
Hello Maciej, here is reply from customer: ---------- Hello Vladislav I just tried to create a router again in our 3.4.1.7 maint environment, then checked the logs on the 3 masters: journalctl -ru atomic-openshift-master-api | grep 'service serving cert controller failed' journalctl -ru atomic-openshift-master-controllers | grep 'service serving cert controller failed' -> no result ---------- So if the controller is not failing, where else the error could occur? Thank you
Hello, as customer reported, the issue is withing the upgrading from 3.2 to 3.3: https://github.com/openshift/openshift-ansible/search?utf8=%E2%9C%93&q=servicesServingCert&type= Thx
Based on the information from the customer I'm moving this to on-qa. The error was misspelled option in master-config.yaml: servicesServingCert vs serviceServingCert.
@scott, This be pull in by certificated changes. I think we should update our upgrade playbooks.
Proposed: https://github.com/openshift/openshift-ansible/pull/4201
Reproduced successfully. Version: atomic-openshift-utils-3.3.68-1.git.0.3792453.el7.noarch openshift v3.3.1.17 Steps: 1. Upgrade ocp3.2 to ocp3.3 2. After upgrade successfully, new router # oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true 3. Router int-router deploy failed for volumes to attach/mount for pod "int-router-1-75l11"/"default". list of unattached/unmounted volumes=[server-certificate] 4. Check no secret created corresponding to new router's service # oc get svc/int-router -o json |grep -A 5 annotation "annotations": { "service.alpha.openshift.io/serving-cert-secret-name": "int-router-certs" } # oc get secrets|grep int-router-certs # 5. Check master-config.yml # cat /etc/origin/master/master-config.yaml | grep -A 5 "controllerConfig" controllerConfig: servicesServingCert: signer: certFile: service-signer.crt keyFile: service-signer.key The bug can be reproduced.
Version: atomic-openshift-utils-3.3.84-1.git.0.4104d2d.el7.noarch openshift v3.3.1.34 Steps: 1. Upgrade ocp3.2 to ocp3.3 2. After upgrade successfully, new router # oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true 3. Check new router created successfully. # oc get po|grep int-router int-router-1-tb227 1/1 Running 0 11m 4. Verify secret created rightly. # oc get svc/int-router -o json |grep -A 5 annotation "annotations": { "service.alpha.openshift.io/serving-cert-secret-name": "int-router-certs", "service.alpha.openshift.io/serving-cert-signed-by": "/tmp/openshift-ansible-O8iNQTY/openshift-service-serving-signer" } # oc get secrets|grep int-router-certs int-router-certs kubernetes.io/tls 2 13m 5. Verify master-config.yaml is configured rightly. # cat /etc/origin/master/master-config.yaml | grep -A 5 "controllerConfig" controllerConfig: serviceServingCert: signer: certFile: service-signer.crt keyFile: service-signer.key Change bug status to verify.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1429