| Summary: | [3.3] Deploy of custom router fails on "list of unattached/unmounted volumes=[server-certificate]" | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Vladislav Walek <vwalek> |
| Component: | Cluster Version Operator | Assignee: | Andrew Butcher <abutcher> |
| Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.3.1 | CC: | abutcher, aloughla, aos-bugs, chuyu, decarr, erich, jiajliu, jkaur, jokerman, maszulik, mfojtik, michael.voegele, mmccomas, pweil, sdodson, tkimura, vwalek |
| Target Milestone: | --- | ||
| Target Release: | 3.3.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
It was a typo in the master-config.yaml. No doc update needed.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-06-12 15:40:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
The mount should be there. In 3.3 we changed the router so that it gets a unique default certificate (if one is not provided) from a service.
Please look at the service definition associated with that router and make sure that it has annotations like:
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.openshift.io/serving-cert-secret-name: router-certs
service.alpha.openshift.io/serving-cert-signed-by: openshift-service-serving-signer@1478546568
Then do 'oc get secrets' and check that one with the name referenced by the service.alpha.openshift.io/serving-cert-secret-name annotation exists. In my case it is named 'router-certs' and is there.
If it was not created, please check that the master-config.yaml has:
controllerConfig:
serviceServingCert:
signer:
certFile: service-signer.crt
keyFile: service-signer.key
If it doesn't, then you can add that and restart the master. How did you perform the upgrade?
Hello, this is comment from customer: 1. we made an automated in-place upgrade as documented here: https://docs.openshift.com/container-platform/3.3/install_config/upgrading/automated_upgrades.html 2. The following is present in master-config.yaml: controllerConfig: serviceServingCert: signer: certFile: service-signer.crt keyFile: service-signer.key 3. The service looks as follows: { "kind": "Service", "apiVersion": "v1", "metadata": { "name": "tools-prod-internal", "namespace": "openpaas-router-ingress-internal", ... "labels": { "router": "tools-prod-internal" }, "annotations": { "service.alpha.openshift.io/serving-cert-secret-name": "tools-prod-internal-certs" } }, "spec": { "ports": [ -> The second annotation service.alpha.openshift.io/serving-cert-signed-by: openshift-service-serving-signer@1478546568 is missing 4. -> A secret with the name tools-prod-internal-certs is missing What is this new feature used for? Any documentation? vwalek The router must always have a default cert. When one is not provided by the user in "oadm router --default-cert=..." one is automatically provided through a annotation in the router's service. Vladislav: Does the service-signer.crt and service-signer.key exist in the same directory as master-config.yaml? Is there anything in the openshift logs about those files? Vladislav: Since those files are there, can you look at the log messages from the master to see if there's anything funny: journalctl -lu atomic-openshift-master | grep -i sign (In reply to Ben Bennett from comment #7) > Vladislav: Since those files are there, can you look at the log messages > from the master to see if there's anything funny: > journalctl -lu atomic-openshift-master | grep -i sign Hello Ben, customer checked, nothing in the logs. This seems to be a problem with the secret generator... passing it off to the Kubernetes team. I also got this issue when trying out router sharding. A named router pod stuck at ContainerCreating state for missing cert secret. *** Bug 1410757 has been marked as a duplicate of this bug. *** Just FYI, we just fixed a bug where if you delete the secret that was automatically created when you annotate the svc manually, it won't be generated again. I can confirm that the secret is automatically created after annotation. Hello Maciej, here is reply from customer: ---------- Hello Vladislav I just tried to create a router again in our 3.4.1.7 maint environment, then checked the logs on the 3 masters: journalctl -ru atomic-openshift-master-api | grep 'service serving cert controller failed' journalctl -ru atomic-openshift-master-controllers | grep 'service serving cert controller failed' -> no result ---------- So if the controller is not failing, where else the error could occur? Thank you Hello, as customer reported, the issue is withing the upgrading from 3.2 to 3.3: https://github.com/openshift/openshift-ansible/search?utf8=%E2%9C%93&q=servicesServingCert&type= Thx Based on the information from the customer I'm moving this to on-qa. The error was misspelled option in master-config.yaml: servicesServingCert vs serviceServingCert. @scott, This be pull in by certificated changes. I think we should update our upgrade playbooks. Reproduced successfully.
Version:
atomic-openshift-utils-3.3.68-1.git.0.3792453.el7.noarch
openshift v3.3.1.17
Steps:
1. Upgrade ocp3.2 to ocp3.3
2. After upgrade successfully, new router
# oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true
3. Router int-router deploy failed for volumes to attach/mount for pod "int-router-1-75l11"/"default". list of unattached/unmounted volumes=[server-certificate]
4. Check no secret created corresponding to new router's service
# oc get svc/int-router -o json |grep -A 5 annotation
"annotations": {
"service.alpha.openshift.io/serving-cert-secret-name": "int-router-certs"
}
# oc get secrets|grep int-router-certs
#
5. Check master-config.yml
# cat /etc/origin/master/master-config.yaml | grep -A 5 "controllerConfig"
controllerConfig:
servicesServingCert:
signer:
certFile: service-signer.crt
keyFile: service-signer.key
The bug can be reproduced.
Version:
atomic-openshift-utils-3.3.84-1.git.0.4104d2d.el7.noarch
openshift v3.3.1.34
Steps:
1. Upgrade ocp3.2 to ocp3.3
2. After upgrade successfully, new router
# oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true
3. Check new router created successfully.
# oc get po|grep int-router
int-router-1-tb227 1/1 Running 0 11m
4. Verify secret created rightly.
# oc get svc/int-router -o json |grep -A 5 annotation
"annotations": {
"service.alpha.openshift.io/serving-cert-secret-name": "int-router-certs",
"service.alpha.openshift.io/serving-cert-signed-by": "/tmp/openshift-ansible-O8iNQTY/openshift-service-serving-signer"
}
# oc get secrets|grep int-router-certs
int-router-certs kubernetes.io/tls 2 13m
5. Verify master-config.yaml is configured rightly.
# cat /etc/origin/master/master-config.yaml | grep -A 5 "controllerConfig"
controllerConfig:
serviceServingCert:
signer:
certFile: service-signer.crt
keyFile: service-signer.key
Change bug status to verify.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1429 |
Description of problem: Hi, while running the custom router with command: oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true --replicas=3 --labels='router=internal' -n default It will create in dc following entry: volumeMounts: - mountPath: /etc/pki/tls/private name: server-certificate readOnly: true Unfortunately, the volume is not mounted and the deployment will fail on: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "int-router-1-9pr9l"/"default". list of unattached/unmounted volumes=[server-certificate] In normal router, the volume is not there. Version-Release number of selected component (if applicable): OpenShift Conatiner Platfrom 3.3.1 How reproducible: Start new deployment of router in default and check the dc (compare it to normal router): oc adm router int-router --stats-port=1940 --ports=12080:12080,12443:12443 --service-account=router --host-network=true --host-ports=true --replicas=3 --labels='router=internal' -n default Actual results: the deployment fails Expected results: what is exactly the volume for, if it is needed. The workaround is to remove the volume entry from dc, then the deployment works. Additional info: