Bug 1720770 - Instructions for adding default API server default certificate fails to configure serving cert [NEEDINFO]
Summary: Instructions for adding default API server default certificate fails to confi...
Keywords:
Status: CLOSED EOL
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Vikram Goyal
QA Contact: scheng
Vikram Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-14 21:33 UTC by rvanderp
Modified: 2020-05-18 06:56 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-18 06:56:53 UTC
Target Upstream Version:
ansverma: needinfo? (deads)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift api pull 372 0 'None' closed Bug 1720770: remove broken default serving cert setting 2020-04-15 18:58:52 UTC
Github openshift cluster-kube-apiserver-operator pull 518 0 'None' closed Bug 1720770: remove broken default serving cert 2020-04-15 18:58:52 UTC

Internal Links: 1719989

Description rvanderp 2019-06-14 21:33:58 UTC
Document URL: https://docs.openshift.com/container-platform/4.1/authentication/certificates/api-server.html#add-default-api-server_api-server-certificates

Section Number and Name: 

Describe the issue: 

Customer followed instructions for setting the default API server serving certificate.  The API server is not presenting the certificate and one of the kube-apiserver pods is stuck in CrashLoopBackoff.

Suggestions for improvement: 

Additional information: 

On the customer's cluster, they have 3 masters.  In the process of applying the change, the first instance of the kube-apiserver pod is stuck in CrashLoopBackoff.  Investigating the logs of the kube-apiserver container reveals that the container is unable to find the serving cert.  See stack trace below.  I can reproduce the same problem on my AWS cluster as well.

	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/cmd/hypershift/main.go:45 +0x2b8
E0614 15:04:19.542054       1 pathrecorder.go:107] registered "/readyz/crd-informer-synced" from goroutine 1 [running]:
runtime/debug.Stack(0x4e8a780, 0xc008619fb0, 0xc002860260)
	/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/debug/stack.go:24 +0xa7
github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).trackCallers(0xc001ece690, 0xc002860260, 0x1b)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:109 +0x89
github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).Handle(0xc001ece690, 0xc002860260, 0x1b, 0xaa2ed40, 0xc00b8c3880)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/mux/pathrecorder.go:173 +0x86
github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/healthz.InstallPathHandler(0xaa27f80, 0xc001ece690, 0x5969c5c, 0x7, 0xc0056ede00, 0x1d, 0x1e)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/healthz/healthz.go:120 +0x3c1
github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server.(*GenericAPIServer).installReadyz(0xc00638edc0, 0xc0003bc6c0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/healthz.go:55 +0x1a8
github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server.preparedGenericAPIServer.Run(0xc00638edc0, 0xc0003bc6c0, 0xc00638edc0, 0x0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/server/genericapiserver.go:290 +0x88
github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.Run(0xc0001ff8c0, 0xc0003bc6c0, 0x0, 0x0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:153 +0x145
github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app.NewAPIServerCommand.func1(0xc001228780, 0x0, 0x0, 0x0, 0x1, 0x0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:115 +0x109
github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver.RunOpenShiftKubeAPIServerServer(0xc001334800, 0xc0003bc6c0, 0x24, 0x7ffda2d38c27)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver/server.go:70 +0x57e
github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver.(*OpenShiftKubeAPIServerServer).RunAPIServer(0xc0003dde60, 0xc0003bc6c0, 0xc000aa38c0, 0xc000519800)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver/cmd.go:129 +0x817
github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver.NewOpenShiftKubeAPIServerServerCommand.func1(0xc00074b900, 0xc000394380, 0x0, 0x2)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/cmd/openshift-kube-apiserver/cmd.go:61 +0x10c
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).execute(0xc00074b900, 0xc0003942a0, 0x2, 0x2, 0xc00074b900, 0xc0003942a0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:760 +0x2cc
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc00074b180, 0xc001228000, 0xc00074bb80, 0xc00074b900)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:846 +0x2fd
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).Execute(0xc00074b180, 0xc00074b180, 0x0)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:794 +0x2b
main.main()
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/cmd/hypershift/main.go:45 +0x2b8
F0614 15:04:19.543371       1 cmd.go:71] open /etc/kubernetes/static-pod-certs/secrets/user-serving-cert/tls.crt: no such file or directory

Comment 1 Ryan Howe 2019-06-15 00:45:26 UTC
Do we even do anything with this secret expect for take the name, and set the observed servinginfo fields for the kubeapiserver CR

      servingInfo:
        certFile: /etc/kubernetes/static-pod-certs/secrets/user-serving-cert/tls.crt
        keyFile: /etc/kubernetes/static-pod-certs/secrets/user-serving-cert/tls.key


Only function I see is "observeDefaultUserServingCertificate" that does any thing the secret set for defaultServingCertificate in the apiserver CR. 

https://github.com/openshift/cluster-kube-apiserver-operator
cluster-kube-apiserver-operator -- pkg/operator/configobservation/apiserver/observe_apiserver.go

Comment 2 Christian Huffman 2019-06-18 15:37:14 UTC
After discussion with engineering this is not a docs bug. Reassigning to Master.

Comment 4 David Eads 2019-07-10 15:32:43 UTC
This setting is inherently dangerous because we rely on the default serving cert to have critical IPs included in valid names to allow the service network to continue to function. Customers should instead use the SNI capabilities we have to provide their certificates.

Because it didn't work and carries a lot of risk, we are going to remove this setting entirely, starting in https://github.com/openshift/cluster-kube-apiserver-operator/pull/518 .  You should use the SNI configuration instead.

Comment 6 David Eads 2019-07-16 11:47:31 UTC
@Anshul Verma You seem to have crossed bugs.  This bug is about the default serving cert.  Do you have a bug open for setting SNI certificates that we've missed?  Also, if you do, be sure to add must-gather information.

Comment 8 David Eads 2019-07-17 18:57:19 UTC
As noted in comment 4, you should use SNI configuration.  If that fails, please open a different bug and attach must-gather output.

You should remove your default serving certificate configuration entirely.

Comment 12 David Eads 2019-07-24 12:05:27 UTC
Can you write up a bug for the SNI configuration issue and attach must-gather to it?  This definitely fixed the default serving cert problem, if we have a different issue we need the configuration and various configmaps, etc to chase it down.

Comment 15 David Eads 2019-08-21 17:25:18 UTC
This bug is about " Instructions for adding default API server default certificate fails to configure serving cert".  We fixed the bug in https://github.com/openshift/openshift-docs/pull/15642 by removing that doc entirely and in https://github.com/openshift/cluster-kube-apiserver-operator/pull/518 by not honoring the setting.

For SNI, we'll want the requested gather on a new bug.

Comment 16 David Eads 2019-08-21 18:57:06 UTC
for completeness, must-gather linked in the case shows a healthy clusteroperator/kube-apiserver and three healthy kube-apiserver pods.

Comment 19 Michal Fojtik 2019-08-26 10:26:51 UTC
This bug was fixed in 4.1 documentation, no code changes merged to payload.


Note You need to log in before you can comment on or make changes to this bug.