Description of problem: Met "x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs" in heapster Pod Version-Release number of selected component (if applicable): devenv_rhel7_2504 openshift v1.0.6-801-ge74f43a kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 etcd 2.1.2 How reproducible: 100% Steps to Reproduce: 1. Build related images on AWS instance $ git clone https://github.com/openshift/origin-metrics.git $ cd hack $ ./build-images.sh --prefix=openshift/origin- --version=3.1.0 2. Log into openshift server and create a project named "chunpj" 3. Create the Deployer Service Account oc create -f https://raw.githubusercontent.com/openshift/origin-metrics/master/metrics-deployer-setup.yaml 4. Add permissions for service account $ oadm policy add-role-to-user edit system:serviceaccount:chunpj:metrics-deployer $ oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:chunpj:heapster 5. Create the Hawkular Deployer Secret oc secrets new metrics-deployer nothing=/dev/null 6. Deploy heapster pod via template $ oc process -f https://raw.githubusercontent.com/openshift/origin-metrics/master/metrics.yaml -v HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.example.com,IMAGE_PREFIX=openshift/origin-,IMAGE_VERSION=3.1.0,USE_PERSISTENT_STORAGE=false | oc create -f - 7. After deployment is finished, check the heapster pod's log oc logs heapster-lsx6l Actual results: <--------snip----------> I1021 09:07:31.800312 1 external.go:116] no timeseries data between 0001-01-01 00:00:00 +0000 UTC and 0001-01-01 00:00:00 +0000 UTC I1021 09:07:31.800430 1 external.go:138] Storing Timeseries to "Hawkular-Metrics Sink" I1021 09:07:31.800455 1 external.go:142] Storing Events to "Hawkular-Metrics Sink" I1021 09:07:35.000178 1 manager.go:162] starting to scrape data from sources start: 2015-10-21 09:07:30 +0000 UTC end: 2015-10-21 09:07:35 +0000 UTC I1021 09:07:35.000225 1 manager.go:103] attempting to get data from source "Kube Pods Source" I1021 09:07:35.000292 1 pods.go:147] selected pods from api server [{pod:0xc20823bfc0 nodeInfo:0xc20828ed00 namespace:0xc20832d400} {pod:0xc20823c1b0 nodeInfo:0xc20828ed40 namespace:0xc20832d4e8} {pod:0xc20823b800 nodeInfo:0xc20828ed80 namespace:0xc20832d400} {pod:0xc20823b9f0 nodeInfo:0xc20828edc0 namespace:0xc20832d400} {pod:0xc20823bbe0 nodeInfo:0xc20828ee00 namespace:0xc20832d400} {pod:0xc20823bdd0 nodeInfo:0xc20828ee40 namespace:0xc20832d400}] I1021 09:07:35.000394 1 manager.go:103] attempting to get data from source "Kube Node Metrics Source" I1021 09:07:35.000421 1 kube_nodes.go:123] Fetched list of nodes from the master I1021 09:07:35.000431 1 manager.go:103] attempting to get data from source "kube-events" I1021 09:07:35.000444 1 kube_events.go:213] Fetched list of events from the master I1021 09:07:35.000449 1 kube_events.go:214] [] I1021 09:07:35.000481 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer" I1021 09:07:35.000560 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry" I1021 09:07:35.000617 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1" I1021 09:07:35.000667 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics" I1021 09:07:35.000711 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics" I1021 09:07:35.000757 1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster" I1021 09:07:35.007430 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer - Get https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.007458 1 kube_pods.go:108] failed to get stats for container "deployer" in pod "chunpj"/"metrics-deployer-al9ap" I1021 09:07:35.023079 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry - Get https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.023097 1 kube_pods.go:108] failed to get stats for container "registry" in pod "default"/"docker-registry-1-dhinq" I1021 09:07:35.138177 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1 - Get https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.138206 1 kube_pods.go:108] failed to get stats for container "hawkular-cassandra-1" in pod "chunpj"/"hawkular-cassandra-1-hixsa" I1021 09:07:35.152113 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics - Get https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.152131 1 kube_pods.go:108] failed to get stats for container "hawkular-metrics" in pod "chunpj"/"hawkular-metrics-98bad" I1021 09:07:35.171184 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics - Get https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.171200 1 kube_pods.go:108] failed to get stats for container "hawkular-metrics" in pod "chunpj"/"hawkular-metrics-jd1eg" I1021 09:07:35.185075 1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster - Get https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs I1021 09:07:35.185092 1 kube_pods.go:108] failed to get stats for container "heapster" in pod "chunpj"/"heapster-lsx6l" I1021 09:07:35.210060 1 kube_nodes.go:56] Failed to get container stats from Kubelet on node "ip-10-109-187-185.ec2.internal" I1021 09:07:35.210083 1 manager.go:175] completed scraping data from sources. Errors: [] Expected results: Should not meet such error in Heapster pod. Additional info:
I was finally able to reproduce the problem locally on my machine and I understand why it was working locally for me and would cause problems for almost anyone else. Its currently being tracked upstream at https://github.com/kubernetes/heapster/issues/663 There should be a simple work around for this (at least for now) but the work around doesn't work in this situation: https://github.com/kubernetes/heapster/issues/662 There are a couple of more elaborate work arounds: 1) set the hostname of the node to its ip address. This is why it was working for me locally without problem. 2) reconfigure your system to use the RO endpoint. This will get around the issue, but it requires extra steps when configuring your OpenShift instance and is not a setup we should be using.
This appears to be an issue with how OpenShift generates its certificates: https://github.com/openshift/origin/issues/5294 The easy work around for now is to set the --hostname to the IP address of the host instead of the actual hostname when generating your configuration.
*** Bug 1272976 has been marked as a duplicate of this bug. ***
(In reply to Matt Wringe from comment #2) > This appears to be an issue with how OpenShift generates its certificates: > https://github.com/openshift/origin/issues/5294 > > The easy work around for now is to set the --hostname to the IP address of > the host instead of the actual hostname when generating your configuration. Yeah, It works for me when setting the --hostname to the IP address of the host.
Duplicate Github issue: https://github.com/openshift/origin/issues/5294
Reassigning to liggitt as he owned the github issue
Fixed in https://github.com/openshift/origin/pull/5510
It's fixed, checked on devenv_rhel7_2619, please refer to the below messages: [root@ip-172-18-3-105 hack]# ps -ef |grep open root 18023 17995 8 21:16 pts/0 00:01:27 openshift start --public-master=ec2-52-91-81-137.compute-1.amazonaws.com --latest-images=true root 22387 17995 0 21:33 pts/0 00:00:00 grep --color=auto open [chunchen@F17-CCY daily]$ oc logs heapster-s4d21 <---------------snip------------------> I1102 02:37:40.029869 1 kubelet.go:99] url: "https://172.18.3.105:10250/stats/default/docker-registry-1-f5yg7/f640ce06-8107-11e5-92d2-0ecb98477595/registry", body: "{\"num_stats\":60,\"start\":\"2015-11-02T02:37:35Z\",\"end\":\"2015-11-02T02:37:40Z\"}", data: {ContainerReference:{Name:/system.slice/docker-94352b158cbdce23a6c5b996028a9f33b55284b9cff6d5c062b51b3fa36e213a.scope Aliases:[k8s_registry.99d9127d_docker-registry-1-f5yg7_default_f640ce06-8107-11e5-92d2-0ecb98477595_e5a37810 94352b158cbdce23a6c5b996028a9f33b55284b9cff6d5c062b51b3fa36e213a] Namespace:docker} Subcontainers:[] Spec:{CreationTime:2015-11-02 02:19:02.804101333 +0000 UTC Labels:map[License:GPLv2 Vendor:CentOS io.kubernetes.pod.name:default/docker-registry-1-f5yg7 io.kubernetes.pod.terminationGracePeriod:30] HasCpu:true Cpu:{Limit:2 MaxLimit:0 Mask:0-1} HasMemory:true Memory:{Limit:18446744073709551615 Reservation:0 SwapLimit:18446744073709551615} HasNetwork:false HasFilesystem:false HasDiskIo:true HasCustomMetrics:false CustomMetrics:[]} Stats:[0xc20867fa00 0xc20867fc00 0xc2082be000]}
I am a bit confused over why 1277842 was created and why it blocks this issue. Is 1277842 not a duplicate of this issue? Or does it need to exist because of the specific version number?
Sorry for confusing, this issue is fixed on Origin, but the bug 1277842 is reproduced against OSE env, I am deleting the blocks.