Bug 1273818 - [intservice_public_121] Met "x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs" in heapster Pod
Summary: [intservice_public_121] Met "x509: cannot validate certificate for 10.109.187...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: unspecified
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Jordan Liggitt
QA Contact: chunchen
URL:
Whiteboard:
: 1272976 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-21 10:18 UTC by chunchen
Modified: 2018-07-26 19:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1277842 (view as bug list)
Environment:
Last Closed: 2015-11-23 21:16:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description chunchen 2015-10-21 10:18:40 UTC
Description of problem:
Met "x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs" in heapster Pod

Version-Release number of selected component (if applicable):
devenv_rhel7_2504
openshift v1.0.6-801-ge74f43a
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
etcd 2.1.2

How reproducible:
100%

Steps to Reproduce:
1. Build related images on AWS instance
$ git clone https://github.com/openshift/origin-metrics.git
$ cd hack
$ ./build-images.sh --prefix=openshift/origin- --version=3.1.0

2. Log into openshift server and create a project named "chunpj"

3. Create the Deployer Service Account
oc create -f https://raw.githubusercontent.com/openshift/origin-metrics/master/metrics-deployer-setup.yaml

4. Add permissions for service account
$ oadm policy add-role-to-user edit system:serviceaccount:chunpj:metrics-deployer
$ oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:chunpj:heapster

5. Create the Hawkular Deployer Secret
oc secrets new metrics-deployer nothing=/dev/null

6. Deploy heapster pod via template
$ oc process -f https://raw.githubusercontent.com/openshift/origin-metrics/master/metrics.yaml -v HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.example.com,IMAGE_PREFIX=openshift/origin-,IMAGE_VERSION=3.1.0,USE_PERSISTENT_STORAGE=false | oc create -f - 

7. After deployment is finished, check the heapster pod's log
oc logs heapster-lsx6l

Actual results:
<--------snip---------->
I1021 09:07:31.800312       1 external.go:116] no timeseries data between 0001-01-01 00:00:00 +0000 UTC and 0001-01-01 00:00:00 +0000 UTC
I1021 09:07:31.800430       1 external.go:138] Storing Timeseries to "Hawkular-Metrics Sink"
I1021 09:07:31.800455       1 external.go:142] Storing Events to "Hawkular-Metrics Sink"
I1021 09:07:35.000178       1 manager.go:162] starting to scrape data from sources start: 2015-10-21 09:07:30 +0000 UTC end: 2015-10-21 09:07:35 +0000 UTC
I1021 09:07:35.000225       1 manager.go:103] attempting to get data from source "Kube Pods Source"
I1021 09:07:35.000292       1 pods.go:147] selected pods from api server [{pod:0xc20823bfc0 nodeInfo:0xc20828ed00 namespace:0xc20832d400} {pod:0xc20823c1b0 nodeInfo:0xc20828ed40 namespace:0xc20832d4e8} {pod:0xc20823b800 nodeInfo:0xc20828ed80 namespace:0xc20832d400} {pod:0xc20823b9f0 nodeInfo:0xc20828edc0 namespace:0xc20832d400} {pod:0xc20823bbe0 nodeInfo:0xc20828ee00 namespace:0xc20832d400} {pod:0xc20823bdd0 nodeInfo:0xc20828ee40 namespace:0xc20832d400}]
I1021 09:07:35.000394       1 manager.go:103] attempting to get data from source "Kube Node Metrics Source"
I1021 09:07:35.000421       1 kube_nodes.go:123] Fetched list of nodes from the master
I1021 09:07:35.000431       1 manager.go:103] attempting to get data from source "kube-events"
I1021 09:07:35.000444       1 kube_events.go:213] Fetched list of events from the master
I1021 09:07:35.000449       1 kube_events.go:214] []
I1021 09:07:35.000481       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer"
I1021 09:07:35.000560       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry"
I1021 09:07:35.000617       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1"
I1021 09:07:35.000667       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics"
I1021 09:07:35.000711       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics"
I1021 09:07:35.000757       1 kubelet.go:110] about to query kubelet using url: "https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster"
I1021 09:07:35.007430       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer - Get https://10.109.187.185:10250/stats/chunpj/metrics-deployer-al9ap/86152640-77d2-11e5-b314-22000bdd4192/deployer: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.007458       1 kube_pods.go:108] failed to get stats for container "deployer" in pod "chunpj"/"metrics-deployer-al9ap"
I1021 09:07:35.023079       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry - Get https://10.109.187.185:10250/stats/default/docker-registry-1-dhinq/b400f2e9-77cd-11e5-b314-22000bdd4192/registry: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.023097       1 kube_pods.go:108] failed to get stats for container "registry" in pod "default"/"docker-registry-1-dhinq"
I1021 09:07:35.138177       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1 - Get https://10.109.187.185:10250/stats/chunpj/hawkular-cassandra-1-hixsa/981a3d90-77d2-11e5-b314-22000bdd4192/hawkular-cassandra-1: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.138206       1 kube_pods.go:108] failed to get stats for container "hawkular-cassandra-1" in pod "chunpj"/"hawkular-cassandra-1-hixsa"
I1021 09:07:35.152113       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics - Get https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-98bad/ed4f89c9-77d0-11e5-b314-22000bdd4192/hawkular-metrics: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.152131       1 kube_pods.go:108] failed to get stats for container "hawkular-metrics" in pod "chunpj"/"hawkular-metrics-98bad"
I1021 09:07:35.171184       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics - Get https://10.109.187.185:10250/stats/chunpj/hawkular-metrics-jd1eg/a943f6a1-77d2-11e5-b314-22000bdd4192/hawkular-metrics: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.171200       1 kube_pods.go:108] failed to get stats for container "hawkular-metrics" in pod "chunpj"/"hawkular-metrics-jd1eg"
I1021 09:07:35.185075       1 kubelet.go:96] failed to get stats from kubelet url: https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster - Get https://10.109.187.185:10250/stats/chunpj/heapster-lsx6l/a95b78d2-77d2-11e5-b314-22000bdd4192/heapster: x509: cannot validate certificate for 10.109.187.185 because it doesn't contain any IP SANs
I1021 09:07:35.185092       1 kube_pods.go:108] failed to get stats for container "heapster" in pod "chunpj"/"heapster-lsx6l"
I1021 09:07:35.210060       1 kube_nodes.go:56] Failed to get container stats from Kubelet on node "ip-10-109-187-185.ec2.internal"
I1021 09:07:35.210083       1 manager.go:175] completed scraping data from sources. Errors: []

Expected results:
Should not meet such error in Heapster pod.

Additional info:

Comment 1 Matt Wringe 2015-10-21 13:54:43 UTC
I was finally able to reproduce the problem locally on my machine and I understand why it was working locally for me and would cause problems for almost anyone else.

Its currently being tracked upstream at https://github.com/kubernetes/heapster/issues/663

There should be a simple work around for this (at least for now) but the work around doesn't work in this situation: https://github.com/kubernetes/heapster/issues/662


There are a couple of more elaborate work arounds:

1) set the hostname of the node to its ip address. This is why it was working for me locally without problem.

2) reconfigure your system to use the RO endpoint. This will get around the issue, but it requires extra steps when configuring your OpenShift instance and is not a setup we should be using.

Comment 2 Matt Wringe 2015-10-21 21:51:27 UTC
This appears to be an issue with how OpenShift generates its certificates: https://github.com/openshift/origin/issues/5294

The easy work around for now is to set the --hostname to the IP address of the host instead of the actual hostname when generating your configuration.

Comment 3 Matt Wringe 2015-10-21 21:53:02 UTC
*** Bug 1272976 has been marked as a duplicate of this bug. ***

Comment 4 chunchen 2015-10-22 05:33:06 UTC
(In reply to Matt Wringe from comment #2)
> This appears to be an issue with how OpenShift generates its certificates:
> https://github.com/openshift/origin/issues/5294
> 
> The easy work around for now is to set the --hostname to the IP address of
> the host instead of the actual hostname when generating your configuration.

Yeah, It works for me when setting the --hostname to the IP address of the host.

Comment 5 Jeff Cantrill 2015-10-29 17:32:17 UTC
Duplicate Github issue: https://github.com/openshift/origin/issues/5294

Comment 6 Jeff Cantrill 2015-10-29 17:33:26 UTC
Reassigning to liggitt as he owned the github issue

Comment 7 Jordan Liggitt 2015-10-31 18:38:30 UTC
Fixed in https://github.com/openshift/origin/pull/5510

Comment 8 chunchen 2015-11-02 02:53:10 UTC
It's fixed, checked on devenv_rhel7_2619, please refer to the below messages:

[root@ip-172-18-3-105 hack]# ps -ef |grep open
root     18023 17995  8 21:16 pts/0    00:01:27 openshift start --public-master=ec2-52-91-81-137.compute-1.amazonaws.com --latest-images=true
root     22387 17995  0 21:33 pts/0    00:00:00 grep --color=auto open

[chunchen@F17-CCY daily]$ oc logs heapster-s4d21
<---------------snip------------------>
I1102 02:37:40.029869       1 kubelet.go:99] url: "https://172.18.3.105:10250/stats/default/docker-registry-1-f5yg7/f640ce06-8107-11e5-92d2-0ecb98477595/registry", body: "{\"num_stats\":60,\"start\":\"2015-11-02T02:37:35Z\",\"end\":\"2015-11-02T02:37:40Z\"}", data: {ContainerReference:{Name:/system.slice/docker-94352b158cbdce23a6c5b996028a9f33b55284b9cff6d5c062b51b3fa36e213a.scope Aliases:[k8s_registry.99d9127d_docker-registry-1-f5yg7_default_f640ce06-8107-11e5-92d2-0ecb98477595_e5a37810 94352b158cbdce23a6c5b996028a9f33b55284b9cff6d5c062b51b3fa36e213a] Namespace:docker} Subcontainers:[] Spec:{CreationTime:2015-11-02 02:19:02.804101333 +0000 UTC Labels:map[License:GPLv2 Vendor:CentOS io.kubernetes.pod.name:default/docker-registry-1-f5yg7 io.kubernetes.pod.terminationGracePeriod:30] HasCpu:true Cpu:{Limit:2 MaxLimit:0 Mask:0-1} HasMemory:true Memory:{Limit:18446744073709551615 Reservation:0 SwapLimit:18446744073709551615} HasNetwork:false HasFilesystem:false HasDiskIo:true HasCustomMetrics:false CustomMetrics:[]} Stats:[0xc20867fa00 0xc20867fc00 0xc2082be000]}

Comment 9 Matt Wringe 2015-11-04 14:05:04 UTC
I am a bit confused over why 1277842 was created and why it blocks this issue. Is 1277842 not a duplicate of this issue? Or does it need to exist because of the specific version number?

Comment 10 chunchen 2015-11-05 07:44:05 UTC
Sorry for confusing, this issue is fixed on Origin, but the bug 1277842 is reproduced against OSE env, I am deleting the blocks.


Note You need to log in before you can comment on or make changes to this bug.