Bug 1416240

Summary: oc cluster up
Product: OpenShift Container Platform Reporter: Veer Muchandi <veer>
Component: ocAssignee: Steve Kuznetsov <skuznets>
Status: CLOSED CURRENTRELEASE QA Contact: Wenjing Zheng <wzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.4.0CC: aos-bugs, dyan, jokerman, mmccomas, mwringe, nschuetz, ramon.gordillo, rsoares, ssadeghi, tdawson, xxia
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
`oc cluster up` will no longer choose an incorrect image for the Aggregated Logging and Metrics deployer containers.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-02 21:04:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Veer Muchandi 2017-01-25 02:46:05 UTC
Description of problem:
oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4.0.40-1 --metrics=true

does not bring up metrics 

Version-Release number of selected component (if applicable):
version=v3.4.0.40-1

How reproducible:
easy to reproduce if you have docker on your box and run oc cluster up

Steps to Reproduce:
1. Run oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4.0.40-1 --metrics=true
2. The cluster comes up. However when you click on the metrics URL it shows metrics is not available
3. oc login -u system:admin 
and 
oc project openshift-infra
and
oc get events
it shows
FailedSync   {kubelet 192.168.64.3}   Error syncing pod, skipping: failed to "StartContainer" for "deployer" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/openshift3/ose-metrics-deployer:v3.4.0.40-1\""

This is because there is no image with name ose-metrics-deployer. The image name is metrics-deployer

Actual results:

Metrics doesn't work

Expected results:
Metrics is supposed to get deployed


Additional info:

the easiest way is to upload the image as ose-metrics-deployer into the registry. 
if not code should be changed.

Comment 2 Matt Wringe 2017-01-25 14:14:38 UTC
Moving this to the installer component and not metrics.

Comment 3 Veer Muchandi 2017-01-31 20:53:13 UTC
this seems like image naming issue .. there is no image with name ose-metrics-deployer. Currently the image name is metrics-deployer.

Comment 4 Steve Kuznetsov 2017-01-31 21:54:25 UTC
Yes. For the time being `oc cluster up` is being changed  to accommodate this naming error. I believe (but do not hold me to this) that we intend to re-publish the images as openshift3/ose-metrics-deployer, etc in the OCP 3.5 timeframe.

Comment 5 Siamak Sadeghianfar 2017-02-01 10:00:40 UTC
Can't we have the image re-tagged in RH registry as openshift3/ose-metrics-deployer:v3.4 so that "oc cluster" works for now till it gets fixed properly in OCP 3.5?

That would allow this to work:
oc cluster up --version=v3.4 --metrics

Comment 6 Steve Kuznetsov 2017-02-01 12:30:09 UTC
We are looking into publishing the image under both names to allow for this. There were more comments and concerns about the image re-naming and/or re-publishing steps than I had anticipated. The changes to `cluster up` in the linked PR may be premature. I'll update here when we have a more clear approach. Sorry for the noise.

Comment 7 Troy Dawson 2017-02-03 22:43:22 UTC
Moving to assigned.  Modified if for when we everything is ready, waiting for rpms and/or images to be built.

Comment 8 Ramon Gordillo 2017-02-09 10:48:36 UTC
Hi.

A workaround (until this bug is fixed) can be to re-tag the images in the docker daemon.

A simple bash script like:

-----
#!/bin/bash

version=v3.4
repo=registry.access.redhat.com/openshift3

for i in metrics-deployer metrics-cassandra metrics-heapster metrics-hawkular-metrics
do
	docker pull $repo/$i:$version
	docker tag $repo/$i:$version $repo/ose-$i:$version
        docker rmi $repo/$i:$version
done
-----

Can do the trick. But I have found also another bug related with an extra param in the templates, that I guess it is not included in the metrics bootstrap. There is another bugzilla for it, https://bugzilla.redhat.com/show_bug.cgi?id=141268.

Thanks

Comment 9 Ramon Gordillo 2017-02-09 10:50:45 UTC
Sorry, C&P typo. https://bugzilla.redhat.com/show_bug.cgi?id=1415268

Comment 10 Rafael Soares (Tuelho) 2017-02-10 17:34:35 UTC
the re-tag workaround is not enough. The metrics deployer fails during hawkular build with the following error log:

Deploying Hawkular Metrics & Cassandra Components
scripts/hawkular.sh: line 200: STARTUP_TIMEOUT: unbound variable
error: no objects passed to create

see https://github.com/openshift/origin/issues/11965

Comment 11 Ramon Gordillo 2017-02-10 19:18:29 UTC
That is exactly what the previous bugzilla (1415268) is about. It seems the template has changed and the deployer does not use a default value.

As an additional step, run the following with admin privs (redeploy, as the previous deploy has failed at some step, but some others has been already done):

oc process -f https://raw.githubusercontent.com/openshift/openshift-ansible/master/roles/openshift_hosted_templates/files/v1.4/enterprise/metrics-deployer.yaml \
	-p IMAGE_PREFIX=openshift3/ose- \
	-p IMAGE_VERSION=v3.4 \
        -p HAWKULAR_METRICS_HOSTNAME=<PUT_YOUR_METRICS_URL> \
	-p USE_PERSISTENT_STORAGE=false \
        -p MODE=redeploy \
        -p METRIC_RESOLUTION=10s | oc create -n openshift-infra

Comment 12 Steve Kuznetsov 2017-02-22 12:58:20 UTC
Troy, does this need to be moved to a different status, as we're waiting on the ticket now?

Comment 13 Troy Dawson 2017-02-23 21:11:52 UTC
With the latest errata, this should be fixed now for 3.2, 3,3 and 3.4.  The images that get pushed to registry.access.redhat.com are now both openshift3/{metrics,logging}-<name> and openshift3/ose-{metrics,logging}-<name>

Example:
If we curl registry.access.redhat.com for both 
openshift3/metrics-deployer and openshift3/ose-metrics-deployer we see that they both have the same docker sum.

openshift3/metrics-deployer:v3.4.1.7-4
    "3.4.1": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "latest": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4.1.7": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4.1.7-4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708"

openshift3/ose-metrics-deployer:v3.4.1.7-4
    "3.4.1": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "latest": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4.1.7": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708",
    "v3.4.1.7-4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708"

Comment 14 Veer Muchandi 2017-02-23 21:29:52 UTC
Tried testing 3.4.1.7 on a MAC. Here is the issue now.


$ oc cluster up --host-data-dir=/Users/veer/occlusterdata/hostdata --host-config-dir=/Users/veer/occlusterdata/hostconfig --image=registry.access.redhat.com/openshift3/ose --version=v3.4.1.7 --metrics=true --routing-suffix apps.10.61.135.205.xip.io


$ oc logs -f heapster-mmv9u -n openshift-infra
Endpoint Check in effect. Checking https://hawkular-metrics:443/hawkular/metrics/status
Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. Curl exit code: 28. Status Code 000

Comment 15 Steve Kuznetsov 2017-02-24 13:06:40 UTC
Veer, 

That seems like an unrelated problem to this BZ -- we have chosen the correct images, at least.

Comment 16 Dongbo Yan 2017-02-27 06:54:06 UTC
Verified
oc v3.5.0.33
kubernetes v1.5.2+43a9be4

$ sudo oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4 --metrics=true 

$ sudo oc get pod -n openshift-infra
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-ymn2g   1/1       Running     0          3m
hawkular-metrics-w5fz3       1/1       Running     0          3m
heapster-39tqg               1/1       Running     0          3m
metrics-deployer-pod-iu724   0/1       Completed   0          4m

could pull metrics image, and deploy pod successfully now.
But cannot curl metrics route url, need report another bug