Description of problem: oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4.0.40-1 --metrics=true does not bring up metrics Version-Release number of selected component (if applicable): version=v3.4.0.40-1 How reproducible: easy to reproduce if you have docker on your box and run oc cluster up Steps to Reproduce: 1. Run oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4.0.40-1 --metrics=true 2. The cluster comes up. However when you click on the metrics URL it shows metrics is not available 3. oc login -u system:admin and oc project openshift-infra and oc get events it shows FailedSync {kubelet 192.168.64.3} Error syncing pod, skipping: failed to "StartContainer" for "deployer" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/openshift3/ose-metrics-deployer:v3.4.0.40-1\"" This is because there is no image with name ose-metrics-deployer. The image name is metrics-deployer Actual results: Metrics doesn't work Expected results: Metrics is supposed to get deployed Additional info: the easiest way is to upload the image as ose-metrics-deployer into the registry. if not code should be changed.
Moving this to the installer component and not metrics.
this seems like image naming issue .. there is no image with name ose-metrics-deployer. Currently the image name is metrics-deployer.
Yes. For the time being `oc cluster up` is being changed to accommodate this naming error. I believe (but do not hold me to this) that we intend to re-publish the images as openshift3/ose-metrics-deployer, etc in the OCP 3.5 timeframe.
Can't we have the image re-tagged in RH registry as openshift3/ose-metrics-deployer:v3.4 so that "oc cluster" works for now till it gets fixed properly in OCP 3.5? That would allow this to work: oc cluster up --version=v3.4 --metrics
We are looking into publishing the image under both names to allow for this. There were more comments and concerns about the image re-naming and/or re-publishing steps than I had anticipated. The changes to `cluster up` in the linked PR may be premature. I'll update here when we have a more clear approach. Sorry for the noise.
Moving to assigned. Modified if for when we everything is ready, waiting for rpms and/or images to be built.
Hi. A workaround (until this bug is fixed) can be to re-tag the images in the docker daemon. A simple bash script like: ----- #!/bin/bash version=v3.4 repo=registry.access.redhat.com/openshift3 for i in metrics-deployer metrics-cassandra metrics-heapster metrics-hawkular-metrics do docker pull $repo/$i:$version docker tag $repo/$i:$version $repo/ose-$i:$version docker rmi $repo/$i:$version done ----- Can do the trick. But I have found also another bug related with an extra param in the templates, that I guess it is not included in the metrics bootstrap. There is another bugzilla for it, https://bugzilla.redhat.com/show_bug.cgi?id=141268. Thanks
Sorry, C&P typo. https://bugzilla.redhat.com/show_bug.cgi?id=1415268
the re-tag workaround is not enough. The metrics deployer fails during hawkular build with the following error log: Deploying Hawkular Metrics & Cassandra Components scripts/hawkular.sh: line 200: STARTUP_TIMEOUT: unbound variable error: no objects passed to create see https://github.com/openshift/origin/issues/11965
That is exactly what the previous bugzilla (1415268) is about. It seems the template has changed and the deployer does not use a default value. As an additional step, run the following with admin privs (redeploy, as the previous deploy has failed at some step, but some others has been already done): oc process -f https://raw.githubusercontent.com/openshift/openshift-ansible/master/roles/openshift_hosted_templates/files/v1.4/enterprise/metrics-deployer.yaml \ -p IMAGE_PREFIX=openshift3/ose- \ -p IMAGE_VERSION=v3.4 \ -p HAWKULAR_METRICS_HOSTNAME=<PUT_YOUR_METRICS_URL> \ -p USE_PERSISTENT_STORAGE=false \ -p MODE=redeploy \ -p METRIC_RESOLUTION=10s | oc create -n openshift-infra
Troy, does this need to be moved to a different status, as we're waiting on the ticket now?
With the latest errata, this should be fixed now for 3.2, 3,3 and 3.4. The images that get pushed to registry.access.redhat.com are now both openshift3/{metrics,logging}-<name> and openshift3/ose-{metrics,logging}-<name> Example: If we curl registry.access.redhat.com for both openshift3/metrics-deployer and openshift3/ose-metrics-deployer we see that they both have the same docker sum. openshift3/metrics-deployer:v3.4.1.7-4 "3.4.1": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "latest": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4.1.7": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4.1.7-4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708" openshift3/ose-metrics-deployer:v3.4.1.7-4 "3.4.1": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "latest": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4.1.7": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708", "v3.4.1.7-4": "35ada240b57ec38deb15d9e29b8f6bdac4588f5af8b71de65fa056b7dda12708"
Tried testing 3.4.1.7 on a MAC. Here is the issue now. $ oc cluster up --host-data-dir=/Users/veer/occlusterdata/hostdata --host-config-dir=/Users/veer/occlusterdata/hostconfig --image=registry.access.redhat.com/openshift3/ose --version=v3.4.1.7 --metrics=true --routing-suffix apps.10.61.135.205.xip.io $ oc logs -f heapster-mmv9u -n openshift-infra Endpoint Check in effect. Checking https://hawkular-metrics:443/hawkular/metrics/status Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. Curl exit code: 28. Status Code 000
Veer, That seems like an unrelated problem to this BZ -- we have chosen the correct images, at least.
Verified oc v3.5.0.33 kubernetes v1.5.2+43a9be4 $ sudo oc cluster up --image=registry.access.redhat.com/openshift3/ose --version=v3.4 --metrics=true $ sudo oc get pod -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-ymn2g 1/1 Running 0 3m hawkular-metrics-w5fz3 1/1 Running 0 3m heapster-39tqg 1/1 Running 0 3m metrics-deployer-pod-iu724 0/1 Completed 0 4m could pull metrics image, and deploy pod successfully now. But cannot curl metrics route url, need report another bug