Created attachment 1195798 [details] logs deployer status error when DYNAMICALLY_PROVISION_STORAGE is 'true' OCP 3.3.0 / metrics M M Description of problem: deploy metrics stack with parameter 'USE_PERSISTENT_STORAGE=true' and 'DYNAMICALLY_PROVISION_STORAGE=true' in a cloud-provider enabled env, wait until the deployment is finished. heapster,cassandra and metrics pods work well, and could get metrics from web ui, but the deployer pod's status is error. Version-Release number of selected component (if applicable): 3.3.0 How reproducible: always Steps to Reproduce: 1. oc project ${PROJECT} oc create serviceaccount metrics-deployer oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-infra:heapster oc policy add-role-to-user edit system:serviceaccount:openshift-infra:metrics-deployer oc secrets new metrics-deployer nothing=/dev/null oc new-app metrics-deployer-template -p \ IMAGE_PREFIX=registry.ops.openshift.com/openshift3/,\ IMAGE_VERSION=3.3.0,\ MASTER_URL=${MASTER_URL},\ HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.${SUBDOMAIN},\ MODE=deploy,\ USE_PERSISTENT_STORAGE=true,\ DYNAMICALLY_PROVISION_STORAGE=true,\ CASSANDRA_NODES=1,\ CASSANDRA_PV_SIZE=10,\ USER_WRITE_ACCESS=false 2. [penli@dhcp-137-185 33]$ oc get po NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-bi7nk 1/1 Running 0 1m hawkular-metrics-t5bxo 1/1 Running 0 1m heapster-qz34g 1/1 Running 0 1m metrics-deployer-4wheq 0/1 Error 0 1m 3. [penli@dhcp-137-185 33]$ oc logs metrics-deployer-4wheq (...) ======== ERROR ========= validate_deployment_artifacts: scripts/validate.sh: line 277: line[3]: unbound variable ======================== --- validate_deployed_project --- VALIDATION FAILED (...) Actual results: deployer pod's status is error. Expected results: deployer pod's status is finished. Additional info: Full log is attached.
Issue reproduced when USE_PERSISTENT_STORAGE=true and DYNAMICALLY_PROVISION_STORAGE=false.
I have ran this multiple times on AWS without encountering this problem. Can you please describe your setup a bit? Is this reproducible?
Hi Matt, we ran the test on Google Cloud with Persistent Disk, cloud provier enabled, since Sep 5th is holiday in US, if you need, I can set up one Google Cloud for you on Sep 6th.
Marking this as low as it doesn't actually affect the pods from being deployed, only our scripts which try and validate that the install happened correctly.
There is some issue when use dynamic pvc on GCE, node will crash, this should be OCP issue not Metrics issue, I'll verify it once it's ready.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066
*** Bug 1425932 has been marked as a duplicate of this bug. ***