Created attachment 1330847 [details] ansible running log Description of problem: Deployed prometheus without pv, ansible threw out 'dict object' has no attribute 'nfs' info, please see details from the attached log file MSG: AnsibleUndefinedVariable: {{ groups.nfs.0 }}: 'dict object' has no attribute 'nfs' to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-prometheus.retry Version-Release number of the following components: # rpm -qa | grep openshift-ansible openshift-ansible-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-roles-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-docs-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-lookup-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-filter-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-playbooks-3.7.0-0.126.6.git.0.a60fe67.el7.noarch openshift-ansible-callback-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch How reproducible: Always Steps to Reproduce: 1. Deploy prometheus without pv via ansible 2. 3. Actual results: ansible threw out 'dict object' has no attribute 'nfs' info Expected results: Deployment should be successful Additional info: # Inventory file [OSEv3:children] masters etcd [masters] ${MASTER_URL} openshift_public_hostname=${MASTER_URL} [etcd] ${MASTER_URL} openshift_public_hostname=${MASTER_URL} [OSEv3:vars] ansible_ssh_user=root ansible_ssh_private_key_file="~/libra.pem" deployment_type=openshift-enterprise openshift_docker_additional_registries=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888 # prometheus openshift_prometheus_state=present openshift_prometheus_namespace=prometheus openshift_prometheus_replicas=1 openshift_prometheus_node_selector={'role': 'node'} openshift_prometheus_image_proxy=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/oauth-proxy:v3.7 openshift_prometheus_image_prometheus=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus:v3.7 openshift_prometheus_image_alertmanager=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-alertmanager:v3.7 openshift_prometheus_image_alertbuffer=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-alert-buffer:v3.7
Submitted fix. Still testing: https://github.com/openshift/openshift-ansible/pull/5459
'dict object' has no attribute 'nfs' info is not shown not, but prometheus pod is in Pending status and it is still create pvc now, but there is no bound pv # oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE prometheus Pending 25m prometheus-alertbuffer Pending 25m prometheus-alertmanager Pending 25m # oc get pv NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE regpv-volume 17G RWX Retain Bound default/regpv-claim 1h ansible version # rpm -qa | grep openshift-ansible openshift-ansible-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-roles-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-docs-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-lookup-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-filter-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-playbooks-3.7.0-0.174.0.git.0.01932ad.el7.noarch openshift-ansible-callback-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch
Created attachment 1341941 [details] prometheus pod info
Created attachment 1341942 [details] ansible inventory file
You can use temporary volumes by overriding the pvc defaults: openshift_prometheus_storage_type='' openshift_prometheus_alertmanager_storage_type='' openshift_prometheus_alertbuffer_storage_type='' Jeff, I am not sure it is safe to infer: 'no nfs' -> 'no pvc'. If NFS is not defined, do we still want to create pvc and allow for manual PV creation?
From a logging perspective, if you do not define storage vars we assume you are going to use ephemeral storage and thus do not create any PVCs
So should I assume: `openshift_prometheus_storage_kind=nfs` <=> `openshift_prometheus_storage_type='pvc'` And the second question: Should the default change to `openshift_prometheus_storage_type=''`
These defaults seem reasonable to me.
PR with fix: https://github.com/openshift/openshift-ansible/pull/5848
Please change to ON_QA, issue is fixed. env: # rpm -qa | grep openshift-ansible openshift-ansible-callback-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-filter-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-playbooks-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-docs-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-lookup-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch openshift-ansible-roles-3.7.0-0.184.0.git.0.d407445.el7.noarch # openshift version openshift v3.7.0-0.184.0 kubernetes v1.7.6+a08f5eeb62 etcd 3.2.8 Images: oauth-proxy/images/v3.7.0-54 prometheus/images/v3.7.0-54 prometheus-alertmanager/images/v3.7.0-54 prometheus-alert-buffer/images/v3.7.0-51
Set it to VERIFIED as per Comment 11
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188