Bug 1495446

Summary: Deploy prometheus without pv, ansible throw out 'dict object' has no attribute 'nfs' info
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: InstallerAssignee: Zohar Gal-Or <zgalor>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: high    
Version: 3.7.0CC: aos-bugs, jcantril, jokerman, mmccomas
Target Milestone: ---   
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-28 22:13:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ansible running log
none
prometheus pod info
none
ansible inventory file none

Description Junqi Zhao 2017-09-26 05:14:10 UTC
Created attachment 1330847 [details]
ansible running log

Description of problem:
Deployed prometheus without pv, ansible threw out 'dict object' has no attribute 'nfs' info,
please see details from the attached log file

MSG:
AnsibleUndefinedVariable: {{ groups.nfs.0 }}: 'dict object' has no attribute 'nfs'
    to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-prometheus.retry

Version-Release number of the following components:
# rpm -qa | grep openshift-ansible
openshift-ansible-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-roles-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-docs-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-lookup-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-filter-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-playbooks-3.7.0-0.126.6.git.0.a60fe67.el7.noarch
openshift-ansible-callback-plugins-3.7.0-0.126.6.git.0.a60fe67.el7.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy prometheus without pv via ansible
2.
3.

Actual results:
ansible threw out 'dict object' has no attribute 'nfs' info

Expected results:
Deployment should be successful

Additional info:
# Inventory file
[OSEv3:children]
masters
etcd

[masters]
${MASTER_URL} openshift_public_hostname=${MASTER_URL}

[etcd]
${MASTER_URL} openshift_public_hostname=${MASTER_URL}

[OSEv3:vars]
ansible_ssh_user=root
ansible_ssh_private_key_file="~/libra.pem"
deployment_type=openshift-enterprise
openshift_docker_additional_registries=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888

# prometheus
openshift_prometheus_state=present
openshift_prometheus_namespace=prometheus

openshift_prometheus_replicas=1
openshift_prometheus_node_selector={'role': 'node'}

openshift_prometheus_image_proxy=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/oauth-proxy:v3.7
openshift_prometheus_image_prometheus=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus:v3.7
openshift_prometheus_image_alertmanager=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-alertmanager:v3.7
openshift_prometheus_image_alertbuffer=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/prometheus-alert-buffer:v3.7

Comment 1 Zohar Gal-Or 2017-10-02 12:46:26 UTC
Submitted fix.
Still testing:
https://github.com/openshift/openshift-ansible/pull/5459

Comment 3 Junqi Zhao 2017-10-23 02:00:17 UTC
'dict object' has no attribute 'nfs' info is not shown not, but prometheus pod is in Pending status and it is still create pvc now, but there is no bound pv

# oc get pvc
NAME                      STATUS    VOLUME    CAPACITY   ACCESSMODES   STORAGECLASS   AGE
prometheus                Pending                                                     25m
prometheus-alertbuffer    Pending                                                     25m
prometheus-alertmanager   Pending                                                     25m

# oc get pv
NAME           CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                 STORAGECLASS   REASON    AGE
regpv-volume   17G        RWX           Retain          Bound     default/regpv-claim                            1h

ansible version
# rpm -qa | grep openshift-ansible
openshift-ansible-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-roles-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-docs-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-lookup-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-filter-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-playbooks-3.7.0-0.174.0.git.0.01932ad.el7.noarch
openshift-ansible-callback-plugins-3.7.0-0.174.0.git.0.01932ad.el7.noarch

Comment 4 Junqi Zhao 2017-10-23 02:00:56 UTC
Created attachment 1341941 [details]
prometheus pod info

Comment 5 Junqi Zhao 2017-10-23 02:01:17 UTC
Created attachment 1341942 [details]
ansible inventory file

Comment 6 Zohar Gal-Or 2017-10-23 13:26:54 UTC
You can use temporary volumes by overriding the pvc defaults:

openshift_prometheus_storage_type=''
openshift_prometheus_alertmanager_storage_type=''
openshift_prometheus_alertbuffer_storage_type=''

Jeff,
I am not sure it is safe to infer: 'no nfs' -> 'no pvc'.
If NFS is not defined, do we still want to create pvc and allow for manual PV creation?

Comment 7 Jeff Cantrill 2017-10-23 13:30:12 UTC
From a logging perspective, if you do not define storage vars we assume you are going to use ephemeral storage and thus do not create any PVCs

Comment 8 Zohar Gal-Or 2017-10-23 13:40:10 UTC
So should I assume: 
`openshift_prometheus_storage_kind=nfs` <=> `openshift_prometheus_storage_type='pvc'` 

And the second question:
Should the default change to `openshift_prometheus_storage_type=''`

Comment 9 Jeff Cantrill 2017-10-23 13:47:40 UTC
These defaults seem reasonable to me.

Comment 10 Zohar Gal-Or 2017-10-23 15:16:33 UTC
PR with fix: https://github.com/openshift/openshift-ansible/pull/5848

Comment 11 Junqi Zhao 2017-11-01 06:07:29 UTC
Please change to ON_QA, issue is fixed.

env:
# rpm -qa | grep openshift-ansible
openshift-ansible-callback-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-filter-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-playbooks-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-docs-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-lookup-plugins-3.7.0-0.184.0.git.0.d407445.el7.noarch
openshift-ansible-roles-3.7.0-0.184.0.git.0.d407445.el7.noarch


# openshift version
openshift v3.7.0-0.184.0
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8

Images:
oauth-proxy/images/v3.7.0-54
prometheus/images/v3.7.0-54
prometheus-alertmanager/images/v3.7.0-54
prometheus-alert-buffer/images/v3.7.0-51

Comment 12 Junqi Zhao 2017-11-02 00:19:10 UTC
Set it to VERIFIED as per Comment 11

Comment 16 errata-xmlrpc 2017-11-28 22:13:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188