Bug 1652797 - OCP installation fails at deploy_gluster; heketi pod creation fails
Summary: OCP installation fails at deploy_gluster; heketi pod creation fails
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.11.z
Assignee: Jose A. Rivera
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-23 04:12 UTC by krishnaram Karthick
Modified: 2019-01-23 07:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-23 07:24:15 UTC
Target Upstream Version:


Attachments (Terms of Use)
ansible_log (2.98 MB, text/plain)
2018-11-23 04:16 UTC, krishnaram Karthick
no flags Details
inventory_file (1.97 KB, text/plain)
2018-11-23 04:16 UTC, krishnaram Karthick
no flags Details

Description krishnaram Karthick 2018-11-23 04:12:22 UTC
Description of problem:

While trying to configure OCP with glusterfs by running deploy_cluster.yml playbook, the deployment failed at task 'openshift_storage_glusterfs : Wait for heketi pod'.

<< Snippet of ansible logs >>

FAILED - RETRYING: Wait for heketi pod (6 retries left).
FAILED - RETRYING: Wait for heketi pod (5 retries left).
FAILED - RETRYING: Wait for heketi pod (4 retries left).
FAILED - RETRYING: Wait for heketi pod (3 retries left).
FAILED - RETRYING: Wait for heketi pod (2 retries left).
FAILED - RETRYING: Wait for heketi pod (1 retries left).
fatal: [dhcp46-77.lab.eng.blr.redhat.com]: FAILED! => {"attempts": 30, "changed": false, "results": {"cmd": "/usr/bin/oc get pod --selector=glusterfs=heketi-storage-pod -o json -n glusterfs", "results": [{"apiVersion": "v1", "items": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "heketi-storage", "openshift.io/deployment.name": "heketi-storage-1", "openshift.io/scc": "privileged"}, "creationTimestamp": "2018-11-23T08:06:53Z", "generateName": "heketi-storage-1-", "labels": {"deployment": "heketi-storage-1", "deploymentconfig": "heketi-storage", "glusterfs": "heketi-storage-pod", "heketi": "storage-pod"}, "name": "heketi-storage-1-gfrx7", "namespace": "glusterfs", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "heketi-storage-1", "uid": "a066e28b-eef6-11e8-bea6-005056a5c898"}], "resourceVersion": "8808", "selfLink": "/api/v1/namespaces/glusterfs/pods/heketi-storage-1-gfrx7", "uid": "bcc20f58-eef6-11e8-bea6-005056a5c898"}, "spec": {"containers": [{"env": [{"name": "HEKETI_USER_KEY", "value": "LVjDhBCkCaMu5SKjU4khRRUIbiY+VZFlptSbh76HofM="}, {"name": "HEKETI_ADMIN_KEY", "value": "8WTHLIOxdStgkyT/iF44btmgSEm0BR3z0K6W54XtmX8="}, {"name": "HEKETI_EXECUTOR", "value": "kubernetes"}, {"name": "HEKETI_FSTAB", "value": "/var/lib/heketi/fstab"}, {"name": "HEKETI_SNAPSHOT_LIMIT", "value": "14"}, {"name": "HEKETI_KUBE_GLUSTER_DAEMONSET", "value": "1"}, {"name": "HEKETI_IGNORE_STALE_OPERATIONS", "value": "true"}], "image": "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-volmanager-rhel7:3.11.1-1", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 30, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "name": "heketi", "ports": [{"containerPort": 8080, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 3, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/lib/heketi", "name": "db"}, {"mountPath": "/etc/heketi", "name": "config"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "heketi-storage-service-account-token-l5gp6", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "heketi-storage-service-account-dockercfg-s6d95"}], "nodeName": "dhcp46-123.lab.eng.blr.redhat.com", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "heketi-storage-service-account", "serviceAccountName": "heketi-storage-service-account", "terminationGracePeriodSeconds": 30, "volumes": [{"glusterfs": {"endpoints": "heketi-db-storage-endpoints", "path": "heketidbstorage"}, "name": "db"}, {"name": "config", "secret": {"defaultMode": 420, "secretName": "heketi-storage-config-secret"}}, {"name": "heketi-storage-service-account-token-l5gp6", "secret": {"defaultMode": 420, "secretName": "heketi-storage-service-account-token-l5gp6"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2018-11-23T08:06:55Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2018-11-23T08:06:55Z", "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": null, "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2018-11-23T08:06:53Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"containerID": "docker://d2c2e9ff341d9d0d4401b6f7f933b2045761113edce5c32a6c98e407741cbf39", "image": "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-volmanager-rhel7:3.11.1-1", "imageID": "docker-pullable://brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/ocs/rhgs-volmanager-rhel7@sha256:66331811d464b3de64b87eb7b32aca0bb0d1a0f3c1e90c3e03c3bab00a93e2e5", "lastState": {"terminated": {"containerID": "docker://5779cdc062dbfc227d0f597d4def593c534e817fac17d344de187b8f8fffae58", "exitCode": 137, "finishedAt": "2018-11-23T08:10:02Z", "reason": "Error", "startedAt": "2018-11-23T08:08:37Z"}}, "name": "heketi", "ready": false, "restartCount": 2, "state": {"running": {"startedAt": "2018-11-23T08:10:09Z"}}}], "hostIP": "10.70.46.123", "phase": "Running", "podIP": "10.130.0.2", "qosClass": "BestEffort", "startTime": "2018-11-23T08:06:55Z"}}], "kind": "List", "metadata": {"resourceVersion": "", "selfLink": ""}}], "returncode": 0}, "state": "list"}

complete set of ansible logs, inventory file used shall be attached to the bug soon.

Version-Release number of selected component (if applicable):
rpm -qa | grep 'ansible'
openshift-ansible-playbooks-3.11.43-1.git.0.fa69a02.el7.noarch
ansible-2.6.7-1.el7ae.noarch
openshift-ansible-roles-3.11.43-1.git.0.fa69a02.el7.noarch
openshift-ansible-docs-3.11.43-1.git.0.fa69a02.el7.noarch
openshift-ansible-3.11.43-1.git.0.fa69a02.el7.noarch


How reproducible:
1/1

Steps to Reproduce:
1. Try to configure OCP by running '/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml'

Actual results:
installation fails at deploy heketi task

Expected results:
installation should succeed

Additional info:

Comment 1 krishnaram Karthick 2018-11-23 04:16:16 UTC
Created attachment 1508179 [details]
ansible_log

Comment 2 krishnaram Karthick 2018-11-23 04:16:39 UTC
Created attachment 1508180 [details]
inventory_file

Comment 3 Jose A. Rivera 2018-11-27 16:57:19 UTC
Is the problem still reproducible? Has any other testing been able to proceed or it this a full blocker?

Comment 4 krishnaram Karthick 2019-01-23 07:24:15 UTC
This issue is not seen anymore. Closing the bug.


Note You need to log in before you can comment on or make changes to this bug.