Description of problem: heketi-storage-copy-job fails during cns-deploy as it trying to pull a wrong image "heketi/heketi:dev" ########## heketi topology loaded. Saving heketi-storage.json secret "heketi-storage-secret" created endpoints "heketi-storage-endpoints" created service "heketi-storage-endpoints" created job "heketi-storage-copy-job" created Checking status of pods matching 'job-name=heketi-storage-copy-job': heketi-storage-copy-job-10twl 0/1 ImagePullBackOff 0 5m Timed out waiting for pods matching 'job-name=heketi-storage-copy-job'. Error waiting for job 'heketi-storage-copy-job' to complete. ########## ######### Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 26s 26s 1 default-scheduler Normal Scheduled Successfully assigned heketi-storage-copy-job-10twl to dhcp46-9.lab.eng.blr.redhat.com 20s 20s 1 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Normal BackOff Back-off pulling image "heketi/heketi:dev" 20s 20s 1 kubelet, dhcp46-9.lab.eng.blr.redhat.com Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "heketi" with ImagePullBackOff: "Back-off pulling image \"heketi/heketi:dev\"" 23s 8s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Normal Pulling pulling image "heketi/heketi:dev" 21s 5s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Warning Failed Failed to pull image "heketi/heketi:dev": rpc error: code = 2 desc = unknown: Not Found 21s 5s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "heketi" with ErrImagePull: "rpc error: code = 2 desc = unknown: Not Found" ######### Version-Release number of selected component (if applicable): heketi-client-5.0.0-1.el7rhgs.x86_64 cns-deploy-5.0.0-1.el7rhgs.x86_64 How reproducible: 2/2 Steps to Reproduce: 1. Execute # cns-deploy /opt/topology.json --deploy-gluster 2. 3. Actual results: cns-deploy fails to complete as it failed during heketi-storage-copy-job due to an incorrect image reference Expected results: cns-deploy should be successful Additional info: -------------------- [root@dhcp47-18 ~]# oc get jobs NAME DESIRED SUCCESSFUL AGE heketi-storage-copy-job 1 0 3s [root@dhcp47-18 ~]# oc describe job heketi-storage-copy-job Name: heketi-storage-copy-job Namespace: storage-project Selector: controller-uid=23a29c8e-4c29-11e7-9471-005056b3ded1 Labels: deploy-heketi=support Annotations: <none> Parallelism: 1 Completions: 1 Start Time: Thu, 08 Jun 2017 14:32:03 +0530 Pods Statuses: 1 Running / 0 Succeeded / 0 Failed Pod Template: Labels: controller-uid=23a29c8e-4c29-11e7-9471-005056b3ded1 job-name=heketi-storage-copy-job Containers: heketi: Image: heketi/heketi:dev Port: Command: cp /db/heketi.db /heketi Environment: <none> Mounts: /db from heketi-storage-secret (rw) /heketi from heketi-storage (rw) Volumes: heketi-storage: Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime) EndpointsName: heketi-storage-endpoints Path: heketidbstorage ReadOnly: false heketi-storage-secret: Type: Secret (a volume populated by a Secret) SecretName: heketi-storage-secret Optional: false Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 13s 13s 1 job-controller Normal SuccessfulCreate Created pod: heketi-storage-copy-job-10twl [root@dhcp47-18 ~]# oc get pods NAME READY STATUS RESTARTS AGE deploy-heketi-1-f5kvs 1/1 Running 0 1m glusterfs-rprj8 1/1 Running 0 2m glusterfs-sss5j 1/1 Running 0 2m glusterfs-vc452 1/1 Running 0 2m heketi-storage-copy-job-10twl 0/1 ImagePullBackOff 0 19s storage-project-router-3-59wqf 1/1 Running 1 2h [root@dhcp47-18 ~]# oc describe pod heketi-storage-copy-job-10twl Name: heketi-storage-copy-job-10twl Namespace: storage-project Security Policy: privileged Node: dhcp46-9.lab.eng.blr.redhat.com/10.70.46.9 Start Time: Thu, 08 Jun 2017 14:32:03 +0530 Labels: controller-uid=23a29c8e-4c29-11e7-9471-005056b3ded1 job-name=heketi-storage-copy-job Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"Job","namespace":"storage-project","name":"heketi-storage-copy-job","uid":"23a29c8e-4c29-11e7-9471-005056b... openshift.io/scc=privileged Status: Pending IP: 10.129.0.8 Controllers: Job/heketi-storage-copy-job Containers: heketi: Container ID: Image: heketi/heketi:dev Image ID: Port: Command: cp /db/heketi.db /heketi State: Waiting Reason: ImagePullBackOff Ready: False Restart Count: 0 Environment: <none> Mounts: /db from heketi-storage-secret (rw) /heketi from heketi-storage (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-zh2vk (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: heketi-storage: Type: Glusterfs (a Glusterfs mount on the host that shares a pod's lifetime) EndpointsName: heketi-storage-endpoints Path: heketidbstorage ReadOnly: false heketi-storage-secret: Type: Secret (a volume populated by a Secret) SecretName: heketi-storage-secret Optional: false default-token-zh2vk: Type: Secret (a volume populated by a Secret) SecretName: default-token-zh2vk Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 26s 26s 1 default-scheduler Normal Scheduled Successfully assigned heketi-storage-copy-job-10twl to dhcp46-9.lab.eng.blr.redhat.com 20s 20s 1 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Normal BackOff Back-off pulling image "heketi/heketi:dev" 20s 20s 1 kubelet, dhcp46-9.lab.eng.blr.redhat.com Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "heketi" with ImagePullBackOff: "Back-off pulling image \"heketi/heketi:dev\"" 23s 8s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Normal Pulling pulling image "heketi/heketi:dev" 21s 5s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com spec.containers{heketi} Warning Failed Failed to pull image "heketi/heketi:dev": rpc error: code = 2 desc = unknown: Not Found 21s 5s 2 kubelet, dhcp46-9.lab.eng.blr.redhat.com Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "heketi" with ErrImagePull: "rpc error: code = 2 desc = unknown: Not Found" --------------------
Verified as fixed in cns-deploy-5.0.0-2 ################ cns-deploy.log --verbose Using OpenShift CLI. NAME STATUS AGE storage-project Active 1d Using namespace "storage-project". Checking that heketi pod is not running ... Checking status of pods matching 'glusterfs=heketi-pod': No resources found. Timed out waiting for pods matching 'glusterfs=heketi-pod'. OK template "deploy-heketi" created serviceaccount "heketi-service-account" created template "heketi" created template "glusterfs" created role "edit" added: "system:serviceaccount:storage-project:heketi-service-account" Marking 'dhcp46-122.lab.eng.blr.redhat.com' as a GlusterFS node. node "dhcp46-122.lab.eng.blr.redhat.com" labeled Marking 'dhcp46-9.lab.eng.blr.redhat.com' as a GlusterFS node. node "dhcp46-9.lab.eng.blr.redhat.com" labeled Marking 'dhcp46-134.lab.eng.blr.redhat.com' as a GlusterFS node. node "dhcp46-134.lab.eng.blr.redhat.com" labeled Deploying GlusterFS pods. daemonset "glusterfs" created Waiting for GlusterFS pods to start ... Checking status of pods matching 'glusterfs-node=pod': glusterfs-3m19h 1/1 Running 0 52s glusterfs-nl9lf 1/1 Running 0 52s glusterfs-p749w 1/1 Running 0 52s OK service "deploy-heketi" created route "deploy-heketi" created deploymentconfig "deploy-heketi" created Waiting for deploy-heketi pod to start ... Checking status of pods matching 'glusterfs=heketi-pod': deploy-heketi-1-fswnt 1/1 Running 0 1m OK Determining heketi service URL ... OK Creating cluster ... ID: bd901d3bbabf347a5718dfd99b467d19 Creating node dhcp46-122.lab.eng.blr.redhat.com ... ID: 15fd7dde406c7d83781c07ce66ebd550 Adding device /dev/sdd ... OK Adding device /dev/sde ... OK Adding device /dev/sdf ... OK Creating node dhcp46-9.lab.eng.blr.redhat.com ... ID: e96157370ed8695d016242d9671bfede Adding device /dev/sdd ... OK Adding device /dev/sde ... OK Adding device /dev/sdf ... OK Creating node dhcp46-134.lab.eng.blr.redhat.com ... ID: f20a07ceb0223c2a4873d0f036539483 Adding device /dev/sdd ... OK Adding device /dev/sde ... OK Adding device /dev/sdf ... OK heketi topology loaded. Saving heketi-storage.json secret "heketi-storage-secret" created endpoints "heketi-storage-endpoints" created service "heketi-storage-endpoints" created job "heketi-storage-copy-job" created Checking status of pods matching 'job-name=heketi-storage-copy-job': heketi-storage-copy-job-1g2sc 0/1 Completed 0 11s deploymentconfig "deploy-heketi" deleted route "deploy-heketi" deleted service "deploy-heketi" deleted job "heketi-storage-copy-job" deleted pod "deploy-heketi-1-fswnt" deleted secret "heketi-storage-secret" deleted service "heketi" created route "heketi" created deploymentconfig "heketi" created Waiting for heketi pod to start ... Checking status of pods matching 'glusterfs=heketi-pod': deploy-heketi-1-fswnt 1/1 Terminating 0 1m OK Determining heketi service URL ... OK heketi is now running and accessible via http://heketi-storage-project.cloudapps.mystorage.com/ Ready to create and provide GlusterFS volumes. ################
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2881
Marking qe-test-coverage as - since the preferred mode of deployment is using ansible