Bug 1572008 - After 3.7 upgrade, deployments fail with 'Error: image <image> not found'
Summary: After 3.7 upgrade, deployments fail with 'Error: image <image> not found'
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 3.7.z
Assignee: Tomáš Nožička
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-26 00:20 UTC by emahoney
Modified: 2018-08-22 05:23 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-22 05:23:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description emahoney 2018-04-26 00:20:21 UTC
Description of problem:
After upgrade 3.6 to 3.7, deployment from Jenkins pipeline job fails with 'automatic: false' and trigger as "type: ImageChange" and we see the below errors on the node.

~~~
Apr 23 11:08:11 host.example.com atomic-openshift-node[1319]: E0423 11:08:11.741279    1319 kuberuntime_manager.go:709] container start failed: ErrImagePull: rpc error: code = 2 desc = Error: image library/my-sample-web:latest not found
~~~

After changing the line 'ImageChangeParams "automatic: true"' the ImagePullError is resolved. 


Version-Release number of selected component (if applicable):
OCP 3.7.42-1

How reproducible:
N/a


Steps to Reproduce:
1.
2.
3.

Actual results:Deployments fails due to ImagePullError


Expected results: Pods deployed successfully. 


Additional info:


Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 2 emahoney 2018-04-26 00:21:43 UTC
Created attachment 1426928 [details]
working dc

Comment 5 Ben Parees 2018-04-26 13:04:12 UTC
they may be using jenkins to create the DC, but it's the DC that isn't behaving properly.  Moving to master team.

Comment 6 Michal Fojtik 2018-04-30 09:34:56 UTC
~~~
Apr 23 11:08:11 host.example.com atomic-openshift-node[1319]: E0423 11:08:11.741279    1319 kuberuntime_manager.go:709] container start failed: ErrImagePull: rpc error: code = 2 desc = Error: image library/my-sample-web:latest not found
~~~

This looks like the image stream name was not resolved properly, seems like an initial deployment. 

Why you would not want 'automatic: true' by default in the template? Were the images updated by Jenkins pipeline or using some external process?

Comment 17 Michal Fojtik 2018-05-07 09:11:25 UTC
OK, couple things:

I think I remember there were some discussions about the behavior of `automatic: true` vs. `false` back in 3.7. If I remember properly (Tomas please correct me), you need to have `automatic: true` if you want to have images resolved.

For the Jenkins pipeline, I don't think the ImageChangeTrigger is a good fit, for couple reasons:

1) You are not guaranteed that the image which will be deployed is the version (tag) you expecting it to be. If something pushes to the ImageStreamTag, you end up with different image...

2) If you control the pipeline and the image cruft process and you know what tag (version) is an output of your process, they I would strongly suggest to get rid of ImageChangeTrigger and just use the `oc set image --source=imagestreamtag foo=image:1.0.x` in your Groovy scripts. That will guarantee the pipeline is consistent and the job is deploying the right image.

3) You can also add `automatic: true` which will enable the trigger and fix the resolution, but it will also cause the DC to trigger immediately when the image is updated...

Comment 18 Tomáš Nožička 2018-05-07 11:50:02 UTC
 - Why are you setting an image when you are expecting it to be filled in from image stream? That image is invalid, causing the errors. put " " there.

 - Why do you want to have automatic:false and expect the image be re-resolved automatically?


> I think I remember there were some discussions about the behavior of `automatic: true` vs. `false` back in 3.7. If I remember properly (Tomas please correct me), you need to have `automatic: true` if you want to have images resolved.

I'd need to check when I am back from PTO. But I think that it should work when you stop specifying the image which is conflicting with IS and there is no reason to do so.

Comment 22 Tomáš Nožička 2018-06-20 09:51:15 UTC
I've checked 3.7 and the automatic seems to work fine there. Any chance this "jenkins pipeline" is doing an oc apply which in result rewrites also the image (same way as the image trigger controller would have done)?

Specifying *both* ICT and an image is undefined behaviour! 

But since you're doing that, might I ask why you don't stop? I never got my questions answered.

 - Why are you setting an image when you are expecting it to be filled in from an image stream? That image is invalid, causing the errors. put " " there.

As I suspect this is the case of oc apply, using *either* ICT, or an image should fix this for you. Well, you should use *either* ICT, or an image anyways.

If that fixes the issue, please feel free to close it.

Comment 23 Tomáš Nožička 2018-08-15 07:55:30 UTC
Did the suggestion made in previous comment fixed the issue for you? There was no response in 2 months, I am inclined to close it as NOTABUG.

Comment 24 emahoney 2018-08-21 14:25:13 UTC
We can close this BZ, after removing the ICT entirely and using only the IS to resolve images (with "automatic: true"), the image pulls were successful. 

Thanks for your help.
-mahoney


Note You need to log in before you can comment on or make changes to this bug.