1543446 – Scheduled images from external registry randomly polled and triggered deployment by overriding default minimum interval setting of 15 minutes.

Bug 1543446 - Scheduled images from external registry randomly polled and triggered deployment by overriding default minimum interval setting of 15 minutes.

Summary: Scheduled images from external registry randomly polled and triggered deploym...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	ImageStreams
Sub Component:
Version:	3.5.0
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	3.9.0
Assignee:	Jim Minter
QA Contact:	Dongbo Yan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-08 13:26 UTC by giriraj rajawat
Modified:	2021-06-10 14:32 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Due to a fault in the scheduler implementation, the ScheduledImageImportMinimumIntervalSeconds setting was not correctly observed, causing OpenShift to attempt to import scheduled images at the wrong intervals. This has been resolved.
Clone Of:
Environment:
Last Closed:	2018-03-28 14:27:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:0489	0	None	None	None	2018-03-28 14:27:56 UTC

Internal Links: 1546097

Description giriraj rajawat 2018-02-08 13:26:15 UTC

Description of problem:



Scheduled images from external registry randomly polled and triggered deployment by overriding default minimum interval of 15 minutes.

Document states by default image periodically check for updates in every 15 minutes. If cluster wide settings are not set. 

https://docs.openshift.com/container-platform/3.5/architecture/core_concepts/builds_and_image_streams.html

"ScheduledImageImportMinimumIntervalSeconds:
The minimum number of seconds that can elapse between when image streams scheduled for background import are checked against the upstream repository. The default value is 15 minutes"


Version-Release number of selected component (if applicable):

Tested on 3.5 & 3.7 


How reproducible:


Steps to Reproduce:

1. oc new-project importscheduled
2. oc new-app --docker-image=docker.io/giriraj789/webapp:v1 --allow-missing-images=true -n importscheduled
3. oc tag docker.io/giriraj789/webapp:v1 webapp:v1 --scheduled
4. Change source code of image and push changes to external docker registry 


Actual results:

Image polling happening randomly, Hence causing deployment trigger before period.

Expected results:
Deployment should be triggered post crossing 15 minutes of minimum interval period.


Additional info:


Image stream description:


oc describe is/webapp
Name:			webapp
Namespace:		importscheduled
Created:		19 minutes ago
Labels:			app=webapp
Annotations:		openshift.io/generated-by=OpenShiftNewApp
			openshift.io/image.dockerRepositoryCheck=2018-02-08T12:57:00Z
Docker Pull Spec:	docker-registry.default.svc:5000/importscheduled/webapp
Image Lookup:		local=false
Unique Images:		3
Tags:			1

v1
  updates automatically from registry docker.io/giriraj789/webapp:v1

  * docker.io/giriraj789/webapp@sha256:3428c7dd63325449b789cbc30459a946e7129f62c1d349bb5f167699b14e316f
      5 seconds ago
    docker.io/giriraj789/webapp@sha256:2791a384e11ce113800912c5510fa1e2e50b1e0cdacc793a8178eb3906898819
      7 minutes ago
    docker.io/giriraj789/webapp@sha256:828e10f55cc3d2c52ccb240e3797665deb4679c2726045e11f3544bd1cc5df58
      19 minutes ago





JouralLOGS 


$ journalctl -u atomic-openshift-master-controllers.service -f | grep -i import
Feb 08 07:49:30 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:30.610580  108489 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"importscheduled", Name:"webapp-2-deploy", UID:"813e6b4a-0cce-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"590374", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned webapp-2-deploy to node1.192.168.124.111.nip.io
Feb 08 07:49:33 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:33.607526  108489 replication_controller.go:451] Too few "importscheduled"/"webapp-2" replicas, need 1, creating 1
Feb 08 07:49:33 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:33.623583  108489 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"importscheduled", Name:"webapp-2", UID:"81391bc8-0cce-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"590391", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: webapp-2-8wmgc
Feb 08 07:49:33 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:33.635217  108489 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"importscheduled", Name:"webapp-2-8wmgc", UID:"830b6a30-0cce-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"590393", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned webapp-2-8wmgc to node1.192.168.124.111.nip.io
Feb 08 07:49:43 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:43.984804  108489 replication_controller.go:517] Too many "importscheduled"/"webapp-1" replicas, need 0, deleting 1
Feb 08 07:49:43 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:43.984841  108489 controller_utils.go:616] Controller webapp-1 deleting pod importscheduled/webapp-1-smhbj
Feb 08 07:49:44 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:49:44.005584  108489 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"importscheduled", Name:"webapp-1", UID:"d835b451-0ccc-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"590425", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: webapp-1-smhbj
Feb 08 07:57:00 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:00.856153  108489 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"importscheduled", Name:"webapp-3-deploy", UID:"8d9dbd37-0ccf-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"591146", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned webapp-3-deploy to node1.192.168.124.111.nip.io
Feb 08 07:57:03 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:03.926815  108489 replication_controller.go:451] Too few "importscheduled"/"webapp-3" replicas, need 1, creating 1
Feb 08 07:57:03 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:03.941851  108489 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"importscheduled", Name:"webapp-3", UID:"8d9a0c39-0ccf-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"591167", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: webapp-3-cdf2f
Feb 08 07:57:03 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:03.948172  108489 event.go:218] Event(v1.ObjectReference{Kind:"Pod", Namespace:"importscheduled", Name:"webapp-3-cdf2f", UID:"8f747ff6-0ccf-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"591168", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned webapp-3-cdf2f to node1.192.168.124.111.nip.io
Feb 08 07:57:16 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:16.251545  108489 replication_controller.go:517] Too many "importscheduled"/"webapp-2" replicas, need 0, deleting 1
Feb 08 07:57:16 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:16.251610  108489 controller_utils.go:616] Controller webapp-2 deleting pod importscheduled/webapp-2-8wmgc
Feb 08 07:57:16 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[108489]: I0208 07:57:16.294596  108489 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"importscheduled", Name:"webapp-2", UID:"81391bc8-0cce-11e8-bf6d-52540011b288", APIVersion:"v1", ResourceVersion:"591204", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: webapp-2-8wmgc


Master config snippet:

imageConfig:
  format: openshift3/ose-${component}:${version}
  latest: false
#imagePolicyConfig:
#MaxScheduledImageImportsPerMinute: 2
#ScheduledImageImportMinimumIntervalSeconds: 900
kind: MasterConfig
kubeletClientInfo:

Comment 1 Jim Minter 2018-02-08 23:05:14 UTC

I haven't yet been able to reproduce an issue that involves irregular or unexpected high frequency schedule timings.

Giriraj, I don't understand what you mean by the commented out items in your master config snippet.  Do you have a test environment I can look at?

It would also be useful if you could run the OpenShift master with --loglevel=3 and show me the timestamps of log lines containing "Scheduled import of stream".

Comment 2 giriraj rajawat 2018-02-09 07:07:01 UTC

Jim, 

Prior this testing , I have tested with cluster wide parameters "ScheduledImageImportMinimumIntervalSeconds: 900" however it didn't impact the randomness of polling. 

Present environment doesn't have any configuration and ideally as per document the minimum interval for poll expected after 15 minutes /900 seconds.

The result shows the polling executed before the default time i.e. 15 minutes.

Logs with level 3. (in short poll intervals 1:21, 1:32, 1:43)



journalctl -u atomic-openshift-master-controllers.service -f | grep -i import
Feb 09 01:21:13 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:21:13.905311    1809 scheduled_image_controller.go:179] Scheduled import of stream blue/webapp...
Feb 09 01:21:13 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:21:13.905367    1809 imagestream_controller.go:223] Importing stream blue/webapp partial=true...
Feb 09 01:21:19 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:21:19.641599    1809 imagestream_controller.go:152] Queued import of stream blue/webapp...
Feb 09 01:32:28 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:32:28.905210    1809 scheduled_image_controller.go:179] Scheduled import of stream blue/webapp...
Feb 09 01:32:28 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:32:28.905241    1809 imagestream_controller.go:223] Importing stream blue/webapp partial=true...
Feb 09 01:32:34 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:32:34.412533    1809 imagestream_controller.go:152] Queued import of stream blue/webapp...
Feb 09 01:43:43 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:43:43.905205    1809 scheduled_image_controller.go:179] Scheduled import of stream blue/webapp...
Feb 09 01:43:43 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:43:43.905235    1809 imagestream_controller.go:223] Importing stream blue/webapp partial=true...
Feb 09 01:43:49 master1.192.168.124.101.nip.io atomic-openshift-master-controllers[1809]: I0209 01:43:49.658860    1809 imagestream_controller.go:152] Queued import of stream blue/webapp...



Followed below steps for reproducing :


1. Created a project 
2. Spun a pod(webapp) from image on external docker hub public registry using new-app --docker-image=<public-image-name>
3. updated tag with --scheduled , for auto update (updates automatically from registry docker.io/giriraj789/webapp:v1)
4. Made changes in source of image and pushed to public docker registry from jumpbox an addition system wherein image was baked.

Do let me know if the whole process of testing make sense or we need to make some changes in any of the logic.


Regards,
Giriraj

Comment 3 Jim Minter 2018-02-13 23:12:27 UTC

https://github.com/openshift/origin/pull/18604 should help with this, but more may be needed.

Comment 5 Dongbo Yan 2018-02-22 08:32:09 UTC

Verified
openshift v3.9.0-0.47.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8
Deployment is triggered post crossing 15 minutes of minimum interval period.

Comment 8 errata-xmlrpc 2018-03-28 14:27:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.