Bug 1876500 - [OCP 3.11] lastTriggeredImageID not honored when applying manifest
Summary: [OCP 3.11] lastTriggeredImageID not honored when applying manifest
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.7.0
Assignee: Adam Kaplan
QA Contact: wewang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-07 10:33 UTC by Mario Abajo
Modified: 2021-04-08 20:38 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-22 18:11:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Mario Abajo 2020-09-07 10:33:37 UTC
Description of problem:
In gitops scenarios in witch you can constantly reapply some manifest of a project, when reapplying a buildconfig, even with the "lastTriggeredImageID" field set, a new and sometimes unnecessary build will be triggered. Indeed, the "lastTriggeredImageID" field is cleared after reapplying the manifests, triggering the rebuild always.

There is an upstream issue that discuss that but was abandoned:
https://github.com/openshift/origin/issues/23696

Version-Release number of selected component (if applicable):
Openshift 3.11.272

How reproducible:
Always

Steps to Reproduce:
$ oc new-project test
$ cat << EOF > ruby-sample-build
kind: "BuildConfig"
apiVersion: "v1"
metadata:
  name: "ruby-sample-build" 
spec:
  runPolicy: "Serial" 
  triggers: 
    - imageChange:
      lastTriggeredImageID: docker-registry.default.svc:5000/openshift/ruby@sha256:946e8625e40e8b894a11b780b1921c0809cec61b5bd04e00c74f33246de33809
      type: "ImageChange"
  source: 
    git:
      uri: "https://github.com/openshift/ruby-hello-world"
  strategy: 
    sourceStrategy:
      from:
        kind: "ImageStreamTag"
        name: "ruby:latest"
        namespace: openshift
  output: 
    to:
      kind: "ImageStreamTag"
      name: "origin-ruby-sample:latest"
  postCommit: 
      script: "bundle exec rake test"
EOF

$ oc create imagestream origin-ruby-sample
$ oc apply -f ruby-sample-build
### Wait until build finishes, you will see the lastTriggeredImageID when it creates the image
$ oc get bc/ruby-sample-build -o yaml | grep lastTriggeredImageID
### Modify, if necessary, the "lastTriggeredImageID" in the file and reapply it

$ oc apply -f ruby-sample-build

$ oc get buildconfig/ruby-sample-build -o yaml | grep lastTriggeredImageID
      {"apiVersion":"build.openshift.io/v1","kind":"BuildConfig","metadata":{"annotations":{},"name":"ruby-sample-build","namespace":"test"},"spec":{"output":{"to":{"kind":"ImageStreamTag","name":"origin-ruby-sample:latest"}},"postCommit":{"script":"bundle exec rake test"},"runPolicy":"Serial","source":{"git":{"uri":"https://github.com/openshift/ruby-hello-world"}},"strategy":{"sourceStrategy":{"from":{"kind":"ImageStreamTag","name":"ruby:latest","namespace":"openshift"}}},"triggers":[{"imageChange":null,"lastTriggeredImageID":"docker-registry.default.svc:5000/openshift/ruby@sha256:946e8625e40e8b894a11b780b1921c0809cec61b5bd04e00c74f33246de33809","type":"ImageChange"}]}}

### ^^ No "lastTriggeredImageID" field is displayed and a new build is triggered.

$ oc get pods
NAME                            READY     STATUS      RESTARTS   AGE
pod/ruby-sample-build-1-build   0/1       Completed   0          3m
pod/ruby-sample-build-2-build   1/1       Running     0          33s



Actual results:
lastTriggeredImageID field is cleared when loading the manifest

Expected results:
lastTriggeredImageID taken into account and not cleared so build are not always triggered but only when necessary.

Additional info:

Comment 1 Adam Kaplan 2020-09-08 12:37:44 UTC
This bug is related to the upstream GitHub issue. Fixing it requires an API change that we probably won't be able to backport (we need to move the `lastTriggeredImageID` to a status field).

Comment 8 Adam Kaplan 2020-09-29 18:06:13 UTC
Naively running `oc apply -f` against a BuildConfig YAML that has image change triggers does not work because the image change trigger adds `lastImageChangeTriggeredID: <sha>` to the BuildConfig spec. Because of the way the image change triggers API is implemented, the `lastImageChangeTriggeredID` is removed if a subsequent call to `oc apply -f` is made and the BuildConfig YAML does not contain the `lastImageChangeTriggeredID`.

In my investigation, the best options to work around this issue are as follows:

1. Write a "pre-process" script which updates the BuildConfig triggers YAML in git with the `lastImageChangeTriggeredID` in from the cluster, and incorporate this into the GitOps workflow.
2. Disable the ImageChange trigger and use a combination of scheduled imagestream imports [1] and a CronJob to run builds on a regular interval with the latest imported images. The CronJob could be added to the cluster-wide project template to ensure all namespaces run builds at regular intervals [2].
3. Use `oc patch` commands to reconcile the state of the BuildConfig. The following repo utilizes a script and proposed workflow to accomplish this task [3].

[1] https://docs.openshift.com/container-platform/4.5/openshift_images/image-streams-manage.html#images-imagestreams-import_image-streams-managing
[2] https://docs.openshift.com/container-platform/4.5/applications/projects/configuring-project-creation.html
[3] https://github.com/adambkaplan/openshift-gitops-imagechange-triggers

Comment 10 Adam Kaplan 2020-10-22 18:11:58 UTC
After much discussion, the engineering team has determined that we do not have a feasible means of fixing this issue.

The triggers API field is problematic for two reasons:

1. `triggers:` is a set-like array that does not define a patchMergeKey [1].
   Updating this array via `oc apply -f` therefore fully replaces the contents of the array, not individual elements.
2. `lastTriggeredImageID` is a data element that belongs in the BuildConfig `status` subresource.
   `spec` is not an appropriate place to store information that represents cluster state.

Fixing 1 requires us to introduce a field that acts as the merge key.
To maintain backwards compatibility, this field would need to be optional.
Technically this is allowed, but a) could lead to buggy behavior, and b) would not work with server-side apply, which is currently in beta [2].
Therefore, engineering is not in favor of introducing this API element.
 
Fixing 2 likewise introduces behavior that could break backwards compatibility.
In this scenario we would deprecate the `lastTriggeredImageID` field in `spec`, record this information in both `spec` and `status`, and then moving forward use the data in `status` to trigger builds.
Unfortunately, the `lastTriggeredImageID` field has a peculiar feature where if it is removed or set to the empty string, a new build is created (basically, a forced image change trigger).
We lose this capability if we only use data in `status` to drive image change triggers in builds.

Therefore, we have decided to close this issue as WONTFIX.
The best option to work around this issue for this particular GitOps scenario is to add jobs/scripts that update the BuildConfig manifests in git with the lastTriggeredImageIDs found on the cluster.

[1] https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#merge-individual-elements-of-a-list-of-complex-elements
[2] https://kubernetes.io/docs/reference/using-api/api-concepts/#server-side-apply

Comment 13 Adam Kaplan 2020-11-16 22:37:13 UTC
There has been further discussion among the engineering team, and we have reversed course on our decision.

Our review revealed that the behavior identified in comment 10 is not officially documented [1]. The API godoc in fact notes that this field is meant for internal purposes only [2].
We are therefore going to deprecate the `lastTriggeredImageId` field in `spec` and move it to a new `status` field. This will be done in two phases:

1. Phase 1 - officially deprecate the field, and begin recording the trigger IDs in status [3].
2. Phase 2 - drive ImageChange triggers from status only.

Phase 1 will include the official deprecation notice, as well as an alert if deprecated behavior is detected on the cluster.

This work will be tracked on issues.redhat.com, under the issues referenced below. This bug will remain closed, but with the DEFERRED status.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1876500#c10
[2] https://github.com/openshift/api/blob/master/build/v1/types.go#L993-L994
[3] https://issues.redhat.com/browse/BUILD-189
[4] https://issues.redhat.com/browse/BUILD-190

Comment 14 Rolfe Dlugy-Hegwer 2021-04-08 20:38:32 UTC
Provided a release note in https://issues.redhat.com/browse/BUILD-186 "builds: Record Last Triggered Image ID in status"


Note You need to log in before you can comment on or make changes to this bug.