Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1543916 - Pipeline builds do not get pruned correctly
Pipeline builds do not get pruned correctly
Status: VERIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: RFE (Show other bugs)
3.9.0
x86_64 Linux
medium Severity medium
: ---
: 3.11.0
Assigned To: Ben Parees
Wenjing Zheng
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-09 09:57 EST by Corey Daley
Modified: 2018-09-09 21:04 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3607641 None None None 2018-09-09 20:15 EDT

  None (edit)
Description Corey Daley 2018-02-09 09:57:42 EST
Description of problem:
Pipeline builds do not get pruned according to the successfulBuildsHistoryLimit/failedBuildsHistoryLimit settings in the BuildConfig

Version-Release number of selected component (if applicable):
oc v3.9.0-alpha.4+65697ed-228
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.50.14:8443
openshift v3.9.0-alpha.4+65697ed-228
kubernetes v1.9.1+a0ce1bc657

How reproducible:
Everytime

Steps to Reproduce:
1. Create a Jenkins pipeline in OpenShift
2. Set the successfulBuildsHistoryLimit/failedBuildsHistoryLimit in the BuildConfig
3.  Run enough builds that the old builds should get pruned

Actual results:
Errors occur in the OpenShift logs stating that the system:serviceaccounts:openshift-infra account is forbidden from deleting the old builds.

Expected results:
The old builds should be pruned according to the settings in the BuildConfig

Additional info:

I0209 07:34:29.438274 31819 util.go:82] Pruning old build: cdaley/nodejs-sample-pipeline-1
I0209 07:34:29.438737 31819 rbac.go:116] RBAC DENY: user "system:serviceaccount:openshift-infra:build-config-change-controller" groups ["system:serviceaccounts" "system:serviceaccounts:openshift-infra" "system:authenticated"] cannot "delete" resource "builds.build.openshift.io" named "nodejs-sample-pipeline-1" in namespace "cdaley"
I0209 07:34:29.438848 31819 authorization.go:59] Forbidden: "/apis/build.openshift.io/v1/namespaces/cdaley/builds/nodejs-sample-pipeline-1", Reason: "User \"system:serviceaccount:openshift-infra:build-config-change-controller\" cannot delete builds.build.openshift.io in project \"cdaley\""
Comment 1 Corey Daley 2018-02-09 19:37:29 EST
It looks like the builds never get pruned due to the following line(s):

https://github.com/openshift/origin/blob/master/pkg/build/controller/build/build_controller.go#L329

https://github.com/openshift/origin/blob/master/pkg/build/controller/build/build_controller.go#L378-L381

Basically if the build strategy is JenkinsPipelineStrategy we are relying on Jenkins to do all creating/updating/deletion of the job.

I believe the fix here is to update the openshift-client plugin to at least 3.0.0, and let Jenkins handle deleting the old builds based on the successfulBuildsHistoryLimit and failedBuildsHistoryLimit options in the BuildConfig, just to keep things consistent.
Comment 2 Ben Parees 2018-02-12 10:05:32 EST
I don't think the "shouldIgnore" logic should apply to the build pruning logic.  We should prune pipeline builds.  The shouldignore was really intended to say "ignore this build in that we aren't going to create a build pod for it and monitor the pod state."
Comment 3 Ben Parees 2018-04-11 12:50:00 EDT
notes to self:  as i recall the hard problem here is that if the controller on the openshift side deletes the build, then the jenkins sync plugin may say "hey i have this jenkins job run and there's no corresponding build object, let me create one".

The solutions to that are either:

1) As corey suggested, have the sync plugin responsible for pruning pipeline builds
2) have the sync plugin be smart enough not to create a build object for job runs that are already completed (it may already be that smart, but that may also not be enough to close all the potential timing windows for getting this right..)

definitely having only a single entity responsible for creating/deleting pipeline builds is "safer" though possibly harder to implement and results in us having code in two places)

Corey if you have any other recollections around this, please add them.
Comment 4 Corey Daley 2018-04-11 13:41:35 EDT
Ben, 
Since we have the sync plugin deleting job runs if the associated openshift build is deleted, I don't think that we have the issue with jenkins recreating the builds, so it seems like it would be safe to have OpenShift prune pipeline builds and then the Sync plugin would clean up the Jenkins jobs.  Of course some tests should/would be created around this scenario.
Comment 5 Ben Parees 2018-04-11 14:14:11 EDT
> Since we have the sync plugin deleting job runs if the associated openshift build is deleted, I don't think that we have the issue with jenkins recreating the builds

my fear is timing between the sync plugin seeing the delete event, and the sync plugin seeing the build is missing.  I can (because i'm a pessimist) envision a case where a build is pruned, then a relist happens, the build is not in the list, the sync plugin starts a build for it, and then we see the delete event for the build and delete the job run.
Comment 6 Corey Daley 2018-04-11 18:34:17 EDT
We also need to update to the openshift-client 3.x which is being held up by https://github.com/fabric8io/kubernetes-client/issues/1046
Comment 7 Corey Daley 2018-08-07 09:35:34 EDT
The OpenShift client has now been updated to openshift-client 3.x
Comment 8 Ben Parees 2018-08-07 10:01:06 EDT
This is actually implemented right Corey?  Delivered in 3.11?
Comment 9 Corey Daley 2018-08-07 10:04:40 EDT
Ben, 
Yes, I was just tracking down the commit for it to post here.

This bug is fixed by https://github.com/openshift/origin/commit/37de5d244bf82e61e9d4d10bc913dbe44794a855
Comment 10 Ben Parees 2018-08-07 10:07:48 EDT
I'm going to throw this straight into ON_QA since i'm confident it's in a build at this point.
Comment 11 Wenjing Zheng 2018-08-14 02:26:00 EDT
Yes, pipeline build can be pruned now. 
Verified with below version:
registry.dev.redhat.io/openshift3/jenkins-2-rhel7@sha256:8b9cc096eaa54eafe905c79ad0b7b43a31137c73a51e91e3ee8e10e6f22734a1

Note You need to log in before you can comment on or make changes to this bug.