Bug 1537317
| Summary: | Jenkins build takes a long time to start when freshly deployed | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Øystein Bedin <obedin> |
| Component: | Build | Assignee: | Gabe Montero <gmontero> |
| Status: | CLOSED WORKSFORME | QA Contact: | Wenjing Zheng <wzheng> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.7.1 | CC: | aos-bugs, bparees, obedin |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
undefined
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-01-23 19:23:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Øystein Bedin
2018-01-22 22:52:20 UTC
I think some fixes were done in the sync plugin for this problem, but I would have expected them to be in the 3.7 image already. One of our main systems is with this issue is running OpenShift v3.7.14 (and Jenkins 2.46.3, OpenShift Sync Plugin 0.1.24). Please let us know what version(s) we potentially need to upgrade to, or if there's anything else we can do to check if we have those fixes. Andy Block found the following (hardcoded) 5 min "poll". This seems very much aligned with our experience as most builds starts around 5-7 min in. Can we make this a configurable parameter, if nothing else? https://github.com/openshift/jenkins-sync-plugin/blob/master/src/main/java/io/fabric8/jenkins/openshiftsync/BaseWatcher.java#L52-L57 We do a watch on the relevant resources, that interval is not a poll, it's the period at which we do a full resync. Again I think you're hitting a bug where we did not process the original watch event properly, thus you'd have to wait for a resync interval, but when things are working properly (as they are w/ the bug fix) that is not the case. Yeah there was a similar problem reported in October that was fixed with aef662b9b7869cc8f82c737b0bda31fa567abbbf in https://github.com/openshift/jenkins-sync-plugin Typically it stems from the BC events for the watch arriving before the build events. The above commit initiates an immediate fetch of the builds when the build config watch event arrives, to expedite / bypass the default relist interval. But yes that should have gotten into v3.7 ... the corresponding sync plugin version was v0.1.32 Øystein - could you at least confirm you are at that version of the sync plugin with your 3.7 image? I just want to make sure there are not some earlier 3.7 images out there. Next, as this is a timing bug, I am not able to reproduce it with my cluster. The timing of the watch events and when the build is getting scheduled is simply different for me. So could you also reproduce again and provide: 1) the jenkins master logs 2) the output of `oc get events` from the namespace in question after this occurs 3) the output from oc get <build name in question> -o yaml and oc get <build config name in questino> -o yaml At the moment, that fetch after the bc event is a one time fetch ... I'm wondering if it needs to have some retry to it if no builds are initially found. Thanks for the detailed info, Gabe. As mentioned above, the sync plugin is on version 0.1.24, so it sounds like it is too old. Our v3.7 cluster was installed a couple of weeks ago (7th or 8th of January), but maybe this was made available after that? I'll investigate and see what we can do to get it updated and report back if this is still a problem. Thanks for your support. I checked another cluster that was installed a week or so later, and it has version 0.1.32. We'll get the first cluster updated and report back. Ah - sorry - I missed the version being noted in https://bugzilla.redhat.com/show_bug.cgi?id=1537317#c2 Yeah let's see what happens when you bump the version with the cluster in question. If there is still an issue, let's then get the three items I noted. I'll hold on the bugzilla for now, give you all some time to give it a go and report back. @Gabe - we have validated that the later version of the plugin works much better - thank you!! BTW: The v3.7 install done 16 days ago had the 'latest' tag in the ImageStream, so it had to be updated to 'v3.7' and re-import the images to make it work. This issue can be closed. Thanks for your support and work on this! Thanks for the quick update Øystein I'll go ahead and closed this out as already fixed on you image stream tag note ... it may be a question of versions of installers and existing image stream defs, but moving forward, the images stream latest tag will have a qualified tag to the docker image For example, for our upcoming 3.9 release, we've got: https://github.com/openshift/origin/blob/master/examples/image-streams/image-streams-centos7.json#L1083-L1123 and https://github.com/openshift/origin/blob/master/examples/image-streams/image-streams-rhel7.json#L983-L1025 |