Bug 1406306 - jenkinsPipelineStrategy BC won't work after master/node services restart and Jenkins pod will log "SEVERE: Failed to update job"
Summary: jenkinsPipelineStrategy BC won't work after master/node services restart and ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.4.z
Assignee: Jimmi Dyson
QA Contact: Wang Haoran
URL:
Whiteboard:
: 1411336 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-20 08:59 UTC by Xingxing Xia
Modified: 2021-09-09 12:03 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Race condition on restarting BuildConfig watches on disconnection from API server. Consequence: Jenkins job creation was requested multiple times for the same BuildConfig, ending up with the documented error message. Fix: Properly synchronize creation requests to prevent conflicts. Result: Jenkins jobs are created reliably, regardless of API server disconnections.
Clone Of:
Environment:
Last Closed: 2017-02-16 20:54:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Xingxing Xia 2016-12-20 08:59:28 UTC
Description of problem:
Jenkins new pipeline build status is always New. Checking jenkins pod, it logs:
SEVERE: Failed to update job
java.lang.IllegalArgumentException: Jenkins already contains an item 'xxia-pipeline-sample-pipeline'
......

Version-Release number of selected component (if applicable):
$ openshift version  # on master
openshift v3.4.0.38
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

$ docker version # on master
......
Server:
 Version:         1.12.4
 API version:     1.24
 Package version: docker-common-1.12.4-3.el7.x86_64
 Go version:      go1.7.4
 Git commit:      ea46b4a
 Built:           Tue Dec 13 16:30:11 2016
 OS/Arch:         linux/amd64

$ docker images | grep jenkins # on jenkins pod's node
brew-*.redhat.com:8888/openshift3/jenkins-2-rhel7  <none>  bae576076f66        12 days ago         678.2 MB

How reproducible:
Sometimes

Steps to Reproduce:
1. Create project xxia-pipeline
2. Create jenkins pipeline BC
$ oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/pipeline/samplepipeline.yaml
3. Wait for jenkins pod to be running
4. Start jenkins pipeline builds
$ oc start-build sample-pipeline

Actual results:
4. At beginning, pipeline builds can succeed. But later it is found that, pipeline becomes not working. New pipeline build status is always "New":
[tester@pc_f25 oc]$ oc get build
NAME                       TYPE              FROM          STATUS     STARTED        DURATION
nodejs-mongodb-example-1   Source            Git@343689f   Complete   23 hours ago   2m58s
...
sample-pipeline-1          JenkinsPipeline                 Complete   23 hours ago   5m44s
...
sample-pipeline-4          JenkinsPipeline                 New

Check pod:
[tester@pc_f25 oc]$ oc get pod
NAME                             READY     STATUS      RESTARTS   AGE
jenkins-1-oj83m                  1/1       Running     1          23h
...

[tester@pc_f25 oc]$ oc logs pod/jenkins-1-oj83m
It outputs many repeated stuffs:

......

Dec 20, 2016 8:16:08 AM io.fabric8.jenkins.openshiftsync.BuildConfigWatcher onInitialBuildConfigs
SEVERE: Failed to update job
java.lang.IllegalArgumentException: Jenkins already contains an item 'xxia-pipeline-sample-pipeline'
    at hudson.model.ItemGroupMixIn.createProjectFromXML(ItemGroupMixIn.java:264)
    at jenkins.model.Jenkins.createProjectFromXML(Jenkins.java:3689)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:214)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:159)
    at hudson.security.ACL.impersonate(ACL.java:221)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.upsertJob(BuildConfigWatcher.java:159)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.onInitialBuildConfigs(BuildConfigWatcher.java:129)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.access$200(BuildConfigWatcher.java:65)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$1.doRun(BuildConfigWatcher.java:88)
    at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:50)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Dec 20, 2016 8:16:18 AM io.fabric8.jenkins.openshiftsync.BuildConfigWatcher onInitialBuildConfigs
SEVERE: Failed to update job
java.lang.IllegalArgumentException: Jenkins already contains an item 'xxia-pipeline-sample-pipeline'
    at hudson.model.ItemGroupMixIn.createProjectFromXML(ItemGroupMixIn.java:264)
    at jenkins.model.Jenkins.createProjectFromXML(Jenkins.java:3689)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:214)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:159)
    at hudson.security.ACL.impersonate(ACL.java:221)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.upsertJob(BuildConfigWatcher.java:159)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.onInitialBuildConfigs(BuildConfigWatcher.java:129)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher.access$200(BuildConfigWatcher.java:65)
    at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$1.doRun(BuildConfigWatcher.java:88)
    at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:50)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Expected results:
4. Pipeline build should always work

Additional info:

Comment 1 Jimmi Dyson 2017-01-04 16:04:04 UTC
Was the BuildConfig deleted & recreated by any chance?

Comment 2 Xingxing Xia 2017-01-05 10:01:39 UTC
(In reply to Jimmi Dyson from comment #1)
> Was the BuildConfig deleted & recreated by any chance?

No. Just restart master/node services on master/node, this bug can be reproduced:

On master:
[root@master]# systemctl restart atomic-openshift-master.service atomic-openshift-node.service

On node:
[root@node ~]# systemctl restart atomic-openshift-node.service

On CLI:
[xxia@localhost ~]$ oc get build
NAME                       TYPE              FROM          STATUS     STARTED          DURATION
nodejs-mongodb-example-1   Source            Git@343689f   Complete   27 minutes ago   55s
sample-pipeline-1          JenkinsPipeline                 Complete   28 minutes ago   2m4s
sample-pipeline-2          JenkinsPipeline                 New

^ New pipeline build is stuck in "New" status

[xxia@localhost ~]$ oc logs pod/<jenkins-pod>
...
Jan 05, 2017 9:51:50 AM io.fabric8.jenkins.openshiftsync.BuildConfigWatcher onInitialBuildConfigs
SEVERE: Failed to update job
java.lang.IllegalArgumentException: Jenkins already contains an item 'xxia-proj-sample-pipeline'
	at hudson.model.ItemGroupMixIn.createProjectFromXML(ItemGroupMixIn.java:264)
	at jenkins.model.Jenkins.createProjectFromXML(Jenkins.java:3689)
	at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:214)
	at io.fabric8.jenkins.openshiftsync.BuildConfigWatcher$2.call(BuildConfigWatcher.java:159)
	at hudson.security.ACL.impersonate(ACL.java:221)
...

^ Logs still show repeated stuff like this

Comment 3 Xingxing Xia 2017-01-05 10:15:59 UTC
(In reply to Xingxing Xia from comment #2)

> On CLI:

Sorry, here I forgot pasting one step after previous restart of master/node services:
[xxia@localhost ~]$ oc start-build sample-pipeline
build "sample-pipeline-2" started

> [xxia@localhost ~]$ oc get build
> NAME                       TYPE              FROM          STATUS    
> STARTED          DURATION
> nodejs-mongodb-example-1   Source            Git@343689f   Complete   27
> minutes ago   55s
> sample-pipeline-1          JenkinsPipeline                 Complete   28
> minutes ago   2m4s
> sample-pipeline-2          JenkinsPipeline                 New

Comment 4 Ben Parees 2017-01-10 18:19:28 UTC
*** Bug 1411336 has been marked as a duplicate of this bug. ***

Comment 5 yasun 2017-01-11 02:19:05 UTC
The problem can be reproduced on dev-preview-online too.

[root@ymsun tmp]# oc get build
NAME                        TYPE              FROM          STATUS     STARTED        DURATION
nodejs-mongodb-example-11   Source            Git@343689f   Complete   18 hours ago   33s
sample-pipeline-11          JenkinsPipeline                 Complete   18 hours ago   1m13s
sample-pipeline-12          JenkinsPipeline                 New  

Jenkins log:
Jan 11, 2017 2:18:04 AM io.fabric8.jenkins.openshiftsync.BuildConfigWatcher onInitialBuildConfigs
SEVERE: Failed to update job
java.lang.IllegalArgumentException: Jenkins already contains an item 'wavellite1-1-sample-pipeline'
at hudson.model.ItemGroupMixIn.createProjectFromXML(ItemGroupMixIn.java:264)
....

Version:
openshift v3.4.0.38
kubernetes v1.4.0+776c994

Comment 6 yasun 2017-01-11 02:27:28 UTC
And the build can not be canceled:
[root@ymsun tmp]# oc cancel-build sample-pipeline-12
error: build wavellite1-1/sample-pipeline-12 failed to cancel: timed out waiting for the condition
error: failure during the build cancellation

Comment 8 Jimmi Dyson 2017-01-12 10:25:33 UTC
I cannot recreate this locally. Could you share the full Jenkins log please (don't worry about how big it is)?

Comment 10 Xingxing Xia 2017-01-12 10:53:54 UTC
I can easily reproduced. Attached full logs:
$ oc logs pod/jenkins-1-9havp > jenkins-pod.logs

Comment 11 Jimmi Dyson 2017-01-18 11:47:52 UTC
Fix (hopefully) in https://github.com/jenkinsci/openshift-sync-plugin/pull/35. Will release soon & will assign to @tdawson for packaging once available.

Comment 12 Jimmi Dyson 2017-01-18 14:08:41 UTC
v0.1.7 now available from http://updates.jenkins-ci.org/download/plugins/openshift-sync/. Over to @tdawson for packaging please.

Comment 14 Troy Dawson 2017-01-19 23:07:19 UTC
rpm updated:
  jenkins-plugin-openshift-sync-0.1.7-1.el7
images made:
  openshift3/jenkins-1-rhel7:1.651.2-47
  openshift3/jenkins-2-rhel7:2.19-14
rpm is in the 3.4 testing repos
images are in the testing registries.

Comment 15 Dongbo Yan 2017-01-20 09:39:40 UTC
Test with
brew-pulp.../openshift3/jenkins-1-rhel7     fbb2479081b2    
brew-pulp.../openshift3/jenkins-2-rhel7      5119d00f1c73

jenkins-plugin-openshift-pipeline-1.0.37-1.el7.x86_64
jenkins-plugin-openshift-sync-0.1.7-1.el7.x86_64
jenkins-plugin-openshift-login-0.9-1.el7.x86_64

openshift v3.4.1.0
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

Comment 16 Rohana Rezel 2017-01-25 06:18:28 UTC
I submitted a PR to fix the issue:

https://github.com/openshift/jenkins/pull/236

Comment 17 Troy Dawson 2017-02-16 20:54:26 UTC
Original issue is fixed and in the latest released version.
Closing bug.


Note You need to log in before you can comment on or make changes to this bug.