Bug 1554902 - Pipeline doesn't start right after clicking button
Summary: Pipeline doesn't start right after clicking button
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.10.0
Assignee: Gabe Montero
QA Contact: Wenjing Zheng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-13 14:26 UTC by Rutvik
Modified: 2021-06-10 15:17 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The build watch maintained by the openshift jenkins sync plugin would no longer function, even while watchers on other api object types still functioned. Consequence: The finding of a build would then fall upon the background build list thread, which by default runs at 5 minute intervals Fix: add better logging around unexpected closure of the openshift jenkins sync plugin watches, and add reconnect when those clousures occur; add the ability for customers to configure the relist interval. Result: customers do not have to wait up to 5 minutes for the pipeline strategy builds to start
Clone Of:
Environment:
Last Closed: 2018-07-30 19:10:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 0 None None None 2018-07-30 19:11:12 UTC

Description Rutvik 2018-03-13 14:26:44 UTC
Description of problem:

While using openshift sync plugin for implementing cross-project pipeline environment, it is observed that When clicking the button to start the pipeline it takes 2-5 minutes before it actually starts.

Scenario:
- Jenkins is running in namespace A
- A pipeline is defined in namespace B.

In a couple of tests, it seems that if the number of stages in a pipeline is more then it's taking an unpredictable time to initiate the builds.

Comment 1 Ben Parees 2018-03-13 15:42:48 UTC
Are you using slaves? It can take time for the slaves images to pull and start before your pipeline will run.

What are you considering the "start" of the pipeline?

Comment 2 Gabe Montero 2018-03-13 15:47:11 UTC
Yep, Ben's question could be on point wrt what is going on here.

In addition to answering that, please get the customer to provide:

a) the pipeline script used

b) the jenkins pod logs

c) the console logs for one of the runs where the delay was observed

Comment 8 Gabe Montero 2018-03-14 15:37:15 UTC
OK, the data provided helped progress things a bit but was not precisely what I needed.  It did make me realize we need some more.

In addition to what was provided before

1) The blue ocean pipeline view is not sufficient, we need the console logs ... i.e. if he clicked "View logs" from that blue ocean view he provided

2) Lets get the openshift  build yaml ... i.e oc get build <build name> -o yaml
since he is starting the build from the jenkins console, we can try to gauge the time spent between the build getting created (i.e. when he hits start build from the console, and when the job gets started (which we'll see in the jenkins logs)

3) run oc get events -o yaml in the build namespace after everything is done ... we'll see if something is amiss on the openshift side 

4) we need the actual Jenkinsfile contents ... they are not embedded in the BC ... the BC references the Jenkinsfile from the git repo

One *POSSIBLE* cause, though I am not ready to declare that yet, is this exception that occurs while executing steps in the Jenkinsfile:

WARNING: Failed to resolve token macros
org.jenkinsci.plugins.tokenmacro.MacroEvaluationException: Unrecognized macro 'TEST_COUNTS' in '${TEST_COUNTS,var="pass"}'
	at org.jenkinsci.plugins.tokenmacro.Parser.processToken(Parser.java:326)
	at org.jenkinsci.plugins.tokenmacro.Action$KiHW1UeqOdqAwZul.run(Unknown Source)
	at org.parboiled.matchers.ActionMatcher.match(ActionMatcher.java:96)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.matchers.SequenceMatcher.match(SequenceMatcher.java:46)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.matchers.FirstOfMatcher.match(FirstOfMatcher.java:41)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.matchers.FirstOfMatcher.match(FirstOfMatcher.java:41)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.matchers.ZeroOrMoreMatcher.match(ZeroOrMoreMatcher.java:39)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.matchers.SequenceMatcher.match(SequenceMatcher.java:46)
	at org.parboiled.parserunners.BasicParseRunner.match(BasicParseRunner.java:77)
	at org.parboiled.MatcherContext.runMatcher(MatcherContext.java:351)
	at org.parboiled.parserunners.BasicParseRunner.run(BasicParseRunner.java:72)
	at org.parboiled.parserunners.ReportingParseRunner.runBasicMatch(ReportingParseRunner.java:86)
	at org.parboiled.parserunners.ReportingParseRunner.run(ReportingParseRunner.java:66)
	at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:81)
	at org.parboiled.parserunners.AbstractParseRunner.run(AbstractParseRunner.java:76)
	at org.jenkinsci.plugins.tokenmacro.Parser.process(Parser.java:68)
	at org.jenkinsci.plugins.tokenmacro.TokenMacro.expand(TokenMacro.java:204)
	at org.jenkinsci.plugins.tokenmacro.TokenMacro.expand(TokenMacro.java:200)
	at jenkins.plugins.hipchat.impl.DefaultCardProvider.getAttributes(DefaultCardProvider.java:60)
	at jenkins.plugins.hipchat.impl.DefaultCardProvider.getCard(DefaultCardProvider.java:37)
	at jenkins.plugins.hipchat.workflow.HipChatSendStep$HipChatSendStepExecution.run(HipChatSendStep.java:192)
	at jenkins.plugins.hipchat.workflow.HipChatSendStep$HipChatSendStepExecution.run(HipChatSendStep.java:115)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
	at hudson.security.ACL.impersonate(ACL.java:260)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)


Unclear to date whether it has any bearing.  Hopefully the other information will clear that up.

Comment 10 Gabe Montero 2018-03-15 17:35:02 UTC
OK the build yaml i.e. my item 2 from the #Comment 8 would be the final piece here.

Also, let's confirm the they are using v3.7 of the openshift jenkins image.

Getting more and more convinced though that watches are becoming less and less of a reliable notification mechanism here, and we essentially have the same thing as https://github.com/openshift/jenkins-sync-plugin/issues/209

Comment 20 wewang 2018-04-08 08:22:35 UTC
verified in brew-pulp-xxxxxx/openshift3/jenkins-2-rhel7 v3.10(v3.10.0.20180406.113022) 838c3e9ab0d9  with sync plugin:v1.0.10

steps:
1. Create jenkins resource with pipeline build 
 $oc new-app -f https://raw.githubusercontent.com/openshift/origin/master/examples/jenkins/pipeline/samplepipeline.yaml

2. Start a pipeline build and wait complete 
 $oc start-build sample-pipeline

3. Log in to jenkins, got to manage jenkins -> configure system, change the 	Build list interval to 400

4. go into pods to check value buildListInterval is 400
      $ oc rsh jenkins-1-nncsv
sh-4.2$ cd /var/lib/jenkins
sh-4.2$ more io.fabric8.jenkins.openshiftsync.GlobalPluginConfiguration.xml
<?xml version='1.1' encoding='UTF-8'?>
<io.fabric8.jenkins.openshiftsync.GlobalPluginConfiguration plugin="openshift-sync.10">
  <enabled>true</enabled>
  <server></server>
  <namespaces>
    <string>wewang</string>
  </namespaces>
  <jobNamePattern></jobNamePattern>
  <skipOrganizationPrefix></skipOrganizationPrefix>
  <skipBranchSuffix></skipBranchSuffix>
  <buildListInterval>400</buildListInterval>
  <buildConfigListInterval>300</buildConfigListInterval>
  <secretListInterval>300</secretListInterval>
  <configMapListInterval>300</configMapListInterval>
  <imageStreamListInterval>300</imageStreamListInterval>
</io.fabric8.jenkins.openshiftsync.GlobalPluginConfiguration>

5)Manage Jenkins -> System Log, and then click "add new log recorder".  type logwatch name, the click "add" and in the logger field, type in "io.fabric8.jenkins.openshiftsync.BuildWatcher" and click save.

6. Start a pipeline build
   $oc start-build sample-pipeline

7. Check the BuildWatcher and see build resources list at an internal
Apr 08, 2018 8:16:37 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
listing Build resources
Apr 08, 2018 8:16:37 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
handled Build resources
Apr 08, 2018 8:16:37 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Reconciling job runs and builds
Apr 08, 2018 8:16:37 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Checking runs for BuildConfig wewang/sample-pipeline

Comment 21 wewang 2018-04-08 08:30:28 UTC
more logs about step7
you can see  the listed  resources in every 400s(8:23:17-8:16:37=400s)


Apr 08, 2018 8:16:37 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
listing Build resources
Apr 08, 2018 8:16:37 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
handled Build resources
Apr 08, 2018 8:16:37 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Reconciling job runs and builds
Apr 08, 2018 8:16:37 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Checking runs for BuildConfig wewang/sample-pipeline
Apr 08, 2018 8:23:17 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
listing Build resources
Apr 08, 2018 8:23:17 AM FINE io.fabric8.jenkins.openshiftsync.BuildWatcher
handled Build resources
Apr 08, 2018 8:23:17 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Reconciling job runs and builds
Apr 08, 2018 8:23:17 AM INFO io.fabric8.jenkins.openshiftsync.BuildWatcher reconcileRunsAndBuilds
Checking runs for BuildConfig wewang/sample-pipeline

Comment 23 errata-xmlrpc 2018-07-30 19:10:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.