Bug 1568529 - Failed builds appear in the Jenkins UI but not in the OpenShift pipeline UI [NEEDINFO]
Summary: Failed builds appear in the Jenkins UI but not in the OpenShift pipeline UI
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.7.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.10.0
Assignee: Ben Parees
QA Contact: Wenjing Zheng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-17 17:09 UTC by Sten Turpin
Modified: 2018-04-24 17:31 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-24 17:31:17 UTC
Target Upstream Version:
bparees: needinfo? (sten)


Attachments (Terms of Use)
build objects list (41.13 KB, image/png)
2018-04-20 19:00 UTC, guilherme.camposo
no flags Details
jenkins view (141.46 KB, image/png)
2018-04-20 19:00 UTC, guilherme.camposo
no flags Details

Description Sten Turpin 2018-04-17 17:09:57 UTC
Description of problem: Failed builds appear in the Jenkins UI but not in the OpenShift pipeline UI 


Version-Release number of selected component (if applicable):  atomic-openshift-3.7.23-1.git.5.83efd71.el7.x86_64


How reproducible: always 


Steps to Reproduce:
1. Fail a build


Actual results:
2. Build output shows in Jenkins
3. Build is not visible in OpenShift pipeline UI 

Expected results:
Build should be visible in pipeline 

Additional info:

Comment 3 Ben Parees 2018-04-18 03:45:38 UTC
Does the build object corresponding to the jenkins job run exist in openshift?  can you share the yaml for it?  
does the build object itself show up in the openshift web console?

I assume, though you didn't make it clear, that you are referring to not seeing pipeline #10 (and maybe 9) in the web console.  Are you also expecting to see build 8?  I can't tell from your screenshot what the state of build 8 is in jenkins.  Did you cancel/abort it? is it still running?

what other recreate steps can you provide?  

how did you define the buildconfig?  how did you launch it?  how did you cause it to fail?

Comment 4 Ben Parees 2018-04-20 13:20:00 UTC
Sten, bump.

Comment 5 guilherme.camposo 2018-04-20 19:00:35 UTC
Created attachment 1424646 [details]
build objects list

Comment 6 guilherme.camposo 2018-04-20 19:00:59 UTC
Created attachment 1424647 [details]
jenkins view

Comment 7 guilherme.camposo 2018-04-20 19:15:53 UTC
(In reply to Ben Parees from comment #3)
> Does the build object corresponding to the jenkins job run exist in
> openshift?  can you share the yaml for it?  
> does the build object itself show up in the openshift web console?
> 
> I assume, though you didn't make it clear, that you are referring to not
> seeing pipeline #10 (and maybe 9) in the web console.  Are you also
> expecting to see build 8?  I can't tell from your screenshot what the state
> of build 8 is in jenkins.  Did you cancel/abort it? is it still running?
> 
> what other recreate steps can you provide?  
> 
> how did you define the buildconfig?  how did you launch it?  how did you
> cause it to fail?

Hi Ben, 

thanks for replying. 

Builds 8,9 and 10 failed and the customer can't find them in Openshift using web console or using oc get builds(as the new attachment shows).

# Build 8 
It was aborted. It took too long to start build job. 

# Build 9 
It timeout trying to start deployment

Log: 
Starting "Trigger OpenShift Deployment" with deployment config "gcom-epg-sync-uat" from the project "sidi".
Operation will timeout after 600000 milliseconds


# Build 10
It timeout trying to start deployment. Same as build 9.

They were all started using the web console button "start pipeline"

Comment 8 Ben Parees 2018-04-20 19:21:14 UTC
well something deleted them.

given that they launched the build from the openshift console, as best I can tell from what you said (when you say web console, please be more specific in the future.  there is a jenkins web console and an openshift web console) the fact that the job got started in jenkins means the openshift build object was created at some point.

So again it would be useful if they can recreate this and confirm the openshift build object initially exists and perhaps determine under what conditions it is being deleted.

I don't believe we normally delete any pipeline build objects in openshift, which makes me think the user took some additional action here.

Comment 9 guilherme.camposo 2018-04-24 14:46:07 UTC
(In reply to Ben Parees from comment #8)
> well something deleted them.
> 
> given that they launched the build from the openshift console, as best I can
> tell from what you said (when you say web console, please be more specific
> in the future.  there is a jenkins web console and an openshift web console)
> the fact that the job got started in jenkins means the openshift build
> object was created at some point.
> 
> So again it would be useful if they can recreate this and confirm the
> openshift build object initially exists and perhaps determine under what
> conditions it is being deleted.
> 
> I don't believe we normally delete any pipeline build objects in openshift,
> which makes me think the user took some additional action here.

Hi Ben, 

I just got the information from customer. 

They started the pipeline using Openshift Web Console and the builds were created, so they saw that build failed using Openshift Web Console. After a while, they can't precise the time, the build history for that 3 builds vanished. They didn't delete the builds manually.

Comment 10 Ben Parees 2018-04-24 15:01:11 UTC
are they running oc adm prune builds, perhaps in a cron job?

Comment 11 guilherme.camposo 2018-04-24 15:41:10 UTC
(In reply to Ben Parees from comment #10)
> are they running oc adm prune builds, perhaps in a cron job?

They are using Openshift Dedicated, so they don't run anything to administer the cluster. Maybe the OSD team can provide you further information. This customer cluster ID is gsat-corp

Comment 12 Ben Parees 2018-04-24 17:31:17 UTC
After discussing w/ the ops team, they are running the oc adm prune builds command every 3 hours on the cluster.  By default that will remove all but 1 failed build for each buildconfig.  I would assume this is what is removing your builds and changes to what pruning is being done would have to be discussed w/ that team.

The next step is for a ticket to be opened requesting the prune behavior be changed (Assuming that is still desired).


Note You need to log in before you can comment on or make changes to this bug.