Created attachment 1272696 [details] completed build json Description of problem: Some of the Completed/Failed builds are missing duration proj9 cakephp-mysql-example-19 Source Git@0014dde Complete 2 hours ago 20s proj9 cakephp-mysql-example-2 Source Git@0014dde Complete 2 hours ago 8s proj9 cakephp-mysql-example-20 Source Git@0014dde Complete 2 hours ago proj9 cakephp-mysql-example-21 Source Git@0014dde Complete 2 hours ago 5s proj9 cakephp-mysql-example-22 Source Git@0014dde Complete 2 hours ago 48s proj9 cakephp-mysql-example-23 Source Git@0014dde Complete 2 hours ago proj9 cakephp-mysql-example-24 Source Git@0014dde Complete 2 hours ago 35s proj9 cakephp-mysql-example-25 Source Git@0014dde Complete 2 hours ago proj9 cakephp-mysql-example-26 Source Git@0014dde Complete 2 hours ago 46s proj9 cakephp-mysql-example-27 Source Git@0014dde Complete 2 hours ago 42s proj13 cakephp-mysql-example-38 Source Git@0014dde Failed (PushImageToRegistryFailed) 5 minutes ago 4m47s proj15 cakephp-mysql-example-36 Source Git@0014dde Failed About an hour ago proj17 cakephp-mysql-example-36 Source Git@0014dde Failed About an hour ago 2m15s proj19 cakephp-mysql-example-36 Source Git@0014dde Failed About an hour ago proj2 cakephp-mysql-example-36 Source Git@0014dde Failed About an hour ago 1m36s Version-Release number of selected component (if applicable): openshift v3.5.5.7 kubernetes v1.5.2+43a9be4 etcd 3.1.0 Client: Version: 1.12.6 API version: 1.24 Package version: docker-common-1.12.6-16.el7.x86_64 Go version: go1.7.4 Git commit: 3a094bd/1.12.6 Built: Tue Mar 21 13:30:59 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-common-1.12.6-16.el7.x86_64 Go version: go1.7.4 Git commit: 3a094bd/1.12.6 Built: Tue Mar 21 13:30:59 2017 OS/Arch: linux/amd64 Steps to Reproduce: 1. Create 20 cakephp build configs 2. start 20 concurrent builds 3. This happens after few builds are completed fine Actual results: Some builds are missing duration column Expected results: All the builds should have duration column populated Additional info: Please see attached description of Failed and Completed builds.
Created attachment 1272698 [details] failed build desc
Created attachment 1272699 [details] completed build desc
the build in https://bugzilla.redhat.com/attachment.cgi?id=1272696 at least shows a started timestamp and a completion timestamp, which makes me think you're hitting this: https://github.com/openshift/origin/issues/12549 which was fixed here: https://github.com/openshift/origin/pull/12569
Tested with following version of openshift and docker, I still see the problem. Please see attachments for details root@ip-172-31-4-124: ~ # openshift version openshift v3.6.65 kubernetes v1.6.1+5115d708d7 etcd 3.1.0 root@ip-172-31-4-124: ~ # docker version Client: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-19.git92b10e4.el7.x86_64 Go version: go1.7.4 Git commit: 92b10e4/1.12.6 Built: Tue May 2 15:06:29 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-19.git92b10e4.el7.x86_64 Go version: go1.7.4 Git commit: 92b10e4/1.12.6 Built: Tue May 2 15:06:29 2017 OS/Arch: linux/amd64
Created attachment 1277151 [details] build json v3.6.65
Created attachment 1277152 [details] build desc v3.6.65
I'm not seeing a missing duration, but I am seeing an incorrect duration (it's reported as 0s): Started: Mon, 08 May 2017 13:53:19 EDT Duration: 0s FetchInputs: 1s CommitContainer: 14s Assemble: 59s PostCommit: 0s PushImage: 14s The startTimestamp value is the same as the completionTimestamp value. This is likely because we introduced logic to force-set the startTimestamp to the same value as the completionTimestamp, if the startTimestamp was not already set when the build completes. That should only happen when a build completes so fast that the pod controller doesn't even see the pod "start", so we never updated the startTimestamp value. However this build clearly took quite a bit of time, so it's not obvious to me why the pod controller would not have seen the pod in a running state and updated the build with a start timestamp before the completion occurred. if you can recreate this with loglevel 5 logs for both the master and the build in question, that would be helpful for us to track down the chain of events leading to this.
Hi Ben, This is how I see the completed builds where duration is missing in oc get builds command proj6 cakephp-mysql-example-7 Source Git@0014dde Complete 13 minutes ago 4s proj6 cakephp-mysql-example-8 Source Git@0014dde Complete 11 minutes ago proj6 cakephp-mysql-example-9 Source Git@0014dde Complete 8 minutes ago 6s proj7 cakephp-mysql-example-1 Source Git@0014dde Complete 40 minutes ago 2m17s proj7 cakephp-mysql-example-10 Source Git@0014dde Complete 8 minutes ago proj7 cakephp-mysql-example-11 Source Git@0014dde Complete 5 minutes ago proj7 cakephp-mysql-example-12 Source Git@0014dde Complete 2 minutes ago proj7 cakephp-mysql-example-2 Source Git@0014dde Complete 30 minutes ago 11s I am re-running the test with loglevel 5 will update the bug again.
Created attachment 1277163 [details] logs with loglevel5 v3.6.65
do you know the name of one of the builds that had a zero/missing duration in this run?
Yes, proj26/builds/cakephp-mysql-example-15 Logs of this build with logleve 5 is also attached.
Created attachment 1277421 [details] build json v3.6.65
found the issue, PR here: https://github.com/openshift/origin/pull/14131
Verified on following version openshift v3.6.94 kubernetes v1.6.1+5115d708d7 etcd 3.1.0 Complete many iterations of 50 concurrent builds where lot of builds also failed due to context deadline exceeded errors from docker. All the builds have Duration field populated.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716