Bug 1365375 - [online_production] Failed to push image to registry during STI build
Summary: [online_production] Failed to push image to registry during STI build
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Image Registry
Version: 3.x
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Alexey Gladkov
QA Contact: Wei Sun
URL:
Whiteboard:
: 1365855 1366326 (view as bug list)
Depends On:
Blocks: OSOPS_V3
TreeView+ depends on / blocked
 
Reported: 2016-08-09 06:15 UTC by Bing Li
Modified: 2016-08-18 12:27 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-18 12:27:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Bing Li 2016-08-09 06:15:44 UTC
Version-Release number of selected component (if applicable):
Dev_preview_PROD
OpenShift Master: v3.2.1.10-1-g668ed0a
Kubernetes Master: v1.2.0-36-g4a3f9c5

How reproducible:
Sometimes

Steps to Reproduce:
1. oc new-app https://github.com/openshift/nodejs-ex
2. Check the build process.

Actual results:
2. In dev_preview_PROD env, sometimes build would fail to push image to registry, some build logs are listed below:
...
I0809 02:01:45.667105       1 sti.go:334] Successfully built bingli-prod/nodejs-ex-4:4b18d114
I0809 02:01:45.697901       1 cleanup.go:23] Removing temporary directory /tmp/s2i-build339799339
I0809 02:01:45.697921       1 fs.go:156] Removing directory '/tmp/s2i-build339799339'
I0809 02:01:45.710856       1 sti.go:268] Using provided push secret for pushing 172.30.47.227:5000/bingli-prod/nodejs-ex:latest image
I0809 02:01:45.710877       1 sti.go:272] Pushing 172.30.47.227:5000/bingli-prod/nodejs-ex:latest image ...
I0809 02:02:15.839267       1 sti.go:277] Registry server Address: 
I0809 02:02:15.839292       1 sti.go:278] Registry server User Name: serviceaccount
I0809 02:02:15.839299       1 sti.go:279] Registry server Email: serviceaccount
I0809 02:02:15.839306       1 sti.go:284] Registry server Password: <<non-empty>>
F0809 02:02:15.839315       1 builder.go:204] Error: build error: Failed to push image. Response from registry is: Error parsing HTTP response: unexpected end of JSON input: ""


Additional info:
[user3@bingli ~]$ oc get build
NAME          TYPE      FROM          STATUS     STARTED          DURATION
nodejs-ex-1   Source    Git@0e748ed   Failed     26 minutes ago   1m1s
nodejs-ex-2   Source    Git@0e748ed   Failed     23 minutes ago   1m11s
nodejs-ex-3   Source    Git@0e748ed   Complete   14 minutes ago   2m26s
nodejs-ex-4   Source    Git@0e748ed   Failed     10 minutes ago   1m9s
[user3@bingli ~]$ oc get secret
NAME                       TYPE                                  DATA      AGE
builder-dockercfg-d4and    kubernetes.io/dockercfg               1         24d
builder-token-bas4d        kubernetes.io/service-account-token   3         24d
builder-token-tbhw7        kubernetes.io/service-account-token   3         24d
default-dockercfg-dm8j3    kubernetes.io/dockercfg               1         24d
default-token-fhqmy        kubernetes.io/service-account-token   3         24d
default-token-uwg2y        kubernetes.io/service-account-token   3         24d
deployer-dockercfg-1zw07   kubernetes.io/dockercfg               1         24d
deployer-token-p4cws       kubernetes.io/service-account-token   3         24d
deployer-token-v5xqh       kubernetes.io/service-account-token   3         24d
[user3@bingli ~]$ oc get bc nodejs-ex -o json
{
    "kind": "BuildConfig",
    "apiVersion": "v1",
    "metadata": {
        "name": "nodejs-ex",
        "namespace": "bingli-prod",
        "selfLink": "/oapi/v1/namespaces/bingli-prod/buildconfigs/nodejs-ex",
        "uid": "5af208b6-5df4-11e6-a1a5-0e3d364e19a5",
        "resourceVersion": "96695839",
        "creationTimestamp": "2016-08-09T05:44:36Z",
        "labels": {
            "app": "nodejs-ex"
        },
        "annotations": {
            "openshift.io/generated-by": "OpenShiftNewApp"
        }
    },
    "spec": {
        "triggers": [
            {
                "type": "GitHub",
                "github": {
                    "secret": "2Yi6C0MPUa_Ei_fo8d0d"
                }
            },
            {
                "type": "Generic",
                "generic": {
                    "secret": "ZUpbr3uIAjqDIwI5opyu"
                }
            },
            {
                "type": "ConfigChange"
            },
            {
                "type": "ImageChange",
                "imageChange": {
                    "lastTriggeredImageID": "registry.access.redhat.com/rhscl/nodejs-4-rhel7:latest"
                }
            }
        ],
        "runPolicy": "Serial",
        "source": {
            "type": "Git",
            "git": {
                "uri": "https://github.com/openshift/nodejs-ex"
            }
        },
        "strategy": {
            "type": "Source",
            "sourceStrategy": {
                "from": {
                    "kind": "ImageStreamTag",
                    "namespace": "openshift",
                    "name": "nodejs:4"
                }
            }
        },
        "output": {
            "to": {
                "kind": "ImageStreamTag",
                "name": "nodejs-ex:latest"
            }
        },
        "resources": {},
        "postCommit": {}
    },
    "status": {
        "lastVersion": 4
    }
}

Comment 2 Steve Speicher 2016-08-10 18:31:03 UTC
Could this registry failure be a result of https://bugzilla.redhat.com/show_bug.cgi?id=1364870 ?

Comment 3 Michal Minar 2016-08-11 09:15:52 UTC
*** Bug 1365855 has been marked as a duplicate of this bug. ***

Comment 4 Cesar Wong 2016-08-11 16:39:36 UTC
*** Bug 1366326 has been marked as a duplicate of this bug. ***

Comment 5 Michal Fojtik 2016-08-12 09:53:05 UTC
(In reply to Steve Speicher from comment #2)
> Could this registry failure be a result of
> https://bugzilla.redhat.com/show_bug.cgi?id=1364870 ?

We this it is. We should get that fix asap to verify that. Also the registry longs indicates a timeout when connecting to OpenShift API server, which might be some infra issue or the API server is overloaded. Is this consistently reproducible or a flake?

Comment 6 Bing Li 2016-08-12 10:25:06 UTC
It can be reproduced easily in online production environment.
I think I'm not the only one who met this issue, because there are several duplicated bugs about this :)

Comment 7 Michal Fojtik 2016-08-12 10:47:58 UTC
I talked to Alex and it seems like the problem is the I/O timeout when contacting the API server to verify the OpenShift user. Alex will try to put a fix together where we do retry if we hit the I/O timeout. We also need to do better job in error reporting if this happen, so the builder can retry the push if we hit capacity problem.


Note You need to log in before you can comment on or make changes to this bug.