Bug 1466659

Summary:	Builds should failed when start-build with a empty dir
Product:	Red Hat Software Collections	Reporter:	XiuJuan Wang <xiuwang>
Component:	rh-ruby24-container	Assignee:	Pavel Valena <pvalena>
Status:	CLOSED EOL	QA Contact:	BaseOS QE - Apps <qe-baseos-apps>
Severity:	low	Docs Contact:
Priority:	medium
Version:	rh-ruby24	CC:	aos-bugs, bparees, cewong, hhorak, jokerman, mmccomas, tborcin, xiuwang
Target Milestone:	---	Keywords:	Regression
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-05 07:09:27 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description XiuJuan Wang 2017-06-30 07:51:21 UTC

Description of problem:
Builds go to completed when start-build with a empty dir, but pod deploy failed since the builder image has no code

Version-Release number of selected component (if applicable):
oc v3.6.126.4
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:
always

Steps to Reproduce:
1.Create an app
oc new-app ruby:latest https://github.com/openshift/ruby-hello-world.git
2.Start build with an empty dir
oc start-build ruby-hello-world --from-dir=/tmp/test
3.Check build-logs

Actual results:

build goes to complete and build-logs is following:

---> Installing application source ...
---> Building your Ruby application from source ...
WARNING: Rubygem Rack is not installed in the present image.
Add rack to your Gemfile in order to start the web server.

Pushing image 172.30.44.192:5000/xiu/ruby-hello-world:latest ...
Push successful

Expected results:
Build should failed.

Additional info:

Comment 4 XiuJuan Wang 2017-07-03 02:36:30 UTC

All ruby versions and  all nodejs images.

Comment 5 Pavel Valena 2017-07-11 15:35:49 UTC

Hello XiuJuan,

I am afraid I was not able to reproduce the issue(on `oc cluster up`).

```
$ ls `pwd`/tmp # empty

$ oc start-build ruby-hello-world --from-dir=`pwd`/tmp
Uploading directory "/home/pvalena/Work/RH/tmp11/tmp" as binary input for the build ...
WARNING: the provided file may not be an archive (tar, tar.gz, or zip), use --from-file to prevent extraction
build "ruby-hello-world-4" started

$ oc logs po/ruby-hello-world-4-build
Receiving source from STDIN as archive ...
---> Installing application source ...
mv: cannot stat '/tmp/src/*': No such file or directory
error: build error: non-zero (13) exit code from centos/ruby-23-centos7@sha256:b013adb164ca694c7c36df8538ed1b14647cad625c935f21ea479709c2cd578e

$ oc version
oc v1.5.1+7b451fc
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://10.43.2.245:8443
openshift v1.5.1+7b451fc
kubernetes v1.5.2+43a9be4
```

Am I missing something? Do I have to run `oc cluster up` as root? Could you send me(irc) access to some cluster where I can reproduce the issue?

I will try with newer oc and I'll let you know.

Comment 6 Pavel Valena 2017-07-11 21:40:34 UTC

With newer version the creation of new app seems to stuck every time.

```
$ oc new-app ruby:latest https://github.com/openshift/ruby-hello-world.git
--> Found image 757f7e5 (8 days old) in image stream "openshift/ruby" under tag "latest" for "ruby:latest"

    Ruby 2.3
    --------
    Platform for building and running Ruby 2.3 applications

    Tags: builder, ruby, ruby23, rh-ruby23

    * The source repository appears to match: ruby
    * A source build using source code from https://github.com/openshift/ruby-hello-world.git will be created
      * The resulting image will be pushed to image stream "ruby-hello-world:latest"
      * Use 'start-build' to trigger a new build
    * This image will be deployed in deployment config "ruby-hello-world"
    * Port 8080/tcp will be load balanced by service "ruby-hello-world"
      * Other containers can access this service through the hostname "ruby-hello-world"

--> Creating resources ...
    imagestream "ruby-hello-world" created
    buildconfig "ruby-hello-world" created
    deploymentconfig "ruby-hello-world" created
    service "ruby-hello-world" created
--> Success
    Build scheduled, use 'oc logs -f bc/ruby-hello-world' to track its progress.
    Run 'oc status' to view your app.

$ oc logs -f bc/ruby-hello-world
Error from server (Timeout): Timeout: timed out waiting for build ruby-hello-world-1 to start after 10s

$ oc get all
NAME                  TYPE      FROM      LATEST
bc/ruby-hello-world   Source    Git       1

NAME                        TYPE      FROM      STATUS    STARTED   DURATION
builds/ruby-hello-world-1   Source    Git       Pending

NAME                  DOCKER REPO                                  TAGS      UPDATED
is/ruby-hello-world   172.30.1.1:5000/myproject/ruby-hello-world

NAME                  REVISION   DESIRED   CURRENT   TRIGGERED BY
dc/ruby-hello-world   0          1         0         config,image(ruby-hello-world:latest)

NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
svc/ruby-hello-world   172.30.231.138   <none>        8080/TCP   13m

NAME                          READY     STATUS              RESTARTS   AGE
po/ruby-hello-world-1-build   0/1       ContainerCreating   0          13m

$ oc version
oc v3.6.0-alpha.2+3c221d5
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://127.0.0.1:8443
openshift v3.6.0-alpha.2+3c221d5
kubernetes v1.6.1+5115d708d7

```

Comment 7 Ben Parees 2017-07-11 22:51:49 UTC

if you oc describe the pod (ruby-hello-world-1-build) you'll get more info of what it's stuck doing.  My guess would be it's pulling the image, but if not that, your docker daemon might be in a bad state and need to be recycled.

Comment 8 Pavel Valena 2017-07-11 23:10:20 UTC

(In reply to Ben Parees from comment #7)
> if you oc describe the pod (ruby-hello-world-1-build) you'll get more info

Yes I've found out in the meantime. :)

> of what it's stuck doing.  My guess would be it's pulling the image, but if
> not that, your docker daemon might be in a bad state and need to be recycled.

I've tried to 'revert' to clean install(I've removed /var/lib/origin, all docker images, restarted docker service and oc cluster etc.), but the issue persists.

```
$ oc describe bc/ruby-hello-world
Name:           ruby-hello-world
Namespace:      myproject
Created:        55 seconds ago
Labels:         app=ruby-hello-world
Annotations:    openshift.io/generated-by=OpenShiftNewApp
Latest Version: 1

Strategy:       Source
URL:            https://github.com/openshift/ruby-hello-world.git
From Image:     ImageStreamTag openshift/ruby:latest
Output to:      ImageStreamTag ruby-hello-world:latest

Build Run Policy:       Serial
Triggered by:           Config, ImageChange
Webhook GitHub:
        URL:    https://127.0.0.1:8443/oapi/v1/namespaces/myproject/buildconfigs/ruby-hello-world/webhooks/x4AlxPIJvQwMIx354utx/github
Webhook Generic:
        URL:            https://127.0.0.1:8443/oapi/v1/namespaces/myproject/buildconfigs/ruby-hello-world/webhooks/2TtzsNc9LkkY18Tseca7/generic
        AllowEnv:       false

Build                   Status          Duration                Creation Time
ruby-hello-world-1      pending         waiting for 55s         2017-07-12 00:59:43 +0200 CEST

Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason                       Message
  ---------     --------        -----   ----                    -------------   --------        ------                       -------
  55s           55s             1       build-config-controller                 Warning         BuildConfigInstantiateFailed gave up on Build for BuildConfig myproject/ruby-hello-world due to fatal error: the LastVersion(1) on build config myproject/ruby-hello-world does not match the build request LastVersion(0)

$ oc describe builds/ruby-hello-world-1
Name:           ruby-hello-world-1
Namespace:      myproject
Created:        2 minutes ago
Labels:         app=ruby-hello-world
                buildconfig=ruby-hello-world
                openshift.io/build-config.name=ruby-hello-world
                openshift.io/build.start-policy=Serial
Annotations:    openshift.io/build-config.name=ruby-hello-world
                openshift.io/build.number=1
                openshift.io/build.pod-name=ruby-hello-world-1-build

Status:         Pending
Duration:       waiting for 2m47s

Build Config:   ruby-hello-world
Build Pod:      ruby-hello-world-1-build

Strategy:       Source
URL:            https://github.com/openshift/ruby-hello-world.git
From Image:     DockerImage centos/ruby-23-centos7@sha256:b013adb164ca694c7c36df8538ed1b14647cad625c935f21ea479709c2cd578e
Output to:      ImageStreamTag ruby-hello-world:latest
Push Secret:    builder-dockercfg-vldbr

Build trigger cause:    Image change
Image ID:               <none>
Image Name/Kind:        ruby:latest / ImageStreamTag

Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason          Message
  ---------     --------        -----   ----                    -------------   --------        ------          -------
  2m            2m              1       default-scheduler                       Normal          Scheduled       Successfully assigned ruby-hello-world-1-build to localhost
  2m            2s              13      kubelet, localhost                      Warning         FailedSync      Error syncing pod, skipping: failed to "CreatePodSandbox" for "ruby-hello-world-1-build_myproject(a09aaca0-668c-11e7-a3de-507b9d787fd0)" with CreatePodSandboxError: "CreatePodSandbox for pod \"ruby-hello-world-1-build_myproject(a09aaca0-668c-11e7-a3de-507b9d787fd0)\" failed: rpc error: code = 2 desc = failed to create a sandbox for pod \"ruby-hello-world-1-build\": Error response from daemon: cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\""

```

I am not sure how to fix this though. I will investigate further tomorrow.
Thanks for your help.

Comment 9 Ben Parees 2017-07-11 23:15:29 UTC

What host operating system are you on?

Comment 10 Pavel Valena 2017-07-11 23:26:02 UTC

(In reply to Ben Parees from comment #9)
> What host operating system are you on?

Fedora 24

docker-1.10.3-55.gite03ddb8.fc24.x86_64

Comment 11 Pavel Valena 2017-07-11 23:53:34 UTC

OK, I think I've found the reason why the build does not start.

```
$ oc logs -f bc/ruby-hello-world
Cloning "https://github.com/openshift/ruby-hello-world.git" ...
error: Unable to update build status: Get https://172.30.0.1:443/oapi/v1/namespaces/myproject/builds/ruby-hello-world-1: dial tcp 172.30.0.1:443: getsockopt: no route to host
error: Unable to update build status: Get https://172.30.0.1:443/oapi/v1/namespaces/myproject/builds/ruby-hello-world-1: dial tcp 172.30.0.1:443: getsockopt: no route to host
error: build error: fatal: unable to access 'https://github.com/openshift/ruby-hello-world.git/': Could not resolve host: github.com; Unknown error

```

This was produced on my Fedora 25. 
My Fedora 24 timeouts on the same request.

Both are however stuck on the same issue AFAICT.

Comment 12 Pavel Valena 2017-07-11 23:57:02 UTC

Note that the service is running.

```
$ curl -k https://172.30.0.1/oapi/v1/namespaces/myproject/builds/ruby-hello-world-1
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "User \"system:anonymous\" cannot get builds in project \"myproject\"",
  "reason": "Forbidden",
  "details": {
    "name": "ruby-hello-world-1",
    "kind": "builds"
  },
  "code": 403
}

```

Comment 13 Ben Parees 2017-07-12 00:13:07 UTC

comment 11 would point to a networking configuration issue.  Make sure you shutdown dnsmasq on your machine before starting cluster up.

Comment 14 Pavel Valena 2017-07-12 17:31:18 UTC

> comment 11 would point to a networking configuration issue.  Make sure you shutdown dnsmasq on your machine before starting cluster up.

Yes, there was dnsmasq running. Stopping it(and starting everything anew) however does not solve the issue. Do you think using minishift could help?

Comment 15 Ben Parees 2017-07-12 18:04:46 UTC

using minishift may help since it runs in a VM, but there's really no reason you shouldn't be able to get oc cluster up to work on a stock fedora system.

Comment 16 Ben Parees 2017-07-12 18:05:14 UTC

you may need to shutdown firewalld, i can't remember if that is still necessary.

Comment 17 Ben Parees 2017-07-12 18:05:47 UTC

Adding Cesar to CC, he may be able to help you through your oc cluster up problems.

Comment 18 Cesar Wong 2017-07-12 18:43:05 UTC

There was a fix we made to cluster up to match the cgroup driver to whatever Docker is using: https://github.com/openshift/origin/pull/13964. From the error, it could be that the cgroup driver is not getting set correctly. Make sure that you are using a recent 'oc' binary.

Also, if you are running more recent versions of origin, Docker 1.10 may be too old. You should try upgrading to 1.12.x

Comment 21 XiuJuan Wang 2018-05-07 09:36:04 UTC

Can't reproduce this bug when start-build with an empty dir in ocp 3.9(3.9.27) and 3.10(3.10.0.31)

Comment 26 Joe Orton 2020-03-05 13:04:10 UTC

Red Hat does not currently plan to provide any further changes to this collection in a Red Hat Software Collections update release.

This software collection is nearing the retirement date (April 2020) after which customers are encouraged to upgrade to a later release (where available).

Please contact Red Hat Support if you have further questions, or refer to the support lifecycle page for more information. https://access.redhat.com/support/policy/updates/rhscl/

Comment 27 Joe Orton 2020-05-05 07:09:27 UTC

In accordance with the Red Hat Software Collections Product Life Cycle, the support period for this collection has ended.

New bug fix, enhancement, and security errata updates, as well as technical support services will no longer be made available for this collection.

Customers are encouraged to upgrade to a later release.

Please contact Red Hat Support if you have further questions, or refer to the support lifecycle page for more information. https://access.redhat.com/support/policy/updates/rhscl/