Description of problem: Install openshift with cri-o, when do sti and docker build, the build pod always failed with error. For the build log it seem that the process can't access external net, but in db pods we can access external net success. Version-Release number of selected component (if applicable): openshift v3.7.0-0.184.0 kubernetes v1.7.6+a08f5eeb62 etcd 3.2.8 cri-o : 1.0.1 How reproducible: Always Steps to Reproduce: [root@qe-dma-master-etcd-1 cri-o.0]# oc get po NAME READY STATUS RESTARTS AGE django-psql-example-1-build 0/1 Error 0 4m postgresql-1-gnb2b 1/1 Running 0 4m ruby-ex-1-build 0/1 Error 0 25s [root@qe-dma-master-etcd-1 cri-o.0]# oc logs ruby-ex-1-build ---> Installing application source ... ---> Building your Ruby application from source ... ---> Running 'bundle install --deployment --without development:test' ... Fetching source index from https://rubygems.org/ Retrying fetcher due to error (2/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ Retrying fetcher due to error (3/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/ Retrying fetcher due to error (4/4): Bundler::HTTPError Could not fetch specs from https://rubygems.org/Could not fetch specs from https://rubygems.org/ error: build error: non-zero (13) exit code from registry.access.redhat.com/rhscl/ruby-24-rhel7@sha256:79c08145aa2d6a680a3dc8b0f026b48b77b970bfcfe184898b3cb6eaf65276d0 [root@qe-dma-master-etcd-1 cri-o.0]# oc logs django-psql-example-1-build ---> Installing application source ... ---> Installing dependencies ... Collecting django<1.12,>=1.11 (from -r requirements.txt (line 1)) Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', gaierror(-2, 'Name or service not known'))': /simple/django/ Could not find a version that satisfies the requirement django<1.12,>=1.11 (from -r requirements.txt (line 1)) (from versions: ) No matching distribution found for django<1.12,>=1.11 (from -r requirements.txt (line 1)) error: build error: non-zero (13) exit code from registry.access.redhat.com/rhscl/python-35-rhel7@sha256:be9df8f0385cb443c5c8ceabfa8b98aa3f213fa60ef1cd40c3649f650693df2e Actual results: Expected results: Build successfully Additional info: ]# docker run -ti --entrypoint /bin/bash registry.access.redhat.com/openshift3/ose-sti-builder:latest [root@3951d95c493e origin]# ping rubygems.org PING rubygems.org (151.101.2.2) 56(84) bytes of data. 64 bytes from 151.101.2.2 (151.101.2.2): icmp_seq=1 ttl=55 time=10.2 ms 64 bytes from 151.101.2.2 (151.101.2.2): icmp_seq=2 ttl=55 time=9.97 ms 64 bytes from 151.101.2.2 (151.101.2.2): icmp_seq=3 ttl=55 time=9.97 ms
This block our test build on cri-o env
The assemble container is supposed to be launched using the network namespace from the build pod. Mrunal can you take a look? DeShuai if you run the build with loglevel 5 I think we'll dump more information about the way the assemble container is being launched.
Created attachment 1345529 [details] build-logs.txt Add build-loglevel=5 to get the detail build logs.
here's the network mode value we used to launch the container in question: NetworkMode: netns:/proc/50018/ns/net
this also looks like it could be dns issues in the container, so perhaps a problem w/ the resolv.conf in the crio pod which is being mounted into the assemble container. DeShuai can your other pods(not build pods) successfully perform DNS resolution?
Yes that would work.
https://github.com/openshift/origin/pull/17094
docs: https://github.com/openshift/openshift-docs/pull/6552 code: https://github.com/openshift/origin/pull/17314
ignore comment 13. relevant PR (comment 11) has merged.
Test in openshift cluster v3.8.22 Can't reproduce this bug in cri-o env,s2i and docker builds work well.
s2i and docker builds work well in openshift cluster v3.9.0-0.9.0 Move this bug as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489