Description: Following the instruction on page https://github.com/openshift/origin/tree/master/examples/sample-app to do STI build, it always to get error "Could not fetch specs from https://rubygems.org/" Version-Release number of selected component (if applicable): openshift v0.5.2.0-176-gc386339 kubernetes v0.17.0-441-g6b6b47a puddle 3.0/2015-05-26.2/ How reproducible: always steps to Reproduce: 1. https://github.com/openshift/origin/tree/master/examples/sample-app, trigger a build 2. osc build-logs ruby-sample-build-1 Actual results: I0526 21:53:46.564011 1 docker.go:364] Waiting for container I0526 21:53:47.055761 1 sti.go:392] ---> Installing application source I0526 21:53:47.073206 1 sti.go:392] ---> Building your Ruby application from source I0526 21:53:47.073246 1 sti.go:392] ---> Running 'bundle install --deployment' I0526 21:54:27.779183 1 sti.go:392] Fetching source index from https://rubygems.org/ I0526 21:55:07.892978 1 sti.go:392] Could not fetch specs from https://rubygems.org/ I0526 21:55:08.000742 1 docker.go:370] Container exited I0526 21:55:09.364931 1 cleanup.go:24] Removing temporary directory /tmp/sti376064972 Expected results: Should work Additional info:
xjia - it looks like either rubygems.org is not accessible from the build container, or dns lookup is not working. Can you please check that you can access rubygems.org from the node itself?
Yep, I have checked my environment. Both node and container could connect to "rubygems.org" (using curl -k https://rubygems.org) But i have no idea why it always failed to fetch data from rubygems.org.
Is there any way that I could access your environment to troubleshoot?
Not just downloading from rubygem.org, also have problem when try to install dependency in container built from perl images: I0528 03:42:56.500439 1 docker.go:354] Starting container I0528 03:42:56.691054 1 docker.go:364] Waiting for container I0528 03:42:56.907048 1 sti.go:392] ---> Installing application source I0528 03:42:56.948766 1 sti.go:392] ---> Installing modules from cpanfile ... E0528 03:43:37.563564 1 sti.go:418] ! Finding Module::CoreList on cpanmetadb failed. E0528 03:43:37.563628 1 sti.go:418] ! Finding Module::CoreList on cpanmetadb failed.
related to https://github.com/openshift/origin/issues/2482
I have a very similar issue with EAP STI. It seems that the openshift-master has to configure docker containers (I can see that /etc/resolv.conf is different in containers created by openshift and in containers created manually using docker). But these settings are not passed to the subsequent STI builder (i.e. EAP image). When I try to use STI, the ose-sti-builder doownloads sources (resolving correctly git url and accessing the server), but once the EAP part starts (in new container with EAP image), the /etc/resolv.conf is in default state and the connection outside container doesn't work (have tried to resolve DNS and access IP address). It blocks xPaaS testing of Beta4.
In my scenarios, I did not see DNS resolver issues, but still can not fetch data from rubygems.org (the same behaviour as the initial report) I could see STI build container have the correct DNS resolver. # cat /var/lib/docker/containers/71a1d75b1af017af4d306fa99832ecbdd0e6004f1e5759aaf6b091df951589e5/resolv.conf nameserver 192.168.1.192 nameserver 10.11.5.19 search jialiu.cluster.local cluster.local openstacklocal cluster.local 71a1d75b1af017af4d306fa99832ecbdd0e6004f1e5759aaf6b091df951589e5 is the STI builder container UUID. Because the container is already terminated after sti build failure, I have no chance to log into it to check DNS stuff. Here I log into router/docker-registry docker container to check its connection to rubygems.org, I even tried "bundle install" just like what sti build do, every thing is going well. # docker exec -t -i <router-container-ID> /bin/sh sh-4.2# cat /etc/resolv.conf nameserver 192.168.1.192 nameserver 10.11.5.19 search default.cluster.local cluster.local openstacklocal cluster.local The same resolver order, just like what sti build container have. 192.168.1.192 is master where SkyDNS is running. 10.11.5.19 is office network DNS resolver. sh-4.2# dig @192.168.1.192 rubygems.org ; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.1 <<>> @192.168.1.192 rubygems.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 22025 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rubygems.org. IN A ;; Query time: 2 msec ;; SERVER: 192.168.1.192#53(192.168.1.192) ;; WHEN: Fri May 29 04:52:32 EDT 2015 ;; MSG SIZE rcvd: 30 sh-4.2# dig rubygems.org ; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.1 <<>> rubygems.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20393 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rubygems.org. IN A ;; Query time: 1 msec ;; SERVER: 192.168.1.192#53(192.168.1.192) ;; WHEN: Fri May 29 04:53:59 EDT 2015 ;; MSG SIZE rcvd: 30 This is wired, seem like in container, the 2nd resolver does not be used when the 1st resolver could not resolve the DNS. But interestingly, I could run the following command successfully without any change to /etc/resolv.conf in container. sh-4.2# yum install rubygem-bundler sh-4.2# curl -k https://rubygems.org/ sh-4.2# gem install rack sh-4.2# bundle install
I have additional information about how the build goes: 1) build is started 2) docker container from ose-pod image is created 3) docker container from ose-sti-builder image is created - this image has access to network, is set-up to use container created in 2) as NetworkMode - this node downloads sources from github 4) docker container from eap-openshift image is created - this docker container is not using container created in 2) as NetworkMode - this node has no access to internet (through IP od DNS) 5) build fails It seems that the sti builder is creating the docker container itself which causes problems, because in current setup, docker containers created by docker do not have access anywhere.
(In reply to Johnny Liu from comment #8) Yes, when I connect to router, everything works just fine (resolving DNS, downloading from github), but if I just create docker container (e.g. docker run --rm -it fedora bash), it is unable to connect to the internet. By doing 'watch docker ps' I found out that new container is started by STI build. That container is created by docker run, so it has no access anywhere (verified it since maven build takes time to fail).
(In reply to Tomas Schlosser from comment #10) > (In reply to Johnny Liu from comment #8) > Yes, when I connect to router, everything works just fine (resolving DNS, > downloading from github), but if I just create docker container (e.g. docker > run --rm -it fedora bash), it is unable to connect to the internet. > > By doing 'watch docker ps' I found out that new container is started by STI > build. That container is created by docker run, so it has no access anywhere > (verified it since maven build takes time to fail). Agree.
The issue is caused by the container run by the STI builder not getting the same nameserver configuration as the builder pod. The solution is to manually add the ip of the openshift master to each node's resolv.conf We will print out a warning if this is not the case when starting the node. However, this needs to be done manually or by deploy scripts. I'll move this bug to ON_QA when documentation has been updated.
Jhon, assigning this one to you, since you are going to be working on the documentation.
(In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. > > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. We just reconfigured skydns not to recurse. Are you adding only the master's ip to the resolv.conf of the containers spawned by S2I builder? I think we'd need to be consistent with other pods and add the master first, then the node's nameservers after that. See https://github.com/openshift/origin/pull/2569 for discussion regarding disabling recursion in skydns.
(In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. I don't think this is a DNS issue, I have tried curl with IP address with result "No route to host". > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. I have tried adding master to each node's resolv.conf. I tried to add it as second nameserver (after the real world one) as well as the first nameserver. None of these setups work and STI build still fails. I have updated the openshift to latest version: openshift-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-master-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 tuned-profiles-openshift-node-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-sdn-ovs-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-node-0.5.2.2-0.git.13.685a58e.el7ose.x86_64
I went through docker inspect output again and it seems that the main difference between manually run container and container run by OSE is the NetworkMode. The container created by docker run has the NetworkMode set to "bridge" while the openshift-pod container (that serves the network to sti-builder) has the NetworkMode set to "" (empty string). I didn't find a way how to reproduce this using docker run command so can't check, if it would solve the problem.
(In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. > > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. My behaviour is the same as what is described in comment 15, seem like adding master ip to node's resolv.conf does not resolve this issue. Here is my testing steps: 1. openshift verison # openshift version openshift v0.5.2.2-14-gef0f6ad kubernetes v0.17.1-804-g496be63 # docker images|grep sti docker-buildvm-rhose.usersys.redhat.com:5000/openshift3_beta/ose-sti-image-builder v0.5.2.2 ad33ee97468d 3 days ago 445.4 MB docker-buildvm-rhose.usersys.redhat.com:5000/openshift3_beta/ose-sti-builder v0.5.2.2 63a3596cbba6 3 days ago 289.1 MB 2. According to comment 12, add master ip to /etc/resolv.conf on nodes. # cat /etc/resolv.conf ; generated by /usr/sbin/dhclient-script search openstacklocal cluster.local nameserver 192.168.1.192 nameserver 10.11.5.19 3. Trigger sti build. 4. During sti build, the following docker container would be spawned. (NOTE: this container does not have associated ose-pod container) CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b964e9349b4d openshift/ruby-20-rhel7:latest "/bin/sh -c 'tar -C 3 seconds ago Up 2 seconds 8080/tcp loving_pike 5. Before this container is terminated, check its resolv.conf to make sure master ip be there. # docker inspect b964e9349b4d|grep res "CpuShares": 0, "MacAddress": "", "CpuShares": 0, "GlobalIPv6Address": "", "IPAddress": "10.1.0.8", "LinkLocalIPv6Address": "fe80::42:aff:fe1:8", "MacAddress": "02:42:0a:01:00:08", "ResolvConfPath": "/var/lib/docker/containers/b964e9349b4dbfa6efc4687db959d08e8cedfef667ab42782fbb51560bf87138/resolv.conf", # cat /var/lib/docker/containers/b964e9349b4dbfa6efc4687db959d08e8cedfef667ab42782fbb51560bf87138/resolv.conf ; generated by /usr/sbin/dhclient-script search openstacklocal cluster.local nameserver 192.168.1.192 nameserver 10.11.5.19 But still the same error in build log. $ osc build-logs ruby-sample-build-1 Switched to a new branch 'beta3' Branch beta3 set up to track remote branch beta3 from origin. I0601 02:16:57.711274 1 sti.go:392] ---> Installing application source I0601 02:16:57.722962 1 sti.go:392] ---> Building your Ruby application from source I0601 02:16:57.723158 1 sti.go:392] ---> Running 'bundle install --deployment' I0601 02:17:38.268773 1 sti.go:392] Fetching source index from https://rubygems.org/ I0601 02:18:18.333187 1 sti.go:392] Could not fetch specs from https://rubygems.org/ F0601 02:18:19.510684 1 builder.go:75] Build error: non-zero (13) exit code from openshift/ruby-20-rhel7
Assigning back to myself to investigate further
Late today Cesar and I did some debugging and we hope to test a fix tomorrow related to the NetworkMode not being set properly on containers created by sti-builder pod, we found that those containers couldn't even ping the default router. Hopefully we'll have a fix tomorrow.
Seem like this is sdn network configuration issue, after restart openshift-node service, openshift-sdn-kube-subnet-setup.sh would be called, "ip route" is showing as the following. But this would cause container (started by "docker run" or spawned by sti builder) lose its network connection. # ip route default via 192.168.1.1 dev eth0 10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1 10.1.0.0/16 dev tun0 proto kernel scope link 169.254.0.0/16 dev eth0 scope link metric 1002 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193 Check all the steps of openshift-sdn-kube-subnet-setup.sh, the following line would lead container lose its network connection. <--> # delete the subnet routing entry created because of lbr0 ip route del ${subnet} dev lbr0 proto kernel scope link src ${subnet_gateway} || true <--> I am not sure if this is by design, what is the reason??? At least, if I comment out this line, ip route would show as the following: # ip route default via 192.168.1.1 dev eth0 10.1.0.0/24 dev lbr0 proto kernel scope link src 10.1.0.1 10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1 10.1.0.0/16 dev tun0 proto kernel scope link 169.254.0.0/16 dev eth0 scope link metric 1002 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193 Then container could connect outside successfully.
Rajat, assigning this one to you, given that in SDN environments this is a networking issue. Containers not started by openshift have no access to the outside network.
(In reply to Johnny Liu from comment #20) I have tried that setup as well and it bring two problems: - deployments end with "dial tcp: i/o timeout" - pods on the same node can't connect to each other (e.g. build on node1 can't use docker-registry on node1) - liveness probes don't work (because curl from node to it's containers doesn't work) So it won't work for us even as a workaround.
Fixed with https://github.com/openshift/origin/pull/2719 @johnny @xjia Could we ask you to test this before it is merged? Thanks.
Seem like 3.0/2015-06-02.3 already merged the PR mentioned in comment 23, so re-test this bug with 3.0/2015-06-02.3, container still can not get outside network connection.
Add more log info, help that could help your debug. Node log message: Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lock_file=/var/lock/openshift-sdn.lock Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet_gateway=10.1.0.1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet=10.1.0.0/24 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + cluster_subnet=10.1.0.0/16 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet_mask_len=24 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + tun_gateway=10.1.0.1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + printf 'Container network is "%s"; local host has subnet "%s" and gateway "%s".\n' 10.1.0.0/16 10.1.0.0/24 10.1.0.1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: Container network is "10.1.0.0/16"; local host has subnet "10.1.0.0/24" and gateway "10.1.0.1". Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + TUN=tun0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lockwrap setup Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + flock 200 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + setup Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + rm -f /etc/openshift-sdn/config.env Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-br br0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-br br0 -- set Bridge br0 fail-mode=secure Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl set bridge br0 protocols=OpenFlow13 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-port br0 vxlan0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ovs-vsctl: no port named vxlan0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + true Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 vxlan0 -- set Interface vxlan0 type=vxlan options:remote_ip=flow options:key=flow ofport_request=1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 tun0 -- set Interface tun0 type=internal ofport_request=2 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link del vlinuxbr Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link add vlinuxbr type veth peer name vovsbr Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vlinuxbr up Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vovsbr up Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vlinuxbr txqueuelen 0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vovsbr txqueuelen 0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-port br0 vovsbr Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ovs-vsctl: no port named vovsbr Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + true Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 vovsbr -- set Interface vovsbr ofport_request=9 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set lbr0 down Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl delbr lbr0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl addbr lbr0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip addr add 10.1.0.1/24 dev lbr0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set lbr0 up Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl addif lbr0 vlinuxbr Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip addr add 10.1.0.1/24 dev tun0 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set tun0 up Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip route add 10.1.0.0/16 dev tun0 proto kernel scope link Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -t nat -D POSTROUTING -s 10.1.0.0/16 '!' -d 10.1.0.0/16 -j MASQUERADE Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -t nat -A POSTROUTING -s 10.1.0.0/16 '!' -d 10.1.0.0/16 -j MASQUERADE Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -D INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -D INPUT -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ iptables -nvL INPUT --line-numbers Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ grep 'state RELATED,ESTABLISHED' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ awk '{print $1}' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lineno= Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I INPUT 1 -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ iptables -nvL FORWARD --line-numbers Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ grep 'reject-with icmp-host-prohibited' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ tail -n 1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ awk '{print $1}' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + fwd_lineno= Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I FORWARD -d 10.1.0.0/16 -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I FORWARD -s 10.1.0.0/16 -j ACCEPT Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + [[ -z '' ]] Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + DOCKER_NETWORK_OPTIONS='-b=lbr0 --mtu=1450' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + grep -q '^DOCKER_NETWORK_OPTIONS='\''-b=lbr0 --mtu=1450'\''' /etc/sysconfig/docker-network Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + systemctl daemon-reload Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + systemctl restart docker.service Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip route del 10.1.0.0/24 dev lbr0 proto kernel scope link src 10.1.0.1 Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + mkdir -p /etc/openshift-sdn Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + echo 'export OPENSHIFT_SDN_TAP1_ADDR=10.1.0.1' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + echo 'export OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16' Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: I0603 16:15:50.534760 86703 kube.go:78] Output of adding table=0,cookie=0xac,priority=100,ip,nw_dst=10.1.1.0/...: (<nil>) Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: I0603 16:15:50.546071 86703 kube.go:80] Output of adding table=0,cookie=0xac,priority=100,arp,nw_dst=10.1.1.0...: (<nil>) Jun 03 16:15:50 minion1.cluster.local systemd[1]: Started OpenShift Node. # brctl show bridge name bridge id STP enabled interfaces lbr0 8000.0eb084ed433d no vlinuxbr # ovs-vsctl show bc8dae8c-d22c-4dce-9d9b-10a816018729 Bridge "br0" fail_mode: secure Port "veth2bc9197" Interface "veth2bc9197" Port "tun0" Interface "tun0" type: internal Port "veth347c6dc" Interface "veth347c6dc" Port "br0" Interface "br0" type: internal Port vovsbr Interface vovsbr Port "vxlan0" Interface "vxlan0" type: vxlan options: {key=flow, remote_ip=flow} Port "veth5c96ddf" Interface "veth5c96ddf" ovs_version: "2.3.1-git3282e51" # ip route default via 192.168.1.1 dev eth0 10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1 10.1.0.0/16 dev tun0 proto kernel scope link 169.254.0.0/16 dev eth0 scope link metric 1002 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193 # iptables -L -n -t nat Chain PREROUTING (policy ACCEPT) target prot opt source destination KUBE-PORTALS-CONTAINER all -- 0.0.0.0/0 0.0.0.0/0 /* handle Portals; NOTE: this must be before the NodePort rules */ KUBE-NODEPORT-CONTAINER all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL /* handle service NodePorts; NOTE: this must be the last rule in the chain */ Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination KUBE-PORTALS-HOST all -- 0.0.0.0/0 0.0.0.0/0 /* handle Portals; NOTE: this must be before the NodePort rules */ KUBE-NODEPORT-HOST all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL /* handle service NodePorts; NOTE: this must be the last rule in the chain */ DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- 10.1.0.0/24 0.0.0.0/0 MASQUERADE all -- 10.1.0.0/16 !10.1.0.0/16 MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:1936 MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:443 MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:80 Chain DOCKER (1 references) target prot opt source destination DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:1936 to:10.1.0.5:1936 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:443 to:10.1.0.5:443 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 to:10.1.0.5:80
I think this is actually an SDN issue. Two containers can't reach eachother, and the container can't reach the host's gateway: [root@ose3-master openshift-ansible]# docker run -it google/golang /bin/bash root@e94e34575453:/# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 19: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:01:00:04 brd ff:ff:ff:ff:ff:ff inet 10.1.0.4/24 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:aff:fe01:4/64 scope link valid_lft forever preferred_lft forever root@e94e34575453:/# ping 10.1.0.1 PING 10.1.0.1 (10.1.0.1) 56(84) bytes of data. 64 bytes from 10.1.0.1: icmp_req=1 ttl=64 time=0.141 ms ^C --- 10.1.0.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.141/0.141/0.141/0.000 ms root@e94e34575453:/# ping 192.168.133.1 PING 192.168.133.1 (192.168.133.1) 56(84) bytes of data. ^C --- 192.168.133.1 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 1999ms *** other container on different host*** [root@ose3-node1 ~]# docker run -it google/golang /bin/bash root@744ba0f177d9:/# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 15: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:01:01:02 brd ff:ff:ff:ff:ff:ff inet 10.1.1.2/24 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:aff:fe01:102/64 scope link valid_lft forever preferred_lft forever root@744ba0f177d9:/# ping 10.1.0.4 PING 10.1.0.4 (10.1.0.4) 56(84) bytes of data. ^C --- 10.1.0.4 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 999ms
The following command is required on all nodes: sysctl -w net.bridge.bridge-nf-call-iptables=0 Will get eventually fixed in ansible/vagrant. Being tracked here: https://github.com/detiber/openshift-ansible/issues/33
Version: 3.0/2015-06-03.2/ Verify: It can download successfully. [xuan@master sample-app]$ osc build-logs ruby-sample-build-1 -n jia I0603 21:55:02.909533 1 cfg.go:50] Problem accessing /root/.dockercfg: stat /root/.dockercfg: no such file or directory I0603 21:55:02.911822 1 sti.go:67] Creating a new S2I builder with build request: api.Request{BaseImage:"openshift/ruby-20-centos7:latest", DockerConfig:(*api.DockerConfig)(0xc20803b680), DockerCfgPath:"", PullAuthentication:docker.AuthConfiguration{Username:"", Password:"", Email:"", ServerAddress:""}, PreserveWorkingDir:false, Source:"git://github.com/openshift/ruby-hello-world.git", Ref:"", Tag:"172.30.168.21:5000/jia/origin-ruby-sample:latest", Incremental:true, RemovePreviousImage:false, Environment:map[string]string{"OPENSHIFT_BUILD_NAME":"ruby-sample-build-1", "OPENSHIFT_BUILD_NAMESPACE":"jia", "OPENSHIFT_BUILD_SOURCE":"git://github.com/openshift/ruby-hello-world.git"}, CallbackURL:"", ScriptsURL:"", Location:"", ForcePull:false, WorkingDir:"", LayeredBuild:false, InstallDestination:"", Quiet:false, ContextDir:""} I0603 21:55:02.915286 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally I0603 21:55:02.918042 1 sti.go:73] Starting S2I build from jia/ruby-sample-build-1 BuildConfig ... I0603 21:55:02.918084 1 sti.go:114] Building 172.30.168.21:5000/jia/origin-ruby-sample:latest I0603 21:55:02.918740 1 clone.go:26] Cloning into /tmp/sti640436419/upload/src I0603 21:55:05.204811 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally I0603 21:55:05.204858 1 docker.go:222] Image contains STI_SCRIPTS_URL set to 'image:///usr/local/sti' I0603 21:55:05.204907 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/assemble I0603 21:55:05.204923 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/run I0603 21:55:05.207682 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally I0603 21:55:05.207702 1 docker.go:222] Image contains STI_SCRIPTS_URL set to 'image:///usr/local/sti' I0603 21:55:05.207722 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/save-artifacts I0603 21:55:05.207741 1 sti.go:185] Using assemble from image:///usr/local/sti I0603 21:55:05.207756 1 sti.go:185] Using run from image:///usr/local/sti I0603 21:55:05.207766 1 sti.go:185] Using save-artifacts from image:///usr/local/sti I0603 21:55:05.208856 1 sti.go:122] Clean build will be performed I0603 21:55:05.208892 1 sti.go:125] Performing source build from git://github.com/openshift/ruby-hello-world.git I0603 21:55:05.208906 1 sti.go:133] Building 172.30.168.21:5000/jia/origin-ruby-sample:latest I0603 21:55:05.208921 1 sti.go:330] Using image name openshift/ruby-20-centos7:latest I0603 21:55:05.208998 1 environment.go:52] Setting 'RACK_ENV' to 'production' I0603 21:55:05.210884 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.gitignore as src/.gitignore I0603 21:55:05.211088 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.sti/bin/README as src/.sti/bin/README I0603 21:55:05.211183 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.sti/environment as src/.sti/environment I0603 21:55:05.211272 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Dockerfile as src/Dockerfile I0603 21:55:05.211360 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Gemfile as src/Gemfile I0603 21:55:05.211495 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Gemfile.lock as src/Gemfile.lock I0603 21:55:05.211605 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/README.md as src/README.md I0603 21:55:05.211697 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Rakefile as src/Rakefile I0603 21:55:05.211790 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/app.rb as src/app.rb I0603 21:55:05.211932 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config/database.rb as src/config/database.rb I0603 21:55:05.212036 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config/database.yml as src/config/database.yml I0603 21:55:05.214260 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config.ru as src/config.ru I0603 21:55:05.214494 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/db/migrate/20141102191902_create_key_pair.rb as src/db/migrate/20141102191902_create_key_pair.rb I0603 21:55:05.214648 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/models.rb as src/models.rb I0603 21:55:05.214764 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/run.sh as src/run.sh I0603 21:55:05.214919 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/views/main.erb as src/views/main.erb I0603 21:55:05.218025 1 docker.go:268] Base directory for STI scripts is '/usr/local/sti'. Untarring destination is '/tmp'. I0603 21:55:05.218066 1 docker.go:294] Creating container using config: {Hostname: Domainname: User: Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:true AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:true StdinOnce:true Env:[RACK_ENV=production OPENSHIFT_BUILD_NAME=ruby-sample-build-1 OPENSHIFT_BUILD_NAMESPACE=jia OPENSHIFT_BUILD_SOURCE=git://github.com/openshift/ruby-hello-world.git] Cmd:[/bin/sh -c tar -C /tmp -xf - && /usr/local/sti/assemble] DNS:[] Image:openshift/ruby-20-centos7:latest Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Labels:map[]} I0603 21:55:07.298492 1 docker.go:301] Attaching to container I0603 21:55:07.305606 1 docker.go:354] Starting container I0603 21:55:08.064878 1 docker.go:364] Waiting for container I0603 21:55:09.147178 1 sti.go:392] ---> Installing application source I0603 21:55:09.223986 1 sti.go:392] ---> Building your Ruby application from source I0603 21:55:09.224019 1 sti.go:392] ---> Running 'bundle install --deployment' I0603 21:55:14.053936 1 sti.go:392] Fetching gem metadata from https://rubygems.org/.......... I0603 21:55:17.846857 1 sti.go:392] Installing rake (10.3.2) I0603 21:55:18.203740 1 sti.go:392] Installing i18n (0.6.11) I0603 21:55:21.842195 1 sti.go:392] Installing json (1.8.1) I0603 21:55:23.026363 1 sti.go:392] Installing minitest (5.4.2) I0603 21:55:23.527062 1 sti.go:392] Installing thread_safe (0.3.4) I0603 21:55:23.912921 1 sti.go:392] Installing tzinfo (1.2.2) I0603 21:55:24.448167 1 sti.go:392] Installing activesupport (4.1.7) I0603 21:55:24.664853 1 sti.go:392] Installing builder (3.2.2) I0603 21:55:24.865857 1 sti.go:392] Installing activemodel (4.1.7) I0603 21:55:25.125443 1 sti.go:392] Installing arel (5.0.1.20140414130214) I0603 21:55:25.719118 1 sti.go:392] Installing activerecord (4.1.7) I0603 21:55:31.490834 1 sti.go:392] Installing mysql2 (0.3.16) I0603 21:55:32.685277 1 sti.go:392] Installing rack (1.5.2) I0603 21:55:32.934857 1 sti.go:392] Installing rack-protection (1.5.3) I0603 21:55:33.153033 1 sti.go:392] Installing tilt (1.4.1) I0603 21:55:33.602519 1 sti.go:392] Installing sinatra (1.4.5) I0603 21:55:33.728385 1 sti.go:392] Installing sinatra-activerecord (2.0.3) I0603 21:55:33.728701 1 sti.go:392] Using bundler (1.3.5) I0603 21:55:33.763013 1 sti.go:392] Your bundle is complete! I0603 21:55:33.763054 1 sti.go:392] It was installed into ./bundle I0603 21:55:33.803384 1 sti.go:392] ---> Cleaning up unused ruby gems I0603 21:55:34.712984 1 docker.go:370] Container exited I0603 21:55:34.713013 1 docker.go:376] Invoking postExecution function I0603 21:55:34.713109 1 environment.go:52] Setting 'RACK_ENV' to 'production' I0603 21:55:34.713146 1 docker.go:408] Committing container with config: {Hostname: Domainname: User: Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:false AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:false StdinOnce:false Env:[RACK_ENV=production OPENSHIFT_BUILD_NAME=ruby-sample-build-1 OPENSHIFT_BUILD_NAMESPACE=jia OPENSHIFT_BUILD_SOURCE=git://github.com/openshift/ruby-hello-world.git] Cmd:[/usr/local/sti/run] DNS:[] Image: Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Labels:map[]} I0603 21:55:44.180502 1 sti.go:249] Successfully built 172.30.168.21:5000/jia/origin-ruby-sample:latest I0603 21:55:44.180540 1 sti.go:250] Tagged 811131ea1b64dd87c8fd5625b885f7d9ba6d91b61d98a25310d96b4fef12083a as 172.30.168.21:5000/jia/origin-ruby-sample:latest I0603 21:55:48.197948 1 cleanup.go:24] Removing temporary directory /tmp/sti640436419 I0603 21:55:48.198003 1 fs.go:99] Removing directory '/tmp/sti640436419' I0603 21:55:48.202826 1 cfg.go:50] Problem accessing /root/.dockercfg: stat /root/.dockercfg: no such file or directory I0603 21:55:48.202856 1 sti.go:92] Pushing 172.30.168.21:5000/jia/origin-ruby-sample:latest image ...