Bug 1225410
| Summary: | Fail to do STI build for sample-app example | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | xjia <xjia> |
| Component: | Build | Assignee: | Rajat Chopra <rchopra> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Johnny Liu <jialiu> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.0.0 | CC: | akostadi, cewong, ejacobs, jialiu, libra-bugs, maschmid, sdodson, tschloss, wzheng, xtian, zroubali |
| Target Milestone: | --- | Keywords: | TestBlocker |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-0.5.2.2-0.git.19.8dc4a9a.el7ose.x86_64 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-11-23 14:43:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
xjia
2015-05-27 10:22:10 UTC
xjia - it looks like either rubygems.org is not accessible from the build container, or dns lookup is not working. Can you please check that you can access rubygems.org from the node itself? Yep, I have checked my environment. Both node and container could connect to "rubygems.org" (using curl -k https://rubygems.org) But i have no idea why it always failed to fetch data from rubygems.org. Is there any way that I could access your environment to troubleshoot? Not just downloading from rubygem.org, also have problem when try to install dependency in container built from perl images: I0528 03:42:56.500439 1 docker.go:354] Starting container I0528 03:42:56.691054 1 docker.go:364] Waiting for container I0528 03:42:56.907048 1 sti.go:392] ---> Installing application source I0528 03:42:56.948766 1 sti.go:392] ---> Installing modules from cpanfile ... E0528 03:43:37.563564 1 sti.go:418] ! Finding Module::CoreList on cpanmetadb failed. E0528 03:43:37.563628 1 sti.go:418] ! Finding Module::CoreList on cpanmetadb failed. I have a very similar issue with EAP STI. It seems that the openshift-master has to configure docker containers (I can see that /etc/resolv.conf is different in containers created by openshift and in containers created manually using docker). But these settings are not passed to the subsequent STI builder (i.e. EAP image). When I try to use STI, the ose-sti-builder doownloads sources (resolving correctly git url and accessing the server), but once the EAP part starts (in new container with EAP image), the /etc/resolv.conf is in default state and the connection outside container doesn't work (have tried to resolve DNS and access IP address). It blocks xPaaS testing of Beta4. In my scenarios, I did not see DNS resolver issues, but still can not fetch data from rubygems.org (the same behaviour as the initial report) I could see STI build container have the correct DNS resolver. # cat /var/lib/docker/containers/71a1d75b1af017af4d306fa99832ecbdd0e6004f1e5759aaf6b091df951589e5/resolv.conf nameserver 192.168.1.192 nameserver 10.11.5.19 search jialiu.cluster.local cluster.local openstacklocal cluster.local 71a1d75b1af017af4d306fa99832ecbdd0e6004f1e5759aaf6b091df951589e5 is the STI builder container UUID. Because the container is already terminated after sti build failure, I have no chance to log into it to check DNS stuff. Here I log into router/docker-registry docker container to check its connection to rubygems.org, I even tried "bundle install" just like what sti build do, every thing is going well. # docker exec -t -i <router-container-ID> /bin/sh sh-4.2# cat /etc/resolv.conf nameserver 192.168.1.192 nameserver 10.11.5.19 search default.cluster.local cluster.local openstacklocal cluster.local The same resolver order, just like what sti build container have. 192.168.1.192 is master where SkyDNS is running. 10.11.5.19 is office network DNS resolver. sh-4.2# dig @192.168.1.192 rubygems.org ; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.1 <<>> @192.168.1.192 rubygems.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 22025 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rubygems.org. IN A ;; Query time: 2 msec ;; SERVER: 192.168.1.192#53(192.168.1.192) ;; WHEN: Fri May 29 04:52:32 EDT 2015 ;; MSG SIZE rcvd: 30 sh-4.2# dig rubygems.org ; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.1 <<>> rubygems.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20393 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;rubygems.org. IN A ;; Query time: 1 msec ;; SERVER: 192.168.1.192#53(192.168.1.192) ;; WHEN: Fri May 29 04:53:59 EDT 2015 ;; MSG SIZE rcvd: 30 This is wired, seem like in container, the 2nd resolver does not be used when the 1st resolver could not resolve the DNS. But interestingly, I could run the following command successfully without any change to /etc/resolv.conf in container. sh-4.2# yum install rubygem-bundler sh-4.2# curl -k https://rubygems.org/ sh-4.2# gem install rack sh-4.2# bundle install I have additional information about how the build goes: 1) build is started 2) docker container from ose-pod image is created 3) docker container from ose-sti-builder image is created - this image has access to network, is set-up to use container created in 2) as NetworkMode - this node downloads sources from github 4) docker container from eap-openshift image is created - this docker container is not using container created in 2) as NetworkMode - this node has no access to internet (through IP od DNS) 5) build fails It seems that the sti builder is creating the docker container itself which causes problems, because in current setup, docker containers created by docker do not have access anywhere. (In reply to Johnny Liu from comment #8) Yes, when I connect to router, everything works just fine (resolving DNS, downloading from github), but if I just create docker container (e.g. docker run --rm -it fedora bash), it is unable to connect to the internet. By doing 'watch docker ps' I found out that new container is started by STI build. That container is created by docker run, so it has no access anywhere (verified it since maven build takes time to fail). (In reply to Tomas Schlosser from comment #10) > (In reply to Johnny Liu from comment #8) > Yes, when I connect to router, everything works just fine (resolving DNS, > downloading from github), but if I just create docker container (e.g. docker > run --rm -it fedora bash), it is unable to connect to the internet. > > By doing 'watch docker ps' I found out that new container is started by STI > build. That container is created by docker run, so it has no access anywhere > (verified it since maven build takes time to fail). Agree. The issue is caused by the container run by the STI builder not getting the same nameserver configuration as the builder pod. The solution is to manually add the ip of the openshift master to each node's resolv.conf We will print out a warning if this is not the case when starting the node. However, this needs to be done manually or by deploy scripts. I'll move this bug to ON_QA when documentation has been updated. Jhon, assigning this one to you, since you are going to be working on the documentation. (In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. > > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. We just reconfigured skydns not to recurse. Are you adding only the master's ip to the resolv.conf of the containers spawned by S2I builder? I think we'd need to be consistent with other pods and add the master first, then the node's nameservers after that. See https://github.com/openshift/origin/pull/2569 for discussion regarding disabling recursion in skydns. (In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. I don't think this is a DNS issue, I have tried curl with IP address with result "No route to host". > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. I have tried adding master to each node's resolv.conf. I tried to add it as second nameserver (after the real world one) as well as the first nameserver. None of these setups work and STI build still fails. I have updated the openshift to latest version: openshift-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-master-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 tuned-profiles-openshift-node-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-sdn-ovs-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 openshift-node-0.5.2.2-0.git.13.685a58e.el7ose.x86_64 I went through docker inspect output again and it seems that the main difference between manually run container and container run by OSE is the NetworkMode. The container created by docker run has the NetworkMode set to "bridge" while the openshift-pod container (that serves the network to sti-builder) has the NetworkMode set to "" (empty string). I didn't find a way how to reproduce this using docker run command so can't check, if it would solve the problem. (In reply to Cesar Wong from comment #12) > The issue is caused by the container run by the STI builder not getting the > same nameserver configuration as the builder pod. > > The solution is to manually add the ip of the openshift master to each > node's resolv.conf > > We will print out a warning if this is not the case when starting the node. > However, this needs to be done manually or by deploy scripts. I'll move this > bug to ON_QA when documentation has been updated. My behaviour is the same as what is described in comment 15, seem like adding master ip to node's resolv.conf does not resolve this issue. Here is my testing steps: 1. openshift verison # openshift version openshift v0.5.2.2-14-gef0f6ad kubernetes v0.17.1-804-g496be63 # docker images|grep sti docker-buildvm-rhose.usersys.redhat.com:5000/openshift3_beta/ose-sti-image-builder v0.5.2.2 ad33ee97468d 3 days ago 445.4 MB docker-buildvm-rhose.usersys.redhat.com:5000/openshift3_beta/ose-sti-builder v0.5.2.2 63a3596cbba6 3 days ago 289.1 MB 2. According to comment 12, add master ip to /etc/resolv.conf on nodes. # cat /etc/resolv.conf ; generated by /usr/sbin/dhclient-script search openstacklocal cluster.local nameserver 192.168.1.192 nameserver 10.11.5.19 3. Trigger sti build. 4. During sti build, the following docker container would be spawned. (NOTE: this container does not have associated ose-pod container) CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b964e9349b4d openshift/ruby-20-rhel7:latest "/bin/sh -c 'tar -C 3 seconds ago Up 2 seconds 8080/tcp loving_pike 5. Before this container is terminated, check its resolv.conf to make sure master ip be there. # docker inspect b964e9349b4d|grep res "CpuShares": 0, "MacAddress": "", "CpuShares": 0, "GlobalIPv6Address": "", "IPAddress": "10.1.0.8", "LinkLocalIPv6Address": "fe80::42:aff:fe1:8", "MacAddress": "02:42:0a:01:00:08", "ResolvConfPath": "/var/lib/docker/containers/b964e9349b4dbfa6efc4687db959d08e8cedfef667ab42782fbb51560bf87138/resolv.conf", # cat /var/lib/docker/containers/b964e9349b4dbfa6efc4687db959d08e8cedfef667ab42782fbb51560bf87138/resolv.conf ; generated by /usr/sbin/dhclient-script search openstacklocal cluster.local nameserver 192.168.1.192 nameserver 10.11.5.19 But still the same error in build log. $ osc build-logs ruby-sample-build-1 Switched to a new branch 'beta3' Branch beta3 set up to track remote branch beta3 from origin. I0601 02:16:57.711274 1 sti.go:392] ---> Installing application source I0601 02:16:57.722962 1 sti.go:392] ---> Building your Ruby application from source I0601 02:16:57.723158 1 sti.go:392] ---> Running 'bundle install --deployment' I0601 02:17:38.268773 1 sti.go:392] Fetching source index from https://rubygems.org/ I0601 02:18:18.333187 1 sti.go:392] Could not fetch specs from https://rubygems.org/ F0601 02:18:19.510684 1 builder.go:75] Build error: non-zero (13) exit code from openshift/ruby-20-rhel7 Assigning back to myself to investigate further Late today Cesar and I did some debugging and we hope to test a fix tomorrow related to the NetworkMode not being set properly on containers created by sti-builder pod, we found that those containers couldn't even ping the default router. Hopefully we'll have a fix tomorrow. Seem like this is sdn network configuration issue, after restart openshift-node service, openshift-sdn-kube-subnet-setup.sh would be called, "ip route" is showing as the following. But this would cause container (started by "docker run" or spawned by sti builder) lose its network connection.
# ip route
default via 192.168.1.1 dev eth0
10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1
10.1.0.0/16 dev tun0 proto kernel scope link
169.254.0.0/16 dev eth0 scope link metric 1002
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193
Check all the steps of openshift-sdn-kube-subnet-setup.sh, the following line would lead container lose its network connection.
<-->
# delete the subnet routing entry created because of lbr0
ip route del ${subnet} dev lbr0 proto kernel scope link src ${subnet_gateway} || true
<-->
I am not sure if this is by design, what is the reason???
At least, if I comment out this line, ip route would show as the following:
# ip route
default via 192.168.1.1 dev eth0
10.1.0.0/24 dev lbr0 proto kernel scope link src 10.1.0.1
10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1
10.1.0.0/16 dev tun0 proto kernel scope link
169.254.0.0/16 dev eth0 scope link metric 1002
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193
Then container could connect outside successfully.
Rajat, assigning this one to you, given that in SDN environments this is a networking issue. Containers not started by openshift have no access to the outside network. (In reply to Johnny Liu from comment #20) I have tried that setup as well and it bring two problems: - deployments end with "dial tcp: i/o timeout" - pods on the same node can't connect to each other (e.g. build on node1 can't use docker-registry on node1) - liveness probes don't work (because curl from node to it's containers doesn't work) So it won't work for us even as a workaround. Fixed with https://github.com/openshift/origin/pull/2719 @johnny @xjia Could we ask you to test this before it is merged? Thanks. Seem like 3.0/2015-06-02.3 already merged the PR mentioned in comment 23, so re-test this bug with 3.0/2015-06-02.3, container still can not get outside network connection. Add more log info, help that could help your debug.
Node log message:
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lock_file=/var/lock/openshift-sdn.lock
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet_gateway=10.1.0.1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet=10.1.0.0/24
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + cluster_subnet=10.1.0.0/16
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + subnet_mask_len=24
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + tun_gateway=10.1.0.1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + printf 'Container network is "%s"; local host has subnet "%s" and gateway "%s".\n' 10.1.0.0/16 10.1.0.0/24 10.1.0.1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: Container network is "10.1.0.0/16"; local host has subnet "10.1.0.0/24" and gateway "10.1.0.1".
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + TUN=tun0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lockwrap setup
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + flock 200
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + setup
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + rm -f /etc/openshift-sdn/config.env
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-br br0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-br br0 -- set Bridge br0 fail-mode=secure
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl set bridge br0 protocols=OpenFlow13
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-port br0 vxlan0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ovs-vsctl: no port named vxlan0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + true
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 vxlan0 -- set Interface vxlan0 type=vxlan options:remote_ip=flow options:key=flow ofport_request=1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 tun0 -- set Interface tun0 type=internal ofport_request=2
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link del vlinuxbr
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link add vlinuxbr type veth peer name vovsbr
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vlinuxbr up
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vovsbr up
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vlinuxbr txqueuelen 0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set vovsbr txqueuelen 0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl del-port br0 vovsbr
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ovs-vsctl: no port named vovsbr
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + true
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ovs-vsctl add-port br0 vovsbr -- set Interface vovsbr ofport_request=9
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set lbr0 down
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl delbr lbr0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl addbr lbr0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip addr add 10.1.0.1/24 dev lbr0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set lbr0 up
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + brctl addif lbr0 vlinuxbr
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip addr add 10.1.0.1/24 dev tun0
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip link set tun0 up
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip route add 10.1.0.0/16 dev tun0 proto kernel scope link
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -t nat -D POSTROUTING -s 10.1.0.0/16 '!' -d 10.1.0.0/16 -j MASQUERADE
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -t nat -A POSTROUTING -s 10.1.0.0/16 '!' -d 10.1.0.0/16 -j MASQUERADE
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -D INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -D INPUT -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ iptables -nvL INPUT --line-numbers
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ grep 'state RELATED,ESTABLISHED'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ awk '{print $1}'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + lineno=
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I INPUT 1 -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ iptables -nvL FORWARD --line-numbers
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ grep 'reject-with icmp-host-prohibited'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ tail -n 1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: ++ awk '{print $1}'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + fwd_lineno=
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I FORWARD -d 10.1.0.0/16 -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + iptables -I FORWARD -s 10.1.0.0/16 -j ACCEPT
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + [[ -z '' ]]
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + DOCKER_NETWORK_OPTIONS='-b=lbr0 --mtu=1450'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + grep -q '^DOCKER_NETWORK_OPTIONS='\''-b=lbr0 --mtu=1450'\''' /etc/sysconfig/docker-network
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + systemctl daemon-reload
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + systemctl restart docker.service
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + ip route del 10.1.0.0/24 dev lbr0 proto kernel scope link src 10.1.0.1
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + mkdir -p /etc/openshift-sdn
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + echo 'export OPENSHIFT_SDN_TAP1_ADDR=10.1.0.1'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: + echo 'export OPENSHIFT_CLUSTER_SUBNET=10.1.0.0/16'
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: I0603 16:15:50.534760 86703 kube.go:78] Output of adding table=0,cookie=0xac,priority=100,ip,nw_dst=10.1.1.0/...: (<nil>)
Jun 03 16:15:50 minion1.cluster.local openshift-node[86703]: I0603 16:15:50.546071 86703 kube.go:80] Output of adding table=0,cookie=0xac,priority=100,arp,nw_dst=10.1.1.0...: (<nil>)
Jun 03 16:15:50 minion1.cluster.local systemd[1]: Started OpenShift Node.
# brctl show
bridge name bridge id STP enabled interfaces
lbr0 8000.0eb084ed433d no vlinuxbr
# ovs-vsctl show
bc8dae8c-d22c-4dce-9d9b-10a816018729
Bridge "br0"
fail_mode: secure
Port "veth2bc9197"
Interface "veth2bc9197"
Port "tun0"
Interface "tun0"
type: internal
Port "veth347c6dc"
Interface "veth347c6dc"
Port "br0"
Interface "br0"
type: internal
Port vovsbr
Interface vovsbr
Port "vxlan0"
Interface "vxlan0"
type: vxlan
options: {key=flow, remote_ip=flow}
Port "veth5c96ddf"
Interface "veth5c96ddf"
ovs_version: "2.3.1-git3282e51"
# ip route
default via 192.168.1.1 dev eth0
10.1.0.0/24 dev tun0 proto kernel scope link src 10.1.0.1
10.1.0.0/16 dev tun0 proto kernel scope link
169.254.0.0/16 dev eth0 scope link metric 1002
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.193
# iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
KUBE-PORTALS-CONTAINER all -- 0.0.0.0/0 0.0.0.0/0 /* handle Portals; NOTE: this must be before the NodePort rules */
KUBE-NODEPORT-CONTAINER all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL /* handle service NodePorts; NOTE: this must be the last rule in the chain */
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-PORTALS-HOST all -- 0.0.0.0/0 0.0.0.0/0 /* handle Portals; NOTE: this must be before the NodePort rules */
KUBE-NODEPORT-HOST all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL /* handle service NodePorts; NOTE: this must be the last rule in the chain */
DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 10.1.0.0/24 0.0.0.0/0
MASQUERADE all -- 10.1.0.0/16 !10.1.0.0/16
MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:1936
MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:443
MASQUERADE tcp -- 10.1.0.5 10.1.0.5 tcp dpt:80
Chain DOCKER (1 references)
target prot opt source destination
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:1936 to:10.1.0.5:1936
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:443 to:10.1.0.5:443
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 to:10.1.0.5:80
I think this is actually an SDN issue. Two containers can't reach eachother, and the container can't reach the host's gateway:
[root@ose3-master openshift-ansible]# docker run -it google/golang /bin/bash
root@e94e34575453:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
19: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP
link/ether 02:42:0a:01:00:04 brd ff:ff:ff:ff:ff:ff
inet 10.1.0.4/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe01:4/64 scope link
valid_lft forever preferred_lft forever
root@e94e34575453:/# ping 10.1.0.1
PING 10.1.0.1 (10.1.0.1) 56(84) bytes of data.
64 bytes from 10.1.0.1: icmp_req=1 ttl=64 time=0.141 ms
^C
--- 10.1.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.141/0.141/0.141/0.000 ms
root@e94e34575453:/# ping 192.168.133.1
PING 192.168.133.1 (192.168.133.1) 56(84) bytes of data.
^C
--- 192.168.133.1 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
*** other container on different host***
[root@ose3-node1 ~]# docker run -it google/golang /bin/bash
root@744ba0f177d9:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
15: eth0: <BROADCAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP
link/ether 02:42:0a:01:01:02 brd ff:ff:ff:ff:ff:ff
inet 10.1.1.2/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe01:102/64 scope link
valid_lft forever preferred_lft forever
root@744ba0f177d9:/# ping 10.1.0.4
PING 10.1.0.4 (10.1.0.4) 56(84) bytes of data.
^C
--- 10.1.0.4 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
The following command is required on all nodes: sysctl -w net.bridge.bridge-nf-call-iptables=0 Will get eventually fixed in ansible/vagrant. Being tracked here: https://github.com/detiber/openshift-ansible/issues/33 Version:
3.0/2015-06-03.2/
Verify:
It can download successfully.
[xuan@master sample-app]$ osc build-logs ruby-sample-build-1 -n jia
I0603 21:55:02.909533 1 cfg.go:50] Problem accessing /root/.dockercfg: stat /root/.dockercfg: no such file or directory
I0603 21:55:02.911822 1 sti.go:67] Creating a new S2I builder with build request: api.Request{BaseImage:"openshift/ruby-20-centos7:latest", DockerConfig:(*api.DockerConfig)(0xc20803b680), DockerCfgPath:"", PullAuthentication:docker.AuthConfiguration{Username:"", Password:"", Email:"", ServerAddress:""}, PreserveWorkingDir:false, Source:"git://github.com/openshift/ruby-hello-world.git", Ref:"", Tag:"172.30.168.21:5000/jia/origin-ruby-sample:latest", Incremental:true, RemovePreviousImage:false, Environment:map[string]string{"OPENSHIFT_BUILD_NAME":"ruby-sample-build-1", "OPENSHIFT_BUILD_NAMESPACE":"jia", "OPENSHIFT_BUILD_SOURCE":"git://github.com/openshift/ruby-hello-world.git"}, CallbackURL:"", ScriptsURL:"", Location:"", ForcePull:false, WorkingDir:"", LayeredBuild:false, InstallDestination:"", Quiet:false, ContextDir:""}
I0603 21:55:02.915286 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally
I0603 21:55:02.918042 1 sti.go:73] Starting S2I build from jia/ruby-sample-build-1 BuildConfig ...
I0603 21:55:02.918084 1 sti.go:114] Building 172.30.168.21:5000/jia/origin-ruby-sample:latest
I0603 21:55:02.918740 1 clone.go:26] Cloning into /tmp/sti640436419/upload/src
I0603 21:55:05.204811 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally
I0603 21:55:05.204858 1 docker.go:222] Image contains STI_SCRIPTS_URL set to 'image:///usr/local/sti'
I0603 21:55:05.204907 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/assemble
I0603 21:55:05.204923 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/run
I0603 21:55:05.207682 1 docker.go:170] Image openshift/ruby-20-centos7:latest available locally
I0603 21:55:05.207702 1 docker.go:222] Image contains STI_SCRIPTS_URL set to 'image:///usr/local/sti'
I0603 21:55:05.207722 1 download.go:55] Using image internal scripts from: image:///usr/local/sti/save-artifacts
I0603 21:55:05.207741 1 sti.go:185] Using assemble from image:///usr/local/sti
I0603 21:55:05.207756 1 sti.go:185] Using run from image:///usr/local/sti
I0603 21:55:05.207766 1 sti.go:185] Using save-artifacts from image:///usr/local/sti
I0603 21:55:05.208856 1 sti.go:122] Clean build will be performed
I0603 21:55:05.208892 1 sti.go:125] Performing source build from git://github.com/openshift/ruby-hello-world.git
I0603 21:55:05.208906 1 sti.go:133] Building 172.30.168.21:5000/jia/origin-ruby-sample:latest
I0603 21:55:05.208921 1 sti.go:330] Using image name openshift/ruby-20-centos7:latest
I0603 21:55:05.208998 1 environment.go:52] Setting 'RACK_ENV' to 'production'
I0603 21:55:05.210884 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.gitignore as src/.gitignore
I0603 21:55:05.211088 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.sti/bin/README as src/.sti/bin/README
I0603 21:55:05.211183 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/.sti/environment as src/.sti/environment
I0603 21:55:05.211272 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Dockerfile as src/Dockerfile
I0603 21:55:05.211360 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Gemfile as src/Gemfile
I0603 21:55:05.211495 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Gemfile.lock as src/Gemfile.lock
I0603 21:55:05.211605 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/README.md as src/README.md
I0603 21:55:05.211697 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/Rakefile as src/Rakefile
I0603 21:55:05.211790 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/app.rb as src/app.rb
I0603 21:55:05.211932 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config/database.rb as src/config/database.rb
I0603 21:55:05.212036 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config/database.yml as src/config/database.yml
I0603 21:55:05.214260 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/config.ru as src/config.ru
I0603 21:55:05.214494 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/db/migrate/20141102191902_create_key_pair.rb as src/db/migrate/20141102191902_create_key_pair.rb
I0603 21:55:05.214648 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/models.rb as src/models.rb
I0603 21:55:05.214764 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/run.sh as src/run.sh
I0603 21:55:05.214919 1 tar.go:133] Adding to tar: /tmp/sti640436419/upload/src/views/main.erb as src/views/main.erb
I0603 21:55:05.218025 1 docker.go:268] Base directory for STI scripts is '/usr/local/sti'. Untarring destination is '/tmp'.
I0603 21:55:05.218066 1 docker.go:294] Creating container using config: {Hostname: Domainname: User: Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:true AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:true StdinOnce:true Env:[RACK_ENV=production OPENSHIFT_BUILD_NAME=ruby-sample-build-1 OPENSHIFT_BUILD_NAMESPACE=jia OPENSHIFT_BUILD_SOURCE=git://github.com/openshift/ruby-hello-world.git] Cmd:[/bin/sh -c tar -C /tmp -xf - && /usr/local/sti/assemble] DNS:[] Image:openshift/ruby-20-centos7:latest Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Labels:map[]}
I0603 21:55:07.298492 1 docker.go:301] Attaching to container
I0603 21:55:07.305606 1 docker.go:354] Starting container
I0603 21:55:08.064878 1 docker.go:364] Waiting for container
I0603 21:55:09.147178 1 sti.go:392] ---> Installing application source
I0603 21:55:09.223986 1 sti.go:392] ---> Building your Ruby application from source
I0603 21:55:09.224019 1 sti.go:392] ---> Running 'bundle install --deployment'
I0603 21:55:14.053936 1 sti.go:392] Fetching gem metadata from https://rubygems.org/..........
I0603 21:55:17.846857 1 sti.go:392] Installing rake (10.3.2)
I0603 21:55:18.203740 1 sti.go:392] Installing i18n (0.6.11)
I0603 21:55:21.842195 1 sti.go:392] Installing json (1.8.1)
I0603 21:55:23.026363 1 sti.go:392] Installing minitest (5.4.2)
I0603 21:55:23.527062 1 sti.go:392] Installing thread_safe (0.3.4)
I0603 21:55:23.912921 1 sti.go:392] Installing tzinfo (1.2.2)
I0603 21:55:24.448167 1 sti.go:392] Installing activesupport (4.1.7)
I0603 21:55:24.664853 1 sti.go:392] Installing builder (3.2.2)
I0603 21:55:24.865857 1 sti.go:392] Installing activemodel (4.1.7)
I0603 21:55:25.125443 1 sti.go:392] Installing arel (5.0.1.20140414130214)
I0603 21:55:25.719118 1 sti.go:392] Installing activerecord (4.1.7)
I0603 21:55:31.490834 1 sti.go:392] Installing mysql2 (0.3.16)
I0603 21:55:32.685277 1 sti.go:392] Installing rack (1.5.2)
I0603 21:55:32.934857 1 sti.go:392] Installing rack-protection (1.5.3)
I0603 21:55:33.153033 1 sti.go:392] Installing tilt (1.4.1)
I0603 21:55:33.602519 1 sti.go:392] Installing sinatra (1.4.5)
I0603 21:55:33.728385 1 sti.go:392] Installing sinatra-activerecord (2.0.3)
I0603 21:55:33.728701 1 sti.go:392] Using bundler (1.3.5)
I0603 21:55:33.763013 1 sti.go:392] Your bundle is complete!
I0603 21:55:33.763054 1 sti.go:392] It was installed into ./bundle
I0603 21:55:33.803384 1 sti.go:392] ---> Cleaning up unused ruby gems
I0603 21:55:34.712984 1 docker.go:370] Container exited
I0603 21:55:34.713013 1 docker.go:376] Invoking postExecution function
I0603 21:55:34.713109 1 environment.go:52] Setting 'RACK_ENV' to 'production'
I0603 21:55:34.713146 1 docker.go:408] Committing container with config: {Hostname: Domainname: User: Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:false AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:false StdinOnce:false Env:[RACK_ENV=production OPENSHIFT_BUILD_NAME=ruby-sample-build-1 OPENSHIFT_BUILD_NAMESPACE=jia OPENSHIFT_BUILD_SOURCE=git://github.com/openshift/ruby-hello-world.git] Cmd:[/usr/local/sti/run] DNS:[] Image: Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Labels:map[]}
I0603 21:55:44.180502 1 sti.go:249] Successfully built 172.30.168.21:5000/jia/origin-ruby-sample:latest
I0603 21:55:44.180540 1 sti.go:250] Tagged 811131ea1b64dd87c8fd5625b885f7d9ba6d91b61d98a25310d96b4fef12083a as 172.30.168.21:5000/jia/origin-ruby-sample:latest
I0603 21:55:48.197948 1 cleanup.go:24] Removing temporary directory /tmp/sti640436419
I0603 21:55:48.198003 1 fs.go:99] Removing directory '/tmp/sti640436419'
I0603 21:55:48.202826 1 cfg.go:50] Problem accessing /root/.dockercfg: stat /root/.dockercfg: no such file or directory
I0603 21:55:48.202856 1 sti.go:92] Pushing 172.30.168.21:5000/jia/origin-ruby-sample:latest image ...
|