This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1459193 - Service is unreachable on the newly added node while manually scaling up nodes in Flannel network mode.
Service is unreachable on the newly added node while manually scaling up node...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Reference Architecture (Show other bugs)
3.4.1
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Mark Lamourine
Gan Huang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-06 09:55 EDT by Gan Huang
Modified: 2017-07-03 14:48 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-07-03 14:48:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Gan Huang 2017-06-06 09:55:45 EDT
Description of problem:
S2I build failed on the newly added node while scaling up nodes manually due to docker-registry was unreachable.

Version-Release number of selected component (if applicable):
openshift-heat-templates-0.9.9-2.el7ost.noarch
openshift-ansible-3.4.89-1.git.0.ac29ce8.el7.noarch
flannel-0.7.0-1.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. Trigger a openshift-on-openstack stack by using the parameters template

https://github.com/ganhuang/shell-learning/blob/master/ocp-on-osp-scritps/ocp34-on-osp10/ocp-templates/ha-master-dedicated-flannel.yaml

2. Manually edit `node_count` to 2, and update the stack

3. Trigger S2I build after scaling complete, make sure the app is assigned to the newly added node

Actual results:
#oc build-logs dancer-mysql-example-1
<--snip-->
Pushing image 172.30.10.181:5000/ghuang-test/dancer-mysql-example:latest ...
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount@example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: Put http://172.30.10.181:5000/v1/repositories/ghuang-test/dancer-mysql-example/: dial tcp 172.30.10.181:5000: getsockopt: no route to host


[root@ha-master-dedicated-flannel-master-0 ~]# oc get po -o wide --all-namespaces=true
<--snip-->
ghuang-test    dancer-mysql-example-1-build    0/1       Error       0          43m       172.30.37.2     ha-master-dedicated-flannel-node-5m6m3589.ocp3.ghuang.com
<--snip-->

Login to the newly added node, failed to curl docker-registry service
[cloud-user@ha-master-dedicated-flannel-node-5m6m3589 ~]$ curl 172.30.10.181:5000                                                       
curl: (7) Failed connect to 172.30.10.181:5000; No route to host

But that works on other nodes

Expected results:
S2I build should succeed in any cases.

Additional info:
Comment 1 Mark Lamourine 2017-06-16 07:40:00 EDT
The scaleup playbook does not include two rules which are needed to complete the firewall configuration.  These rules are present in the deployment playbook but not in scaleup.

Adding these two rules, conditional on sdn_flannel should resolve this.

+  - name: Set up masquerading on flannel interface
 +    shell: iptables -t nat -A POSTROUTING -o {{ flannel_interface }} -j MASQUERADE
 +
 +  - name: Make iptables rules permanent
 +    shell: /usr/libexec/iptables/iptables.init save
 +
Comment 2 Mark Lamourine 2017-06-30 07:48:51 EDT
Fixed upstream - 

https://github.com/redhat-openstack/openshift-on-openstack/commit/9b9f90f44bb9d032ee11e7dcf7ad30370ffdd10a

Creating a package for testing.
Comment 3 Mark Lamourine 2017-07-02 19:35:39 EDT
Fixed and OCP version corrected in RPM 

https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=570510
Comment 4 Gan Huang 2017-07-03 02:30:49 EDT
Verified with openshift-heat-templates-0.9.9-5.el7ost.noarch

Manual scaling succeed with Flannel network enabled. The services for both the newly node and existing node can be accessed successfully.

# openshift version
openshift v3.4.1.37
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

# rpm -q flannel
flannel-0.7.1-1.el7.x86_64

Note You need to log in before you can comment on or make changes to this bug.