Bug 1459193 - Service is unreachable on the newly added node while manually scaling up nodes in Flannel network mode.
Service is unreachable on the newly added node while manually scaling up node...
Product: OpenShift Container Platform
Classification: Red Hat
Component: Reference Architecture (Show other bugs)
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Mark Lamourine
Gan Huang
Depends On:
  Show dependency treegraph
Reported: 2017-06-06 09:55 EDT by Gan Huang
Modified: 2017-07-03 14:48 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-07-03 14:48:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Gan Huang 2017-06-06 09:55:45 EDT
Description of problem:
S2I build failed on the newly added node while scaling up nodes manually due to docker-registry was unreachable.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Trigger a openshift-on-openstack stack by using the parameters template


2. Manually edit `node_count` to 2, and update the stack

3. Trigger S2I build after scaling complete, make sure the app is assigned to the newly added node

Actual results:
#oc build-logs dancer-mysql-example-1
Pushing image ...
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount@example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: Put dial tcp getsockopt: no route to host

[root@ha-master-dedicated-flannel-master-0 ~]# oc get po -o wide --all-namespaces=true
ghuang-test    dancer-mysql-example-1-build    0/1       Error       0          43m     ha-master-dedicated-flannel-node-5m6m3589.ocp3.ghuang.com

Login to the newly added node, failed to curl docker-registry service
[cloud-user@ha-master-dedicated-flannel-node-5m6m3589 ~]$ curl                                                       
curl: (7) Failed connect to; No route to host

But that works on other nodes

Expected results:
S2I build should succeed in any cases.

Additional info:
Comment 1 Mark Lamourine 2017-06-16 07:40:00 EDT
The scaleup playbook does not include two rules which are needed to complete the firewall configuration.  These rules are present in the deployment playbook but not in scaleup.

Adding these two rules, conditional on sdn_flannel should resolve this.

+  - name: Set up masquerading on flannel interface
 +    shell: iptables -t nat -A POSTROUTING -o {{ flannel_interface }} -j MASQUERADE
 +  - name: Make iptables rules permanent
 +    shell: /usr/libexec/iptables/iptables.init save
Comment 2 Mark Lamourine 2017-06-30 07:48:51 EDT
Fixed upstream - 


Creating a package for testing.
Comment 3 Mark Lamourine 2017-07-02 19:35:39 EDT
Fixed and OCP version corrected in RPM 

Comment 4 Gan Huang 2017-07-03 02:30:49 EDT
Verified with openshift-heat-templates-0.9.9-5.el7ost.noarch

Manual scaling succeed with Flannel network enabled. The services for both the newly node and existing node can be accessed successfully.

# openshift version
openshift v3.4.1.37
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

# rpm -q flannel

Note You need to log in before you can comment on or make changes to this bug.