Bug 1534720
| Summary: | Invalid OVS rules when the node IP is updated | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ravi Sankar <rpenta> |
| Component: | Networking | Assignee: | Ravi Sankar <rpenta> |
| Status: | CLOSED ERRATA | QA Contact: | Hongan Li <hongli> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.9.0 | CC: | aos-bugs, bmeng, hongli, jkaur, weliang |
| Target Milestone: | --- | ||
| Target Release: | 3.9.0 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Problem: Node IP update created invalid ovs rules which resulted into unexpected traffic behavior.
Fix: Node IP update handled correctly by waiting for latest HostSubnet record and no unnecessary ovs flow rules will be created.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-03-28 14:19:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ravi Sankar
2018-01-15 19:41:03 UTC
Proposed fix: openshift-node should not read the local hostsubnet record if the node IP is not updated. This is very easy to reproduce. Openshift master doesn't need to be heavily loaded. Still wondering why this issue was not filed earlier. Do we need to back-port this fix? Fixed in https://github.com/openshift/origin/pull/18117 Commits pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/f6e67a0d61b6597ba281ac44fb7311fb2c74ee3d Bug 1534720 - SDN node should fetch latest local HostSubnet for the node https://github.com/openshift/origin/commit/0201b094575868e7f79af6ee672fead00828cc33 Merge pull request #18117 from pravisankar/fix-subnets Automatic merge from submit-queue (batch tested with PRs 18117, 18049). Bug 1534720 - SDN node should fetch latest local HostSubnet for the node *** Bug 1530931 has been marked as a duplicate of this bug. *** verified in openshift v3.9.0-0.38.0 but failed to reach to the node and pods on the node from master. Checked the hostsubnet and ovs rules after updating nodeIP in node-config.yaml, looks the hostsubnet is updated and no ovs rules for current node IP for the tables 10, 50 and 90. but the problem is cannot reach to the node and pods on the node. If reverting to original nodeIP in node-config.yaml then the problem is gone. @hongli, using dind env to test in v3.9.0-0.41, I can not reproduce the origin problem, the both hostsubnet and new ovs rules are updated with new nodeIP worked fine. After using new nodeIP, master and other node does not have any issue to reach that new nodeIP from testing node. Here is the dind commands to create two NICs (eth0 and eth1) in one node: ./dind-cluster.sh start -ar -n redhat/openshift-ovs-multitenant @Weibin, thanks for your help and verification. I just using a secondary IP in eth0 to test it. like
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:86:27:7c brd ff:ff:ff:ff:ff:ff
inet 172.16.1.12/24 brd 172.16.1.255 scope global dynamic eth0
valid_lft 73436sec preferred_lft 73436sec
inet 172.16.1.13/24 scope global secondary eth0
valid_lft forever preferred_lft forever
so does that mean we cannot using secondary IP for nodeIP ?
@hongli, before trying dind setup , I did the same way as you did to define secondary IP in eth0, and I found the other nodes can not communicate with testing node through this secondary IP. I check with our developers about using secondary IP in openshift env, and they confirmed because the network security configuration in AWS or Openstack, their security policy may block secondary IP traffic. At same time, the Ravi's origin bug was found by configuring second NIC eth1, not secondary IP under same NIC eth0. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489 |