Bug 1464191
| Summary: | [GSS] [OCP 3.2.1] openshift-sdn partially deletes openflow rules leading to "no route to host" for service | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Francesco Marchioni <fmarchio> | |
| Component: | Networking | Assignee: | Ben Bennett <bbennett> | |
| Status: | CLOSED WONTFIX | QA Contact: | Meng Bo <bmeng> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 3.2.1 | CC: | aloughla, aos-bugs, atragler, bbennett, pweil, rkhan, sukulkar, wmeng | |
| Target Milestone: | --- | |||
| Target Release: | 3.2.1 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1538220 1538227 (view as bug list) | Environment: | ||
| Last Closed: | 2017-10-16 12:32:25 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1538220, 1538227 | |||
|
Description
Francesco Marchioni
2017-06-22 15:35:15 UTC
Related case: 01875046 > So, the problem was on deletion. Either the table1 rule was deleted and the
> delete returned an error, causing the 2 table8 deletion commands to be ignored,
> or the table1 rule delete worked and returned no error, but the 1st table8
> rule command failed and led to ignoring the 2nd table8 delete.
This can't really happen. add-flow and delete-flows mostly can only fail if the flow is syntactically invalid in some way (in which case *everyone* would see the bug).
But anyway, if something is going wrong with our manipulation of the OVS flows, there ought to be errors logged about it. Can you get a copy of the atomic-openshift-node logs from one of the nodes that is experiencing problems? (If the logs are huge, then grep for just "controller.go" to start with.)
We've looked over the code again, and although it is not explicitly done in a transaction, it will never fail since the commands do not error out if they are syntactically correct. If the rule already exists, nothing happens. Given that this has not happened other than the one time, and given that we are moving to OVN rather than programming OVS directly, I'd rather not take the risk of changing the code now to attempt to address this. |