Bug 1993845 - Enabling internalTrafficPolicy=Local found two issues in test cases
Summary: Enabling internalTrafficPolicy=Local found two issues in test cases
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Martin Kennelly
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-16 09:19 UTC by Martin Kennelly
Modified: 2021-08-31 08:14 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-31 08:14:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubernetes kubernetes pull 104408 0 None None None 2021-08-27 12:44:39 UTC
Github kubernetes kubernetes pull 104409 0 None None None 2021-08-27 12:44:39 UTC

Description Martin Kennelly 2021-08-16 09:19:15 UTC
Description of problem:
When re-enabling tests after k8 rebase to 1.22.0 (https://bugzilla.redhat.com/show_bug.cgi?id=1986307):
[sig-network] Services should respect internalTrafficPolicy=Local Pod to Pod (hostNetwork: true) [Feature:ServiceInternalTrafficPolicy] [Suite:openshift/conformance/parallel] [Suite:k8s] 
[sig-network] Services should respect internalTrafficPolicy=Local Pod (hostNetwork: true) to Pod (hostNetwork: true) [Feature:ServiceInternalTrafficPolicy] [Suite:openshift/conformance/parallel] [Suite:k8s] 

I encountered two problems.

1) Pod has insufficient privileges to bind to hostport 80.
~ $ /agnhost netexec --http-port 80
~ $ /agnhost netexec --http-port 80
2021/08/13 15:19:55 Started HTTP server on port 80
2021/08/13 15:19:55 Started UDP server on port  8081
2021/08/13 15:19:55 listen tcp :80: bind: permission denied



2) Comparison of FQDN and hostname fails
See test/e2e/network/service.go +2259 & +2341 - Calling execHostnameTest with node0.Name (FQDN) and then comparing with agnhost /hostname (hostname) ( https://pkg.go.dev/k8s.io/kubernetes@v1.18.0-alpha.0/test/images/agnhost?readme=expanded#readme-serve-hostname) will fail on OCP.


Version-Release number of selected component (if applicable):
K8 1.22.0


How reproducible:
Build openshift-tests with k8 1.22 test cases (see PR on origin rebase-1.22.0-rc.0]).
Test against nightly of ocp 4.9.



I have produced two fixes that enable the test cases to pass for upstream k8:

Issue 1: https://github.com/martinkennelly/kubernetes/tree/fix_local_test_bind_denied

Issue 2: https://github.com/martinkennelly/kubernetes/tree/fix_fqdn_hostname_mismatch

Comment 1 Martin Kennelly 2021-08-16 09:27:42 UTC
For issue 1 - Either we increase pod privileges or up the port number above 1024. I went for the latter.
I will disable the two test cases until upstream is resolved.

Comment 2 Dan Winship 2021-08-16 12:21:13 UTC
fixes look good, though I'd add a comment to the code in the second one rather than only explaining in the commit message

Can you push those PRs upstream and the link to the PRs from here so I'll see them?

Then once it merges upstream you'll need to cherry-pick them into https://github.com/openshift/kubernetes, as explained in the README.openshift.md there

Comment 3 Dan Winship 2021-08-16 12:22:52 UTC
(Though cherry-picking them is only relevant if we're actually planning to enable the alpha feature gate in 4.9, which I guess we probably aren't, so probably you don't actually have to do that.)

Comment 4 Martin Kennelly 2021-08-17 08:46:38 UTC
Comment added to code.

PRs:

1) Pod has insufficient privileges to bind to hostport 80.
https://github.com/kubernetes/kubernetes/pull/104409


2) Comparison of FQDN and hostname fails
https://github.com/kubernetes/kubernetes/pull/104408

Yes, but I may as well do this when it's merged so we have it done for the future.

Comment 5 Dan Winship 2021-08-17 13:30:35 UTC
(In reply to Martin Kennelly from comment #4)
> Yes, but I may as well do this when it's merged so we have it done for the
> future.

If we don't need the fix until OCP 4.10 then it doesn't have to be cherry-picked, because it will get pulled in as part of the rebase to kube 1.23.

Comment 6 Martin Kennelly 2021-08-26 10:30:54 UTC
Dan, isn't OCP 4.9 based on k8 1.22 and therefore this feature is in beta? https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
Therefore we need to cherry-pick back the fixes.

Comment 7 Martin Kennelly 2021-08-26 12:04:46 UTC
Missing from this BZ was test case: "[sig-network] Services should respect internalTrafficPolicy=Local Pod (hostNetwork: true) to Pod [Feature:ServiceInternalTrafficPolicy]"

This was also disabled due to upstream fix here: https://github.com/kubernetes/kubernetes/pull/104409/

Comment 8 Dan Winship 2021-08-26 13:22:49 UTC
ah, kube_features.go claims it's still alpha in the comment:

	// owner: @maplain @andrewsykim
	// kep: http://kep.k8s.io/2086
	// alpha: v1.21
	//
	// Enables node-local routing for Service internal traffic
	ServiceInternalTrafficPolicy featuregate.Feature = "ServiceInternalTrafficPolicy"

but sets it to beta in defaultKubernetesFeatureGates:

	ServiceInternalTrafficPolicy:                   {Default: true, PreRelease: featuregate.Beta},

so it looks like they forgot to update the comment.

So yes, it would be good to cherry-pick the fixes. (And maybe also fix the comment upstream to indicate its status correctly.)


Note You need to log in before you can comment on or make changes to this bug.