Bug 2044824

Summary: Failing test in periodics: [sig-network] Services should respect internalTrafficPolicy=Local Pod and Node, to Pod (hostNetwork: true) [Feature:ServiceInternalTrafficPolicy] [Skipped:Network/OVNKubernetes] [Suite:openshift/conformance/parallel] [Suite:k8s]
Product: OpenShift Container Platform Reporter: Pierre Prinetti <pprinett>
Component: InstallerAssignee: Martin André <m.andre>
Installer sub component: OpenShift on OpenStack QA Contact: Jon Uriarte <juriarte>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: high CC: aos-bugs, dgoodwin, juriarte, pprinett
Version: 4.10Keywords: Triaged
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2050247 (view as bug list) Environment:
Last Closed: 2022-08-10 10:43:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2050247    

Comment 1 Martin André 2022-01-25 10:29:03 UTC
It might just be a short hostname vs FDQN thing, where openstack would behave differently than the other cloud platforms:

fail [k8s.io/kubernetes.0/test/e2e/network/util.go:170]: Expected
    <string>: bl3gfvsx-1cd93-5v8rq-worker-0-5kmrj.novalocal
to equal
    <string>: bl3gfvsx-1cd93-5v8rq-worker-0-5kmrj

In that case the test would need to be fixed.

Comment 2 Martin André 2022-01-25 12:15:02 UTC
According to https://github.com/openshift/kubernetes/commit/752a532c3d0819759f98821a94f26193b494c3d5 the return from the agnhost "/hostname" endpoint should be the hostname and not the FQDN. Which would mean the is fine, and the "/hostname" endpoint is the one that needs fixing.

Comment 3 Martin André 2022-01-25 13:05:04 UTC
The test seems to be consistently failing the openstack 4.10 parallel job since Jan 21 11:20:31.

Comment 4 Martin André 2022-01-25 14:04:47 UTC
Continuing my exploration of the agnhost image.

Looks like the "/hostname" endpoint implementation [1] simply return the output of os.Hostname() [2], which in turn gets the hostname from the uname syscall or /proc/sys/kernel/hostname [3]

[1] https://github.com/kubernetes/kubernetes/blob/master/test/images/agnhost/netexec/netexec.go#L652
[2] https://pkg.go.dev/os#Hostname
[3] https://cs.opensource.google/go/go/+/refs/tags/go1.17.6:src/os/sys_linux.go;drc=refs%2Ftags%2Fgo1.17.6;l=12

Which makes me think the assumption from https://github.com/openshift/kubernetes/commit/752a532c3d0819759f98821a94f26193b494c3d5 is incorrect.

Comment 5 Martin André 2022-01-26 10:23:17 UTC
Posted an upstream patch at https://github.com/kubernetes/kubernetes/pull/107786 to fix the failing test.

Comment 6 Pierre Prinetti 2022-01-28 18:49:42 UTC
Setting blocker- because this bug has been triaged as sev LOW

Comment 9 Devan Goodwin 2022-02-03 14:37:32 UTC
Can we get this bug to verified so we can proceed with merging to 4.10?

Comment 10 Devan Goodwin 2022-02-03 14:42:15 UTC
Modified would also work according to bot, presumably we need to see https://github.com/openshift/origin/pull/26805 merge to origin so we can verify.

Comment 13 Martin André 2022-02-23 15:50:12 UTC
self-verifying, we don't see the issue anymore in CI.

Comment 16 errata-xmlrpc 2022-08-10 10:43:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069