Description of problem: ovnkube-trace requires iproute to be installed in order to work because it execs `ip -o link show` in it. This essentially breaks ovnkube-trace for most use cases, as it is easier to find application pods without iproute installed than with it. Reasons include (but are not limited to): - ubi8 images don't install iproute by default and most people don't take care to install it explicitly - s2i builders create ubi8-based images, so s2i builds have the problem in previous points - It becomes more and more typical to use microcontainers of different kinds, which may only include the application binary or a very small root filesystem where, again, ip binary is not available. Not to mention than doing an exec in the application pod is not a very good practice. Version-Release number of selected component (if applicable): 4.7 (but it seems that 4.8 contains also the same or similar code) How reproducible: Always if source or destination pod don't have iproute (ip binary) installed. Steps to Reproduce: 1. Follow https://access.redhat.com/solutions/5887511 with a source or destination pod that don't have iproute package on it 2. 3. Actual results: ovnkube-trace doesn't work Expected results: ovnkube-trace works Additional info: The best way to go would be to not rely on any tools installed on application container but enter the network namespace of the container. Taking into account that ovnkube-trace already performs execs on ovnkube-node-XXX pod of the node and that the ovnkub-node-XXX pod mounts the root filesystem in /host, it should be trivial to leverage that to get the pid of the sandbox and nsenter its network namespace.
Bug seems to be handled by https://github.com/ovn-org/ovn-kubernetes/pull/1975 and https://github.com/ovn-org/ovn-kubernetes/pull/2308 in ovn-kubernetes upstream. Waiting for https://github.com/openshift/ovn-kubernetes/pull/618 to be merged, so fix is available downstream.
Fix was merged via https://github.com/openshift/ovn-kubernetes/pull/619 into ovn-kubernetes downstream (https://github.com/openshift/ovn-kubernetes/commit/b97d1a3cace41a0dd92ce2e337b0b3c8ffb1b078).
❯ oc version Client Version: 4.7.0-202107292319.p0.git.8b4b094.assembly.stream-8b4b094 Server Version: 4.9.0-0.nightly-2021-08-07-175228 Kubernetes Version: v1.21.1+8268f88 ❯ POD=$(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master -o name | head -1 | awk -F '/' '{print $NF}') ❯ oc cp -n openshift-ovn-kubernetes $POD:/usr/bin/ovnkube-trace ovnkube-trace Defaulting container name to northd. tar: Removing leading `/' from member names ❯ chmod +x ovnkube-trace ❯ ./ovnkube-trace -dst alertmanager-main-0 -dst-namespace openshift-monitoring -src alertmanager-main-0 -src-namespace openshift-monitoring -tcp I0811 19:04:16.998694 507 ovs.go:96] Maximum command line arguments set to: 191102 I0811 19:04:16.998848 507 ovnkube-trace.go:463] Log level set to: 0 ovn-trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output to "openshift-monitoring_alertmanager-main-0" I0811 19:04:23.227740 507 ovnkube-trace.go:750] ovn-trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output to "openshift-monitoring_alertmanager-main-0" ovn-trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output to "openshift-monitoring_alertmanager-main-0" I0811 19:04:24.038886 507 ovnkube-trace.go:792] ovn-trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output to "openshift-monitoring_alertmanager-main-0" ovs-appctl ofproto/trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output:15 Final flow: I0811 19:04:24.529076 507 ovnkube-trace.go:836] ovs-appctl ofproto/trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output:15 Final flow: ovs-appctl ofproto/trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output:15 Final flow: I0811 19:04:25.051374 507 ovnkube-trace.go:880] ovs-appctl ofproto/trace indicates success from alertmanager-main-0 to alertmanager-main-0 - matched on output:15 Final flow: ovn-trace command Completed normally Everything working as expected. Marking as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759