Bug 1752453 - Multus no longer finds binaries in /opt/multus/bin
Summary: Multus no longer finds binaries in /opt/multus/bin
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.2.0
Assignee: Douglas Smith
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-16 12:01 UTC by Daniel Grimm
Modified: 2019-10-16 06:41 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:41:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github intel multus-cni pull 378 0 None closed [entrypoint] Adds --additional-bin-dir option to entrypoint 2020-09-01 19:50:35 UTC
Github openshift cluster-network-operator pull 320 0 None closed Bug 1752453: Multus should execute CNI plugins in /opt/multus/bin 2020-09-01 19:50:34 UTC
Github openshift multus-cni pull 29 0 None closed Bug 1752453: Adds additional bin dir functionality 2020-09-01 19:50:34 UTC
Github openshift multus-cni pull 30 0 None closed Bug 1752453: Adds additional bin dir (backport 4.2) 2020-09-01 19:50:34 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:41:23 UTC

Description Daniel Grimm 2019-09-16 12:01:05 UTC
Description of problem:
In OCP 4.1.x, CNI binaries could be copied into /opt/multus/bin where they would be picked up by multus. This does no longer work in 4.2. Previously, multus would search two directories for binaries:
- /var/lib/cni/bin
- /opt/multus/bin

In 4.2, it only finds binaries in /var/lib/cni/bin.

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-09-14-171119

How reproducible:
Always

Steps to Reproduce:
1. Install ServiceMesh operator in 4.2
2. deploy control plane
3. to circumvent race condition (Bug 1732598), run this on every node:
  rm /etc/kubernetes/cni/net.d/80-openshift-network.conf && systemctl restart crio
4. deploy services into the mesh.

Actual results:
Pods fail to start with errors similar to: "Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox"

Expected results:
Pods start and traffic is routed through the sidecars.

Additional info:
If you don't perform step 3, pods will start, but traffic is not routed through the sidecars because the CNI binary is never executed.

Note that I couldn't find this path documented anywhere, but 4.1.x versions showed it in error messages. See this issue for reference: https://issues.jboss.org/browse/MAISTRA-582

Comment 1 Casey Callendrello 2019-09-16 12:18:56 UTC
Doug,

Even if unintentional, it seems like an API change, which it would be nice to avoid. Do you think we can fix this for 4.2?

Comment 2 Douglas Smith 2019-09-16 14:18:00 UTC
Hey guys thanks for filing this and for the heads up.

I'm starting an archaeological dig to figure out where/when we had the `/opt/multus/bin` and figure where and how that got removed from the multiple CNI binary search paths.

Comment 3 Douglas Smith 2019-09-16 15:56:39 UTC
Just a quick update that even after some searching, I can't find a reference to `/opt/multus/bin` in the Multus history (I have, however found a reference to it in debug logs in this github comment: https://github.com/intel/multus-cni/issues/243 ). Which isn't making it easier to unravel.

However, I did discover an issue where the "binDir" Multus configuration parameter (documented here: https://github.com/intel/multus-cni/blob/master/doc/configuration.md#multus-cni-configuration-reference) is not being picked up. And I'm wondering if it's related. I'll update once I know more.

Comment 4 Douglas Smith 2019-09-17 20:24:10 UTC
My current plan is to ensure that the `binDir` configuration option in Multus works as intended and provide a way to use that to add the `/opt/multus/bin` as a directory that can be used (as I cannot locate in history how this worked before, which is a mystery to me, but, I digress)

I've figured out a way to make it work, however, I feel that my approach is somewhat ham-fisted, and I've asked my colleague Tomofumi to take a look at it.

This upstream pull request is here: https://github.com/intel/multus-cni/pull/376

This will likely also likely require a downstream PR to Multus to bring these changes in (once we figure out the right method by which to accomplish this), as well as a companion change to the cluster-network-operator in order to express which `binDir` to use in addition to the regularly used /var/lib/cni/bin (which would be /opt/multus/bin)

Comment 5 Daniel Grimm 2019-09-18 10:04:13 UTC
Thank you for the update Doug. I found that we write PATH into CNI_PATH: https://github.com/openshift/containernetworking-plugins/blob/ab8f244f28035eb06c0f5416bd1964c541b38a6e/pkg/testutils/cmd.go#L36 - maybe the PATH has changed?

Comment 6 Daniel Grimm 2019-09-18 11:24:16 UTC
Nevermind my last comment, it doesn't look like that is happening in multus-cni

Comment 7 Douglas Smith 2019-09-18 16:21:50 UTC
Upstream PR #376 has been merged. However, I also noticed I had an oversight, I need to expose the `binDir` CNI configuration option via the entrypoint (on a quick read through I thought that may have been an option, but, I was looking at the wrong bin dir option)

https://github.com/intel/multus-cni/pull/378

Once these two changes come downstream, we'll also need to make a PR to the CNO in order to also add a `--additional-bin-dir=/opt/multus/bin`

Comment 8 Douglas Smith 2019-09-19 15:59:13 UTC
We have some pending changes with pull requests up:

* https://github.com/openshift/multus-cni/pull/30 -- brings the upstream changes previously referenced into 4.2 release
* https://github.com/openshift/cluster-network-operator/pull/320 -- integrates those changes into the cluster-network-operator by setting the --additional-bin-dir=/opt/multus/bin for the Multus entrypoint.

Comment 10 Anurag saxena 2019-09-20 20:32:20 UTC
On UPI Vsphere cluster one master one worker, /opt/multus/bin is still not being created on 4.2.0-0.nightly-2019-09-20-090334


$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-09-20-090334   True        False         5h17m   4.2.0-0.nightly-2019-09-20-090334


$ oc get pods -n openshift-multus
NAME                                READY   STATUS    RESTARTS   AGE
multus-admission-controller-c8kjf   1/1     Running   1          5h35m
multus-xsw8f                        1/1     Running   1          5h35m
multus-z2st6                        1/1     Running   1          5h35m

$ ll /opt/multus/bin
ls: cannot access /opt/multus/bin: No such file or directory

$ ll /var/lib/cni/bin
total 107904
-rwxr-xr-x. 1 root root  4016716 Sep 20 13:24 bandwidth
-rwxr-xr-x. 1 root root  4459109 Sep 20 13:24 bridge
-rwxr-xr-x. 1 root root 11188325 Sep 20 13:24 dhcp
-rwxr-xr-x. 1 root root  5706536 Sep 20 13:24 firewall
-rwxr-xr-x. 1 root root  2937585 Sep 20 13:24 flannel
-rwxr-xr-x. 1 root root  3982836 Sep 20 13:24 host-device
-rwxr-xr-x. 1 root root  3465116 Sep 20 13:24 host-local
-rwxr-xr-x. 1 root root  4141112 Sep 20 13:24 ipvlan
-rwxr-xr-x. 1 root root  3060578 Sep 20 13:24 loopback
-rwxr-xr-x. 1 root root  4212551 Sep 20 13:24 macvlan
-rwxr-xr-x. 1 root root 35627186 Sep 20 13:24 multus
-rwxr-xr-x. 1 root root  6005728 Sep 20 13:24 openshift-sdn
-rwxr-xr-x. 1 root root  3945238 Sep 20 13:24 portmap
-rwxr-xr-x. 1 root root  4389701 Sep 20 13:24 ptp
-rwxr-xr-x. 1 root root  3269265 Sep 20 13:24 sbr
-rwxr-xr-x. 1 root root  2761580 Sep 20 13:24 static
-rwxr-xr-x. 1 root root  3138879 Sep 20 13:24 tuning
-rwxr-xr-x. 1 root root  4141022 Sep 20 13:24 vlan

Comment 11 Anurag saxena 2019-09-20 20:33:44 UTC
Not sure if ServiceMesh operator in 4.2 installation is mandatory to check this

Comment 12 Douglas Smith 2019-09-22 06:46:18 UTC
Hrmmm, this change shouldn't actually create /opt/multus/bin -- just enable you to put plugins there.

Daniel -- was /opt/multus/bin existing when you were putting plugins there originally, or is it a requirement? Or can that directory be created at the time you drop the binary in /opt/multus/bin?

If it is, we can look into getting a PR into the machine config operator to have the directory created.

Comment 13 Daniel Grimm 2019-09-23 08:22:25 UTC
We're creating the directory as part of dropping the binary - if it doesn't exist. Not sure if it exists by default on 4.1

Comment 14 Casey Callendrello 2019-09-23 13:06:56 UTC
Yeah, I don't think this is a problem. It's OK if the directory doesn't exist, but we will look there if it exists.

Back to you, Anurag. You'll need to make the directory as part of the test.

Comment 15 Anurag saxena 2019-09-23 15:18:50 UTC
Okay, it seems that /opt/multus/bin doesn't exist in 4.1 but/opt/cni/bin is present but /opt/multus/bin can be created.

I will go ahead and check on 4.2. Thanks

Comment 16 Anurag saxena 2019-09-23 18:21:25 UTC
Verifying this bug on 4.2.0-0.nightly-2019-09-22-222738. Daniel, please re-open this if it conflicts at your end.

I can create /opt/multus/bin on 4.2 now

[core@compute-0 ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-09-22-222738   True        False         5m56s   Cluster version is 4.2.0-0.nightly-2019-09-22-222738

[core@compute-0 ~]$ sudo mkdir /opt/multus/
[core@compute-0 ~]$ sudo mkdir /opt/multus/bin
[core@compute-0 ~]$ ll /opt/multus/bin
total 0

Thanks!

Comment 17 errata-xmlrpc 2019-10-16 06:41:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.