Bug 1636493

Summary: failed to find plugin "bridge" in path [/opt/cni/bin]
Product: OpenShift Container Platform Reporter: Nicholas Schuetz <nick>
Component: ContainersAssignee: Giuseppe Scrivano <gscrivan>
Status: CLOSED NOTABUG QA Contact: weiwei jiang <wjiang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.11.0CC: aos-bugs, gscrivan, jmalde, jokerman, mmccomas, mpatel, mtaru, nick, nschuetz, ocasalsa, pabraham, sdodson, tsweeney, wmeng, wsun
Target Milestone: ---Keywords: Reopened
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-26 06:24:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ansible inventory none

Description Nicholas Schuetz 2018-10-05 14:16:21 UTC
If deploying 3.11.x on docker container runtime, DO NOT install the cri-o package.  If you do, you'll get this error and the install wont work.

Comment 1 Mrunal Patel 2018-10-08 16:23:17 UTC
This isn't a supported operation. You have to select cri-o at install time in the installer. From the comments it looks like cri-o was installed afterwards. Can you confirm that?

Comment 2 Nicholas Schuetz 2018-10-09 14:59:52 UTC
Created attachment 1492116 [details]
ansible inventory

Comment 3 Nicholas Schuetz 2018-10-09 15:00:54 UTC
No.  Cri-o was the runtime from the start/install.  Hosts file attached.

Comment 4 Giuseppe Scrivano 2018-11-29 14:22:54 UTC
"bridge" is not supposed to be in /opt/cni/bin but it is installed in /usr/libexec/cni as part of the containernetworking-plugins rpm.

It should not be used on OpenShift.  What files do you have under /etc/cni/net.d/?  There should be only 80-openshift-network.conf and openshift-sdn.conf

I am not sure the installer should delete other configuration files if they are already present.

Comment 5 Nick Schuetz 2018-12-10 17:46:46 UTC
Here's what I have:

# ls -lh /etc/cni/net.d/
total 20K
-rw-r--r--. 1 root root 294 Oct 26 12:10 100-crio-bridge.conf
-rw-r--r--. 1 root root  54 Oct 26 12:10 200-loopback.conf
-rw-r--r--. 1 root root  83 Dec 10 09:57 80-openshift-network.conf
-rw-r--r--. 1 root root 483 Nov 15 23:01 87-podman-bridge.conflist
-rw-r--r--. 1 root root  82 Oct 11 17:03 openshift-sdn.conf

Comment 6 Nick Schuetz 2018-12-10 19:31:30 UTC
The openshift uninstaller does not remove any of the files. Removing them all per https://access.redhat.com/solutions/3449671 did resolve the issue after a full uninstall.

Comment 7 Petter Abrahamsson 2018-12-14 22:22:12 UTC
I'm seeing the same error when trying to setup local volumes[1] on 3.11 running cri-o (from install).

# ls -lh /etc/cni/net.d
total 16K
-rw-r--r--. 1 root root 294 Nov 15 20:45 100-crio-bridge.conf
-rw-r--r--. 1 root root  54 Nov 15 20:45 200-loopback.conf
-rw-r--r--. 1 root root  83 Dec  5 13:46 80-openshift-network.conf
-rw-r--r--. 1 root root  82 Dec  1 10:19 openshift-sdn.conf

Based on comments above and the KB article I'm not sure what is the proper resolution is.
Happy to provide inventory file if that's helpful.

[1] https://docs.openshift.com/container-platform/3.11/install_config/configuring_local.html

Comment 8 Petter Abrahamsson 2018-12-29 02:57:48 UTC
Just wanted to follow up and and say that my issue was resolved after deleting all the files but 80-openshift-network.conf and restarting the nodes.

Comment 9 gerodrig 2019-01-25 11:13:42 UTC
Hello, it seems that this keeps happening after normal installs.
Would it be possible to add a check to the installer for these extra files, or fix the cases where it's not correctly removing the files?
Can we have an update on the status of the bug?

Thank you.

Gerard.

Comment 10 Giuseppe Scrivano 2019-01-28 10:32:33 UTC
@gerodrig, there is a KB article explaining what to do in case of the issue.  I am not sure the installer should delete these files that were created separately.

I think the KB article is enough, Scott what do you think?

Comment 11 Scott Dodson 2019-01-28 13:18:12 UTC
Without clearly documented reproducer steps I don't know if we should do anything more. Even with all the additional comments in this bug it's still not clear to me what the workflow is to produce this problem. If someone is doing something extra outside of the installer then I'm not worried about it, but if a simple installation to clean hosts according to our documented process results in a broken install we should fix that in the installer.

Comment 12 gerodrig 2019-02-15 12:34:59 UTC
@(In reply to Mrunal Patel from comment #1)
> This isn't a supported operation. You have to select cri-o at install time
> in the installer. From the comments it looks like cri-o was installed
> afterwards. Can you confirm that?

Hello Mrunal,

Can you confirm that installing cri-o via rpm either *before* or after the installation is an unsupported operation?

Looking at the documentation [1] it's not clear that the only way to install cri-o should be letting the installation do it via the inventory variables.


[1] https://access.redhat.com/documentation/en-us/openshift_container_platform/3.11/html-single/cri-o_runtime/

Comment 13 Mrunal Patel 2019-02-19 22:15:07 UTC
CRI-O is only supported through the installer to guarantee that the nodes are in a good state and don't need any further manual steps.

Comment 14 gerodrig 2019-02-20 14:09:15 UTC
(In reply to Mrunal Patel from comment #13)
> CRI-O is only supported through the installer to guarantee that the nodes
> are in a good state and don't need any further manual steps.

Thank you very much for the confirmation.

I've opened a documentation bug to make it clearer https://bugzilla.redhat.com/show_bug.cgi?id=1679130

Comment 16 weiwei jiang 2019-02-25 07:20:20 UTC
(In reply to Mrunal Patel from comment #13)
> CRI-O is only supported through the installer to guarantee that the nodes
> are in a good state and don't need any further manual steps.

So how to deal this bug from the qe side since it's on ON_QA status?

Comment 21 Giuseppe Scrivano 2020-05-26 06:24:05 UTC
I am not sure how to proceed here, the last comment doesn't specify what files needed to be deleted, but that is the solution provided in the KB.

The account who reopened the issue is disabled now so I cannot ask for more information.

Please reopen if it is still an issue.