Bug 1625873

Summary: Could not find csr for node when using cni plugin
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: InstallerAssignee: Michael Gugino <mgugino>
Status: CLOSED ERRATA QA Contact: sheng.lao <shlao>
Severity: high Docs Contact:
Priority: high    
Version: 3.11.0CC: aos-bugs, ccoleman, cdc, jokerman, juriarte, mmccomas, pcameron, shlao, wmeng, zzhao
Target Milestone: ---   
Target Release: 3.11.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1628964 1632862 (view as bug list) Environment:
Last Closed: 2018-10-11 07:25:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1628964, 1632862, 1651773    

Description zhaozhanqi 2018-09-06 07:17:02 UTC
Description of problem:
Met "could not find csr for nodes: ip-172-18-9-45.ec2.internal", "state": "unknown" during setup the OCP 3.11 cluster using cni plugin:

os_sdn_network_plugin_name: cni

Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-3.10.35-1.git.0.e5b821eNone.noarch.rpm
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1. setup ocp 3.11 cluster with parameter:
  os_sdn_network_plugin_name: cni

2. the job was broken 
3. Check the csr
  oc get csr

Actual results:

fatal: [ec2-52-203-112-231.compute-1.amazonaws.com]: FAILED! => {"attempts": 30, "changed": false, "msg": "Could not find csr for nodes: ip-172-18-9-45.ec2.internal", "state": "unknown"}

step 3. the csr always in 'pending' status

Expected results:

no this error.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Scott Dodson 2018-09-06 12:33:36 UTC
Please gather `oc get nodes -o yaml` `oc get csr -o yaml`
Please provide a complete inventory.

Comment 6 sheng.lao 2018-09-07 10:55:38 UTC
It is easy to reproduce the problem with configuration:
   os_sdn_network_plugin_name: cni

Comment 8 Casey Callendrello 2018-09-07 15:10:50 UTC
I wonder if there is a control-plane component that should be "hostNetwork: true" but isn't.

This would mean that it doesn't come up until the network comes up. When    os_sdn_network_plugin_name is "cni", no network is installed and the user is expected to provide their own.

This is NOT a networking bug, AFAICT.

Comment 9 zhaozhanqi 2018-09-10 10:09:24 UTC
hmm, I'm using 'cni' is testing the PR https://github.com/openvswitch/ovn-kubernetes/pull/385

@phil, could you help check this?

Comment 10 Phil Cameron 2018-09-10 15:05:08 UTC
zzhao I am investigating. I am trying to set up 3.11 cluster and I am still debugging. This is among the problems.

Comment 11 Phil Cameron 2018-09-10 15:18:53 UTC
Casey, (Comment 9)
When I set "os_sdn_network_plugin_name='cni'", redhat/openshift-ovs-multitenant is still installed and the daemonsets are up ad running.
Part of the instructions in PR-385 is to delete the daemonsets.
The 'cni' may be causing some other install item to not work as expected.

This look sto me like an installed bug, not a network bug.

Comment 12 Michael Gugino 2018-09-10 16:18:21 UTC
The logic to run this in master's config was added in commit b17728d542

Clayton, can you clarify what's supposed to be happening here?  You are the author of that commit.

Comment 13 Michael Gugino 2018-09-12 17:06:00 UTC
PR created in master: https://github.com/openshift/openshift-ansible/pull/10033

Comment 14 Scott Dodson 2018-09-13 20:28:01 UTC
https://github.com/openshift/openshift-ansible/pull/10054 release-3.11

Comment 15 Scott Dodson 2018-09-13 20:41:47 UTC
These fixes missed today's build so I've produced a new build via brew with them for testing.

https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=765897

Comment 16 sheng.lao 2018-09-14 06:43:51 UTC
Fixed at: openshift-ansible-playbooks-3.11.5-1.git.0.5a01a3c.el7_5.noarch.rpm

Comment 18 errata-xmlrpc 2018-10-11 07:25:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Comment 19 Red Hat Bugzilla 2023-09-14 04:34:18 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days