Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1625873 - Could not find csr for node when using cni plugin [NEEDINFO]
Could not find csr for node when using cni plugin
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.11.0
All All
high Severity high
: ---
: 3.11.0
Assigned To: Michael Gugino
sheng.lao
:
Depends On:
Blocks: 1628964 1632862
  Show dependency treegraph
 
Reported: 2018-09-06 03:17 EDT by zhaozhanqi
Modified: 2018-10-11 03:26 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1628964 1632862 (view as bug list)
Environment:
Last Closed: 2018-10-11 03:25:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
mgugino: needinfo? (ccoleman)


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:26 EDT

  None (edit)
Description zhaozhanqi 2018-09-06 03:17:02 EDT
Description of problem:
Met "could not find csr for nodes: ip-172-18-9-45.ec2.internal", "state": "unknown" during setup the OCP 3.11 cluster using cni plugin:

os_sdn_network_plugin_name: cni

Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-3.10.35-1.git.0.e5b821eNone.noarch.rpm
rpm -q ansible
ansible --version

How reproducible:

Steps to Reproduce:
1. setup ocp 3.11 cluster with parameter:
  os_sdn_network_plugin_name: cni

2. the job was broken 
3. Check the csr
  oc get csr

Actual results:

fatal: [ec2-52-203-112-231.compute-1.amazonaws.com]: FAILED! => {"attempts": 30, "changed": false, "msg": "Could not find csr for nodes: ip-172-18-9-45.ec2.internal", "state": "unknown"}

step 3. the csr always in 'pending' status

Expected results:

no this error.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Comment 3 Scott Dodson 2018-09-06 08:33:36 EDT
Please gather `oc get nodes -o yaml` `oc get csr -o yaml`
Please provide a complete inventory.
Comment 6 sheng.lao 2018-09-07 06:55:38 EDT
It is easy to reproduce the problem with configuration:
   os_sdn_network_plugin_name: cni
Comment 8 Casey Callendrello 2018-09-07 11:10:50 EDT
I wonder if there is a control-plane component that should be "hostNetwork: true" but isn't.

This would mean that it doesn't come up until the network comes up. When    os_sdn_network_plugin_name is "cni", no network is installed and the user is expected to provide their own.

This is NOT a networking bug, AFAICT.
Comment 9 zhaozhanqi 2018-09-10 06:09:24 EDT
hmm, I'm using 'cni' is testing the PR https://github.com/openvswitch/ovn-kubernetes/pull/385

@phil, could you help check this?
Comment 10 Phil Cameron 2018-09-10 11:05:08 EDT
zzhao@redhat.com I am investigating. I am trying to set up 3.11 cluster and I am still debugging. This is among the problems.
Comment 11 Phil Cameron 2018-09-10 11:18:53 EDT
Casey, (Comment 9)
When I set "os_sdn_network_plugin_name='cni'", redhat/openshift-ovs-multitenant is still installed and the daemonsets are up ad running.
Part of the instructions in PR-385 is to delete the daemonsets.
The 'cni' may be causing some other install item to not work as expected.

This look sto me like an installed bug, not a network bug.
Comment 12 Michael Gugino 2018-09-10 12:18:21 EDT
The logic to run this in master's config was added in commit b17728d542

Clayton, can you clarify what's supposed to be happening here?  You are the author of that commit.
Comment 13 Michael Gugino 2018-09-12 13:06:00 EDT
PR created in master: https://github.com/openshift/openshift-ansible/pull/10033
Comment 14 Scott Dodson 2018-09-13 16:28:01 EDT
https://github.com/openshift/openshift-ansible/pull/10054 release-3.11
Comment 15 Scott Dodson 2018-09-13 16:41:47 EDT
These fixes missed today's build so I've produced a new build via brew with them for testing.

https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=765897
Comment 16 sheng.lao 2018-09-14 02:43:51 EDT
Fixed at: openshift-ansible-playbooks-3.11.5-1.git.0.5a01a3c.el7_5.noarch.rpm
Comment 18 errata-xmlrpc 2018-10-11 03:25:56 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.