Bug 1753059

Summary: openshift-dns fails to start due to NetworkPluginNotReady message:Network plugin returns error: Missing CNI default network
Product: OpenShift Container Platform Reporter: Douglas Smith <dosmith>
Component: NetworkingAssignee: Dan Mace <dmace>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: aos-bugs
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1780213 (view as bug list) Environment:
Last Closed: 2020-01-23 11:06:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1780213    

Description Douglas Smith 2019-09-18 01:16:33 UTC
Description of problem: openshift-dns fails to start with:

Sep 17 17:39:43.716 W ns/openshift-dns pod/dns-default-tvnxp network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: Missing CNI default network (11 times)

Version-Release number of selected component (if applicable):

How reproducible: Appears in 4.2 nightly e2e runs

Additional info: you can see it show up in this run https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.2/541

in the event logs (click monitor cluster while tests execute and then click open stdout)

Comment 1 Casey Callendrello 2019-09-18 16:06:28 UTC
Did this actually fail? DNS has tolerations, so it's scheduled on notready pods. But it should start up.

Doug, did you see a degraded operator? Otherwise this is NOTABUG since it's a transient state.

Comment 3 Casey Callendrello 2019-09-18 16:32:34 UTC
Yeah, the DNS pod has a NotReady toleration. So this error is expected. Once the network is up, it starts just fine.

The "fix" is either to accept this error as ignorable or remove the toleration.

Comment 4 Casey Callendrello 2019-09-18 16:44:26 UTC
Kicking over to Network Edge to decide whether or not they want to remove the toleration.

Comment 5 Dan Mace 2019-09-18 18:47:19 UTC
Thanks Casey. I think we should talk about removing the toleration to smooth things out unless there's some reason for the current behavior.

Moving to 4.3 because we aren't going to block the release unless the transient state has some serious downstream effect.

Comment 7 Hongan Li 2019-11-08 08:41:07 UTC
verified with 4.3.0-0.nightly-2019-11-06-230519 and didn't find the issue.

Comment 9 errata-xmlrpc 2020-01-23 11:06:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.