Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1548525

Summary: Cannot start atomic-openshift-node when using networkpolicy plugin while upgrade from 3.7 to 3.9
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED EOL Docs Contact:
Severity: high    
Priority: unspecified CC: aos-bugs, bbennett, danw, erich, wmeng, wsun, xtian
Version: 3.8.0   
Target Milestone: ---   
Target Release: 3.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-05 21:55:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vikas Laad 2018-02-23 18:34:07 UTC
Description of problem:
While doing an upgrade from 3.7 to 3.9 upgrade fails due to following error

Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-node[77664]: E0223 18:19:02.519017   77664 networkpolicy.go:130] Unable to query 
NetworkPolicies (networkpolicies.networking.k8s.io is forbidden: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list networkpolicies.ne
tworking.k8s.io at the cluster scope: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list all networkpolicies.networking.k8s.io in the 
cluster) - please ensure your nodes have access to view NetworkPolicy (eg, 'oc adm policy reconcile-cluster-roles')
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-node[77664]: F0223 18:19:02.519049   77664 network.go:44] SDN node startup failed
: networkpolicies.networking.k8s.io is forbidden: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list networkpolicies.networking.k8s.io
 at the cluster scope: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list all networkpolicies.networking.k8s.io in the cluster
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal dnsmasq[38697]: setting upstream servers from DBus
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal dnsmasq[38697]: using nameserver 172.31.0.2#53
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: Failed to start OpenShift Node.
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state.
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service failed.
Feb 23 18:19:03 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-master-api[21430]: I0223 18:19:03.104565   21430 rest.go:362] Starting watch for /api/v1/namespaces, rv=27363 labels= fields= timeout=9m59s
Feb 23 18:19:03 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-master-api[21430]: I0223 18:19:03.283532   21430 rest.go:362] Starting watch for /api/v1/nodes, rv=27363 labels= fields=metadata.name=ip-172-31-62-223.us-west-2.compute.internal timeout=9m50s


Version-Release number of selected component (if applicable):
openshift v3.8.32
kubernetes v1.8.5+440f8d36da
etcd 3.2.8

How reproducible:
During the upgrade from 3.7 to 3.9 with NetworkPolicy plugin

Steps to Reproduce:
1. Create OCP 3.7 cluster with networkpolicy plugin
2. Run v3_9 playbook to upgrade
3. Playbook fails

Actual results:
Upgrade fails

Expected results:
Upgrade should succeed.

Additional info:
Looks like it was fixed in 3.9 but needs to be backported to 3.8.
https://bugzilla.redhat.com/show_bug.cgi?id=1523153

Comment 1 Dan Winship 2018-02-23 18:35:16 UTC
https://github.com/openshift/ose/pull/1082

Comment 2 Vikas Laad 2018-02-27 14:51:03 UTC
This blocks upgrade scenario.

Comment 4 Weihua Meng 2018-03-07 04:45:54 UTC
Could you please make a 3.8 rpm build for test?
Thanks.

Comment 6 Weihua Meng 2018-03-09 03:01:44 UTC
Fixed.
upgrade successful with redhat/openshift-ovs-networkpolicy from 3.7 to 3.9
latest 3.8 and 3.9 build used
v3.8.33-1
v3.9.3-1
and 
openshift-ansible-3.9.3-1.git.0.e166207.el7.noarch for ansible-playbook

Both RPM and containerized are tested.