Bug 1548525 - Cannot start atomic-openshift-node when using networkpolicy plugin while upgrade from 3.7 to 3.9
Summary: Cannot start atomic-openshift-node when using networkpolicy plugin while upgr...
Keywords:
Status: CLOSED EOL
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.8.0
Assignee: Casey Callendrello
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-23 18:34 UTC by Vikas Laad
Modified: 2019-12-05 21:55 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2019-12-05 21:55:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vikas Laad 2018-02-23 18:34:07 UTC
Description of problem:
While doing an upgrade from 3.7 to 3.9 upgrade fails due to following error

Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-node[77664]: E0223 18:19:02.519017   77664 networkpolicy.go:130] Unable to query 
NetworkPolicies (networkpolicies.networking.k8s.io is forbidden: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list networkpolicies.ne
tworking.k8s.io at the cluster scope: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list all networkpolicies.networking.k8s.io in the 
cluster) - please ensure your nodes have access to view NetworkPolicy (eg, 'oc adm policy reconcile-cluster-roles')
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-node[77664]: F0223 18:19:02.519049   77664 network.go:44] SDN node startup failed
: networkpolicies.networking.k8s.io is forbidden: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list networkpolicies.networking.k8s.io
 at the cluster scope: User "system:node:ip-172-31-19-190.us-west-2.compute.internal" cannot list all networkpolicies.networking.k8s.io in the cluster
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal dnsmasq[38697]: setting upstream servers from DBus
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal dnsmasq[38697]: using nameserver 172.31.0.2#53
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: Failed to start OpenShift Node.
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state.
Feb 23 18:19:02 ip-172-31-19-190.us-west-2.compute.internal systemd[1]: atomic-openshift-node.service failed.
Feb 23 18:19:03 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-master-api[21430]: I0223 18:19:03.104565   21430 rest.go:362] Starting watch for /api/v1/namespaces, rv=27363 labels= fields= timeout=9m59s
Feb 23 18:19:03 ip-172-31-19-190.us-west-2.compute.internal atomic-openshift-master-api[21430]: I0223 18:19:03.283532   21430 rest.go:362] Starting watch for /api/v1/nodes, rv=27363 labels= fields=metadata.name=ip-172-31-62-223.us-west-2.compute.internal timeout=9m50s


Version-Release number of selected component (if applicable):
openshift v3.8.32
kubernetes v1.8.5+440f8d36da
etcd 3.2.8

How reproducible:
During the upgrade from 3.7 to 3.9 with NetworkPolicy plugin

Steps to Reproduce:
1. Create OCP 3.7 cluster with networkpolicy plugin
2. Run v3_9 playbook to upgrade
3. Playbook fails

Actual results:
Upgrade fails

Expected results:
Upgrade should succeed.

Additional info:
Looks like it was fixed in 3.9 but needs to be backported to 3.8.
https://bugzilla.redhat.com/show_bug.cgi?id=1523153

Comment 1 Dan Winship 2018-02-23 18:35:16 UTC
https://github.com/openshift/ose/pull/1082

Comment 2 Vikas Laad 2018-02-27 14:51:03 UTC
This blocks upgrade scenario.

Comment 4 Weihua Meng 2018-03-07 04:45:54 UTC
Could you please make a 3.8 rpm build for test?
Thanks.

Comment 6 Weihua Meng 2018-03-09 03:01:44 UTC
Fixed.
upgrade successful with redhat/openshift-ovs-networkpolicy from 3.7 to 3.9
latest 3.8 and 3.9 build used
v3.8.33-1
v3.9.3-1
and 
openshift-ansible-3.9.3-1.git.0.e166207.el7.noarch for ansible-playbook

Both RPM and containerized are tested.


Note You need to log in before you can comment on or make changes to this bug.