Hide Forgot
Description of problem: The following bug still happens: Bug 1382380 - Upgrade from 3.2 to 3.3 fails with could not get EgressNetworkPolicies Version-Release number of selected component (if applicable): atomic-openshift-utils-3.3.41-1.git.0.a1a327b.el7.noarch How reproducible: Always, got 1 customer and I can reproduce the issue in my lab as well Steps to Reproduce: 1. Upgrade 3.2 to 3.3 using /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_3/upgrade.yml 2. 3. Actual results: atomic-openshift-node restarted without 3.3 master: Nov 07 17:19:08 tkimura-oseha-01.usersys.redhat.com atomic-openshift-node[25808]: F1107 17:19:08.480417 25808 node.go:343] error: SDN node startup failed: could not get EgressNetworkPolicies: the server could not find the requested resource Expected results: No failure Additional info: The upstream fix is: https://github.com/openshift/openshift-ansible/pull/2593/files It's included in changelog of atomic-openshift-utils, but the node restart code still exists.
I think this might actually be https://github.com/openshift/openshift-ansible/pull/2637 which was fixed in 3.4 but did not get backported to 3.3. In this scenario your failure *was* during node upgrade, not master upgrade as the bug/pr you linked to above. However it appears that despite being upgraded, the masters never actually got restarted. I will try to reproduce and submit a PR for backporting.
I can't quite reproduce the failure, but I can reproduce the condition where master API is not restarted before proceeding to node upgrade. This was actually already backported and will be available in openshift-ansible-3.3.42-1 or greater. (one version beyond the one where this was reported)
FYI I use HA setup for the reproducer, 3 master/etcd hosts and 2 node hosts.
No EgressNetworkPolicies error so move bug to verified.
This was fixed in openshift-ansible-3.3.42-1