Description of problem: Due to openvswitch isn't restart during upgrade, the pod can't be deployed. the router is unavaible between nodes. "Docker ps" show openvswitch is still using images without version. after restart openvswitch, the openvswitch application is using 'v3.2.0.1'. and the issue was fixed. Version-Release number of selected component (if applicable): atomic-openshift-utils-3.0.57- How reproducible: always Steps to Reproduce: 1) ansible-playbook -i config/oseatomic /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_1_to_v3_2/upgrade.yml | tee upgrade.log 2) check pod status oc logs docker-registry-2-deploy 3) check docker applications 4) After restart openvswitch, checking pod and openvswitch applications again. Actual results: ) #oc get pods NAME READY STATUS RESTARTS AGE docker-registry-1-lj68f 1/1 Running 1 33m docker-registry-2-deploy 0/1 Error 0 14m router-1-86wk0 1/1 Running 1 2h router-2-deploy 0/1 Error 0 14m #oc logs docker-registry-2-deploy F0315 06:40:09.924072 1 deployer.go:69] couldn't get deployment default/docker-registry-2: Get https://172.30.0.1:443/api/v1/namespaces/default/replicationcontrollers/docker-registry-2: dial tcp 172.30.0.1:443: i/o timeout 3) -bash-4.2# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1b82f2a34947 openshift3/node:v3.2.0.1 "/usr/local/bin/origi" 8 minutes ago Up 8 minutes atomic-openshift-node a620a50a3602 openshift3/openvswitch "/usr/local/bin/ovs-r" 13 minutes ago Up 13 minutes openvswitch 4) # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e38794410531 openshift3/node:v3.2.0.1 "/usr/local/bin/origi" 6 minutes ago Up 6 minutes atomic-openshift-node e2e31e84de0c openshift3/openvswitch:v3.2.0.1 "/usr/local/bin/ovs-r" 6 minutes ago Up 6 minutes openvswitch Expected results: openvswitch should be restarted or reload during upgrade. Additional info:
Basically, I think the following needs to happen: On upgrade from 3.1 -> 3.2, we will need to modify the systemd unit files for the containerized components to include the latest dependencies and ordering prior to restarting services/docker on the hosts. Brenton, One thing that I'm wondering about, would it make sense to re-apply the systemd unit templates in this case rather than modifying parts of the files on disk? We could use the relative path to the templates from the roles (or a symlink in the upgrade playbook directory).
Yeah, that would probably make more sense. How much refactoring would be needed to make that possible? Would we create separate roles for the systemd units or are you saying simply to create a new task in the upgrade playbooks that reuse the existing templates and set any needed variable appropriately?
I'm suggesting using the existing templates in the roles directories from the upgrade playbooks. The easiest way would probably be to symlink the templates into the upgrade subdirectory.
I found this bug have been fixed in atomic-openshift-utils-3.0.64, shall we move it to ON_QA?
We're having to do a bit of refactoring to the way the systemd units are installed. I'd prefer to retest this bug once I'm done with that.
The openvswitch can be restarted on openshift-ansible-3.0.67-1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064