Description of problem: Run upgrade against container ocp. Upgrade fail at task [openshift_node : Wait for node to be ready]. Checked that node can not start. One of reasons should be that old Node service file was not updated to unwants openvwitch because ovs service file was removed before the task. # cat /etc/systemd/system/atomic-openshift-node.service|grep openv After=openvswitch.service Wants=openvswitch.service PartOf=openvswitch.service -v /lib/modules:/lib/modules -v /etc/origin/openvswitch:/etc/openvswitch \ # systemctl status openvswitch.service Unit openvswitch.service could not be found. # systemctl status atomic-openshift-node -l ● atomic-openshift-node.service Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2018-04-25 04:17:03 EDT; 1h 8min ago Main PID: 22580 (code=exited, status=1/FAILURE) Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 atomic-openshift-node[22580]: I0425 04:17:02.536408 22637 factory.go:116] Factory "docker" was unable to handle container "/system.slice/var-lib-origin-openshift.local.volumes-pods-047fedd5\\x2d4861\\x2d11e8\\x2da006\\x2d42010af00014-volumes-kubernetes.io\\x7esecret-sdn\\x2dtoken\\x2d82s7t.mount" Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 atomic-openshift-node[22580]: I0425 04:17:02.536418 22637 factory.go:109] Factory "systemd" can handle container "/system.slice/var-lib-origin-openshift.local.volumes-pods-047fedd5\\x2d4861\\x2d11e8\\x2da006\\x2d42010af00014-volumes-kubernetes.io\\x7esecret-sdn\\x2dtoken\\x2d82s7t.mount", but ignoring. Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 atomic-openshift-node[22580]: I0425 04:17:02.536431 22637 manager.go:930] ignoring container "/system.slice/var-lib-origin-openshift.local.volumes-pods-047fedd5\\x2d4861\\x2d11e8\\x2da006\\x2d42010af00014-volumes-kubernetes.io\\x7esecret-sdn\\x2dtoken\\x2d82s7t.mount" Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 atomic-openshift-node[22580]: I0425 04:17:02.718150 22637 docker_server.go:73] Stop docker server Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 atomic-openshift-node[23611]: atomic-openshift-node Apr 25 04:17:02 qe-jliu-c39-master-etcd-1 systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=1/FAILURE Apr 25 04:17:03 qe-jliu-c39-master-etcd-1 systemd[1]: Stopped atomic-openshift-node.service. Apr 25 04:17:03 qe-jliu-c39-master-etcd-1 systemd[1]: Unit atomic-openshift-node.service entered failed state. Apr 25 04:17:03 qe-jliu-c39-master-etcd-1 systemd[1]: atomic-openshift-node.service failed. Apr 25 05:00:36 qe-jliu-c39-master-etcd-1 systemd[1]: Cannot add dependency job for unit atomic-openshift-node.service, ignoring: Unit not found. Version-Release number of the following components: openshift-ansible-3.10.0-0.28.0.git.0.439cb5c.el7.noarch How reproducible: always Steps to Reproduce: 1. Run upgrade against container ocp without setting openshift_use_system_containers 2. 3. Actual results: Upgrade failed. Expected results: Upgrade succeed. Additional info: Please attach logs from ansible-playbook with the -vvv flag
Container upgrade can not proceed. Need this bug fixed asap.
https://github.com/openshift/openshift-ansible/pull/8239 WIP
Fix is available in openshift-ansible-3.10.0-0.35.0
Blocked verify by bz1575897. Remove testblocker first.
Version:openshift-ansible-3.10.0-0.41.0.git.0.88119e4.el7.noarch The original issue which caused node service can not start has been fixed. But upgrade against containerized ocp still failed(tracked in another bz1575507). Verify this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816