Description of problem: upgrade failed due to crio client and server mismatch Version-Release number of the following components: openshift-ansible-3.10.101-1.git.0.5f32198.el7.noarch How reproducible: Always Steps to Reproduce: 1. install OCP v3.9 with cri-o container runtime. 2. upgrade to v3.10 Actual results: upgrade failed. TASK [openshift_node : Check that node image is present] *********************** task path: /home/slave2/workspace/Run-Ansible-Playbooks-Nextge/private-openshift-ansible/roles/openshift_node/tasks/prepull.yml:2 Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py <ec2-3-90-247-103.compute-1.amazonaws.com> ESTABLISH SSH CONNECTION FOR USER: root <ec2-3-90-247-103.compute-1.amazonaws.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/slave2/workspace/Run-Ansible-Playbooks-Nextge/private/config/keys/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/slave2/.ansible/cp/%C ec2-3-90-247-103.compute-1.amazonaws.com '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"'' <ec2-3-90-247-103.compute-1.amazonaws.com> (1, '\n{"changed": true, "end": "2019-01-23 02:36:13.533830", "stdout": "", "cmd": ["crictl", "images", "-q", "registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.10"], "failed": true, "delta": "0:00:00.016518", "stderr": "W0123 02:36:13.531278 67461 util_unix.go:75] Using \\"/var/run/crio/crio.sock\\" as endpoint is deprecated, please consider using full url format \\"unix:///var/run/crio/crio.sock\\".\\ntime=\\"2019-01-23T02:36:13-05:00\\" level=fatal msg=\\"listing images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\\" ", "rc": 1, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "_raw_params": "crictl images -q registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.10", "removes": null, "creates": null, "chdir": null, "stdin": null}}, "start": "2019-01-23 02:36:13.517312", "msg": "non-zero return code"}\n', '') fatal: [ec2-3-90-247-103.compute-1.amazonaws.com]: FAILED! => { "changed": true, "cmd": [ "crictl", "images", "-q", "registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.10" ], "delta": "0:00:00.016518", "end": "2019-01-23 02:36:13.533830", "failed": true, "invocation": { "module_args": { "_raw_params": "crictl images -q registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.10", "_uses_shell": false, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "msg": "non-zero return code", "rc": 1, "start": "2019-01-23 02:36:13.517312", "stderr": "W0123 02:36:13.531278 67461 util_unix.go:75] Using \"/var/run/crio/crio.sock\" as endpoint is deprecated, please consider using full url format \"unix:///var/run/crio/crio.sock\".\ntime=\"2019-01-23T02:36:13-05:00\" level=fatal msg=\"listing images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" ", "stderr_lines": [ "W0123 02:36:13.531278 67461 util_unix.go:75] Using \"/var/run/crio/crio.sock\" as endpoint is deprecated, please consider using full url format \"unix:///var/run/crio/crio.sock\".", "time=\"2019-01-23T02:36:13-05:00\" level=fatal msg=\"listing images failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService\" " ], "stdout": "", "stdout_lines": [] } info when upgrade failed: [root@ip-172-18-31-212 ~]# crictl --version crictl version 1.0.0-beta.0 [root@ip-172-18-31-212 ~]# rpm -q cri-o cri-o-1.9.14-1.git4e220eb.el7.x86_64 [root@ip-172-18-31-212 ~]# rpm -q cri-tools cri-tools-1.0.0-5.rhaos3.10.git2e22a75.el7.x86_64 [root@ip-172-18-31-212 ~]# oc get node -owide NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-172-18-11-104.ec2.internal Ready master 1h v1.9.1+a0ce1bc657 34.229.101.173 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-12-197.ec2.internal Ready <none> 1h v1.9.1+a0ce1bc657 54.166.154.56 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-15-193.ec2.internal Ready master 1h v1.9.1+a0ce1bc657 54.224.233.49 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-17-45.ec2.internal Ready <none> 1h v1.9.1+a0ce1bc657 3.90.205.150 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-25-141.ec2.internal Ready <none> 1h v1.9.1+a0ce1bc657 54.160.180.155 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-3-134.ec2.internal Ready compute 1h v1.9.1+a0ce1bc657 52.203.131.75 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-30-239.ec2.internal Ready compute 1h v1.9.1+a0ce1bc657 34.228.55.131 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 ip-172-18-31-212.ec2.internal Ready master 1h v1.9.1+a0ce1bc657 3.90.247.103 Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.1.3.el7.x86_64 cri-o://1.9.14 Expected results: upgrade succeeded. Additional info: Please attach logs from ansible-playbook with the -vvv flag
What version of the installer was used to install 3.9? The upgrade playbooks only assert that cri-tools are installed and they should've been installed when installing 3.9 with cri-o but but in yours it's installed during the upgrade in this task TASK [openshift_control_plane : Ensure cri-tools installed] ********************
OCP v3.9 was installed by openshift-ansible-3.9.65-1.git.0.a14009a.el7.noarch I did not see this task during OCP v3.9 install. TASK [openshift_control_plane : Ensure cri-tools installed]
(In reply to Weihua Meng from comment #4) > OCP v3.9 was installed by > openshift-ansible-3.9.65-1.git.0.a14009a.el7.noarch > > I did not see this task during OCP v3.9 install. > TASK [openshift_control_plane : Ensure cri-tools installed] Sorry, I meant that was the task from your upgrade log that installed cri-tools which pulled the latest version because it wasn't previously installed. Taking another look at the 3.9 codebase cri-tools would've only been installed in 3.9 if it were upgraded from a release prior to 3.9 which seems like a problem unto itself. We'll have to look into possibly removing the dependency on cri-tools in the 3.9 to 3.10 upgrade codepath or some other way to make sure that we install a 3.9 version. Workaround would be to install cri-tools while running 3.9 and before enabling the 3.10 repo.
The workaround works. The latest released openshift-ansible is openshift-ansible-3.10.89-1.git.0.14ed1cb.el7.noarch It has same issue, so this is not regression bug.
Testing 3.9 crio cluster upgrades.
Proposed https://github.com/openshift/openshift-ansible/pull/11146
Fixed. openshift-ansible-3.10.112-1.git.0.7823ef0.el7.noarch
*** Bug 1680278 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0405