Description of problem: Upgrade failed at task [openshift_node : Copy node container image to ostree storage]. FAILED - RETRYING: Copy node container image to ostree storage (3 retries left). FAILED - RETRYING: Copy node container image to ostree storage (2 retries left). FAILED - RETRYING: Copy node container image to ostree storage (1 retries left). fatal: [x]: FAILED! => {"attempts": 3, "changed": false, "cmd": ["atomic", "pull", "--storage=ostree", "docker:registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.11"], "delta": "0:00:01.013669", "end": "2018-09-17 05:39:11.586435", "msg": "non-zero return code", "rc": 1, "start": "2018-09-17 05:39:10.572766", "stderr": "time=\"2018-09-17T05:39:11Z\" level=fatal msg=\"Error initializing source docker-daemon:registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.11: Error loading image from docker engine: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\" ", "stderr_lines": ["time=\"2018-09-17T05:39:11Z\" level=fatal msg=\"Error initializing source docker-daemon:registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.11: Error loading image from docker engine: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\" "], "stdout": "", "stdout_lines": []} ====================== This should be caused by the previous task [openshift_node : stop docker to kill static pods] ************************, this change merged from pr10030. [root@ip-172-18-14-104 ~]# systemctl status docker ● docker.service - Docker Application Container Engine Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/docker.service.d └─custom.conf /usr/lib/systemd/system/docker.service.d └─flannel.conf Active: inactive (dead) since Mon 2018-09-17 05:37:11 UTC; 4min 39s ago [root@ip-172-18-14-104 ~]# atomic pull --storage=ostree docker:registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.11 FATA[0000] Error initializing source docker-daemon:registry.reg-aws.openshift.com:443/openshift3/ose-node:v3.11: Error loading image from docker engine: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? Version-Release number of the following components: ansible-2.6.4-1.el7ae.noarch openshift-ansible-3.11.7-1.git.0.911481d.el7_5.noarch How reproducible: always Steps to Reproduce: 1. Install ocp v3.10 on atomic with system container node and without service catelog deployed. 2. Upgrade above ocp 3. Actual results: Upgrade failed. Expected results: Upgrade succeed. Additional info: Please attach logs from ansible-playbook with the -vvv flag
block upgrade test against system container node
PR Created in master: https://github.com/openshift/openshift-ansible/pull/10125
https://github.com/openshift/openshift-ansible/pull/10135 release-3.11
The PR 10135 has been merged to openshift-ansible-3.11.9-1,please check the bug.
Verified on openshift-ansible-3.11.9-1.git.0.63f7970.el7_5.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3537