Bug 1493714 - installer removes /var/lib/docker/* when cri-o variables are passed in inv file
Summary: installer removes /var/lib/docker/* when cri-o variables are passed in inv file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ---
: 3.7.0
Assignee: Giuseppe Scrivano
QA Contact: Johnny Liu
URL:
Whiteboard: aos-scalability-37
Depends On:
Blocks: 1489014
TreeView+ depends on / blocked
 
Reported: 2017-09-20 18:35 UTC by Vikas Laad
Modified: 2017-11-28 22:12 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-28 22:12:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Vikas Laad 2017-09-20 18:35:27 UTC
Description of problem:
Installer removes /var/lib/docker/* from the system where openshift is being installed and fails with following error

FAILED! => {"changed": true, "cmd": ["docker", "run", "--rm", "openshift3/ose:latest", "version"], "delta": "0:00:00.018003", "end": "2017-09-20 18:09:54.266171", "failed": true, "rc": 127, "start": "2017-09-20 18:09:54.248168", "stderr": "/usr/bin/docker-current: Error response from daemon: open /var/lib/docker/image/overlay2/imagedb/content/sha256/76ace9dc9d4f7d1a84ad01e7e73ef31594085b36e26906f593ee7c7a1fb8c0d9: no such file or directory.\nSee '/usr/bin/docker-current run --help'.", "stderr_lines": ["/usr/bin/docker-current: Error response from daemon: open /var/lib/docker/image/overlay2/imagedb/content/sha256/76ace9dc9d4f7d1a84ad01e7e73ef31594085b36e26906f593ee7c7a1fb8c0d9: no such file or directory.", "See '/usr/bin/docker-current run --help'."], "stdout": "", "stdout_lines": []}


Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-3.7.0-0.126.4.git.0.3fc2b9b.el7.noarch

rpm -q ansible
ansible-2.3.2.0-2.el7.noarch

ansible --version
ansible 2.3.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]

How reproducible:
When following variables are passed in inventory
openshift_use_crio=true
openshift_crio_systemcontainer_image_registry_override=<repo>


Steps to Reproduce:
1. Run openshift-ansible/playbooks/byo/config.yaml for attached inv file

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated
attached

Expected results:
Installer should not fail

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Attached inv and log file.

Comment 3 Giuseppe Scrivano 2017-09-22 09:17:08 UTC
CRI-O should't change anything in the Docker configuration.  "docker run --rm openshift3/ose:latest version" command is used only when the version is not specified.  From the log it looks more like a temporary failure to me since Docker falls back to docker.io.

Is "/var/lib/docker/" present on the machine after the installer fails?

Could you try again with and without openshift_use_crio=true?  Does it make any difference?

There is a PR that might affect this: https://github.com/openshift/openshift-ansible/pull/5490

Comment 4 Mike Fiedler 2017-09-22 12:16:06 UTC
/var/lib/docker before running openshift-ansible/playbook/byo/config.yml with the attached inventory:

root@ip-172-31-54-223: ~ # ls -l /var/lib/docker
total 12
drwx------.  2 root root    6 Sep 19 18:30 containers
drwx------.  3 root root   22 Sep 19 18:30 image
drwxr-x---.  3 root root   19 Sep 19 18:30 network
drwx------. 63 root root 8192 Sep 19 18:37 overlay2
drwx------.  2 root root    6 Sep 19 18:30 swarm
drwx------.  2 root root    6 Sep 19 18:37 tmp
drwx------.  2 root root    6 Sep 19 18:30 trust
drwx------.  2 root root   25 Sep 19 18:30 volumes
root@ip-172-31-54-223: ~ # docker images
REPOSITORY                                                        TAG                 IMAGE ID            CREATED             SIZE
docker.io/ravielluri/image                                        controller          b4f6b442b51a        2 days ago          2.999 GB
pbench-controller                                                 latest              b4f6b442b51a        2 days ago          2.999 GB
collectd                                                          latest              d405ef38b02b        2 days ago          480.8 MB
docker.io/ravielluri/image                                        collectd            d405ef38b02b        2 days ago          480.8 MB
docker.io/ravielluri/image                                        agent               8ded7c8f3e6b        2 days ago          599.1 MB
pbench-agent                                                      latest              8ded7c8f3e6b        2 days ago          599.1 MB
registry.ops.openshift.com/openshift3/node                        v3.7.0-0.126.4      2109952b23df        6 days ago          1.24 GB
registry.ops.openshift.com/openshift3/ose-haproxy-router          v3.7.0-0.126.4      a3e7f7913c33        6 days ago          1.072 GB
registry.ops.openshift.com/openshift3/ose-sti-builder             v3.7.0-0.126.4      fd0beae44d01        6 days ago          1.053 GB
registry.ops.openshift.com/openshift3/ose-docker-builder          v3.7.0-0.126.4      e7e42f8b34d1        6 days ago          1.053 GB
registry.ops.openshift.com/openshift3/ose-deployer                v3.7.0-0.126.4      3aed7ac8060d        6 days ago          1.053 GB
registry.ops.openshift.com/openshift3/ose                         v3.7.0-0.126.4      76ace9dc9d4f        6 days ago          1.053 GB
registry.ops.openshift.com/openshift3/ose-docker-registry         v3.7.0-0.126.4      e7eb8c7b8274        6 days ago          467.1 MB
registry.ops.openshift.com/openshift3/ose-keepalived-ipfailover   v3.7.0-0.126.4      1807f62ce0e0        6 days ago          385 MB
registry.ops.openshift.com/openshift3/ose-pod                     v3.7.0-0.126.4      fd9520b20499        6 days ago          208.6 MB
registry.ops.openshift.com/openshift3/ruby-20-rhel7               latest              e5833aa7cf85        8 months ago        443.4 MB
registry.ops.openshift.com/openshift3/python-33-rhel7             latest              e18350a7786c        8 months ago        521.4 MB
registry.ops.openshift.com/openshift3/php-55-rhel7                latest              2d6fbdfafa33        8 months ago        568.9 MB
registry.ops.openshift.com/openshift3/nodejs-010-rhel7            latest              226d0b1b7987        8 months ago        430.3 MB
registry.ops.openshift.com/openshift3/perl-516-rhel7              latest              45c996c39407        8 months ago        474.6 MB


after running openshift_ansible/playbooks/byo/config.yml:

root@ip-172-31-54-223: ~ # ls -lrt /var/lib/docker
total 0
root@ip-172-31-54-223: ~ # docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
root@ip-172-31-54-223: ~ # 


Nothing else was done on the system apart from running the playbook with the attached inventory.

Comment 5 Mike Fiedler 2017-09-22 16:11:45 UTC
I removed the following 2 lines from the attached inventory and installed on an instance booted from the same Atomic Host AMI:

openshift_use_crio=true  
openshift_crio_systemcontainer_image_registry_override=registry.ops.openshift.com/openshift3

and the containerized install completed successfully

Comment 6 Giuseppe Scrivano 2017-09-22 23:00:53 UTC
Proposed fix:

https://github.com/projectatomic/atomic-system-containers/pull/117

Comment 8 Giuseppe Scrivano 2017-10-04 14:38:47 UTC
what image are you using for CRI-O?

The fix went into the CRI-O system container:

https://github.com/projectatomic/atomic-system-containers/pull/120

Comment 9 Giuseppe Scrivano 2017-10-04 14:50:54 UTC
I'll ask Jhon since it seems the image is not yet updated:

# docker run --rm registry.ops.openshift.com/openshift3/cri-o cat /set_mounts.sh
#!/bin/sh

findmnt /var/lib > /dev/null || mount --rbind --make-shared /var/lib /var/lib
findmnt /var/lib/containers > /dev/null || mount --bind --make-shared /var/lib/containers /var/lib/containers
findmnt /var/lib/origin > /dev/null || mount --bind --make-shared /var/lib/origin /var/lib/origin
mount --make-shared /run
findmnt /run/systemd > /dev/null || mount --bind --make-rslave /run/systemd /run/systemd

I get the same result with the image from brew.

Was the change in https://github.com/projectatomic/atomic-system-containers/pull/120 pulled in?

Comment 11 Giuseppe Scrivano 2017-10-05 07:31:48 UTC
Thanks.  The image on brew should be fine now.

Comment 15 Vikas Laad 2017-10-09 16:28:25 UTC
verified with cri-o 3.7 image, openshift-ansible playbook completes fine.

Comment 19 errata-xmlrpc 2017-11-28 22:12:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.