Bug 1340324 - [docker1.10]Need more robust method to utilize docker client in node container
Summary: [docker1.10]Need more robust method to utilize docker client in node container
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Release
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.2.1
Assignee: Scott Dodson
QA Contact: Ma xiaoqiang
URL:
Whiteboard:
Depends On:
Blocks: 1342762
TreeView+ depends on / blocked
 
Reported: 2016-05-27 02:14 UTC by Scott Dodson
Modified: 2016-07-04 00:45 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Due to newer releases of docker changing the path of the docker executable, containerized nodes could fail to initialize the SDN because they could not execute docker properly. This bug fix updates the containerized node image to accommodate this change, and as a result containerized nodes work properly with current and future versions of docker.
Clone Of:
: 1342762 (view as bug list)
Environment:
Last Closed: 2016-06-27 15:07:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1343 0 normal SHIPPED_LIVE Red Hat OpenShift Enterprise 3.2.1.1 bug fix and enhancement update 2016-06-27 19:04:05 UTC

Description Scott Dodson 2016-05-27 02:14:54 UTC
Description of problem:
The SDN components of OpenShift call /usr/bin/docker. In an effort to ensure that the client matches the version of the daemon we're currently mounting /usr/bin/docker and /usr/bin/docker-current (if it exists) into the node container.

docker-1.10 will introduce dependencies on libseccomp. This means if the node container continues relying on /usr/bin/docker /usr/bin/docker-current being mounted from the host we'll also need to start mounting libseccomp into the container. We can do this, but this seems like we're going to be continuously chasing the problem here.

One way to handle this may be to provide a chroot wrapper in the node container. This seems to work, I need to test it more thoroughly.

https://github.com/openshift/origin/pull/9046

Comment 2 Scott Dodson 2016-06-04 19:05:59 UTC
To reproduce, install containerized node using docker-1.10. Exec into the node container and attempt to use docker, you'll get an error. Update to an image that fixes this by adding the wrapper script at /usr/local/bin/docker. Then try to use docker and it should work.

Comment 4 Johnny Liu 2016-06-08 12:34:37 UTC
Tested this bug with openshift3/node:v3.2.1.1 (2344f942d5a0), docker wrapper is working well. Sti build and deployment is completed successfully.


# docker exec -ti c5ddbdb49619 /bin/sh
sh-4.2# docker version
Client:
 Version:         1.10.3
 API version:     1.22
 Package version: docker-common-1.10.3-31.el7.x86_64
 Go version:      go1.4.2
 Git commit:      4779225/1.10.3
 Built:           
 OS/Arch:         linux/amd64

Server:
 Version:         1.10.3
 API version:     1.22
 Package version: docker-common-1.10.3-31.el7.x86_64
 Go version:      go1.4.2
 Git commit:      4779225/1.10.3
 Built:           
 OS/Arch:         linux/amd64


Found some minor issues in node system unit files:
# cat /etc/systemd/system/atomic-openshift-node.service
[Unit]
After=atomic-openshift-master.service
After=docker.service
After=openvswitch.service
PartOf=docker.service
Requires=docker.service
Requires=openvswitch.service
Wants=atomic-openshift-master.service
Requires=atomic-openshift-node-dep.service
After=atomic-openshift-node-dep.service

[Service]
EnvironmentFile=/etc/sysconfig/atomic-openshift-node
EnvironmentFile=/etc/sysconfig/atomic-openshift-node-dep
ExecStartPre=-/usr/bin/docker rm -f atomic-openshift-node
ExecStart=/usr/bin/docker run --name atomic-openshift-node --rm --privileged --net=host --pid=host --env-file=/etc/sysconfig/atomic-openshift-node -v /:/rootfs:ro -e CONFIG_FILE=${CONFIG_FILE} -e OPTIONS=${OPTIONS} -e HOST=/rootfs -e HOST_ETC=/host-etc -v /var/lib/origin:/var/lib/origin -v /etc/origin/node:/etc/origin/node -v /etc/localtime:/etc/localtime:ro -v /etc/machine-id:/etc/machine-id:ro -v /run:/run -v /sys:/sys:ro -v /usr/bin/docker:/usr/bin/docker:ro -v /var/lib/docker:/var/lib/docker -v /lib/modules:/lib/modules -v /etc/origin/openvswitch:/etc/openvswitch -v /etc/origin/sdn:/etc/openshift-sdn -v /etc/systemd/system:/host-etc/systemd/system -v /var/log:/var/log -v /dev:/dev $DOCKER_ADDTL_BIND_MOUNTS openshift3/node:${IMAGE_VERSION}
ExecStartPost=/usr/bin/sleep 10
ExecStop=/usr/bin/docker stop atomic-openshift-node
SyslogIdentifier=atomic-openshift-node
Restart=always
RestartSec=5s

[Install]
WantedBy=docker.service


Now we are using chroot to use host's rootfs, so I think we should cleanup the service system unit file, e.g:
1. remove "/etc/sysconfig/atomic-openshift-node-dep" files, remove "EnvironmentFile=/etc/sysconfig/atomic-openshift-node-dep" lines.
2. Remove "-v /usr/bin/docker:/usr/bin/docker:ro" just like what did in https://github.com/openshift/origin/pull/9046

Comment 5 Scott Dodson 2016-06-08 13:05:52 UTC
Jianlin,

I agree that we should clean that up, but I don't think we can do that until we can be sure that all, or a majority, of users are using these newer images. I think what we should do is leave it in for 3.2 installs and stop adding it for 3.3 installs. What do you think?

Comment 6 Brenton Leanhardt 2016-06-08 13:24:28 UTC
I definitely agree if it's not a blocking issue to consider this ON_QA and track the remaining problem in another bug.

Comment 7 Johnny Liu 2016-06-12 05:39:51 UTC
Verified this bug with atomicOpenShift-errata/3.2/2016-06-09.2 puddle, the same behavior in comment 4 is seen, based on comment 5 and 6, move this bug to "Verified".

Comment 9 errata-xmlrpc 2016-06-27 15:07:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1343


Note You need to log in before you can comment on or make changes to this bug.