Bug 1534933 - [3.9] installer need provide a way to add docker auth to kubelet for auto pulling infra image from an authenticated registry in system container env without cri-o enabled
Summary: [3.9] installer need provide a way to add docker auth to kubelet for auto pul...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.9.0
Assignee: Michael Gugino
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On: 1514324
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-16 10:19 UTC by Gaoyun Pei
Modified: 2018-03-28 14:20 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1514324
Environment:
Last Closed: 2018-03-28 14:19:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 0 None None None 2018-03-28 14:20:16 UTC

Comment 1 Michael Gugino 2018-01-17 21:57:50 UTC
I am researching how to properly bind mount the credentials into the atomic system container.  The systemd unit file is created dynamically when the container is started.

Comment 2 Michael Gugino 2018-01-17 22:53:11 UTC
The issue appears to be that runc does not support something like 'DOCKER_ADDTL_BIND_MOUNTS.'  I will need to reach out to the syscontainer team to see if something is patched in that I haven't found or what the best solution is.

Currently, the work around appears to be editing the container's config.json, as that is the only place I can tell runc gets mount directives.

Comment 3 Scott Dodson 2018-01-18 13:35:08 UTC
Giuseppe,

Do you have a suggestion for how to handle mounting in auth credentials for the node when running as a system container?

Comment 4 Giuseppe Scrivano 2018-01-18 13:47:09 UTC
reporting my answered on IRC here:

This mechanism for adding custom bind mounts is already used by the etcd system container: roles/etcd/tasks/system_container.yml (line 81)

which requires this change in the system container:

https://github.com/projectatomic/atomic-system-containers/blob/master/etcd/config.json.template#L219

To work with the kubelet, this change should end up in https://github.com/openshift/origin/blob/master/images/node/system-container/config.json.template

Please make sure the new variable gets a default empty string in the manifest.json file: https://github.com/openshift/origin/blob/master/images/node/system-container/manifest.json

FYI: this value is passed at installation time.  Values passed at installation time are checked by atomic install/update, so if the installer is launched with the same configuration there is no change in the deployment of the container (i.e. it is idempotent).  If anything changes, such as a different value for ADDTL_BIND_MOUNTS, then the configuration is updated and the container restarted.

Please let me know if I can help in any way.

Comment 5 Michael Gugino 2018-01-18 18:20:00 UTC
Based on the above comments, it's necessary to patch both origin and openshift-ansible to enable this feature.

openshift-ansible patch submitted: https://github.com/openshift/openshift-ansible/pull/6783

Comment 6 Michael Gugino 2018-01-18 18:38:23 UTC
Origin PR Created: https://github.com/openshift/origin/pull/18162

Comment 8 Johnny Liu 2018-01-25 03:30:55 UTC
This bug is targeted to 3.9, while it was attached to 3.7/3.6/3.5 errata, move correct it.

Comment 9 Gan Huang 2018-01-25 09:28:32 UTC
atomic-openshift-node failed to start:

Jan 25 04:20:51 host-192-168-100-13 atomic-openshift-node[29854]: container_linux.go:274: starting container process caused "process_linux.go:366: container init caused \"rootfs_linux.go:57: mounting \\\"/var/lib/origin/.docker\\\" to rootfs \\\"/var/lib/containers/atomic/atomic-openshift-node.0/rootfs\\\" at \\\"/var/lib/containers/atomic/atomic-openshift-node.0/rootfs
/root/.docker\\\" caused \\\"no such device\\\"\""


Tested with openshift-ansible master branch

openshift version: v3.9.0-0.23.0

Comment 10 Giuseppe Scrivano 2018-01-25 11:44:10 UTC
is there a newline after /var/lib/containers/atomic/atomic-openshift-node.0/rootfs?  It looks like the destination is "\n/root/.docker".

@Gan, do you still have the running cluster?  In case, could you attach your /var/lib/containers/atomic/atomic-openshift-node.0/config.json file?

Comment 12 Giuseppe Scrivano 2018-01-25 12:24:38 UTC
@Gan thanks!

It looks like that error is given by runc when the "bind" option is not specified.  I could easily reproduce locally adding a new mount:

,{"type" : "bind", "destination" : "/root/.docker", "source" : "/tmp/.docker", "options" : ["ro"]}                                                        

# touch /tmp/.docker
#  runc run foo
container_linux.go:296: starting container process caused "process_linux.go:398: container init caused \"rootfs_linux.go:58: mounting \\\"/tmp/.docker\\\" to rootfs \\\"/var/lib/containers/atomic/node-v3.9.0-0.23.0.0/rootfs\\\" at \\\"/var/lib/containers/atomic/node-v3.9.0-0.23.0.0/rootfs/root/.docker.config.json.foo\\\" caused \\\"no such device\\\"\""


adding "bind" solved the issue for me:

https://github.com/openshift/openshift-ansible/pull/6865

Comment 13 Scott Dodson 2018-01-25 15:37:22 UTC
Additional changes merged

Comment 16 Gaoyun Pei 2018-01-29 03:11:20 UTC
Verify this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch.rpm against an authenticated registry.

Enable openshift using system container in 3.9 cluster setup, after installation, router/registry pods are running well, sti-build test also pass.

[root@ip-172-18-10-52 ~]# oc get pod
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-1-j6fqq    1/1       Running   2          24m
registry-console-1-5lc72   1/1       Running   1          23m
router-1-qnvgj             1/1       Running   2          24m

Auth credentials is mounted on node system container.
[root@ip-172-18-2-155 ~]# runc exec atomic-openshift-node ls /root/.docker/
config.json

Comment 19 errata-xmlrpc 2018-03-28 14:19:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489


Note You need to log in before you can comment on or make changes to this bug.