Bug 1534933
| Summary: | [3.9] installer need provide a way to add docker auth to kubelet for auto pulling infra image from an authenticated registry in system container env without cri-o enabled | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gaoyun Pei <gpei> |
| Component: | Installer | Assignee: | Michael Gugino <mgugino> |
| Status: | CLOSED ERRATA | QA Contact: | Gaoyun Pei <gpei> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.9.0 | CC: | aos-bugs, ghuang, gpei, gscrivan, jialiu, jokerman, mgugino, mmccomas, wmeng, wsun |
| Target Milestone: | --- | ||
| Target Release: | 3.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1514324 | Environment: | |
| Last Closed: | 2018-03-28 14:19:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1514324 | ||
| Bug Blocks: | |||
|
Comment 1
Michael Gugino
2018-01-17 21:57:50 UTC
The issue appears to be that runc does not support something like 'DOCKER_ADDTL_BIND_MOUNTS.' I will need to reach out to the syscontainer team to see if something is patched in that I haven't found or what the best solution is. Currently, the work around appears to be editing the container's config.json, as that is the only place I can tell runc gets mount directives. Giuseppe, Do you have a suggestion for how to handle mounting in auth credentials for the node when running as a system container? reporting my answered on IRC here: This mechanism for adding custom bind mounts is already used by the etcd system container: roles/etcd/tasks/system_container.yml (line 81) which requires this change in the system container: https://github.com/projectatomic/atomic-system-containers/blob/master/etcd/config.json.template#L219 To work with the kubelet, this change should end up in https://github.com/openshift/origin/blob/master/images/node/system-container/config.json.template Please make sure the new variable gets a default empty string in the manifest.json file: https://github.com/openshift/origin/blob/master/images/node/system-container/manifest.json FYI: this value is passed at installation time. Values passed at installation time are checked by atomic install/update, so if the installer is launched with the same configuration there is no change in the deployment of the container (i.e. it is idempotent). If anything changes, such as a different value for ADDTL_BIND_MOUNTS, then the configuration is updated and the container restarted. Please let me know if I can help in any way. Based on the above comments, it's necessary to patch both origin and openshift-ansible to enable this feature. openshift-ansible patch submitted: https://github.com/openshift/openshift-ansible/pull/6783 Origin PR Created: https://github.com/openshift/origin/pull/18162 This bug is targeted to 3.9, while it was attached to 3.7/3.6/3.5 errata, move correct it. atomic-openshift-node failed to start: Jan 25 04:20:51 host-192-168-100-13 atomic-openshift-node[29854]: container_linux.go:274: starting container process caused "process_linux.go:366: container init caused \"rootfs_linux.go:57: mounting \\\"/var/lib/origin/.docker\\\" to rootfs \\\"/var/lib/containers/atomic/atomic-openshift-node.0/rootfs\\\" at \\\"/var/lib/containers/atomic/atomic-openshift-node.0/rootfs /root/.docker\\\" caused \\\"no such device\\\"\"" Tested with openshift-ansible master branch openshift version: v3.9.0-0.23.0 is there a newline after /var/lib/containers/atomic/atomic-openshift-node.0/rootfs? It looks like the destination is "\n/root/.docker". @Gan, do you still have the running cluster? In case, could you attach your /var/lib/containers/atomic/atomic-openshift-node.0/config.json file? @Gan thanks!
It looks like that error is given by runc when the "bind" option is not specified. I could easily reproduce locally adding a new mount:
,{"type" : "bind", "destination" : "/root/.docker", "source" : "/tmp/.docker", "options" : ["ro"]}
# touch /tmp/.docker
# runc run foo
container_linux.go:296: starting container process caused "process_linux.go:398: container init caused \"rootfs_linux.go:58: mounting \\\"/tmp/.docker\\\" to rootfs \\\"/var/lib/containers/atomic/node-v3.9.0-0.23.0.0/rootfs\\\" at \\\"/var/lib/containers/atomic/node-v3.9.0-0.23.0.0/rootfs/root/.docker.config.json.foo\\\" caused \\\"no such device\\\"\""
adding "bind" solved the issue for me:
https://github.com/openshift/openshift-ansible/pull/6865
Additional changes merged Verify this bug with openshift-ansible-3.9.0-0.31.0.git.0.e0a0ad8.el7.noarch.rpm against an authenticated registry. Enable openshift using system container in 3.9 cluster setup, after installation, router/registry pods are running well, sti-build test also pass. [root@ip-172-18-10-52 ~]# oc get pod NAME READY STATUS RESTARTS AGE docker-registry-1-j6fqq 1/1 Running 2 24m registry-console-1-5lc72 1/1 Running 1 23m router-1-qnvgj 1/1 Running 2 24m Auth credentials is mounted on node system container. [root@ip-172-18-2-155 ~]# runc exec atomic-openshift-node ls /root/.docker/ config.json Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489 |