Bug 1448384
| Summary: | The generated /etc/docker/daemon.json was not capable which using docker system container | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Gan Huang <ghuang> |
| Component: | Installer | Assignee: | Steve Milner <smilner> |
| Status: | CLOSED ERRATA | QA Contact: | Gan Huang <ghuang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.6.0 | CC: | aos-bugs, ghuang, jhonce, jokerman, mmccomas, sdodson, smilner |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
undefined
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-10 05:23:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1448372 | ||
| Bug Blocks: | |||
I see the issue. The strings are being turned into unicode instances inside the template. I'll put something together. Created https://github.com/openshift/openshift-ansible/pull/4106 and added Gan as a reviewer. Only the daemon.json makes sense to me to start container-engine successfully after deleting the illegal parameters one by one. And `/var/run/docker.pid` needs to be deleted manually alought docker has been stopped.
# cat /etc/docker/daemon.json
{
"api-cors-header": "",
"bip": "",
"bridge": "",
"cgroup-parent": "",
"cluster-store": "",
"cluster-store-opts": {},
"cluster-advertise": "",
"debug": true,
"default-gateway": "",
"default-gateway-v6": "",
"default-ulimits": {},
"disable-legacy-registry": false,
"dns": [],
"dns-opts": [],
"dns-search": [],
"exec-root": "",
"fixed-cidr": "",
"fixed-cidr-v6": "",
"graph": "",
"group": "",
"hosts": [],
"icc": false,
"insecure-registries": [u'test.registry.com:8888', u'registry.ops.openshift.com'],
"ip": "0.0.0.0",
"iptables": true,
"ipv6": false,
"ip-forward": false,
"ip-masq": false,
"labels": [],
"live-restore": true,
"log-level": "",
"log-opts": null,
"max-concurrent-downloads": 3,
"max-concurrent-uploads": 5,
"mtu": 0,
"oom-score-adjust": -500,
"raw-logs": false,
"registry-mirrors": [],
"userns-remap": ""
}
Added https://github.com/projectatomic/atomic-system-containers/pull/65 to remove /etc/sysconfig/docker usage from the system container. https://github.com/projectatomic/atomic-system-containers/pull/65 merged. Waiting for CI on https://github.com/openshift/openshift-ansible/pull/4106 Both are now merged. Test with latest openshift-ansible master branch and latest "container-engine" (IMAGE ID: edd29b7740cd) Still have some issues not addressed. 1) > May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" It's caused by capital "T" which can't be recognized by Docker daemon # grep T /etc/docker/daemon.json "selinux-enabled": True, 2) Problem still persists after fixing the issue above, seems "blocked-registries" not supported in /etc/docker/daemon.json > May 10 01:41:53 qe-ghuang-master-nfs-1.localdomain runc[23807]: time="2017-05-10T05:41:53Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives don't match any configuration option: blocked-registries\n" 3) There still many duplicated settings that I couldn't figure out: > May 10 01:42:37 qe-ghuang-master-nfs-1.localdomain runc[24688]: time="2017-05-10T05:42:37Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: add-registry: (from flag: [registry.access.redhat.com], from file: [registry.access.redhat.com]), runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd]), storage-driver: (from flag: devicemapper, from file: ), selinux-enabled: (from flag: true, from file: true), storage-opts: (from flag: [dm.fs=xfs dm.thinpooldev=/dev/mapper/rhel-docker--pool dm.use_deferred_removal=true], from file: []), userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" ... (In reply to Gan Huang from comment #7) > Test with latest openshift-ansible master branch and latest > "container-engine" (IMAGE ID: edd29b7740cd) > > Still have some issues not addressed. > 1) > > May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" > > It's caused by capital "T" which can't be recognized by Docker daemon > # grep T /etc/docker/daemon.json > "selinux-enabled": True, Ah, I see. Will fix. I should have noticed that... > 2) > Problem still persists after fixing the issue above, seems > "blocked-registries" not supported in /etc/docker/daemon.json > Interesting. The code seems to indicate it is supported but I'll check again. > > May 10 01:41:53 qe-ghuang-master-nfs-1.localdomain runc[23807]: time="2017-05-10T05:41:53Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives don't match any configuration option: blocked-registries\n" > > 3) > There still many duplicated settings that I couldn't figure out: > > > May 10 01:42:37 qe-ghuang-master-nfs-1.localdomain runc[24688]: time="2017-05-10T05:42:37Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: add-registry: (from flag: [registry.access.redhat.com], from file: [registry.access.redhat.com]), runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd]), storage-driver: (from flag: devicemapper, from file: ), selinux-enabled: (from flag: true, from file: true), storage-opts: (from flag: [dm.fs=xfs dm.thinpooldev=/dev/mapper/rhel-docker--pool dm.use_deferred_removal=true], from file: []), userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" The duplicated settings should be fixed with https://bugzilla.redhat.com/show_bug.cgi?id=1448372 as it removes the use of /etc/sysconfig/docker from the system container. Added https://github.com/openshift/openshift-ansible/pull/4147 for lower case T and renaming blocked to block. Merged Tested with latest openshfit-ansible including PR4147, selinux-enabled is still set to "True", and seems `block-registries` should be renamed to `docker-registry` Proposed fix: https://github.com/openshift/openshift-ansible/pull/4158 These changes worked for me and have been merged. Tested with latest openshift-ansible. container-engine still failed to start May 11 21:55:28 qe-ghuang-master-nfs-1.localdomain runc[13140]: time="2017-05-11T21:55:28-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" Restart successfully after removing "userland-proxy-path": "/usr/libexec/docker/docker-proxy-current" from /etc/docker/daemon.json. Found it. The init.sh was still referencing that outside of the json. PR: https://github.com/openshift/openshift-ansible/pull/4174 Verified with openshift-ansible-3.6.68-1.git.0.9cbe2b7.el7.noarch.rpm container-engine can be started properly. Reopening this issue:
Seems `add-registry`, `block-registry`, `insecure-registries` in /etc/docker/daemon.json didn't work as expected, installer failed at:
TASK [openshift_version : Set precise containerized version to configure if openshift_release specified] ***
Monday 05 June 2017 06:17:15 +0000 (0:00:00.114) 0:03:37.158 ***********
fatal: [openshift-140.lab.sjc.redhat.com]: FAILED! => {
"changed": true,
"cmd": [
"docker",
"run",
"--rm",
"openshift3/ose:v3.6",
"version"
],
"delta": "0:00:02.489766",
"end": "2017-06-05 02:17:18.754778",
"failed": true,
"rc": 125,
"start": "2017-06-05 02:17:16.265012",
"warnings": []
}
STDERR:
Unable to find image 'openshift3/ose:v3.6' locally
Trying to pull repository docker.io/openshift3/ose ...
/usr/bin/docker-current: unauthorized: authentication required.
See '/usr/bin/docker-current run --help'.
fatal: [openshift-125.lab.sjc.redhat.com]: FAILED! => {
"changed": true,
"cmd": [
"docker",
"run",
"--rm",
"openshift3/ose:v3.6",
"version"
],
"delta": "0:00:02.515127",
"end": "2017-06-05 02:17:17.958299",
"failed": true,
"rc": 125,
"start": "2017-06-05 02:17:15.443172",
"warnings": []
}
STDERR:
Unable to find image 'openshift3/ose:v3.6' locally
Trying to pull repository docker.io/openshift3/ose ...
/usr/bin/docker-current: unauthorized: authentication required.
See '/usr/bin/docker-current run --help'.
With the same config I can't replicate the issue (docker 1.12.6): $ sudo docker pull openshift3/ose:v3.6 Trying to pull repository brew-pulp.xxxxxxxxxx:8888/openshift3/ose ... sha256:0c8ae9030a2479d5a9407d5adb90476ee055bac8c16c2253750b515d4c0fa6b6: Pulling from brew-pulp.xxxxxxxxxx:8888/openshift3/ose - What's the version of the docker command that is present? - Can you provide the log of the run? - Is this only happening when doing containerized installs or is there failure with rpm install as well? It looks like in the latest build Jhon moved the daemon.json file to container-deamon.json and, thus, the changes we are making to daemon.json are not being picked up. Once I copied container-deamon.json to daemon.json it started to work once more. I'm going to update the installer to modify container-deamon.json instead. *container-daemon.json Merged. Tested against openshift-ansible-3.6.98-1.git.0.e651d65.el7.noarch.rpm /etc/docker/container-daemon.json is generated and docker system container is running well. atomic-1.17.2-4.git2760e30.el7.x86_64 runc-1.0.0-6.gite800860.el7.x86_64 # atomic -v 1.17.1 # runc -v runc version 1.0.0-rc3 commit: cafb8d8755dc2b990fc73fbf7bff62f534da9219-dirty spec: 1.0.0-rc5 # docker version Client: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-31.git3a6eaeb.el7.x86_64 Go version: go1.7.6 Git commit: 3a6eaeb/1.12.6 Built: Tue Jun 6 12:45:07 2017 OS/Arch: linux/amd64 # atomic images list|grep container-engine xxx/rhel7/container-engine latest 7d4eccca7dfc 2017-06-11 21:58 ostree Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716 |
Description of problem: The generated /etc/docker/daemon.json was not capable which installing docker system container Version-Release number of selected component (if applicable): openshift-ansible-3.6.53-1.git.0.03f33da.el7.noarch How reproducible: always Steps to Reproduce: 1. Trigger the installation using docker system container #cat inventory_hosts <--snip--> openshift_docker_use_system_container=True openshift_docker_systemcontainer_image_registry_override=test.registry.xxx.com/rhel7/ 2.Check the container-engine status 3. Actual results: Check the status of container-engine #journalctl -u container-engine <--snip--> level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'u' looking for beginning of value\n" <--snip--> #cat /etc/docker/daemon.json { "api-cors-header": "", "authorization-plugins": ["rhel-push-plugin"], "bip": "", "bridge": "", "cgroup-parent": "", "cluster-store": "", "cluster-store-opts": {}, "cluster-advertise": "", "debug": true, "default-gateway": "", "default-gateway-v6": "", "default-runtime": "oci", "containerd": "/var/run/containerd.sock", "default-ulimits": {}, "disable-legacy-registry": false, "dns": [], "dns-opts": [], "dns-search": [], "exec-opts": ["native.cgroupdriver=systemd"], "exec-root": "", "fixed-cidr": "", "fixed-cidr-v6": "", "graph": "", "group": "", "hosts": [], "icc": false, "insecure-registries": [u'test.registry.com:8888', u'registry.ops.openshift.com'], "ip": "0.0.0.0", "iptables": false, "ipv6": false, "ip-forward": false, "ip-masq": false, "labels": [], "live-restore": true, "log-level": "", "log-opts": {}, "max-concurrent-downloads": 3, "max-concurrent-uploads": 5, "mtu": 0, "oom-score-adjust": -500, "pidfile": "", "raw-logs": false, "registry-mirrors": [], "runtimes": { "oci": { "path": "/usr/libexec/docker/docker-runc-current" } }, "selinux-enabled": True, "storage-driver": "", "storage-opts": [], "tls": true, "tlscacert": "", "tlscert": "", "tlskey": "", "tlsverify": true, "userns-remap": "", "add-registry": [u'test.registry.com:8888', u'registry.access.redhat.com'], "blocked-registries": [u'registry.hacker.com'], "userland-proxy-path": "/usr/libexec/docker/docker-proxy-current" } Expected results: No errors Additional info: After removing "u" character in the image prefix, restart container-engine may hit other issues: May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" May 05 05:47:29 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[62757]: time="2017-05-05T05:47:29-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd])\n" May 05 05:47:49 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[63646]: time="2017-05-05T05:47:49-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '}' looking for beginning of object key string\n"