Red Hat Bugzilla – Bug 1448384
The generated /etc/docker/daemon.json was not capable which using docker system container
Last modified: 2017-08-16 15:51 EDT
Description of problem: The generated /etc/docker/daemon.json was not capable which installing docker system container Version-Release number of selected component (if applicable): openshift-ansible-3.6.53-1.git.0.03f33da.el7.noarch How reproducible: always Steps to Reproduce: 1. Trigger the installation using docker system container #cat inventory_hosts <--snip--> openshift_docker_use_system_container=True openshift_docker_systemcontainer_image_registry_override=test.registry.xxx.com/rhel7/ 2.Check the container-engine status 3. Actual results: Check the status of container-engine #journalctl -u container-engine <--snip--> level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'u' looking for beginning of value\n" <--snip--> #cat /etc/docker/daemon.json { "api-cors-header": "", "authorization-plugins": ["rhel-push-plugin"], "bip": "", "bridge": "", "cgroup-parent": "", "cluster-store": "", "cluster-store-opts": {}, "cluster-advertise": "", "debug": true, "default-gateway": "", "default-gateway-v6": "", "default-runtime": "oci", "containerd": "/var/run/containerd.sock", "default-ulimits": {}, "disable-legacy-registry": false, "dns": [], "dns-opts": [], "dns-search": [], "exec-opts": ["native.cgroupdriver=systemd"], "exec-root": "", "fixed-cidr": "", "fixed-cidr-v6": "", "graph": "", "group": "", "hosts": [], "icc": false, "insecure-registries": [u'test.registry.com:8888', u'registry.ops.openshift.com'], "ip": "0.0.0.0", "iptables": false, "ipv6": false, "ip-forward": false, "ip-masq": false, "labels": [], "live-restore": true, "log-level": "", "log-opts": {}, "max-concurrent-downloads": 3, "max-concurrent-uploads": 5, "mtu": 0, "oom-score-adjust": -500, "pidfile": "", "raw-logs": false, "registry-mirrors": [], "runtimes": { "oci": { "path": "/usr/libexec/docker/docker-runc-current" } }, "selinux-enabled": True, "storage-driver": "", "storage-opts": [], "tls": true, "tlscacert": "", "tlscert": "", "tlskey": "", "tlsverify": true, "userns-remap": "", "add-registry": [u'test.registry.com:8888', u'registry.access.redhat.com'], "blocked-registries": [u'registry.hacker.com'], "userland-proxy-path": "/usr/libexec/docker/docker-proxy-current" } Expected results: No errors Additional info: After removing "u" character in the image prefix, restart container-engine may hit other issues: May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" May 05 05:47:29 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[62757]: time="2017-05-05T05:47:29-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd])\n" May 05 05:47:49 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[63646]: time="2017-05-05T05:47:49-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '}' looking for beginning of object key string\n"
I see the issue. The strings are being turned into unicode instances inside the template. I'll put something together.
Created https://github.com/openshift/openshift-ansible/pull/4106 and added Gan as a reviewer.
Only the daemon.json makes sense to me to start container-engine successfully after deleting the illegal parameters one by one. And `/var/run/docker.pid` needs to be deleted manually alought docker has been stopped. # cat /etc/docker/daemon.json { "api-cors-header": "", "bip": "", "bridge": "", "cgroup-parent": "", "cluster-store": "", "cluster-store-opts": {}, "cluster-advertise": "", "debug": true, "default-gateway": "", "default-gateway-v6": "", "default-ulimits": {}, "disable-legacy-registry": false, "dns": [], "dns-opts": [], "dns-search": [], "exec-root": "", "fixed-cidr": "", "fixed-cidr-v6": "", "graph": "", "group": "", "hosts": [], "icc": false, "insecure-registries": [u'test.registry.com:8888', u'registry.ops.openshift.com'], "ip": "0.0.0.0", "iptables": true, "ipv6": false, "ip-forward": false, "ip-masq": false, "labels": [], "live-restore": true, "log-level": "", "log-opts": null, "max-concurrent-downloads": 3, "max-concurrent-uploads": 5, "mtu": 0, "oom-score-adjust": -500, "raw-logs": false, "registry-mirrors": [], "userns-remap": "" }
Added https://github.com/projectatomic/atomic-system-containers/pull/65 to remove /etc/sysconfig/docker usage from the system container.
https://github.com/projectatomic/atomic-system-containers/pull/65 merged. Waiting for CI on https://github.com/openshift/openshift-ansible/pull/4106
Both are now merged.
Test with latest openshift-ansible master branch and latest "container-engine" (IMAGE ID: edd29b7740cd) Still have some issues not addressed. 1) > May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" It's caused by capital "T" which can't be recognized by Docker daemon # grep T /etc/docker/daemon.json "selinux-enabled": True, 2) Problem still persists after fixing the issue above, seems "blocked-registries" not supported in /etc/docker/daemon.json > May 10 01:41:53 qe-ghuang-master-nfs-1.localdomain runc[23807]: time="2017-05-10T05:41:53Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives don't match any configuration option: blocked-registries\n" 3) There still many duplicated settings that I couldn't figure out: > May 10 01:42:37 qe-ghuang-master-nfs-1.localdomain runc[24688]: time="2017-05-10T05:42:37Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: add-registry: (from flag: [registry.access.redhat.com], from file: [registry.access.redhat.com]), runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd]), storage-driver: (from flag: devicemapper, from file: ), selinux-enabled: (from flag: true, from file: true), storage-opts: (from flag: [dm.fs=xfs dm.thinpooldev=/dev/mapper/rhel-docker--pool dm.use_deferred_removal=true], from file: []), userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" ...
(In reply to Gan Huang from comment #7) > Test with latest openshift-ansible master branch and latest > "container-engine" (IMAGE ID: edd29b7740cd) > > Still have some issues not addressed. > 1) > > May 05 05:39:51 host-8-175-193.host.centralci.eng.rdu2.redhat.com runc[51741]: time="2017-05-05T05:39:51-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'T' looking for beginning of value\n" > > It's caused by capital "T" which can't be recognized by Docker daemon > # grep T /etc/docker/daemon.json > "selinux-enabled": True, Ah, I see. Will fix. I should have noticed that... > 2) > Problem still persists after fixing the issue above, seems > "blocked-registries" not supported in /etc/docker/daemon.json > Interesting. The code seems to indicate it is supported but I'll check again. > > May 10 01:41:53 qe-ghuang-master-nfs-1.localdomain runc[23807]: time="2017-05-10T05:41:53Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives don't match any configuration option: blocked-registries\n" > > 3) > There still many duplicated settings that I couldn't figure out: > > > May 10 01:42:37 qe-ghuang-master-nfs-1.localdomain runc[24688]: time="2017-05-10T05:42:37Z" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: add-registry: (from flag: [registry.access.redhat.com], from file: [registry.access.redhat.com]), runtimes: (from flag: [oci], from file: map[oci:map[path:/usr/libexec/docker/docker-runc-current]]), authorization-plugins: (from flag: [rhel-push-plugin], from file: [rhel-push-plugin]), containerd: (from flag: /run/containerd.sock, from file: /run/containerd.sock), default-runtime: (from flag: oci, from file: oci), exec-opts: (from flag: [native.cgroupdriver=systemd], from file: [native.cgroupdriver=systemd]), storage-driver: (from flag: devicemapper, from file: ), selinux-enabled: (from flag: true, from file: true), storage-opts: (from flag: [dm.fs=xfs dm.thinpooldev=/dev/mapper/rhel-docker--pool dm.use_deferred_removal=true], from file: []), userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" The duplicated settings should be fixed with https://bugzilla.redhat.com/show_bug.cgi?id=1448372 as it removes the use of /etc/sysconfig/docker from the system container.
Added https://github.com/openshift/openshift-ansible/pull/4147 for lower case T and renaming blocked to block.
Merged
Tested with latest openshfit-ansible including PR4147, selinux-enabled is still set to "True", and seems `block-registries` should be renamed to `docker-registry` Proposed fix: https://github.com/openshift/openshift-ansible/pull/4158
These changes worked for me and have been merged.
Tested with latest openshift-ansible. container-engine still failed to start May 11 21:55:28 qe-ghuang-master-nfs-1.localdomain runc[13140]: time="2017-05-11T21:55:28-04:00" level=fatal msg="unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: userland-proxy-path: (from flag: /usr/libexec/docker/docker-proxy-current, from file: /usr/libexec/docker/docker-proxy-current)\n" Restart successfully after removing "userland-proxy-path": "/usr/libexec/docker/docker-proxy-current" from /etc/docker/daemon.json.
Found it. The init.sh was still referencing that outside of the json. PR: https://github.com/openshift/openshift-ansible/pull/4174
Verified with openshift-ansible-3.6.68-1.git.0.9cbe2b7.el7.noarch.rpm container-engine can be started properly.
Reopening this issue: Seems `add-registry`, `block-registry`, `insecure-registries` in /etc/docker/daemon.json didn't work as expected, installer failed at: TASK [openshift_version : Set precise containerized version to configure if openshift_release specified] *** Monday 05 June 2017 06:17:15 +0000 (0:00:00.114) 0:03:37.158 *********** fatal: [openshift-140.lab.sjc.redhat.com]: FAILED! => { "changed": true, "cmd": [ "docker", "run", "--rm", "openshift3/ose:v3.6", "version" ], "delta": "0:00:02.489766", "end": "2017-06-05 02:17:18.754778", "failed": true, "rc": 125, "start": "2017-06-05 02:17:16.265012", "warnings": [] } STDERR: Unable to find image 'openshift3/ose:v3.6' locally Trying to pull repository docker.io/openshift3/ose ... /usr/bin/docker-current: unauthorized: authentication required. See '/usr/bin/docker-current run --help'. fatal: [openshift-125.lab.sjc.redhat.com]: FAILED! => { "changed": true, "cmd": [ "docker", "run", "--rm", "openshift3/ose:v3.6", "version" ], "delta": "0:00:02.515127", "end": "2017-06-05 02:17:17.958299", "failed": true, "rc": 125, "start": "2017-06-05 02:17:15.443172", "warnings": [] } STDERR: Unable to find image 'openshift3/ose:v3.6' locally Trying to pull repository docker.io/openshift3/ose ... /usr/bin/docker-current: unauthorized: authentication required. See '/usr/bin/docker-current run --help'.
With the same config I can't replicate the issue (docker 1.12.6): $ sudo docker pull openshift3/ose:v3.6 Trying to pull repository brew-pulp.xxxxxxxxxx:8888/openshift3/ose ... sha256:0c8ae9030a2479d5a9407d5adb90476ee055bac8c16c2253750b515d4c0fa6b6: Pulling from brew-pulp.xxxxxxxxxx:8888/openshift3/ose - What's the version of the docker command that is present? - Can you provide the log of the run? - Is this only happening when doing containerized installs or is there failure with rpm install as well?
It looks like in the latest build Jhon moved the daemon.json file to container-deamon.json and, thus, the changes we are making to daemon.json are not being picked up. Once I copied container-deamon.json to daemon.json it started to work once more. I'm going to update the installer to modify container-deamon.json instead.
*container-daemon.json
PR: https://github.com/openshift/openshift-ansible/pull/4370
Merged.
Tested against openshift-ansible-3.6.98-1.git.0.e651d65.el7.noarch.rpm /etc/docker/container-daemon.json is generated and docker system container is running well. atomic-1.17.2-4.git2760e30.el7.x86_64 runc-1.0.0-6.gite800860.el7.x86_64 # atomic -v 1.17.1 # runc -v runc version 1.0.0-rc3 commit: cafb8d8755dc2b990fc73fbf7bff62f534da9219-dirty spec: 1.0.0-rc5 # docker version Client: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-28.git1398f24.el7.x86_64 Go version: go1.7.4 Git commit: 1398f24/1.12.6 Built: Wed May 17 01:16:44 2017 OS/Arch: linux/amd64 Server: Version: 1.12.6 API version: 1.24 Package version: docker-1.12.6-31.git3a6eaeb.el7.x86_64 Go version: go1.7.6 Git commit: 3a6eaeb/1.12.6 Built: Tue Jun 6 12:45:07 2017 OS/Arch: linux/amd64 # atomic images list|grep container-engine xxx/rhel7/container-engine latest 7d4eccca7dfc 2017-06-11 21:58 ostree
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716