Created attachment 1341220 [details] Errors from journalctl when running oc cluster up --service-catalog=true Description of problem: I am trying to run 'oc cluster up --service-catalog=true --routing-suffix=172.18.0.1.nip.io --public-hostname=172.18.0.1 --host-pv-dir=/persistedvolumes --image=openshift/origin --version=latest' This worked with skopeo-containers-0.1.24-4.dev.git28d4e08.fc27.x86_64, but it fails with skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64. It appears an oc cluster up without the service-catalog still works, but trying to do an oc cluster up with the new service-catalog and skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64 breaks. Version-Release number of selected component (if applicable): skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64 works. How reproducible: Always Steps to Reproduce: 1. Install skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64 2. run oc cluster up --service-catalog=true Actual results: oc cluster hangs for a long time and probably eventually fails Expected results: oc cluster up works Additional info: Works with skopeo-containers-0.1.24-4.dev.git28d4e08.fc27.x86_64 or if you delete /usr/share/rhel/secrets in the newer skopeo-containers.
"Version-Release number of selected component (if applicable): skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64 works." Should just say: "Version-Release number of selected component (if applicable): skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64" skopeo-containers-0.1.24-6.dev.git28d4e08.fc27.x86_64 is the package that is not working.
Thanks for the bug report. Adding cewong on cc. Please could you paste the output of oc cluster up --loglevel=4 ?
Created attachment 1341259 [details] oc cluster up with loglevel 4
Hi Jason, This looks like an issue with either the service catalog api server or controller. Can you please run cluster up with --loglevel=4 and while it's printing out "polling for service catalog api server endpoint availability", hit Ctrl+C Then run 'docker exec -ti origin bash' to access the shell inside the origin container. From there run: 'oc get pods -n kube-service-catalog' You will likely see one or two pods in CrashLoopBackOff status. Obtain the logs of each with 'oc log -n kube-service-catalog [pod-name]' and please attach it to the BZ. Also please include the output of the 'oc get pods' command above.
It works fine if I delete /usr/share/rhel/secrets, so I do not believe the container is non-functional on its own. My hunch is something along the lines /rootfs/run/secrets is being used to mount this dir and then the when the secrets the apiserver requires are attempting to be mounted on the now read only fs there it's falling apart, hence the ""mkdir /var/lib/docker/devicemapper/mnt/34e7025d5413c6b81e3a5e6e9e4266ca8ec68e7c9cef85625d12f8857a02dc31/rootfs/run/secrets/kubernetes.io: read-only file system\\\\\\\"\\\"\"\n"" in the original log. I see the same on all the persistent-volume-setup pods, docker-registry-1-deploy, etc. $ docker exec -ti origin bash [root@jmontleo origin]# oc get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default docker-registry-1-deploy 0/1 ContainerCannotRun 0 2m default persistent-volume-setup-02vs7 0/1 ContainerCannotRun 0 53s default persistent-volume-setup-1v98s 0/1 ContainerCannotRun 0 41s default persistent-volume-setup-2rwxv 0/1 ContainerCannotRun 0 2m default persistent-volume-setup-2wjjd 0/1 ContainerCannotRun 0 46s default persistent-volume-setup-2ztm2 0/1 ContainerCannotRun 0 19s default persistent-volume-setup-4bd8l 0/1 ContainerCreating 0 5s default persistent-volume-setup-58qt1 0/1 ContainerCannotRun 0 8s default persistent-volume-setup-6kxsq 0/1 ContainerCannotRun 0 26s default persistent-volume-setup-7d47f 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-7kb9q 0/1 ContainerCannotRun 0 22s default persistent-volume-setup-9f6f0 0/1 ContainerCannotRun 0 15s default persistent-volume-setup-jnwnl 0/1 ContainerCannotRun 0 50s default persistent-volume-setup-jxgzg 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-m748k 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-mk0xh 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-mnlbq 0/1 ContainerCannotRun 0 38s default persistent-volume-setup-n9j4p 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-p6clq 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-pk6zn 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-qrh33 0/1 ContainerCannotRun 0 33s default persistent-volume-setup-qt2b1 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-rss82 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-td5sx 0/1 ContainerCannotRun 0 30s default persistent-volume-setup-twrqv 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-v10b8 0/1 ContainerCannotRun 0 1m default persistent-volume-setup-zl461 0/1 ContainerCannotRun 0 11s default persistent-volume-setup-zt99b 0/1 ContainerCannotRun 0 56s default router-1-deploy 0/1 ContainerCannotRun 0 2m kube-service-catalog apiserver-273091082-jtglp 1/2 CrashLoopBackOff 4 2m kube-service-catalog controller-manager-3676386035-l04nc 0/1 CrashLoopBackOff 4 2m # oc log -n kube-service-catalog controller-manager-3676386035-l04nc W1020 17:22:53.689022 23366 cmd.go:403] log is DEPRECATED and will be removed in a future version. Use logs instead. container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/origin/openshift.local.volumes/pods/ba2cd7c2-b5b9-11e7-b32d-64006a559656/volumes/kubernetes.io~secret/service-catalog-controller-token-smgtf\\\" to rootfs \\\"/var/lib/docker/devicemapper/mnt/c96d3bac59427d2b2d5c0cafd40cd5a8d1d31e380561adeb444598deec488bf8/rootfs\\\" at \\\"/var/lib/docker/devicemapper/mnt/c96d3bac59427d2b2d5c0cafd40cd5a8d1d31e380561adeb444598deec488bf8/rootfs/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/devicemapper/mnt/c96d3bac59427d2b2d5c0cafd40cd5a8d1d31e380561adeb444598deec488bf8/rootfs/run/secrets/kubernetes.io: read-only file system\\\"\"" # oc log -n kube-service-catalog apiserver-273091082-jtglp -c apiserver W1020 17:23:55.713227 29352 cmd.go:403] log is DEPRECATED and will be removed in a future version. Use logs instead. container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/origin/openshift.local.volumes/pods/ba1f124a-b5b9-11e7-b32d-64006a559656/volumes/kubernetes.io~secret/service-catalog-apiserver-token-2wv8t\\\" to rootfs \\\"/var/lib/docker/devicemapper/mnt/549517d5c2ccc12e4d790bba40dc74760b508e6f6859cbf5ed7daf3c2271a24a/rootfs\\\" at \\\"/var/lib/docker/devicemapper/mnt/549517d5c2ccc12e4d790bba40dc74760b508e6f6859cbf5ed7daf3c2271a24a/rootfs/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/devicemapper/mnt/549517d5c2ccc12e4d790bba40dc74760b508e6f6859cbf5ed7daf3c2271a24a/rootfs/run/secrets/kubernetes.io: read-only file system\\\"\"" # oc log -n default docker-registry-1-deploy W1020 17:24:18.975057 30866 cmd.go:403] log is DEPRECATED and will be removed in a future version. Use logs instead. container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/origin/openshift.local.volumes/pods/b9e9c343-b5b9-11e7-b32d-64006a559656/volumes/kubernetes.io~secret/deployer-token-9c7pj\\\" to rootfs \\\"/var/lib/docker/devicemapper/mnt/a29a6fe0a19acb099df839c2ce865c3c0e03c83cd58fd7f0279e6c9d9d913a5a/rootfs\\\" at \\\"/var/lib/docker/devicemapper/mnt/a29a6fe0a19acb099df839c2ce865c3c0e03c83cd58fd7f0279e6c9d9d913a5a/rootfs/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/devicemapper/mnt/a29a6fe0a19acb099df839c2ce865c3c0e03c83cd58fd7f0279e6c9d9d913a5a/rootfs/run/secrets/kubernetes.io: read-only file system\\\"\""
skopeo-0.1.24-7.gitdd2c3e3.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
skopeo-0.1.24-7.gitdd2c3e3.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-6f4b15d9e3
skopeo-0.1.24-7.gitdd2c3e3.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-6f4b15d9e3
Same issue with skopeo-0.1.24-7.gitdd2c3e3.fc27
I am seeing the same thing on my F27 box as well
Jason, are you saying there was some recentish skopeo version where this is known to work, i.e. there's a regression? If so, what is that version where it worked?
Answering my own question. I think the commit that broke this is http://pkgs.fedoraproject.org/cgit/rpms/skopeo.git/commit/?id=faffe8847b203c873735e636f503b694e43052e1 Downgrading (--nodeps --force) to `skopeo-containers-0.1.24-1.dev.gita41cd0a.fc27.x86_64.rpm` reverts the break. The above commit causes /usr/share/rhel/secrets to exist on the F27 host, which is then mounted by default in /run/secrets in the container, read only. This prevents k8s from mounting into /run/secrets/kubernetes.io This prevents `oc cluster up` from running on F27, regardless of --service-catalog.
skopeo-containers-0.1.24-4.dev.git28d4e08.fc27.x86_64 was the last version that worked. The only things I saw that changed between -4 and -6 were the addition of the secrets and some changes to config options. I reverted the config options and saw no difference, leading me to believe it was the addition of the secrets. On deleting the new secrets directory, /usr/share/rhel/secrets, in -6 and -7 I can get them to also work.
skopeo-0.1.24-7.gitdd2c3e3.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
docker-1.13.1-36.git8fd0ebb.fc26 skopeo-0.1.24-7.gitdd2c3e3.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
docker-1.13.1-36.git8fd0ebb.fc26, skopeo-0.1.24-7.gitdd2c3e3.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
atomic-1.19.1-5.fc26 docker-1.13.1-37.git166a52e.fc26 skopeo-0.1.24-7.gitdd2c3e3.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
atomic-1.19.1-5.fc26, docker-1.13.1-38.git166a52e.fc26, skopeo-0.1.24-7.gitdd2c3e3.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
atomic-1.19.1-6.fc26 docker-1.13.1-40.git877b6df.fc26 skopeo-0.1.24-7.gitdd2c3e3.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
atomic-1.19.1-6.fc26, docker-1.13.1-40.git877b6df.fc26, skopeo-0.1.24-7.gitdd2c3e3.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-cbf83e5281
skopeo-0.1.24-7.gitdd2c3e3.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
I am still experiencing this problem when trying to start a container. Here is my system info: CentOS Linux release 7.3.1611 (Core) Installed RPMS: kubernetes-client-1.8.1-1.el7.x86_64 kubernetes-node-1.8.1-1.el7.x86_64 skopeo-containers-0.1.25-2.git7fd6f66.el7.x86_64 docker-common-1.13.1-35.git8fd0ebb.el7.x86_64 docker-rhel-push-plugin-1.13.1-35.git8fd0ebb.el7.x86_64 docker-1.13.1-35.git8fd0ebb.el7.x86_64 All packages were retrieved from http://cbs.centos.org/repos/virt7-container-common-candidate/x86_64/os/Packages/ I get the following error from `kubectl describe pods k8s-test-885307647-bc9ch`: Warning Failed 10s kubelet, ip-10-91-34-46.ec2.internal Error: failed to start container "k8s-test": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/kubelet/pods/935fe3e3-c886-11e7-a4d4-0e6c81959008/volumes/kubernetes.io~secret/default-token-7x2xh\\\" to rootfs \\\"/var/lib/docker/devicemapper/mnt/8222150ea4e887d4de776a4a47d6bebc3658214a7891bb6530570886f8e30671/rootfs\\\" at \\\"/var/lib/docker/devicemapper/mnt/8222150ea4e887d4de776a4a47d6bebc3658214a7891bb6530570886f8e30671/rootfs/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/devicemapper/mnt/8222150ea4e887d4de776a4a47d6bebc3658214a7891bb6530570886f8e30671/rootfs/run/secrets/kubernetes.io: read-only file system\\\"\""
re-opening since the centos build is just a recompile of fedora rawhide, so quite likely the issue isn't fixed yet.
Yesterday, I ran into this issue, too, after upgrading from Fedora 26 to Fedora 27 and testing the OpenShift Docker test installation (https://docs.openshift.org/latest/getting_started/administrators.html#running-in-a-docker-container): F26 works, F27 does not. See also: https://github.com/openshift/origin/issues/15038#issuecomment-344554311
atomic-1.19.1-6.fc26, docker-1.13.1-40.git877b6df.fc26, skopeo-0.1.24-7.gitdd2c3e3.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.
This is a docker issue, the secrets directory should not be readonly. This should work the same way on RHEL as it does on Fedora. Removing this directory from skopeo-containers is just masking over the real issue.
same issue: https://bugzilla.redhat.com/show_bug.cgi?id=1511375
https://github.com/projectatomic/docker/commit/fef7588ca66af99707c314be5c63ee7c184f036d https://github.com/projectatomic/docker/commit/9099cbba16481a2979d750ab566633762395ad34 https://github.com/projectatomic/docker/commit/97d36c360d9973e498c2b73c4f783ab97d6dce36 https://github.com/projectatomic/docker/commit/612ed23ae365f586a76aca9869e1c4cb79d8957e https://github.com/projectatomic/docker/commit/4402c09586c72e0c32b90d72bd24304f609e2b7a The patches above are now taking care of this bug in Fedora while making sure we don't regress on https://bugzilla.redhat.com/show_bug.cgi?id=1440389 Could someone rebuild for fedora?
docker-1.13.1-42.git4402c09.fc27 skopeo-0.1.25-2.git7fd6f66.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-3da8ad596a
docker-1.13.1-42.git4402c09.fc27, skopeo-0.1.25-2.git7fd6f66.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-3da8ad596a
I confirm that upgrading to docker-1.13.1-42.git4402c09.fc27 from updates-testing fixes the same issue when running Kubernetes master with hack/local-up-cluster.sh and attempting to create pod: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 43s default-scheduler Successfully assigned test-pod to 127.0.0.1 Normal SuccessfulMountVolume 43s kubelet, 127.0.0.1 MountVolume.SetUp succeeded for volume "default-token-7dmn8" Warning Failed 25s kubelet, 127.0.0.1 Error: failed to start container "test-pod": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/kubelet/pods/da9bccd5-cea0-11e7-b3ea-525400533a7a/volumes/kubernetes.io~secret/default-token-7dmn8\\\" to rootfs \\\"/var/lib/docker/overlay2/fc1c26de88ac6064c898caec40e5f2f136522032ca3c3019e9abee1f5d97856b/merged\\\" at \\\"/var/lib/docker/overlay2/fc1c26de88ac6064c898caec40e5f2f136522032ca3c3019e9abee1f5d97856b/merged/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/overlay2/fc1c26de88ac6064c898caec40e5f2f136522032ca3c3019e9abee1f5d97856b/merged/run/secrets/kubernetes.io: read-only file system\\\"\"" Warning Failed 21s kubelet, 127.0.0.1 Error: failed to start container "test-pod": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/kubelet/pods/da9bccd5-cea0-11e7-b3ea-525400533a7a/volumes/kubernetes.io~secret/default-token-7dmn8\\\" to rootfs \\\"/var/lib/docker/overlay2/4967a8d508134f2fec14c321185a61db734afa43eac78e74cffbffd49aa48d3f/merged\\\" at \\\"/var/lib/docker/overlay2/4967a8d508134f2fec14c321185a61db734afa43eac78e74cffbffd49aa48d3f/merged/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/overlay2/4967a8d508134f2fec14c321185a61db734afa43eac78e74cffbffd49aa48d3f/merged/run/secrets/kubernetes.io: read-only file system\\\"\"" Normal Pulling 6s (x3 over 42s) kubelet, 127.0.0.1 pulling image "registry.access.redhat.com/rhel7" Normal Pulled 3s (x3 over 25s) kubelet, 127.0.0.1 Successfully pulled image "registry.access.redhat.com/rhel7" Normal Created 3s (x3 over 25s) kubelet, 127.0.0.1 Created container Warning Failed 3s kubelet, 127.0.0.1 Error: failed to start container "test-pod": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:364: container init caused \"rootfs_linux.go:54: mounting \\\"/var/lib/kubelet/pods/da9bccd5-cea0-11e7-b3ea-525400533a7a/volumes/kubernetes.io~secret/default-token-7dmn8\\\" to rootfs \\\"/var/lib/docker/overlay2/c87d2519dec16b046a1b94356b5f595d586e439dfbc94340f7339f7c31b43662/merged\\\" at \\\"/var/lib/docker/overlay2/c87d2519dec16b046a1b94356b5f595d586e439dfbc94340f7339f7c31b43662/merged/run/secrets/kubernetes.io/serviceaccount\\\" caused \\\"mkdir /var/lib/docker/overlay2/c87d2519dec16b046a1b94356b5f595d586e439dfbc94340f7339f7c31b43662/merged/run/secrets/kubernetes.io: read-only file system\\\"\""
(In reply to Jan Pazdziora from comment #33) > I confirm that upgrading to docker-1.13.1-42.git4402c09.fc27 from > updates-testing fixes the same issue when running Kubernetes master with > hack/local-up-cluster.sh and attempting to create pod: > Jan, could you triple check this again? I'm not able to reproduce with that docker version
The docker-1.13.1-42.git4402c09.fc27 fixes the issue.
docker-1.13.1-42.git4402c09.fc27, skopeo-0.1.25-2.git7fd6f66.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
No package that fixes this is yet in 'updates' for f26, although docker-1.13.1-44.git584d391.fc26 is available in 'updates-testing' (dnf --enablerepo=updates-testing update docker)
seems like -42 has been obsoleted. anyway, since -44 is a security update (though selinux should protect against it already), could you please check and add karma to https://bodhi.fedoraproject.org/updates/FEDORA-2017-3976710f1e so we can get that into testing?
confirmed that -44.fc26 fixes this issue, +1 karma added.
+1 for f26, at least one of these fixed this for me: $ sudo dnf --enablerepo=updates-testing update docker Upgrading: docker x86_64 2:1.13.1-44.git584d391.fc26 updates-testing 20 M docker-common x86_64 2:1.13.1-44.git584d391.fc26 updates-testing 83 k docker-rhel-push-plugin x86_64 2:1.13.1-44.git584d391.fc26 updates-testing 1.6 M skopeo x86_64 0.1.27-1.git93876ac.fc26 updates-testing 4.6 M skopeo-containers x86_64 0.1.27-1.git93876ac.fc26 updates-testing 16 k
*** Bug 1544377 has been marked as a duplicate of this bug. ***