Description of problem: The Openshift EFK stack is part of the infra deploys fluentd as a daemonset to collects node and container logs. It mounts (since inception in v3.2) various parts of the host to collect application and journal logs [1]. Since the mount propegation feature landed [2], log collection is completely broken which is blocking bug fixing in the current stack and testing of releasing the ES5 stack [1] https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_logging_fluentd/templates/2.x/fluentd.j2#L215-L245 [2] https://github.com/kubernetes/kubernetes/issues/61058 https://github.com/kubernetes/kubernetes/pull/61126 Version-Release number of selected component (if applicable): openshift v3.10.0-alpha.0+435f98f-619 and kubernetes v1.10.0+b81c8f8 How reproducible: Always Steps to Reproduce: 1. Deploy openshift EFK stack 2. 3.
You can use whole /var/lib/docker instead of /var/lib/docker/containers in your pod: spec: containers: - name: varlibdocker mountPath: /var/lib/docker readOnly: true ... volumes: - name: varlibdocker hostPath: path: /var/lib/docker Proper fix would require us to change API in Kubernetes, which is long and tedious process that does not work well with urgent bugs.
Jeff - we made rslave mount as default in 1.10 and since docker explicitly marks /var/lib/docker/containers as private mount - it can't be mounted within a pod. Mounting /var/lib/docker still works because it is within "/" file system and can be mounted as rslave. If this workaround does not work - we can disable mount propagation feature in the cluster. Providing private mount as optional param will require API change via upstream and will take time.
For now I have opened a PR to disable mount propagation via ansible - https://github.com/openshift/openshift-ansible/pull/7936
Long-term, I want Kubernetes to revert to "private" propagation by default (i.e. same as was in 1.9 and earlier): https://github.com/kubernetes/kubernetes/pull/62462
Revert PR for Openshift as well - https://github.com/openshift/origin/pull/19364
(In reply to Jan Safranek from comment #1) > You can use whole /var/lib/docker instead of /var/lib/docker/containers in > your pod: > > spec: > containers: > - name: varlibdocker > mountPath: /var/lib/docker > readOnly: true > > ... > volumes: > - name: varlibdocker > hostPath: > path: /var/lib/docker > > > Proper fix would require us to change API in Kubernetes, which is long and > tedious process that does not work well with urgent bugs. There is one interesting scenario, I set the following parameters, no need to do the workaround, fluentd pods can be started up, openshift_logging_use_ops=true openshift_logging_es_cluster_size=2 openshift_logging_es_ops_cluster_size=2 # oc get pod NAME READY STATUS RESTARTS AGE logging-curator-1-rjbpp 1/1 Running 0 47m logging-curator-ops-1-676fp 1/1 Running 0 47m logging-es-data-master-i39dne2b-1-ckqkp 2/2 Running 0 46m logging-es-data-master-tumll5zj-1-k6rzh 2/2 Running 0 46m logging-es-ops-data-master-4z0dr5nh-1-cx67r 2/2 Running 0 46m logging-es-ops-data-master-vj9lewcb-1-7zk2l 2/2 Running 0 46m logging-fluentd-b6dcw 1/1 Running 0 46m logging-fluentd-s28sd 1/1 Running 0 46m logging-kibana-1-frx6j 2/2 Running 0 48m logging-kibana-ops-1-cpkrs 2/2 Running 0 47m Still need to do workaround if without the following settings openshift_logging_es_cluster_size=2 openshift_logging_es_ops_cluster_size=2 more info see the attached flunetd ds file
Created attachment 1428003 [details] fluentd ds file
Issue is fixed, fluentd pods can be started up now. used /var/lib/docker as hostPath for fluentd - mountPath: /var/lib/docker name: varlibdockercontainers readOnly: true - hostPath: path: /var/lib/docker type: "" name: varlibdockercontainers Images version: v3.10.0-0.54.0.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816