Bug 1569106 - 3.10: logging-fluentd pod fails to start: Path /var/lib/docker/containers is mounted on /var/lib/docker/containers but it is not a shared or slave mount.
Summary: 3.10: logging-fluentd pod fails to start: Path /var/lib/docker/containers i...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.10.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.10.0
Assignee: ewolinet
QA Contact: Anping Li
URL:
Whiteboard: aos-scalability-310
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-18 15:42 UTC by Mike Fiedler
Modified: 2019-06-20 13:51 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-12-20 21:11:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mike Fiedler 2018-04-18 15:42:10 UTC
Description of problem:

  Normal   Created                6m (x3 over 6m)   kubelet, ip-172-31-54-175.us-west-2.compute.internal  Created container                  
  Warning  Failed                 6m (x3 over 6m)   kubelet, ip-172-31-54-175.us-west-2.compute.internal  Error: failed to start container "fluentd-elasticsearch": Error response from daemon: linux mounts: Path /var/lib/docker/containers is mounted on /var/lib/docker/containers but it is not a shared or slave mount.
  Normal   Pulled                 6m (x3 over 6m)   kubelet, ip-172-31-54-175.us-west-2.compute.internal  Container image "registry.reg-aws.openshift.com:443/openshift3/logging-fluentd:v3.10" already present on machine
  Warning  BackOff                1m (x21 over 6m)  kubelet, ip-172-31-54-175.us-west-2.compute.internal  Back-off restarting failed container


Version-Release number of selected component (if applicable): v3.10.0-0.22.0


How reproducible:  Always


Steps to Reproduce:
1.  Install logging with the inventory below - adjust as needed.


Actual results:

Install is successful but logging-fluentd pods are stuck in a crash loop with the error "Error: failed to start container "fluentd-elasticsearch": Error response from daemon: linux mounts: Path /var/lib/docker/containers is mounted on /var/lib/docker/containers but it is not a shared or slave mount."

Expected results:

logging-fluentd starts successfully and collects pod logs

Additional info:

[OSEv3:children]                                                                                                                             
masters                                                                                                                                      
etcd                                                                                                                                         
                                                                      

[masters]                                                                                                                                    
ip-172-31-40-189                                                      

[etcd]                                                                                                                                       
ip-172-31-40-189                                                      

                                                                      

[OSEv3:vars]                                                                                                                                 
deployment_type=openshift-enterprise                                                                                                         

openshift_deployment_type=openshift-enterprise                                                                                               
openshift_release=v3.10                                                                                                                      
openshift_docker_additional_registries=registry.reg-aws.openshift.com                                                                        


openshift_logging_install_logging=true                                                                                                       
openshift_logging_master_url=https://ec2-18-236-98-55.us-west-2.compute.amazonaws.com:8443                                                   
openshift_logging_master_public_url=https://ec2-18-236-98-55.us-west-2.compute.amazonaws.com:8443                                            
openshift_logging_kibana_hostname=kibana.34.223.229.29.xip.io         
openshift_logging_image_prefix=registry.reg-aws.openshift.com:443/openshift3/                                                                                                                                                                                                             
openshift_logging_image_version=v3.10                                 
openshift_logging_es_cluster_size=1                                                                                                          
openshift_logging_es_pvc_dynamic=true                                                                                                        
openshift_logging_es_pvc_size=10Gi                                                                                                           
openshift_logging_es_pvc_storage_class_name=gp2                                                                                              
openshift_logging_fluentd_read_from_head=false                                                                                               
openshift_logging_use_mux=false                                                                                                              
openshift_logging_curator_nodeselector={"region": "infra"}                                                                                   
openshift_logging_kibana_nodeselector={"region": "infra"}                                                                                    
openshift_logging_es_nodeselector={"region": "infra"}

Comment 1 Rich Megginson 2018-04-18 15:58:38 UTC
Yep, being worked on from multiple angles.

In the meantime, change the mount to /var/lib/docker in the ds/logging-fluentd

Comment 3 Jeff Cantrill 2018-05-02 14:42:23 UTC
Believe this is resolved by https://github.com/openshift/origin/pull/19364.  Please verify

Comment 4 Mike Fiedler 2018-05-03 18:27:29 UTC
Verified on v3.10.0-0.32.0.   logging-fluentd pods start fine.

Comment 5 Anping Li 2018-05-07 08:15:08 UTC
The issue appears again in 3.10.0-0.33.0.  The kubernetes v1.10.0+b81c8f8 version is same with v3.10.0-0.32.0. 

atomic-openshift-node-3.10.0-0.33.0.git.0.db310d4.el7.x86_64
[root@ip-172-18-25-83 ~]# oc version
oc v3.10.0-0.32.0
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.10.0-0.32.0
kubernetes v1.10.0+b81c8f8


ft.com:443/openshift3/logging-fluentd:v3.10.0"
  24m		24m		3	kubelet, ip-172-18-0-191.ec2.internal	spec.containers{fluentd-elasticsearch}	Normal		Created			Created container
  24m		24m		3	kubelet, ip-172-18-0-191.ec2.internal	spec.containers{fluentd-elasticsearch}	Warning		Failed			Error: failed to start container "fluentd-elasticsearch": Error response from daemon: linux mounts: Path /var/lib/docker/containers is mounted on /var/lib/docker/containers but it is not a shared or slave mount.
  24m		24m		2	kubelet, ip-172-18-0-191.ec2.internal	spec.containers{fluentd-elasticsearch}	Normal		Pulled			Container image "registry.reg-aws.openshift.com:443/openshift3/logging-fluentd:v3.10.0" already present on machine
  24m		4m		88	kubelet, ip-172-18-0-191.ec2.internal	spec.containers{fluentd-elasticsearch}	Warning		BackOff			Back-off restarting failed container

Comment 6 Mike Fiedler 2018-05-08 18:09:41 UTC
Marking this bug as VERFIED in comment 4 was incorrect (sorry @anli, thought I had QA on it).

This issue is NOT reproducible when the container runtime is CRI-O - the fluentd pod starts fine with the mount at /var/lib/docker/containers

This issue IS reproducible on loggging-fluentd v3.10.0-0.32.0 and v3.10.0-0.37.0 when the container runtime is Docker 1.13.   The pod fails to start.  If the mount is changed to /var/lib/docker as a workaround (see comment 1), the pod does start.

When I verified in comment 4, I was using a CRI-O runtime and thus missed the problem.  

This bz is correctly in ASSIGNED state.

Comment 7 Rich Megginson 2018-05-09 15:42:57 UTC
reassigning to Containers

Comment 8 Daniel Walsh 2018-05-14 18:59:24 UTC
But we don't intend to fix this in docker, we want fluentd to use the higher level directory.

Comment 9 Rich Megginson 2018-05-14 19:03:38 UTC
(In reply to Daniel Walsh from comment #8)
> But we don't intend to fix this in docker, we want fluentd to use the higher
> level directory.

So it is ok for fluentd to mount /var/lib/docker into the fluentd container? Ok, then we'll reassign back to Logging to fix in openshift-ansible.

Comment 10 Daniel Walsh 2018-05-14 20:07:26 UTC
I would prefer that they did not, but that is better then the conflict that we have now, where oci-umount is causing fluentd issues.

Comment 11 Rich Megginson 2018-05-14 20:12:38 UTC
@ewolinetz - please resurrect your openshift-ansible patch to change the mount point to /var/lib/docker

Comment 12 Rich Megginson 2018-05-14 20:13:40 UTC
> @ewolinetz - please resurrect your openshift-ansible patch to change the
> mount point to /var/lib/docker

including Eric

Comment 14 Junqi Zhao 2018-05-16 07:17:45 UTC
Please change to ON_QA, issue is fixed in openshift-ansible-3.10.0-0.47.0


Note You need to log in before you can comment on or make changes to this bug.