Bug 1389617 - Encountered fluentd/ES stuck after logging deployment
Summary: Encountered fluentd/ES stuck after logging deployment
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Logging
Version: 3.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: ewolinet
QA Contact: Xia Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-28 03:11 UTC by Xia Zhao
Modified: 2016-12-09 21:49 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-09 21:49:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
deployer_log_short_write (23.49 KB, text/plain)
2016-11-04 09:24 UTC, Xia Zhao
no flags Details

Comment 1 Luke Meyer 2016-10-28 15:02:06 UTC
This is not a problem we've encountered before. From a bit of Googling it seems "error: short write" is an OS-level error message that usually indicates the filesystem is either out of space or out of inodes. If you're seeing it both in ES and fluentd logs, that suggests they're using the same filesystem (probably the root filesystem) and it's full.

Check:
# df -h /
# df -i /

This checks space and inodes (respectively) for the root filesystem.

Comment 2 Xia Zhao 2016-10-31 05:46:18 UTC
Thanks for the info. I will check it if encounter it next time. -- The origin instance will be automatically deleted everyday so I don't have a repro env to check for this at the moment.

Comment 3 Xia Zhao 2016-10-31 10:48:00 UTC
@Eric @Luke

I get it reproduced on my env, and seems the disk is not full:

# df -h
Filesystem                                       Size  Used Avail Use% Mounted on
/dev/xvda2                                        25G   15G   11G  60% /
devtmpfs                                         1.9G     0  1.9G   0% /dev
tmpfs                                            1.8G     0  1.8G   0% /dev/shm
tmpfs                                            1.8G   18M  1.8G   2% /run
tmpfs                                            1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/mapper/docker--vg-openshift--xfs--vol--dir  5.0G   33M  5.0G   1% /mnt/openshift-xfs-vol-dir
tmpfs                                            354M     0  354M   0% /run/user/0
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/7623b7b6-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/registry-token-z5kq8
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/762f2a36-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/server-certificate
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/762f2a36-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/router-token-povpt
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9d76ac1f-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9d76ac1f-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9d9c6c26-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9d94f11b-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9d94f11b-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9d9c6c26-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9ddb3633-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9ddb3633-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9e304ad9-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G  8.0K  1.8G   1% /root/openshift.local.volumes/pods/9e35502a-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9e35502a-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/9e304ad9-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/fa2a476c-9f47-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/logging-deployer-token-4u5t6
tmpfs                                            1.8G   12K  1.8G   1% /root/openshift.local.volumes/pods/0702a8e7-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/0702a8e7-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-curator-token-scqen
tmpfs                                            1.8G   20K  1.8G   1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana-proxy
tmpfs                                            1.8G   12K  1.8G   1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-kibana-token-r7td0
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-kibana-token-r7td0
tmpfs                                            1.8G   20K  1.8G   1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana-proxy
tmpfs                                            1.8G   12K  1.8G   1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana
tmpfs                                            1.8G   12K  1.8G   1% /root/openshift.local.volumes/pods/09411ddd-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/09411ddd-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-curator-token-scqen
tmpfs                                            1.8G   12K  1.8G   1% /root/openshift.local.volumes/pods/77c8592e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/77c8592e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-fluentd-token-og3cv
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/5a3303bb-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-elasticsearch-token-2q8w1
tmpfs                                            1.8G   32K  1.8G   1% /root/openshift.local.volumes/pods/5a3303bb-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/elasticsearch
tmpfs                                            1.8G   16K  1.8G   1% /root/openshift.local.volumes/pods/613f9490-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-elasticsearch-token-2q8w1
tmpfs                                            1.8G   32K  1.8G   1% /root/openshift.local.volumes/pods/613f9490-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/elasticsearch


# df -i
Filesystem                                        Inodes  IUsed    IFree IUse% Mounted on
/dev/xvda2                                      26212288 186481 26025807    1% /
devtmpfs                                          480226    399   479827    1% /dev
tmpfs                                             452376      1   452375    1% /dev/shm
tmpfs                                             452376   1130   451246    1% /run
tmpfs                                             452376     13   452363    1% /sys/fs/cgroup
/dev/mapper/docker--vg-openshift--xfs--vol--dir  5222400      4  5222396    1% /mnt/openshift-xfs-vol-dir
tmpfs                                             452376      1   452375    1% /run/user/0
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/7623b7b6-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/registry-token-z5kq8
tmpfs                                             452376      7   452369    1% /root/openshift.local.volumes/pods/762f2a36-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/server-certificate
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/762f2a36-9f43-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/router-token-povpt
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9d76ac1f-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9d76ac1f-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9d9c6c26-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9d94f11b-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9d94f11b-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9d9c6c26-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9ddb3633-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9ddb3633-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9e304ad9-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376      5   452371    1% /root/openshift.local.volumes/pods/9e35502a-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-dockercfg-9jav7-push
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9e35502a-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/9e304ad9-9f44-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/builder-token-apusg
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/fa2a476c-9f47-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/logging-deployer-token-4u5t6
tmpfs                                             452376      9   452367    1% /root/openshift.local.volumes/pods/0702a8e7-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/0702a8e7-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-curator-token-scqen
tmpfs                                             452376     13   452363    1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana-proxy
tmpfs                                             452376      9   452367    1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/0886c19e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-kibana-token-r7td0
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-kibana-token-r7td0
tmpfs                                             452376     13   452363    1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana-proxy
tmpfs                                             452376      9   452367    1% /root/openshift.local.volumes/pods/08f5d06f-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/kibana
tmpfs                                             452376      9   452367    1% /root/openshift.local.volumes/pods/09411ddd-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/09411ddd-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-curator-token-scqen
tmpfs                                             452376      9   452367    1% /root/openshift.local.volumes/pods/77c8592e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/certs
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/77c8592e-9f48-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-fluentd-token-og3cv
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/5a3303bb-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-elasticsearch-token-2q8w1
tmpfs                                             452376     19   452357    1% /root/openshift.local.volumes/pods/5a3303bb-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/elasticsearch
tmpfs                                             452376     11   452365    1% /root/openshift.local.volumes/pods/613f9490-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/aggregated-logging-elasticsearch-token-2q8w1
tmpfs                                             452376     19   452357    1% /root/openshift.local.volumes/pods/613f9490-9f56-11e6-acfc-0e7e0c141386/volumes/kubernetes.io~secret/elasticsearch

Comment 4 Luke Meyer 2016-10-31 20:47:01 UTC
@Xiao I'm mystified what's happening if it's not a full file system somewhere (and you're right, don't see any evidence of it). Are those the full logs from the ES and fluentd containers? If not, would appreciate those being attached whenever this happens again. There's no need to wait for more than a couple minutes for it to come up (by that point it probably won't). Also are you giving ES persistent storage or just using ephemeral storage?

Aside from the "error: short write" it's pretty strange if ES doesn't output more than that initial line. I almost wonder if docker is having trouble with its volumes - technically the default devicemapper storage is unsupportable... BTW what's the docker version?

Comment 8 Avesh Agarwal 2016-11-01 15:42:12 UTC
There is a PR opened https://github.com/openshift/origin/pull/11680 for short write issue.

Comment 9 Xia Zhao 2016-11-04 09:22:16 UTC
@Eric  @Avesh  

Encountered this error inside deployer pod , on devenv-rhel7_5264 where seemed to contain https://github.com/openshift/origin/pull/11680 :

$ oc get po
NAME                            READY     STATUS      RESTARTS   AGE
logging-auth-proxy-1-build      0/1       Completed   0          57m
logging-curator-1-build         0/1       Completed   0          57m
logging-deployer-2n5nm          0/1       Error       0          13m
logging-deployer-stbt5          0/1       Error       0          7m
logging-deployment-1-build      0/1       Completed   0          57m
logging-elasticsearch-1-build   0/1       Completed   0          57m
logging-fluentd-1-build         0/1       Completed   0          57m
logging-kibana-1-build          0/1       Completed   0          57m


$ oc logs -f logging-deployer-stbt5
.....
+ generate_support_objects
++ cat /etc/deploy/scratch/oauth-secret
+ oc new-app -f templates/support.yaml --param
OAUTH_SECRET=NUeyQxGzCXY7IEp97Cn9NMTbtDGz2BVOH9DR4DjM2yS5GmZJsPGQBMKtjMVBVREl
--param KIBANA_HOSTNAME=kibana.router.default.svc.cluster.local --param
KIBANA_OPS_HOSTNAME=kibana-ops.router.default.svc.cluster.local --param
IMAGE_PREFIX_DEFAULT=docker.io/openshift/origin- --param
IMAGE_VERSION_DEFAULT=v1.3.0-rc1
--> Deploying template logging-support-template-maker for
"templates/support.yaml"
     With parameters:

OAUTH_SECRET=NUeyQxGzCXY7IEp97Cn9NMTbtDGz2BVOH9DR4DjM2yS5GmZJsPGQBMKtjMVBVREl
error: short write

New deployer log attached.

Comment 10 Xia Zhao 2016-11-04 09:24:58 UTC
Created attachment 1217335 [details]
deployer_log_short_write

Comment 11 Avesh Agarwal 2016-11-04 13:01:43 UTC
The PR https://github.com/openshift/origin/pull/11680 is merged in origin to address short write issue.

Comment 12 Xia Zhao 2016-11-07 10:36:33 UTC
Tested with devenv-rhel7_5323, encountered the following error when doing es build:

logging-elasticsearch-1-build   0/1       ContainerCannotRun   0          6m

events:

failed to "StartContainer" for "docker-build" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"linux runtime spec devices: lstat /dev/dm-14: no such file or directory\"}"

I will try this later.

Comment 13 Jeff Cantrill 2016-11-07 13:31:00 UTC
Please use the images from the qa registry instead of trying to build them yourself as those in the registry are the ones that will be released.

Comment 14 Xia Zhao 2016-11-07 13:58:58 UTC
(In reply to Jeff Cantrill from comment #13)
> Please use the images from the qa registry instead of trying to build them
> yourself as those in the registry are the ones that will be released.

Hi Jeff, 

Do you mean there is already origin images available for trello cards testing for 3.4.0? Could you please let me know the exact address of "qa registry"? Want to give them a try, thanks.

Thanks,
Xia

Comment 15 ewolinet 2016-11-07 16:54:56 UTC
Xia,

You should be able to test this with the images available on brew-pulp. 

Jeff is referring to the QE registry that should have the most up-to-date images, I'm not 100% on what that address is though.


Regarding the error with trying to do a local build, I believe that is Docker related.

Comment 16 Xia Zhao 2016-11-08 07:23:18 UTC
Verified with the latest puddle of OCP + latest logging images on brew, the original issue did not repro.

Images tested:
openshift3/logging-deployer    5f4ddb8cf40d
openshift3/logging-elasticsearch    9b9452c0f8c2
openshift3/logging-kibana    7fc9916eea4d
openshift3/logging-auth-proxy    0d8e3202af61
openshift3/logging-fluentd    22aea57e7576
openshift3/logging-curator    9af78fc06248

# openshift version
openshift v3.4.0.22+5c56720
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0


Note You need to log in before you can comment on or make changes to this bug.