Description of problem: ---------------------------- we had a highly scaled up OCS 3.10 setup on which multiple creates/deletes and other activities were in progress for a considerable amount of days. It is seen that the heketi logs keep getting overflowed and then rotate, due to which we are losing on some of the details from the current logs. As per general expectation, the first few lines of a heketi logs are: # oc logs heketi-storage-1-vn4qn |head Heketi 7.0.0 [heketi] INFO 2018/12/13 13:33:35 Loaded ssh executor But in this scaled setup, the log's contents keep rotating and the first few lines are not as above Following are some of the instances where the logs were collected from the same heketi pod and but the contents had changed/rotated. Thus we are losing on important log contents while collecting the log for troubleshooting. ------------- [root@ip-10-3-244-116 tc-12]# cat heketi_logs_gdb3|head [cmdexec] INFO 2018/12/12 02:40:33 Check Glusterd service status in node openshift-storage201.exexample2.com [kubeexec] DEBUG 2018/12/12 02:40:33 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: openshift-storage201.exexample2.com Pod: glusterfs-7vtj8 Command: systemctl status glusterd Result: ● glusterd.service - GlusterFS, a clustered file-system server -------------------- [root@ip-10-3-244-116 tc-4]# cat heketi.logs.mid |head Result: [negroni] Started GET /queue/6853743577588d448174ddd57050970e [negroni] Completed 200 OK in 57.271µs [kubeexec] DEBUG 2018/12/11 04:49:19 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: openshift-storage202.exexample2.com Pod: glusterfs-44zms Command: rmdir /var/lib/heketi/mounts/vg_17c6cc109412c9aeaa2b5d1ae44df792/brick_87b63778aa18f9170508cc887d5a2248 -------------- root@ip-10-3-244-116 ec2-user]# oc logs heketi-1-rvrwb|head [negroni] Completed 200 OK in 1.959989ms [negroni] Started GET /volumes/02cc8a3b2f51ad6ab3e01c1a38c68c66 [negroni] Completed 200 OK in 1.973788ms -------------- Version-Release number of selected component (if applicable): ------------ OCS 3.10 How reproducible: ----------- On a scaled setup with innumerable requests being sent to heketi, it was easily reproducible Additional info: -------------- We are not sure if there's any specific setting from docker side which can cause this problem. [root@openshift-worker233 ec2-user]# docker ps|grep heketi 22e694113d1b 874c20c00022 "/usr/sbin/heketi-..." 7 hours ago Up 7 hours k8s_heketi_heketi-1-rvrwb_storage-project_919d46f3-fa20-11e8-8686-0e2dc3edd788_11 db995aa7bfb7 registry.access.redhat.com/openshift3/ose-pod:v3.9.30 "/usr/bin/pod" 6 days ago Up 6 days k8s_POD_heketi-1-rvrwb_storage-project_919d46f3-fa20-11e8-8686-0e2dc3edd788_0 [root@openshift-worker233 ec2-user]# rpm -qa|grep docker docker-common-1.13.1-75.git8633870.el7_5.x86_64 docker-client-1.13.1-75.git8633870.el7_5.x86_64 python-docker-pycreds-1.10.6-4.el7.noarch python-docker-2.4.2-1.3.el7.noarch atomic-openshift-docker-excluder-3.9.43-1.git.0.7ad1066.el7.noarch docker-rhel-push-plugin-1.13.1-75.git8633870.el7_5.x86_64 docker-1.13.1-75.git8633870.el7_5.x86_64 [root@openshift-worker233 ec2-user]#
Heketi does not manage its own logging. There's nothing heketi can do here except change the amount of logs generated, which is probably not what you want to do. Changing this would be an openshift level configuration.