Bug 1659097 - Heketi logs getting rolled back even without a pod restart
Summary: Heketi logs getting rolled back even without a pod restart
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: heketi
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: John Mulligan
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-13 14:57 UTC by Neha Berry
Modified: 2019-01-25 00:34 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-25 00:34:59 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Neha Berry 2018-12-13 14:57:44 UTC
Description of problem:
----------------------------

we had a highly scaled up OCS 3.10 setup on which multiple creates/deletes and other activities were in progress for a considerable amount of days.

It is seen that the heketi logs keep getting overflowed and then rotate, due to which we are losing on some of the details from the current logs.

As per general expectation, the first few lines of a heketi logs are:

# oc logs heketi-storage-1-vn4qn |head
Heketi 7.0.0
[heketi] INFO 2018/12/13 13:33:35 Loaded ssh executor


But in this scaled setup, the log's contents keep rotating  and the first few lines are not as above


Following are some of the instances where the logs were collected from the same heketi pod and but the contents had changed/rotated. Thus we are losing on important log contents while collecting the log for troubleshooting.



-------------
[root@ip-10-3-244-116 tc-12]# cat heketi_logs_gdb3|head

[cmdexec] INFO 2018/12/12 02:40:33 Check Glusterd service status in node openshift-storage201.exexample2.com
[kubeexec] DEBUG 2018/12/12 02:40:33 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: openshift-storage201.exexample2.com Pod: glusterfs-7vtj8 Command: systemctl status glusterd
Result: ● glusterd.service - GlusterFS, a clustered file-system server

--------------------

[root@ip-10-3-244-116 tc-4]# cat heketi.logs.mid |head
Result: 
[negroni] Started GET /queue/6853743577588d448174ddd57050970e
[negroni] Completed 200 OK in 57.271µs
[kubeexec] DEBUG 2018/12/11 04:49:19 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: openshift-storage202.exexample2.com Pod: glusterfs-44zms Command: rmdir /var/lib/heketi/mounts/vg_17c6cc109412c9aeaa2b5d1ae44df792/brick_87b63778aa18f9170508cc887d5a2248


--------------
root@ip-10-3-244-116 ec2-user]# oc logs heketi-1-rvrwb|head
[negroni] Completed 200 OK in 1.959989ms
[negroni] Started GET /volumes/02cc8a3b2f51ad6ab3e01c1a38c68c66
[negroni] Completed 200 OK in 1.973788ms


--------------

Version-Release number of selected component (if applicable):
------------

OCS 3.10 

How reproducible:
-----------
On a scaled setup with innumerable requests being sent to heketi, it was easily reproducible



Additional info:
-------------- 
We are not sure if there's any specific setting from docker side which can cause this problem.

[root@openshift-worker233 ec2-user]# docker ps|grep heketi
22e694113d1b        874c20c00022                                                                                                          "/usr/sbin/heketi-..."   7 hours ago         Up 7 hours                              k8s_heketi_heketi-1-rvrwb_storage-project_919d46f3-fa20-11e8-8686-0e2dc3edd788_11
db995aa7bfb7        registry.access.redhat.com/openshift3/ose-pod:v3.9.30                                                                 "/usr/bin/pod"           6 days ago          Up 6 days                               k8s_POD_heketi-1-rvrwb_storage-project_919d46f3-fa20-11e8-8686-0e2dc3edd788_0


[root@openshift-worker233 ec2-user]# rpm -qa|grep docker
docker-common-1.13.1-75.git8633870.el7_5.x86_64
docker-client-1.13.1-75.git8633870.el7_5.x86_64
python-docker-pycreds-1.10.6-4.el7.noarch
python-docker-2.4.2-1.3.el7.noarch
atomic-openshift-docker-excluder-3.9.43-1.git.0.7ad1066.el7.noarch
docker-rhel-push-plugin-1.13.1-75.git8633870.el7_5.x86_64
docker-1.13.1-75.git8633870.el7_5.x86_64
[root@openshift-worker233 ec2-user]#

Comment 3 John Mulligan 2018-12-13 16:39:53 UTC
Heketi does not manage its own logging. There's nothing heketi can do here except change the amount of logs generated, which is probably not what you want to do. Changing this would be an openshift level configuration.


Note You need to log in before you can comment on or make changes to this bug.