Bug 844695

Summary: excessive logging in self-heal daemon should be reduced
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: amarts, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 844758 (view as bug list) Environment:
Last Closed: 2012-09-12 11:05:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 844758, 844804    
Attachments:
Description Flags
log analysing script which parses the log file and prints the number of times each function has logged none

Description Raghavendra Bhat 2012-07-31 12:15:22 UTC
Created attachment 601512 [details]
log analysing script which parses the log file and prints the number of times each function has logged

Description of problem:

Self-heal daemon logs heavily by increasing the log file to some GBs thus consuming large disk space. It should be minimized by preventing the logs being repeated and also avoiding non important logs or printing those log message in some other log level such as DEBUG or TRACE.

With the bug a script has been attached which analyzes the glusterfs log file provided as an argument and prints the functions and number of times they have been called.

This is the o/p of log analyzer script run on a self-heal-daemon log file which had grown to 15-20MBs.

afr_detect_self_heal_by_iatt:7353
afr_detect_self_heal_by_lookup_status:243
afr_dir_exclusive_crawl:20
afr_launch_self_heal:19993
afr_notify:15
afr_self_heal_completion_cbk:19993
afr_sh_common_lookup_resp_handler:5745
afr_sh_data_open_cbk:240
afr_sh_entry_expunge_entry_cbk:1956
afr_sh_entry_fix:1
afr_sh_metadata_post_nonblocking_inodelk_cbk:486
afr_sh_post_nonblocking_entry_cbk:3
cleanup_and_exit:2
client3_1_entrylk_cbk:3
client3_1_getxattr_cbk:3
client3_1_inodelk_cbk:243
client3_1_open_cbk:240
client3_1_unlink_cbk:4
client_ping_cbk:3
client_rpc_notify:42
client_set_lk_version_cbk:54
client_setvolume_cbk:108
_crawl_proceed:12
gf_add_cmdline_options:12
main:2
mgmt_cbk_spec:4
mgmt_getspec_cbk:5
notify:81
reincarnate:1
_remove_stale_index:9
rpc_client_ping_timer_expired:3
rpc_clnt_reconfig:24
saved_frames_unwind:6
select_server_supported_programs:54
sh_loop_driver_done:18008
socket_connect_finish:24
__socket_keepalive:2
__socket_proto_state_machine:41
__socket_rwv:18
socket_server_event_handler:2
========= Error Functions ========
afr_notify:3
afr_self_heal_completion_cbk:243
afr_sh_common_lookup_resp_handler:5745
afr_sh_data_open_cbk:240
afr_sh_metadata_post_nonblocking_inodelk_cbk:486
afr_sh_post_nonblocking_entry_cbk:3
cleanup_and_exit:2
client3_1_entrylk_cbk:3
client3_1_getxattr_cbk:3
client3_1_inodelk_cbk:243
client3_1_open_cbk:240
client3_1_unlink_cbk:4
client_ping_cbk:3
_crawl_proceed:12
_remove_stale_index:4
saved_frames_unwind:6
socket_connect_finish:24
__socket_keepalive:2
__socket_proto_state_machine:41
__socket_rwv:18
socket_server_event_handler:2


"Error function"  section lists also functions which have logged in WARNING level.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Amar Tumballi 2012-08-01 07:31:34 UTC
Raghavendra Bhat,

Please use below in your patch, so we have better filtering done using unix scripts.
----------------
Subject: [PATCH] make output 'sort -n' friendly

---
 log_analyzer.sh |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log_analyzer.sh b/log_analyzer.sh
index cea10b5..7e1f483 100755
--- a/log_analyzer.sh
+++ b/log_analyzer.sh
@@ -27,7 +27,7 @@ function analyze ()
     for i in $(cat /tmp/uniq_function)
     do
         count=$(grep $i /tmp/functions_file | wc -l)
-        echo "$i:$count"
+        printf "%d\t%s\n" $count $i
     done
 
     #grep  "time of crash:" $log_file 2>/dev/null;
-------------

after applying above patch, try log_analyzer.sh <logfile> | sort -n, and i feel that gives more better viewing experience.

Comment 2 Pranith Kumar K 2012-09-12 11:05:49 UTC

*** This bug has been marked as a duplicate of bug 844697 ***