Bug 1772006 - NULL dict messages flooding fuse mount log
Summary: NULL dict messages flooding fuse mount log
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-13 12:21 UTC by Ravishankar N
Modified: 2019-11-16 09:07 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-16 09:07:27 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 23708 0 None Merged afr: fix log flooding 2019-11-16 09:07:26 UTC

Description Ravishankar N 2019-11-13 12:21:25 UTC
Description of problem:
Commit "ccf33e789 - dict.c: remove redundant checks (6 days ago) <Yaniv Kaul>"
removed some NULL checks in certain dict functions. This caused flooding of fuse mount logs when I/O was done on the mount:

Message:
W [dict.c:1478:dict_get_with_refn] (-->/usr/local/lib/libglusterfs.so.0(dict_get_uint32+0x4d) [0x7ff9121ec963] -->/usr/local/lib/libglusterfs.so.0(dict_get_with_ref+0x90) [0x7ff9121eb93f] -->/usr/local/lib/libglusterfs.so.0(+0x229be) [0x7ff9121eb9be] ) 0-dict: dict OR key (glusterfs.lk.lkmode) is NULL [Invalid argument]

Two observed code paths in AFR that caused these logs:
==============================================================
#0  dict_get_with_refn (this=0x0, key=0x7ff8ffa90e16 "glusterfs.lk.lkmode", keylen=19, data=0x7ff900b49690) at dict.c:1478
#1  0x00007ff9121eb93f in dict_get_with_ref (this=0x0, key=0x7ff8ffa90e16 "glusterfs.lk.lkmode", data=0x7ff900b49690) at dict.c:1467
#2  0x00007ff9121ec963 in dict_get_uint32 (this=0x0, key=0x7ff8ffa90e16 "glusterfs.lk.lkmode", val=0x7ff900b496c8) at dict.c:1894
#3  0x00007ff8ffa631fd in afr_is_lock_mode_mandatory (xdata=0x0) at afr-common.c:667
#4  0x00007ff8ffa6d0c0 in afr_discover_unwind (frame=0x7ff8ec00ee78, this=0x7ff8f801caa0) at afr-common.c:3650
#5  0x00007ff8ffa6dddd in afr_discover_done (frame=0x7ff8ec00ee78, this=0x7ff8f801caa0) at afr-common.c:3745
#6  0x00007ff8ffa6bd4b in afr_lookup_metadata_heal_check (frame=0x7ff8ec00ee78, this=0x7ff8f801caa0) at afr-common.c:3400
#7  0x00007ff8ffa6e166 in afr_discover_cbk (frame=0x7ff8ec00ee78, cookie=0x2, this=0x7ff8f801caa0, op_ret=0, op_errno=0, inode=0x7ff8f81b22f8, buf=0x7ff900b499b0, xdata=0x7ff8f0024798, postparent=0x7ff900b49910) at afr-common.c:3786
#8  0x00007ff8ffb1dfae in client4_0_lookup_cbk (req=0x7ff8ec01ef08, iov=0x7ff8ec01ef38, count=1, myframe=0x7ff8ec006078) at client-rpc-fops_v2.c:2631
#9  0x00007ff91219769a in rpc_clnt_handle_reply (clnt=0x7ff8f8069740, pollin=0x7ff8f0011fd0) at rpc-clnt.c:768
#10 0x00007ff912197bcb in rpc_clnt_notify (trans=0x7ff8f8069a40, mydata=0x7ff8f8069770, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7ff8f0011fd0) at rpc-clnt.c:935
#11 0x00007ff912193a4a in rpc_transport_notify (this=0x7ff8f8069a40, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7ff8f0011fd0) at rpc-transport.c:545
#12 0x00007ff900c1ca5e in socket_event_poll_in_async (xl=0x7ff8f8010b60, async=0x7ff8f00120e8) at socket.c:2601
#13 0x00007ff900c13e67 in gf_async (async=0x7ff8f00120e8, xl=0x7ff8f8010b60, cbk=0x7ff900c1ca07 <socket_event_poll_in_async>) at ../../../../libglusterfs/src/glusterfs/async.h:189
#14 0x00007ff900c1cbed in socket_event_poll_in (this=0x7ff8f8069a40, notify_handled=true) at socket.c:2642
#15 0x00007ff900c1db8a in socket_event_handler (fd=16, idx=4, gen=1, data=0x7ff8f8069a40, poll_in=1, poll_out=0, poll_err=0, event_thread_died=0 '\000') at socket.c:3033
#16 0x00007ff91227504b in event_dispatch_epoll_handler (event_pool=0xdd10e0, event=0x7ff900b4a02c) at event-epoll.c:650
#17 0x00007ff91227553e in event_dispatch_epoll_worker (data=0xe1b930) at event-epoll.c:763
#18 0x00007ff911f915a2 in start_thread () from /lib64/libpthread.so.0
#19 0x00007ff911bdd023 in clone () from /lib64/libc.so.6
==============================================================
#0  dict_get_with_refn (this=0x0, key=0x7f893c059250 "trusted.afr.testvol-client-0", keylen=28, data=0x7f8944378460) at dict.c:1478
#1  0x00007f8955a1a93f in dict_get_with_ref (this=0x0, key=0x7f893c059250 "trusted.afr.testvol-client-0", data=0x7f8944378460) at dict.c:1467
#2  0x00007f8955a1c6d4 in dict_get_ptr (this=0x0, key=0x7f893c059250 "trusted.afr.testvol-client-0", ptr=0x7f89443784b0) at dict.c:2247
#3  0x00007f8943294d79 in afr_accused_fill (this=0x7f893c01caa0, xdata=0x0, accused=0x7f8944378590 "", type=AFR_ENTRY_TRANSACTION) at afr-common.c:1569
#4  0x00007f8943295173 in afr_readables_fill (frame=0x7f8934152848, this=0x7f893c01caa0, inode=0x7f89300272a8, data_accused=0x7f8944378590 "", metadata_accused=0x7f8944378570 "", data_readable=0x7f8944378580 "\001\001\001<\211\177",
    metadata_readable=0x7f8944378560 "\001\001\001\064\211\177", replies=0x7f8934173550) at afr-common.c:1655
#5  0x00007f89432954b7 in afr_replies_interpret (frame=0x7f8934152848, this=0x7f893c01caa0, inode=0x7f89300272a8, start_heal=0x0) at afr-common.c:1711
#6  0x00007f89432223cf in __afr_dir_write_finalize (frame=0x7f8934152848, this=0x7f893c01caa0) at afr-dir-write.c:98
#7  0x00007f8943223a40 in __afr_dir_write_cbk (frame=0x7f8934152848, cookie=0x2, this=0x7f893c01caa0, op_ret=0, op_errno=2, buf=0x7f8944378910, preparent=0x7f8944378870, postparent=0x7f89443787d0, preparent2=0x0, postparent2=0x0,
    xdata=0x0) at afr-dir-write.c:237
#8  0x00007f8943226d68 in afr_mkdir_wind_cbk (frame=0x7f8934152848, cookie=0x2, this=0x7f893c01caa0, op_ret=0, op_errno=2, inode=0x7f89300272a8, buf=0x7f8944378910, preparent=0x7f8944378870, postparent=0x7f89443787d0, xdata=0x0)
    at afr-dir-write.c:629
#9  0x00007f894333cdc4 in client4_0_mkdir_cbk (req=0x7f893c0020f8, iov=0x7f893c002128, count=1, myframe=0x7f893c003cd8) at client-rpc-fops_v2.c:217
==============================================================

Version-Release number of selected component (if applicable):
Master.

How reproducible:
Always

Steps to Reproduce:
1. Create dist-rep volume and fuse mount it.
2. perform I/O- mkdir/rmdir, touch, stat, rm etc.


Actual results:
Fuse log is flooded with above messages.

Expected results:
No spurious logs.

Additional info:
Not sure if it is better to revert Yaniv's patch or retain it, but fix AFR (and possibly any other xlators that may do call dict_* functions on a null dict in future). In any case, I'm sending out the AFR fix to see what the consensus is.

Comment 1 Worker Ant 2019-11-13 12:24:31 UTC
REVIEW: https://review.gluster.org/23708 (afr: fix log flooding) posted (#1) for review on master by Ravishankar N

Comment 2 Worker Ant 2019-11-16 09:07:27 UTC
REVIEW: https://review.gluster.org/23708 (afr: fix log flooding) merged (#3) on master by Amar Tumballi


Note You need to log in before you can comment on or make changes to this bug.