Bug 1596686 - key = trusted.glusterfs.protect.writes [Invalid argument]; key = glusterfs.avoid.overwrite [Invalid argument]
Summary: key = trusted.glusterfs.protect.writes [Invalid argument]; key = glusterfs.av...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 4.1
Hardware: x86_64
OS: Linux
medium
unspecified
Target Milestone: ---
Assignee: Krutika Dhananjay
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-29 12:51 UTC by kcao22003
Modified: 2018-08-29 12:44 UTC (History)
9 users (show)

Fixed In Version: glusterfs-4.1.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-29 12:44:28 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description kcao22003 2018-06-29 12:51:04 UTC
Description of problem:

We are currently using gluster version 4.0.0. We have three gluster server nodes
and a gluster volume, MyVol with quota being enabled and a limit usage is set to a arbitrary number such as 1GB.

We see a lot of logging messages at INFO level in /var/log/glusterfs/bricks/mnt-[xxx]-[VOLUME_NAME].log:

[<TIMESTAMP1>] I [dict.c:491: dict_get] ( -->/usr/lib64/glsuterfs/4.0.0/xlator/storage/posix.so(+0x25953) [0x7f73e0c62953] --> /usr/lib64/glusterfs/4.0.0/xlator/storage/posix.so(+0xc132) [0x7f73e0c49132] --> /lib64/libglusterfs.so.0(dict_get +0x10c) [0x7f73e836487c] 0-dict: !this || key = trusted.glusterfs.protect.writes [Invalid argument]

[<TIMESTAMP2>] I [dict.c:491: dict_get] ( -->/usr/lib64/glsuterfs/4.0.0/xlator/storage/posix.so(+0xc1b7) [0x7f73e0c491b7] --> /usr/lib64/glusterfs/4.0.0/xlator/storage/posix.so(+0xc132) [0x7f73e0c49132] --> /lib64/libglusterfs.so.0(dict_get +0x10c) [0x7f73e836487c] 0-dict: !this || key = glusterfs.avoid.overwrite [Invalid argument]


These two messages (with different HEX codes) are written to the same log file over and over and filled up the disk within a few days.

Does anyone know what is going on and any/or issue related to these messages? 

Thank you for your support.


Additional info:

1. gluster v status

   Status of volume: MyVol
   Gluster Process               TCP Port    RDMA Port   Online   PID
   --------------------------------------------------------
   Brick server1:/mnt/xxx/MyVol  49152         0           Y      109
   Brick server2:/mnt/xxx/MyVol  49152         0           Y      109
   Brick server3:/mnt/xxx/MyVol  49152         0           Y      109
   Self-heal Daemon on localhost NA            NA          Y      91
   Quota-Daemon on localhost     NA            NA          Y      100
   Self-heal Daemon on server1   NA            NA          Y      91
   Quota-Daemon on server1       NA            NA          Y      100
   Self-heal Daemon on server2   NA            NA          Y      91
   Quota-Daemon on server2       NA            NA

Comment 2 kcao22003 2018-06-29 12:53:54 UTC
Here's the source code I found that is related to this report: 
https://github.com/gluster/glusterfs/blob/release-4.0/libglusterfs/src/dict.c

Comment 3 Sunil Kumar Acharya 2018-06-29 16:24:54 UTC
Closing this bug. Please raise an upstream bug for all the upstream gluster
versions.

Comment 4 kcao22003 2018-06-29 17:52:51 UTC
"Please raise an upstream bug for all the upstream gluster
versions." 

I am confused and still do not understand why this bug report was closed. 
If this is not the right place to file this bug request, please let me know the correct place to do so. Thank you very much for your support.

Comment 5 Nithya Balachandran 2018-07-10 05:32:17 UTC
(In reply to kcao22003 from comment #0)
> Description of problem:
> 
> We are currently using gluster version 4.0.0. We have three gluster server
> nodes
> and a gluster volume, MyVol with quota being enabled and a limit usage is
> set to a arbitrary number such as 1GB.
> 
> We see a lot of logging messages at INFO level in
> /var/log/glusterfs/bricks/mnt-[xxx]-[VOLUME_NAME].log:
> 
> [<TIMESTAMP1>] I [dict.c:491: dict_get] (
> -->/usr/lib64/glsuterfs/4.0.0/xlator/storage/posix.so(+0x25953)
> [0x7f73e0c62953] -->
> /usr/lib64/glusterfs/4.0.0/xlator/storage/posix.so(+0xc132) [0x7f73e0c49132]
> --> /lib64/libglusterfs.so.0(dict_get +0x10c) [0x7f73e836487c] 0-dict: !this
> || key = trusted.glusterfs.protect.writes [Invalid argument]
> 
> [<TIMESTAMP2>] I [dict.c:491: dict_get] (
> -->/usr/lib64/glsuterfs/4.0.0/xlator/storage/posix.so(+0xc1b7)
> [0x7f73e0c491b7] -->
> /usr/lib64/glusterfs/4.0.0/xlator/storage/posix.so(+0xc132) [0x7f73e0c49132]
> --> /lib64/libglusterfs.so.0(dict_get +0x10c) [0x7f73e836487c] 0-dict: !this
> || key = glusterfs.avoid.overwrite [Invalid argument]
> 
> 
> These two messages (with different HEX codes) are written to the same log
> file over and over and filled up the disk within a few days.
> 
> Does anyone know what is going on and any/or issue related to these
> messages? 
> 
> Thank you for your support.
> 

Were you running a rebalance when you saw these messages?

Comment 6 kcao22003 2018-07-24 16:04:23 UTC
No. We were not running any rebalance. Is there any additional information you need me to provide to determine the cause of this issue? If yes, please let me know. Thank you for your support.

Comment 7 Chad Cropper 2018-08-09 21:51:29 UTC
I am experiencing the exact same issue after upgrade from 3.12.6 to 4.1.0 on CentOS7. I'm not sure what logs I can provide to help. Let me know.

Comment 8 Chad Cropper 2018-08-09 22:05:37 UTC
(In reply to Chad Cropper from comment #7)
> I am experiencing the exact same issue after upgrade from 3.12.6 to 4.1.0 on
> CentOS7. I'm not sure what logs I can provide to help. Let me know.

I set my logs to WARNING instead of INFO just to keep the logs from filling /var every 5 minutes.

Comment 9 Nithya Balachandran 2018-08-10 04:38:00 UTC
(In reply to Chad Cropper from comment #8)
> (In reply to Chad Cropper from comment #7)
> > I am experiencing the exact same issue after upgrade from 3.12.6 to 4.1.0 on
> > CentOS7. I'm not sure what logs I can provide to help. Let me know.
> 
> I set my logs to WARNING instead of INFO just to keep the logs from filling
> /var every 5 minutes.

Please paste the message here.

Comment 10 Nithya Balachandran 2018-08-10 05:09:26 UTC
This has been fixed in master. We will backport it to the release branches.


https://review.gluster.org/#/c/glusterfs/+/20250/

Comment 11 Worker Ant 2018-08-10 05:40:47 UTC
REVIEW: https://review.gluster.org/20694 (storage/posix: Fix excessive logging in WRITE fop path) posted (#1) for review on release-4.1 by Krutika Dhananjay

Comment 12 Worker Ant 2018-08-15 18:39:11 UTC
COMMIT: https://review.gluster.org/20694 committed in release-4.1 by "Krutika Dhananjay" <kdhananj> with a commit message- storage/posix: Fix excessive logging in WRITE fop path

        Backport of: https://review.gluster.org/#/c/glusterfs/+/20250

I was running some write-intensive tests on my volume, and in a matter
of 2 hrs, the 50GB space in my root partition was exhausted. On inspecting
further, figured that excessive logging in bricks was the cause -
specifically in posix write when posix_check_internal_writes() does
dict_get() without a NULL-check on xdata.

Change-Id: I89de57a3a90ca5c375e5b9477801a9e5ff018bbf
fixes: bz#1596686
Signed-off-by: Krutika Dhananjay <kdhananj>
(cherry picked from commit 81701e4d92ae7b1d97e5bc955703719f2e9e773a)

Comment 13 Shyamsundar 2018-08-29 12:44:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.3, please open a new bug report.

glusterfs-4.1.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-August/000111.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.