Description of problem: ------------------------ Added bricks to an EC 2*(4+2) to make it 4*(4+2) with continuous IO. IO - Bonnie,kernel untars,lots of finds/ls. I see a steady stream of these error messages : <snip> [2017-07-19 13:44:12.966014] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (584b3aa0-c456-4291-b25a-bbb4ce68ad04) [Invalid argument] [2017-07-19 13:44:12.969403] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (601482f0-96fb-40bc-86d4-010cc64b61d6) [Invalid argument] [2017-07-19 13:44:12.973036] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (d0c85164-52f2-48b9-9489-9193337bdcdb) [Invalid argument] [2017-07-19 13:44:12.977283] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (c02ea4ac-b686-4666-8b74-c2079057ae9f) [Invalid argument] [2017-07-19 13:44:12.980771] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (05686e26-e251-4118-8521-53e0890d33e4) [Invalid argument] [2017-07-19 13:44:12.983944] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (72f29f40-17de-491a-9345-7fa9e7b71ee8) [Invalid argument] [2017-07-19 13:44:12.987362] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (34bb4e69-87df-4215-8528-22603cb3ccd7) [Invalid argument] [2017-07-19 13:44:12.990829] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (718a24c9-1010-4b94-9926-e31f1c46462a) [Invalid argument] [2017-07-19 13:44:12.995798] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (ac971b73-31f7-4cc3-88e8-6792ff5aa80f) [Invalid argument] [2017-07-19 13:44:13.014622] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (b47d478d-797d-4443-89cf-c4e5b02e83d6) [Invalid argument] [2017-07-19 13:44:13.028056] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (6c46a760-c6bb-429f-ac78-89960d10d537) [Invalid argument] [2017-07-19 13:44:13.028478] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (e7ae2c5e-2082-4495-947d-3b7bf3017b77) [Invalid argument] [2017-07-19 13:44:13.035634] E [MSGID: 123001] [io-cache.c:565:ioc_open_cbk] 2-butcher-io-cache: inode context is NULL (859caaf3-b152-4241-ae85-a54d8aaf10fb) [Invalid argument] </snip> The log file is bloated heavily : [root@gqac028 ~]# cat /var/log/glusterfs/gluster-mount.log |grep "inode context is NULL"|wc -l 13871 [root@gqac028 ~]# ll -h /var/log/glusterfs/gluster-mount.log -rw------- 1 root root 30M Jul 19 23:46 /var/log/glusterfs/gluster-mount.log [root@gqac028 ~]# Version-Release number of selected component (if applicable): ------------------------------------------------------------- glusterfs-3.8.4-34.el7.x86_64 How reproducible: ----------------- 1/1 Additional info: ---------------- Volume Name: butcher Type: Distributed-Disperse Volume ID: 1ad04ea0-4b0c-44d2-aa32-15ca6b1657eb Status: Started Snapshot Count: 0 Number of Bricks: 4 x (4 + 2) = 24 Transport-type: tcp Bricks: Brick1: gqas007.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick2: gqas003.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick3: gqas009.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick4: gqas016.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick5: gqas007.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick6: gqas003.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick7: gqas009.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick8: gqas016.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick9: gqas007.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick10: gqas003.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick11: gqas009.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick12: gqas016.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick13: gqas007.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick14: gqas003.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick15: gqas009.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick16: gqas016.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick17: gqas007.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick18: gqas003.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick19: gqas007.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick20: gqas003.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick21: gqas009.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick22: gqas016.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick23: gqas007.sbu.lab.eng.bos.redhat.com:/bricks7/A1 Brick24: gqas003.sbu.lab.eng.bos.redhat.com:/bricks7/A1 Options Reconfigured: features.uss: enable features.quota-deem-statfs: on features.inode-quota: on features.quota: on network.inode-lru-limit: 50000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet nfs.disable: off [root@gqas003 ~]#
upstream patch : https://review.gluster.org/#/c/17855/
Upstream patch https://review.gluster.org/#/c/5029/ fixes the issue properly. I think we should abandon the patch https://review.gluster.org/#/c/17855/ and have https://review.gluster.org/#/c/5029/ as a fix.
upstream 3.12 patch : https://review.gluster.org/#/c/17889/
Tried this use case once on glusterfs-3.8.4-37,could not reproduce it. Moving this to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774