Bug 1529095

Summary: /usr/sbin/glusterfs crashing on Red Hat OpenShift Container Platform node
Product: [Community] GlusterFS Reporter: Raghavendra G <rgowdapp>
Component: write-behindAssignee: bugs <bugs>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.12CC: atumball, bkunal, bugs, csaba, rgowdapp, rhinduja, rhs-bugs, sreber, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.12.5 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1528558 Environment:
Last Closed: 2018-02-01 04:43:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1516638, 1528558, 1529096    
Bug Blocks: 1529094    

Comment 1 Worker Ant 2017-12-26 10:39:20 UTC
REVIEW: https://review.gluster.org/19091 (performance/write-behind: fix bug while handling short writes) posted (#1) for review on release-3.12 by Raghavendra G

Comment 2 Worker Ant 2018-01-02 09:52:12 UTC
COMMIT: https://review.gluster.org/19091 committed in release-3.12 by \"Raghavendra G\" <rgowdapp> with a commit message- performance/write-behind: fix bug while handling short writes

The variabled "fulfilled" in wb_fulfill_short_write is not reset to 0
while handling every member of the list.

This has some interesting consequences:

* If we break from the loop while processing last member of the list
  head->winds, req is reset to head as the list is a circular
  one. However, head is already fulfilled and can potentially be
  freed. So, we end up adding a freed request to wb_inode->todo
  list. This is the RCA for the crash tracked by the bug associated
  with this patch (Note that we saw "holder" which is freed in todo
  list).

* If we break from the loop while processing any of the last but one
  member of the list head->winds, req is set to next member in the
  list, skipping the current request, even though it is not entirely
  synced. This can lead to data corruption.

The fix is very simple and we've to change the code to make sure
"fulfilled" reflects whether the current request is fulfilled or not
and it doesn't carry history of previous requests in the list.

>Change-Id: Ia3d6988175a51c9e08efdb521a7b7938b01f93c8
>BUG: 1528558
>Signed-off-by: Raghavendra G <rgowdapp>

(cherry picked from commit 0bc22bef7f3c24663aadfb3548b348aa121e3047)
Change-Id: Ia3d6988175a51c9e08efdb521a7b7938b01f93c8
BUG: 1529095
Signed-off-by: Raghavendra G <rgowdapp>

Comment 3 Jiffin 2018-02-01 04:43:00 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.5, please open a new bug report.

glusterfs-3.12.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-devel/2018-February/054356.html
[2] https://www.gluster.org/pipermail/gluster-users/