Bug 1553129 - Memory corruption is causing crashes, hangs and invalid answers
Summary: Memory corruption is causing crashes, hangs and invalid answers
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: protocol
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1554235
TreeView+ depends on / blocked
 
Reported: 2018-03-08 11:27 UTC by Xavi Hernandez
Modified: 2018-06-20 18:01 UTC (History)
2 users (show)

Fixed In Version: glusterfs-v4.1.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1554235 (view as bug list)
Environment:
Last Closed: 2018-06-20 18:01:56 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Xavi Hernandez 2018-03-08 11:27:34 UTC
Description of problem:

I've detected this problem only by running some regression tests in a loop. I haven't seen this in a regular running system.

I'm not absolutely sure yet about the root cause of the memory corruption but some clues seem to indicate that it happens at the protocol/client layer. Still investigating.

Version-Release number of selected component (if applicable): mainline


How reproducible:

very rare

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jeff Darcy 2018-03-08 13:28:11 UTC
One useful trick, from when I had to debug one of these in server code a while ago, is to use gdb's "find" function to search for the mem-pool header/footer around the pointer you're looking at. If it's a use-after-free situation, which is the most common cause of memory corruption, that and a little luck can conclusively identify a culprit.

Comment 2 Worker Ant 2018-03-09 22:32:33 UTC
REVIEW: https://review.gluster.org/19691 (protocol/client: fix memory corruption) posted (#1) for review on master by Xavi Hernandez

Comment 3 Worker Ant 2018-03-10 18:00:57 UTC
COMMIT: https://review.gluster.org/19691 committed in master by "Xavi Hernandez" <xhernandez> with a commit message- protocol/client: fix memory corruption

There was an issue when some accesses to saved_fds list were
protected by the wrong mutex (lock instead of fd_lock).

Additionally, the retrieval of fdctx from fd's context and any
checks done on it have also been protected by fd_lock to avoid
fdctx to become outdated just after retrieving it.

Change-Id: If2910508bcb7d1ff23debb30291391f00903a6fe
BUG: 1553129
Signed-off-by: Xavi Hernandez <xhernandez>

Comment 4 Shyamsundar 2018-06-20 18:01:56 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.

glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.