Bug 1392299
Summary: | [SAMBA-mdcache]Read hungs and leads to disconnect of samba share while creating IOs from one client & reading from another client | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Vivek Das <vdas> |
Component: | read-ahead | Assignee: | Poornima G <pgurusid> |
Status: | CLOSED ERRATA | QA Contact: | Vivek Das <vdas> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.2 | CC: | amukherj, nbalacha, pgurusid, rcyriac, rhs-bugs, rhs-smb, sbhaloth, storage-qa-internal |
Target Milestone: | --- | ||
Target Release: | RHGS 3.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.4-6 | Doc Type: | Bug Fix |
Doc Text: |
In some situations, read operations were skipped by the io-cache translator, which led to a hung client mount. This has been corrected so that the client mount process works as expected for read operations.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-03-23 06:16:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1388292, 1399015, 1399018, 1399023, 1399024 | ||
Bug Blocks: | 1351528, 1351530 |
Description
Vivek Das
2016-11-07 07:02:52 UTC
Sosreports , samba logs available http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1392299 From comment#3: This bug is not reproducible i.e it works absolutely fine when we disable read-ahead for the volume. < gluster volume set volname read-ahead off > Looks like this is related to read-ahead, not readdir-ahead. Updating the component. Fix posted upstream : http://review.gluster.org/15901 RCA: ==== In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead of STACK_WIND(). One such case is when inode_ctx for that file is not present (can happen if readdirp was called, and populates md-cache and serves all the lookups from cache). Consider the following graph: ... io-cache (parent) | readdir-ahead | read-ahead ... Below is the code snippet of ioc_readv calling STACK_WIND_TAIL: ioc_readv() { ... if (!inode_ctx) STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this), FIRST_CHILD (frame->this)->fops->readv, fd, size, offset, flags, xdata); /* Ideally, this stack_wind should wind to readdir-ahead:readv() but it winds to read-ahead:readv(). See below for explaination. */ ... } STACK_WIND_TAIL (frame, obj, fn, ...) { frame->this = obj; /* for the above mentioned graph, frame->this will be readdir-ahead * frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which * is as expected */ ... THIS = obj; /* THIS will be read-ahead instead of readdir-ahead!, as obj expands * to "FIRST_CHILD (frame->this)" and frame->this was pointing * to readdir-ahead in the previous statement. */ ... fn (frame, obj, params); /* fn will call read-ahead:readv() instead of readdir-ahead:readv()! * as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and * frame->this was pointing ro readdir-ahead in the first statement */ ... } Thus, the readdir-ahead's readv() implementation will be skipped, and ra_readv() will be called with frame->this = "readdir-ahead" and this = "read-ahead". This can lead to corruption / hang / other problems. But in this perticular case, when 'frame->this' and 'this' passed to ra_readv() doesn't match, it causes ra_readv() to call ra_readv() again!. Thus the logic of read-ahead readv() falls apart and leads to hang. Have posted another patch for review: http://review.gluster.org/#/c/15923/ Once this is merged will backport this patch(http://review.gluster.org/#/c/15923/) to downstream. http://review.gluster.org/15901 which is already merged also fixes the issue, but the right way to fix would be http://review.gluster.org/#/c/15923/ downstream patch : https://code.engineering.redhat.com/gerrit/#/c/91496/ Followed steps to reproduce with glusterfs-3.8.4-6 & read-ahead: on, works fine so moving it to Verified state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |