Bug 1340085
Summary: | Directory creation(mkdir) fails when the remove brick is initiated for replicated volumes accessing via nfs-ganesha | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shashank Raj <sraj> | |
Component: | replicate | Assignee: | Jiffin <jthottan> | |
Status: | CLOSED ERRATA | QA Contact: | Shashank Raj <sraj> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.1 | CC: | amukherj, jthottan, kkeithle, nchilaka, ndevos, pkarampu, rcyriac, rgowdapp, rhinduja, rhs-bugs, sashinde, skoduri, sraj, storage-qa-internal | |
Target Milestone: | --- | Keywords: | Regression, ZStream | |
Target Release: | RHGS 3.1.3 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.9-7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1340623 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-23 05:24:33 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1311817, 1340623, 1340992, 1342178 |
Description
Shashank Raj
2016-05-26 12:28:12 UTC
sosreports, logs and packet_trace can be found under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1340085 Why can't we have the back traces of the relevant processes to narrow down the issue? These are my initial findings : On v3 mount, when we perform "gluster v remove-brick <volname> start" all the ongoing mkdir call will fail with "mkdir: cannot create directory `new': Input/output error". Creation of files is fine. I hit didn't any IO hang in my setup.(I got simialr log message in ganesha.log and ganesha-gfapi.log) All the further mkdir request will fail until gluster v remove-brick <volname> commit is performed Copying the RCA from the cloned upstream bug: "DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir failures because of stale layout. But AFR was unwinding null xdata_rsp in case of failures. This was leading to mkdir failures just after remove-brick. Unwind the xdata_rsp in case of failures to make sure the response from brick reaches dht." Upstream patch http://review.gluster.org/14553 posted for review. Verified this bug with latest glusterfs-3.7.9-7 and nfs-ganesha-2.3.1-7 build and during remove-brick operations, no failures for directories are seen. Ran all the automated remove brick related cases which earlier reported this problem and all are passed without any issues. based on above observation, marking this bug as Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |