Bug 1332199
Summary: | Self Heal fails on a replica3 volume with 'disk quota exceeded' | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Sweta Anandpara <sanandpa> | |
Component: | quota | Assignee: | Pranith Kumar K <pkarampu> | |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.1 | CC: | amukherj, asrivast, pkarampu, rgowdapp, rhinduja, rhs-bugs, storage-qa-internal | |
Target Milestone: | --- | Keywords: | Regression, ZStream | |
Target Release: | RHGS 3.1.3 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.7.9-4 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1332994 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-23 05:20:47 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1311817, 1332994, 1335283, 1335686 |
Description
Sweta Anandpara
2016-05-02 13:30:07 UTC
[qe@rhsqe-repo 1332199]$ [qe@rhsqe-repo 1332199]$ hostname rhsqe-repo.lab.eng.blr.redhat.com [qe@rhsqe-repo 1332199]$ [qe@rhsqe-repo 1332199]$ pwd /home/repo/sosreports/1332199 [qe@rhsqe-repo 1332199]$ [qe@rhsqe-repo 1332199]$ ls -l total 136516 -rwxr-xr-x. 1 qe qe 35482436 May 2 19:00 sosreport-sysreg-prod-20160502173626.tar.xz -rwxr-xr-x. 1 qe qe 36357356 May 2 19:02 sosreport-sysreg-prod-20160502173627.tar.xz -rwxr-xr-x. 1 qe qe 37035712 May 2 19:05 sosreport-sysreg-prod-20160502173628.tar.xz -rwxr-xr-x. 1 qe qe 30910008 May 2 19:04 sosreport-sysreg-prod-20160502173629.tar.xz [qe@rhsqe-repo 1332199]$ [qe@rhsqe-repo 1332199]$ The package glusterfs-debuginfo is installed on the setup now. That should help in further debugging. Quota is getting write requests (as part of self-heal) with pid as 0. For quota to skip any checks pid has to be a negative number. (gdb) p req->pid $3 = 0 (gdb) p *req $4 = {trans = 0x7f781c00ebc0, svc = 0x7f782402f840, prog = 0x7f7824031200, xid = 7048, prognum = 1298437, progver = 330, procnum = 13, type = 0, uid = 0, gid = 0, pid = 0, lk_owner = {len = 8, data = "ĝ\256\324%\177", '\000' <repeats 1017 times>}, gfs_id = 0, auxgids = 0x7f78280370ec, auxgidsmall = {0 <repeats 128 times>}, auxgidlarge = 0x0, auxgidcount = 0, msg = {{iov_base = 0x7f7837473a44, iov_len = 44}, {iov_base = 0x7f7837493d00, iov_len = 35}, {iov_base = 0x0, iov_len = 0} <repeats 14 times>}, count = 2, iobref = 0x7f781c0185d0, rpc_status = 0, rpc_err = 0, auth_err = 0, txlist = {next = 0x7f782803741c, prev = 0x7f782803741c}, payloadsize = 0, cred = {flavour = 390039, datalen = 28, authdata = '\000' <repeats 19 times>, "\bĝ\256\324%\177", '\000' <repeats 373 times>}, verf = {flavour = 0, datalen = 0, authdata = '\000' <repeats 399 times>}, synctask = _gf_false, private = 0x0, trans_private = 0x0, hdr_iobuf = 0x0, reply = 0x0} (gdb) c Continuing. [Thread 0x7f7821a02700 (LWP 19940) exited] Breakpoint 1, quota_writev (frame=0x7f7834b9d3e8, this=0x7f7824019dc0, fd=0x7f78240cdebc, vector=0x7f781c018c38, count=1, off=0, flags=0, iobref=0x7f781c0185d0, xdata=0x0) at quota.c:1810 1810 { (gdb) p frame->root->pid $5 = 0 After debugging this issue, found that multi-threaded self-heal feature introduced this regression. Please mark this as blocker. Upstream patch http://review.gluster.org/14211 posted for review QATP: === (tested all with x3) TC#1 ran the case which was mentioned while raising bug===>passed TC#2 failed==>raised a bug 1341190 -conservative merge happening on a x3 volume for a deleted file TC#3 same as tc#1 but check with data size limit usage instead of inode limit ==>passed but as tc#1 passed moving to verified retried tc#1 and tc#3 with multithrreaded set to 16 cluster.shd-max-threads:16 ==>both passed test version: 3.7.9-6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |