Bug 1635967
Summary: | Heal on arbitered brick(node) is in pending state after in-service upgrade from RHGS3.4.0 to RHGS3.4.0-async | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Bala Konda Reddy M <bmekala> |
Component: | arbiter | Assignee: | Ravishankar N <ravishankar> |
Status: | CLOSED DUPLICATE | QA Contact: | Karan Sandha <ksandha> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.4 | CC: | amukherj, bmekala, ravishankar, rhs-bugs, sanandpa, sankarshan, storage-qa-internal, vdas |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-22 15:53:20 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Bala Konda Reddy M
2018-10-04 06:37:45 UTC
What's the relation of this bug with the async change we did in glusterd? Ravi, I haven't restart or rebooted on the cluster Attaching the sosreports, bricks dumps and shd dumps http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/bmekala/bug.1635967/ The brick statedump show(In reply to Bala Konda Reddy M from comment #6) > Ravi, > I haven't restart or rebooted on the cluster > > Attaching the sosreports, bricks dumps and shd dumps > http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/bmekala/bug.1635967/ The brick statedump taken earlier shows ACTIVE locks on the bricks dated 2018-10-05: inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 18446744073709551610, owner=402201c8af7f0000, client=0x7f8a800a6e20, connection-id=dhcp37-75.lab.eng.blr.redhat.com-2118-2018/10/03-08:40:08:786539-arb_1-client-0-0-0, granted at 2018-10-05 16:58:43 lock-dump.domain.domain=arb_1-replicate-0:metadata I haven't checked the shd logs yet but I think there might have been a network disconnect between shd and bricks in order for the locks to be released. At this point I'm almost sure that with test description in the BZ resembling https://bugzilla.redhat.com/show_bug.cgi?id=1637802#c0 (which is for fixing the the one raised by Vijay- BZ 1636902), the same fix should address this issue also. Hi Bala, Further ti comment #7, do you have any objections to close this as a duplicate of BZ 1636902? The issue should not occur with glusterfs-3.12.2-23 which contains the fix for the stale lock issue. (In reply to Ravishankar N from comment #8) > Hi Bala, > Further ti comment #7, do you have any objections to close this as a > duplicate of BZ 1636902? The issue should not occur with glusterfs-3.12.2-23 > which contains the fix for the stale lock issue. Yes mark it as duplicate *** This bug has been marked as a duplicate of bug 1636902 *** |