Bug 1409102
| Summary: | [Arbiter] IO Failure and mount point inaccessible after killing a brick | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Karan Sandha <ksandha> |
| Component: | rpc | Assignee: | Milind Changire <mchangir> |
| Status: | CLOSED ERRATA | QA Contact: | Karan Sandha <ksandha> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, ksandha, mchangir, rcyriac, rgowdapp, rhs-bugs, sheggodu, smali |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 3.4.0 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | rebase | ||
| Fixed In Version: | glusterfs-3.12.2-1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-09-04 06:29:55 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1503134 | ||
|
Description
Karan Sandha
2016-12-29 14:06:04 UTC
My gut feeling is that its the same as bug [1]. [1] was hit when protocol/client received events in the order, CONNECT DISCONNECT DISCONNECT CONNECT However, in this bz I think protocol/client received events in the order, DISCONNECT CONNECT CONNECT DISCONNECT Though we need to think such an ordering is possible (since there can be only one event from socket due to EPOLL_ONESHOT, but the events can be on different sockets since for every new connection transport/socket uses a new socket). Another point to note that [2] fixes [1], by making: 1. making priv->connected=0 2. notifying higher layers a DISCONNECT event as atomic in rpc-client. However, if indeed there are racing events, what about a CONNECT and DISCONNECT racing b/w transport/socket and rpc-client and changing the order. Is it possible? Something to ponder about. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1385605 [2] http://review.gluster.org/15916 rjosoph, This is very intermittently reproducible but when this issue gets hit it makes the whole system in a hanged state. I have the statedump taken at that time when the issue got hit. Its placed at the location itself. pstack output is not taken. Thanks & regards Karan Sandha Patches [1][2] are merged in rhgs-3.3.0. Should we close this bug as fixed? [1] https://code.engineering.redhat.com/gerrit/#/c/99220/ [2] http://review.gluster.org/15916 regards, Raghavendra Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607 |