Bug 1577672
| Summary: | Brick-mux regressions failing for over 8+ weeks on master | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Shyamsundar <srangana> | |
| Component: | tests | Assignee: | bugs <bugs> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | mainline | CC: | atumball, bugs | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-5.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1582286 (view as bug list) | Environment: | ||
| Last Closed: | 2018-10-05 04:34:24 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
|
Description
Shyamsundar
2018-05-13 21:59:39 UTC
REVIEW: https://review.gluster.org/20022 (glusterd: address test failures with brick mux enabled) posted (#2) for review on master by Atin Mukherjee REVIEW: https://review.gluster.org/20036 (afr: fix bug-1363721.t failure) posted (#1) for review on master by Ravishankar N REVIEW: https://review.gluster.org/20037 (changelog: fix br-state-check.t failure for brick_mux) posted (#1) for review on master by MOHIT AGRAWAL COMMIT: https://review.gluster.org/20036 committed in master by "Ravishankar N" <ravishankar> with a commit message- afr: fix bug-1363721.t failure Problem: In the .t, when the only good brick was brought down, writes on the fd were still succeeding on the bad bricks. The inflight split-brain check was marking the write as failure but since the write succeeded on all the bad bricks, afr_txn_nothing_failed() was set to true and we were unwinding writev with success to DHT and then catching the failure in post-op in the background. Fix: Don't wind the FOP phase if the write_subvol (which is populated with readable subvols obtained in pre-op cbk) does not have at least 1 good brick which was up when the transaction started. Note: This fix is not related to brick muliplexing. I ran the .t 10 times with this fix and brick-mux enabled without any failures. Change-Id: I915c9c366aa32cd342b1565827ca2d83cb02ae85 updates: bz#1577672 Signed-off-by: Ravishankar N <ravishankar> COMMIT: https://review.gluster.org/20037 committed in master by "Amar Tumballi" <amarts> with a commit message- changelog: fix br-state-check.t failure for brick_mux Problem: Sometime br-state-check.t crash while runnning for brick multiplex and command in test case is taking 2 minutes for detach a brick Solution: Update code in changelog xlator specific to wait on all connection before cleanup rpc threads and cleanup rpc object only in non brick mux scenario BUG: 1577672 Change-Id: I16e257c1e127744a815000b87bd8b7b8d9c51e1b fixes: bz#1577672 Signed-off-by: Mohit Agrawal <moagrawa> COMMIT: https://review.gluster.org/20022 committed in master by "Amar Tumballi" <amarts> with a commit message- glusterd: address test failures with brick mux enabled This patch addresses following: 1. On volume stop, for the last brick, pmap_registry_remove () is invoked by glusterd. 2. If a brick process is sigkilled, remove all the associated brick instances from the portmap. 3. Bump up PROCESS_UP_TIMEOUT to 45. 4. gf_attach to kill a brick takes more time in mux (which is an issue that needs a fix), but in the interim, give br-state-check.t more time to complete (there are 2 kill_bricks, each taking 120 seconds, and the test usually passes in 30 odd seconds, hence bumping this up to 350 seconds) 5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at times on master without mux, in mux cases, when it fails, it is almost at the last iteration, hence bumping the timeout for this test case to reduce regression error rates Updates: bz#1577672 Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67 Signed-off-by: Atin Mukherjee <amukherj> https://build.gluster.org/job/regression-test-with-multiplex/ Master branch is now stable of all brick-mux regressions! This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report. glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html [2] https://www.gluster.org/pipermail/gluster-users/ |