Bug 1630145
Summary: | Geo-rep: Few workers fails to start with out any failure | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Kotresh HR <khiremat> |
Component: | geo-replication | Assignee: | Kotresh HR <khiremat> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1 | CC: | bugs |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-4.1.5 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1614799 | Environment: | |
Last Closed: | 2018-09-26 14:02:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1614799 | ||
Bug Blocks: | 1623749 |
Description
Kotresh HR
2018-09-18 05:46:27 UTC
REVIEW: https://review.gluster.org/21201 (geo-rep: Fix deadlock during worker start) posted (#1) for review on release-4.1 by Kotresh HR COMMIT: https://review.gluster.org/21201 committed in release-4.1 by "Shyamsundar Ranganathan" <srangana> with a commit message- geo-rep: Fix deadlock during worker start Analysis: Monitor process spawns monitor threads (one per brick). Each monitor thread, forks worker and agent processes. Each monitor thread, while intializing, updates the monitor status file. It is synchronized using flock. The race is that, some thread can fork worker while other thread opened the status file resulting in holding the reference of fd in worker process. Cause: flock gets unlocked either by specifically unlocking it or by closing all duplicate fds referring to the file. The code was relying on fd close, hence a reference in worker/agent process by fork could cause the deadlock. Fix: 1. flock is unlocked specifically. 2. Also made sure to update status file in approriate places so that the reference is not leaked to worker/agent process. With this fix, both the deadlock and possible fd leaks is solved. Backport of: > Patch: https://review.gluster.org/20704 > BUG: bz#1614799 > Change-Id: I0d1ce93072dab07d0dbcc7e779287368cd9f093d > Signed-off-by: Kotresh HR <khiremat> fixes: bz#1630145 Change-Id: I0d1ce93072dab07d0dbcc7e779287368cd9f093d Signed-off-by: Kotresh HR <khiremat> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.1.5, please open a new bug report. glusterfs-4.1.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2018-September/000113.html [2] https://www.gluster.org/pipermail/gluster-users/ |