Bug 1059092 - gsyncd.conf goes corrupt - looses state_file entry - leads to "defunct" geo-rep status
Summary: gsyncd.conf goes corrupt - looses state_file entry - leads to "defunct" geo-r...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Avra Sengupta
QA Contact:
URL:
Whiteboard:
Depends On: 1058999 1162142
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-29 07:31 UTC by Avra Sengupta
Modified: 2014-11-11 08:27 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1058999
Environment:
Last Closed: 2014-11-11 08:27:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 2 Anand Avati 2014-01-29 14:30:28 UTC
REVIEW: http://review.gluster.org/6856 (gluserd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#1) for review on master by Avra Sengupta (asengupt)

Comment 3 Avra Sengupta 2014-01-29 14:41:05 UTC
In the config file we have observed several missing entries including state_file, pid_file, which are crucial for start and stop operations of gsyncd processes. While the status and pid files might themselves be present, the entries that lead to the location of these files is missing. We are investigating the circumstances, that could have lead to the deletion of these entries, as none of the gsyncd/glusterd operations remove entries from the config file and it doesn't seem like a corruption either, as the rest of the entries in the config file are fine.

Meanwhile, we have sent this patch (http://review.gluster.org/6856), which fixes the failure of stop force. With this patch if entries like state_file or pid-file are missing in the gsyncd.conf or if the gsyncd.conf is also missing, glusterd looks for the missing configs in the gsyncd_template.conf. 

stop force will successfully stop an already running session, even if the state-file entries are missing in both the config file and the template, as long as either of them have a pid-file entry. if the pid-file entry is missing in an already started session, then stop force will fetch it from the config template and stop the session. However if the pid-file entry is missing in both the config and the template, stop force will fail with appropriate error stating pid-file entry is missing.

This patch is currently under review, and has been thoroughly unit-tested. But as it involves major changes in critical code path, it would be preferable to have a proper qe regression done on this as well.

Comment 4 Anand Avati 2014-02-03 10:21:15 UTC
REVIEW: http://review.gluster.org/6856 (gluserd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#2) for review on master by Avra Sengupta (asengupt)

Comment 5 Anand Avati 2014-02-04 14:00:42 UTC
REVIEW: http://review.gluster.org/6856 (gluserd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#3) for review on master by Avra Sengupta (asengupt)

Comment 6 Anand Avati 2014-02-07 14:38:29 UTC
REVIEW: http://review.gluster.org/6856 (gluserd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#4) for review on master by Avra Sengupta (asengupt)

Comment 7 Anand Avati 2014-02-10 07:45:31 UTC
REVIEW: http://review.gluster.org/6856 (glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#5) for review on master by Avra Sengupta (asengupt)

Comment 8 Anand Avati 2014-02-14 10:42:03 UTC
REVIEW: http://review.gluster.org/6856 (glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#6) for review on master by Avra Sengupta (asengupt)

Comment 9 Anand Avati 2014-03-20 07:56:20 UTC
REVIEW: http://review.gluster.org/6856 (glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#7) for review on master by Avra Sengupta (asengupt)

Comment 10 Anand Avati 2014-04-30 09:33:54 UTC
REVIEW: http://review.gluster.org/6856 (glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf) posted (#8) for review on master by Avra Sengupta (asengupt)

Comment 11 Anand Avati 2014-05-02 03:21:37 UTC
COMMIT: http://review.gluster.org/6856 committed in master by Vijay Bellur (vbellur) 
------
commit 3d4a31d304064f88d2d1e414346c790f099743b5
Author: Avra Sengupta <asengupt>
Date:   Wed Jan 29 03:06:19 2014 +0000

    glusterd/geo-rep: Looks for state_file and pid-file in gsyncd_template.conf
    
    If entries like state_file or pid-file are missing in the gsyncd.conf
    or if the gsyncd.conf is also missing, glusterd looks for the missing
    configs in the gsyncd_template.conf
    
    status will display "Config Corrupted" as long as the entry is missing in
    the config file.  Missing state-file entry in both config and template
    will not allow starting a geo-rep session.
    
    However stop force will successfully stop an already running session,
    if the state-file entries are missing in both the config file and
    the template, as long as either of them have a pid-file entry.
    
    if the pid-file entry is missing in the gsyncd.conf file, starting a
    geo-rep session will not be allowed.
    
    if the pid-file entry is missing in an already started session, then
    stop force will fetch it from the config template and stop the session.
    
    if the pid-file entry is missing in both the config and the template,
    stop force will fail with appropriate error stating pid-file entry is missing.
    
    Change-Id: I81d7cbc4af085d82895bbef46ca732555aa5365d
    BUG: 1059092
    Signed-off-by: Avra Sengupta <asengupt>
    Reviewed-on: http://review.gluster.org/6856
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 12 Niels de Vos 2014-09-22 12:35:26 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 13 Niels de Vos 2014-11-11 08:27:24 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.