Bug 1035586 - gluster volume status shows incorrect information for brick process
Summary: gluster volume status shows incorrect information for brick process
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-28 07:09 UTC by Ravishankar N
Modified: 2014-11-11 08:24 UTC (History)
3 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Clone Of:
Environment:
Last Closed: 2014-11-11 08:24:54 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2013-11-28 07:09:46 UTC
Description of problem:
After killing gluster processes and restarting glusterd , `gluster volume status` shows wrong status for the brick process.

Version-Release number of selected component (if applicable):
Mainline

How reproducible:
Always

Steps to Reproduce:
1.Create 1x2 replicate volume using 2 nodes, check the status on node 1:

[root@tuxmv3 ~]# gluster v status
Status of volume: repvol
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.194:/brick/brick1                        49152   Y       23687
Brick 10.70.42.203:/brick/brick1                        49152   Y       24016
NFS Server on localhost                                 2049    Y       23701
Self-heal Daemon on localhost                           N/A     Y       23705
NFS Server on 10.70.42.203                              2049    Y       24030
Self-heal Daemon on 10.70.42.203                        N/A     Y       24034

Task Status of Volume repvol
------------------------------------------------------------------------------


2.`pkill gluster` on node 1
3. `service glusterd start` on node 1
4. 'gluster volume info` on node 1 shows it's brick is still offline, with an unrelated pid. But the brick _is_ online as checked by ps command.

[root@tuxmv3 ~]# gluster v status
Status of volume: repvol
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.194:/brick/brick1                        N/A     N       23927
Brick 10.70.42.203:/brick/brick1                        49152   Y       24016
NFS Server on localhost                                 2049    Y       23918
Self-heal Daemon on localhost                           N/A     Y       23922
NFS Server on 10.70.42.203                              2049    Y       24030
Self-heal Daemon on 10.70.42.203                        N/A     Y       24034

Task Status of Volume repvol
------------------------------------------------------------------------------
There are no active volume tasks

[root@tuxmv3 ~]# ps aux|grep gluster
root     23784  0.0  0.7 396652 16264 ?        Ssl  06:57   0:00 /usr/local/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     23911  0.0  0.9 507960 20476 ?        Ssl  06:57   0:00 /usr/local/sbin/glusterfsd -s 10.70.42.194 --volfile-id repvol.10.70.42.194.brick-brick1 -p /var/lib/glusterd/vols/repvol/run/10.70.42.194-brick-brick1.pid -S /var/run/8a99d8b0a161ffd90effca760c4fb4d3.socket --brick-name /brick/brick1 -l /usr/local/var/log/glusterfs/bricks/brick-brick1.log --xlator-option *-posix.glusterd-uuid=d3a9c269-c19c-4a6b-adde-3645711b3d37 --brick-port 49152 --xlator-option repvol-server.listen-port=49152
root     23918  0.0  3.6 330168 74492 ?        Ssl  06:57   0:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /usr/local/var/log/glusterfs/nfs.log -S /var/run/1c28951fd59c7b60bb283e440cd9890d.socket
root     23922  0.0  1.1 326596 22832 ?        Ssl  06:57   0:00 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /usr/local/var/log/glusterfs/glustershd.log -S /var/run/01ae7ed01b52557e860cc9c51b9adee2.socket --xlator-option *replicate*.node-uuid=d3a9c269-c19c-4a6b-adde-3645711b3d37
root     24047  0.0  0.0 103248   916 pts/1    S+   07:07   0:00 grep --color=auto gluster
[root@tuxmv3 ~]#
Actual results:


Expected results:
The correct pid and status of brick process.

Additional info:

Comment 2 Anand Avati 2014-01-24 17:50:27 UTC
REVIEW: http://review.gluster.org/6786 (glusterd: Fix race in pid file update) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 3 Anand Avati 2014-02-03 13:45:13 UTC
COMMIT: http://review.gluster.org/6786 committed in master by Vijay Bellur (vbellur) 
------
commit 8cf2a36dad6c8bac7cd3a05455fd555544ebb457
Author: Ravishankar N <ravishankar>
Date:   Fri Jan 24 17:27:38 2014 +0530

    glusterd: Fix race in pid file update
    
    This patch only removes lines of code. For personal gratification, giving a
    detailed explanation of what the problem was.
    
    When glusterd spawns the local brick process, say when a reboot of the node
    occurs,the glusterd_brick_start() and subsequently the
    glusterd_volume_start_glusterfs() function gets called twice; from
    glusterd_spawn_daemons() and glusterd_do_volume_quorum_action() respectively.
    This causes a race, best described by a pseudo-code of current behaviour.
    
    glusterd_volume_start_glusterfs()
    {
           if(!brick process running) {
    
             step-a) reap pid file( i.e. unlink it)
             step-b) fork a brick process which creates and locks pid file and
                     binds the process to a socket.
    
            }
    
    }
    
    Time            Event
    ----            -----
    T1              Call-1 arrives, completes step-a, starts step-b
    T2              Call-2 arrives, enters step-a as Call-1's forked child is not
                    yet running.
    T3              Call-1's forked child is alive, creates pidfile and locks it,binds
                    its address to a socket.
    T4              Call-2 performs step-a; i.e.unlinks the pid file created by Call-1 !!
                    (files can still be stil be unlinked despite a lockf on it)
    T5              Call-2 does step-b, and the forked child process creates a *new* pid file
                    with it's pid and locks this file.
    T6              But Call-2's brick process is not able to bind to socket as it
                    is already in use (courtesy T3) and hence exits (so no locks anymore on the pidfile).
    
    Result:
    -  Pid file now contains PID of an extinct brick process.
    - `gluster volume status` shows this PID value. It also notices that there is no
    lock held on pid file by the currently running brick process (created by Call-1)
    and hence shows N/A for the online status.
    
    Also, as a result of events at T4, "ls  -l /proc/<brick process PID>/fd/pidfile"
    shows up as deleted.
    
    Fix:
    1.Do not unlink pid file. i.e. avoid step-a. Now at T5,Call-2's brick process
    cannot obtain lock on pid file (Call-1's process still holds it) and exits.
    
    2. Unrelated, but remove lock-unlock sequence in glusterfs_pidfile_setup()
    which does not seem to be doing anything useful.
    
    Change-Id: I18d3e577bb56f68d85d3e0d0bfd268e50ed4f038
    BUG: 1035586
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/6786
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 Niels de Vos 2014-09-22 12:33:00 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 5 Niels de Vos 2014-11-11 08:24:54 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.