Bug 988286 - restarting glusterd doesn't start the brick, nfs and self-heal daemon process
restarting glusterd doesn't start the brick, nfs and self-heal daemon process
Status: CLOSED WORKSFORME
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Bug Updates Notification Mailing List
Sudhir D
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-25 05:07 EDT by spandura
Modified: 2014-01-17 06:50 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-17 06:50:46 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
SOS Reports (20.94 KB, application/x-gzip)
2013-07-25 05:32 EDT, spandura
no flags Details

  None (edit)
Description spandura 2013-07-25 05:07:06 EDT
Description of problem:
========================
In a replicate volume ( 1 x 2 , storage_node1 and storage_node2 ) all the brick process , nfs process, self-heal-daemon process and glusterd process are killed on both the nodes ( killall glusterfs glusterfsd glusterd ) and started glusterd (service glusterd start) on one of the node. glusterd fails to start brick, nfs and self-heal daemon process on the node where the glusterd is restarted. 

Version-Release number of selected component (if applicable):
============================================================
root@rhs-client11 [Jul-25-2013-14:30:55] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.12rhs.beta6-1.el6rhs.x86_64

root@rhs-client11 [Jul-25-2013-14:30:58] >gluster --version
glusterfs 3.4.0.12rhs.beta6 built on Jul 23 2013 16:20:03

How reproducible:
=================
Often

Steps to Reproduce:
=====================
1. Create a replicate volume ( 1 x 2 ). Start the volume.

2. Create nfs, fuse mount. Create files/dirs from both the mounts. 

3. killall glusterfs glusterfsd glusterd  from both the storage nodes. 

4. From one of the storage node start glusterd (service glusterd start)

Actual results:
================
1. Brick, NFS, Self-heal daemon process are not started. 

2. Even after restarting glusterd multiple times it doesn't start any of the process

root@rhs-client11 [Jul-25-2013-13:55:18] >killall glusterfs glusterfsd glusterd
root@rhs-client11 [Jul-25-2013-13:55:31] >
root@rhs-client11 [Jul-25-2013-13:55:36] >service glusterd start
Starting glusterd:                                         [  OK  ]
root@rhs-client11 [Jul-25-2013-13:55:44] >
root@rhs-client11 [Jul-25-2013-13:55:44] >
root@rhs-client11 [Jul-25-2013-13:55:44] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:47] >
root@rhs-client11 [Jul-25-2013-13:55:48] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:49] >
root@rhs-client11 [Jul-25-2013-13:55:53] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:54] >
root@rhs-client11 [Jul-25-2013-13:55:59] >ps -ef | grep gluster
root     23631 23619  0 11:49 pts/2    00:00:00 tail -f /var/log/glusterfs/glustershd.log
root     24906     1  0 13:55 ?        00:00:00 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     25058 22781  0 13:56 pts/0    00:00:00 grep gluster

root@rhs-client11 [Jul-25-2013-14:01:14] >service glusterd status
glusterd (pid  24906) is running...
root@rhs-client11 [Jul-25-2013-14:01:22] >
root@rhs-client11 [Jul-25-2013-14:01:22] >service glusterd restart
Starting glusterd:                                         [  OK  ]
root@rhs-client11 [Jul-25-2013-14:01:28] >
root@rhs-client11 [Jul-25-2013-14:01:29] >service glusterd status
glusterd (pid  25231) is running...
root@rhs-client11 [Jul-25-2013-14:01:31] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	

Expected results:
================
should restart bricks, nfs and self-heal-daemon process. 

Additional info:
=================
root@rhs-client11 [Jul-25-2013-14:35:00] >gluster v info
 
Volume Name: vol_rep
Type: Replicate
Volume ID: f7928cb5-76bf-4a9f-93b2-a4ce3073519b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: rhs-client11:/rhs/brick1/b0
Brick2: rhs-client12:/rhs/brick1/b1
Comment 2 spandura 2013-07-25 05:32:11 EDT
Created attachment 778159 [details]
SOS Reports
Comment 3 Pranith Kumar K 2014-01-03 08:22:36 EST
Tried to re-create. It works fine.

root@pranithk-vm3 - ~/RPMS 
18:22:56 :) ⚡ killall glusterfs glusterd glusterfsd

root@pranithk-vm3 - ~/RPMS 
18:23:03 :) ⚡ ps aux | grep gluster
root     16432  0.0  0.0 103244   832 pts/0    S+   18:23   0:00 grep gluster

root@pranithk-vm3 - ~/RPMS 
18:23:06 :) ⚡ gluster volume start r2
Connection failed. Please check if gluster daemon is operational.

root@pranithk-vm3 - ~/RPMS 
18:23:12 :( ⚡ service glusterd start
Starting glusterd:                                         [  OK  ]

root@pranithk-vm3 - ~/RPMS 
18:23:24 :) ⚡ ps aux | grep gluster
root     16479  3.1  0.8 360520 16456 ?        Ssl  18:23   0:00 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     16778  0.6  0.9 642520 19700 ?        Ssl  18:23   0:00 /usr/sbin/glusterfsd -s 10.70.43.148 --volfile-id r2.10.70.43.148.brick-2 -p /var/lib/glusterd/vols/r2/run/10.70.43.148-brick-2.pid -S /var/run/d3d3
root     16790  2.6  3.0 389888 62400 ?        Ssl  18:23   0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/55e1ee0bebt
root     16795  1.3  0.9 329196 20504 ?        Ssl  18:23   0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustersh1
root     16814  0.0  0.0 103248   840 pts/0    S+   18:23   0:00 grep gluster

root@pranithk-vm3 - ~/RPMS 
18:23:27 :) ⚡ rpm -qa | grep gluster
glusterfs-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-server-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-debuginfo-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-libs-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-api-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-geo-replication-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-api-devel-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-devel-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-rdma-3.4.0.53rhs-1.el6rhs.x86_64
Comment 4 Vivek Agarwal 2014-01-17 06:50:46 EST
Based on comment 3, closing this bug

Note You need to log in before you can comment on or make changes to this bug.