Bug 963169 - glusterd : 'gluster volume start <volname> force' is unable to start brick process and it fails but 'gluster volume stop' followed by 'gluster volume start' starts that brick process [NEEDINFO]
glusterd : 'gluster volume start <volname> force' is unable to start brick pr...
Status: CLOSED NOTABUG
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: krishnan parthasarathi
Rajesh Madaka
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-15 06:12 EDT by Rachana Patel
Modified: 2018-01-24 23:44 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-26 02:35:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sanandpa: needinfo? (rmadaka)


Attachments (Terms of Use)

  None (edit)
Description Rachana Patel 2013-05-15 06:12:45 EDT
Description of problem:
glusterd : 'gluster volume start <volname> force' is unable to start brick process and it fails but 'gluster volume stop' followed by 'gluster volume start' starts that brick process

Version-Release number of selected component (if applicable):
3.4.0.8rhs-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:
1. had a cluster of 4 server and volume(DHT) having  3 bricks. 

[root@fred ~]# gluster v info sanity
 
Volume Name: sanity
Type: Distribute
Volume ID: f72df54d-410c-4f34-b181-65d8bd0cdcc4
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: fan.lab.eng.blr.redhat.com:/rhs/brick1/sanity
Brick2: mia.lab.eng.blr.redhat.com:/rhs/brick1/sanity
Brick3: fred.lab.eng.blr.redhat.com:/rhs/brick1/sanity

2.detach  servers and probe it again
[root@mia ~]# gluster peer detach fred.lab.eng.blr.redhat.com
peer detach: failed: Brick(s) with the peer fred.lab.eng.blr.redhat.com exist in cluster
[root@mia ~]# gluster peer detach fred.lab.eng.blr.redhat.com force
peer detach: success
[root@mia ~]# gluster peer probe fred.lab.eng.blr.redhat.com 
peer probe: success

- also detach server fan and add it back

[root@mia ~]# gluster peer status
Number of Peers: 4

Hostname: mia.lab.eng.blr.redhat.com
Uuid: 1698dc55-2245-4b20-9b8c-60fbe77a06ff
State: Peer in Cluster (Connected)

Hostname: fan.lab.eng.blr.redhat.com
Uuid: c6dfd028-d46f-4d20-a9c6-17c04e7fb919
State: Peer in Cluster (Connected)

Hostname: cutlass.lab.eng.blr.redhat.com
Uuid: 8969af20-77e0-41a5-bb8e-500d1a238f1b
State: Peer in Cluster (Connected)

Hostname: fred.lab.eng.blr.redhat.com
Port: 24007
Uuid: ababf76c-a741-4e27-a6bb-93da035d8fd7
State: Peer in Cluster (Connected)

3.chek gluster volume status

[root@fred ~]# gluster v status

Status of volume: sanity
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/sanity	N/A	N	4380
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/sanity	49154	Y	1623
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/sanity	N/A	N	N/A
NFS Server on localhost					2049	Y	4411
NFS Server on 8969af20-77e0-41a5-bb8e-500d1a238f1b	2049	Y	3549
NFS Server on 1698dc55-2245-4b20-9b8c-60fbe77a06ff	2049	Y	1632
NFS Server on c6dfd028-d46f-4d20-a9c6-17c04e7fb919	2049	Y	4386
 
There are no active volume tasks

verify that glustefsd is running on that server or not - it's running

[root@fan ~]# ps -aef | grep glusterfsd
root      1605     1  0 May14 ?        00:00:00 /usr/sbin/glusterfsd -s fan.lab.eng.blr.redhat.com --volfile-id sanity.fan.lab.eng.blr.redhat.com.rhs-brick1-sanity -p /var/lib/glusterd/vols/sanity/run/fan.lab.eng.blr.redhat.com-rhs-brick1-sanity.pid -S /var/run/5013ac74e2050c547e6087ce611cbe45.socket --brick-name /rhs/brick1/sanity -l /var/log/glusterfs/bricks/rhs-brick1-sanity.log --xlator-option *-posix.glusterd-uuid=c6dfd028-d46f-4d20-a9c6-17c04e7fb919 --brick-port 49154 --xlator-option sanity-server.listen-port=49154
root      1616     1  0 May14 ?        00:00:00 /usr/sbin/glusterfsd -s fan.lab.eng.blr.redhat.com --volfile-id t1.fan.lab.eng.blr.redhat.com.rhs-brick1-t1 -p /var/lib/glusterd/vols/t1/run/fan.lab.eng.blr.redhat.com-rhs-brick1-t1.pid -S /var/run/d221c6eaad62743f6a0336c357372761.socket --brick-name /rhs/brick1/t1 -l /var/log/glusterfs/bricks/rhs-brick1-t1.log --xlator-option *-posix.glusterd-uuid=c6dfd028-d46f-4d20-a9c6-17c04e7fb919 --brick-port 49155 --xlator-option t1-server.listen-port=49155
root      4464  3106  0 01:51 pts/0    00:00:00 grep glusterfsd


4. try to start volume forcefully in order to get brick process on line. It always fails saying that Commit failed


[root@fan ~]# gluster volume start sanity force
volume start: sanity: failed: Commit failed on localhost. Please check the log file for more details.


5. stop volume and start again. brick processes are online now

[root@fan ~]# gluster volume status sanity
Status of volume: sanity
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fan.lab.eng.blr.redhat.com:/rhs/brick1/sanity	49157	Y	4978
Brick mia.lab.eng.blr.redhat.com:/rhs/brick1/sanity	49154	Y	4846
Brick fred.lab.eng.blr.redhat.com:/rhs/brick1/sanity	49159	Y	4831
NFS Server on localhost					2049	Y	4988
NFS Server on 1698dc55-2245-4b20-9b8c-60fbe77a06ff	2049	Y	4856
NFS Server on ababf76c-a741-4e27-a6bb-93da035d8fd7	2049	Y	4842
NFS Server on 8969af20-77e0-41a5-bb8e-500d1a238f1b	2049	Y	3986
 
There are no active volume tasks

Actual results:
start force should bring the brick process back and status should show them online

Expected results:


Additional info:
Comment 3 Rachana Patel 2013-05-21 02:53:18 EDT
log always  says

W [syncop.c:32:__run] 0-management: re-running already running task
Comment 4 Nagaprasad Sathyanarayana 2014-05-06 07:43:43 EDT
Dev ack to 3.0 RHS BZs
Comment 5 Atin Mukherjee 2014-05-08 08:33:55 EDT
The second step mentioned here i.e. peer detach with force would fail as there has been a recent change in the peer detach functionality (http://review.gluster.org/5325) which would not allow a peer to be detached (even with force) if it holds a brick.
Comment 7 Atin Mukherjee 2014-08-12 02:16:51 EDT
Hi Rachana,

Can you please try to reproduce this bug as I believe this is no more valid based on comment 5. 

~Atin
Comment 8 Atin Mukherjee 2014-08-26 02:35:48 EDT
Closing this bug, please re-open if it gets reproduced.
Comment 9 Sweta Anandpara 2018-01-15 03:12:46 EST
Re setting the need_info to current glusterd qe for the question asked in comment 7.

Note You need to log in before you can comment on or make changes to this bug.