Bug 1069040
| Summary: | Procedure for replacing a completely failed peer with one of the same hostname and Ip does not work | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Brad Hubbard <bhubbard> | ||||||||
| Component: | glusterfs | Assignee: | Raghavendra G <rgowdapp> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Sudhir D <sdharane> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 2.1 | CC: | abelur, asrivast, hamiller, nlevinki, nsathyan, pkarampu, ravishankar, sasundar, sauchter, spandura, vbellur, vumrao | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2014-04-01 02:23:45 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1073815 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Brad Hubbard
2014-02-24 02:43:50 UTC
Created attachment 866848 [details]
sosreport2
Created attachment 866850 [details]
sosreport3
This works for me: From any of the 2 healthy peers: #gluster peer detach <IP_of_the_peer_that went_down> force #gluster peer probe <new_peer_with_same_IP> Then check if the brick, self-heald processes on the new peer are alive using #gluster volume status <vol_name> If not , run #gluster volume start <vol_name> force Now we can run #gluster volume heal <vol_name> full to trigger the heal to the bricks of the new peer. Also, the "store.c:1957:glusterd_store_retrieve_volume] 0-: Unknown key: brick-X" messages are just spurious messages and not really errors. A patch that fixes it has just been merged upstream:http://review.gluster.org/#/c/7314/ I am not sure this addresses replacing of the bricks. Lets try that one as well before confirming the steps. Pranith Tried out the steps given in comment #6 for a couple of iterations and verified that it works fine. It was observed that if "gluster volume heal <vol_name> full" was issued before the connections between processes were established (after gluster volume start force), self-heal did not happen. In such a case, just wait for a couple of minutes and run the command again. This will trigger the heal. That is considerably different to the community procedure mentioned and not really intuitive for our customers. Have we documented it anywhere? Here is the possible cause for the issue:-
==========================================
Steps used to re-create
~~~~~~~~~~~~~~~~~~~~~~~~
1. Create a replicate volume 1 x 2 with 2 nodes (node1 {king} and node2 {hicks}).
2. Create a volume 'vol_rep' . Start the volume. Create a fuse mount and add files/dirs to it.
3. Bring offline node2. {Crash/Re-provision).
[root@king vol_rep]# gluster peer status
Number of Peers: 1
Hostname: hicks
Uuid: 093ebaa2-3dc2-4317-a2b5-3461ff08b0e7
State: Peer in Cluster (Disconnected)
Note: node2 , when comes back online has same hostname/ip but a different glusterd UUID
Also, by default we are starting the glusterd automatically when the node is rebooted. If this is the case, when node2 comes online node1 sees node2 and changes it's state "Disconnected" to "Connected"
[root@king vol_rep]# gluster peer status
Number of Peers: 1
Hostname: hicks
Uuid: 093ebaa2-3dc2-4317-a2b5-3461ff08b0e7
State: Peer in Cluster (Connected)
Even though node2 had a different UUID, node1 established the connection to node2.
4. stop glusterd on node2.
[root@hicks ~]# cat /var/lib/glusterd/glusterd.info
UUID=32e87425-8309-45bc-9d91-2cbb3e431326
operating-version=2
[root@hicks ~]#
[root@hicks ~]# service glusterd stop
[root@hicks ~]# d: [ OK ]
5. On node2, edit the /var/lib/glusterd/glusterd.info . Change the UUID of glusterd to older value : "093ebaa2-3dc2-4317-a2b5-3461ff08b0e7"
6. Create the bricks on node2. Set the "volume-id" extended attributes for the bricks.
[root@hicks ~]# mkdir /rhs/bricks/b2
[root@hicks ~]# setfattr -n trusted.glusterfs.volume-id -v "0x1d105460d1e9478f97d0d44fa2068114" /rhs/bricks/b2/
7. restart the glusterd on node2.
8. As soon as node2 glusterd is restrted, node1 is putting the node2 glusterd in 'peer Rejected' state.
[root@king vol_rep]# gluster peer status
Number of Peers: 1
Hostname: hicks
Uuid: 093ebaa2-3dc2-4317-a2b5-3461ff08b0e7
State: Peer Rejected (Disconnected)
9. from node2, peer probe node1
[root@hicks ~]# service glusterd start
Starting glusterd: [ OK ]
[root@hicks ~]# gluster peer status
Number of Peers: 0
[root@hicks ~]#
[root@hicks ~]# gluster v info
No volumes present
[root@hicks ~]# gluster peer probe king
peer probe: success.
[root@hicks ~]# gluster peer status
Number of Peers: 1
Hostname: king
Uuid: 62c5a2a0-c058-47c1-af1f-bac54a33d2d8
State: Accepted peer request (Connected)
[root@king vol_rep]# gluster peer status
Number of Peers: 1
Hostname: hicks
Uuid: 093ebaa2-3dc2-4317-a2b5-3461ff08b0e7
State: Accepted peer request (Connected)
NOTE: For node1 , node2 is in "Accepted peer request" and For node2, node1 is in "Accepted peer request" state. For the proper functionality, nodes must be in "Peer in cluster" state.
10. restart glusted on both nodes node1 and node2.
NODE1:
======
[root@king vol_rep]# service glusterd restart
Starting glusterd: [ OK ]
[root@king vol_rep]#
[root@king vol_rep]# gluster peer status
Number of Peers: 1
Hostname: hicks
Uuid: 093ebaa2-3dc2-4317-a2b5-3461ff08b0e7
State: Sent and Received peer request (Connected)
[root@king vol_rep]#
[root@king vol_rep]#
NODE2:
======
[root@hicks ~]# service glusterd restart
Starting glusterd: [ OK ]
[root@hicks ~]#
[root@hicks ~]# gluster peer status
Number of Peers: 1
Hostname: king
Uuid: 62c5a2a0-c058-47c1-af1f-bac54a33d2d8
State: Sent and Received peer request (Connected)
NOTE : At this state, when the glusterd is restarted, node1's NFS and glustershd process is not started
[root@king vol_rep]# gluster v status
Status of volume: vol_rep
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick king:/rhs/bricks/b1 49152 Y 3810
NFS Server on localhost N/A N N/A
Self-heal Daemon on localhost N/A N N/A
Task Status of Volume vol_rep
------------------------------------------------------------------------------
There are no active volume tasks
11. From node2, sync the volume information from node1.
[root@hicks ~]# gluster v info
No volumes present
[root@hicks ~]#
[root@hicks ~]#
[root@hicks ~]# gluster v status
No volumes present
[root@hicks ~]#
[root@hicks ~]#
[root@hicks ~]# gluster volume sync king all
Sync volume may make data inaccessible while the sync is in progress. Do you want to continue? (y/n) y
volume sync: success
[root@hicks ~]#
[root@hicks ~]# gluster v info
Volume Name: vol_rep
Type: Replicate
Volume ID: 1d105460-d1e9-478f-97d0-d44fa2068114
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: king:/rhs/bricks/b1
Brick2: hicks:/rhs/bricks/b2
[root@hicks ~]# gluster v status
Status of volume: vol_rep
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick hicks:/rhs/bricks/b2 49152 Y 3391
NFS Server on localhost N/A N N/A
Self-heal Daemon on localhost N/A N N/A
NOTE: At this state, Even after volume sync, node1 and node2's NFS and glustershd process is not started
"glusterd" of both the nodes are still in "State: Sent and Received peer request (Connected)" . Expected is "Peer in CLuster (Connected)"
For the errors "Unknown key:" Refer to the bugs :
==================================================
https://bugzilla.redhat.com/show_bug.cgi?id=1036551
https://bugzilla.redhat.com/show_bug.cgi?id=1056910
For recovering as mentioned in "comment 6"
=============================================================
For all the nodes which crashed / re-provisioned and node goes offline,
1. Peer detach the crashed node
gluster peer detach <hostname_of_node_that_crashed> force
2. Note down the volume-id of the volume
VOLUME_ID=$(gluster v info | grep "Volume ID" | cut -d ":" -f 2 | tr -d ' \s')
3. When node that had previously crashed comes online execute the following on that node.
a. create the brick directories
mkdir <brick_directories>
b. set the extended attribute "trusted.glusterfs.volume-id" on the brick directories.
setfattr -n "trusted.glusterfs.volume-id" -v "VOLUME_ID" <brick_directories>
4. From any of the storage node , peer probe the perviously crashed node
gluster peer probe <hostname_of_node_that_crashed>
Validation
===========
1. Check the peer status . Status of the peer should be : "Peer in Cluster (Connected)" state.
2. Check the volume status . All the brick, nfs, glustershd process should be started.
5. Trigger the heal : "gluster volume heal <volume_name> full"
The procedure outlined in comment 6 seems to work fine. I tested it twice on a three node pool and it worked both times, no problems. I'll get this documented in KCS in the next day or so but this needs to be included in the documentation. I've created the following KCS documents for the two scenarios. - How can I replace a completely failed peer with a machine with a different hostname/IP in Red Hat Storage 2 Update 1 https://access.redhat.com/site/solutions/720413 - How can I replace a completely failed peer with a machine with the same hostname/IP in Red Hat Storage 2 Update 1 https://access.redhat.com/site/solutions/773533 I will update the documentation bug. Thanks for your efforts. Closing as NOTABUG. Dev ack to 3.0 RHS BZs |