Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1004546

Summary: peer probe can deadlock in "Sent and Received peer request" for both servers after server build
Product: [Community] GlusterFS Reporter: Todd Stansell <todd+rhbugs>
Component: glusterdAssignee: Kaushal <kaushal>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, banio, bugs, jbyers, todd+rhbugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-15 14:41:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
admin01 logs from failure after kickstart
none
admin02 logs from failure after kickstart
none
admin01 logs from after removing peer state and restarting glusterd
none
admin02 logs from after removing peer state and restarting glusterd
none
admin01 logs from success after kickstart
none
admin02 logs from success after kickstart none

Description Todd Stansell 2013-09-04 21:53:13 UTC
Description of problem:

Occasionally after rebuilding a node in a replica 2 cluster, the initial peer probe from the rebuilt node will cause both peers to be in a "Sent and Received peer request" state, never exchanging volume information and letting the rebuilt node move to the "Peer in Cluster" state. 

The only way out of this I've found is to stop glusterd on both nodes, remove the state= parameter from the /var/lib/glusterd/peers/<uuid> file and then start glusterd up again.  After starting glusterd, the negotiation between the two start from the "Establishing Connection" state and things work as expected.

Version-Release number of selected component (if applicable):
3.4.0

How reproducible:
It seems to happen every time I change which host is being rebuilt, but not if I rebuild the same node.  I'm not 100% sure of this pattern, but it seems this way.

Steps to Reproduce:
1. begin with a replica 2 cluster
2. shut down services and kickstart one server
3. restore previous uuid in /var/lib/glusterd/glusterd.info
4. start glusterd
5. run: gluster peer probe $peer
6. restart glusterd

Actual results:

peer status shows both peers in "Sent and Received peer request" as they both seem to wait for an ACC from the other side.

Expected results:

peer should end up in "Peer in Cluster" state, with volume information exchanged and bricks started.

Additional info:

In our situation, we've written kickstart scripts to automate the peer probe and rejoining of the cluster.  During kickstart we preserve the uuid of the server (step 3) and then set up an init script to run soon after glusterd starts upon the first boot.  The script that gets generated in our test while rebuilding admin02.mgmt is as follows (to see our exact steps):

#!/bin/bash
# initialize glusterfs config
#

# Source function library.
. /etc/init.d/functions

me=admin02.mgmt
peer=admin01.mgmt
gluster peer probe $peer
for i in 1 2 3 4 5; do
    echo -n "Checking for Peer in Cluster .. $i .. "
    out=`gluster peer status 2>/dev/null | grep State:`
    echo $out
    if echo $out | grep "Peer in Cluster" >/dev/null; then
        break
    fi
    sleep 1
done
# restart glusterd after we've attempted a probe
service glusterd restart

for i in 1 2 3 4 5; do
    echo "Checking for volume info .. $i"
    out=`gluster volume info 2>/dev/null | grep -v "^No "`
    if [ -n "$out" ] ; then
        break
    fi
    sleep 1
done
#----------------------------------------------------

One of the failures we've observed showed the following on the console:

  Running /etc/rc3.d/S21glusterfs-init start
  peer probe: success
  Checking for Peer in Cluster .. 1 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 2 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 3 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 4 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 5 .. State: Accepted peer request (Connected)
  Stopping glusterd:[  OK  ]
  Starting glusterd:[  OK  ]
  Checking for volume info .. 1
  Checking for volume info .. 2
  Checking for volume info .. 3
  Checking for volume info .. 4
  Checking for volume info .. 5

After this, if we look at peer status, it shows both nodes in the "Sent and Received peer request" status.

One of the times this procedure works, we get the following output:

  Running /etc/rc3.d/S21glusterfs-init start
  peer probe: success
  Checking for Peer in Cluster .. 1 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 2 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 3 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 4 .. State: Accepted peer request (Connected)
  Checking for Peer in Cluster .. 5 .. State: Accepted peer request (Connected)
  Stopping glusterd:[  OK  ]
  Starting glusterd:[  OK  ]
  Checking for volume info .. 1
  Checking for volume info .. 2

And at this point, it joins the cluster and starts the bricks.

I will include attachments of the etc-glusterfs-glusterd logs in DEBUG mode from both servers in 3 different situations to try to help understand what's going on.

  * The logs with -194603 suffix are from the failed attempt above to kickstart admin02.
  * The logs with -200414 are after I shut down glusterd on both nodes and removed state= from the peer files, causing them to start over and join the cluster.
  * The logs with -215614 are a second full kickstart of admin02 where it succeeded as expected.

The only pattern I can find to this is that when I switch which node is getting kickstarted it seems to fail all the time.  If I continue to kickstart the same node, it seems to continue to succeed.

Todd

Comment 1 Todd Stansell 2013-09-04 21:54:21 UTC
Created attachment 793876 [details]
admin01 logs from failure after kickstart

Comment 2 Todd Stansell 2013-09-04 21:54:55 UTC
Created attachment 793877 [details]
admin02 logs from failure after kickstart

Comment 3 Todd Stansell 2013-09-04 21:55:37 UTC
Created attachment 793878 [details]
admin01 logs from after removing peer state and restarting glusterd

Comment 4 Todd Stansell 2013-09-04 21:56:00 UTC
Created attachment 793879 [details]
admin02 logs from after removing peer state and restarting glusterd

Comment 5 Todd Stansell 2013-09-04 21:56:27 UTC
Created attachment 793880 [details]
admin01 logs from success after kickstart

Comment 6 Todd Stansell 2013-09-04 21:56:52 UTC
Created attachment 793881 [details]
admin02 logs from success after kickstart

Comment 7 Banio Carpenter 2014-10-10 20:17:07 UTC
I can confirm that this is still not fixed.  Although I had to change the line in the /var/lib/glusterd/<uuid> file from:
state=5
to
state=3
For it to work

gluster version: 3.5.2

OS: CentOS

Please let me know if any further information is needed.

Comment 8 Atin Mukherjee 2017-01-30 06:11:29 UTC
(In reply to Todd Stansell from comment #0)
> Description of problem:
> 
> Occasionally after rebuilding a node in a replica 2 cluster, the initial
> peer probe from the rebuilt node will cause both peers to be in a "Sent and
> Received peer request" state, never exchanging volume information and
> letting the rebuilt node move to the "Peer in Cluster" state. 
> 
> The only way out of this I've found is to stop glusterd on both nodes,
> remove the state= parameter from the /var/lib/glusterd/peers/<uuid> file and
> then start glusterd up again.  After starting glusterd, the negotiation
> between the two start from the "Establishing Connection" state and things
> work as expected.
> 
> Version-Release number of selected component (if applicable):
> 3.4.0
> 
> How reproducible:
> It seems to happen every time I change which host is being rebuilt, but not
> if I rebuild the same node.  I'm not 100% sure of this pattern, but it seems
> this way.
> 
> Steps to Reproduce:
> 1. begin with a replica 2 cluster
> 2. shut down services and kickstart one server
> 3. restore previous uuid in /var/lib/glusterd/glusterd.info

Why are we trying to restore previous UUID? If it's a fresh set up then you should retain the original UUIDs.

> 4. start glusterd
> 5. run: gluster peer probe $peer
> 6. restart glusterd
> 
> Actual results:
> 
> peer status shows both peers in "Sent and Received peer request" as they
> both seem to wait for an ACC from the other side.
> 
> Expected results:
> 
> peer should end up in "Peer in Cluster" state, with volume information
> exchanged and bricks started.
> 
> Additional info:
> 
> In our situation, we've written kickstart scripts to automate the peer probe
> and rejoining of the cluster.  During kickstart we preserve the uuid of the
> server (step 3) and then set up an init script to run soon after glusterd
> starts upon the first boot.  The script that gets generated in our test
> while rebuilding admin02.mgmt is as follows (to see our exact steps):
> 
> #!/bin/bash
> # initialize glusterfs config
> #
> 
> # Source function library.
> . /etc/init.d/functions
> 
> me=admin02.mgmt
> peer=admin01.mgmt
> gluster peer probe $peer
> for i in 1 2 3 4 5; do
>     echo -n "Checking for Peer in Cluster .. $i .. "
>     out=`gluster peer status 2>/dev/null | grep State:`
>     echo $out
>     if echo $out | grep "Peer in Cluster" >/dev/null; then
>         break
>     fi
>     sleep 1
> done
> # restart glusterd after we've attempted a probe
> service glusterd restart
> 
> for i in 1 2 3 4 5; do
>     echo "Checking for volume info .. $i"
>     out=`gluster volume info 2>/dev/null | grep -v "^No "`
>     if [ -n "$out" ] ; then
>         break
>     fi
>     sleep 1
> done
> #----------------------------------------------------
> 
> One of the failures we've observed showed the following on the console:
> 
>   Running /etc/rc3.d/S21glusterfs-init start
>   peer probe: success
>   Checking for Peer in Cluster .. 1 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 2 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 3 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 4 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 5 .. State: Accepted peer request
> (Connected)
>   Stopping glusterd:[  OK  ]
>   Starting glusterd:[  OK  ]
>   Checking for volume info .. 1
>   Checking for volume info .. 2
>   Checking for volume info .. 3
>   Checking for volume info .. 4
>   Checking for volume info .. 5
> 
> After this, if we look at peer status, it shows both nodes in the "Sent and
> Received peer request" status.
> 
> One of the times this procedure works, we get the following output:
> 
>   Running /etc/rc3.d/S21glusterfs-init start
>   peer probe: success
>   Checking for Peer in Cluster .. 1 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 2 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 3 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 4 .. State: Accepted peer request
> (Connected)
>   Checking for Peer in Cluster .. 5 .. State: Accepted peer request
> (Connected)
>   Stopping glusterd:[  OK  ]
>   Starting glusterd:[  OK  ]
>   Checking for volume info .. 1
>   Checking for volume info .. 2
> 
> And at this point, it joins the cluster and starts the bricks.
> 
> I will include attachments of the etc-glusterfs-glusterd logs in DEBUG mode
> from both servers in 3 different situations to try to help understand what's
> going on.
> 
>   * The logs with -194603 suffix are from the failed attempt above to
> kickstart admin02.
>   * The logs with -200414 are after I shut down glusterd on both nodes and
> removed state= from the peer files, causing them to start over and join the
> cluster.
>   * The logs with -215614 are a second full kickstart of admin02 where it
> succeeded as expected.
> 
> The only pattern I can find to this is that when I switch which node is
> getting kickstarted it seems to fail all the time.  If I continue to
> kickstart the same node, it seems to continue to succeed.
> 
> Todd

Comment 9 Atin Mukherjee 2017-08-08 15:43:55 UTC
Bump, can the needinfo be addressed?

Comment 10 Todd Stansell 2017-08-08 23:20:17 UTC
I can't provide anything more. I left that company years ago.