In a cluster having 2 networks, sometimes when a new peer is added into the cluster, glusterd on the new peer cannot restart. The restart fails because, it cannot resolve bricks belonging the peer which probed the new peer into the cluster. The resolution only fails if the bricks were created on the 2nd network of the initiator peer, because the new peer doesn't know about the 2nd network of the initiator.
This is caused by race which hadn't been encountered before. The analysis is as follows.
Assuming A, B and C as the peers. A and B are a cluster and have probed each other on the 2 networks. C is probed from A.
During the probe, C is first validate by A. Once C is accepted, A sends and update to both B and C to inform them of the each other. The update C gets from A doesn't have A's second network information. C can only get this information when B sends an update to C.
The problem faced here was that B didn't send an update to C. This happens because B sending an update to C depends on the ordering of connection establishment between B and C.
B and C both try to establish connections to each other once they receive A's update and get to know of each other. If B establishes the connection first then it sends and update to C. But if C establishes the connection first, B will not send an update to C.
This is the first time this situation was observed. This doesn't happen always.
REVIEW: http://review.gluster.org/11625 (glusterd: Send friend update even for EVENT_RCVD_ACC) posted (#1) for review on master by Kaushal M (firstname.lastname@example.org)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.
glusterfs-3.8.0 has been announced on the Gluster mailinglists , packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist  and the update infrastructure for your distribution.