Cause:
Corosync forms new membership and tries to send messages in recovery.
Consequence:
Messages are not fully sent and other nodes receives them corrupted.
Fix:
Properly set maximum size of message.
Result:
Messages are always fully sent so other nodes receive them correctly.
Created attachment 1630613[details]
8.1.z-bz1765619-1-totemsrp-Reduce-MTU-to-left-room-second-mcast
totemsrp: Reduce MTU to left room second mcast
Messages sent during recovery phase are encapsulated so such message has
extra size of mcast structure. This is not so big problem for UDPU,
because most of the switches are able to fragment and defragment packet
but it is problem for knet, because totempg is using maximum packet size
(65536 bytes) and when another header is added during retransmition,
then packet is too large.
Solution is to reduce mtu by 2 * sizeof (struct mcast).
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Fabio M. Di Nitto <fdinitto>
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2019:4264