Description of problem: If there are too many journal segments during recovery, the MDS will fail internal heartbeats. See also: https://bugzilla.redhat.com/show_bug.cgi?id=1713527#c10 Version-Release number of selected component (if applicable): 3.1 How reproducible: Test case needs written.
MDS should also detect this situation explain what's delaying recovery during up:replay.
Replaying lots of segments does not cause unhealthy heartbeat. The origin issue is that trimming lots of log segments after mds recovered, which causes unhealthy heartbeat
(In reply to Yan, Zheng from comment #2) > Replaying lots of segments does not cause unhealthy heartbeat. The origin > issue is that trimming lots of log segments after mds recovered, which > causes unhealthy heartbeat But the trimming occurs during up:active? The original issue was that the MDS was stuck in up:replay. The trimming issue once the MDS hits up:active is bz1714814.
yes, trimming happens when mds is active. The origin issue is that there were lots of log segments, replaying them spent long time. After journal replay finished, mds started to trim logs and caused unhealthy heartbeat. Then mds got replaced by new mds.
Yes, you're right. Description of the issue in the case confused me.