"Behind on trimming (105235/128)max_segments" can explain this. There are lots of log segments, replaying them require long time. Ask customer to not ignore 'MDSs behind on trimming' health warning next time
Connected clients do affect mds journal replay (It's unlikely that they do IOs on metadata pool). The best solution for now is wait until journal replay finishes. Because journal reset and scan whole filesystem may also require very long time. Disable all mds debug can speed up journal replay.
Sorry. I mean "Connected clients do not affect mds journal replay"
For /cases/02388834/ceph-mds.storageM3-STG-NGN1.log.tgz/ceph-mds.storageM3-STG-NGN1.log The recovering mds had "heartbeat map not healthy" when it's in rejoin stage. It likely the mds was iterating all inodes. To prevent mds from being replaced by monitor, set mds_beacon_grace config of monitor to 300 or more.
mds_log_max_segments default is 128. decrease it by 100 every 10 seconds, until it reach 128 There are lots of log segments in this case. when mds become active, it tries trimming all of them, which create lots of osd requests.
no new discover from the log. still looks like http://tracker.ceph.com/issues/40028