Description of problem: When MDSs are upgraded to 12.2.3+, all online MDS will suicide after the first upgraded MDS goes online. Version-Release number of selected component (if applicable): N/A yet How reproducible: 100% Steps to Reproduce: 1. Take RHCS3.0 cluster and upgrade an MDS to a release based on 12.2.3. 2. 3. Actual results: 12.2.2- MDSs will suicide. Expected results: 12.2.2- MDS continue functioning. Additional info: Caused by this backport: https://github.com/ceph/ceph/pull/18782
I think it's caused by commit cb8eff43b1abd8c268df9e57906d677ff4be8d95 Author: Yan, Zheng <zyan> Date: Wed Oct 18 20:58:15 2017 +0800 mds: don't rdlock locks in replica object while auth mds is recovering Auth mds may take xlock on the lock and change the object when replaying unsafe requests. To guarantee new requests and replayed unsafe requests (on auth mds) get processed in proper order, we shouldn't rdlock locks in replica object while auth mds of the object is recovering Signed-off-by: "Yan, Zheng" <zyan> (cherry picked from commit 0afbc0338e1b9f32340eaa74899d8d43ac8608fe) The commit modified CInode::encode_replica and CInode::_encode_locks_state_for_replica
Hi Erin, Can you please add changes made in RHEL installation guide, also to Ubuntu installation guide and Container Guide also ? Regards, Vasishta Shastry AQE, Ceph
Pushing this to assigned state based on comment 34 and 35
Moving this bz to verified state, doc text for RHEL, Ubuntu and Container looks good.