Bug 2135990
| Summary: | [CEE] MDS pods CrashLoopBackoff | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | James Biao <jbiao> |
| Component: | ceph | Assignee: | Venky Shankar <vshankar> |
| ceph sub component: | CephFS | QA Contact: | Elad <ebenahar> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | urgent | CC: | bniver, gfarnum, hyelloji, madam, mmanjuna, muagarwa, ocs-bugs, odf-bz-bot, tnielsen, vshankar, xiubli |
| Version: | 4.9 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-23 06:27:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
James Biao
2022-10-19 03:09:30 UTC
FWIW, in Server::reconnect_tick():
```
for (auto session : remaining_sessions) {
// Keep sessions that have specified timeout. These sessions will prevent
// mds from going to active. MDS goes to active after they all have been
// killed or reclaimed.
if (session->info.client_metadata.find("timeout") !=
session->info.client_metadata.end()) {
dout(1) << "reconnect keeps " << session->info.inst
<< ", need to be reclaimed" << dendl;
client_reclaim_gather.insert(session->get_client());
continue;
}
dout(1) << "reconnect gives up on " << session->info.inst << dendl;
mds->clog->warn() << "evicting unresponsive client " << *session
<< ", after waiting " << elapse1
<< " seconds during MDS startup";
```
Is the MDS waiting for session to be reclaimed?
|