Moving this over to node/kubelet team to further investigate, since this doesn't look like a scheduler problem.
A reasonably simple improvement to kubelet's healthz checks would have prevented the node from becoming wedged. Unfortunately there doesn't seem to be much interest merging the PR that implements it. https://github.com/kubernetes/kubernetes/issues/98981 https://github.com/kubernetes/kubernetes/pull/94210
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days