Description of problem:
See upstream issue for more background:
MTC creates 'stage pods' to aid with ensuring a PV is mounted to a Pod, the approach is to launch a stage pod on the same node as the existing Pod which is consuming the PV.
For 'Block' storage PVs, they are typically RWO, so only one pod may mount them at a time. Our understanding was that this restriction was enforced at the Node level, so it'd be possible for 2 pods to mount the same RWO PV if it was scheduled on same Node.
Seeing the behavior with IBM Block storage in IBM ROKS we are questioning our approach and need to reexamine how we should approach stage pods.
Here's what we'll need to do to resolve this:
1) swap quiesce and "create stage pods" phase order, since we will need to create stage pods for PVCs if we're quiescing
2) make sure that for PVCs that are mounted by more than one pod only one of these pods gets the restic annotation -- this will keep us from failing restore when we have ROX PVCs that must be mounted RWO for restore, and it keeps restic from attempting to backup/restore a volume more than once.
3) Only create new stage pods for the disconnected and quiesced PVCs, use live application pods for those that are going to be live through stage backup -- add restic annotations for volumes to back up to these live application pods
4) On stage restore, convert live application pods to stage pods, including only mounting PVCs that have corresponding restic annotations
These two PRs implements the solution and resolves this issue of stage pod failing.
REMOTE CLUSTER: AWS OCP 3.11 GP2
LOCAL CLUSTER: ROKS OCP 4.7 ibmc-block-gold (controller + UI)
REPLICATION REPOSITORY: AWS S3
- name: MIG_CONTROLLER_REPO
- name: MIG_CONTROLLER_TAG
- name: MIG_UI_REPO
- name: MIG_UI_TAG
We could migrate pvcs with "indirect" migrations from local to remote cluster and from remote to local cluster.
Moved to VERIFIED.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.6.0 security & bugfix update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.