Description of problem: bootkube.sh is not re-entrant (and will not run properly) if it was stopped midway through running. > Jul 23 18:02:39 bootstrap systemd[1]: Stopped Bootstrap a Kubernetes cluster. > Jul 23 18:02:46 bootstrap systemd[1]: Started Bootstrap a Kubernetes cluster. > Jul 23 18:03:01 bootstrap bootkube.sh[8114]: Starting etcd certificate signer... > Jul 23 18:03:02 bootstrap bootkube.sh[8114]: error creating container storage: the container name "etcd-signer" is already in use by "01b3ece5e73cee6a197fdc641cd362a05e40c26611a9b2a230700ab26e614af1". You have to remove that container to be able to reuse that name.: that name is already in use > Jul 23 18:03:02 bootstrap bootkube.sh[8114]: 01b3ece5e73cee6a197fdc641cd362a05e40c26611a9b2a230700ab26e614af1 > Jul 23 18:03:02 bootstrap systemd[1]: bootkube.service: Main process exited, code=exited, status=125/n/a > Jul 23 18:03:02 bootstrap systemd[1]: bootkube.service: Failed with result 'exit-code'. Version-Release number of selected component (if applicable): 4.1.x How reproducible: 100% Steps to Reproduce: 1. Start UPI install 2. SSH to bootstrap system 3. sudo systemctl stop bootkube.service 4. ### Debug install issue 5. sudo systemctl start bootkube.service Actual results: See error above. Expected results: The bootkube.sh service should when started (clean up any previous invocations and restart) or (bypass things its already done - which it does in some cases). Alternatively a bootkube_cleanup.sh script should be provided to run inbetween stop and start issue (and should be messaged to a user should bootkube.sh fail). Additional info:
If the user wants to configure the bootstrap-host, the recommended mechanism is through ignition. But i think we will try to make bootkube re-entrant. updating priority because better method is available.
The reason for this needing to be re-entrant is not to make or adjust configurations. Often times it's to complete or continue a failed install. For example a miss-condigured load balancer. Pods will/can get out of sync and having a way to restart or continue the install process to complete an install over having to re-deploy a system can cave man hrs (given the installer only gives you ~24 hrs to complete the installer) the time pressure here can be too much for some customers.
bootkube.sh retries over and over but this appears to require a change to the inputs, without a clear usecase closing this
Hello, I am using openshift-installer v4.3.12 and encountering this issue. I attempted to try v4.3.18 version of the openshift-installer but the error persists were my "etcd-signer" container already exists by name and it causes the bootkube.sh service to crash (and loop). I am unsure how to get around this problem, other than destroying my bootstrap machine all of my masters/workers and totally rebuilding which would be incredibly time consuming. I am installing OS v4.3.8 on bare-metal UPI. I tried several times to delete my installation directory and make fresh attempts with both the v4.3.12 and v4.3.18 installers.
Please open a new bug with full details of what you're seeing including the log bundle from `openshift-install gather bootstrap`