Description of problem:
NFD pods disappear after cluster upgrade
upgrading from 4.2.7 to 4.3.0-0.nightly-2019-11-21-122827
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Deploy 4.2.7 cluster
2. git clone https://github.com/openshift/cluster-nfd-operator
3. make deploy
4. oc adm upgrade --to-image registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-11-21-122827 --force --allow-explicit-upgrade
[ematysek@jump ~]$ oc get all -n openshift-nfd
NAME READY STATUS RESTARTS AGE
pod/nfd-master-gvj7n 1/1 Running 0 32s
pod/nfd-master-kmtx8 1/1 Running 0 32s
pod/nfd-master-wg47s 1/1 Running 0 32s
pod/nfd-worker-csdwm 1/1 Running 2 33s
pod/nfd-worker-nxqrv 1/1 Running 2 33s
pod/nfd-worker-qnjxn 1/1 Running 2 33s
[ematysek@jump cluster-nfd-operator]$ oc get all -n openshift-nfd
No resources found in openshift-nfd namespace.
NFD pods should still exist
(In reply to Eric Matysek from comment #0)
> Description of problem:
> NFD pods disappear after cluster upgrade
> upgrading from 4.2.7 to 4.3.0-0.nightly-2019-11-21-122827
> Version-Release number of selected component (if applicable):
> How reproducible:
> Steps to Reproduce:
> 1. Deploy 4.2.7 cluster
> 2. git clone https://github.com/openshift/cluster-nfd-operator
> 3. make deploy
> 4. oc adm upgrade --to-image
> --force --allow-explicit-upgrade
> Actual results:
> before upgrade:
> [ematysek@jump ~]$ oc get all -n openshift-nfd
> NAME READY STATUS RESTARTS AGE
> pod/nfd-master-gvj7n 1/1 Running 0 32s
> pod/nfd-master-kmtx8 1/1 Running 0 32s
> pod/nfd-master-wg47s 1/1 Running 0 32s
> pod/nfd-worker-csdwm 1/1 Running 2 33s
> pod/nfd-worker-nxqrv 1/1 Running 2 33s
> pod/nfd-worker-qnjxn 1/1 Running 2 33s
> after upgrade:
> [ematysek@jump cluster-nfd-operator]$ oc get all -n openshift-nfd
> No resources found in openshift-nfd namespace.
> Expected results:
> NFD pods should still exist
> Additional info:
You were testing the master against a specific openshift release. Master will not always work. Please install NFD from operatorhub in ocp 4.2 and try the upgrade path.
Verified this works with the NFD version in public OperatorHub as well
And by works I mean the bug is present... sorry for the typo
Can you provide any logs or events?
oc get events -n openshift-nfd
oc get events -n openshift-nfd-operator
Is the operator still running?
oc logs -f <operator-name> -n openshift-nfd-operator
What is the status of the DaemonSets in openshift-nfd
oc describe ds -n openshift-nfd
Can you also try to deploy NFD via OLM, and then do the upgrade?
OLM is responsible for updating the day 2 operators, manually deploying means also manually updating.
This might fix the issue, https://github.com/openshift/cluster-nfd-operator/pull/45
Verification blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1778904
PR 49 is not merged with release-4.2 branch, so it has no effect on a 4.2.x cluster
https://github.com/openshift/cluster-nfd-operator/pull/51 for release-4.2
I don't know why this BZ is talking about 4.2. This is a 4.3 BZ. I see that this fix is supposedly merged and that 1778904 is VERIFIED. So I'm moving this back ON_QA.
Upgraded successfully without nfd pods disappearing!
[ematysek@jump ~]$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.3.0-0.nightly-2020-01-08-181129 True False 13m Cluster version is 4.3.0-0.nightly-2020-01-08-181129
[ematysek@jump ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
nfd-master-cb57h 1/1 Running 0 14m
nfd-master-pv9c7 1/1 Running 0 14m
nfd-master-rs8tz 1/1 Running 0 14m
nfd-worker-khrhj 1/1 Running 2 14m
nfd-worker-wnwlv 1/1 Running 2 14m
nfd-worker-wxzq7 1/1 Running 2 14m
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.