Hide Forgot
Description of problem: NFD pods disappear after cluster upgrade upgrading from 4.2.7 to 4.3.0-0.nightly-2019-11-21-122827 Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. Deploy 4.2.7 cluster 2. git clone https://github.com/openshift/cluster-nfd-operator 3. make deploy 4. oc adm upgrade --to-image registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-11-21-122827 --force --allow-explicit-upgrade Actual results: before upgrade: [ematysek@jump ~]$ oc get all -n openshift-nfd NAME READY STATUS RESTARTS AGE pod/nfd-master-gvj7n 1/1 Running 0 32s pod/nfd-master-kmtx8 1/1 Running 0 32s pod/nfd-master-wg47s 1/1 Running 0 32s pod/nfd-worker-csdwm 1/1 Running 2 33s pod/nfd-worker-nxqrv 1/1 Running 2 33s pod/nfd-worker-qnjxn 1/1 Running 2 33s after upgrade: [ematysek@jump cluster-nfd-operator]$ oc get all -n openshift-nfd No resources found in openshift-nfd namespace. Expected results: NFD pods should still exist Additional info:
(In reply to Eric Matysek from comment #0) > Description of problem: > NFD pods disappear after cluster upgrade > upgrading from 4.2.7 to 4.3.0-0.nightly-2019-11-21-122827 > > Version-Release number of selected component (if applicable): > > > How reproducible: > 100% > > > Steps to Reproduce: > 1. Deploy 4.2.7 cluster > 2. git clone https://github.com/openshift/cluster-nfd-operator > 3. make deploy > 4. oc adm upgrade --to-image > registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-11-21-122827 > --force --allow-explicit-upgrade > > Actual results: > before upgrade: > [ematysek@jump ~]$ oc get all -n openshift-nfd > NAME READY STATUS RESTARTS AGE > pod/nfd-master-gvj7n 1/1 Running 0 32s > pod/nfd-master-kmtx8 1/1 Running 0 32s > pod/nfd-master-wg47s 1/1 Running 0 32s > pod/nfd-worker-csdwm 1/1 Running 2 33s > pod/nfd-worker-nxqrv 1/1 Running 2 33s > pod/nfd-worker-qnjxn 1/1 Running 2 33s > > after upgrade: > [ematysek@jump cluster-nfd-operator]$ oc get all -n openshift-nfd > No resources found in openshift-nfd namespace. > > > Expected results: > NFD pods should still exist > > Additional info: You were testing the master against a specific openshift release. Master will not always work. Please install NFD from operatorhub in ocp 4.2 and try the upgrade path.
Verified this works with the NFD version in public OperatorHub as well
And by works I mean the bug is present... sorry for the typo
Can you provide any logs or events? oc get events -n openshift-nfd oc get events -n openshift-nfd-operator Is the operator still running? oc logs -f <operator-name> -n openshift-nfd-operator What is the status of the DaemonSets in openshift-nfd oc describe ds -n openshift-nfd
Can you also try to deploy NFD via OLM, and then do the upgrade? OLM is responsible for updating the day 2 operators, manually deploying means also manually updating.
This might fix the issue, https://github.com/openshift/cluster-nfd-operator/pull/45
Verification blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1778904
PR 49 is not merged with release-4.2 branch, so it has no effect on a 4.2.x cluster
https://github.com/openshift/cluster-nfd-operator/pull/51 for release-4.2
I don't know why this BZ is talking about 4.2. This is a 4.3 BZ. I see that this fix is supposedly merged and that 1778904 is VERIFIED. So I'm moving this back ON_QA.
Upgraded successfully without nfd pods disappearing! [ematysek@jump ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2020-01-08-181129 True False 13m Cluster version is 4.3.0-0.nightly-2020-01-08-181129 [ematysek@jump ~]$ oc get pods NAME READY STATUS RESTARTS AGE nfd-master-cb57h 1/1 Running 0 14m nfd-master-pv9c7 1/1 Running 0 14m nfd-master-rs8tz 1/1 Running 0 14m nfd-worker-khrhj 1/1 Running 2 14m nfd-worker-wnwlv 1/1 Running 2 14m nfd-worker-wxzq7 1/1 Running 2 14m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062