Bug 1805394

Summary: NFD pods disappear after cluster upgrade
Product: OpenShift Container Platform Reporter: Eric Matysek <ematysek>
Component: Node Feature Discovery OperatorAssignee: Zvonko Kosic <zkosic>
Status: CLOSED DUPLICATE QA Contact: Eric Matysek <ematysek>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.2.zCC: mpatel, sdodson, sejug, wabouham, zkosic
Target Milestone: ---Keywords: Reopened, TestBlocker
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-06 14:30:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1775849    
Bug Blocks:    

Description Eric Matysek 2020-02-20 17:35:09 UTC
Description of problem:
NFD pods disappear from all worker & master nodes when upgrading between 4.2.x versions.

Version-Release number of selected component (if applicable):
4.2.x

How reproducible:
100%

Steps to Reproduce:
1. Deploy openshift 4.2.18
2. Deploy NFD from OperatorHub web console
3. Upgrade openshift to 4.2.19

Actual results:
openshift-nfd namespace empty

Expected results:
Should have 1 pod per node in openshift-nfd namespace

Additional info:

Comment 2 Zvonko Kosic 2020-02-27 15:56:42 UTC
We have verified this bug for 4.4 https://bugzilla.redhat.com/show_bug.cgi?id=1782948
and for 4.3 https://bugzilla.redhat.com/show_bug.cgi?id=1775849
this should target 4.2.z will set the 4.3 as 'Depends On'

Comment 3 Zvonko Kosic 2020-02-28 13:35:14 UTC
*** Bug 1784678 has been marked as a duplicate of this bug. ***

Comment 4 Zvonko Kosic 2020-03-02 15:16:21 UTC

*** This bug has been marked as a duplicate of bug 1785307 ***

Comment 5 Eric Matysek 2020-03-02 21:40:14 UTC
I don't think this bug is a duplicate, I re-tested this from 4.2.18 to 4.2.19 on the day I opened this bug and all NFD pods disappeared.

Comment 6 Scott Dodson 2020-03-03 01:50:34 UTC
(In reply to Eric Matysek from comment #5)
> I don't think this bug is a duplicate, I re-tested this from 4.2.18 to
> 4.2.19 on the day I opened this bug and all NFD pods disappeared.

That bug was only fixed in 4.2.20, please test with that release and re-close as a dupe if confirmed.

Comment 7 Eric Matysek 2020-03-03 17:51:56 UTC
Are we essentially saying NFD is not supported on 4.2 versions below 4.2.20? I tried to deploy NFD on a 4.2.19 cluster today to test upgrading to 4.2.20 and it failed to deploy.

pod/nfd-operator-7f9b9b65b-9sh27   0/1     CrashLoopBackOff   2          47s

Going to upgrade the cluster to 4.2.20 and retry then hopefully test upgrade to 4.2.21

Comment 8 Zvonko Kosic 2020-03-03 18:03:09 UTC
What is the log of the operator?

Comment 9 Eric Matysek 2020-03-03 18:51:53 UTC
{"level":"info","ts":1583261443.6853044,"logger":"cmd","msg":"Go Version: go1.11.13"}
{"level":"info","ts":1583261443.6853268,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1583261443.685332,"logger":"cmd","msg":"Version of operator-sdk: v0.4.0+git"}
{"level":"info","ts":1583261443.6858435,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1583261443.823679,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."}
{"level":"info","ts":1583261443.8237095,"logger":"leader","msg":"Continuing as the leader."}
{"level":"info","ts":1583261443.937325,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1583261443.9374692,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1583261443.9375896,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1583261443.9376585,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1583261443.9377139,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1583261443.9377654,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"nodefeaturediscovery-controller","source":"kind source: /, Kind="}
{"level":"error","ts":1583261443.9377759,"logger":"cmd","msg":"","error":"no kind is registered for the type v1.SecurityContextConstraints in scheme \"k8s.io/client-go/kubernetes/scheme/register.go:60\"","stacktrace":"github.com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/openshift/cluster-nfd-operator/cmd/manager/main.go:92\nruntime.main\n\t/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/proc.go:201"}

Comment 10 Eric Matysek 2020-03-03 22:09:20 UTC
Adding link to new bug for unable to deploy NFD on 4.2.x

Comment 12 Zvonko Kosic 2020-03-04 17:10:36 UTC
Ok this error you're getting is not related to this BZ you created the right one, I remember Walid was able to deploy NFD in 4.2 from Operatorhub did you test this? 
Besides that we will cherrypick the fix from 4.3 to 4.2

Comment 16 Zvonko Kosic 2020-03-06 14:30:51 UTC

*** This bug has been marked as a duplicate of bug 1805427 ***

Comment 17 Zvonko Kosic 2020-03-06 14:31:26 UTC
(In reply to Eric Matysek from comment #9)
> {"level":"info","ts":1583261443.6853044,"logger":"cmd","msg":"Go Version:
> go1.11.13"}
> {"level":"info","ts":1583261443.6853268,"logger":"cmd","msg":"Go OS/Arch:
> linux/amd64"}
> {"level":"info","ts":1583261443.685332,"logger":"cmd","msg":"Version of
> operator-sdk: v0.4.0+git"}
> {"level":"info","ts":1583261443.6858435,"logger":"leader","msg":"Trying to
> become the leader."}
> {"level":"info","ts":1583261443.823679,"logger":"leader","msg":"Found
> existing lock with my name. I was likely restarted."}
> {"level":"info","ts":1583261443.8237095,"logger":"leader","msg":"Continuing
> as the leader."}
> {"level":"info","ts":1583261443.937325,"logger":"cmd","msg":"Registering
> Components."}
> {"level":"info","ts":1583261443.9374692,"logger":"kubebuilder.controller",
> "msg":"Starting
> EventSource","controller":"nodefeaturediscovery-controller","source":"kind
> source: /, Kind="}
> {"level":"info","ts":1583261443.9375896,"logger":"kubebuilder.controller",
> "msg":"Starting
> EventSource","controller":"nodefeaturediscovery-controller","source":"kind
> source: /, Kind="}
> {"level":"info","ts":1583261443.9376585,"logger":"kubebuilder.controller",
> "msg":"Starting
> EventSource","controller":"nodefeaturediscovery-controller","source":"kind
> source: /, Kind="}
> {"level":"info","ts":1583261443.9377139,"logger":"kubebuilder.controller",
> "msg":"Starting
> EventSource","controller":"nodefeaturediscovery-controller","source":"kind
> source: /, Kind="}
> {"level":"info","ts":1583261443.9377654,"logger":"kubebuilder.controller",
> "msg":"Starting
> EventSource","controller":"nodefeaturediscovery-controller","source":"kind
> source: /, Kind="}
> {"level":"error","ts":1583261443.9377759,"logger":"cmd","msg":"","error":"no
> kind is registered for the type v1.SecurityContextConstraints in scheme
> \"k8s.io/client-go/kubernetes/scheme/register.go:60\"","stacktrace":"github.
> com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr.
> (*zapLogger).Error\n\t/go/src/github.com/openshift/cluster-nfd-operator/
> vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/
> openshift/cluster-nfd-operator/cmd/manager/main.go:92\nruntime.main\n\t/opt/
> rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/proc.go:
> 201"}

This is the error of this bug https://bugzilla.redhat.com/show_bug.cgi?id=1805427

Comment 18 Zvonko Kosic 2020-03-06 14:44:02 UTC
This error has nothing to do with that NFD pods disappear after upgrade. Lets first fix 1805427 and then we can tackle the the update.

Comment 19 Red Hat Bugzilla 2023-09-14 05:53:01 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days