Bug 1784678 - OCP 4.2.12: upi on baremetal - openshift-nfd namespace disappears with nfd pods several hours after deploying Node Feature Discovery operator from operatorHub
Summary: OCP 4.2.12: upi on baremetal - openshift-nfd namespace disappears with nfd p...
Keywords:
Status: CLOSED DUPLICATE of bug 1805394
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Feature Discovery Operator
Version: 4.2.z
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 4.2.z
Assignee: Carlos Eduardo Arango Gutierrez
QA Contact: Walid A.
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-18 03:55 UTC by Walid A.
Modified: 2020-02-28 13:36 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-28 13:35:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Walid A. 2019-12-18 03:55:20 UTC
Description of problem:
This is on UPI on baremetal openstack OCP 4.2.12 staging cluster.  After deploying Node Feature Discovery (NFD) operator from OperatorHub from the OCP console, and creating an instance of that operator, NFD gets deployed successfully and the nfd pods in the openshift-nfd namespace are created successfully.  All the nodes get the NFD specific labels.

After letting the cluster sit for several hours, at least 6-8 hours or overnight, the openshift-nfd namespace disappeared along with the all the nfd pods.

The NFD operator shows still installed and the operator pod in the namespace I created is still running.

Version-Release number of selected component (if applicable):
Server Version: 4.2.12
Kubernetes Version: v1.14.6+32dc4a0


How reproducible:
Seen twice

Steps to Reproduce:
1. UPI baremetal cluster install of OCP 4.2.12 on openstack, 3 master and 3 worker nodes
2. from OCP console, logged in as kubeadmin user with kuebadmin-password, create a new project called test-nfd.
3. From Console Operators -> Operator Hub:  Search for "NFD", and click on Node Feature Discovery operator icon, then install.  Choose: 
- install in namespace you created 
- 4.2 for update channel
- Approval strategy:  Automatic
4. Create instance of that operator

Actual results:
NFD operator is deployed successfully in test-nfd namespace and is running.  A new namespace called "openshift-nfd" is created along with the nfd-master and nfd-worker worker for each node.  NFD labels are created successfully on all node.  Only problem is after several hours (6-8 hours), the namespace "openshift-nfd" along with all the nfd pods disappears.

$  oc get pods -n openshift-nfd
No resources found.

Expected results:
Openshift-nfd namespace should not disappear and the nfd master and worker pods in that namespace should not disappear either and stay running.

$  oc get pods -n openshift-nfd
NAME               READY   STATUS    RESTARTS   AGE
nfd-master-bzsb5   1/1     Running   0          71s
nfd-master-dstjq   1/1     Running   0          71s
nfd-master-t84wf   1/1     Running   0          71s
nfd-worker-2c6bw   1/1     Running   2          72s
nfd-worker-glb55   1/1     Running   2          72s
nfd-worker-tsnj5   1/1     Running   2          72s

Additional info:
Link to must-gather logs and various oc commands is provided in next comment

Comment 2 Walid A. 2019-12-20 20:45:38 UTC
Hitting the same issue on AWS IPI installed OCP 4.2.1.
Link to logs will be in next comment

Comment 6 Carlos Eduardo Arango Gutierrez 2020-02-06 13:30:16 UTC
Assigned to Eduardo Arango -> Cherry pick fix from master

Comment 7 Zvonko Kosic 2020-02-28 13:35:14 UTC

*** This bug has been marked as a duplicate of bug 1805394 ***


Note You need to log in before you can comment on or make changes to this bug.