Bug 1947430

Summary: Openshift 4 has a zombie problem
Product: OpenShift Container Platform Reporter: tmicheli
Component: ocAssignee: Maciej Szulik <maszulik>
Status: CLOSED ERRATA QA Contact: zhou ying <yinzhou>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6.zCC: andbartl, aos-bugs, jokerman, mfojtik, shishika, tmicheli
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1949022 (view as bug list) Environment:
Last Closed: 2021-05-12 12:18:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1949024, 2016500    
Bug Blocks:    

Description tmicheli 2021-04-08 13:15:04 UTC
Description of problem:
After doing an `oc debug node/<nodename>`, the for this purpose created namespaces are not getting deleted properly. This behaviour seems to be random.

Version-Release number of selected component (if applicable):
OCP 4.6.z

How reproducible:


Steps to Reproduce:
1. oc debug node/<nodename>
2. exit the debug session
3. It does not happen always and even if exited properly with `exit`.

Actual results:
# oc get ns | grep -i debug
openshift-debug-node-7k4p88q4kk                    Active        78d
openshift-debug-node-7shqwcjgwf                    Active        57d
openshift-debug-node-c5zp2njj2c                    Active        48d
openshift-debug-node-fr768h6jw2                    Active        48d
openshift-debug-node-gxtb2                         Active        21d
openshift-debug-node-hz2z7r9vrs                    Active        26h
openshift-debug-node-pdlxr86vtp                    Active        57d
openshift-debug-node-t4vp6cgxzl                    Active        62d
openshift-debug-node-z2n2s6gmvq                    Active        86d
openshift-debug-node-zjlvczvwfd                    Active        48d

Expected results:
- Debug namespace deleted properly

Additional info:

Comment 2 Maciej Szulik 2021-04-09 11:12:59 UTC
The creation of additional namespace for debugging was coming from solving https://bugzilla.redhat.com/show_bug.cgi?id=1812813
In the end we've decided to revert that change, and that happened in https://github.com/openshift/oc/pull/668.
It just happened that this was never backported to 4.6, is the customer willing to use 4.7 binary which does not create that namespace?

Comment 6 Maciej Szulik 2021-04-13 09:21:02 UTC
https://github.com/openshift/oc/pull/808 is backporting the revert to 4.6

Comment 7 zhou ying 2021-04-19 02:31:01 UTC
Checked the oc build with the related pr , can't see the debug project now:

At the first terminal run the debug command:
[root@localhost ~]# ./oc debug node/ip-10-0-129-86.us-east-2.compute.internal
Starting pod/ip-10-0-129-86us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.129.86
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# 


At the second terminal run the command to check the project:
[root@localhost ~]# oc get project |grep -i debug
[root@localhost ~]#

Comment 13 errata-xmlrpc 2021-05-12 12:18:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.28 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1487

Comment 14 Maciej Szulik 2021-11-04 12:02:25 UTC
*** Bug 2020098 has been marked as a duplicate of this bug. ***