Bug 1822211 - taint a node cause debug node to fail
Summary: taint a node cause debug node to fail
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.5.0
Assignee: Jan Chaloupka
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-08 13:56 UTC by Raif Ahmed
Modified: 2024-03-25 15:48 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-17 15:56:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1780318 0 high CLOSED Machine Config Daemon Daemon Set does not set universal Toleration (and therefore gets booted if taints are set on a nod... 2024-03-25 15:33:23 UTC
Red Hat Bugzilla 1813479 0 medium CLOSED openshift-dns daemonset doesn't include toleration to run on nodes with taints 2024-10-01 16:31:12 UTC
Red Hat Knowledge Base (Solution) 4976641 0 None None None 2020-04-09 10:26:26 UTC

Description Raif Ahmed 2020-04-08 13:56:43 UTC
Description of problem:

taint a node cause debug node to fail

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. add node taint 
oc patch node invd086.nxdi.nl-htc01.nxp.com --type=merge -p '{"spec":{"taints": [{ "key":"infra", "value":"reserved", "effect":"NoSchedule"},{ "key":"infra", "value":"reserved", "effect":"NoExecute"}]}}'

2. try to debug node 

 oc debug node/invd086.nxdi.nl-htc01.nxp.com 

3. Error is generated "Generated from taint-controller, Marking for deletion Pod dummy/invd094nxdinl-htc01nxpcom-debug"

Actual results:

debug pod is deleted

Expected results:

debug pod run, and have toleration for any node

Additional info:

I was able to work around this issue by adding toleration to the namespace, which was added to the debug node pod.

oc patch namespace dummy --type=merge -p '{"metadata": {"annotations": { "scheduler.alpha.kubernetes.io/defaultTolerations": "[{\"operator\": \"Exists\"}]"}}}'

Comment 1 Harshal Patil 2020-04-17 15:56:56 UTC
It's working as expected. Although it can be added in the documentation or a blog post as simple recipe to debug nodes that are tainted.

Comment 2 Raif Ahmed 2020-04-17 16:34:45 UTC
Ok, I will write a blog post on Taint in more details

Comment 3 yhe 2021-03-03 08:42:50 UTC
You may also need to use the --to-namespace option for the oc debug node command to have the debug pod be created in the dummy namespace. Updated the corresponding KCS as well.

$ oc debug node/worker-0.example.redhat.com --to-namespace dummy


Note You need to log in before you can comment on or make changes to this bug.